consistency reliability estimates: Topics by Science.gov

Sample records for consistency reliability estimates

Use of Internal Consistency Coefficients for Estimating Reliability of Experimental Tasks Scores

PubMed Central

Green, Samuel B.; Yang, Yanyun; Alt, Mary; Brinkley, Shara; Gray, Shelley; Hogan, Tiffany; Cowan, Nelson

2017-01-01

Reliabilities of scores for experimental tasks are likely to differ from one study to another to the extent that the task stimuli change, the number of trials varies, the type of individuals taking the task changes, the administration conditions are altered, or the focal task variable differs. Given reliabilities vary as a function of the design of these tasks and the characteristics of the individuals taking them, making inferences about the reliability of scores in an ongoing study based on reliability estimates from prior studies is precarious. Thus, it would be advantageous to estimate reliability based on data from the ongoing study. We argue that internal consistency estimates of reliability are underutilized for experimental task data and in many applications could provide this information using a single administration of a task. We discuss different methods for computing internal consistency estimates with a generalized coefficient alpha and the conditions under which these estimates are accurate. We illustrate use of these coefficients using data for three different tasks. PMID:26546100
Assessment of the Maximal Split-Half Coefficient to Estimate Reliability

ERIC Educational Resources Information Center

Thompson, Barry L.; Green, Samuel B.; Yang, Yanyun

2010-01-01

The maximal split-half coefficient is computed by calculating all possible split-half reliability estimates for a scale and then choosing the maximal value as the reliability estimate. Osburn compared the maximal split-half coefficient with 10 other internal consistency estimates of reliability and concluded that it yielded the most consistently…
The reliability and internal consistency of one-shot and flicker change detection for measuring individual differences in visual working memory capacity.

PubMed

Pailian, Hrag; Halberda, Justin

2015-04-01

We investigated the psychometric properties of the one-shot change detection task for estimating visual working memory (VWM) storage capacity-and also introduced and tested an alternative flicker change detection task for estimating these limits. In three experiments, we found that the one-shot whole-display task returns estimates of VWM storage capacity (K) that are unreliable across set sizes-suggesting that the whole-display task is measuring different things at different set sizes. In two additional experiments, we found that the one-shot single-probe variant shows improvements in the reliability and consistency of K estimates. In another additional experiment, we found that a one-shot whole-display-with-click task (requiring target localization) also showed improvements in reliability and consistency. The latter results suggest that the one-shot task can return reliable and consistent estimates of VWM storage capacity (K), and they highlight the possibility that the requirement to localize the changed target is what engenders this enhancement. Through a final series of four experiments, we introduced and tested an alternative flicker change detection method that also requires the observer to localize the changing target and that generates, from response times, an estimate of VWM storage capacity (K). We found that estimates of K from the flicker task correlated with estimates from the traditional one-shot task and also had high reliability and consistency. We highlight the flicker method's ability to estimate executive functions as well as VWM storage capacity, and discuss the potential for measuring multiple abilities with the one-shot and flicker tasks.
Internal Consistency, Retest Reliability, and their Implications For Personality Scale Validity

PubMed Central

McCrae, Robert R.; Kurtz, John E.; Yamagata, Shinji; Terracciano, Antonio

2010-01-01

We examined data (N = 34,108) on the differential reliability and validity of facet scales from the NEO Inventories. We evaluated the extent to which (a) psychometric properties of facet scales are generalizable across ages, cultures, and methods of measurement; and (b) validity criteria are associated with different forms of reliability. Composite estimates of facet scale stability, heritability, and cross-observer validity were broadly generalizable. Two estimates of retest reliability were independent predictors of the three validity criteria; none of three estimates of internal consistency was. Available evidence suggests the same pattern of results for other personality inventories. Internal consistency of scales can be useful as a check on data quality, but appears to be of limited utility for evaluating the potential validity of developed scales, and it should not be used as a substitute for retest reliability. Further research on the nature and determinants of retest reliability is needed. PMID:20435807
Reliability and validity of generalizable skills instruments for students who are deaf, blind, or visually impaired.

PubMed

Loeding, B L; Greenan, J P

1998-12-01

The study examined the validity and reliability of four assessments, with three instruments per domain. Domains included generalizable mathematics, communication, interpersonal relations, and reasoning skills. Participants were deaf, legally blind, or visually impaired students enrolled in vocational classes at residential secondary schools. The researchers estimated the internal consistency reliability, test-retest reliability, and construct validity correlations of three subinstruments: student self-ratings, teacher ratings, and performance assessments. The data suggest that these instruments are highly internally consistent measures of generalizable vocational skills. Four performance assessments have high-to-moderate test-retest reliability estimates, and were generally considered to possess acceptable validity and reliability.
Evaluation of General Classes of Reliability Estimators Often Used in Statistical Analyses of Quasi-Experimental Designs

NASA Astrophysics Data System (ADS)

Saini, K. K.; Sehgal, R. K.; Sethi, B. L.

2008-10-01

In this paper major reliability estimators are analyzed and there comparatively result are discussed. There strengths and weaknesses are evaluated in this case study. Each of the reliability estimators has certain advantages and disadvantages. Inter-rater reliability is one of the best ways to estimate reliability when your measure is an observation. However, it requires multiple raters or observers. As an alternative, you could look at the correlation of ratings of the same single observer repeated on two different occasions. Each of the reliability estimators will give a different value for reliability. In general, the test-retest and inter-rater reliability estimates will be lower in value than the parallel forms and internal consistency ones because they involve measuring at different times or with different raters. Since reliability estimates are often used in statistical analyses of quasi-experimental designs.
Influences on and Limitations of Classical Test Theory Reliability Estimates.

ERIC Educational Resources Information Center

Arnold, Margery E.

It is incorrect to say "the test is reliable" because reliability is a function not only of the test itself, but of many factors. The present paper explains how different factors affect classical reliability estimates such as test-retest, interrater, internal consistency, and equivalent forms coefficients. Furthermore, the limits of classical test…
Reliability Generalization of the Psychopathy Checklist Applied in Youthful Samples

ERIC Educational Resources Information Center

Campbell, Justin S.; Pulos, Steven; Hogan, Mike; Murry, Francie

2005-01-01

This study examines the average reliability of Hare Psychopathy Checklists (PCLs) adapted for use in samples of youthful offenders (aged 12 to 21 years). Two forms of reliability are examined: 18 alpha estimates of internal consistency and 18 intraclass correlation (two or more raters) estimates of interrater reliability. The results, an average…
Reliability of Summed Item Scores Using Structural Equation Modeling: An Alternative to Coefficient Alpha

ERIC Educational Resources Information Center

Green, Samuel B.; Yang, Yanyun

2009-01-01

A method is presented for estimating reliability using structural equation modeling (SEM) that allows for nonlinearity between factors and item scores. Assuming the focus is on consistency of summed item scores, this method for estimating reliability is preferred to those based on linear SEM models and to the most commonly reported estimate of…
Reliability Estimation When a Test Is Split into Two Parts of Unknown Effective Length.

ERIC Educational Resources Information Center

Feldt, Leonard S.

2002-01-01

Considers the situation in which content or administrative considerations limit the way in which a test can be partitioned to estimate the internal consistency reliability of the total test score. Demonstrates that a single-valued estimate of the total score reliability is possible only if an assumption is made about the comparative size of the…
Alternative Estimates of the Reliability of College Grade Point Averages. Professional File. Article 130, Spring 2013

ERIC Educational Resources Information Center

Saupe, Joe L.; Eimers, Mardy T.

2013-01-01

The purpose of this paper is to explore differences in the reliabilities of cumulative college grade point averages (GPAs), estimated for unweighted and weighted, one-semester, 1-year, 2-year, and 4-year GPAs. Using cumulative GPAs for a freshman class at a major university, we estimate internal consistency (coefficient alpha) reliabilities for…
Processes and Procedures for Estimating Score Reliability and Precision

ERIC Educational Resources Information Center

Bardhoshi, Gerta; Erford, Bradley T.

2017-01-01

Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…
Reliability of the Raven Coloured Progressive Matrices for Anglo and for Mexican-American Children.

ERIC Educational Resources Information Center

Valencia, Richard R.

1984-01-01

Investigated the internal consistency reliability estimates of the Raven Coloured Progressive Matrices (CPM) for 96 Anglo and Mexican American third-grade boys from low socioeconomic status background. The results showed that the reliability estimates of the CPM for the two ethnic groups were acceptably high and extremely similar in magnitude.…
An overview of coefficient alpha and a reliability matrix for estimating adequacy of internal consistency coefficients with psychological research measures.

PubMed

Ponterotto, Joseph G; Ruckdeschel, Daniel E

2007-12-01

The present article addresses issues in reliability assessment that are often neglected in psychological research such as acceptable levels of internal consistency for research purposes, factors affecting the magnitude of coefficient alpha (alpha), and considerations for interpreting alpha within the research context. A new reliability matrix anchored in classical test theory is introduced to help researchers judge adequacy of internal consistency coefficients with research measures. Guidelines and cautions in applying the matrix are provided.
Reliability Generalization of Scores on the Spielberger State-Trait Anxiety Inventory.

ERIC Educational Resources Information Center

Barnes, Laura L. B.; Harp, Diane; Jung, Woo Sik

2002-01-01

Conducted a reliability generalization study for the State-Trait Anxiety Inventory (C. Spielberger, 1983) by reviewing and classifying 816 research articles. Average reliability coefficients were acceptable for both internal consistency and test-retest reliability, but variation was present among the estimates. Other differences are discussed.…
Calculating system reliability with SRFYDO

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morzinski, Jerome; Anderson - Cook, Christine M; Klamann, Richard M

2010-01-01

SRFYDO is a process for estimating reliability of complex systems. Using information from all applicable sources, including full-system (flight) data, component test data, and expert (engineering) judgment, SRFYDO produces reliability estimates and predictions. It is appropriate for series systems with possibly several versions of the system which share some common components. It models reliability as a function of age and up to 2 other lifecycle (usage) covariates. Initial output from its Exploratory Data Analysis mode consists of plots and numerical summaries so that the user can check data entry and model assumptions, and help determine a final form for themore » system model. The System Reliability mode runs a complete reliability calculation using Bayesian methodology. This mode produces results that estimate reliability at the component, sub-system, and system level. The results include estimates of uncertainty, and can predict reliability at some not-too-distant time in the future. This paper presents an overview of the underlying statistical model for the analysis, discusses model assumptions, and demonstrates usage of SRFYDO.« less
Reliability reporting across studies using the Buss Durkee Hostility Inventory.

PubMed

Vassar, Matt; Hale, William

2009-01-01

Empirical research on anger and hostility has pervaded the academic literature for more than 50 years. Accurate measurement of anger/hostility and subsequent interpretation of results requires that the instruments yield strong psychometric properties. For consistent measurement, reliability estimates must be calculated with each administration, because changes in sample characteristics may alter the scale's ability to generate reliable scores. Therefore, the present study was designed to address reliability reporting practices for a widely used anger assessment, the Buss Durkee Hostility Inventory (BDHI). Of the 250 published articles reviewed, 11.2% calculated and presented reliability estimates for the data at hand, 6.8% cited estimates from a previous study, and 77.1% made no mention of score reliability. Mean alpha estimates of scores for BDHI subscales generally fell below acceptable standards. Additionally, no detectable pattern was found between reporting practices and publication year or journal prestige. Areas for future research are also discussed.
Estimating the Reliability of Single-Item Life Satisfaction Measures: Results from Four National Panel Studies

ERIC Educational Resources Information Center

Lucas, Richard E.; Donnellan, M. Brent

2012-01-01

Life satisfaction is often assessed using single-item measures. However, estimating the reliability of these measures can be difficult because internal consistency coefficients cannot be calculated. Existing approaches use longitudinal data to isolate occasion-specific variance from variance that is either completely stable or variance that…
The Riso-Hudson Enneagram Type Indicator: Estimates of Reliability and Validity

ERIC Educational Resources Information Center

Newgent, Rebecca A.; Parr, Patricia E.; Newman, Isadore; Higgins, Kristin K.

2004-01-01

This investigation was conducted to estimate the reliability and validity of scores on the Riso-Hudson Enneagram Type Indicator (D. R. Riso & R. Hudson, 1999a). Results of 287 participants were analyzed. Alpha suggests an adequate degree of internal consistency. Evidence provides mixed support for construct validity using correlational and…
Examining Readability Estimates' Predictions of Students' Oral Reading Rate: Spache, Lexile, and Forcast

ERIC Educational Resources Information Center

Ardoin, Scott P.; Williams, Jessica C.; Christ, Theodore J.; Klubnik, Cynthia; Wellborn, Claire

2010-01-01

Beyond reliability and validity, measures used to model student growth must consist of multiple probes that are equivalent in level of difficulty to establish consistent measurement conditions across time. Although existing evidence supports the reliability of curriculum-based measurement in reading (CBMR), few studies have empirically evaluated…

Evaluation of Scale Reliability with Binary Measures Using Latent Variable Modeling

ERIC Educational Resources Information Center

Raykov, Tenko; Dimitrov, Dimiter M.; Asparouhov, Tihomir

2010-01-01

A method for interval estimation of scale reliability with discrete data is outlined. The approach is applicable with multi-item instruments consisting of binary measures, and is developed within the latent variable modeling methodology. The procedure is useful for evaluation of consistency of single measures and of sum scores from item sets…
Measuring eating disorder attitudes and behaviors: a reliability generalization study

PubMed Central

2014-01-01

Background Although score reliability is a sample-dependent characteristic, researchers often only report reliability estimates from previous studies as justification for employing particular questionnaires in their research. The present study followed reliability generalization procedures to determine the mean score reliability of the Eating Disorder Inventory and its most commonly employed subscales (Drive for Thinness, Bulimia, and Body Dissatisfaction) and the Eating Attitudes Test as a way to better identify those characteristics that might impact score reliability. Methods Published studies that used these measures were coded based on their reporting of reliability information and additional study characteristics that might influence score reliability. Results Score reliability estimates were included in 26.15% of studies using the EDI and 36.28% of studies using the EAT. Mean Cronbach’s alphas for the EDI (total score = .91; subscales = .75 to .89), EAT-40 (total score = .81) and EAT-26 (total score = .86; subscales = .56 to .80) suggested variability in estimated internal consistency. Whereas some EDI subscales exhibited higher score reliability in clinical eating disorder samples than in nonclinical samples, other subscales did not exhibit these differences. Score reliability information for the EAT was primarily reported for nonclinical samples, making it difficult to characterize the effect of type of sample on these measures. However, there was a tendency for mean score reliability to be higher in the adult (vs. adolescent) samples and in female (vs. male) samples. Conclusions Overall, this study highlights the importance of assessing and reporting internal consistency during every test administration because reliability is affected by characteristics of the participants being examined. PMID:24764530
Score Reliability of Adolescent Alcohol Screening Measures: A Meta-Analytic Inquiry

ERIC Educational Resources Information Center

Shields, Alan L.; Campfield, Delia C.; Miller, Christopher S.; Howell, Ryan T.; Wallace, Kimberly; Weiss, Roger D.

2008-01-01

This study describes the reliability reporting practices in empirical studies using eight adolescent alcohol screening tools and characterizes and explores variability in internal consistency estimates across samples. Of 119 observed administrations of these instruments, 40 (34%) reported usable reliability information. The Personal Experience…
Coefficient Alpha and Reliability of Scale Scores

ERIC Educational Resources Information Center

Almehrizi, Rashid S.

2013-01-01

The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…
Psychometric considerations in the measurement of event-related brain potentials: Guidelines for measurement and reporting.

PubMed

Clayson, Peter E; Miller, Gregory A

2017-01-01

Failing to consider psychometric issues related to reliability and validity, differential deficits, and statistical power potentially undermines the conclusions of a study. In research using event-related brain potentials (ERPs), numerous contextual factors (population sampled, task, data recording, analysis pipeline, etc.) can impact the reliability of ERP scores. The present review considers the contextual factors that influence ERP score reliability and the downstream effects that reliability has on statistical analyses. Given the context-dependent nature of ERPs, it is recommended that ERP score reliability be formally assessed on a study-by-study basis. Recommended guidelines for ERP studies include 1) reporting the threshold of acceptable reliability and reliability estimates for observed scores, 2) specifying the approach used to estimate reliability, and 3) justifying how trial-count minima were chosen. A reliability threshold for internal consistency of at least 0.70 is recommended, and a threshold of 0.80 is preferred. The review also advocates the use of generalizability theory for estimating score dependability (the generalizability theory analog to reliability) as an improvement on classical test theory reliability estimates, suggesting that the latter is less well suited to ERP research. To facilitate the calculation and reporting of dependability estimates, an open-source Matlab program, the ERP Reliability Analysis Toolbox, is presented. Copyright © 2016 Elsevier B.V. All rights reserved.
Reliability and Validity of the Evidence-Based Practice Confidence (EPIC) Scale

ERIC Educational Resources Information Center

Salbach, Nancy M.; Jaglal, Susan B.; Williams, Jack I.

2013-01-01

Introduction: The reliability, minimal detectable change (MDC), and construct validity of the evidence-based practice confidence (EPIC) scale were evaluated among physical therapists (PTs) in clinical practice. Methods: A longitudinal mail survey was conducted. Internal consistency and test-retest reliability were estimated using Cronbach's alpha…
Automation of reliability evaluation procedures through CARE - The computer-aided reliability estimation program.

NASA Technical Reports Server (NTRS)

Mathur, F. P.

1972-01-01

Description of an on-line interactive computer program called CARE (Computer-Aided Reliability Estimation) which can model self-repair and fault-tolerant organizations and perform certain other functions. Essentially CARE consists of a repository of mathematical equations defining the various basic redundancy schemes. These equations, under program control, are then interrelated to generate the desired mathematical model to fit the architecture of the system under evaluation. The mathematical model is then supplied with ground instances of its variables and is then evaluated to generate values for the reliability-theoretic functions applied to the model.
SAS and SPSS macros to calculate standardized Cronbach's alpha using the upper bound of the phi coefficient for dichotomous items.

PubMed

Sun, Wei; Chou, Chih-Ping; Stacy, Alan W; Ma, Huiyan; Unger, Jennifer; Gallaher, Peggy

2007-02-01

Cronbach's a is widely used in social science research to estimate the internal consistency of reliability of a measurement scale. However, when items are not strictly parallel, the Cronbach's a coefficient provides a lower-bound estimate of true reliability, and this estimate may be further biased downward when items are dichotomous. The estimation of standardized Cronbach's a for a scale with dichotomous items can be improved by using the upper bound of coefficient phi. SAS and SPSS macros have been developed in this article to obtain standardized Cronbach's a via this method. The simulation analysis showed that Cronbach's a from upper-bound phi might be appropriate for estimating the real reliability when standardized Cronbach's a is problematic.
Reliability models applicable to space telescope solar array assembly system

NASA Technical Reports Server (NTRS)

Patil, S. A.

1986-01-01

A complex system may consist of a number of subsystems with several components in series, parallel, or combination of both series and parallel. In order to predict how well the system will perform, it is necessary to know the reliabilities of the subsystems and the reliability of the whole system. The objective of the present study is to develop mathematical models of the reliability which are applicable to complex systems. The models are determined by assuming k failures out of n components in a subsystem. By taking k = 1 and k = n, these models reduce to parallel and series models; hence, the models can be specialized to parallel, series combination systems. The models are developed by assuming the failure rates of the components as functions of time and as such, can be applied to processes with or without aging effects. The reliability models are further specialized to Space Telescope Solar Arrray (STSA) System. The STSA consists of 20 identical solar panel assemblies (SPA's). The reliabilities of the SPA's are determined by the reliabilities of solar cell strings, interconnects, and diodes. The estimates of the reliability of the system for one to five years are calculated by using the reliability estimates of solar cells and interconnects given n ESA documents. Aging effects in relation to breaks in interconnects are discussed.
Psychometric Evaluation of the Young Children's Participation and Environment Measure (YC-PEM) for use in Singapore.

PubMed

Lim, Chun Yi; Law, Mary; Khetani, Mary; Rosenbaum, Peter; Pollock, Nancy

2018-08-01

To estimate the psychometric properties of a culturally adapted version of the Young Children's Participation and Environment Measure (YC-PEM) for use among Singaporean families. This is a prospective cohort study. Caregivers of 151 Singaporean children with (n = 83) and without (n = 68) developmental disabilities, between 0 and 7 years, completed the YC-PEM (Singapore) questionnaire with 3 participation scales (frequency, involvement, and change desired) and 1 environment scale for three settings: home, childcare/preschool, and community. Setting-specific estimates of internal consistency, test-retest reliability, and construct validity were obtained. Internal consistency estimates varied from .59 to .92 for the participation scales and .73 to .79 for the environment scale. Test-retest reliability estimates from the YC-PEM conducted on two occasions, 2-3 weeks apart, varied from .39 to .89 for the participation scales and from .65 to .80 for the environment scale. Moderate to large differences were found in participation and perceived environmental support between children with and without a disability. YC-PEM (Singapore) scales have adequate psychometric properties except for low internal consistency for the childcare/preschool participation frequency scale and low test-retest reliability for home participation frequency scale. The YC-PEM (Singapore) may be used for population-level studies involving young children with and without developmental disabilities.
Parts and Components Reliability Assessment: A Cost Effective Approach

NASA Technical Reports Server (NTRS)

Lee, Lydia

2009-01-01

System reliability assessment is a methodology which incorporates reliability analyses performed at parts and components level such as Reliability Prediction, Failure Modes and Effects Analysis (FMEA) and Fault Tree Analysis (FTA) to assess risks, perform design tradeoffs, and therefore, to ensure effective productivity and/or mission success. The system reliability is used to optimize the product design to accommodate today?s mandated budget, manpower, and schedule constraints. Stand ard based reliability assessment is an effective approach consisting of reliability predictions together with other reliability analyses for electronic, electrical, and electro-mechanical (EEE) complex parts and components of large systems based on failure rate estimates published by the United States (U.S.) military or commercial standards and handbooks. Many of these standards are globally accepted and recognized. The reliability assessment is especially useful during the initial stages when the system design is still in the development and hard failure data is not yet available or manufacturers are not contractually obliged by their customers to publish the reliability estimates/predictions for their parts and components. This paper presents a methodology to assess system reliability using parts and components reliability estimates to ensure effective productivity and/or mission success in an efficient manner, low cost, and tight schedule.
Test-retest reliability and construct validity of the Helplessness, Hopelessness, and Haplessness Scale in patients with anxiety disorders.

PubMed

Vatan, Sevginar; Ertaş, Sedar; Lester, David

2011-04-01

In a sample of 100 Turkish psychiatric patients with diagnoses of anxiety disorders, Lester's Helplessness, Hopelessness, and Haplessness inventory had moderate estimates of internal consistency, test-retest reliability, and construct validity.
Psychometrics Matter in Health Behavior: A Long-term Reliability Generalization Study.

PubMed

Pickett, Andrew C; Valdez, Danny; Barry, Adam E

2017-09-01

Despite numerous calls for increased understanding and reporting of reliability estimates, social science research, including the field of health behavior, has been slow to respond and adopt such practices. Therefore, we offer a brief overview of reliability and common reporting errors; we then perform analyses to examine and demonstrate the variability of reliability estimates by sample and over time. Using meta-analytic reliability generalization, we examined the variability of coefficient alpha scores for a well-designed, consistent, nationwide health study, covering a span of nearly 40 years. For each year and sample, reliability varied. Furthermore, reliability was predicted by a sample characteristic that differed among age groups within each administration. We demonstrated that reliability is influenced by the methods and individuals from which a given sample is drawn. Our work echoes previous calls that psychometric properties, particularly reliability of scores, are important and must be considered and reported before drawing statistical conclusions.
Psychometric evaluation of a unified Portuguese-language version of the Body Shape Questionnaire in female university students.

PubMed

Silva, Wanderson Roberto; Costa, David; Pimenta, Filipa; Maroco, João; Campos, Juliana Alvares Duarte Bonini

2016-07-21

The objectives of this study were to develop a unified Portuguese-language version, for use in Brazil and Portugal, of the Body Shape Questionnaire (BSQ) and to estimate its validity, reliability, and internal consistency in Brazilian and Portuguese female university students. Confirmatory factor analysis was performed using both original (34-item) and shortened (8-item) versions. The model's fit was assessed with χ²/df, CFI, NFI, and RMSEA. Concurrent and convergent validity were assessed. Reliability was estimated through internal consistency and composite reliability (α). Transnational invariance of the BSQ was tested using multi-group analysis. The original 32-item model was refined to present a better fit and adequate validity and reliability. The shortened model was stable in both independent samples and in transnational samples (Brazil and Portugal). The use of this unified version is recommended for the assessment of body shape concerns in both Brazilian and Portuguese college students.
Reliability of Use, Abuse, and Dependence of Four Types of Inhalants in Adolescents and Young Adults

PubMed Central

Ridenour, Ty A.; Bray, Bethany C.; Cottler, Linda B.

2007-01-01

Inhalants, as a class of drugs, consists of heterogeneous substances that include some of the most dangerous drugs on a per use basis. Research on inhalant abuse has lagged behind other drugs partly because of the need for a diagnostic instrument of different types of inhalants. This study was conducted to obtain reliability estimates for the new Substance Abuse Module DSM-IV inhalants diagnoses for four types of inhalants: aerosols, gases, nitrites, and solvents as well as different diagnostic configurations of inhalant use. Participants were 162 community sample adolescents or young adults (mean age = 20.3 years, SD = 2.4). Two-thirds of the sample was male and 83.3% was Caucasian. Kappas and intraclass correlation coefficients were computed to estimate test-retest reliabilities. Results suggested (a) abuse was more common than dependence (34.6% vs. 12.3%), (b) reliabilities of abuse criteria and diagnosis were good to excellent across subtypes, and (c) reliabilities of dependence criteria and diagnoses were poor to good across subtypes. Alternative configurations of DSM-IV criteria that were consistent with previous research on adolescents provided excellent reliabilities across subtypes of inhalants. Moreover, 11.1% of participants experienced inhalants withdrawal. PMID:17576041
An Improved Internal Consistency Reliability Estimate.

ERIC Educational Resources Information Center

Cliff, Norman

1984-01-01

The proposed coefficient is derived by assuming that the average Goodman-Kruskal gamma between items of identical difficulty would be the same for items of different difficulty. An estimate of covariance between items of identical difficulty leads to an estimate of the correlation between two tests with identical distributions of difficulty.…
Neurology objective structured clinical examination reliability using generalizability theory

PubMed Central

Park, Yoon Soo; Lukas, Rimas V.; Brorson, James R.

2015-01-01

Objectives: This study examines factors affecting reliability, or consistency of assessment scores, from an objective structured clinical examination (OSCE) in neurology through generalizability theory (G theory). Methods: Data include assessments from a multistation OSCE taken by 194 medical students at the completion of a neurology clerkship. Facets evaluated in this study include cases, domains, and items. Domains refer to areas of skill (or constructs) that the OSCE measures. G theory is used to estimate variance components associated with each facet, derive reliability, and project the number of cases required to obtain a reliable (consistent, precise) score. Results: Reliability using G theory is moderate (Φ coefficient = 0.61, G coefficient = 0.64). Performance is similar across cases but differs by the particular domain, such that the majority of variance is attributed to the domain. Projections in reliability estimates reveal that students need to participate in 3 OSCE cases in order to increase reliability beyond the 0.70 threshold. Conclusions: This novel use of G theory in evaluating an OSCE in neurology provides meaningful measurement characteristics of the assessment. Differing from prior work in other medical specialties, the cases students were randomly assigned did not influence their OSCE score; rather, scores varied in expected fashion by domain assessed. PMID:26432851
Neurology objective structured clinical examination reliability using generalizability theory.

PubMed

Blood, Angela D; Park, Yoon Soo; Lukas, Rimas V; Brorson, James R

2015-11-03

This study examines factors affecting reliability, or consistency of assessment scores, from an objective structured clinical examination (OSCE) in neurology through generalizability theory (G theory). Data include assessments from a multistation OSCE taken by 194 medical students at the completion of a neurology clerkship. Facets evaluated in this study include cases, domains, and items. Domains refer to areas of skill (or constructs) that the OSCE measures. G theory is used to estimate variance components associated with each facet, derive reliability, and project the number of cases required to obtain a reliable (consistent, precise) score. Reliability using G theory is moderate (Φ coefficient = 0.61, G coefficient = 0.64). Performance is similar across cases but differs by the particular domain, such that the majority of variance is attributed to the domain. Projections in reliability estimates reveal that students need to participate in 3 OSCE cases in order to increase reliability beyond the 0.70 threshold. This novel use of G theory in evaluating an OSCE in neurology provides meaningful measurement characteristics of the assessment. Differing from prior work in other medical specialties, the cases students were randomly assigned did not influence their OSCE score; rather, scores varied in expected fashion by domain assessed. © 2015 American Academy of Neurology.
Evaluation of Thompson-type trend and monthly weather data models for corn yields in Iowa, Illinois, and Indiana

NASA Technical Reports Server (NTRS)

French, V. (Principal Investigator)

1982-01-01

An evaluation was made of Thompson-Type models which use trend terms (as a surrogate for technology), meteorological variables based on monthly average temperature, and total precipitation to forecast and estimate corn yields in Iowa, Illinois, and Indiana. Pooled and unpooled Thompson-type models were compared. Neither was found to be consistently superior to the other. Yield reliability indicators show that the models are of limited use for large area yield estimation. The models are objective and consistent with scientific knowledge. Timely yield forecasts and estimates can be made during the growing season by using normals or long range weather forecasts. The models are not costly to operate and are easy to use and understand. The model standard errors of prediction do not provide a useful current measure of modeled yield reliability.
Reliability estimation of a N- M-cold-standby redundancy system in a multicomponent stress-strength model with generalized half-logistic distribution

NASA Astrophysics Data System (ADS)

Liu, Yiming; Shi, Yimin; Bai, Xuchao; Zhan, Pei

2018-01-01

In this paper, we study the estimation for the reliability of a multicomponent system, named N- M-cold-standby redundancy system, based on progressive Type-II censoring sample. In the system, there are N subsystems consisting of M statistically independent distributed strength components, and only one of these subsystems works under the impact of stresses at a time and the others remain as standbys. Whenever the working subsystem fails, one from the standbys takes its place. The system fails when the entire subsystems fail. It is supposed that the underlying distributions of random strength and stress both belong to the generalized half-logistic distribution with different shape parameter. The reliability of the system is estimated by using both classical and Bayesian statistical inference. Uniformly minimum variance unbiased estimator and maximum likelihood estimator for the reliability of the system are derived. Under squared error loss function, the exact expression of the Bayes estimator for the reliability of the system is developed by using the Gauss hypergeometric function. The asymptotic confidence interval and corresponding coverage probabilities are derived based on both the Fisher and the observed information matrices. The approximate highest probability density credible interval is constructed by using Monte Carlo method. Monte Carlo simulations are performed to compare the performances of the proposed reliability estimators. A real data set is also analyzed for an illustration of the findings.

The Brazilian version of the effort-reward imbalance questionnaire to assess job stress.

PubMed

Chor, Dóra; Werneck, Guilherme Loureiro; Faerstein, Eduardo; Alves, Márcia Guimarães de Mello; Rotenberg, Lúcia

2008-01-01

The effort-reward imbalance (ERI) model has been used to assess the health impact of job stress. We aimed at describing the cross-cultural adaptation of the ERI questionnaire into Portuguese and some psychometric properties, in particular internal consistency, test-retest reliability, and factorial structure. We developed a Brazilian version of the ERI using a back-translation method and tested its reliability. The test-retest reliability study was conducted with 111 health workers and University staff. The current analyses are based on 89 participants, after exclusion of those with missing data. Reproducibility (interclass correlation coefficients) for the "effort", "'reward", and "'overcommitment"' dimensions of the scale was estimated at 0.76, 0.86, and 0.78, respectively. Internal consistency (Cronbach's alpha) estimates for these same dimensions were 0.68, 0.78, and 0.78, respectively. The exploratory factorial structure was fairly consistent with the model's theoretical components. We conclude that the results of this study represent the first evidence in favor of the application of the Brazilian Portuguese version of the ERI scale in health research in populations with similar socioeconomic characteristics.
Reliability of a structured interview for admission to an emergency medicine residency program.

PubMed

Blouin, Danielle

2010-10-01

Interviews are most important in resident selection. Structured interviews are more reliable than unstructured ones. We sought to measure the interrater reliability of a newly designed structured interview during the selection process to an Emergency Medicine residency program. The critical incident technique was used to extract the desired dimensions of performance. The interview tool consisted of 7 clinical scenarios and 1 global rating. Three trained interviewers marked each candidate on all scenarios without discussing candidates' responses. Interitem consistency and estimates of variance were computed. Twenty-eight candidates were interviewed. The generalizability coefficient was 0.67. Removing the central tendency ratings increased the coefficient to 0.74. Coefficients of interitem consistency ranged from 0.64 to 0.74. The structured interview tool provided good although suboptimal interrater reliability. Increasing the number of scenarios improves reliability as does applying differential weights to the rating scale anchors. The latter would also facilitate the identification of those candidates with extreme ratings.
A reliability generalization meta-analysis of coefficient alpha and test-retest coefficient for the aging males' symptoms (AMS) scale.

PubMed

Lee, Chin-Pang; Chiu, Yu-Wen; Chu, Chun-Lin; Chen, Yu; Jiang, Kun-Hao; Chen, Jiun-Liang; Chen, Ching-Yen

2016-12-01

The aging males' symptoms (AMS) scale is an instrument used to determine the health-related quality of life in adult and elderly men. The purpose of this study was to synthesize internal consistency (Cronbach's alpha) and test-retest reliability for the AMS scale and its three subscales. Of the 123 studies reviewed, 12 provided alpha coefficients which were then used in the meta-analyses of internal consistency. Seven of the 12 included studies provided test-retest coefficients, and these were used in the meta-analyses of test-retest reliability. The AMS scale had excellent internal consistency [α = 0.89 (95% CI 0.88-0.90)]; the mean alpha estimates across the AMS subscales ranged from 0.79 to 0.82. The AMS scale also had good test-retest reliability [r = 0.85 (95% CI 0.82-0.88]; the test-retest reliability coefficients of the AMS subscales ranged from 0.76 to 0.83. There was significant heterogeneity among the included studies. The AMS scale and the three subscales had fairly good internal consistency and test-retest reliability. Future psychometric studies of the AMS scale should report important characteristics of the participants, details of item scores, and test-retest reliability.
Study samples are too small to produce sufficiently precise reliability coefficients.

PubMed

Charter, Richard A

2003-04-01

In a survey of journal articles, test manuals, and test critique books, the author found that a mean sample size (N) of 260 participants had been used for reliability studies on 742 tests. The distribution was skewed because the median sample size for the total sample was only 90. The median sample sizes for the internal consistency, retest, and interjudge reliabilities were 182, 64, and 36, respectively. The author presented sample size statistics for the various internal consistency methods and types of tests. In general, the author found that the sample sizes that were used in the internal consistency studies were too small to produce sufficiently precise reliability coefficients, which in turn could cause imprecise estimates of examinee true-score confidence intervals. The results also suggest that larger sample sizes have been used in the last decade compared with those that were used in earlier decades.
Reliability and validity of the McDonald Play Inventory.

PubMed

McDonald, Ann E; Vigen, Cheryl

2012-01-01

This study examined the ability of a two-part self-report instrument, the McDonald Play Inventory, to reliably and validly measure the play activities and play styles of 7- to 11-yr-old children and to discriminate between the play of neurotypical children and children with known learning and developmental disabilities. A total of 124 children ages 7-11 recruited from a sample of convenience and a subsample of 17 parents participated in this study. Reliability estimates yielded moderate correlations for internal consistency, total test intercorrelations, and test-retest reliability. Validity estimates were established for content and construct validity. The results suggest that a self-report instrument yields reliable and valid measures of a child's perceived play performance and discriminates between the play of children with and without disabilities. Copyright © 2012 by the American Occupational Therapy Association, Inc.
The Reliability of the OWLS Written Expression Scale with ESL Kindergarten Students

ERIC Educational Resources Information Center

Harrison, Gina L.; Ogle, Keira C.; Keilty, Megan

2011-01-01

A reliability analysis was conducted on the Written Expression Scale from the Oral and Written Language Scales, (OWLS, Carrow-Woolfolk, 1996), with 68 ESL and 56 non-ESL kindergarten students. Interrater and internal consistency estimates for the Written Expression Scale were examined separately for each language group. Despite lower oral English…
Changes in School Climate in a Long-Term Perspective

ERIC Educational Resources Information Center

Kallestad, Jan Helge

2010-01-01

In a previous report five school climate instruments were explored (1983 and 1985), and four scales were regarded as meaningful climate measures according to suggested criteria. These scales were re-inspected in the present study (1997 and 1998) by analyses of internal consistency, estimates of reliability (unit and aggregated reliability), and…
[Estimators of internal consistency in health research: the use of the alpha coefficient].

PubMed

da Silva, Franciele Cascaes; Gonçalves, Elizandra; Arancibia, Beatriz Angélica Valdivia; Bento, Gisele Graziele; Castro, Thiago Luis da Silva; Hernandez, Salma Stephany Soleman; da Silva, Rudney

2015-01-01

Academic production has increased in the area of health, increasingly demanding high quality in publications of great impact. One of the ways to consider quality is through methods that increase the consistency of data analysis, such as reliability which, depending on the type of data, can be evaluated by different coefficients, especially the alpha coefficient. Based on this, the present review systematically gathers scientific articles produced in the last five years, which in a methodological manner gave the α coefficient psychometric use as an estimator of internal consistency and reliability in the processes of construction, adaptation and validation of instruments. The identification of the studies was conducted systematically in the databases BioMed Central Journals, Web of Science, Wiley Online Library, Medline, SciELO, Scopus, Journals@Ovid, BMJ and Springer, using inclusion and exclusion criteria. Data analyses were performed by means of triangulation, content analysis and descriptive analysis. It was found that most studies were conducted in Iran (f=3), Spain (f=2) and Brazil (f=2). These studies aimed to test the psychometric properties of instruments, with eight studies using the α coefficient to assess reliability and nine for assessing internal consistency. All studies were classified as methodological research when their objectives were analyzed. In addition, four studies were also classified as correlational and one as descriptive-correlational. It can be concluded that though the α coefficient is widely used as one of the main parameters for assessing internal consistency of questionnaires in health sciences, its use as an estimator of trust of the methodology used and internal consistency has some critiques that should be considered.
Estimating Classification Consistency and Accuracy for Cognitive Diagnostic Assessment

ERIC Educational Resources Information Center

Cui, Ying; Gierl, Mark J.; Chang, Hua-Hua

2012-01-01

This article introduces procedures for the computation and asymptotic statistical inference for classification consistency and accuracy indices specifically designed for cognitive diagnostic assessments. The new classification indices can be used as important indicators of the reliability and validity of classification results produced by…
[KON-2006--Neurotic Personality Questionnaire].

PubMed

Aleksandrowicz, Jerzy W; Klasa, Katarzyna; Sobański, Jerzy A; Stolarska, Dorota

2007-01-01

Construction of a questionnaire describing personality traits connected to the occurrence and persistence of neurotic disorders. Responses of 794 patients (before treatment) and 520 persons from the control group on items of the constructed personality questionnaire and the symptom checklist "0". Analyses of subscales reliability and item-scale correlations, test-retest and split-half reliability. Factor analyses estimating internal reliability of the questionnaire. Cross-validation with the KO"0". symptom checklist Psychometric properties of KON-2006 questionnaire indicate that it is consistent and reliable enough. Validity analyses indicate a large probability that the X-KON coefficient informs on personality dysfunctions related to neurotic disorders. The Neurotic Personality Questionnaire KON-2006 may serve to estimate personality traits connected to the occurrence and persistence of neurotic disorders as well as changes resulting from psychotherapy.
Confirmatory Factor Analysis of Persian Adaptation of Multidimensional Students' Life Satisfaction Scale (MSLSS)

ERIC Educational Resources Information Center

Hatami, Gissou; Motamed, Niloofar; Ashrafzadeh, Mahshid

2010-01-01

Validity and reliability of Persian adaptation of MSLSS in the 12-18 years, middle and high school students (430 students in grades 6-12 in Bushehr port, Iran) using confirmatory factor analysis by means of LISREL statistical package were checked. Internal consistency reliability estimates (Cronbach's coefficient [alpha]) were all above the…
A Validation of the Ski Hi Language Development Scale.

ERIC Educational Resources Information Center

Tonelson, Stephen W.

The purpose of the study was to assess the reliability and the validity of the Ski Hi Language Development Scale which was designed to determine the receptive and the expressive language levels of hearing impaired children from birth to age 5. The reliability of the instrument was estimated through: (1) internal consistency, (2) inter-rater…
Development and Validation of a Coping with Discrimination Scale: Factor Structure, Reliability, and Validity

ERIC Educational Resources Information Center

Wei, Meifen; Alvarez, Alvin N.; Ku, Tsun-Yao; Russell, Daniel W.; Bonett, Douglas G.

2010-01-01

Four studies were conducted to develop and validate the Coping With Discrimination Scale (CDS). In Study 1, an exploratory factor analysis (N = 328) identified 5 factors: Education/Advocacy, Internalization, Drug and Alcohol Use, Resistance, and Detachment, with internal consistency reliability estimates ranging from 0.72 to 0.90. In Study 2, a…
Psychometric Evaluation of the D-Catch, an Instrument to Measure the Accuracy of Nursing Documentation.

PubMed

D'Agostino, Fabio; Barbaranelli, Claudio; Paans, Wolter; Belsito, Romina; Juarez Vela, Raul; Alvaro, Rosaria; Vellone, Ercole

2017-07-01

To evaluate the psychometric properties of the D-Catch instrument. A cross-sectional methodological study. Validity and reliability were estimated with confirmatory factor analysis (CFA) and internal consistency and inter-rater reliability, respectively. A sample of 250 nursing documentations was selected. CFA showed the adequacy of a 1-factor model (chronologically descriptive accuracy) with an outlier item (nursing diagnosis accuracy). Internal consistency and inter-rater reliability were adequate. The D-Catch is a valid and reliable instrument for measuring the accuracy of nursing documentation. Caution is needed when measuring diagnostic accuracy since only one item measures this dimension. The D-Catch can be used as an indicator of the accuracy of nursing documentation and the quality of nursing care. © 2015 NANDA International, Inc.
Reliable estimation of orbit errors in spaceborne SAR interferometry. The network approach

NASA Astrophysics Data System (ADS)

Bähr, Hermann; Hanssen, Ramon F.

2012-12-01

An approach to improve orbital state vectors by orbit error estimates derived from residual phase patterns in synthetic aperture radar interferograms is presented. For individual interferograms, an error representation by two parameters is motivated: the baseline error in cross-range and the rate of change of the baseline error in range. For their estimation, two alternatives are proposed: a least squares approach that requires prior unwrapping and a less reliable gridsearch method handling the wrapped phase. In both cases, reliability is enhanced by mutual control of error estimates in an overdetermined network of linearly dependent interferometric combinations of images. Thus, systematic biases, e.g., due to unwrapping errors, can be detected and iteratively eliminated. Regularising the solution by a minimum-norm condition results in quasi-absolute orbit errors that refer to particular images. For the 31 images of a sample ENVISAT dataset, orbit corrections with a mutual consistency on the millimetre level have been inferred from 163 interferograms. The method itself qualifies by reliability and rigorous geometric modelling of the orbital error signal but does not consider interfering large scale deformation effects. However, a separation may be feasible in a combined processing with persistent scatterer approaches or by temporal filtering of the estimates.
Allometric scaling theory applied to FIA biomass estimation

Treesearch

David C. Chojnacky

2002-01-01

Tree biomass estimates in the Forest Inventory and Analysis (FIA) database are derived from numerous methodologies whose abundance and complexity raise questions about consistent results throughout the U.S. A new model based on allometric scaling theory ("WBE") offers simplified methodology and a theoretically sound basis for improving the reliability and...
Compound estimation procedures in reliability

NASA Technical Reports Server (NTRS)

Barnes, Ron

1990-01-01

At NASA, components and subsystems of components in the Space Shuttle and Space Station generally go through a number of redesign stages. While data on failures for various design stages are sometimes available, the classical procedures for evaluating reliability only utilize the failure data on the present design stage of the component or subsystem. Often, few or no failures have been recorded on the present design stage. Previously, Bayesian estimators for the reliability of a single component, conditioned on the failure data for the present design, were developed. These new estimators permit NASA to evaluate the reliability, even when few or no failures have been recorded. Point estimates for the latter evaluation were not possible with the classical procedures. Since different design stages of a component (or subsystem) generally have a good deal in common, the development of new statistical procedures for evaluating the reliability, which consider the entire failure record for all design stages, has great intuitive appeal. A typical subsystem consists of a number of different components and each component has evolved through a number of redesign stages. The present investigations considered compound estimation procedures and related models. Such models permit the statistical consideration of all design stages of each component and thus incorporate all the available failure data to obtain estimates for the reliability of the present version of the component (or subsystem). A number of models were considered to estimate the reliability of a component conditioned on its total failure history from two design stages. It was determined that reliability estimators for the present design stage, conditioned on the complete failure history for two design stages have lower risk than the corresponding estimators conditioned only on the most recent design failure data. Several models were explored and preliminary models involving bivariate Poisson distribution and the Consael Process (a bivariate Poisson process) were developed. Possible short comings of the models are noted. An example is given to illustrate the procedures. These investigations are ongoing with the aim of developing estimators that extend to components (and subsystems) with three or more design stages.
Reliability of the ecSatter Inventory as a tool to measure eating competence.

PubMed

Stotts, Jodi L; Lohse, Barbara

2007-01-01

To examine the reliability of the ecSatter Inventory (ecSI), a measure of eating competence. Self-report questionnaires were administered in person or by mail. Retesting occurred 2 to 6 weeks after completion of the first questionnaire. Both administrations of the questionnaire were completed by 259 participants who were mostly food secure, white females with some college education; mean age was 26.9 +/- 10.4 years. Test-retest reliability and internal consistency. Spearman's rank correlation coefficients to estimate test-retest reliability and Cronbach alpha coefficients to estimate internal consistency. Spearman's rank correlation coefficient for ecSI total score was 0.68; subscale coefficients were 0.70 for eating attitudes, 0.70 for contextual skills, 0.65 for food acceptance, and 0.52 for internal regulation. Cronbach alpha coefficient for ecSI total score was 0.77. Subscale alphas coefficients were 0.80 for eating attitudes, 0.69 for contextual skills, 0.68 for food acceptance, and 0.66 for internal regulation. This study provides psychometric evidence about the reliability of ecSI as a measure of eating competence in this sample. Although some ecSI items may require revision, results suggest that the instrument may be used to evaluate nutrition education designed to improve eating competence.
Developing a Danish version of the "Impact on Participation and Autonomy Questionnaire".

PubMed

Ghaziani, Emma; Krogh, Anne Grethe; Lund, Hans

2013-05-01

To translate the "Impact on Participation and Autonomy Questionnaire" into Danish (IPAQ-DK), and estimate its internal consistency and test-retest reliability in order to promote participation-based interventions and research. Translation and two successive reliability assessments through test-retest. 137 adults with varying degrees of impairment; of these, 67 participated in the final reliability assessment. The translation followed guidelines set forth by the "European Group for Quality of Life Assessment and Health Measurement". Internal consistency for subscales was estimated by Chronbach's alpha. Weighted kappa coefficients and intraclass correlation coefficients were calculated to assess the test-retest reliability at item and subscale level, respectively. A preliminary reliability assessment revealed residual issues regarding the translation and cultural adaptation of the instrument. The revised version (IPAQ-DK) was subsequently subjected to a similar assessment demonstrating Chronbach's alpha values from 0.698 to 0.817. Weighted kappa ranged from 0.370 to 0.880; 78% of these values were higher than 0.600. The intraclass correlation coefficient covered values from 0.701 to 0.818. IPAQ-DK is a useful instrument for identifying person-perceived participation restrictions and satisfaction with participation. Further studies of IPAQ-DK's floor/ceiling effects and responsiveness to change are recommended, and whether there is a need for further linguistic improvement of certain items.
Development of a self-report questionnaire designed for population-based surveillance of gingivitis in adolescents: assessment of content validity and reliability.

PubMed

Quiroz, Viviana; Reinero, Daniela; Hernández, Patricia; Contreras, Johanna; Vernal, Rolando; Carvajal, Paola

2017-01-01

This study aimed to develop and assess the content validity and reliability of a cognitively adapted self-report questionnaire designed for surveillance of gingivitis in adolescents. Ten predetermined self-report questions evaluating early signs and symptoms of gingivitis were preliminary assessed by a panel of clinical experts. Eight questions were selected and cognitively tested in 20 adolescents aged 12 to 18 years from Santiago de Chile. The questionnaire was then conducted and answered by 178 Chilean adolescents. Internal consistency was measured using the Cronbach's alpha and temporal stability was calculated using the Kappa-index. A reliable final self-report questionnaire consisting of 5 questions was obtained, with a total Cronbach's alpha of 0.73 and a Kappa-index ranging from 0.41 to 0.77 between the different questions. The proposed questionnaire is reliable, with an acceptable internal consistency and a temporal stability from moderate to substantial, and it is promising for estimating the prevalence of gingivitis in adolescents.

On the Use, the Misuse, and the Very Limited Usefulness of Cronbach's Alpha

ERIC Educational Resources Information Center

Sijtsma, Klaas

2009-01-01

This discussion paper argues that both the use of Cronbach's alpha as a reliability estimate and as a measure of internal consistency suffer from major problems. First, alpha always has a value, which cannot be equal to the test score's reliability given the inter-item covariance matrix and the usual assumptions about measurement error. Second, in…
Lifetime prediction and reliability estimation methodology for Stirling-type pulse tube refrigerators by gaseous contamination accelerated degradation testing

NASA Astrophysics Data System (ADS)

Wan, Fubin; Tan, Yuanyuan; Jiang, Zhenhua; Chen, Xun; Wu, Yinong; Zhao, Peng

2017-12-01

Lifetime and reliability are the two performance parameters of premium importance for modern space Stirling-type pulse tube refrigerators (SPTRs), which are required to operate in excess of 10 years. Demonstration of these parameters provides a significant challenge. This paper proposes a lifetime prediction and reliability estimation method that utilizes accelerated degradation testing (ADT) for SPTRs related to gaseous contamination failure. The method was experimentally validated via three groups of gaseous contamination ADT. First, the performance degradation model based on mechanism of contamination failure and material outgassing characteristics of SPTRs was established. Next, a preliminary test was performed to determine whether the mechanism of contamination failure of the SPTRs during ADT is consistent with normal life testing. Subsequently, the experimental program of ADT was designed for SPTRs. Then, three groups of gaseous contamination ADT were performed at elevated ambient temperatures of 40 °C, 50 °C, and 60 °C, respectively and the estimated lifetimes of the SPTRs under normal condition were obtained through acceleration model (Arrhenius model). The results show good fitting of the degradation model with the experimental data. Finally, we obtained the reliability estimation of SPTRs through using the Weibull distribution. The proposed novel methodology enables us to take less than one year time to estimate the reliability of the SPTRs designed for more than 10 years.
CREATING A DECISION CONTEXT FOR COMPARATIVE ANALYSIS AND CONSISTENT APPLICATION OF INHALATION DOSIMETRY MODELS IN CHILDREN'S RISK ASSESSMENT

EPA Science Inventory

Estimation of risks to children from exposure to airborne pollutants is often complicated by the lack of reliable epidemiological data specific to this age group. As a result, risks are generally estimated from extrapolations based on data obtained in other human age groups (e.g....
Assessment of the reliability and consistency of the "malnutrition inflammation score" (MIS) in Mexican adults with chronic kidney disease for diagnosis of protein-energy wasting syndrome (PEW).

PubMed

González-Ortiz, Ailema Janeth; Arce-Santander, Celene Viridiana; Vega-Vega, Olynka; Correa-Rotter, Ricardo; Espinosa-Cuevas, María de Los Angeles

2014-10-04

The protein-energy wasting syndrome (PEW) is a condition of malnutrition, inflammation, anorexia and wasting of body reserves resulting from inflammatory and non-inflammatory conditions in patients with chronic kidney disease (CKD).One way of assessing PEW, extensively described in the literature, is using the Malnutrition Inflammation Score (MIS). To assess the reliability and consistency of MIS for diagnosis of PEW in Mexican adults with CKD on hemodialysis (HD). Study of diagnostic tests. A sample of 45 adults with CKD on HD were analyzed during the period June-July 2014.The instrument was applied on 2 occasions; the test-retest reliability was calculated using the Intraclass Correlation Coefficient (ICC); the internal consistency of the questionnaire was analyzed using Cronbach's αcoefficient. A weighted Kappa test was used to estimate the validity of the instrument; the result was subsequently compared with the Bilbrey nutritional index (BNI). The reliability of the questionnaires, evaluated in the patient sample, was ICC=0.829.The agreement between MIS observations was considered adequate, k= 0.585 (p <0.001); when comparing it with BNI, a value of k = 0.114 was obtained (p <0.001).In order to estimate the tendency, a correlation test was performed. The r² correlation coefficient was 0.488 (P <0.001). MIS has adequate reliability and validity for diagnosing PEW in the population with chronic kidney disease on HD. Copyright AULA MEDICA EDICIONES 2014. Published by AULA MEDICA. All rights reserved.
Reliability and criterion validity of two applications of the iPhone™ to measure cervical range of motion in healthy participants

PubMed Central

2013-01-01

Summary of background data Recent smartphones, such as the iPhone, are often equipped with an accelerometer and magnetometer, which, through software applications, can perform various inclinometric functions. Although these applications are intended for recreational use, they have the potential to measure and quantify range of motion. The purpose of this study was to estimate the intra and inter-rater reliability as well as the criterion validity of the clinometer and compass applications of the iPhone in the assessment cervical range of motion in healthy participants. Methods The sample consisted of 28 healthy participants. Two examiners measured cervical range of motion of each participant twice using the iPhone (for the estimation of intra and inter-reliability) and once with the CROM (for the estimation of criterion validity). Estimates of reliability and validity were then established using the intraclass correlation coefficient (ICC). Results We observed a moderate intra-rater reliability for each movement (ICC = 0.65-0.85) but a poor inter-rater reliability (ICC < 0.60). For the criterion validity, the ICCs are moderate (>0.50) to good (>0.65) for movements of flexion, extension, lateral flexions and right rotation, but poor (<0.50) for the movement left rotation. Conclusion We found good intra-rater reliability and lower inter-rater reliability. When compared to the gold standard, these applications showed moderate to good validity. However, before using the iPhone as an outcome measure in clinical settings, studies should be done on patients presenting with cervical problems. PMID:23829201
Reliability of the Cooking Task in adults with acquired brain injury.

PubMed

Poncet, Frédérique; Swaine, Bonnie; Taillefer, Chantal; Lamoureux, Julie; Pradat-Diehl, Pascale; Chevignard, Mathilde

2015-01-01

Acquired brain injury (ABI) often leads to deficits in executive functioning (EF) responsible for severe and long-standing disabilities in daily life activities. The Cooking Task is an ecological and valid test of EF involving multi-tasking in a real environment. Given its complex scoring system, it is important to establish the tool's reliability. The objective of the study was to examine the reliability of the Cooking Task (internal consistency, inter-rater and test-retest reliability). A total of 160 patients with ABI (113 men, mean age 37 years, SD = 14.3) were tested using the Cooking Task. For test-retest reliability, patients were assessed by the same rater on two occasions (mean interval 11 days) while two raters independently and simultaneously observed and scored patients' performances to estimate inter-rater reliability. Internal consistency was high for the global scale (Cronbach α = .74). Inter-rater reliability (n = 66) for total errors was also high (ICC = .93), however the test-retest reliability (n = 11) was poor (ICC = .36). In general the Cooking Task appears to be a reliable tool. The low test-retest results were expected given the importance of EF in the performance of novel tasks.
Reliability of air displacement plethysmography.

PubMed

Anderson, Dawn E

2007-02-01

The purpose of this study was to examine the reliability of an air displacement plethysmography device (BOD POD) over trials performed on 3 different days. Subjects consisted of 24 healthy adults (8 men, 16 women), ages 18-38 years, with body weights 46.8-93.6 kg, body mass indexes of 19.1-30.1 kg x m(-2), and percentage body fats (BF) of 7.9-43.1%. Two estimates of BF were performed on 3 days. Paired t-tests revealed no significant within-day differences in body volume (BV), thoracic gas volume (V(TG)), body density (BD), and BF. Correlations between the two V(TG) measures on a day were r = 0.86 for day 1, r = 0.93 for day 2, and r = 0.96 for day 3. BF estimates within a day had high correlations of r = 0.98. Significant differences were found between days for measures of BV, V(TG), BD, and BF. These results indicate a high reliability for within-day estimates of BF and significant differences in between-day estimates of BF using air displacement plethysmography. Reliability of BF may be increased by requiring subjects to practice the procedure for V(TG) measurement.
Measuring Fisher Information Accurately in Correlated Neural Populations

PubMed Central

Kohn, Adam; Pouget, Alexandre

2015-01-01

Neural responses are known to be variable. In order to understand how this neural variability constrains behavioral performance, we need to be able to measure the reliability with which a sensory stimulus is encoded in a given population. However, such measures are challenging for two reasons: First, they must take into account noise correlations which can have a large influence on reliability. Second, they need to be as efficient as possible, since the number of trials available in a set of neural recording is usually limited by experimental constraints. Traditionally, cross-validated decoding has been used as a reliability measure, but it only provides a lower bound on reliability and underestimates reliability substantially in small datasets. We show that, if the number of trials per condition is larger than the number of neurons, there is an alternative, direct estimate of reliability which consistently leads to smaller errors and is much faster to compute. The superior performance of the direct estimator is evident both for simulated data and for neuronal population recordings from macaque primary visual cortex. Furthermore we propose generalizations of the direct estimator which measure changes in stimulus encoding across conditions and the impact of correlations on encoding and decoding, typically denoted by Ishuffle and Idiag respectively. PMID:26030735
Bayesian Approach for Reliability Assessment of Sunshield Deployment on JWST

NASA Technical Reports Server (NTRS)

Kaminskiy, Mark P.; Evans, John W.; Gallo, Luis D.

2013-01-01

Deployable subsystems are essential to mission success of most spacecraft. These subsystems enable critical functions including power, communications and thermal control. The loss of any of these functions will generally result in loss of the mission. These subsystems and their components often consist of unique designs and applications, for which various standardized data sources are not applicable for estimating reliability and for assessing risks. In this study, a Bayesian approach for reliability estimation of spacecraft deployment was developed for this purpose. This approach was then applied to the James Webb Space Telescope (JWST) Sunshield subsystem, a unique design intended for thermal control of the observatory's telescope and science instruments. In order to collect the prior information on deployable systems, detailed studies of "heritage information", were conducted extending over 45 years of spacecraft launches. The NASA Goddard Space Flight Center (GSFC) Spacecraft Operational Anomaly and Reporting System (SOARS) data were then used to estimate the parameters of the conjugative beta prior distribution for anomaly and failure occurrence, as the most consistent set of available data and that could be matched to launch histories. This allows for an emperical Bayesian prediction for the risk of an anomaly occurrence of the complex Sunshield deployment, with credibility limits, using prior deployment data and test information.
Temporal validation for landsat-based volume estimation model

Treesearch

Renaldo J. Arroyo; Emily B. Schultz; Thomas G. Matney; David L. Evans; Zhaofei Fan

2015-01-01

Satellite imagery can potentially reduce the costs and time associated with ground-based forest inventories; however, for satellite imagery to provide reliable forest inventory data, it must produce consistent results from one time period to the next. The objective of this study was to temporally validate a Landsat-based volume estimation model in a four county study...
Orofacial Pain during Mastication in People with Dementia: Reliability Testing of the Orofacial Pain Scale for Non-Verbal Individuals.

PubMed

de Vries, Merlijn W; Visscher, Corine; Delwel, Suzanne; van der Steen, Jenny T; Pieper, Marjoleine J C; Scherder, Erik J A; Achterberg, Wilco P; Lobbezoo, Frank

2016-01-01

Objectives. The aim of this study was to establish the reliability of the "chewing" subscale of the OPS-NVI, a novel tool designed to estimate presence and severity of orofacial pain in nonverbal patients. Methods. The OPS-NVI consists of 16 items for observed behavior, classified into four categories and a subjective estimate of pain. Two observers used the OPS-NVI for 237 video clips of people with dementia in Dutch nursing homes during their meal to observe their behavior and to estimate the intensity of orofacial pain. Six weeks later, the same observers rated the video clips a second time. Results. Bottom and ceiling effects for some items were found. This resulted in exclusion of these items from the statistical analyses. The categories which included the remaining items (n = 6) showed reliability varying between fair-to-good and excellent (interobserver reliability, ICC: 0.40-0.47; intraobserver reliability, ICC: 0.40-0.92). Conclusions. The "chewing" subscale of the OPS-NVI showed a fair-to-good to excellent interobserver and intraobserver reliability in this dementia population. This study contributes to the validation process of the OPS-NVI as a whole and stresses the need for further assessment of the reliability of the OPS-NVI with subjects that might already show signs of orofacial pain.
Is the encoding of Reward Prediction Error reliable during development?

PubMed

Keren, Hanna; Chen, Gang; Benson, Brenda; Ernst, Monique; Leibenluft, Ellen; Fox, Nathan A; Pine, Daniel S; Stringaris, Argyris

2018-05-16

Reward Prediction Errors (RPEs), defined as the difference between the expected and received outcomes, are integral to reinforcement learning models and play an important role in development and psychopathology. In humans, RPE encoding can be estimated using fMRI recordings, however, a basic measurement property of RPE signals, their test-retest reliability across different time scales, remains an open question. In this paper, we examine the 3-month and 3-year reliability of RPE encoding in youth (mean age at baseline = 10.6 ± 0.3 years), a period of developmental transitions in reward processing. We show that RPE encoding is differentially distributed between the positive values being encoded predominantly in the striatum and negative RPEs primarily encoded in the insula. The encoding of negative RPE values is highly reliable in the right insula, across both the long and the short time intervals. Insula reliability for RPE encoding is the most robust finding, while other regions, such as the striatum, are less consistent. Striatal reliability appeared significant as well once covarying for factors, which were possibly confounding the signal to noise ratio. By contrast, task activation during feedback in the striatum is highly reliable across both time intervals. These results demonstrate the valence-dependent differential encoding of RPE signals between the insula and striatum, and the consistency of RPE signals or lack thereof, during childhood and into adolescence. Characterizing the regions where the RPE signal in BOLD fMRI is a reliable marker is key for estimating reward-processing alterations in longitudinal designs, such as developmental or treatment studies. Copyright © 2018 Elsevier Inc. All rights reserved.
Validity of an adaptation of the Framingham cardiovascular risk function: the VERIFICA study

PubMed Central

Marrugat, Jaume; Subirana, Isaac; Comín, Eva; Cabezas, Carmen; Vila, Joan; Elosua, Roberto; Nam, Byung‐Ho; Ramos, Rafel; Sala, Joan; Solanas, Pascual; Cordón, Ferran; Gené‐Badia, Joan; D'Agostino, Ralph B

2007-01-01

Background To assess the reliability and accuracy of the Framingham coronary heart disease (CHD) risk function adapted by the Registre Gironí del Cor (REGICOR) investigators in Spain. Methods A 5‐year follow‐up study was completed in 5732 participants aged 35–74 years. The adaptation consisted of using in the function the average population risk factor prevalence and the cumulative incidence observed in Spain instead of those from Framingham in a Cox proportional hazards model. Reliability and accuracy in estimating the observed cumulative incidence were tested with the area under the curve comparison and goodness‐of‐fit test, respectively. Results The Kaplan–Meier CHD cumulative incidence during the follow‐up was 4.0% in men and 1.7% in women. The original Framingham function and the REGICOR adapted estimates were 10.4% and 4.8%, and 3.6% and 2.0%, respectively. The REGICOR‐adapted function's estimate did not differ from the observed cumulated incidence (goodness of fit in men, p = 0.078, in women, p = 0.256), whereas all the original Framingham function estimates differed significantly (p<0.001). Reliabilities of the original Framingham function and of the best Cox model fit with the study data were similar in men (area under the receiver operator characteristic curve 0.68 and 0.69, respectively, p = 0.273), whereas the best Cox model fitted better in women (0.73 and 0.81, respectively, p<0.001). Conclusion The Framingham function adapted to local population characteristics accurately and reliably predicted the 5‐year CHD risk for patients aged 35–74 years, in contrast with the original function, which consistently overestimated the actual risk. PMID:17183014
Reliability study on high power 638-nm triple emitter broad area laser diode

NASA Astrophysics Data System (ADS)

Yagi, T.; Kuramoto, K.; Kadoiwa, K.; Wakamatsu, R.; Miyashita, M.

2016-03-01

Reliabilities of the 638-nm triple emitter broad area laser diode (BA-LD) with the window-mirror structure were studied. Methodology to estimate mean time to failure (MTTF) due to catastrophic optical mirror degradation (COMD) in reasonable aging duration was newly proposed. Power at which the LD failed due to COMD (PCOMD) was measured for the aged LDs under the several aging conditions. It was revealed that the PCOMD was proportional to logarithm of aging duration, and MTTF due to COMD (MTTF(COMD)) could be estimated by using this relation. MTTF(COMD) estimated by the methodology with the aging duration of approximately 2,000 hours was consistent with that estimated by the long term aging. By using this methodology, the MTTF of the BA-LD was estimated exceeding 100,000 hours under the output of 2.5 W, duty cycles of 30% .
Measurement of cochlear length using the 'A' value for cochlea basal diameter: A feasibility study.

PubMed

Deep, Nicholas L; Howard, Brittany E; Holbert, Sarah O; Hoxworth, Joseph M; Barrs, David M

2017-07-01

To determine whether the cochlea basal diameter (A value) measurement can be consistently and precisely obtained from high-resolution temporal bone imaging for use in cochlear length estimation. A feasibility study at a tertiary referral center was performed using the temporal bone CTs of 40 consecutive patients. The distance from the round window to the lateral wall was measured for each cochlea by two independent reviewers, a neuroradiologist and an otolaryngologist. The interrater reliability was calculated using the intraclass correlation coefficient (ICC) and the Bland-Altman plot. Forty patients (19 males, 21 females) for a total of 80 cochleae were included. Interrater reliability on the same ear had a high level of agreement by both the ICC and the Bland-Altman plot. ICCs were 0.90 (95% CI: 0.82, 0.94) for the left ear and 0.96 (95% CI: 0.92, 0.98) for the right ear. Bland-Altman plot confirmed interrater reliability with all 96% of measurements falling within the 95% limits of agreement. Measurement between the round window and lateral cochlear wall can be consistently and reliably obtained from high-resolution temporal bone CT scans. Thus, it is feasible to utilize this method to estimate the cochlear length of patients undergoing cochlear implantation.
Validation and reliability of the sex estimation of the human os coxae using freely available DSP2 software for bioarchaeology and forensic anthropology.

PubMed

Brůžek, Jaroslav; Santos, Frédéric; Dutailly, Bruno; Murail, Pascal; Cunha, Eugenia

2017-10-01

A new tool for skeletal sex estimation based on measurements of the human os coxae is presented using skeletons from a metapopulation of identified adult individuals from twelve independent population samples. For reliable sex estimation, a posterior probability greater than 0.95 was considered to be the classification threshold: below this value, estimates are considered indeterminate. By providing free software, we aim to develop an even more disseminated method for sex estimation. Ten metric variables collected from 2,040 ossa coxa of adult subjects of known sex were recorded between 1986 and 2002 (reference sample). To test both the validity and reliability, a target sample consisting of two series of adult ossa coxa of known sex (n = 623) was used. The DSP2 software (Diagnose Sexuelle Probabiliste v2) is based on Linear Discriminant Analysis, and the posterior probabilities are calculated using an R script. For the reference sample, any combination of four dimensions provides a correct sex estimate in at least 99% of cases. The percentage of individuals for whom sex can be estimated depends on the number of dimensions; for all ten variables it is higher than 90%. Those results are confirmed in the target sample. Our posterior probability threshold of 0.95 for sex estimate corresponds to the traditional sectioning point used in osteological studies. DSP2 software is replacing the former version that should not be used anymore. DSP2 is a robust and reliable technique for sexing adult os coxae, and is also user friendly. © 2017 Wiley Periodicals, Inc.
Oil and gas reserves estimates

USGS Publications Warehouse

Harrell, R.; Gajdica, R.; Elliot, D.; Ahlbrandt, T.S.; Khurana, S.

2005-01-01

This article is a summary of a panel session at the 2005 Offshore Technology Conference. Oil and gas reserves estimates are further complicated with the expanding importance of the worldwide deepwater arena. These deepwater reserves can be analyzed, interpreted, and conveyed in a consistent, reliable way to investors and other stakeholders. Continually improving technologies can lead to improved estimates of production and reserves, but the estimates are not necessarily recognized by regulatory authorities as an indicator of "reasonable certainty," a term used since 1964 to describe proved reserves in several venues. Solutions are being debated in the industry to arrive at a reporting mechanism that generates consistency and at the same time leads to useful parameters in assessing a company's value without compromising confidentiality. Copyright 2005 Offshore Technology Conference.
Measurement Myths and Misconceptions.

ERIC Educational Resources Information Center

Goodwin, Laura D.; Goodwin, William L.

1999-01-01

Presents frequently encountered measurement misconceptions and various measurement "rules." Origins of the misconceptions and rules are described, along with the reasons why they are problematic. Alternate approaches or considerations are given. Misconceptions discussed pertain to the estimation of internal consistency reliability and item…
Reliability Estimation of Parameters of Helical Wind Turbine with Vertical Axis

PubMed Central

Dumitrascu, Adela-Eliza; Lepadatescu, Badea; Dumitrascu, Dorin-Ion; Nedelcu, Anisor; Ciobanu, Doina Valentina

2015-01-01

Due to the prolonged use of wind turbines they must be characterized by high reliability. This can be achieved through a rigorous design, appropriate simulation and testing, and proper construction. The reliability prediction and analysis of these systems will lead to identifying the critical components, increasing the operating time, minimizing failure rate, and minimizing maintenance costs. To estimate the produced energy by the wind turbine, an evaluation approach based on the Monte Carlo simulation model is developed which enables us to estimate the probability of minimum and maximum parameters. In our simulation process we used triangular distributions. The analysis of simulation results has been focused on the interpretation of the relative frequency histograms and cumulative distribution curve (ogive diagram), which indicates the probability of obtaining the daily or annual energy output depending on wind speed. The experimental researches consist in estimation of the reliability and unreliability functions and hazard rate of the helical vertical axis wind turbine designed and patented to climatic conditions for Romanian regions. Also, the variation of power produced for different wind speeds, the Weibull distribution of wind probability, and the power generated were determined. The analysis of experimental results indicates that this type of wind turbine is efficient at low wind speed. PMID:26167524
Reliability Estimation of Parameters of Helical Wind Turbine with Vertical Axis.

PubMed

Dumitrascu, Adela-Eliza; Lepadatescu, Badea; Dumitrascu, Dorin-Ion; Nedelcu, Anisor; Ciobanu, Doina Valentina

2015-01-01

Due to the prolonged use of wind turbines they must be characterized by high reliability. This can be achieved through a rigorous design, appropriate simulation and testing, and proper construction. The reliability prediction and analysis of these systems will lead to identifying the critical components, increasing the operating time, minimizing failure rate, and minimizing maintenance costs. To estimate the produced energy by the wind turbine, an evaluation approach based on the Monte Carlo simulation model is developed which enables us to estimate the probability of minimum and maximum parameters. In our simulation process we used triangular distributions. The analysis of simulation results has been focused on the interpretation of the relative frequency histograms and cumulative distribution curve (ogive diagram), which indicates the probability of obtaining the daily or annual energy output depending on wind speed. The experimental researches consist in estimation of the reliability and unreliability functions and hazard rate of the helical vertical axis wind turbine designed and patented to climatic conditions for Romanian regions. Also, the variation of power produced for different wind speeds, the Weibull distribution of wind probability, and the power generated were determined. The analysis of experimental results indicates that this type of wind turbine is efficient at low wind speed.

High inter-rater reliability, agreement, and convergent validity of Constant score in patients with clavicle fractures.

PubMed

Ban, Ilija; Troelsen, Anders; Kristensen, Morten Tange

2016-10-01

The Constant score (CS) has been the primary endpoint in most studies on clavicle fractures. However, the CS was not developed to assess patients with clavicle fractures. Our aim was to examine inter-rater reliability and agreement of the CS in patients with clavicle fractures. The secondary aim was to estimate the correlation between the CS and the Disabilities of the Arm, Shoulder and Hand score and the internal consistency of the 2 scores. On the basis of sample sizing, 36 patients (31 male and 5 female patients; mean age, 41.3 years) with clavicle fractures underwent standardized CS assessment at a mean of 6.8 weeks (SD, 1.0 weeks) after injury. Reliability and agreement of the CS were determined by 2 raters. The interclass correlation coefficient (ICC2,1), standard error of measurement, minimal detectable change, Cronbach α coefficient, and Pearson correlation coefficient were estimated. Inter-rater reliability of the total CS was excellent (interclass correlation coefficient, 0.94; 95% confidence interval, 0.88-0.97), with no systematic difference between the 2 raters (P = .75). The standard error of measurement (measurement error at the group level) was 4.9, whereas the minimal detectable change (smallest change needed to indicate a real change for an individual) was 13.6 CS points. The internal consistency of the 10 CS items was good, with a Cronbach α of .85, and we found a strong correlation (r = -0.92) between the CS and Disabilities of the Arm, Shoulder and Hand score. The CS was found to be reliable for assessing patients with clavicle fractures, especially at the group level. With high inter-rater reliability and agreement, in addition to good internal consistency, the standardized CS used in this study can be used for comparison of results from different settings. Copyright © 2016 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
[Validating the Spanish version of the Nursing Activities Score].

PubMed

Sánchez-Sánchez, M M; Arias-Rivera, S; Fraile-Gamo, M P; Thuissard-Vasallo, I J; Frutos-Vivar, F

2015-01-01

Validating workload scores ensures that they are appropriate for the purpose for which they were developed. To validate the Nursing Activities Score (NAS) Spanish version. Observational and prospective study. 1,045 patients who were admitted to a medical-surgical unit and a serious burns unit in 2006 were included. The nurse in charge assessed patient workloads by Nine Equivalent of Nursing Manpower use Score and NAS. To assess the internal consistency of the measurements of NAS, item-test correlations, Cronbach's α and Cronbach's α corrected by omitting each of the items were calculated. The intraobserver and interobserver reliability were assessed with the intraclass correlation coefficient by viewing recordings and Kappa (interobserver reliability) was estimated. For the analysis of internal validity, a factorial principal components analysis was performed. Convergent validity was assessed using the Spearman correlation coefficient values obtained from the Nine Equivalent of Nursing Manpower use Score and Spanish-NAS scales. For internal consistency, 164 questionnaires were analysed and a Cronbach's α of 0.373 was calculated. The intraclass correlation coefficient for intraobserver reliability estimate was 0.837 (95% IC: 0.466-0.950) and 0.662 (95% IC: 0.033-0.882) for interobserver reliability. The estimated kappa was 0.371. For internal validity, exploratory factor analysis showed that the first item explained 58.9% of the variance of the questionnaire. For convergent validity 1006 questionnaires were included and a Spearman correlation coefficient of 0.746 was observed. The psychometric properties of Spanish-NAS are acceptable. Copyright © 2014 Elsevier España, S.L.U. y SEEIUC. All rights reserved.
Measuring the Reliability of Picture Story Exercises like the TAT

PubMed Central

Gruber, Nicole; Kreuzpointner, Ludwig

2013-01-01

As frequently reported, psychometric assessments on Picture Story Exercises, especially variations of the Thematic Apperception Test, mostly reveal inadequate scores for internal consistency. We demonstrate that the reason for this apparent shortcoming is not caused by the coding system itself but from the incorrect use of internal consistency coefficients, especially Cronbach’s α. This problem could be eliminated by using the category-scores as items instead of the picture-scores. In addition to a theoretical explanation we prove mathematically why the use of category-scores produces an adequate internal consistency estimation and examine our idea empirically with the origin data set of the Thematic Apperception Test by Heckhausen and two additional data sets. We found generally higher values when using the category-scores as items instead of picture-scores. From an empirical and theoretical point of view, the estimated reliability is also superior to each category within a picture as item measuring. When comparing our suggestion with a multifaceted Rasch-model we provide evidence that our procedure better fits the underlying principles of PSE. PMID:24348902
Reliability studies of diagnostic methods in Indian traditional Ayurveda medicine: An overview

PubMed Central

Kurande, Vrinda Hitendra; Waagepetersen, Rasmus; Toft, Egon; Prasad, Ramjee

2013-01-01

Recently, a need to develop supportive new scientific evidence for contemporary Ayurveda has emerged. One of the research objectives is an assessment of the reliability of diagnoses and treatment. Reliability is a quantitative measure of consistency. It is a crucial issue in classification (such as prakriti classification), method development (pulse diagnosis), quality assurance for diagnosis and treatment and in the conduct of clinical studies. Several reliability studies are conducted in western medicine. The investigation of the reliability of traditional Chinese, Japanese and Sasang medicine diagnoses is in the formative stage. However, reliability studies in Ayurveda are in the preliminary stage. In this paper, examples are provided to illustrate relevant concepts of reliability studies of diagnostic methods and their implication in practice, education, and training. An introduction to reliability estimates and different study designs and statistical analysis is given for future studies in Ayurveda. PMID:23930037
Psychophysical measurements in children: challenges, pitfalls, and considerations.

PubMed

Witton, Caroline; Talcott, Joel B; Henning, G Bruce

2017-01-01

Measuring sensory sensitivity is important in studying development and developmental disorders. However, with children, there is a need to balance reliable but lengthy sensory tasks with the child's ability to maintain motivation and vigilance. We used simulations to explore the problems associated with shortening adaptive psychophysical procedures, and suggest how these problems might be addressed. We quantify how adaptive procedures with too few reversals can over-estimate thresholds, introduce substantial measurement error, and make estimates of individual thresholds less reliable. The associated measurement error also obscures group differences. Adaptive procedures with children should therefore use as many reversals as possible, to reduce the effects of both Type 1 and Type 2 errors. Differences in response consistency, resulting from lapses in attention, further increase the over-estimation of threshold. Comparisons between data from individuals who may differ in lapse rate are therefore problematic, but measures to estimate and account for lapse rates in analyses may mitigate this problem.
Reliability of perceived neighbourhood conditions and the effects of measurement error on self-rated health across urban and rural neighbourhoods.

PubMed

Pruitt, Sandi L; Jeffe, Donna B; Yan, Yan; Schootman, Mario

2012-04-01

Limited psychometric research has examined the reliability of self-reported measures of neighbourhood conditions, the effect of measurement error on associations between neighbourhood conditions and health, and potential differences in the reliabilities between neighbourhood strata (urban vs rural and low vs high poverty). We assessed overall and stratified reliability of self-reported perceived neighbourhood conditions using five scales (social and physical disorder, social control, social cohesion, fear) and four single items (multidimensional neighbouring). We also assessed measurement error-corrected associations of these conditions with self-rated health. Using random-digit dialling, 367 women without breast cancer (matched controls from a larger study) were interviewed twice, 2-3 weeks apart. Test-retest (intraclass correlation coefficients (ICC)/weighted κ) and internal consistency reliability (Cronbach's α) were assessed. Differences in reliability across neighbourhood strata were tested using bootstrap methods. Regression calibration corrected estimates for measurement error. All measures demonstrated satisfactory internal consistency (α ≥ 0.70) and either moderate (ICC/κ=0.41-0.60) or substantial (ICC/κ=0.61-0.80) test-retest reliability in the full sample. Internal consistency did not differ by neighbourhood strata. Test-retest reliability was significantly lower among rural (vs urban) residents for two scales (social control, physical disorder) and two multidimensional neighbouring items; test-retest reliability was higher for physical disorder and lower for one multidimensional neighbouring item among the high (vs low) poverty strata. After measurement error correction, the magnitude of associations between neighbourhood conditions and self-rated health were larger, particularly in the rural population. Research is needed to develop and test reliable measures of perceived neighbourhood conditions relevant to the health of rural populations.
Resting-state test-retest reliability of a priori defined canonical networks over different preprocessing steps.

PubMed

Varikuti, Deepthi P; Hoffstaedter, Felix; Genon, Sarah; Schwender, Holger; Reid, Andrew T; Eickhoff, Simon B

2017-04-01

Resting-state functional connectivity analysis has become a widely used method for the investigation of human brain connectivity and pathology. The measurement of neuronal activity by functional MRI, however, is impeded by various nuisance signals that reduce the stability of functional connectivity. Several methods exist to address this predicament, but little consensus has yet been reached on the most appropriate approach. Given the crucial importance of reliability for the development of clinical applications, we here investigated the effect of various confound removal approaches on the test-retest reliability of functional-connectivity estimates in two previously defined functional brain networks. Our results showed that gray matter masking improved the reliability of connectivity estimates, whereas denoising based on principal components analysis reduced it. We additionally observed that refraining from using any correction for global signals provided the best test-retest reliability, but failed to reproduce anti-correlations between what have been previously described as antagonistic networks. This suggests that improved reliability can come at the expense of potentially poorer biological validity. Consistent with this, we observed that reliability was proportional to the retained variance, which presumably included structured noise, such as reliable nuisance signals (for instance, noise induced by cardiac processes). We conclude that compromises are necessary between maximizing test-retest reliability and removing variance that may be attributable to non-neuronal sources.
Resting-state test-retest reliability of a priori defined canonical networks over different preprocessing steps

PubMed Central

Varikuti, Deepthi P.; Hoffstaedter, Felix; Genon, Sarah; Schwender, Holger; Reid, Andrew T.; Eickhoff, Simon B.

2016-01-01

Resting-state functional connectivity analysis has become a widely used method for the investigation of human brain connectivity and pathology. The measurement of neuronal activity by functional MRI, however, is impeded by various nuisance signals that reduce the stability of functional connectivity. Several methods exist to address this predicament, but little consensus has yet been reached on the most appropriate approach. Given the crucial importance of reliability for the development of clinical applications, we here investigated the effect of various confound removal approaches on the test-retest reliability of functional-connectivity estimates in two previously defined functional brain networks. Our results showed that grey matter masking improved the reliability of connectivity estimates, whereas de-noising based on principal components analysis reduced it. We additionally observed that refraining from using any correction for global signals provided the best test-retest reliability, but failed to reproduce anti-correlations between what have been previously described as antagonistic networks. This suggests that improved reliability can come at the expense of potentially poorer biological validity. Consistent with this, we observed that reliability was proportional to the retained variance, which presumably included structured noise, such as reliable nuisance signals (for instance, noise induced by cardiac processes). We conclude that compromises are necessary between maximizing test-retest reliability and removing variance that may be attributable to non-neuronal sources. PMID:27550015
The reliability of the Hendrich Fall Risk Model in a geriatric hospital.

PubMed

Heinze, Cornelia; Halfens, Ruud; Dassen, Theo

2008-12-01

Aims and objectives. The purpose of this study was to test the interrater reliability of the Hendrich Fall Risk Model, an instrument to identify patients in a hospital setting with a high risk of falling. Background. Falls are a serious problem in older patients. Valid and reliable fall risk assessment tools are required to identify high-risk patients and to take adequate preventive measures. Methods. Seventy older patients were independently and simultaneously assessed by six pairs of raters made up of nursing staff members. Consensus estimates were calculated using simple percentage agreement and consistency estimates using Spearman's rho and intra class coefficient. Results. Percentage agreement ranged from 0.70 to 0.92 between the six pairs of raters. Spearman's rho coefficients were between 0.54 and 0.80 and the intra class coefficients were between 0.46 and 0.92. Conclusions. Whereas some pairs of raters obtained considerable interobserver agreement and internal consistency, the others did not. Therefore, it is concluded that the Hendrich Fall Risk Model is not a reliable instrument. The use of more unambiguous operationalized items is preferred. Relevance to clinical practice. In practice, well operationalized fall risk assessment tools are necessary. Observer agreement should always be investigated after introducing a standardized measurement tool. © 2008 The Authors. Journal compilation © 2008 Blackwell Publishing Ltd.
Identification of the contribution of the ankle and hip joints to multi-segmental balance control

PubMed Central

2013-01-01

Background Human stance involves multiple segments, including the legs and trunk, and requires coordinated actions of both. A novel method was developed that reliably estimates the contribution of the left and right leg (i.e., the ankle and hip joints) to the balance control of individual subjects. Methods The method was evaluated using simulations of a double-inverted pendulum model and the applicability was demonstrated with an experiment with seven healthy and one Parkinsonian participant. Model simulations indicated that two perturbations are required to reliably estimate the dynamics of a double-inverted pendulum balance control system. In the experiment, two multisine perturbation signals were applied simultaneously. The balance control system dynamic behaviour of the participants was estimated by Frequency Response Functions (FRFs), which relate ankle and hip joint angles to joint torques, using a multivariate closed-loop system identification technique. Results In the model simulations, the FRFs were reliably estimated, also in the presence of realistic levels of noise. In the experiment, the participants responded consistently to the perturbations, indicated by low noise-to-signal ratios of the ankle angle (0.24), hip angle (0.28), ankle torque (0.07), and hip torque (0.33). The developed method could detect that the Parkinson patient controlled his balance asymmetrically, that is, the right ankle and hip joints produced more corrective torque. Conclusion The method allows for a reliable estimate of the multisegmental feedback mechanism that stabilizes stance, of individual participants and of separate legs. PMID:23433148
The brief multidimensional students' life satisfaction scale-college version.

PubMed

Zullig, Keith J; Huebner, E Scott; Patton, Jon M; Murray, Karen A

2009-01-01

To investigate the psychometric properties of the BMSLSS-College among 723 college students. Internal consistency estimates explored scale reliability, factor analysis explored construct validity, and known-groups validity was assessed using the National College Youth Risk Behavior Survey and Harvard School of Public Health College Alcohol Study. Criterion-related validity was explored through analyses with the CDC's health-related quality of life scale and a social isolation scale. Acceptable internal consistency reliability, construct, known-groups, and criterion-related validity were established. Findings offer preliminary support for the BMSLSS-C; it could be useful in large-scale research studies, applied screening contexts, and for program evaluation purposes toward achieving Healthy People 2010 objectives.
Development and initial validation of the internalization of Asian American stereotypes scale.

PubMed

Shen, Frances C; Wang, Yu-Wei; Swanson, Jane L

2011-07-01

This research consists of four studies on the initial reliability and validity of the Internalization of Asian American Stereotypes Scale (IAASS), a self-report instrument that measures the degree Asian Americans have internalized racial stereotypes about their own group. The results from the exploratory and confirmatory factor analyses support a stable four-factor structure of the IAASS: Difficulties with English Language Communication, Pursuit of Prestigious Careers, Emotional Reservation, and Expected Academic Success. Evidence for concurrent and discriminant validity is presented. High internal-consistency and test-retest reliability estimates are reported. A discussion of how this scale can contribute to research and practice regarding internalized stereotyping among Asian Americans is provided.
Statistical properties of the anomalous scaling exponent estimator based on time-averaged mean-square displacement

NASA Astrophysics Data System (ADS)

Sikora, Grzegorz; Teuerle, Marek; Wyłomańska, Agnieszka; Grebenkov, Denis

2017-08-01

The most common way of estimating the anomalous scaling exponent from single-particle trajectories consists of a linear fit of the dependence of the time-averaged mean-square displacement on the lag time at the log-log scale. We investigate the statistical properties of this estimator in the case of fractional Brownian motion (FBM). We determine the mean value, the variance, and the distribution of the estimator. Our theoretical results are confirmed by Monte Carlo simulations. In the limit of long trajectories, the estimator is shown to be asymptotically unbiased, consistent, and with vanishing variance. These properties ensure an accurate estimation of the scaling exponent even from a single (long enough) trajectory. As a consequence, we prove that the usual way to estimate the diffusion exponent of FBM is correct from the statistical point of view. Moreover, the knowledge of the estimator distribution is the first step toward new statistical tests of FBM and toward a more reliable interpretation of the experimental histograms of scaling exponents in microbiology.
Incorporation of prior information on parameters into nonlinear regression groundwater flow models: 1. Theory

USGS Publications Warehouse

Cooley, Richard L.

1982-01-01

Prior information on the parameters of a groundwater flow model can be used to improve parameter estimates obtained from nonlinear regression solution of a modeling problem. Two scales of prior information can be available: (1) prior information having known reliability (that is, bias and random error structure) and (2) prior information consisting of best available estimates of unknown reliability. A regression method that incorporates the second scale of prior information assumes the prior information to be fixed for any particular analysis to produce improved, although biased, parameter estimates. Approximate optimization of two auxiliary parameters of the formulation is used to help minimize the bias, which is almost always much smaller than that resulting from standard ridge regression. It is shown that if both scales of prior information are available, then a combined regression analysis may be made.
Quality and rigor of the concept mapping methodology: a pooled study analysis.

PubMed

Rosas, Scott R; Kane, Mary

2012-05-01

The use of concept mapping in research and evaluation has expanded dramatically over the past 20 years. Researchers in academic, organizational, and community-based settings have applied concept mapping successfully without the benefit of systematic analyses across studies to identify the features of a methodologically sound study. Quantitative characteristics and estimates of quality and rigor that may guide for future studies are lacking. To address this gap, we conducted a pooled analysis of 69 concept mapping studies to describe characteristics across study phases, generate specific indicators of validity and reliability, and examine the relationship between select study characteristics and quality indicators. Individual study characteristics and estimates were pooled and quantitatively summarized, describing the distribution, variation and parameters for each. In addition, variation in the concept mapping data collection in relation to characteristics and estimates was examined. Overall, results suggest concept mapping yields strong internal representational validity and very strong sorting and rating reliability estimates. Validity and reliability were consistently high despite variation in participation and task completion percentages across data collection modes. The implications of these findings as a practical reference to assess the quality and rigor for future concept mapping studies are discussed. Copyright © 2011 Elsevier Ltd. All rights reserved.
Selecting statistical model and optimum maintenance policy: a case study of hydraulic pump.

PubMed

Ruhi, S; Karim, M R

2016-01-01

Proper maintenance policy can play a vital role for effective investigation of product reliability. Every engineered object such as product, plant or infrastructure needs preventive and corrective maintenance. In this paper we look at a real case study. It deals with the maintenance of hydraulic pumps used in excavators by a mining company. We obtain the data that the owner had collected and carry out an analysis and building models for pump failures. The data consist of both failure and censored lifetimes of the hydraulic pump. Different competitive mixture models are applied to analyze a set of maintenance data of a hydraulic pump. Various characteristics of the mixture models, such as the cumulative distribution function, reliability function, mean time to failure, etc. are estimated to assess the reliability of the pump. Akaike Information Criterion, adjusted Anderson-Darling test statistic, Kolmogrov-Smirnov test statistic and root mean square error are considered to select the suitable models among a set of competitive models. The maximum likelihood estimation method via the EM algorithm is applied mainly for estimating the parameters of the models and reliability related quantities. In this study, it is found that a threefold mixture model (Weibull-Normal-Exponential) fits well for the hydraulic pump failures data set. This paper also illustrates how a suitable statistical model can be applied to estimate the optimum maintenance period at a minimum cost of a hydraulic pump.
Oral Health Disparities as Determined by Selected Healthy People 2020 Oral Health Objectives for the United States, ...

MedlinePlus

... status of the civilian noninstitutionalized U.S. population. The survey consists of interviews conducted in participants' homes and standardized physical examinations in mobile examination centers. The sample design includes oversampling to obtain reliable estimates of health ...
Measurement Properties of Performance-Specific Pain Ratings of Patients Awaiting Total Joint Arthroplasty as a Consequence of Osteoarthritis

PubMed Central

Stratford, Paul W.; Kennedy, Deborah M.; Woodhouse, Linda J.; Spadoni, Gregory

2008-01-01

Purpose: To estimate the test–retest reliability of the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) pain sub-scale and performance-specific assessments of pain, as well as the association between these measures for patients awaiting primary total hip or knee arthroplasty as a consequence of osteoarthritis. Methods: A total of 164 patients awaiting unilateral primary hip or knee arthroplasty completed four performance measures (self-paced walk, timed up and go, stair test, six-minute walk) and the WOMAC. Scores for 22 of these patients provided test–retest reliability data. Estimates of test–retest reliability (Type 2,1 intraclass correlation coefficient [ICC] and standard error of measurement [SEM]) and the association between measures were examined. Results: ICC values for individual performance-specific pain ratings were between 0.70 and 0.86; SEM values were between 0.97 and 1.33 pain points. ICC estimates for the four-item performance pain ratings and the WOMAC pain sub-scale were 0.82 and 0.57 respectively. The correlation between the sum of the pain scores for the four performance measures and the WOMAC pain sub-scale was 0.62. Conclusion: Reliability estimates for the performance-specific assessments of pain using the numeric pain rating scale were consistent with values reported for patients with a spectrum of musculoskeletal conditions. The reliability estimate for the WOMAC pain sub-scale was lower than typically reported in the literature. The level of association between the WOMAC pain sub-scale and the various performance-specific pain scales suggests that the scores can be used interchangeably when applied to groups but not for individual patients. PMID:20145758
Reliability and validity of the Modified Erikson Psychosocial Stage Inventory in diverse samples.

PubMed

Leidy, N K; Darling-Fisher, C S

1995-04-01

The Modified Erikson Psychosocial Stage Inventory (MEPSI) is a relatively simple survey measure designed to assess the strength of psychosocial attributes that arise from progression through Erikson's eight stages of development. The purpose of this study was to employ secondary analysis to evaluate the internal-consistency reliability and construct validity of the MEPSI across four diverse samples: healthy young adults, hemophilic men, healthy older adults, and older adults with chronic obstructive pulmonary disease. Special attention was given to the performance of the measure across gender, with exploratory analyses examining possible age cohort and health status effects. Internal-consistency estimates for the aggregate measure were high, whereas subscale reliability levels varied across age groups. Construct validity was supported across samples. Gender, cohort, and health effects offered interesting psychometric and theoretical insights and direction for further research. Findings indicated that the MEPSI might be a useful instrument for operationalizing and testing Eriksonian developmental theory in adults.
Approximate Bayesian algorithm to estimate the basic reproduction number in an influenza pandemic using arrival times of imported cases.

PubMed

Chong, Ka Chun; Zee, Benny Chung Ying; Wang, Maggie Haitian

2018-04-10

In an influenza pandemic, arrival times of cases are a proxy of the epidemic size and disease transmissibility. Because of intense surveillance of travelers from infected countries, detection is more rapid and complete than on local surveillance. Travel information can provide a more reliable estimation of transmission parameters. We developed an Approximate Bayesian Computation algorithm to estimate the basic reproduction number (R 0 ) in addition to the reporting rate and unobserved epidemic start time, utilizing travel, and routine surveillance data in an influenza pandemic. A simulation was conducted to assess the sampling uncertainty. The estimation approach was further applied to the 2009 influenza A/H1N1 pandemic in Mexico as a case study. In the simulations, we showed that the estimation approach was valid and reliable in different simulation settings. We also found estimates of R 0 and the reporting rate to be 1.37 (95% Credible Interval [CI]: 1.26-1.42) and 4.9% (95% CI: 0.1%-18%), respectively, in the 2009 influenza pandemic in Mexico, which were robust to variations in the fixed parameters. The estimated R 0 was consistent with that in the literature. This method is useful for officials to obtain reliable estimates of disease transmissibility for strategic planning. We suggest that improvements to the flow of reporting for confirmed cases among patients arriving at different countries are required. Copyright © 2018 Elsevier Ltd. All rights reserved.

Reliability of Instruments Measuring At-Risk and Problem Gambling Among Young Individuals: A Systematic Review Covering Years 2009-2015.

PubMed

Edgren, Robert; Castrén, Sari; Mäkelä, Marjukka; Pörtfors, Pia; Alho, Hannu; Salonen, Anne H

2016-06-01

This review aims to clarify which instruments measuring at-risk and problem gambling (ARPG) among youth are reliable and valid in light of reported estimates of internal consistency, classification accuracy, and psychometric properties. A systematic search was conducted in PubMed, Medline, and PsycInfo covering the years 2009-2015. In total, 50 original research articles fulfilled the inclusion criteria: target age under 29 years, using an instrument designed for youth, and reporting a reliability estimate. Articles were evaluated with the revised Quality Assessment of Diagnostic Accuracy Studies tool. Reliability estimates were reported for five ARPG instruments. Most studies (66%) evaluated the South Oaks Gambling Screen Revised for Adolescents. The Gambling Addictive Behavior Scale for Adolescents was the only novel instrument. In general, the evaluation of instrument reliability was superficial. Despite its rare use, the Canadian Adolescent Gambling Inventory (CAGI) had a strong theoretical and methodological base. The Gambling Addictive Behavior Scale for Adolescents and the CAGI were the only instruments originally developed for youth. All studies, except the CAGI study, were population based. ARPG instruments for youth have not been rigorously evaluated yet. Further research is needed especially concerning instruments designed for clinical use. Copyright © 2016 The Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Reliability and Agreement of Neck Functional Capacity Evaluation Tests in Patients With Chronic Multifactorial Neck Pain.

PubMed

Reneman, M F; Roelofs, M; Schiphorst Preuper, H R

2017-07-01

To analyze test-retest reliability and agreement, and to explore the safety of neck functional capacity evaluation (Neck-FCE) tests in patients with chronic multifactorial neck pain. Test-retest; 2 FCE sessions were held with a 2-week interval. University-based outpatient rehabilitation center. Individuals (N=18; 14 women) with a mean age of 34 years. Not applicable. The Neck-FCE protocol consists of 6 tests: lifting waist to overhead (kg), 2-handed carrying (kg), overhead working (s), bending and overhead reaching (s), and repetitive side reaching (left and right) (s). Intraclass correlation coefficients (ICCs) and limits of agreement (LoA) were calculated. ICC point estimates between .75 and .90 were considered as good, and >.90 were considered as excellent reliability. ICC point estimates ranged between .39 and .96. Ratios of the LoA ranged between 32.0% and 56.5%. Mean ± SD numeric rating scale pain scores in the neck and shoulder 24 hours after the test were 6.7±2.6 and 6.3±3.0, respectively. Based on ICC point estimates and 95% confidence intervals, 3 tests had excellent reliability and 3 had poor reliability. LoA were substantial in all 6 tests. Safety was confirmed. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Validity and reliability of Nike + Fuelband for estimating physical activity energy expenditure.

PubMed

Tucker, Wesley J; Bhammar, Dharini M; Sawyer, Brandon J; Buman, Matthew P; Gaesser, Glenn A

2015-01-01

The Nike + Fuelband is a commercially available, wrist-worn accelerometer used to track physical activity energy expenditure (PAEE) during exercise. However, validation studies assessing the accuracy of this device for estimating PAEE are lacking. Therefore, this study examined the validity and reliability of the Nike + Fuelband for estimating PAEE during physical activity in young adults. Secondarily, we compared PAEE estimation of the Nike + Fuelband with the previously validated SenseWear Armband (SWA). Twenty-four participants (n = 24) completed two, 60-min semi-structured routines consisting of sedentary/light-intensity, moderate-intensity, and vigorous-intensity physical activity. Participants wore a Nike + Fuelband and SWA, while oxygen uptake was measured continuously with an Oxycon Mobile (OM) metabolic measurement system (criterion). The Nike + Fuelband (ICC = 0.77) and SWA (ICC = 0.61) both demonstrated moderate to good validity. PAEE estimates provided by the Nike + Fuelband (246 ± 67 kcal) and SWA (238 ± 57 kcal) were not statistically different than OM (243 ± 67 kcal). Both devices also displayed similar mean absolute percent errors for PAEE estimates (Nike + Fuelband = 16 ± 13 %; SWA = 18 ± 18 %). Test-retest reliability for PAEE indicated good stability for Nike + Fuelband (ICC = 0.96) and SWA (ICC = 0.90). The Nike + Fuelband provided valid and reliable estimates of PAEE, that are similar to the previously validated SWA, during a routine that included approximately equal amounts of sedentary/light-, moderate- and vigorous-intensity physical activity.
Using Internet search engines to estimate word frequency.

PubMed

Blair, Irene V; Urland, Geoffrey R; Ma, Jennifer E

2002-05-01

The present research investigated Internet search engines as a rapid, cost-effective alternative for estimating word frequencies. Frequency estimates for 382 words were obtained and compared across four methods: (1) Internet search engines, (2) the Kucera and Francis (1967) analysis of a traditional linguistic corpus, (3) the CELEX English linguistic database (Baayen, Piepenbrock, & Gulikers, 1995), and (4) participant ratings of familiarity. The results showed that Internet search engines produced frequency estimates that were highly consistent with those reported by Kucera and Francis and those calculated from CELEX, highly consistent across search engines, and very reliable over a 6-month period of time. Additional results suggested that Internet search engines are an excellent option when traditional word frequency analyses do not contain the necessary data (e.g., estimates for forenames and slang). In contrast, participants' familiarity judgments did not correspond well with the more objective estimates of word frequency. Researchers are advised to use search engines with large databases (e.g., AltaVista) to ensure the greatest representativeness of the frequency estimates.
The Depression Anxiety Stress Scales-21 (DASS-21): further examination of dimensions, scale reliability, and correlates.

PubMed

Osman, Augustine; Wong, Jane L; Bagge, Courtney L; Freedenthal, Stacey; Gutierrez, Peter M; Lozano, Gregorio

2012-12-01

We conducted two studies to examine the dimensions, internal consistency reliability estimates, and potential correlates of the Depression Anxiety Stress Scales-21 (DASS-21; Lovibond & Lovibond, 1995). Participants in Study 1 included 887 undergraduate students (363 men and 524 women, aged 18 to 35 years; mean [M] age = 19.46, standard deviation [SD] = 2.17) recruited from two public universities to assess the specificity of the individual DASS-21 items and to evaluate estimates of internal consistency reliability. Participants in a follow-up study (Study 2) included 410 students (168 men and 242 women, aged 18 to 47 years; M age = 19.65, SD = 2.88) recruited from the same universities to further assess factorial validity and to evaluate potential correlates of the original DASS-21 total and scale scores. Item bifactor and confirmatory factor analyses revealed that a general factor accounted for the greatest proportion of common variance in the DASS-21 item scores (Study 1). In Study 2, the fit statistics showed good fit for the bifactor model. In addition, the DASS-21 total scale score correlated more highly with scores on a measure of mixed depression and anxiety than with scores on the proposed specific scales of depression or anxiety. Coefficient omega estimates for the DASS-21 scale scores were good. Further investigations of the bifactor structure and psychometric properties of the DASS-21, specifically its incremental and discriminant validity, using known clinical groups are needed. © 2012 Wiley Periodicals, Inc.
Bayesian Meta-Analysis of Coefficient Alpha

ERIC Educational Resources Information Center

Brannick, Michael T.; Zhang, Nanhua

2013-01-01

The current paper describes and illustrates a Bayesian approach to the meta-analysis of coefficient alpha. Alpha is the most commonly used estimate of the reliability or consistency (freedom from measurement error) for educational and psychological measures. The conventional approach to meta-analysis uses inverse variance weights to combine…
Reliability of self-reported antisocial personality disorder symptoms among substance abusers.

PubMed

Cottler, L B; Compton, W M; Ridenour, T A; Ben Abdallah, A; Gallagher, T

1998-02-01

It is estimated that from 20 to 60% of substance abusers meet criteria for Antisocial Personality Disorder (APD). An accurate and reliable diagnosis is important because persons meeting criteria for APD, by the nature of their disorder, are less likely to change behaviors and more likely to relapse to both substance abuse and high risk behaviors. To understand more about the reliability of the disorder and symptoms of APD, the Diagnostic Interview Schedule Version III-R (DIS) was administered to 453 substance abusers ascertained from treatment programs and from the general population (St Louis Epidemiological Catchment Area (ECA) follow-up study). Estimates of the 1 week, test-retest reliability for the childhood conduct disorder criterion, the adult antisocial behavior criterion, and APD diagnosis fell in the good agreement range, as measured by kappa. The internal consistency of these DIS symptoms was adequate to acceptable. Individual DIS criteria designed to measure childhood conduct disorder ranged from fair to good for most items; reliability was slightly higher for the adult antisocial behavior symptom items. Finally, self-reported 'liars' were no more unreliable in their reports of their behaviors than 'non-liars'.
Pilot Study: Survey Tools for Assessing Parenting Styles and Family Contributors to the Development of Obesity in Arab Children Ages 6 to 12 Years.

PubMed

Tami, Suzan H; Reed, Debra B; Trejos, Elizabeth; Boylan, Mallory; Wang, Shu

2015-11-05

Our pilot study was conducted to test the reliability of the Caregiver's Feeding Styles Questionnaire (CFSQ) and the Family Nutrition and Physical Activity Assessment (FNPA) in a sample of Arab mothers. Twenty-five Arab mothers completed the CFSQ, FNPA, and the Participant Background Survey for the first administration. After 1-2 weeks, participants completed the CFSQ and the FNPA for the second administration. The two administrations of the surveys allowed for test/retest reliability of the CFSQ and the FNPA and to measure the internal consistency of the two surveys. Pearson's correlation between the first and second administrations or the 19-item scale (demandingness) and the 7-item scale (responsiveness) of the CFSQ were .95 and .86, respectively. As for the FNPA, Pearson's correlation was .80. The estimated reliabilities (Cronbach's alpha) of the CFSQ increased from .86 for the first administration to .93 for the second administration. However, the estimated reliabilities of the FNPA slightly increased from .58 for first administration to .59 for the second administration. In our pilot study of Arab mothers, the CFSQ and FNPA were shown to be promising in terms of reliability and content validity.
Measurement properties of the WOMAC LK 3.1 pain scale.

PubMed

Stratford, P W; Kennedy, D M; Woodhouse, L J; Spadoni, G F

2007-03-01

The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) is applied extensively to patients with osteoarthritis of the hip or knee. Previous work has challenged the validity of its physical function scale however an extensive evaluation of its pain scale has not been reported. Our purpose was to estimate internal consistency, factorial validity, test-retest reliability, and the standard error of measurement (SEM) of the WOMAC LK 3.1 pain scale. Four hundred and seventy-four patients with osteoarthritis of the hip or knee awaiting arthroplasty were administered the WOMAC. Estimates of internal consistency (coefficient alpha), factorial validity (confirmatory factor analysis), and the SEM based on internal consistency (SEM(IC)) were obtained. Test-retest reliability [Type 2,1 intraclass correlation coefficients (ICC)] and a corresponding SEM(TRT) were estimated on a subsample of 36 patients. Our estimates were: internal consistency alpha=0.84; SEM(IC)=1.48; Type 2,1 ICC=0.77; SEM(TRT)=1.69. Confirmatory factor analysis failed to support a single factor structure of the pain scale with uncorrelated error terms. Two comparable models provided excellent fit: (1) a model with correlated error terms between the walking and stairs items, and between night and sit items (chi2=0.18, P=0.98); (2) a two factor model with walking and stairs items loading on one factor, night and sit items loading on a second factor, and the standing item loading on both factors (chi2=0.18, P=0.98). Our examination of the factorial structure of the WOMAC pain scale failed to support a single factor and internal consistency analysis yielded a coefficient less than optimal for individual patient use. An alternate strategy to summing the five-item responses when considering individual patient application would be to interpret item responses separately or to sum only those items which display homogeneity.
Translation, cross-cultural adaptation, and validation of the Bulgarian version of the Liverpool Adverse Event Profile.

PubMed

Kuzmanova, Rumyana; Stefanova, Irina; Velcheva, Irena; Stambolieva, Katerina

2014-10-01

Adverse effects (AEs) of antiepileptic drugs (AEDs) affect the quality of life of patients with epilepsy and their outcomes. There are no questionnaires or studies on the reliability and validity of instruments measuring AEs of AEDs in patients with epilepsy in Bulgarian language. The aim of the present study was the translation, cross-cultural adaptation, and validation of the LAEP in the Bulgarian language in order to use it in the Bulgarian-speaking population in providing a reliable instrument for the clinical monitoring of patients with epilepsy. One hundred thirty-one patients (57 men and 74 women, mean age: 40.13±13.37 years) took part in the investigation. The internal consistency and test-retest reliability were tested by Cronbach's α and ICC estimations. The convergent construct validity was tested by estimating the correlation of the LAEP-BG with the QOLIE-89 and the discriminant validity by evaluating the difference between LAEP-BG scores and clinical parameters such as the type of epilepsy using Kruskal-Wallis ANOVA. The LAEP-BG showed high internal consistency and reliability. The Cronbach's α of the total scale was 0.86. No significant differences between the Cronbach's α coefficients of the total LAEP-BG and original English, Chinese, Spanish, Korean, and Portuguese-Brazilian versions of the questionnaire were observed. The ICCs, which evaluate the test-retest reliability, were higher than the recommended value of 0.75 and determined the strong positive correlations between the first and second examinations. The creation of two subscales "Neurological and psychiatric side effects" and "Non neurological side effects" of the LAEP-BG proposed by us showed good internal consistency (Cronbach's α of 0.85 and 0.71, respectively). The LAEP-BG scores significantly correlated with other questionnaires such as the Quality of Life in Epilepsy Inventory-89 (QOLIE-89) and showed a good discriminative validity between groups with different levels of self-assessed AEs of AEDs. The Bulgarian version of the Liverpool Adverse Event Profile (LAEP) is a reliable and valid tool in assessing the patient-reported AEs of AEDs and their impact on the patient's outcome. Copyright © 2014 Elsevier Inc. All rights reserved.
Mental Disorder Among Homeless Persons in the United States: An Overview of Recent Empirical Literature.

ERIC Educational Resources Information Center

Robertson, Marjorie J.

1986-01-01

Reviews literature on the homeless reporting higher rates of psychiatric disorder, psychological distress, and previous psychiatric hospitalization compared to the general population. However, understandardized methodology and lack of consistent findings across studies prohibit reliable prevalence estimates of mental disorder among the homeless.…
Two-dimensional echo-cardiographic estimation of left atrial volume and volume load in patients with congenital heart disease.

PubMed

Kawaguchi, A; Linde, L M; Imachi, T; Mizuno, H; Akutsu, H

1983-12-01

To estimate the left atrial volume (LAV) and pulmonary blood flow in patients with congenital heart disease (CHD), we employed two-dimensional echocardiography (TDE). The LAV was measured in dimensions other than those obtained in conventional M-mode echocardiography (M-mode echo). Mathematical and geometrical models for LAV calculation using the standard long-axis, short-axis and apical four-chamber planes were devised and found to be reliable in a preliminary study using porcine heart preparations, although length (10%), area (20%) and volume (38%) were significantly and consistently underestimated with echocardiography. Those models were then applied and correlated with angiocardiograms (ACG) in 25 consecutive patients with suspected CHD. In terms of the estimation of the absolute LAV, accuracy seemed commensurate with the number of the dimensions measured. The correlation between data obtained by TDE and ACG varied with changing hemodynamics such as cardiac cycle, absolute LAV and presence or absence of volume load. The left atrium was found to become spherical and progressively underestimated with TDE at ventricular endsystole, in larger LAV and with increased volume load. Since this tendency became less pronounced in measuring additional dimensions, reliable estimation of the absolute LAV and volume load was possible when 2 or 3 dimensions were measured. Among those calculation models depending on 2 or 3 dimensional measurements, there was only a small difference in terms of accuracy and predictability, although algorithm used varied from one model to another. This suggests that accurate cross-sectional area measurement is critically important for volume estimation rather than any particular algorithm involved. Cross-sectional area measurement by TDE integrated into a three dimensional equivalent allowed a reliable estimate of the LAV or volume load in a variety of hemodynamic situations where M-mode echo was not reliable.
[New questionnaire to assess self-efficacy toward physical activity in children].

PubMed

Aedo, Angeles; Avila, Héctor

2009-10-01

To design a questionnaire for assessment of self-efficacy toward physical activity in school children, as well as to measure its construct validity, test-retest reliability, and internal consistency. A four-stage multimethod approach was used: (1) bibliographic research followed by exploratory study and the formulation of questions and responses based on a dichotomous scale of 14 items; (2) validation of the content by a panel of experts; (3) application of the preliminary version of the questionnaire to a sample of 900 school-aged children in Mexico City; and (4) determination of the construct validity, test-retest reliability, and internal consistency (Cronbach's alpha). Three factors were identified that explain 64.15% of the variance: the search for positive alternatives to physical activity, ability to deal with possible barriers to exercising, and expectations of skill or competence. The model was validated using the goodness of fit, and the result of 65% less than 0.05 indicated that the estimated factor model fit the data. Cronbach's consistency alpha was 0.733; test-retest reliability was 0.867. The scale designed has adequate reliability and validity. These results are a good indicator of self-efficacy toward physical activity in school children, which is important when developing programs intended to promote such behavior in this age group.
Structural validity and reliability of the Positive and Negative Affect Schedule (PANAS): evidence from a large Brazilian community sample.

PubMed

Carvalho, Hudson W de; Andreoli, Sérgio B; Lara, Diogo R; Patrick, Christopher J; Quintana, Maria Inês; Bressan, Rodrigo A; Melo, Marcelo F de; Mari, Jair de J; Jorge, Miguel R

2013-01-01

Positive and negative affect are the two psychobiological-dispositional dimensions reflecting proneness to positive and negative activation that influence the extent to which individuals experience life events as joyful or as distressful. The Positive and Negative Affect Schedule (PANAS) is a structured questionnaire that provides independent indexes of positive and negative affect. This study aimed to validate a Brazilian interview-version of the PANAS by means of factor and internal consistency analysis. A representative community sample of 3,728 individuals residing in the cities of São Paulo and Rio de Janeiro, Brazil, voluntarily completed the PANAS. Exploratory structural equation model analysis was based on maximum likelihood estimation and reliability was calculated via Cronbach's alpha coefficient. Our results provide support for the hypothesis that the PANAS reliably measures two distinct dimensions of positive and negative affect. The structure and reliability of the Brazilian version of the PANAS are consistent with those of its original version. Taken together, these results attest the validity of the Brazilian adaptation of the instrument.
Stable Estimation of a Covariance Matrix Guided by Nuclear Norm Penalties

PubMed Central

Chi, Eric C.; Lange, Kenneth

2014-01-01

Estimation of a covariance matrix or its inverse plays a central role in many statistical methods. For these methods to work reliably, estimated matrices must not only be invertible but also well-conditioned. The current paper introduces a novel prior to ensure a well-conditioned maximum a posteriori (MAP) covariance estimate. The prior shrinks the sample covariance estimator towards a stable target and leads to a MAP estimator that is consistent and asymptotically efficient. Thus, the MAP estimator gracefully transitions towards the sample covariance matrix as the number of samples grows relative to the number of covariates. The utility of the MAP estimator is demonstrated in two standard applications – discriminant analysis and EM clustering – in this sampling regime. PMID:25143662
Test-retest reliability and comparability of paper and computer questionnaires for the Finnish version of the Tampa Scale of Kinesiophobia.

PubMed

Koho, P; Aho, S; Kautiainen, H; Pohjolainen, T; Hurri, H

2014-12-01

To estimate the internal consistency, test-retest reliability and comparability of paper and computer versions of the Finnish version of the Tampa Scale of Kinesiophobia (TSK-FIN) among patients with chronic pain. In addition, patients' personal experiences of completing both versions of the TSK-FIN and preferences between these two methods of data collection were studied. Test-retest reliability study. Paper and computer versions of the TSK-FIN were completed twice on two consecutive days. The sample comprised 94 consecutive patients with chronic musculoskeletal pain participating in a pain management or individual rehabilitation programme. The group rehabilitation design consisted of physical and functional exercises, evaluation of the social situation, psychological assessment of pain-related stress factors, and personal pain management training in order to regain overall function and mitigate the inconvenience of pain and fear-avoidance behaviour. The mean TSK-FIN score was 37.1 [standard deviation (SD) 8.1] for the computer version and 35.3 (SD 7.9) for the paper version. The mean difference between the two versions was 1.9 (95% confidence interval 0.8 to 2.9). Test-retest reliability was 0.89 for the paper version and 0.88 for the computer version. Internal consistency was considered to be good for both versions. The intraclass correlation coefficient for comparability was 0.77 (95% confidence interval 0.66 to 0.85), indicating substantial reliability between the two methods. Both versions of the TSK-FIN demonstrated substantial intertest reliability, good test-retest reliability, good internal consistency and acceptable limits of agreement, suggesting their suitability for clinical use. However, subjects tended to score higher when using the computer version. As such, in an ideal situation, data should be collected in a similar manner throughout the course of rehabilitation or clinical research. Copyright © 2014 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.
Resting State Network Estimation in Individual Subjects

PubMed Central

Hacker, Carl D.; Laumann, Timothy O.; Szrama, Nicholas P.; Baldassarre, Antonello; Snyder, Abraham Z.

2014-01-01

Resting-state functional magnetic resonance imaging (fMRI) has been used to study brain networks associated with both normal and pathological cognitive function. The objective of this work is to reliably compute resting state network (RSN) topography in single participants. We trained a supervised classifier (multi-layer perceptron; MLP) to associate blood oxygen level dependent (BOLD) correlation maps corresponding to pre-defined seeds with specific RSN identities. Hard classification of maps obtained from a priori seeds was highly reliable across new participants. Interestingly, continuous estimates of RSN membership retained substantial residual error. This result is consistent with the view that RSNs are hierarchically organized, and therefore not fully separable into spatially independent components. After training on a priori seed-based maps, we propagated voxel-wise correlation maps through the MLP to produce estimates of RSN membership throughout the brain. The MLP generated RSN topography estimates in individuals consistent with previous studies, even in brain regions not represented in the training data. This method could be used in future studies to relate RSN topography to other measures of functional brain organization (e.g., task-evoked responses, stimulation mapping, and deficits associated with lesions) in individuals. The multi-layer perceptron was directly compared to two alternative voxel classification procedures, specifically, dual regression and linear discriminant analysis; the perceptron generated more spatially specific RSN maps than either alternative. PMID:23735260
Validity and reliability of global operative assessment of laparoscopic skills (GOALS) in novice trainees performing a laparoscopic cholecystectomy.

PubMed

Kramp, Kelvin H; van Det, Marc J; Hoff, Christiaan; Lamme, Bas; Veeger, Nic J G M; Pierie, Jean-Pierre E N

2015-01-01

Global Operative Assessment of Laparoscopic Skills (GOALS) assessment has been designed to evaluate skills in laparoscopic surgery. A longitudinal blinded study of randomized video fragments was conducted to estimate the validity and reliability of GOALS in novice trainees. In total, 10 trainees each performed 6 consecutive laparoscopic cholecystectomies. Sixty procedures were recorded on video. Video fragments of (1) opening of the peritoneum; (2) dissection of Calot's triangle and achievement of critical view of safety; and (3) dissection of the gallbladder from the liver bed were blinded, randomized, and rated by 2 consultant surgeons using GOALS. Also, a grade was given for overall competence. The correlation of GOALS with live observation Objective Structured Assessment of Technical Skills (OSATS) scores was calculated. Construct validity was estimated using the Friedman 2-way analysis of variance by ranks and the Wilcoxon signed-rank test. The interrater reliability was calculated using the absolute and consistency agreement 2-way random-effects model intraclass correlation coefficient. A high correlation was found between mean GOALS score (r = 0.879, p = 0.021) and mean OSATS score. The GOALS score increased significantly across the 6 procedures (p = 0.002). The trainees performed significantly better on their sixth when compared with their first cholecystectomy (p = 0.004). The consistency agreement interrater reliability was 0.37 for the mean GOALS score (p = 0.002) and 0.55 for overall competence (p < 0.001) of the 3 video fragments. The validity observed in this randomized blinded longitudinal study supports the existing evidence that GOALS is a valid tool for assessment of novice trainees. A relatively low reliability was found in this study. Copyright © 2014 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Translation, cross-cultural adaptation, and validation of the Turkish version of the Harris Hip Score.

PubMed

Çelik, Derya; Can, Canan; Aslan, Yasemin; Ceylan, Hasan Huseyin; Bilsel, Kerem; Ozdincler, Arzu Razak

2014-01-01

The Harris Hip Score (HHS) developed to assess function and pain from the perspective of patients hip pathologies. The purpose of this study was to translate and culturally adapt the HHS into Turkish, and thereby determine the reliability and validity of the translated version. The HHS was translated into Turkish in accordance with the stages recommended by Beaton. The measurement properties of the HHS were tested in 80 patients; 52 males, mean age 51 years (range 21-75 years) suffering from different hip pathologies. The test-retest reliability was tested in 58 patients; 28 males mean age, 52 years (range 30-73 years) after an interval of seven days. The Cronbach's Alpha was used to assess internal consistency and the intra-class correlation coefficient (ICC) was used to estimate the test-retest reliability. Patients were asked to answer the Oxford Hip Score (OHS), the Western Ontario and McMaster Universities Arthritis Index (WOMAC), the VAS and the Short Form-36 (SF-36) for the validity of the estimation. The Turkish version of the HHS showed sufficient internal consistency (Cronbach's alpha,0.70) and test-retest reliability (ICC = 0.91). The correlation coefficients between the HHS, the WOMAC and the OHS were 0.64 and 0.89 respectively. The highest correlations between the HHS and SF-36 were with the physical function scale (r = 0.72), and the lowest correlations were with the mental function scale (r = 0.10). We observed no floor or ceiling effects. The Turkish version of the HHS has sufficient reliability and validity to measure patient-reported outcome for Turkish-speaking individuals with a variety of hip disorders.
Reliability of self-rated tinnitus distress and association with psychological symptom patterns.

PubMed

Hiller, W; Goebel, G; Rief, W

1994-05-01

Psychological complaints were investigated in two samples of 60 and 138 in-patients suffering from chronic tinnitus. We administered the Tinnitus Questionnaire (TQ), a 52-item self-rating scale which differentiates between dimensions of emotional and cognitive distress, intrusiveness, auditory perceptual difficulties, sleep disturbances and somatic complaints. The test-retest reliability was .94 for the TQ global score and between .86 and .93 for subscales. Three independent analyses were conducted to estimate the split-half reliability (internal consistency) which was only slightly lower than the test-retest values for scales with a relatively small number of items. Reliability was sufficient also on the level of single items. Low correlation between the TQ and the Hopkins Symptom Checklist (SCL-90-R) indicate a distinct quality of tinnitus-related and general psychological disturbances.

Accuracy of the visual estimation method as a predictor of food intake in Alzheimer's patients provided with different types of food.

PubMed

Amano, Nobuko; Nakamura, Tomiyo

2018-02-01

The visual estimation method is commonly used in hospitals and other care facilities to evaluate food intake through estimation of plate waste. In Japan, no previous studies have investigated the validity and reliability of this method under the routine conditions of a hospital setting. The present study aimed to evaluate the validity and reliability of the visual estimation method, in long-term inpatients with different levels of eating disability caused by Alzheimer's disease. The patients were provided different therapeutic diets presented in various food types. This study was performed between February and April 2013, and 82 patients with Alzheimer's disease were included. Plate waste was evaluated for the 3 main daily meals, for a total of 21 days, 7 consecutive days during each of the 3 months, originating a total of 4851 meals, from which 3984 were included. Plate waste was measured by the nurses through the visual estimation method, and by the hospital's registered dietitians through the actual measurement method. The actual measurement method was first validated to serve as a reference, and the level of agreement between both methods was then determined. The month, time of day, type of food provided, and patients' physical characteristics were considered for analysis. For the 3984 meals included in the analysis, the level of agreement between the measurement methods was 78.4%. Disagreement of measurements consisted of 3.8% of underestimation and 17.8% of overestimation. Cronbach's α (0.60, P < 0.001) indicated that the reliability of the visual estimation method was within the acceptable range. The visual estimation method was found to be a valid and reliable method for estimating food intake in patients with different levels of eating impairment. The successful implementation and use of the method depends upon adequate training and motivation of the nurses and care staff involved. Copyright © 2017 European Society for Clinical Nutrition and Metabolism. Published by Elsevier Ltd. All rights reserved.
A new multidimensional measure of African adolescents' perceptions of teachers' behaviors.

PubMed

Mboya, M M

1994-04-01

The Perceived Teacher Behavior Inventory was designed to measure three dimensions of students' perceptions of the behaviors of their teachers. This research was conducted to assess the statistical validity and reliability of the instrument administered to 770 students attending two coeducational high schools in Cape Town, South Africa. Factor analysis clearly identified three subscales indicating that the instrument distinguished the students' perceptions of their teachers' behaviors in three areas. Estimates of internal consistency of the subscales were assessed using the squared multiple correlation as the index of reliability.
The Joint Confidence Level Paradox: A History of Denial

NASA Technical Reports Server (NTRS)

Butts, Glenn; Linton, Kent

2009-01-01

This paper is intended to provide a reliable methodology for those tasked with generating price tags on construction (C0F) and research and development (R&D) activities in the NASA performance world. This document consists of a collection of cost-related engineering detail and project fulfillment information from early agency days to the present. Accurate historical detail is the first place to start when determining improved methodologies for future cost and schedule estimating. This paper contains a beneficial proposed cost estimating method for arriving at more reliable numbers for future submits. When comparing current cost and schedule methods with earlier cost and schedule approaches, it became apparent that NASA's organizational performance paradigm has morphed. Mission fulfillment speed has slowed and cost calculating factors have increased in 21st Century space exploration.
The application of the statistical theory of extreme values to gust-load problems

NASA Technical Reports Server (NTRS)

Press, Harry

1950-01-01

An analysis is presented which indicates that the statistical theory of extreme values is applicable to the problems of predicting the frequency of encountering the larger gust loads and gust velocities for both specific test conditions as well as commercial transport operations. The extreme-value theory provides an analytic form for the distributions of maximum values of gust load and velocity. Methods of fitting the distribution are given along with a method of estimating the reliability of the predictions. The theory of extreme values is applied to available load data from commercial transport operations. The results indicate that the estimates of the frequency of encountering the larger loads are more consistent with the data and more reliable than those obtained in previous analyses. (author)
Czech results at criticality dosimetry intercomparison 2002.

PubMed

Frantisek, Spurný; Jaroslav, Trousil

2004-01-01

Two criticality dosimetry systems were tested by Czech participants during the intercomparison held in Valduc, France, June 2002. The first consisted of the thermoluminescent detectors (TLDs) (Al-P glasses) and Si-diodes as passive neutron dosemeters. Second, it was studied to what extent the individual dosemeters used in the Czech routine personal dosimetry service can give a reliable estimation of criticality accident exposure. It was found that the first system furnishes quite reliable estimation of accidental doses. For routine individual dosimetry system, no important problems were encountered in the case of photon dosemeters (TLDs, film badge). For etched track detectors in contact with the 232Th or 235U-Al alloy, the track density saturation for the spark counting method limits the upper dose at approximately 1 Gy for neutrons with the energy >1 MeV.
Production Variability and Single Word Intelligibility in Aphasia and Apraxia of Speech

ERIC Educational Resources Information Center

Haley, Katarina L.; Martin, Gwenyth

2011-01-01

This study was designed to estimate test-retest reliability of orthographic speech intelligibility testing in speakers with aphasia and AOS and to examine its relationship to the consistency of speaker and listener responses. Monosyllabic single word speech samples were recorded from 13 speakers with coexisting aphasia and AOS. These words were…
National visitor use monitoring implementation in Alaska.

Treesearch

Eric M. White; Joshua B. Wilson

2008-01-01

The USDA Forest Service implemented the National Visitor Use Monitoring (NVUM) program across the entire National Forest System (NFS) in calendar year 2000. The primary objective of the NVUM program is to develop reliable estimates of recreation use on NFS lands via a nationally consistent, statistically valid sampling approach. Secondary objectives of NVUM are to...
Measures of Instruction for Creative Engagement: Making Metacognition, Modeling and Creative Thinking Visible

ERIC Educational Resources Information Center

Pitts, Christine; Anderson, Ross; Haney, Michele

2018-01-01

The purpose of the current study was to estimate reliability, internal consistency and construct validity of the Measure of Instruction for Creative Engagement (MICE) instrument. The MICE uses an iterative process of evidence collection and scoring through teacher observations to determine instructional domain ratings and overall scores. The…
Estimating distributions with increasing failure rate in an imperfect repair model.

PubMed

Kvam, Paul H; Singh, Harshinder; Whitaker, Lyn R

2002-03-01

A failed system is repaired minimally if after failure, it is restored to the working condition of an identical system of the same age. We extend the nonparametric maximum likelihood estimator (MLE) of a system's lifetime distribution function to test units that are known to have an increasing failure rate. Such items comprise a significant portion of working components in industry. The order-restricted MLE is shown to be consistent. Similar results hold for the Brown-Proschan imperfect repair model, which dictates that a failed component is repaired perfectly with some unknown probability, and is otherwise repaired minimally. The estimators derived are motivated and illustrated by failure data in the nuclear industry. Failure times for groups of emergency diesel generators and motor-driven pumps are analyzed using the order-restricted methods. The order-restricted estimators are consistent and show distinct differences from the ordinary MLEs. Simulation results suggest significant improvement in reliability estimation is available in many cases when component failure data exhibit the IFR property.
Evaluation of dynamic balance among community-dwelling older adult fallers: a generalizability study of the limits of stability test.

PubMed

Clark, S; Rose, D J

2001-04-01

To establish reliability estimates of the 75% Limits of Stability Test (75% LOS test) when administered to community-dwelling older adults with a history of falls. Generalizability theory was used to estimate both the relative contribution of identified error sources to the total measurement error and generalizability coefficients. A random effects repeated-measures analysis of variance (ANOVA) was used to assess consistency of LOS test movement variables across both days and targets. A motor control research laboratory in a university setting. Fifty community-dwelling older adults with 2 or more falls in the previous year. Spatial and temporal measures of dynamic balance derived from the 75% LOS test included average movement velocity, maximum center of gravity (COG) excursion, end-point COG excursion, and directional control. Estimated generalizability coefficients for 2 testing days ranged from.58 to.87. Total variance in LOS test measures attributable to inconsistencies in day-to-day test performance (Day and Subject x Day facets) ranged from 2.5% to 8.4%. The ANOVA results indicated that no significant differences were observed in the LOS test variables across the 2 testing days. The 75% LOS test administered to older adult fallers on 2 consecutive days provides consistent and reliable measures of dynamic balance.
NDE reliability and probability of detection (POD) evolution and paradigm shift

NASA Astrophysics Data System (ADS)

Singh, Surendra

2014-02-01

The subject of NDE Reliability and POD has gone through multiple phases since its humble beginning in the late 1960s. This was followed by several programs including the important one nicknamed "Have Cracks - Will Travel" or in short "Have Cracks" by Lockheed Georgia Company for US Air Force during 1974-1978. This and other studies ultimately led to a series of developments in the field of reliability and POD starting from the introduction of fracture mechanics and Damaged Tolerant Design (DTD) to statistical framework by Bernes and Hovey in 1981 for POD estimation to MIL-STD HDBK 1823 (1999) and 1823A (2009). During the last decade, various groups and researchers have further studied the reliability and POD using Model Assisted POD (MAPOD), Simulation Assisted POD (SAPOD), and applying Bayesian Statistics. All and each of these developments had one objective, i.e., improving accuracy of life prediction in components that to a large extent depends on the reliability and capability of NDE methods. Therefore, it is essential to have a reliable detection and sizing of large flaws in components. Currently, POD is used for studying reliability and capability of NDE methods, though POD data offers no absolute truth regarding NDE reliability, i.e., system capability, effects of flaw morphology, and quantifying the human factors. Furthermore, reliability and POD have been reported alike in meaning but POD is not NDE reliability. POD is a subset of the reliability that consists of six phases: 1) samples selection using DOE, 2) NDE equipment setup and calibration, 3) System Measurement Evaluation (SME) including Gage Repeatability &Reproducibility (Gage R&R) and Analysis Of Variance (ANOVA), 4) NDE system capability and electronic and physical saturation, 5) acquiring and fitting data to a model, and data analysis, and 6) POD estimation. This paper provides an overview of all major POD milestones for the last several decades and discuss rationale for using Integrated Computational Materials Engineering (ICME), MAPOD, SAPOD, and Bayesian statistics for studying controllable and non-controllable variables including human factors for estimating POD. Another objective is to list gaps between "hoped for" versus validated or fielded failed hardware.
Examining the interrater reliability of the Hare Psychopathy Checklist-Revised across a large sample of trained raters.

PubMed

Blais, Julie; Forth, Adelle E; Hare, Robert D

2017-06-01

The goal of the current study was to assess the interrater reliability of the Psychopathy Checklist-Revised (PCL-R) among a large sample of trained raters (N = 280). All raters completed PCL-R training at some point between 1989 and 2012 and subsequently provided complete coding for the same 6 practice cases. Overall, 3 major conclusions can be drawn from the results: (a) reliability of individual PCL-R items largely fell below any appropriate standards while the estimates for Total PCL-R scores and factor scores were good (but not excellent); (b) the cases representing individuals with high psychopathy scores showed better reliability than did the cases of individuals in the moderate to low PCL-R score range; and (c) there was a high degree of variability among raters; however, rater specific differences had no consistent effect on scoring the PCL-R. Therefore, despite low reliability estimates for individual items, Total scores and factor scores can be reliably scored among trained raters. We temper these conclusions by noting that scoring standardized videotaped case studies does not allow the rater to interact directly with the offender. Real-world PCL-R assessments typically involve a face-to-face interview and much more extensive collateral information. We offer recommendations for new web-based training procedures. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Internal consistency and stability of the CANTAB neuropsychological test battery in children.

PubMed

Syväoja, Heidi J; Tammelin, Tuija H; Ahonen, Timo; Räsänen, Pekka; Tolvanen, Asko; Kankaanpää, Anna; Kantomaa, Marko T

2015-06-01

The Cambridge Neuropsychological Test Automated Battery (CANTAB) is a computer-assessed test battery widely use in different populations. The internal consistency and 1-year stability of CANTAB tests were examined in school-age children. Two hundred-thirty children (57% girls) from five schools in the Jyväskylä school district in Finland participated in the study in spring 2011. The children completed the following CANTAB tests: (a) visual memory (pattern recognition memory [PRM] and spatial recognition memory [SRM]), (b) executive function (spatial span [SSP], Stockings of Cambridge [SOC], and intra-extra dimensional set shift [IED]), and (c) attention (reaction time [RTI] and rapid visual information processing [RVP]). Seventy-four children participated in the follow-up measurements (64% girls) in spring 2012. Cronbach's alpha reliability coefficient was used to estimate the internal consistency of the nonhampering test, and structural equation models were applied to examine the stability of these tests. The reliability and the stability could not be determined for IED or SSP because of the nature of these tests. The internal consistency was acceptable only in the RTI task. The 1-year stability was moderate-to-good for the PRM, RTI, and RVP. The SSP and IED showed a moderate correlation between the two measurement points. The SRM and the SOC tasks were not reliable or stable measures in this study population. For research purposes, we recommend using structural equation modeling to improve reliability. The results suggest that the reliability and the stability of computer-based test batteries should be confirmed in the target population before using them for clinical or research purposes. (c) 2015 APA, all rights reserved).
A Bayesian approach to reliability and confidence

NASA Technical Reports Server (NTRS)

Barnes, Ron

1989-01-01

The historical evolution of NASA's interest in quantitative measures of reliability assessment is outlined. The introduction of some quantitative methodologies into the Vehicle Reliability Branch of the Safety, Reliability and Quality Assurance (SR and QA) Division at Johnson Space Center (JSC) was noted along with the development of the Extended Orbiter Duration--Weakest Link study which will utilize quantitative tools for a Bayesian statistical analysis. Extending the earlier work of NASA sponsor, Richard Heydorn, researchers were able to produce a consistent Bayesian estimate for the reliability of a component and hence by a simple extension for a system of components in some cases where the rate of failure is not constant but varies over time. Mechanical systems in general have this property since the reliability usually decreases markedly as the parts degrade over time. While they have been able to reduce the Bayesian estimator to a simple closed form for a large class of such systems, the form for the most general case needs to be attacked by the computer. Once a table is generated for this form, researchers will have a numerical form for the general solution. With this, the corresponding probability statements about the reliability of a system can be made in the most general setting. Note that the utilization of uniform Bayesian priors represents a worst case scenario in the sense that as researchers incorporate more expert opinion into the model, they will be able to improve the strength of the probability calculations.
Use of the Environment and Policy Evaluation and Observation as a Self-Report Instrument (EPAO-SR) to measure nutrition and physical activity environments in child care settings: validity and reliability evidence.

PubMed

Ward, Dianne S; Mazzucca, Stephanie; McWilliams, Christina; Hales, Derek

2015-09-26

Early care and education (ECE) centers are important settings influencing young children's diet and physical activity (PA) behaviors. To better understand their impact on diet and PA behaviors as well as to evaluate public health programs aimed at ECE settings, we developed and tested the Environment and Policy Assessment and Observation - Self-Report (EPAO-SR), a self-administered version of the previously validated, researcher-administered EPAO. Development of the EPAO-SR instrument included modification of items from the EPAO, community advisory group and expert review, and cognitive interviews with center directors and classroom teachers. Reliability and validity data were collected across 4 days in 3-5 year old classrooms in 50 ECE centers in North Carolina. Center teachers and directors completed relevant portions of the EPAO-SR on multiple days according to a standardized protocol, and trained data collectors completed the EPAO for 4 days in the centers. Reliability and validity statistics calculated included percent agreement, kappa, correlation coefficients, coefficients of variation, deviations, mean differences, and intraclass correlation coefficients (ICC), depending on the response option of the item. Data demonstrated a range of reliability and validity evidence for the EPAO-SR instrument. Reporting from directors and classroom teachers was consistent and similar to the observational data. Items that produced strongest reliability and validity estimates included beverages served, outside time, and physical activity equipment, while items such as whole grains served and amount of teacher-led PA had lower reliability (observation and self-report) and validity estimates. To overcome lower reliability and validity estimates, some items need administration on multiple days. This study demonstrated appropriate reliability and validity evidence for use of the EPAO-SR in the field. The self-administered EPAO-SR is an advancement of the measurement of ECE settings and can be used by researchers and practitioners to assess the nutrition and physical activity environments of ECE settings.
A dynamic Thurstonian item response theory of motive expression in the picture story exercise: solving the internal consistency paradox of the PSE.

PubMed

Lang, Jonas W B

2014-07-01

The measurement of implicit or unconscious motives using the picture story exercise (PSE) has long been a target of debate in the psychological literature. Most debates have centered on the apparent paradox that PSE measures of implicit motives typically show low internal consistency reliability on common indices like Cronbach's alpha but nevertheless predict behavioral outcomes. I describe a dynamic Thurstonian item response theory (IRT) model that builds on dynamic system theories of motivation, theorizing on the PSE response process, and recent advancements in Thurstonian IRT modeling of choice data. To assess the models' capability to explain the internal consistency paradox, I first fitted the model to archival data (Gurin, Veroff, & Feld, 1957) and then simulated data based on bias-corrected model estimates from the real data. Simulation results revealed that the average squared correlation reliability for the motives in the Thurstonian IRT model was .74 and that Cronbach's alpha values were similar to the real data (<.35). These findings suggest that PSE motive measures have long been reliable and increase the scientific value of extant evidence from motivational research using PSE motive measures. (c) 2014 APA, all rights reserved.
Lower Bounds to the Reliabilities of Factor Score Estimators.

PubMed

Hessen, David J

2016-10-06

Under the general common factor model, the reliabilities of factor score estimators might be of more interest than the reliability of the total score (the unweighted sum of item scores). In this paper, lower bounds to the reliabilities of Thurstone's factor score estimators, Bartlett's factor score estimators, and McDonald's factor score estimators are derived and conditions are given under which these lower bounds are equal. The relative performance of the derived lower bounds is studied using classic example data sets. The results show that estimates of the lower bounds to the reliabilities of Thurstone's factor score estimators are greater than or equal to the estimates of the lower bounds to the reliabilities of Bartlett's and McDonald's factor score estimators.
Validity and reliability of Optojump photoelectric cells for estimating vertical jump height.

PubMed

Glatthorn, Julia F; Gouge, Sylvain; Nussbaumer, Silvio; Stauffacher, Simone; Impellizzeri, Franco M; Maffiuletti, Nicola A

2011-02-01

Vertical jump is one of the most prevalent acts performed in several sport activities. It is therefore important to ensure that the measurements of vertical jump height made as a part of research or athlete support work have adequate validity and reliability. The aim of this study was to evaluate concurrent validity and reliability of the Optojump photocell system (Microgate, Bolzano, Italy) with force plate measurements for estimating vertical jump height. Twenty subjects were asked to perform maximal squat jumps and countermovement jumps, and flight time-derived jump heights obtained by the force plate were compared with those provided by Optojump, to examine its concurrent (criterion-related) validity (study 1). Twenty other subjects completed the same jump series on 2 different occasions (separated by 1 week), and jump heights of session 1 were compared with session 2, to investigate test-retest reliability of the Optojump system (study 2). Intraclass correlation coefficients (ICCs) for validity were very high (0.997-0.998), even if a systematic difference was consistently observed between force plate and Optojump (-1.06 cm; p < 0.001). Test-retest reliability of the Optojump system was excellent, with ICCs ranging from 0.982 to 0.989, low coefficients of variation (2.7%), and low random errors (±2.81 cm). The Optojump photocell system demonstrated strong concurrent validity and excellent test-retest reliability for the estimation of vertical jump height. We propose the following equation that allows force plate and Optojump results to be used interchangeably: force plate jump height (cm) = 1.02 × Optojump jump height + 0.29. In conclusion, the use of Optojump photoelectric cells is legitimate for field-based assessments of vertical jump height.
Modeling and experimental characterization of electromigration in interconnect trees

NASA Astrophysics Data System (ADS)

Thompson, C. V.; Hau-Riege, S. P.; Andleigh, V. K.

1999-11-01

Most modeling and experimental characterization of interconnect reliability is focussed on simple straight lines terminating at pads or vias. However, laid-out integrated circuits often have interconnects with junctions and wide-to-narrow transitions. In carrying out circuit-level reliability assessments it is important to be able to assess the reliability of these more complex shapes, generally referred to as `trees.' An interconnect tree consists of continuously connected high-conductivity metal within one layer of metallization. Trees terminate at diffusion barriers at vias and contacts, and, in the general case, can have more than one terminating branch when they include junctions. We have extended the understanding of `immortality' demonstrated and analyzed for straight stud-to-stud lines, to trees of arbitrary complexity. This leads to a hierarchical approach for identifying immortal trees for specific circuit layouts and models for operation. To complete a circuit-level-reliability analysis, it is also necessary to estimate the lifetimes of the mortal trees. We have developed simulation tools that allow modeling of stress evolution and failure in arbitrarily complex trees. We are testing our models and simulations through comparisons with experiments on simple trees, such as lines broken into two segments with different currents in each segment. Models, simulations and early experimental results on the reliability of interconnect trees are shown to be consistent.
Reliability of a self-report Italian version of the AUDIT-C questionnaire, used to estimate alcohol consumption by pregnant women in an obstetric setting.

PubMed

Bazzo, Stefania; Battistella, Giuseppe; Riscica, Patrizia; Moino, Giuliana; Dal Pozzo, Giuseppe; Bottarel, Mery; Geromel, Mariasole; Czerwinsky, Loredana

2015-01-01

Alcohol consumption during pregnancy can result in a range of harmful effects on the developing foetus and newborn, called Fetal Alcohol Spectrum Disorders (FASD). The identification of pregnant women who use alcohol enables to provide information, support and treatment for women and the surveillance of their children. The AUDIT-C (the shortened consumption version of the Alcohol Use Disorders Identification Test) is used for investigating risky drinking with different populations, and has been applied to estimate alcohol use and risky drinking also in antenatal clinics. The aim of the study was to investigate the reliability of a self-report Italian version of the AUDIT-C questionnaire to detect alcohol consumption during pregnancy, regardless of its use as a screening tool. The questionnaire was filled in by two independent consecutive series of pregnant women at the 38th gestation week visit in the two birth locations of the Local Health Authority of Treviso (Italy), during the years 2010 and 2011 (n=220 and n=239). Reliability analysis was performed using internal consistency, item-total score correlations, and inter-item correlations. The "discriminatory power" of the test was also evaluated. Results. Overall, about one third of women recalled alcohol consumption at least once during the current pregnancy. The questionnaire had an internal consistency of 0.565 for the group of the year 2010, of 0.516 for the year 2011, and of 0.542 for the overall group. The highest item total correlations' coefficient was 0.687 and the highest inter-item correlations' coefficient was 0.675. As for the discriminatory power of the questionnaire, the highest Ferguson's delta coefficient was 0.623. These findings suggest that the Italian self-report version of the AUDIT-C possesses unsatisfactory reliability to estimate alcohol consumption during pregnancy when used as self-report questionnaire in an obstetric setting.

Reliability of Direct Behavior Ratings - Social Competence (DBR-SC) data: How many ratings are necessary?

PubMed

Kilgus, Stephen P; Riley-Tillman, T Chris; Stichter, Janine P; Schoemann, Alexander M; Bellesheim, Katie

2016-09-01

The purpose of this investigation was to evaluate the reliability of Direct Behavior Ratings-Social Competence (DBR-SC) ratings. Participants included 60 students identified as possessing deficits in social competence, as well as their 23 classroom teachers. Teachers used DBR-SC to complete ratings of 5 student behaviors within the general education setting on a daily basis across approximately 5 months. During this time, each student was assigned to 1 of 2 intervention conditions, including the Social Competence Intervention-Adolescent (SCI-A) and a business-as-usual (BAU) intervention. Ratings were collected across 3 intervention phases, including pre-, mid-, and postintervention. Results suggested DBR-SC ratings were highly consistent across time within each student, with reliability coefficients predominantly falling in the .80 and .90 ranges. Findings further indicated such levels of reliability could be achieved with only a small number of ratings, with estimates varying between 2 and 10 data points. Group comparison analyses further suggested the reliability of DBR-SC ratings increased over time, such that student behavior became more consistent throughout the intervention period. Furthermore, analyses revealed that for 2 of the 5 DBR-SC behavior targets, the increase in reliability over time was moderated by intervention grouping, with students receiving SCI-A demonstrating greater increases in reliability relative to those in the BAU group. Limitations of the investigation as well as directions for future research are discussed herein. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Reliability and Validity of a Questionnaire for Physical Activity Assessment in South American Children and Adolescents: The SAYCARE Study.

PubMed

Nascimento-Ferreira, Marcus Vinícius; De Moraes, Augusto César Ferreira; Toazza-Oliveira, Paulo Vinícius; Forjaz, Claudia L M; Aristizabal, Juan Carlos; Santaliesra-Pasías, Alba M; Lepera, Candela; Nascimento-Junior, Walter Viana; Skapino, Estela; Delgado, Carlos Alberto; Moreno, Luis Alberto; Carvalho, Heráclito Barbosa

2018-03-01

The objective of this article is to test the reliability and validity of the new and innovative physical activity (PA) questionnaire. Subsamples from the South American Youth/Child Cardiovascular and Environment Study (SAYCARE) study were included to examine its reliability (children: n = 161; adolescents: n = 177) and validity (children: n = 82; adolescents: n = 60). The questionnaire consists of three dimensions of PA (leisure, active commuting, and school) performed during the last week. To assess its validity, the subjects wore accelerometers for at least 3 days and 8 h/d (at least one weekend day). The reliability was analyzed by correlation coefficients. In addition, Bland-Altman analysis and a multilevel regression were applied to estimate the measurement bias, limits of agreement, and influence of contextual variables. In children, the questionnaire showed consistent reliability (ρ = 0.56) and moderate validity (ρ = 0.46), and the contextual variable variance explained 43.0% with -22.9 min/d bias. In adolescents, the reliability was higher (ρ = 0.76) and the validity was almost excellent (ρ = 0.88), with 66.7% of the variance explained by city level with 16.0 min/d PA bias. The SAYCARE PA questionnaire shows acceptable (in children) to strong (in adolescents) reliability and strong validity in the measurement of PA in the pediatric population from low- to middle-income countries. © 2018 The Obesity Society.
Validation of the German revised version of the program in palliative care education and practice questionnaire (PCEP-GR).

PubMed

Fetz, Katharina; Wenzel-Meyburg, Ursula; Schulz-Quach, Christian

2017-12-28

The evaluation of the effectiveness of undergraduate palliative care education (UPCE) programs is an essential foundation to providing high-quality UPCE programs. Therefore, the implementation of valid evaluation tools is indispensable. Until today, there has been no general consensus regarding concrete outcome parameters and their accurate measurement. The Program in Palliative Care Education and Practice Questionnaire (German Revised Version; PCEP-GR) is a promising assessment tool for UPCE. The aim of the current study was to evaluate the psychometric properties of PCEP-GR and to demonstrate its feasibility for the evaluation of UPCE programs. The practical feasibility of the PCEP-GR and its acceptance in medical students were investigated in a pilot study with 24 undergraduate medical students at Heinrich Heine University Dusseldorf, Germany. Subsequently, the PCEP-GR was surveyed in a representative sample (N = 680) of medical students in order to investigate its psychometric properties. Factorial validity was investigated by means of principal component analysis (PCA). Reliability was examined by means of split-half-reliability analysis and analysis of internal consistency. After taking into consideration the PCA and distribution analysis results, an evaluation instruction for the PCEP-GR was developed. The PCEP-GR proved to be feasible and well-accepted in medical students. PCA revealed a four-factorial solution indicating four PCEP-GR subscales: preparation to provide palliative care, attitudes towards palliative care, self-estimation of competence in communication with dying patients and their relatives and self-estimation of knowledge and skills in palliative care. The PCEP-GR showed good split-half-reliability and acceptable to good internal consistency of subscales. Attitudes towards palliative care slightly missed the criterion of acceptable internal consistency. The evaluation instruction suggests a global PCEP-GR index and four subscales. The PCEP-GR has proven to be a feasible, economic, valid and reliable tool for the assessment of UPCE that comprises self-efficacy expectation and relevant attitudes towards palliative care.
Improved protocol and data analysis for accelerated shelf-life estimation of solid dosage forms.

PubMed

Waterman, Kenneth C; Carella, Anthony J; Gumkowski, Michael J; Lukulay, Patrick; MacDonald, Bruce C; Roy, Michael C; Shamblin, Sheri L

2007-04-01

To propose and test a new accelerated aging protocol for solid-state, small molecule pharmaceuticals which provides faster predictions for drug substance and drug product shelf-life. The concept of an isoconversion paradigm, where times in different temperature and humidity-controlled stability chambers are set to provide a critical degradant level, is introduced for solid-state pharmaceuticals. Reliable estimates for temperature and relative humidity effects are handled using a humidity-corrected Arrhenius equation, where temperature and relative humidity are assumed to be orthogonal. Imprecision is incorporated into a Monte-Carlo simulation to propagate the variations inherent in the experiment. In early development phases, greater imprecision in predictions is tolerated to allow faster screening with reduced sampling. Early development data are then used to design appropriate test conditions for more reliable later stability estimations. Examples are reported showing that predicted shelf-life values for lower temperatures and different relative humidities are consistent with the measured shelf-life values at those conditions. The new protocols and analyses provide accurate and precise shelf-life estimations in a reduced time from current state of the art.
Fundamentals of endoscopic surgery: creation and validation of the hands-on test.

PubMed

Vassiliou, Melina C; Dunkin, Brian J; Fried, Gerald M; Mellinger, John D; Trus, Thadeus; Kaneva, Pepa; Lyons, Calvin; Korndorffer, James R; Ujiki, Michael; Velanovich, Vic; Kochman, Michael L; Tsuda, Shawn; Martinez, Jose; Scott, Daniel J; Korus, Gary; Park, Adrian; Marks, Jeffrey M

2014-03-01

The Fundamentals of Endoscopic Surgery™ (FES) program consists of online materials and didactic and skills-based tests. All components were designed to measure the skills and knowledge required to perform safe flexible endoscopy. The purpose of this multicenter study was to evaluate the reliability and validity of the hands-on component of the FES examination, and to establish the pass score. Expert endoscopists identified the critical skill set required for flexible endoscopy. They were then modeled in a virtual reality simulator (GI Mentor™ II, Simbionix™ Ltd., Airport City, Israel) to create five tasks and metrics. Scores were designed to measure both speed and precision. Validity evidence was assessed by correlating performance with self-reported endoscopic experience (surgeons and gastroenterologists [GIs]). Internal consistency of each test task was assessed using Cronbach's alpha. Test-retest reliability was determined by having the same participant perform the test a second time and comparing their scores. Passing scores were determined by a contrasting groups methodology and use of receiver operating characteristic curves. A total of 160 participants (17 % GIs) performed the simulator test. Scores on the five tasks showed good internal consistency reliability and all had significant correlations with endoscopic experience. Total FES scores correlated 0.73, with participants' level of endoscopic experience providing evidence of their validity, and their internal consistency reliability (Cronbach's alpha) was 0.82. Test-retest reliability was assessed in 11 participants, and the intraclass correlation was 0.85. The passing score was determined and is estimated to have a sensitivity (true positive rate) of 0.81 and a 1-specificity (false positive rate) of 0.21. The FES hands-on skills test examines the basic procedural components required to perform safe flexible endoscopy. It meets rigorous standards of reliability and validity required for high-stakes examinations, and, together with the knowledge component, may help contribute to the definition and determination of competence in endoscopy.
The SASSI-3 Face Valid other Drugs Scale: A Psychometric Investigation (Substance Abuse Subtle Screening Inventory-3)

ERIC Educational Resources Information Center

Laux, John M.; Perera-Diltz, Dilani; Smirnoff, Jennifer B.; Salyers, Kathleen M.

2005-01-01

The authors investigated the psychometric capabilities of the Face Valid Other Drugs (FVOD) scale of the Substance Abuse Subtle Screening Inventory-3 (SASSI-3; G. A. Miller, 1999). Internal consistency reliability estimates and construct validity factor analysis for 230 college students provided initial support for the psychometric properties of…
Ultrasound semi-automated measurement of fetal nuchal translucency thickness based on principal direction estimation

NASA Astrophysics Data System (ADS)

Yoon, Heechul; Lee, Hyuntaek; Jung, Haekyung; Lee, Mi-Young; Won, Hye-Sung

2015-03-01

The objective of the paper is to introduce a novel method for nuchal translucency (NT) boundary detection and thickness measurement, which is one of the most significant markers in the early screening of chromosomal defects, namely Down syndrome. To improve the reliability and reproducibility of NT measurements, several automated methods have been introduced. However, the performance of their methods degrades when NT borders are tilted due to varying fetal movements. Therefore, we propose a principal direction estimation based NT measurement method to provide reliable and consistent performance regardless of both fetal positions and NT directions. At first, Radon Transform and cost function are used to estimate the principal direction of NT borders. Then, on the estimated angle bin, i.e., the main direction of NT, gradient based features are employed to find initial NT lines which are beginning points of the active contour fitting method to find real NT borders. Finally, the maximum thickness is measured from distances between the upper and lower border of NT by searching along to the orthogonal lines of main NT direction. To evaluate the performance, 89 of in vivo fetal images were collected and the ground-truth database was measured by clinical experts. Quantitative results using intraclass correlation coefficients and difference analysis verify that the proposed method can improve the reliability and reproducibility in the measurement of maximum NT thickness.
Is Coefficient Alpha Robust to Non-Normal Data?

PubMed Central

Sheng, Yanyan; Sheng, Zhaohui

2011-01-01

Coefficient alpha has been a widely used measure by which internal consistency reliability is assessed. In addition to essential tau-equivalence and uncorrelated errors, normality has been noted as another important assumption for alpha. Earlier work on evaluating this assumption considered either exclusively non-normal error score distributions, or limited conditions. In view of this and the availability of advanced methods for generating univariate non-normal data, Monte Carlo simulations were conducted to show that non-normal distributions for true or error scores do create problems for using alpha to estimate the internal consistency reliability. The sample coefficient alpha is affected by leptokurtic true score distributions, or skewed and/or kurtotic error score distributions. Increased sample sizes, not test lengths, help improve the accuracy, bias, or precision of using it with non-normal data. PMID:22363306
Interval estimation of the overall treatment effect in a meta-analysis of a few small studies with zero events.

PubMed

Pateras, Konstantinos; Nikolakopoulos, Stavros; Mavridis, Dimitris; Roes, Kit C B

2018-03-01

When a meta-analysis consists of a few small trials that report zero events, accounting for heterogeneity in the (interval) estimation of the overall effect is challenging. Typically, we predefine meta-analytical methods to be employed. In practice, data poses restrictions that lead to deviations from the pre-planned analysis, such as the presence of zero events in at least one study arm. We aim to explore heterogeneity estimators behaviour in estimating the overall effect across different levels of sparsity of events. We performed a simulation study that consists of two evaluations. We considered an overall comparison of estimators unconditional on the number of observed zero cells and an additional one by conditioning on the number of observed zero cells. Estimators that performed modestly robust when (interval) estimating the overall treatment effect across a range of heterogeneity assumptions were the Sidik-Jonkman, Hartung-Makambi and improved Paul-Mandel. The relative performance of estimators did not materially differ between making a predefined or data-driven choice. Our investigations confirmed that heterogeneity in such settings cannot be estimated reliably. Estimators whose performance depends strongly on the presence of heterogeneity should be avoided. The choice of estimator does not need to depend on whether or not zero cells are observed.
Assessment of the reliability of protein-protein interactions and protein function prediction.

PubMed

Deng, Minghua; Sun, Fengzhu; Chen, Ting

2003-01-01

As more and more high-throughput protein-protein interaction data are collected, the task of estimating the reliability of different data sets becomes increasingly important. In this paper, we present our study of two groups of protein-protein interaction data, the physical interaction data and the protein complex data, and estimate the reliability of these data sets using three different measurements: (1) the distribution of gene expression correlation coefficients, (2) the reliability based on gene expression correlation coefficients, and (3) the accuracy of protein function predictions. We develop a maximum likelihood method to estimate the reliability of protein interaction data sets according to the distribution of correlation coefficients of gene expression profiles of putative interacting protein pairs. The results of the three measurements are consistent with each other. The MIPS protein complex data have the highest mean gene expression correlation coefficients (0.256) and the highest accuracy in predicting protein functions (70% sensitivity and specificity), while Ito's Yeast two-hybrid data have the lowest mean (0.041) and the lowest accuracy (15% sensitivity and specificity). Uetz's data are more reliable than Ito's data in all three measurements, and the TAP protein complex data are more reliable than the HMS-PCI data in all three measurements as well. The complex data sets generally perform better in function predictions than do the physical interaction data sets. Proteins in complexes are shown to be more highly correlated in gene expression. The results confirm that the components of a protein complex can be assigned to functions that the complex carries out within a cell. There are three interaction data sets different from the above two groups: the genetic interaction data, the in-silico data and the syn-express data. Their capability of predicting protein functions generally falls between that of the Y2H data and that of the MIPS protein complex data. The supplementary information is available at the following Web site: http://www-hto.usc.edu/-msms/AssessInteraction/.
Interrater Reliability Estimators Commonly Used in Scoring Language Assessments: A Monte Carlo Investigation of Estimator Accuracy

ERIC Educational Resources Information Center

Morgan, Grant B.; Zhu, Min; Johnson, Robert L.; Hodge, Kari J.

2014-01-01

Common estimators of interrater reliability include Pearson product-moment correlation coefficients, Spearman rank-order correlations, and the generalizability coefficient. The purpose of this study was to examine the accuracy of estimators of interrater reliability when varying the true reliability, number of scale categories, and number of…
Reliability of self-reported childhood physical abuse by adults and factors predictive of inconsistent reporting.

PubMed

McKinney, Christy M; Harris, T Robert; Caetano, Raul

2009-01-01

Little is known about the reliability of self-reported child physical abuse (CPA) or CPA reporting practices. We estimated reliability and prevalence of self-reported CPA and identified factors predictive of inconsistent CPA reporting among 2,256 participants using surveys administered in 1995 and 2000. Reliability of CPA was fair to moderate (kappa = 0.41). Using a positive report from either survey, the prevalence of moderate (61.8%) and severe (12.0%) CPA was higher than at either survey alone. Compared to consistent reporters of having experienced CPA, inconsistent reporters were less likely to be > or = 30 years old (vs. 18-29) or Black (vs. White) and more likely to have < 12 years of education (vs. 12), have no alcohol-related problems (vs. having problems), or report one type (vs. > or = 2) of CPA. These findings may assist researchers conducting and interpreting studies of CPA.
Confirmatory Factor Analysis of the System for Evaluation of Teaching Qualities (SETQ) in Graduate Medical Training.

PubMed

Boerebach, Benjamin C M; Lombarts, Kiki M J M H; Arah, Onyebuchi A

2016-03-01

The System for Evaluation of Teaching Qualities (SETQ) was developed as a formative system for the continuous evaluation and development of physicians' teaching performance in graduate medical training. It has been seven years since the introduction and initial exploratory psychometric analysis of the SETQ questionnaires. This study investigates the validity and reliability of the SETQ questionnaires across hospitals and medical specialties using confirmatory factor analyses (CFAs), reliability analysis, and generalizability analysis. The SETQ questionnaires were tested in a sample of 3,025 physicians and 2,848 trainees in 46 hospitals. The CFA revealed acceptable fit of the data to the previously identified five-factor model. The high internal consistency estimates suggest satisfactory reliability of the subscales. These results provide robust evidence for the validity and reliability of the SETQ questionnaires for evaluating physicians' teaching performance. © The Author(s) 2014.
Validity and Reliability of a New Instrument to Measure Cancer-Related Fatigue in Adolescents

PubMed Central

Hinds, Pamela S.; Hockenberry, Marilyn; Tong, Xin; Rai, Shesh N.; Gattuso, Jamie S.; McCarthy, Kathleen; Pui, Ching-Hon; Srivastava, Deo Kumar

2008-01-01

Adolescents undergoing treatment for cancer rate fatigue as their most prevalent and intense cancer- and treatment-related effect. Parents and staff rate it similarly. Despite its reported prevalence, intensity, and distressing effects, cancer-related fatigue in adolescents is not routinely assessed during or after cancer treatment. We contend that the insufficient clinical attention is primarily due to the lack of a reliable and valid self-report instrument with which adolescent cancer-related fatigue can be measured. Our aim was to determine the reliability and construct validity of a new instrument and its ability to measure change in fatigue over time. Initial testing involved 64 adolescents undergoing curative treatment of cancer who completed the Fatigue Scale-Adolescent (FS-A) at two to four key points in treatment in one of four studies. Internal consistency estimates ranged from 0.67 to 0.95. Validity estimates involving the FS-A with the parent version ranged from 0.13 to 0.76; estimates involving the staff version and the Reynolds Depression Scale were 0.27 and 0.87 respectively. Additional validity findings included significant fatigue differences between anemic and non-anemic patients (P = 0.042) and the emergence of four factors in an exploratory factor analysis. Findings further indicate that the FS-A can be used to measure change over time (t = 2.55, P <0.01). In summary, the FS-A has moderate to strong reliability and impressive validity coefficients for a new research instrument. PMID:17629669
eHealth literacy in chronic disease patients: An item response theory analysis of the eHealth literacy scale (eHEALS).

PubMed

Paige, Samantha R; Krieger, Janice L; Stellefson, Michael; Alber, Julia M

2017-02-01

Chronic disease patients are affected by low computer and health literacy, which negatively affects their ability to benefit from access to online health information. To estimate reliability and confirm model specifications for eHealth Literacy Scale (eHEALS) scores among chronic disease patients using Classical Test (CTT) and Item Response Theory techniques. A stratified sample of Black/African American (N=341) and Caucasian (N=343) adults with chronic disease completed an online survey including the eHEALS. Item discrimination was explored using bi-variate correlations and Cronbach's alpha for internal consistency. A categorical confirmatory factor analysis tested a one-factor structure of eHEALS scores. Item characteristic curves, in-fit/outfit statistics, omega coefficient, and item reliability and separation estimates were computed. A 1-factor structure of eHEALS was confirmed by statistically significant standardized item loadings, acceptable model fit indices (CFI/TLI>0.90), and 70% variance explained by the model. Item response categories increased with higher theta levels, and there was evidence of acceptable reliability (ω=0.94; item reliability=89; item separation=8.54). eHEALS scores are a valid and reliable measure of self-reported eHealth literacy among Internet-using chronic disease patients. Providers can use eHEALS to help identify patients' eHealth literacy skills. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
The Swedish translation and cross-cultural adaptation of the Functional Assessment of Chronic Illness Therapy - Cervical Dysplasia (FACIT-CD): linguistic validity and reliability of the Swedish version.

PubMed

Rask, Marie; Oscarsson, Marie; Ludwig, Neil; Swahnberg, Katarina

2017-04-04

Cervical dysplasia is a precancerous condition, which has been shown to create anxiety in women. To be able to investigate these women's health-related quality of life, a disease-specific instrument is required. There does not seem to be a Swedish version of an instrument to screen for this specific disease. Therefore, this study aims to translate and cross-culturally adapt the Functional Assessment of Chronic Illness Therapy - Cervical Dysplasia (FACIT-CD) into a Swedish context and evaluate its linguistic validity and reliability. The Functional Assessment of Chronic Illness Therapy (FACIT) translation methodology was used, which consists of several steps including pilot testing of the FACIT-CD instrument through cognitive debriefing interviews. Ten women diagnosed with cervical dysplasia participated in the cognitive debriefing interviews. The internal consistency reliability of the Swedish FACIT-CD was estimated by Cronbach's alpha coefficient. Homogeneity of the items was evaluated by corrected item-total correlations. The sample consists of 34 women who were diagnosed with cervical dysplasia. The translation and cross-cultural adaptation went smoothly without any problems for the majority of the items. The cognitive debriefing interviews indicated that the Swedish FACIT-CD consists of relevant items, is easy to understand and complete, and has unambiguous and comprehensive response categories. The translation and cross-cultural adaptation resulted in a Swedish FACIT-CD, which is conceptually and semantically equivalent to the English version and linguistically valid. The total scale of the Swedish FACIT-CD exhibited good internal consistency reliability with a Cronbach's alpha coefficient of 0.84, and all of the subscales exhibited acceptable value between 0.71 and 0.81 except the Relationships subscale, which had a value of 0.67. Finally, all but four items exceeded the acceptable level for the corrected item-total correlations of ≥ 0.20. The Swedish FACIT-CD is conceptually and semantically equivalent to the English version and linguistically valid; further, it exhibits good internal consistency reliability.
Practical no-gold-standard evaluation framework for quantitative imaging methods: application to lesion segmentation in positron emission tomography

PubMed Central

Jha, Abhinav K.; Mena, Esther; Caffo, Brian; Ashrafinia, Saeed; Rahmim, Arman; Frey, Eric; Subramaniam, Rathan M.

2017-01-01

Abstract. Recently, a class of no-gold-standard (NGS) techniques have been proposed to evaluate quantitative imaging methods using patient data. These techniques provide figures of merit (FoMs) quantifying the precision of the estimated quantitative value without requiring repeated measurements and without requiring a gold standard. However, applying these techniques to patient data presents several practical difficulties including assessing the underlying assumptions, accounting for patient-sampling-related uncertainty, and assessing the reliability of the estimated FoMs. To address these issues, we propose statistical tests that provide confidence in the underlying assumptions and in the reliability of the estimated FoMs. Furthermore, the NGS technique is integrated within a bootstrap-based methodology to account for patient-sampling-related uncertainty. The developed NGS framework was applied to evaluate four methods for segmenting lesions from F-Fluoro-2-deoxyglucose positron emission tomography images of patients with head-and-neck cancer on the task of precisely measuring the metabolic tumor volume. The NGS technique consistently predicted the same segmentation method as the most precise method. The proposed framework provided confidence in these results, even when gold-standard data were not available. The bootstrap-based methodology indicated improved performance of the NGS technique with larger numbers of patient studies, as was expected, and yielded consistent results as long as data from more than 80 lesions were available for the analysis. PMID:28331883
Validation of the Brazilian Portuguese Version of Geriatric Anxiety Inventory--GAI-BR.

PubMed

Massena, Patrícia Nitschke; de Araújo, Narahyana Bom; Pachana, Nancy; Laks, Jerson; de Pádua, Analuiza Camozzato

2015-07-01

The Geriatric Anxiety Inventory (GAI) is a recently developed scale aiming to evaluate symptoms of anxiety in later life. This 20-item scale uses dichotomous answers highlighting non-somatic anxiety complaints of elderly people. The present study aimed to evaluate the psychometric properties of the Brazilian Portuguese version GAI (GAI-BR) in a sample from community and outpatient psychogeriatric clinic. A mixed convenience sample of 72 subjects was recruited for answering the research protocol. The interview procedures were structured with questionnaires about sociodemographic data, clinical health status, anxiety, and depression previously validated instruments, Mini-Mental State Examination, Mini International Neuropsychiatric Interview, and GAI-BR. Twenty-two percent of the sample were interviewed twice for test-retest reliability. For internal consistency analyses, the Cronbach's α test was applied. The Spearman correlation test was applied to evaluate the test-retest GAI-BR reliability. A ROC (receiver operating characteristic) curve study was made to estimate the GAI-BR area under curve, cut-off points, sensitivity, and specificity for the Generalized Anxiety Disorder diagnosis. The GAI-BR version showed high internal consistency (Cronbach's α = 0.91) and strong and significant test-retest reliability (ρ = 0.85, p < 0.001). It also showed moderate and significant correlation with the Beck Anxiety Inventory (ρ = 0.68, p < 0.001) and the State-Trait Anxiety Inventory (ρ = 0.61, p < 0.001) showing evidence of concurrent validation. The cut-off point of 13 estimated by ROC curve analyses showed sensitivity of 83.3% and specificity of 84.6% to detect Generalized Anxiety Disorder (DSM-IV). GAI-BR has demonstrated very good psychometric properties and can be a reliable instrument to measure anxiety in Brazilian elderly people.
Antisocial Personality Disorder Subscale (Chinese Version) of the Structured Clinical Interview for the DSM-IV Axis II disorders: validation study in Cantonese-speaking Hong Kong Chinese.

PubMed

Tang, D Y Y; Liu, A C Y; Leung, M H T; Siu, B W M

2013-06-01

OBJECTIVE. Antisocial personality disorder (ASPD) is a risk factor for violence and is associated with poor treatment response when it is a co-morbid condition with substance abuse. It is an under-recognised clinical entity in the local Hong Kong setting, for which there are only a few available Chinese-language diagnostic instruments. None has been tested for its psychometric properties in the Cantonese-speaking population in Hong Kong. This study therefore aimed to assess the reliability and validity of the Chinese version of the ASPD subscale of the Structured Clinical Interview for the DSM-IV Axis II Disorders (SCID-II) in Hong Kong Chinese. METHODS. This assessment tool was modified according to dialectal differences between Mainland China and Hong Kong. Inpatients in Castle Peak Hospital, Hong Kong, who were designated for priority follow-up based on their assessed propensity for violence and who fulfilled the inclusion criteria for the study, were recruited. To assess the level of agreement, best-estimate diagnosis made by a multidisciplinary team was compared with diagnostic status determined by the SCID-II ASPD subscale. The internal consistency, sensitivity, and specificity of the subscale were also calculated. RESULTS. The internal consistency of the subscale was acceptable at 0.79, whereas the test-retest reliability and inter-rater reliability showed an excellent and good agreement of 0.90 and 0.86, respectively. Best-estimate clinical diagnosis-SCID diagnosis agreement was acceptable at 0.76. The sensitivity, specificity, positive and negative predictive values were 0.91, 0.86, 0.83, and 0.93, respectively. CONCLUSION. The Chinese version of the SCID-II ASPD subscale is reliable and valid for diagnosing ASPD in a Cantonese-speaking clinical population.
Methods Used to Streamline the CAHPS® Hospital Survey

PubMed Central

Keller, San; O'Malley, A James; Hays, Ron D; Matthew, Rebecca A; Zaslavsky, Alan M; Hepner, Kimberly A; Cleary, Paul D

2005-01-01

Objective To identify a parsimonious subset of reliable, valid, and consumer-salient items from 33 questions asking for patient reports about hospital care quality. Data Source CAHPS® Hospital Survey pilot data were collected during the summer of 2003 using mail and telephone from 19,720 patients who had been treated in 132 hospitals in three states and discharged from November 2002 to January 2003. Methods Standard psychometric methods were used to assess the reliability (internal consistency reliability and hospital-level reliability) and construct validity (exploratory and confirmatory factor analyses, strength of relationship to overall rating of hospital) of the 33 report items. The best subset of items from among the 33 was selected based on their statistical properties in conjunction with the importance assigned to each item by participants in 14 focus groups. Principal Findings Confirmatory factor analysis (CFA) indicated that a subset of 16 questions proposed to measure seven aspects of hospital care (communication with nurses, communication with doctors, responsiveness to patient needs, physical environment, pain control, communication about medication, and discharge information) demonstrated excellent fit to the data. Scales in each of these areas had acceptable levels of reliability to discriminate among hospitals and internal consistency reliability estimates comparable with previously developed CAHPS instruments. Conclusion Although half the length of the original, the shorter CAHPS hospital survey demonstrates promising measurement properties, identifies variations in care among hospitals, and deals with aspects of the hospital stay that are important to patients' evaluations of care quality. PMID:16316438

Test-Retest Analyses of the Test of English as a Foreign Language. TOEFL Research Reports Report 45.

ERIC Educational Resources Information Center

Henning, Grant

This study provides information about the total and component scores of the Test of English as a Foreign Language (TOEFL). First, the study provides comparative global and component estimates of test-retest, alternate-form, and internal-consistency reliability, controlling for sources of measurement error inherent in the examinees and the testing…
Estimation of Enthalpy of Formation of Liquid Transition Metal Alloys: A Modified Prescription Based on Macroscopic Atom Model of Cohesion

NASA Astrophysics Data System (ADS)

Raju, Subramanian; Saibaba, Saroja

2016-09-01

The enthalpy of formation Δo H f is an important thermodynamic quantity, which sheds significant light on fundamental cohesive and structural characteristics of an alloy. However, being a difficult one to determine accurately through experiments, simple estimation procedures are often desirable. In the present study, a modified prescription for estimating Δo H f L of liquid transition metal alloys is outlined, based on the Macroscopic Atom Model of cohesion. This prescription relies on self-consistent estimation of liquid-specific model parameters, namely electronegativity ( ϕ L) and bonding electron density ( n b L ). Such unique identification is made through the use of well-established relationships connecting surface tension, compressibility, and molar volume of a metallic liquid with bonding charge density. The electronegativity is obtained through a consistent linear scaling procedure. The preliminary set of values for ϕ L and n b L , together with other auxiliary model parameters, is subsequently optimized to obtain a good numerical agreement between calculated and experimental values of Δo H f L for sixty liquid transition metal alloys. It is found that, with few exceptions, the use of liquid-specific model parameters in Macroscopic Atom Model yields a physically consistent methodology for reliable estimation of mixing enthalpies of liquid alloys.
Psychometric performance of the National Eye Institute visual function questionnaire in Latinos and non-Latinos.

PubMed

Baker, Richard S; Bazargan, Mohsen; Calderón, José L; Hays, Ron D

2006-08-01

To compare the psychometric performance of Spanish versions of the 25-item National Eye Institute Visual Function Questionnaire (NEI VFQ-25) and the NEI VFQ-39 administered to Latino patients with the psychometric performance of the standard English NEI VFQ-25 and NEI VFQ-39 administered to non-Latino patients. Clinic-based cross-sectional survey. Four hundred three patients (160 Latinos and 243 non-Latinos) recruited from general ophthalmology clinics of an urban public hospital over a 6-month period. Structured face-to-face interviews were conducted in Spanish and English to collect data for the NEI VFQ-25 and NEI VFQ-39. We calculated the mean, standard deviation, and percentage of participants having the minimum (floor) and maximum (ceiling) possible score for each item and scale. Internal consistency reliability of the NEI VFQ-25 and NEI VFQ-39 was estimated using the Cronbach alpha and average inter-item correlation. Construct validity for the instruments was assessed by comparing scores for participants classified as having normal versus impaired visual acuity. Instrument scales for general health; general vision; ocular pain; near activities; distance activities; vision-specific social functioning, mental health, role difficulties, and dependency; driving; color vision; and peripheral vision. Internal consistency reliability was significantly lower in the Spanish version than in the English version for 3 scales of the NEI VFQ-25. More importantly, 3 scales in the Spanish version manifested inadequate reliability (alpha< or =0.70), compared with only 1 inadequately reliable subscale in the English version. Reliability coefficients associated with the Spanish NEI VFQ-39 scales exceeded commonly accepted minimum standards. Comparison of reliability coefficients between Latino and non-Latino subgroups demonstrated statistically significant differences for 4 scales: Ocular Pain, Mental Health, Role Difficulties, and Dependency. In each case, the Latino group had the lower internal consistency reliability. However, only for the Ocular Pain subscale was reliability both significantly lower and inadequate (alpha<0.70). Overall performance of the NEI VFQ in Latino populations is adequate. However, in the absence of modifications to improve the reliability of specific Spanish version subscales, comparisons between Latino and non-Latino subgroups using the NEI VFQ must be interpreted with appropriate caution.
NDE reliability and probability of detection (POD) evolution and paradigm shift

DOE Office of Scientific and Technical Information (OSTI.GOV)

Singh, Surendra

2014-02-18

The subject of NDE Reliability and POD has gone through multiple phases since its humble beginning in the late 1960s. This was followed by several programs including the important one nicknamed “Have Cracks – Will Travel” or in short “Have Cracks” by Lockheed Georgia Company for US Air Force during 1974–1978. This and other studies ultimately led to a series of developments in the field of reliability and POD starting from the introduction of fracture mechanics and Damaged Tolerant Design (DTD) to statistical framework by Bernes and Hovey in 1981 for POD estimation to MIL-STD HDBK 1823 (1999) and 1823Amore » (2009). During the last decade, various groups and researchers have further studied the reliability and POD using Model Assisted POD (MAPOD), Simulation Assisted POD (SAPOD), and applying Bayesian Statistics. All and each of these developments had one objective, i.e., improving accuracy of life prediction in components that to a large extent depends on the reliability and capability of NDE methods. Therefore, it is essential to have a reliable detection and sizing of large flaws in components. Currently, POD is used for studying reliability and capability of NDE methods, though POD data offers no absolute truth regarding NDE reliability, i.e., system capability, effects of flaw morphology, and quantifying the human factors. Furthermore, reliability and POD have been reported alike in meaning but POD is not NDE reliability. POD is a subset of the reliability that consists of six phases: 1) samples selection using DOE, 2) NDE equipment setup and calibration, 3) System Measurement Evaluation (SME) including Gage Repeatability and Reproducibility (Gage R and R) and Analysis Of Variance (ANOVA), 4) NDE system capability and electronic and physical saturation, 5) acquiring and fitting data to a model, and data analysis, and 6) POD estimation. This paper provides an overview of all major POD milestones for the last several decades and discuss rationale for using Integrated Computational Materials Engineering (ICME), MAPOD, SAPOD, and Bayesian statistics for studying controllable and non-controllable variables including human factors for estimating POD. Another objective is to list gaps between “hoped for” versus validated or fielded failed hardware.« less
Automated comprehensive Adolescent Idiopathic Scoliosis assessment using MVC-Net.

PubMed

Wu, Hongbo; Bailey, Chris; Rasoulinejad, Parham; Li, Shuo

2018-05-18

Automated quantitative estimation of spinal curvature is an important task for the ongoing evaluation and treatment planning of Adolescent Idiopathic Scoliosis (AIS). It solves the widely accepted disadvantage of manual Cobb angle measurement (time-consuming and unreliable) which is currently the gold standard for AIS assessment. Attempts have been made to improve the reliability of automated Cobb angle estimation. However, it is very challenging to achieve accurate and robust estimation of Cobb angles due to the need for correctly identifying all the required vertebrae in both Anterior-posterior (AP) and Lateral (LAT) view x-rays. The challenge is especially evident in LAT x-ray where occlusion of vertebrae by the ribcage occurs. We therefore propose a novel Multi-View Correlation Network (MVC-Net) architecture that can provide a fully automated end-to-end framework for spinal curvature estimation in multi-view (both AP and LAT) x-rays. The proposed MVC-Net uses our newly designed multi-view convolution layers to incorporate joint features of multi-view x-rays, which allows the network to mitigate the occlusion problem by utilizing the structural dependencies of the two views. The MVC-Net consists of three closely-linked components: (1) a series of X-modules for joint representation of spinal structure (2) a Spinal Landmark Estimator network for robust spinal landmark estimation, and (3) a Cobb Angle Estimator network for accurate Cobb Angles estimation. By utilizing an iterative multi-task training algorithm to train the Spinal Landmark Estimator and Cobb Angle Estimator in tandem, the MVC-Net leverages the multi-task relationship between landmark and angle estimation to reliably detect all the required vertebrae for accurate Cobb angles estimation. Experimental results on 526 x-ray images from 154 patients show an impressive 4.04° Circular Mean Absolute Error (CMAE) in AP Cobb angle and 4.07° CMAE in LAT Cobb angle estimation, which demonstrates the MVC-Net's capability of robust and accurate estimation of Cobb angles in multi-view x-rays. Our method therefore provides clinicians with a framework for efficient, accurate, and reliable estimation of spinal curvature for comprehensive AIS assessment. Copyright © 2018. Published by Elsevier B.V.
Psychometric evaluation of the Nursing Stress Scale (NSS) among Chinese nurses in Taiwan.

PubMed

Lee, Mei-Hua; Holzemer, William L; Faucett, Julia

2007-01-01

The purpose of this study was to translate the Nursing Stress Scale (NSS) into Chinese and test its reliability and validity among Chinese nurses in Taiwan. Potential participants were asked to self-administer a Chinese version of the NSS. The agreement estimation was used to determine the equivalence of the meaning between the Chinese and original English versions and was rated by five bilingual nurses as 92% accurate for the 34 items. The test-retest reliability for the NSS at 2 weeks was .71 (p = .022, n=10). Internal consistency reliability and factor analysis were tested with 770 nurses from 65 inpatient units at a medical center in Taiwan. The internal consistency of the Chinese version of the NSS for an overall coefficient alpha is .91 for the total scale, and ranges from .67 to .79 for the subscales. The Chinese version of the NSS explains 53.77% of the variance in work stressors among Chinese nurses in Taiwan. Overall, the Chinese version of the NSS is internally consistent but may not be stable over 2 weeks. There was adequate evidence of the reliability and validity of the NSS-Chinese as an instrument appropriate to measure work stress among Chinese nurses. The translated NSS could be a useful tool for examining the frequency and major sources of stress experienced by Chinese nurses in hospital settings, and for the development of appropriate interventions for stress reduction.
Measuring teacher self-report on classroom practices: Construct validity and reliability of the Classroom Strategies Scale-Teacher Form.

PubMed

Reddy, Linda A; Dudek, Christopher M; Fabiano, Gregory A; Peters, Stephanie

2015-12-01

This article presents information about the construct validity and reliability of a new teacher self-report measure of classroom instructional and behavioral practices (the Classroom Strategies Scales-Teacher Form; CSS-T). The theoretical underpinnings and empirical basis for the instructional and behavioral management scales are presented. Information is provided about the construct validity, internal consistency, test-retest reliability, and freedom from item-bias of the scales. Given previous investigations with the CSS Observer Form, it was hypothesized that internal consistency would be adequate and that confirmatory factor analyses (CFA) of CSS-T data from 293 classrooms would offer empirical support for the CSS-T's Total, Composite and subscales, and yield a similar factor structure to that of the CSS Observer Form. Goodness-of-fit indices of χ2/df, Root Mean Square Error of Approximation, Goodness of Fit Index, and Adjusted Goodness of Fit Index suggested satisfactory fit of proposed CFA models whereas the Comparative Fit Index did not. Internal consistency estimates of .93 and .94 were obtained for the Instructional Strategies and Behavioral Strategies Total scales respectively. Adequate test-retest reliability was found for instructional and behavioral total scales (r = .79, r = .84, percent agreement 93% and 93%). The CSS-T evidences freedom from item bias on important teacher demographics (age, educational degree, and years of teaching experience). Implications of results are discussed. (c) 2015 APA, all rights reserved).
Development of a self-report questionnaire designed for population-based surveillance of gingivitis in adolescents: assessment of content validity and reliability

PubMed Central

QUIROZ, Viviana; REINERO, Daniela; HERNÁNDEZ, Patricia; CONTRERAS, Johanna; VERNAL, Rolando; CARVAJAL, Paola

2017-01-01

Abstract The major infectious diseases in Chile encompass the periodontal diseases, with a combined prevalence that rises up to 90% of the population. Thus, the population-based surveillance of periodontal diseases plays a central role for assessing their prevalence and for planning, implementing, and evaluating preventive and control programs. Self-report questionnaires have been proposed for the surveillance of periodontal diseases in adult populations world-wide. Objective This study aimed to develop and assess the content validity and reliability of a cognitively adapted self-report questionnaire designed for surveillance of gingivitis in adolescents. Material and Methods Ten predetermined self-report questions evaluating early signs and symptoms of gingivitis were preliminary assessed by a panel of clinical experts. Eight questions were selected and cognitively tested in 20 adolescents aged 12 to 18 years from Santiago de Chile. The questionnaire was then conducted and answered by 178 Chilean adolescents. Internal consistency was measured using the Cronbach’s alpha and temporal stability was calculated using the Kappa-index. Results A reliable final self-report questionnaire consisting of 5 questions was obtained, with a total Cronbach’s alpha of 0.73 and a Kappa-index ranging from 0.41 to 0.77 between the different questions. Conclusions The proposed questionnaire is reliable, with an acceptable internal consistency and a temporal stability from moderate to substantial, and it is promising for estimating the prevalence of gingivitis in adolescents. PMID:28877279
Reliability, validity, and utility of the Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) in assessments of bariatric surgery candidates.

PubMed

Tarescavage, Anthony M; Wygant, Dustin B; Boutacoff, Lana I; Ben-Porath, Yossef S

2013-12-01

In the current study, we examined the reliability, validity, and clinical utility of Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2011) scores in a sample of 759 bariatric surgery candidates. We provide descriptives for all scales, internal consistency and standard error of measurement estimates for all substantive scales, external correlates of substantive scales using chart review and self-report criteria, and relative risk ratios to assess the clinical utility of the instrument. Results generally support the reliability, validity, and clinical utility of MMPI-2-RF scale scores in the psychological evaluation of bariatric surgery candidates. Limitations, future directions, and practical application of these results are discussed. (c) 2013 APA, all rights reserved.
Test-retest of self-reported exposure to artificial tanning devices, self-tanning creams, and sun sensitivity showed consistency.

PubMed

Beane Freeman, Laura E; Dennis, Leslie K; Lynch, Charles F; Lowe, John B; Clarke, William R

2005-04-01

Exposure to ultraviolet radiation has consistently been linked to an increased risk of melanoma. Epidemiologic studies are susceptible to measurement error, which can distort the magnitude of observed effects. Although the reliability of self-report of many sun exposure factors has been previously described in several studies, self-report of use of artificial tanning devices and self-tanning creams has been less well characterized. A mailed survey was re-administered 2-4 weeks after completion of the initial survey to 76 randomly selected participants in a case-control study of melanoma. Cases and controls were individuals diagnosed in 1999 and 2000 who were ascertained from the Iowa Cancer Registry in 2002. We assessed the consistency of self-reported use of sunlamps and self-tanning creams, sun sensitivity, and history of sunburns. There was substantial reliability in reporting the use of sunlamps or self-tanning creams (cases: Kappa (kappa)=1.0 for both exposures; controls: kappa=0.71 and 0.87, respectively). kappa estimates of 0.62-0.78 were found for overall reliability of several sun sensitivity factors. Overall, the survey instrument demonstrated substantial reproducibility for factors related to the use of sunlamps or tanning beds, self-tanning creams, and sun sensitivity factors.
Validity and Reliability of Malay Version of the Job Content Questionnaire among Public Hospital Female Nurses in Malaysia.

PubMed

Amin, N A; Quek, K F; Oxley, J A; Noah, R M; Nordin, R

2015-10-01

The Job Content Questionnaire (M-JCQ) is an established self-reported instrument used across the world to measure the work dimensions based on the Karasek's demand-control-support model. To evaluate the psychometrics properties of the Malay version of M-JCQ among nurses in Malaysia. This cross-sectional study was carried out on nurses working in 4 public hospitals in Klang Valley area, Malaysia. M-JCQ was used to assess the perceived psychosocial stressors and physical demands of nurses at their workplaces. Construct validity of the questionnaire was examined using exploratory factor analysis (EFA). Cronbach's α values were used to estimate the reliability (internal consistency) of the M-JCQ. EFA showed that 34 selected items were loaded in 4 factors. Except for psychological job demand (Cronbach's α 0.51), the remaining 3 α values for 3 subscales (job control, social support, and physical demand) were greater than 0.70, indicating acceptable internal consistency. However, an item was excluded due to poor item-total correlation (r<0.3). The final M-JCQ was consisted of 33 items. The M-JCQ is a reliable and valid instrument to measure psychosocial and physical stressors in the workplace of public hospital nurses in Malaysia.
Reliability, Validity, and Clinical Utility of the Dominic Interactive for Adolescents-RevisedA DSM-5-Based Self-Report Screen for Mental Disorders, Borderline Personality Traits, and Suicidality.

PubMed

Bergeron, Lise; Smolla, Nicole; Berthiaume, Claude; Renaud, Johanne; Breton, Jean-Jacques; St-Georges, Marie; Morin, Pauline; Zavaglia, Elissa; Labelle, Réal

2017-03-01

The Dominic Interactive for Adolescents-Revised (DIA-R) is a multimedia self-report screen for 9 mental disorders, borderline personality traits, and suicidality defined by the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders ( DSM-5). This study aimed to examine the reliability and the validity of this instrument. French- and English-speaking adolescents aged 12 to 15 years ( N = 447) were recruited from schools and clinical settings in Montreal and were evaluated twice. The internal consistency was estimated by Cronbach alpha coefficients and the test-retest reliability by intraclass correlation coefficients. Cutoff points on the DIA-R scales were determined by using clinically relevant measures for defining external validation criteria: the Schedule for Affective Disorders and Schizophrenia for School-Aged Children, the Beck Hopelessness Scale, and the Abbreviated-Diagnostic Interview for Borderlines. Receiver operating characteristic (ROC) analyses provided accuracy estimates (area under the ROC curve, sensitivity, specificity, likelihood ratio) to evaluate the ability of the DIA-R scales to predict external criteria. For most of the DIA-R scales, reliability coefficients were excellent or moderate. High or moderate accuracy estimates from ROC analyses demonstrated the ability of the DIA-R thresholds to predict psychopathological conditions. These thresholds were generally capable to discriminate between clinical and school subsamples. However, the validity of the obsessions/compulsions scale was too low. Findings clearly support the reliability and the validity of the DIA-R. This instrument may be useful to assess a wide range of adolescents' mental health problems in the continuum of services. This conclusion applies to all scales, except the obsessions/compulsions one.
Validation of the Adolescent Concerns Measure (ACM): evidence from exploratory and confirmatory factor analysis.

PubMed

Ang, Rebecca P; Chong, Wan Har; Huan, Vivien S; Yeo, Lay See

2007-01-01

This article reports the development and initial validation of scores obtained from the Adolescent Concerns Measure (ACM), a scale which assesses concerns of Asian adolescent students. In Study 1, findings from exploratory factor analysis using 619 adolescents suggested a 24-item scale with four correlated factors--Family Concerns (9 items), Peer Concerns (5 items), Personal Concerns (6 items), and School Concerns (4 items). Initial estimates of convergent validity for ACM scores were also reported. The four-factor structure of ACM scores derived from Study 1 was confirmed via confirmatory factor analysis in Study 2 using a two-fold cross-validation procedure with a separate sample of 811 adolescents. Support was found for both the multidimensional and hierarchical models of adolescent concerns using the ACM. Internal consistency and test-retest reliability estimates were adequate for research purposes. ACM scores show promise as a reliable and potentially valid measure of Asian adolescents' concerns.
A method for vibrational assessment of cortical bone

NASA Astrophysics Data System (ADS)

Song, Yan; Gunaratne, Gemunu H.

2006-09-01

Large bones from many anatomical locations of the human skeleton consist of an outer shaft (cortex) surrounding a highly porous internal region (trabecular bone) whose structure is reminiscent of a disordered cubic network. Age related degradation of cortical and trabecular bone takes different forms. Trabecular bone weakens primarily by loss of connectivity of the porous network, and recent studies have shown that vibrational response can be used to obtain reliable estimates for loss of its strength. In contrast, cortical bone degrades via the accumulation of long fractures and changes in the level of mineralization of the bone tissue. In this paper, we model cortical bone by an initially solid specimen with uniform density to which long fractures are introduced; we find that, as in the case of trabecular bone, vibrational assessment provides more reliable estimates of residual strength in cortical bone than is possible using measurements of density or porosity.
Training and Maintaining System-Wide Reliability in Outcome Management.

PubMed

Barwick, Melanie A; Urajnik, Diana J; Moore, Julia E

2014-01-01

The Child and Adolescent Functional Assessment Scale (CAFAS) is widely used for outcome management, for providing real time client and program level data, and the monitoring of evidence-based practices. Methods of reliability training and the assessment of rater drift are critical for service decision-making within organizations and systems of care. We assessed two approaches for CAFAS training: external technical assistance and internal technical assistance. To this end, we sampled 315 practitioners trained by external technical assistance approach from 2,344 Ontario practitioners who had achieved reliability on the CAFAS. To assess the internal technical assistance approach as a reliable alternative training method, 140 practitioners trained internally were selected from the same pool of certified raters. Reliabilities were high for both practitioners trained by external technical assistance and internal technical assistance approaches (.909-.995, .915-.997, respectively). 1 and 3-year estimates showed some drift on several scales. High and consistent reliabilities over time and training method has implications for CAFAS training of behavioral health care practitioners, and the maintenance of CAFAS as a global outcome management tool in systems of care.
Validity and reliability of the Self-Reported Physical Fitness (SRFit) survey.

PubMed

Keith, NiCole R; Clark, Daniel O; Stump, Timothy E; Miller, Douglas K; Callahan, Christopher M

2014-05-01

An accurate physical fitness survey could be useful in research and clinical care. To estimate the validity and reliability of a Self-Reported Fitness (SRFit) survey; an instrument that estimates muscular fitness, flexibility, cardiovascular endurance, BMI, and body composition (BC) in adults ≥ 40 years of age. 201 participants completed the SF-36 Physical Function Subscale, International Physical Activity Questionnaire (IPAQ), Older Adults' Desire for Physical Competence Scale (Rejeski), the SRFit survey, and the Rikli and Jones Senior Fitness Test. BC, height and weight were measured. SRFit survey items described BC, BMI, and Senior Fitness Test movements. Correlations between the Senior Fitness Test and the SRFit survey assessed concurrent validity. Cronbach's Alpha measured internal consistency within each SRFit domain. SRFit domain scores were compared with SF-36, IPAQ, and Rejeski survey scores to assess construct validity. Intraclass correlations evaluated test-retest reliability. Correlations between SRFit and the Senior Fitness Test domains ranged from 0.35 to 0.79. Cronbach's Alpha scores were .75 to .85. Correlations between SRFit and other survey scores were -0.23 to 0.72 and in the expected direction. Intraclass correlation coefficients were 0.79 to 0.93. All P-values were 0.001. Initial evaluation supports the SRFit survey's validity and reliability.
A Bayesian Framework for Reliability Analysis of Spacecraft Deployments

NASA Technical Reports Server (NTRS)

Evans, John W.; Gallo, Luis; Kaminsky, Mark

2012-01-01

Deployable subsystems are essential to mission success of most spacecraft. These subsystems enable critical functions including power, communications and thermal control. The loss of any of these functions will generally result in loss of the mission. These subsystems and their components often consist of unique designs and applications for which various standardized data sources are not applicable for estimating reliability and for assessing risks. In this study, a two stage sequential Bayesian framework for reliability estimation of spacecraft deployment was developed for this purpose. This process was then applied to the James Webb Space Telescope (JWST) Sunshield subsystem, a unique design intended for thermal control of the Optical Telescope Element. Initially, detailed studies of NASA deployment history, "heritage information", were conducted, extending over 45 years of spacecraft launches. This information was then coupled to a non-informative prior and a binomial likelihood function to create a posterior distribution for deployments of various subsystems uSing Monte Carlo Markov Chain sampling. Select distributions were then coupled to a subsequent analysis, using test data and anomaly occurrences on successive ground test deployments of scale model test articles of JWST hardware, to update the NASA heritage data. This allowed for a realistic prediction for the reliability of the complex Sunshield deployment, with credibility limits, within this two stage Bayesian framework.
Coefficient alpha and interculture test selection.

PubMed

Thurber, Steven; Kishi, Yasuhiro

2014-04-01

The internal consistency reliability of a measure can be a focal point in an evaluation of the potential adequacy of an instrument for adaptation to another cultural setting. Cronbach's alpha (α) coefficient is often used as the statistical index for such a determination. However, alpha presumes a tau-equivalent test and may constitute an inaccurate population estimate for multidimensional tests. These notions are expanded and examined with a Japanese version of a questionnaire on nursing attitudes toward suicidal patients, originally constructed in Sweden using the English language. The English measure was reported to have acceptable internal consistency (α) albeit the dimensionality of the questionnaire was not addressed. The Japanese scale was found to lack tau-equivalence. An alternative to alpha, "composite reliability," was computed and found to be below acceptable standards in magnitude and precision. Implications for research application of the Japanese instrument are discussed. © The Author(s) 2012.
The Revised School Culture Elements Questionnaire: Gender and Grade Level Invariant?

ERIC Educational Resources Information Center

DeVaney, Thomas A.; Adams, Nan B.; Hill-Winstead, Flo; Trahan, Mitzi P.

2012-01-01

The purpose of this research was to examine the psychometric properties of the RSCEQ with respect to invariance across gender and grade level, using a sample of 901 teachers from 44 schools in southeast Louisiana. Reliability estimates were consistent with previous research and ranged from 0.81 to 0.90 on the actual and 0.83 to 0.92 on the…
Learned perceptual associations influence visuomotor programming under limited conditions: kinematic consistency.

PubMed

Haffenden, Angela M; Goodale, Melvyn A

2002-12-01

Previous findings have suggested that visuomotor programming can make use of learned size information in experimental paradigms where movement kinematics are quite consistent from trial to trial. The present experiment was designed to test whether or not this conclusion could be generalized to a different manipulation of kinematic variability. As in previous work, an association was established between the size and colour of square blocks (e.g. red = large; yellow = small, or vice versa). Associating size and colour in this fashion has been shown to reliably alter the perceived size of two test blocks halfway in size between the large and small blocks: estimations of the test block matched in colour to the group of large blocks are smaller than estimations of the test block matched to the group of small blocks. Subjects grasped the blocks, and on other trials estimated the size of the blocks. These changes in perceived block size were incorporated into grip scaling only when movement kinematics were highly consistent from trial to trial; that is, when the blocks were presented in the same location on each trial. When the blocks were presented in different locations grip scaling remained true to the metrics of the test blocks despite the changes in perceptual estimates of block size. These results support previous findings suggesting that kinematic consistency facilitates the incorporation of learned perceptual information into grip scaling.

Evaluation of the Consumer Assessment of Healthcare Providers and Systems In-Center Hemodialysis Survey

PubMed Central

Paoli, Carly J.; Hays, Ron D.; Taylor-Stokes, Gavin; Piercy, James; Gitlin, Matthew

2014-01-01

Background and objectives The US Centers for Medicare and Medicaid Services (CMS) End Stage Renal Disease Prospective Payment System and Quality Incentive Program requires that dialysis centers meet predefined criteria for quality of patient care to ensure future funding. The CMS selected the Consumer Assessment of Healthcare Providers and Systems In-Center Hemodialysis (CAHPS-ICH) survey for the assessment of patient experience of care. This analysis evaluated the psychometric properties of the CAHPS-ICH survey in a sample of hemodialysis patients. Design, setting, participants, & measurements Data were drawn from the Adelphi CKD Disease Specific Program (a retrospective, cross-sectional survey of nephrologists and patients). Selected United States–based nephrologists treating patients receiving hemodialysis completed patient record forms and provided information on their dialysis center. Patients (n=404) completed the CAHPS-ICH survey (comprising 58 questions) providing six scores for the assessment of patient experience of care. CAHPS-ICH item-scale convergence, discrimination, and reliability were evaluated for multi-item scales. Floor and ceiling effects were estimated for all six scores. Patient (demographics, dialysis history, vascular access method) and facility characteristics (size, ratio of patients-to-physicians, nurses, and technicians) associated with the CAHPS-ICH scores were also evaluated. Results Item-scale correlations and internal consistency reliability estimates provided support for the nephrologists’ communication (range, 0.16–0.71; α=0.81) and quality of care (range, 0.16–0.76; α=0.90) composites. However, the patient information composite had low internal consistency reliability (α=0.55). Provider-to-patient ratios (range, 2.37 for facilities with >36 patients per physician to 2.8 for those with <8 patients per physician) and time spent in the waiting room (3.44 for >15 minutes of waiting time to 3.75 for 5 to <10 minutes) were characteristics most consistently related to patients’ perceptions of dialysis care. Conclusions CAHPS-ICH is a potentially valuable and informative tool for the evaluation of patients’ experiences with dialysis care. Additional studies are needed to estimate clinically meaningful differences between care providers. PMID:24832092
A Psychometric Analysis of Quality of Life Tools in Lung Cancer Patients Who Smoke

PubMed Central

Browning, Kristine K.; Ferketich, Amy K.; Otterson, Gregory A.; Reynolds, Nancy R.; Wewers, Mary Ellen

2009-01-01

Lung cancer is the leading cause of cancer death for both men and women in the United States. Patient quality of life (QOL) prior to cancer treatment is known to be a strong predictor of survival and toleration of treatment toxicities. A lung cancer patient’s self-assessment of QOL is highly valued among clinicians as it guides treatment-related decisions and impacts clinical outcomes. Smokers are known to report a lower QOL. Limited research has been conducted on QOL outcomes in lung cancer patients who continue to smoke. To assess QOL, a reliable and valid QOL measure specific to lung cancer is required. The Functional Assessment of Cancer Therapy-Lung Cancer (FACT-L) and Lung Cancer Symptom Scale (LCSS) are instruments that specifically examine QOL among lung cancer patients. The LCSS is a focused QOL instrument that includes physical and functional domains of QOL and disease symptomatology. The FACT-L is a broader QOL instrument that includes physical, functional, social and emotional domains and disease symptomatology. Both are psychometrically valid and are widely used in the literature, but have not been exclusively evaluated in smokers. Furthermore, there is no ‘gold standard’ instrument since there has never been a correlation study to compare estimates of reliability and validity between these instruments. The purpose of this study is to report the internal consistency and convergence validity of the FACT-L and the LCSS among newly diagnosed lung cancer patients who smoke. This data were collected and analyzed from a larger study examining smoking behavior among newly diagnosed lung cancer patients (n=51). Descriptive statistics were calculated on the FACT-L and LCSS scores, internal consistency was assessed by estimating Cronbach’s alpha coefficients, and Pearson correlation coefficients were estimated between the two scales. Internal consistency coefficients demonstrated good reliability for both scales, and the two instruments demonstrated a strong correlation, suggesting good convergence validity. Either of these instruments are appropriate measures for QOL in lung cancer patients who smoke. Given the conceptual difference between the two instruments, it is important to carefully consider the research aims when selecting the appropriate QOL measurement instrument. PMID:19181418
The reliability of vertical jump tests between the Vertec and My Jump phone application.

PubMed

Yingling, Vanessa R; Castro, Dimitri A; Duong, Justin T; Malpartida, Fiorella J; Usher, Justin R; O, Jenny

2018-01-01

The vertical jump is used to estimate sports performance capabilities and physical fitness in children, elderly, non-athletic and injured individuals. Different jump techniques and measurement tools are available to assess vertical jump height and peak power; however, their use is limited by access to laboratory settings, excessive cost and/or time constraints thus making these tools oftentimes unsuitable for field assessment. A popular field test uses the Vertec and the Sargent vertical jump with countermovement; however, new low cost, easy to use tools are becoming available, including the My Jump iOS mobile application (app). The purpose of this study was to assess the reliability of the My Jump relative to values obtained by the Vertec for the Sargent stand and reach vertical jump (VJ) test. One hundred and thirty-five healthy participants aged 18-39 years (94 males, 41 females) completed three maximal Sargent VJ with countermovement that were simultaneously measured using the Vertec and the My Jump . Jump heights were quantified for each jump and peak power was calculated using the Sayers equation. Four separate ICC estimates and their 95% confidence intervals were used to assess reliability. Two analyses (with jump height and calculated peak power as the dependent variables, respectively) were based on a single rater, consistency, two-way mixed-effects model, while two others (with jump height and calculated peak power as the dependent variables, respectively) were based on a single rater, absolute agreement, two-way mixed-effects model. Moderate to excellent reliability relative to the degree of consistency between the Vertec and My Jump values was found for jump height (ICC = 0.813; 95% CI [0.747-0.863]) and calculated peak power (ICC = 0.926; 95% CI [0.897-0.947]). However, poor to good reliability relative to absolute agreement for VJ height (ICC = 0.665; 95% CI [0.050-0.859]) and poor to excellent reliability relative to absolute agreement for peak power (ICC = 0.851; 95% CI [0.272-0.946]) between the Vertec and My Jump values were found; Vertec VJ height, and thus, Vertec calculated peak power values, were significantly higher than those calculated from My Jump values ( p < 0.0001). The My Jump app may provide a reliable measure of vertical jump height and calculated peak power in multiple field and laboratory settings without the need of costly equipment such as force plates or Vertec. The reliability relative to degree of consistency between the Vertec and My Jump app was moderate to excellent. However, the reliability relative to absolute agreement between Vertec and My Jump values contained significant variation (based on CI values), thus, it is recommended that either the My Jump or the Vertec be used to assess VJ height in repeated measures within subjects' designs; these measurement tools should not be considered interchangeable within subjects or in group measurement designs.
The reliability of vertical jump tests between the Vertec and My Jump phone application

PubMed Central

Castro, Dimitri A.; Duong, Justin T.; Malpartida, Fiorella J.; Usher, Justin R.; O, Jenny

2018-01-01

Background The vertical jump is used to estimate sports performance capabilities and physical fitness in children, elderly, non-athletic and injured individuals. Different jump techniques and measurement tools are available to assess vertical jump height and peak power; however, their use is limited by access to laboratory settings, excessive cost and/or time constraints thus making these tools oftentimes unsuitable for field assessment. A popular field test uses the Vertec and the Sargent vertical jump with countermovement; however, new low cost, easy to use tools are becoming available, including the My Jump iOS mobile application (app). The purpose of this study was to assess the reliability of the My Jump relative to values obtained by the Vertec for the Sargent stand and reach vertical jump (VJ) test. Methods One hundred and thirty-five healthy participants aged 18–39 years (94 males, 41 females) completed three maximal Sargent VJ with countermovement that were simultaneously measured using the Vertec and the My Jump. Jump heights were quantified for each jump and peak power was calculated using the Sayers equation. Four separate ICC estimates and their 95% confidence intervals were used to assess reliability. Two analyses (with jump height and calculated peak power as the dependent variables, respectively) were based on a single rater, consistency, two-way mixed-effects model, while two others (with jump height and calculated peak power as the dependent variables, respectively) were based on a single rater, absolute agreement, two-way mixed-effects model. Results Moderate to excellent reliability relative to the degree of consistency between the Vertec and My Jump values was found for jump height (ICC = 0.813; 95% CI [0.747–0.863]) and calculated peak power (ICC = 0.926; 95% CI [0.897–0.947]). However, poor to good reliability relative to absolute agreement for VJ height (ICC = 0.665; 95% CI [0.050–0.859]) and poor to excellent reliability relative to absolute agreement for peak power (ICC = 0.851; 95% CI [0.272–0.946]) between the Vertec and My Jump values were found; Vertec VJ height, and thus, Vertec calculated peak power values, were significantly higher than those calculated from My Jump values (p < 0.0001). Discussion The My Jump app may provide a reliable measure of vertical jump height and calculated peak power in multiple field and laboratory settings without the need of costly equipment such as force plates or Vertec. The reliability relative to degree of consistency between the Vertec and My Jump app was moderate to excellent. However, the reliability relative to absolute agreement between Vertec and My Jump values contained significant variation (based on CI values), thus, it is recommended that either the My Jump or the Vertec be used to assess VJ height in repeated measures within subjects’ designs; these measurement tools should not be considered interchangeable within subjects or in group measurement designs. PMID:29692955
Visual-haptic integration with pliers and tongs: signal “weights” take account of changes in haptic sensitivity caused by different tools

PubMed Central

Takahashi, Chie; Watt, Simon J.

2014-01-01

When we hold an object while looking at it, estimates from visual and haptic cues to size are combined in a statistically optimal fashion, whereby the “weight” given to each signal reflects their relative reliabilities. This allows object properties to be estimated more precisely than would otherwise be possible. Tools such as pliers and tongs systematically perturb the mapping between object size and the hand opening. This could complicate visual-haptic integration because it may alter the reliability of the haptic signal, thereby disrupting the determination of appropriate signal weights. To investigate this we first measured the reliability of haptic size estimates made with virtual pliers-like tools (created using a stereoscopic display and force-feedback robots) with different “gains” between hand opening and object size. Haptic reliability in tool use was straightforwardly determined by a combination of sensitivity to changes in hand opening and the effects of tool geometry. The precise pattern of sensitivity to hand opening, which violated Weber's law, meant that haptic reliability changed with tool gain. We then examined whether the visuo-motor system accounts for these reliability changes. We measured the weight given to visual and haptic stimuli when both were available, again with different tool gains, by measuring the perceived size of stimuli in which visual and haptic sizes were varied independently. The weight given to each sensory cue changed with tool gain in a manner that closely resembled the predictions of optimal sensory integration. The results are consistent with the idea that different tool geometries are modeled by the brain, allowing it to calculate not only the distal properties of objects felt with tools, but also the certainty with which those properties are known. These findings highlight the flexibility of human sensory integration and tool-use, and potentially provide an approach for optimizing the design of visual-haptic devices. PMID:24592245
Reliability and Responsiveness of NutriQoL® Questionnaire.

PubMed

Cuerda, Maria Cristina; Apezetxea, Antonio; Carrillo, Lourdes; Casanueva, Felipe; Cuesta, Federico; Irles, Jose Antonio; Virgili, Maria Nuria; Layola, Miquel; Lizán, Luis

2016-10-01

NutriQoL ® (Nestlé Health Science, Vevay, Switzerland) is a questionnaire developed to assess the health-related quality-of-life (HRQoL) of patients with home enteral nutrition (HEN) irrespective of their underlying condition and route of administration. The aim of this work is assessing the questionnaire's reliability and responsiveness to change. Two cohorts of patients with HEN and their primary caregivers were enrolled to assess reliability and responsiveness, respectively. All participants had to be 18 years of age or older, without mental deterioration (≤3 or 4 errors in the Pfeiffer's test) and with sufficient functional status (>40 points on Karnovsky's performance status scale). When the patients' ability to respond to the questionnaire was impaired due to underlying disease, their caregivers answered on their behalf. NutriQoL was administered in two and three visits to reliability and responsiveness cohorts, respectively. Test-retest reliability and internal consistency were assessed by the intra-class correlation coefficient (ICC) and the Cronbach's α, respectively. Responsiveness was evaluated by standardized effect size and standardized response mean between basal visit and third visit. Finally, the minimal clinically important difference (MCID) was estimated. A total of 54 and 86 participants were recruited to the reliability and responsiveness cohort, respectively. Thirty-five caregivers were selected to assess the inter-observer reliability. ICC values confirmed the good reproducibility level (ICC >0.75) of the questionnaire in both "physical functioning and activities of daily living" and "social life" domains and total score. The assessment of internal consistency in both domains of the questionnaire showed good internal consistency in visit 2. ICC showed the excellent agreement level between caregiver and patient in the global NutriQoL score. Finally, patients classified as having a minimal change in their health reported a mean (standard deviation) MCID in NutriQoL score of 0.63 (11.51). NutriQoL is a reliable and unique instrument to measure the HRQoL in HEN patients. NutriQoL detects changes in the health status of the patient. Nevertheless, further research is needed to determine the full extent of the questionnaire responsiveness.
Thermal dye double indicator dilution measurement of lung water in man: comparison with gravimetric measurements.

PubMed Central

Mihm, F G; Feeley, T W; Jamieson, S W

1987-01-01

The thermal dye double indicator dilution technique for estimating lung water was compared with gravimetric analyses in nine human subjects who were organ donors. As observed in animal studies, the thermal dye measurement of extravascular thermal volume (EVTV) consistently overestimated gravimetric extravascular lung water (EVLW), the mean (SEM) difference being 3.43 (0.59) ml/kg. In eight of the nine subjects the EVTV -3.43 ml/kg would yield an estimate of EVLW that would be from 3.23 ml/kg under to 3.37 ml/kg over the actual value EVLW at the 95% confidence limits. Reproducibility, assessed with the standard error of the mean percentage, suggested that a 15% change in EVTV can be reliably detected with repeated measurements. One subject was excluded from analysis because the EVTV measurement grossly underestimated its actual EVLW. This error was associated with regional injury observed on gross examination of the lung. Experimental and clinical evidence suggest that the thermal dye measurement provides a reliable estimate of lung water in diffuse pulmonary oedema states. PMID:3616974
Reliability Problems of the Datum: Solutions for Questionnaire Responses.

ERIC Educational Resources Information Center

Bastick, Tony

Questionnaires often ask for estimates, and these estimates are given with different reliabilities. It is difficult to know the different reliabilities of single estimates and to take these into account in subsequent analyses. This paper contains a practical example to show that not taking the reliability of different responses into account can…
The relative impact of baryons and cluster shape on weak lensing mass estimates of galaxy clusters

NASA Astrophysics Data System (ADS)

Lee, B. E.; Le Brun, A. M. C.; Haq, M. E.; Deering, N. J.; King, L. J.; Applegate, D.; McCarthy, I. G.

2018-05-01

Weak gravitational lensing depends on the integrated mass along the line of sight. Baryons contribute to the mass distribution of galaxy clusters and the resulting mass estimates from lensing analysis. We use the cosmo-OWLS suite of hydrodynamic simulations to investigate the impact of baryonic processes on the bias and scatter of weak lensing mass estimates of clusters. These estimates are obtained by fitting NFW profiles to mock data using MCMC techniques. In particular, we examine the difference in estimates between dark matter-only runs and those including various prescriptions for baryonic physics. We find no significant difference in the mass bias when baryonic physics is included, though the overall mass estimates are suppressed when feedback from AGN is included. For lowest-mass systems for which a reliable mass can be obtained (M200 ≈ 2 × 1014M⊙), we find a bias of ≈-10 per cent. The magnitude of the bias tends to decrease for higher mass clusters, consistent with no bias for the most massive clusters which have masses comparable to those found in the CLASH and HFF samples. For the lowest mass clusters, the mass bias is particularly sensitive to the fit radii and the limits placed on the concentration prior, rendering reliable mass estimates difficult. The scatter in mass estimates between the dark matter-only and the various baryonic runs is less than between different projections of individual clusters, highlighting the importance of triaxiality.
Towards a fully self-consistent inversion combining historical and paleomagnetic data for geomagnetic field reconstructions

NASA Astrophysics Data System (ADS)

Arneitz, P.; Leonhardt, R.; Fabian, K.; Egli, R.

2017-12-01

Historical and paleomagnetic data are the two main sources of information about the long-term geomagnetic field evolution. Historical observations extend to the late Middle Ages, and prior to the 19th century, they consisted mainly of pure declination measurements from navigation and orientation logs. Field reconstructions going back further in time rely solely on magnetization acquired by rocks, sediments, and archaeological artefacts. The combined dataset is characterized by a strongly inhomogeneous spatio-temporal distribution and highly variable data reliability and quality. Therefore, an adequate weighting of the data that correctly accounts for data density, type, and realistic error estimates represents the major challenge for an inversion approach. Until now, there has not been a fully self-consistent geomagnetic model that correctly recovers the variation of the geomagnetic dipole together with the higher-order spherical harmonics. Here we present a new geomagnetic field model for the last 4 kyrs based on historical, archeomagnetic and volcanic records. The iterative Bayesian inversion approach targets the implementation of reliable error treatment, which allows different record types to be combined in a fully self-consistent way. Modelling results will be presented along with a thorough analysis of model limitations, validity and sensitivity.
Psychometric properties of the Interpersonal Relationship Inventory-Short Form for active duty female service members.

PubMed

Nayback-Beebe, Ann M; Yoder, Linda H

2011-06-01

The Interpersonal Relationship Inventory-Short Form (IPRI-SF) has demonstrated psychometric consistency across several demographic and clinical populations; however, it has not been psychometrically tested in a military population. The purpose of this study was to psychometrically evaluate the reliability and component structure of the IPRI-SF in active duty United States Army female service members (FSMs). The reliability estimates were .93 for the social support subscale and .91 for the conflict subscale. Principal component analysis demonstrated an obliquely rotated three-component solution that accounted for 58.9% of the variance. The results of this study support the reliability and validity of the IPRI-SF for use in FSMs; however, a three-factor structure emerged in this sample of FSMs post-deployment that represents "cultural context." Copyright © 2011 Wiley Periodicals, Inc.
Estimating sediment discharge: Appendix D

USGS Publications Warehouse

Gray, John R.; Simões, Francisco J. M.

2008-01-01

Sediment-discharge measurements usually are available on a discrete or periodic basis. However, estimates of sediment transport often are needed for unmeasured periods, such as when daily or annual sediment-discharge values are sought, or when estimates of transport rates for unmeasured or hypothetical flows are required. Selected methods for estimating suspended-sediment, bed-load, bed- material-load, and total-load discharges have been presented in some detail elsewhere in this volume. The purposes of this contribution are to present some limitations and potential pitfalls associated with obtaining and using the requisite data and equations to estimate sediment discharges and to provide guidance for selecting appropriate estimating equations. Records of sediment discharge are derived from data collected with sufficient frequency to obtain reliable estimates for the computational interval and period. Most sediment- discharge records are computed at daily or annual intervals based on periodically collected data, although some partial records represent discrete or seasonal intervals such as those for flood periods. The method used to calculate sediment- discharge records is dependent on the types and frequency of available data. Records for suspended-sediment discharge computed by methods described by Porterfield (1972) are most prevalent, in part because measurement protocols and computational techniques are well established and because suspended sediment composes the bulk of sediment dis- charges for many rivers. Discharge records for bed load, total load, or in some cases bed-material load plus wash load are less common. Reliable estimation of sediment discharges presupposes that the data on which the estimates are based are comparable and reliable. Unfortunately, data describing a selected characteristic of sediment were not necessarily derived—collected, processed, analyzed, or interpreted—in a consistent manner. For example, bed-load data collected with different types of bed-load samplers may not be comparable (Gray et al. 1991; Childers 1999; Edwards and Glysson 1999). The total suspended solids (TSS) analytical method tends to produce concentration data from open-channel flows that are biased low with respect to their paired suspended-sediment concentration values, particularly when sand-size material composes more than about a quarter of the material in suspension. Instantaneous sediment-discharge values based on TSS data may differ from the more reliable product of suspended- sediment concentration values and the same water-discharge data by an order of magnitude (Gray et al. 2000; Bent et al. 2001; Glysson et al. 2000; 2001). An assessment of data comparability and reliability is an important first step in the estimation of sediment discharges. There are two approaches to obtaining values describing sediment loads in streams. One is based on direct measurement of the quantities of interest, and the other on relations developed between hydraulic parameters and sediment- transport potential. In the next sections, the most common techniques for both approaches are briefly addressed.
Computer-Aided Reliability Estimation

NASA Technical Reports Server (NTRS)

Bavuso, S. J.; Stiffler, J. J.; Bryant, L. A.; Petersen, P. L.

1986-01-01

CARE III (Computer-Aided Reliability Estimation, Third Generation) helps estimate reliability of complex, redundant, fault-tolerant systems. Program specifically designed for evaluation of fault-tolerant avionics systems. However, CARE III general enough for use in evaluation of other systems as well.
The Challenge Posed by Geomagnetic Activity to Electric Power Reliability: Evidence From England and Wales

NASA Astrophysics Data System (ADS)

Forbes, Kevin F.; St. Cyr, O. C.

2017-10-01

This paper addresses whether geomagnetic activity challenged the reliability of the electric power system during part of the declining phase of solar cycle 23. Operations by National Grid in England and Wales are examined over the period of 11 March 2003 through 31 March 2005. This paper examines the relationship between measures of geomagnetic activity and a metric of challenged electric power reliability known as the net imbalance volume (NIV). Measured in megawatt hours, NIV represents the sum of all energy deployments initiated by the system operator to balance the electric power system. The relationship between geomagnetic activity and NIV is assessed using a multivariate econometric model. The model was estimated using half-hour settlement data over the period of 11 March 2003 through 31 December 2004. The results indicate that geomagnetic activity had a demonstrable effect on NIV over the sample period. Based on the parameter estimates, out-of-sample predictions of NIV were generated for each half hour over the period of 1 January to 31 March 2005. Consistent with the existence of a causal relationship between geomagnetic activity and the electricity market imbalance, the root-mean-square error of the out-of-sample predictions of NIV is smaller; that is, the predictions are more accurate, when the statistically significant estimated effects of geomagnetic activity are included as drivers in the predictions.
Detection of the lunar body tide by the Lunar Orbiter Laser Altimeter.

PubMed

Mazarico, Erwan; Barker, Michael K; Neumann, Gregory A; Zuber, Maria T; Smith, David E

2014-04-16

The Lunar Orbiter Laser Altimeter instrument onboard the Lunar Reconnaissance Orbiter spacecraft collected more than 5 billion measurements in the nominal 50 km orbit over ∼10,000 orbits. The data precision, geodetic accuracy, and spatial distribution enable two-dimensional crossovers to be used to infer relative radial position corrections between tracks to better than ∼1 m. We use nearly 500,000 altimetric crossovers to separate remaining high-frequency spacecraft trajectory errors from the periodic radial surface tidal deformation. The unusual sampling of the lunar body tide from polar lunar orbit limits the size of the typical differential signal expected at ground track intersections to ∼10 cm. Nevertheless, we reliably detect the topographic tidal signal and estimate the associated Love number h 2 to be 0.0371 ± 0.0033, which is consistent with but lower than recent results from lunar laser ranging. Altimetric data are used to create radial constraints on the tidal deformationThe body tide amplitude is estimated from the crossover dataThe estimated Love number is consistent with previous estimates but more precise.
Objectivity and validity of EMG method in estimating anaerobic threshold.

PubMed

Kang, S-K; Kim, J; Kwon, M; Eom, H

2014-08-01

The purposes of this study were to verify and compare the performances of anaerobic threshold (AT) point estimates among different filtering intervals (9, 15, 20, 25, 30 s) and to investigate the interrelationships of AT point estimates obtained by ventilatory threshold (VT) and muscle fatigue thresholds using electromyographic (EMG) activity during incremental exercise on a cycle ergometer. 69 untrained male university students, yet pursuing regular exercise voluntarily participated in this study. The incremental exercise protocol was applied with a consistent stepwise increase in power output of 20 watts per minute until exhaustion. AT point was also estimated in the same manner using V-slope program with gas exchange parameters. In general, the estimated values of AT point-time computed by EMG method were more consistent across 5 filtering intervals and demonstrated higher correlations among themselves when compared with those values obtained by VT method. The results found in the present study suggest that the EMG signals could be used as an alternative or a new option in estimating AT point. Also the proposed computing procedure implemented in Matlab for the analysis of EMG signals appeared to be valid and reliable as it produced nearly identical values and high correlations with VT estimates. © Georg Thieme Verlag KG Stuttgart · New York.
ERP Reliability Analysis (ERA) Toolbox: An open-source toolbox for analyzing the reliability of event-related brain potentials.

PubMed

Clayson, Peter E; Miller, Gregory A

2017-01-01

Generalizability theory (G theory) provides a flexible, multifaceted approach to estimating score reliability. G theory's approach to estimating score reliability has important advantages over classical test theory that are relevant for research using event-related brain potentials (ERPs). For example, G theory does not require parallel forms (i.e., equal means, variances, and covariances), can handle unbalanced designs, and provides a single reliability estimate for designs with multiple sources of error. This monograph provides a detailed description of the conceptual framework of G theory using examples relevant to ERP researchers, presents the algorithms needed to estimate ERP score reliability, and provides a detailed walkthrough of newly-developed software, the ERP Reliability Analysis (ERA) Toolbox, that calculates score reliability using G theory. The ERA Toolbox is open-source, Matlab software that uses G theory to estimate the contribution of the number of trials retained for averaging, group, and/or event types on ERP score reliability. The toolbox facilitates the rigorous evaluation of psychometric properties of ERP scores recommended elsewhere in this special issue. Copyright © 2016 Elsevier B.V. All rights reserved.
Cross-cultural adaptation and validation of the Korean version of the neck disability index.

PubMed

Song, Kyung-Jin; Choi, Byung-Wan; Choi, Byung-Ryeul; Seo, Gyeu-Beom

2010-09-15

Validation of a translated, culturally adapted questionnaire. The purpose of this study is to translate and culturally adapt the Neck Disability Index (NDI) and to validate the use of the derived version in Korean patient. Although several valid measures exist for measurement of neck pain and functional impairment, these measures have yet been validated in Korean version. The NDI was linguistically translated into Korean, and prefinal version was assessed and modified by a pilot study. The reliability and validity of the derived Korean version was examined in 78 patients with degenerative cervical spine disease. Test-retest reliability, internal consistency, and construct validity were investigated by comparing Visual Analogue Scale (VAS) and Short Form Health Survey (SF-36) scores. Factor analysis of Korean NDI extracted 2 factors with eigenvalues >1. The intraclass-correlation coefficient of test-retest reliability was 0.93. Reliability, estimated by internal consistency, had a Cronbach alpha value of 0.82. The correlation between NDI and VAS scores was r = 0.49, and the correlation between NDI and SF-36 scores was r = -0.44. The physical health component score of SF-36 was highly correlated with NDI, and the correlation between VAS scores and the mental health component scores of SF-36 was high. The derived Korean version of the NDI was found to be a reliable and valid instrument for measuring disability in Korean patients with cervical problems. The authors recommend its use in future Korean clinical studies.
A Note on Structural Equation Modeling Estimates of Reliability

ERIC Educational Resources Information Center

Yang, Yanyun; Green, Samuel B.

2010-01-01

Reliability can be estimated using structural equation modeling (SEM). Two potential problems with this approach are that estimates may be unstable with small sample sizes and biased with misspecified models. A Monte Carlo study was conducted to investigate the quality of SEM estimates of reliability by themselves and relative to coefficient…
Estimating Measures of Pass-Fail Reliability from Parallel Half-Tests.

ERIC Educational Resources Information Center

Woodruff, David J.; Sawyer, Richard L.

Two methods for estimating measures of pass-fail reliability are derived, by which both theta and kappa may be estimated from a single test administration. The methods require only a single test administration and are computationally simple. Both are based on the Spearman-Brown formula for estimating stepped-up reliability. The non-distributional…

Large Sample Confidence Intervals for Item Response Theory Reliability Coefficients

ERIC Educational Resources Information Center

Andersson, Björn; Xin, Tao

2018-01-01

In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability…
[Construction of the Time Management Scale and examination of the influence of time management on psychological stress response].

PubMed

Imura, Tomoya; Takamura, Masahiro; Okazaki, Yoshihiro; Tokunaga, Satoko

2016-10-01

We developed a scale to measure time management and assessed its reliability and validity. We then used this scale to examine the impact of time management on psychological stress response. In Study 1-1, we developed the scale and assessed its internal consistency and criterion-related validity. Findings from a factor analysis revealed three elements of time management, “time estimation,” “time utilization,” and “taking each moment as it comes.” In Study 1-2, we assessed the scale’s test-retest reliability. In Study 1-3, we assessed the validity of the constructed scale. The results indicate that the time management scale has good reliability and validity. In Study 2, we performed a covariance structural analysis to verify our model that hypothesized that time management influences perceived control of time and psychological stress response, and perceived control of time influences psychological stress response. The results showed that time estimation increases the perceived control of time, which in turn decreases stress response. However, we also found that taking each moment as it comes reduces perceived control of time, which in turn increases stress response.
Adaptation to the Spanish population of the Substance Use Risk Profile Scale (SURPS) and psychometric properties.

PubMed

Fernández-Calderón, Fermín; Díaz-Batanero, Carmen; Rojas-Tejada, Antonio J; Castellanos-Ryan, Natalie; Lozano-Rojas, Óscar M

2017-07-14

The identification of different personality risk profiles for substance misuse is useful in preventing substance-related problems. This study aims to test the psychometric properties of a new version of the Substance Use Risk Profile Scale (SURPS) for Spanish college students. Cross-sectional study with 455 undergraduate students from four Spanish universities. A new version of the SURPS, adapted to the Spanish population, was administered with the Beck Hopelessness Scale, the UPPS-P Impulsive Behavior Scale, the State-Trait Anxiety Inventory (STAI) and the Alcohol Use Disorders Identification Test (AUDIT). Internal consistency reliability ranged between 0.652 and 0.806 for the four SURPS subscales, while reliability estimated by split-half coefficients varied from 0.686 to 0.829. The estimated test-retest reliability ranged between 0.733 and 0.868. The expected four-factor structure of the original scale was replicated. As evidence of convergent validity, we found that the SURPS subscales were significantly associated with other conceptually-relevant personality scales and significantly associated with alcohol use measures in theoretically-expected ways. This SURPS version may be a useful instrument for measuring personality traits related to vulnerability to substance use and misuse when targeting personality with preventive interventions.
Reliability Correction for Functional Connectivity: Theory and Implementation

PubMed Central

Mueller, Sophia; Wang, Danhong; Fox, Michael D.; Pan, Ruiqi; Lu, Jie; Li, Kuncheng; Sun, Wei; Buckner, Randy L.; Liu, Hesheng

2016-01-01

Network properties can be estimated using functional connectivity MRI (fcMRI). However, regional variation of the fMRI signal causes systematic biases in network estimates including correlation attenuation in regions of low measurement reliability. Here we computed the spatial distribution of fcMRI reliability using longitudinal fcMRI datasets and demonstrated how pre-estimated reliability maps can correct for correlation attenuation. As a test case of reliability-based attenuation correction we estimated properties of the default network, where reliability was significantly lower than average in the medial temporal lobe and higher in the posterior medial cortex, heterogeneity that impacts estimation of the network. Accounting for this bias using attenuation correction revealed that the medial temporal lobe’s contribution to the default network is typically underestimated. To render this approach useful to a greater number of datasets, we demonstrate that test-retest reliability maps derived from repeated runs within a single scanning session can be used as a surrogate for multi-session reliability mapping. Using data segments with different scan lengths between 1 and 30 min, we found that test-retest reliability of connectivity estimates increases with scan length while the spatial distribution of reliability is relatively stable even at short scan lengths. Finally, analyses of tertiary data revealed that reliability distribution is influenced by age, neuropsychiatric status and scanner type, suggesting that reliability correction may be especially important when studying between-group differences. Collectively, these results illustrate that reliability-based attenuation correction is an easily implemented strategy that mitigates certain features of fMRI signal nonuniformity. PMID:26493163
Skeletal age estimation for forensic purposes: A comparison of GP, TW2 and TW3 methods on an Italian sample.

PubMed

Pinchi, Vilma; De Luca, Federica; Ricciardi, Federico; Focardi, Martina; Piredda, Valentina; Mazzeo, Elena; Norelli, Gian-Aristide

2014-05-01

Paediatricians, radiologists, anthropologists and medico-legal specialists are often called as experts in order to provide age estimation (AE) for forensic purposes. The literature recommends performing the X-rays of the left hand and wrist (HW-XR) for skeletal age estimation. The method most frequently employed is the Greulich and Pyle (GP) method. In addition, the so-called bone-specific techniques are also applied including the method of Tanner Whitehouse (TW) in the latest versions TW2 and TW3. To compare skeletal age and chronological age in a large sample of children and adolescents using GP, TW2 and TW3 methods in order to establish which of these is the most reliable for forensic purposes. The sample consisted of 307 HW-XRs of Italian children or adolescents, 145 females and 162 males aged between 6 and 20 years. The radiographies were scored according to the GP, TW2RUS and TW3RUS methods by one investigator. The results' reliability was assessed using intraclass correlation coefficient. Wilcoxon signed-rank test and Student t-test were performed to search for significant differences between skeletal and chronological ages. The distributions of the differences between estimated and chronological age, by means of boxplots, show how median differences for TW3 and GP methods are generally very close to 0. Hypothesis tests' results were obtained, with respect to the sex, both for the entire group of individuals and people grouped by age. Results show no significant differences among estimated and chronological age for TW3 and, to a lesser extent, GP. The TW2 proved to be the worst of the three methods. Our results support the conclusion that the TW2 method is not reliable for AE for forensic purpose. The GP and TW3 methods have proved to be reliable in males. For females, the best method was found to be TW3. When performing forensic age estimation in subjects around 14 years of age, it could be advisable to use and associate the TW3 and GP methods. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Reliability-based design optimization of reinforced concrete structures including soil-structure interaction using a discrete gravitational search algorithm and a proposed metamodel

NASA Astrophysics Data System (ADS)

Khatibinia, M.; Salajegheh, E.; Salajegheh, J.; Fadaee, M. J.

2013-10-01

A new discrete gravitational search algorithm (DGSA) and a metamodelling framework are introduced for reliability-based design optimization (RBDO) of reinforced concrete structures. The RBDO of structures with soil-structure interaction (SSI) effects is investigated in accordance with performance-based design. The proposed DGSA is based on the standard gravitational search algorithm (GSA) to optimize the structural cost under deterministic and probabilistic constraints. The Monte-Carlo simulation (MCS) method is considered as the most reliable method for estimating the probabilities of reliability. In order to reduce the computational time of MCS, the proposed metamodelling framework is employed to predict the responses of the SSI system in the RBDO procedure. The metamodel consists of a weighted least squares support vector machine (WLS-SVM) and a wavelet kernel function, which is called WWLS-SVM. Numerical results demonstrate the efficiency and computational advantages of DGSA and the proposed metamodel for RBDO of reinforced concrete structures.
Reinforcing flood-risk estimation.

PubMed

Reed, Duncan W

2002-07-15

Flood-frequency estimation is inherently uncertain. The practitioner applies a combination of gauged data, scientific method and hydrological judgement to derive a flood-frequency curve for a particular site. The resulting estimate can be thought fully satisfactory only if it is broadly consistent with all that is reliably known about the flood-frequency behaviour of the river. The paper takes as its main theme the search for information to strengthen a flood-risk estimate made from peak flows alone. Extra information comes in many forms, including documentary and monumental records of historical floods, and palaeological markers. Meteorological information is also useful, although rainfall rarity is difficult to assess objectively and can be a notoriously unreliable indicator of flood rarity. On highly permeable catchments, groundwater levels present additional data. Other types of information are relevant to judging hydrological similarity when the flood-frequency estimate derives from data pooled across several catchments. After highlighting information sources, the paper explores a second theme: that of consistency in flood-risk estimates. Following publication of the Flood estimation handbook, studies of flood risk are now using digital catchment data. Automated calculation methods allow estimates by standard methods to be mapped basin-wide, revealing anomalies at special sites such as river confluences. Such mapping presents collateral information of a new character. Can this be used to achieve flood-risk estimates that are coherent throughout a river basin?
Appearance motives to tan and not tan: evidence for validity and reliability of a new scale.

PubMed

Cafri, Guy; Thompson, J Kevin; Roehrig, Megan; Rojas, Ariz; Sperry, Steffanie; Jacobsen, Paul B; Hillhouse, Joel

2008-04-01

Risk for skin cancer is increased by UV exposure and decreased by sun protection. Appearance reasons to tan and not tan have consistently been shown to be related to intentions and behaviors to UV exposure and protection. This study was designed to determine the factor structure of appearance motives to tan and not tan, evaluate the extent to which this factor structure is gender invariant, test for mean differences in the identified factors, and evaluate internal consistency, temporal stability, and criterion-related validity. Five-hundred eighty-nine females and 335 male college students were used to test confirmatory factor analysis models within and across gender groups, estimate latent mean differences, and use the correlation coefficient and Cronbach's alpha to further evaluate the reliability and validity of the identified factors. A measurement invariant (i.e., factor-loading invariant) model was identified with three higher-order factors: sociocultural influences to tan (lower order factors: media, friends, family, significant others), appearance reasons to tan (general, acne, body shape), and appearance reasons not to tan (skin aging, immediate skin damage). Females had significantly higher means than males on all higher-order factors. All subscales had evidence of internal consistency, temporal stability, and criterion-related validity. This study offers a framework and measurement instrument that has evidence of validity and reliability for evaluating appearance-based motives to tan and not tan.
Portuguese version of the EUROPEP questionnaire: contributions to the psychometric validation

PubMed Central

Roque, Hugo; Veloso, Ana; Ferreira, Pedro L

2016-01-01

ABSTRACT OBJECTIVE To assess the construct validity and reliability of the Portuguese version of the European Task Force on Patient Evaluation of General Practice Care questionnaire. METHODS We applied the Portuguese version of the European Task Force on Patient Evaluation of General Practice Care to 392 users of 20 Family Health Units from the North of Portugal. The validity of the construct was evaluated by exploratory factor analysis, with the Principal Axis Factoring method, by orthogonal rotation (varimax procedure), by the Kaiser normalization criteria (eigenvalue ≥ 1). The factorability of the data matrix was verified by the Kaiser-Meyer-Olkin and Bartlett’s sphericity test. We estimated the reliability by the indicator of internal consistency Cronbach’s alpha. To analyze the correlations between satisfaction and loyalty, we used the Pearson correlations. The predictor effect of satisfaction on loyalty was analyzed by simple linear regression. RESULTS Satisfaction presented five robust and well individualized dimensions – medical care, nursing care, clinical secretariat services, accessibility, and organization of services – with alpha values between 0.86 and 0.97, good levels of internal consistency. The loyalty showed alpha value of 0.72, considered a reasonable internal consistency. The satisfaction was predictive of loyalty. CONCLUSIONS The Portuguese European Task Force on Patient Evaluation of General Practice Care questionnaire is a robust and reliable instrument to measure the satisfaction and loyalty of users of the Family Health Units. PMID:27706374
Evaluation of the CEAS trend and monthly weather data models for soybean yields in Iowa, Illinois, and Indiana

NASA Technical Reports Server (NTRS)

French, V. (Principal Investigator)

1982-01-01

The CEAS models evaluated use historic trend and meteorological and agroclimatic variables to forecast soybean yields in Iowa, Illinois, and Indiana. Indicators of yield reliability and current measures of modeled yield reliability were obtained from bootstrap tests on the end of season models. Indicators of yield reliability show that the state models are consistently better than the crop reporting district (CRD) models. One CRD model is especially poor. At the state level, the bias of each model is less than one half quintal/hectare. The standard deviation is between one and two quintals/hectare. The models are adequate in terms of coverage and are to a certain extent consistent with scientific knowledge. Timely yield estimates can be made during the growing season using truncated models. The models are easy to understand and use and are not costly to operate. Other than the specification of values used to determine evapotranspiration, the models are objective. Because the method of variable selection used in the model development is adequately documented, no evaluation can be made of the objectivity and cost of redevelopment of the model.
Developing a measure of cultural-, maturity-, or esteem-driven modesty among Jewish women.

PubMed

Andrews, Caryn Scheinberg

2014-01-01

Understanding modesty and how it relates to religiosity among Jewish women was relatively unexplained, and as part of a larger study, a measure was needed. The purpose of this article is to report on three studies which represent the three stages of instrument development of a measure of modesty among Jewish women, "Your Views of Modesty": (a) content/concept definition; (b) instrument development; and (c) evaluation of the psychometric properties of the instrument: reliability and validity. In Study I, Q methodology was used to define the domain and results suggesting that modesty has multidimensions. In Study II, an instrument was developed based on distinctive perspectives from each group or what was important and not so important. This formed a 25-item Likert scale. In Study III, a survey of 300 Jewish women revealed internal consistency estimates with Cronbach's alpha 0.92, indicating high degree of internal consistency reliability for "Your Views of Modesty." For construct validity, four factors were found explaining 55% of the variance of modesty: (a) religion-driven, (b) maturity-driven, (c) esteem-driven, and (d) public-based modesty was identified. "Your Views of Modesty" shows good evidence for reliability and validity in this Jewish population.
Testing of the SEE and OEE post-hip fracture.

PubMed

Resnick, Barbara; Orwig, Denise; Zimmerman, Sheryl; Hawkes, William; Golden, Justine; Werner-Bronzert, Michelle; Magaziner, Jay

2006-08-01

The purpose of this study was to test the reliability and validity of the Self-Efficacy for Exercise (SEE) and the Outcome Expectations for Exercise (OEE) scales in a sample of 166 older women post-hip fracture. There was some evidence of validity of the SEE and OEE based on confirmatory factor analysis and Rasch model testing, criterion based and convergent validity, and evidence of internal consistency based on alpha coefficients and separation indices and reliability based on R2 estimates. Rasch model testing demonstrated that some items had high variability. Based on these findings suggestions are made for how items could be revised and the scales improved for future use.
Test-retest reliability of the scale of participation in organized activities among adolescents in the Czech Republic and Slovakia.

PubMed

Bosakova, Lucia; Kolarcik, Peter; Bobakova, Daniela; Sulcova, Martina; Van Dijk, Jitse P; Reijneveld, Sijmen A; Geckova, Andrea Madarasova

2016-04-01

Participation in organized activities is related with a range of positive outcomes, but the way such participation is measured has not been scrutinized. Test-retest reliability as an important indicator of a scale's reliability has been assessed rarely and for "The scale of participation in organized activities" lacks completely. This test-retest study is based on the Health Behaviour in School-aged Children study and is consistent with its methodology. We obtained data from 353 Czech (51.9 % boys) and 227 Slovak (52.9 % boys) primary school pupils, grades five and nine, who participated in this study in 2013. We used Cohen's kappa statistic and single measures of the intraclass correlation coefficient to estimate the test-retest reliability of all selected items in the sample, stratified by gender, age and country. We mostly observed a large correlation between the test and retest in all of the examined variables (κ ranged from 0.46 to 0.68). Test-retest reliability of the sum score of individual items showed substantial agreement (ICC = 0.64). The scale of participation in organized activities has an acceptable level of agreement, indicating good reliability.
The McCanse Readiness for Death Instrument (MRDI): a reliable and valid measure for hospice care.

PubMed

McCanse, R P

1995-01-01

The purpose of this study was to establish whether or not readiness for death, as an indicator of healthy dying, is a measurable concept. Review of relevant literature revealed consensus regarding the universality of a human need for healthy dying. A theory of healthy dying was derived from the Rogerian paradigm. The McCanse Readiness for Death Instrument (MRDI) was constructed, which included indicators of physiological, psychological, sociological, and spiritual aspects of "healthy" field pattern as death is developmentally approached. The MRDI was a 26-item structured interview questionnaire which generated interval-ratio data through a visual analog scale. A pretest was conducted with a sample of 9 volunteer patients drawn from a small suburban outpatient hospice. The MRDI was concurrently administered to dying individuals, their primary caregivers, and their primary hospice nurses. Correlations between dying individuals' scores and their primary caregivers' estimates of patient death readiness as well as between patients and their primary hospice nurses were very encouraging. Cronbach's coefficient alpha for internal consistency reliability was .59. Content validity was supported by consensus of an expert panel of practicing hospice nurses. Construct validity was demonstrated through legitimate placement of the concept, healthy death readiness, within the theoretical web which supported it. The MRDI was then administered to a sample of 31 terminally-ill individuals, their primary caregivers, and their primary hospice nurses drawn from larger, urban hospice populations in three geographic areas of the United States. The MRDI was also administered to a contrast group of 39 cardiac-impaired individuals who were not terminally-ill. Overall internal consistency of the MRDI was found to be quite favorable (alpha = .76). Debilitating illness and actual mortality in the study sample precluded and/or confounded estimates of test-retest reliability. Convergent validity of the MRDI was indicated by significant correlations between patients' scores and primary caregivers' estimates (r = .35, p < .05) and between patients' scores and primary hospice nurses' estimates (r = .53, p < .01). Discriminant validity of the MRDI was demonstrated by a significant mean difference between the group of terminally-ill patients and the group of non-terminal, cardiac-impaired patients (t = 1.76, p < .01).
A particle swarm model for estimating reliability and scheduling system maintenance

NASA Astrophysics Data System (ADS)

Puzis, Rami; Shirtz, Dov; Elovici, Yuval

2016-05-01

Modifying data and information system components may introduce new errors and deteriorate the reliability of the system. Reliability can be efficiently regained with reliability centred maintenance, which requires reliability estimation for maintenance scheduling. A variant of the particle swarm model is used to estimate reliability of systems implemented according to the model view controller paradigm. Simulations based on data collected from an online system of a large financial institute are used to compare three component-level maintenance policies. Results show that appropriately scheduled component-level maintenance greatly reduces the cost of upholding an acceptable level of reliability by reducing the need in system-wide maintenance.
Development and Validation of the Body Size Scale for Assessing Body Weight Perception in African Populations

PubMed Central

Cohen, Emmanuel; Bernard, Jonathan Y.; Ponty, Amandine; Ndao, Amadou; Amougou, Norbert; Saïd-Mohamed, Rihlat; Pasquet, Patrick

2015-01-01

Background The social valorisation of overweight in African populations could promote high-risk eating behaviours and therefore become a risk factor of obesity. However, existing scales to assess body image are usually not accurate enough to allow comparative studies of body weight perception in different African populations. This study aimed to develop and validate the Body Size Scale (BSS) to estimate African body weight perception. Methods Anthropometric measures of 80 Cameroonians and 81 Senegalese were used to evaluate three criteria of adiposity: body mass index (BMI), overall percentage of fat, and endomorphy (fat component of the somatotype). To develop the BSS, the participants were photographed in full face and profile positions. Models were selected for their representativeness of the wide variability in adiposity with a progressive increase along the scale. Then, for the validation protocol, participants self-administered the BSS to assess self-perceived current body size (CBS), desired body size (DBS) and provide a “body self-satisfaction index.” This protocol included construct validity, test-retest reliability and convergent validity and was carried out with three independent samples of respectively 201, 103 and 1115 Cameroonians. Results The BSS comprises two sex-specific scales of photos of 9 models each, and ordered by increasing adiposity. Most participants were able to correctly order the BSS by increasing adiposity, using three different words to define body size. Test-retest reliability was consistent in estimating CBS, DBS and the “body self-satisfaction index.” The CBS was highly correlated to the objective BMI, and two different indexes assessed with the BSS were consistent with declarations obtained in interviews. Conclusion The BSS is the first scale with photos of real African models taken in both full face and profile and representing a wide and representative variability in adiposity. The validation protocol proved its reliability for estimating body weight perception in Africans. PMID:26536030
A Comparison of the Approaches of Generalizability Theory and Item Response Theory in Estimating the Reliability of Test Scores for Testlet-Composed Tests

ERIC Educational Resources Information Center

Lee, Guemin; Park, In-Yong

2012-01-01

Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…
Self-Management and Transition Readiness Assessment: Development, Reliability, and Factor Structure of the STARx Questionnaire.

PubMed

Ferris, M; Cohen, S; Haberman, C; Javalkar, K; Massengill, S; Mahan, J D; Kim, S; Bickford, K; Cantu, G; Medeiros, M; Phillips, A; Ferris, M T; Hooper, S R

2015-01-01

The Self-Management and Transition to Adulthood with Rx=Treatment (STARx) Questionnaire was developed to collect information on self-management and health care transition (HCT) skills, via self-report, in a broad population of adolescents and young adults (AYAs) with chronic conditions. Over several iterations, the STARx questionnaire was created with AYA, family, and health provider input. The development and pilot testing of the STARx Questionnaire took place with the assistance of 1219 AYAs with different chronic health conditions, in multiple institutions and settings over three phases: item development, pilot testing, reliability and factor structuring. The three development phases resulted in a final version of the STARx Questionnaire. The exploratory factor analysis of the third version of the 18-item STARx identified six factors that accounted for about 65% of the variance: Medication management, Provider communication, Engagement during appointments, Disease knowledge, Adult health responsibilities, and Resource utilization. Reliability estimates revealed good internal consistency and temporal stability, with the alpha coefficient for the overall scale being .80. The STARx was developmentally sensitive, with older patients scoring significantly higher on nearly every factor than younger patients. The STARx Questionnaire is a reliable, self-report tool with adequate internal consistency, temporal stability, and a strong, multidimensional factor structure. It provides another assessment strategy to measure self-management and transition skills in AYAs with chronic conditions. Copyright © 2015 Elsevier Inc. All rights reserved.
Psychometric properties and cross-cultural adaptation of the Brazilian Quebec back pain disability scale questionnaire.

PubMed

Rodrigues, Marcelo F; Michel-Crosato, Edgard; Cardoso, Jefferson R; Traebert, Jefferson

2009-06-01

Cross-cultural translation and psychometric testing. To translate and cross-culturally adapt the Quebec Back Pain Disability Scale (QDS) to Brazilian Portuguese and to examine its validity and reliability. Current literature shows the need to adopt reliable and internationally standardized methods for the analysis of low back pain. To our knowledge, this specific questionnaire has not been translated and validated for Portuguese-speaking patients. The translation and cross-cultural adaptation of the QDS were developed in agreement with internationally recommended methodology, and the resulting product was evaluated in this study with 54 consecutive patients. Internal consistency was obtained through Cronbach's alpha; reliability was estimated through the intraclass correlation coefficient and the Bland and Altman agreement (d = mean difference). Validity was determined by correlating the scores of the Brazil-QDS with the Brazilian version of the Roland-Morris Questionnaire and Visual Analogue Pain Scale by means of the Spearman rank correlation coefficient. The internal consistency obtained was excellent (Cronbach's alpha = 0.97). Intraobserver and interobserver reliability were considered strong (ICC = 0.93-d = 0.68 and 0.96-d = 0.57, respectively). The correlation with Brazilian Roland-Morris Questionnaire and with the Visual Analogue Scale was high (r = 0.857; r = 0.758, respectively). The data showed that the process of translation and cross-cultural adaptation were successful and that the adapted instrument demonstrated excellent psychometric properties.
[Reproducibility, internal consistency, and construct validity of KIDSCREEN-27 in Brazilian adolescents].

PubMed

Farias, José Cazuza de; Loch, Mathias Roberto; Lima, Antônio José de; Sales, Joana Marcela; Ferreira, Flávia Emília Leite de Lima

2017-09-28

: The objective of this two-part study was to estimate the reproducibility, internal consistency, and construct validity of KIDSCREEN-27, a questionnaire to measure health-related quality of life, in Brazilian adolescents. One study component estimated reproducibility (176 adolescents, 59.7% females, 64.7% 10 to 12 years of age), and another estimated internal consistency and validity (1,321 adolescents, 53.7% females, 56.9% 10 to 12 years of age). The studies were conducted with adolescents of both sexes in public schools in the municipality of João Pessoa, Paraíba State, Brazil. KIDSCREEN-27 consists of 27 items distributed across five domains (physical well-being, 5 items; psychological well-being, 7 items; parents and social support, 7 items; autonomy and relationship with parents, 4 items; school environment, 4 items). Reproducibility was estimated by intra-class correlation coefficient (ICC). Confirmatory factor analysis was used to assess construct validity, and composite reliability index (CRI) was used to verify the questionnaire's internal consistency. ICCs were greater than or equal to 0.70 (0.70 to 0.96). Factor loads were greater than 0.40, except for five items (0.28 to 0.39). The model's goodness-of-fit indices were adequate (χ2/df = 2.79; RMR = 0.035; RMSEA = 0.037; GFI = 0.951; AGFI = 0.941; CFI = 0.908; TLI = 0.901). CRI varied from 0.65 to 0.70 in the domains and was 0.90 for the questionnaire. KIDSCREEN-27 reached satisfactory levels of reproducibility, internal consistency, and construct validity and can be used to assess health-related quality of life in Brazilian adolescents 10 to 15 years of age.

Estimating the production, consumption and export of cannabis: The Dutch case.

PubMed

van der Giessen, Mark; van Ooyen-Houben, Marianne M J; Moolenaar, Debora E G

2016-05-01

Quantifying an illegal phenomenon like a drug market is inherently complex due to its hidden nature and the limited availability of reliable information. This article presents findings from a recent estimate of the production, consumption and export of Dutch cannabis and discusses the opportunities provided by, and limitations of, mathematical models for estimating the illegal cannabis market. The data collection consisted of a comprehensive literature study, secondary analyses on data from available registrations (2012-2014) and previous studies, and expert opinion. The cannabis market was quantified with several mathematical models. The data analysis included a Monte Carlo simulation to come to a 95% interval estimate (IE) and a sensitivity analysis to identify the most influential indicators. The annual production of Dutch cannabis was estimated to be between 171 and 965tons (95% IE of 271-613tons). The consumption was estimated to be between 28 and 119tons, depending on the inclusion or exclusion of non-residents (95% IE of 51-78tons or 32-49tons respectively). The export was estimated to be between 53 and 937tons (95% IE of 206-549tons or 231-573tons, respectively). Mathematical models are valuable tools for the systematic assessment of the size of illegal markets and determining the uncertainty inherent in the estimates. The estimates required the use of many assumptions and the availability of reliable indicators was limited. This uncertainty is reflected in the wide ranges of the estimates. The estimates are sensitive to 10 of the 45 indicators. These 10 account for 86-93% of the variation found. Further research should focus on improving the variables and the independence of the mathematical models. Copyright © 2016 Elsevier B.V. All rights reserved.
Precipitation and Latent Heating Distributions from Satellite Passive Microwave Radiometry. Part II: Evaluation of Estimates Using Independent Data

NASA Technical Reports Server (NTRS)

Yang, Song; Olson, William S.; Wang, Jian-Jian; Bell, Thomas L.; Smith, Eric A.; Kummerow, Christian D.

2006-01-01

Rainfall rate estimates from spaceborne microwave radiometers are generally accepted as reliable by a majority of the atmospheric science community. One of the Tropical Rainfall Measuring Mission (TRMM) facility rain-rate algorithms is based upon passive microwave observations from the TRMM Microwave Imager (TMI). In Part I of this series, improvements of the TMI algorithm that are required to introduce latent heating as an additional algorithm product are described. Here, estimates of surface rain rate, convective proportion, and latent heating are evaluated using independent ground-based estimates and satellite products. Instantaneous, 0.5 deg. -resolution estimates of surface rain rate over ocean from the improved TMI algorithm are well correlated with independent radar estimates (r approx. 0.88 over the Tropics), but bias reduction is the most significant improvement over earlier algorithms. The bias reduction is attributed to the greater breadth of cloud-resolving model simulations that support the improved algorithm and the more consistent and specific convective/stratiform rain separation method utilized. The bias of monthly 2.5 -resolution estimates is similarly reduced, with comparable correlations to radar estimates. Although the amount of independent latent heating data is limited, TMI-estimated latent heating profiles compare favorably with instantaneous estimates based upon dual-Doppler radar observations, and time series of surface rain-rate and heating profiles are generally consistent with those derived from rawinsonde analyses. Still, some biases in profile shape are evident, and these may be resolved with (a) additional contextual information brought to the estimation problem and/or (b) physically consistent and representative databases supporting the algorithm. A model of the random error in instantaneous 0.5 deg. -resolution rain-rate estimates appears to be consistent with the levels of error determined from TMI comparisons with collocated radar. Error model modifications for nonraining situations will be required, however. Sampling error represents only a portion of the total error in monthly 2.5 -resolution TMI estimates; the remaining error is attributed to random and systematic algorithm errors arising from the physical inconsistency and/or nonrepresentativeness of cloud-resolving-model-simulated profiles that support the algorithm.
Precipitation and Latent Heating Distributions from Satellite Passive Microwave Radiometry. Part 2; Evaluation of Estimates Using Independent Data

NASA Technical Reports Server (NTRS)

Yang, Song; Olson, William S.; Wang, Jian-Jian; Bell, Thomas L.; Smith, Eric A.; Kummerow, Christian D.

2004-01-01

Rainfall rate estimates from space-borne k&ents are generally accepted as reliable by a majority of the atmospheric science commu&y. One-of the Tropical Rainfall Measuring Mission (TRh4M) facility rain rate algorithms is based upon passive microwave observations fiom the TRMM Microwave Imager (TMI). Part I of this study describes improvements in the TMI algorithm that are required to introduce cloud latent heating and drying as additional algorithm products. Here, estimates of surface rain rate, convective proportion, and latent heating are evaluated using independent ground-based estimates and satellite products. Instantaneous, OP5resolution estimates of surface rain rate over ocean fiom the improved TMI algorithm are well correlated with independent radar estimates (r approx. 0.88 over the Tropics), but bias reduction is the most significant improvement over forerunning algorithms. The bias reduction is attributed to the greater breadth of cloud-resolving model simulations that support the improved algorithm, and the more consistent and specific convective/stratiform rain separation method utilized. The bias of monthly, 2.5 deg. -resolution estimates is similarly reduced, with comparable correlations to radar estimates. Although the amount of independent latent heating data are limited, TMI estimated latent heating profiles compare favorably with instantaneous estimates based upon dual-Doppler radar observations, and time series of surface rain rate and heating profiles are generally consistent with those derived from rawinsonde analyses. Still, some biases in profile shape are evident, and these may be resolved with: (a) additional contextual information brought to the estimation problem, and/or; (b) physically-consistent and representative databases supporting the algorithm. A model of the random error in instantaneous, 0.5 deg-resolution rain rate estimates appears to be consistent with the levels of error determined from TMI comparisons to collocated radar. Error model modifications for non-raining situations will be required, however. Sampling error appears to represent only a fraction of the total error in monthly, 2S0-resolution TMI estimates; the remaining error is attributed to physical inconsistency or non-representativeness of cloud-resolving model simulated profiles supporting the algorithm.
Performance of Blind Source Separation Algorithms for FMRI Analysis using a Group ICA Method

PubMed Central

Correa, Nicolle; Adali, Tülay; Calhoun, Vince D.

2007-01-01

Independent component analysis (ICA) is a popular blind source separation (BSS) technique that has proven to be promising for the analysis of functional magnetic resonance imaging (fMRI) data. A number of ICA approaches have been used for fMRI data analysis, and even more ICA algorithms exist, however the impact of using different algorithms on the results is largely unexplored. In this paper, we study the performance of four major classes of algorithms for spatial ICA, namely information maximization, maximization of non-gaussianity, joint diagonalization of cross-cumulant matrices, and second-order correlation based methods when they are applied to fMRI data from subjects performing a visuo-motor task. We use a group ICA method to study the variability among different ICA algorithms and propose several analysis techniques to evaluate their performance. We compare how different ICA algorithms estimate activations in expected neuronal areas. The results demonstrate that the ICA algorithms using higher-order statistical information prove to be quite consistent for fMRI data analysis. Infomax, FastICA, and JADE all yield reliable results; each having their strengths in specific areas. EVD, an algorithm using second-order statistics, does not perform reliably for fMRI data. Additionally, for the iterative ICA algorithms, it is important to investigate the variability of the estimates from different runs. We test the consistency of the iterative algorithms, Infomax and FastICA, by running the algorithm a number of times with different initializations and note that they yield consistent results over these multiple runs. Our results greatly improve our confidence in the consistency of ICA for fMRI data analysis. PMID:17540281
Performance of blind source separation algorithms for fMRI analysis using a group ICA method.

PubMed

Correa, Nicolle; Adali, Tülay; Calhoun, Vince D

2007-06-01

Independent component analysis (ICA) is a popular blind source separation technique that has proven to be promising for the analysis of functional magnetic resonance imaging (fMRI) data. A number of ICA approaches have been used for fMRI data analysis, and even more ICA algorithms exist; however, the impact of using different algorithms on the results is largely unexplored. In this paper, we study the performance of four major classes of algorithms for spatial ICA, namely, information maximization, maximization of non-Gaussianity, joint diagonalization of cross-cumulant matrices and second-order correlation-based methods, when they are applied to fMRI data from subjects performing a visuo-motor task. We use a group ICA method to study variability among different ICA algorithms, and we propose several analysis techniques to evaluate their performance. We compare how different ICA algorithms estimate activations in expected neuronal areas. The results demonstrate that the ICA algorithms using higher-order statistical information prove to be quite consistent for fMRI data analysis. Infomax, FastICA and joint approximate diagonalization of eigenmatrices (JADE) all yield reliable results, with each having its strengths in specific areas. Eigenvalue decomposition (EVD), an algorithm using second-order statistics, does not perform reliably for fMRI data. Additionally, for iterative ICA algorithms, it is important to investigate the variability of estimates from different runs. We test the consistency of the iterative algorithms Infomax and FastICA by running the algorithm a number of times with different initializations, and we note that they yield consistent results over these multiple runs. Our results greatly improve our confidence in the consistency of ICA for fMRI data analysis.
Annual estimates of recharge, quick-flow runoff, and ET for the contiguous U.S. using empirical regression equations

USGS Publications Warehouse

Reitz, Meredith; Sanford, Ward E.; Senay, Gabriel; Cazenas, J.

2017-01-01

This study presents new data-driven, annual estimates of the division of precipitation into the recharge, quick-flow runoff, and evapotranspiration (ET) water budget components for 2000-2013 for the contiguous United States (CONUS). The algorithms used to produce these maps ensure water budget consistency over this broad spatial scale, with contributions from precipitation influx attributed to each component at 800 m resolution. The quick-flow runoff estimates for the contribution to the rapidly varying portion of the hydrograph are produced using data from 1,434 gaged watersheds, and depend on precipitation, soil saturated hydraulic conductivity, and surficial geology type. Evapotranspiration estimates are produced from a regression using water balance data from 679 gaged watersheds and depend on land cover, temperature, and precipitation. The quick-flow and ET estimates are combined to calculate recharge as the remainder of precipitation. The ET and recharge estimates are checked against independent field data, and the results show good agreement. Comparisons of recharge estimates with groundwater extraction data show that in 15% of the country, groundwater is being extracted at rates higher than the local recharge. These maps of the internally consistent water budget components of recharge, quick-flow runoff, and ET, being derived from and tested against data, are expected to provide reliable first-order estimates of these quantities across the CONUS, even where field measurements are sparse.
Perceptual attraction in tool use: evidence for a reliability-based weighting mechanism.

PubMed

Debats, Nienke B; Ernst, Marc O; Heuer, Herbert

2017-04-01

Humans are well able to operate tools whereby their hand movement is linked, via a kinematic transformation, to a spatially distant object moving in a separate plane of motion. An everyday example is controlling a cursor on a computer monitor. Despite these separate reference frames, the perceived positions of the hand and the object were found to be biased toward each other. We propose that this perceptual attraction is based on the principles by which the brain integrates redundant sensory information of single objects or events, known as optimal multisensory integration. That is, 1 ) sensory information about the hand and the tool are weighted according to their relative reliability (i.e., inverse variances), and 2 ) the unisensory reliabilities sum up in the integrated estimate. We assessed whether perceptual attraction is consistent with optimal multisensory integration model predictions. We used a cursor-control tool-use task in which we manipulated the relative reliability of the unisensory hand and cursor position estimates. The perceptual biases shifted according to these relative reliabilities, with an additional bias due to contextual factors that were present in experiment 1 but not in experiment 2 The biased position judgments' variances were, however, systematically larger than the predicted optimal variances. Our findings suggest that the perceptual attraction in tool use results from a reliability-based weighting mechanism similar to optimal multisensory integration, but that certain boundary conditions for optimality might not be satisfied. NEW & NOTEWORTHY Kinematic tool use is associated with a perceptual attraction between the spatially separated hand and the effective part of the tool. We provide a formal account for this phenomenon, thereby showing that the process behind it is similar to optimal integration of sensory information relating to single objects. Copyright © 2017 the American Physiological Society.
Comparing Fit and Reliability Estimates of a Psychological Instrument Using Second-Order CFA, Bifactor, and Essentially Tau-Equivalent (Coefficient Alpha) Models via AMOS 22

ERIC Educational Resources Information Center

Black, Ryan A.; Yang, Yanyun; Beitra, Danette; McCaffrey, Stacey

2015-01-01

Estimation of composite reliability within a hierarchical modeling framework has recently become of particular interest given the growing recognition that the underlying assumptions of coefficient alpha are often untenable. Unfortunately, coefficient alpha remains the prominent estimate of reliability when estimating total scores from a scale with…
Summative assessment of undergraduates' communication competence in challenging doctor-patient encounters. Evaluation of the Düsseldorf CoMeD-OSCE.

PubMed

Mortsiefer, Achim; Immecke, Janine; Rotthoff, Thomas; Karger, André; Schmelzer, Regine; Raski, Bianca; Schmitten, Jürgen In der; Altiner, Attila; Pentzek, Michael

2014-06-01

To evaluate the summative assessment (OSCE) of a communication training programme for dealing with challenging doctor-patient encounters in the 4th study year. Our OSCE consists of 4 stations (breaking bad news, guilt and shame, aggressive patients, shared decision making), using a 4-item global rating (GR) instrument. We calculated reliability coefficients for different levels, discriminability of single items and interrater reliability. Validity was estimated by gender differences and accordance between GR and a checklist. In a pooled sample of 456 students in 3 OSCEs over 3 terms, total reliability was α=0.64, reliability coefficients for single stations were >0.80, and discriminability in 3 of 4 stations was within the range of 0.4-0.7. Except for one station, interrater reliability was moderate to strong. Reliability on item level was poor and pointed to some problems with the use of the GR. The application of the GR on regular undergraduate medical education shows moderate reliability in need of improvement and some traits of validity. Ongoing development and evaluation is needed with particular regard to the training of the examiners. Our CoMeD-OSCE proved suitable for the summative assessment of communication skills in challenging doctor-patient encounters. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
A Comparison of Three Multivariate Models for Estimating Test Battery Reliability.

ERIC Educational Resources Information Center

Wood, Terry M.; Safrit, Margaret J.

1987-01-01

A comparison of three multivariate models (canonical reliability model, maximum generalizability model, canonical correlation model) for estimating test battery reliability indicated that the maximum generalizability model showed the least degree of bias, smallest errors in estimation, and the greatest relative efficiency across all experimental…
NIED seismic moment tensor catalogue for regional earthquakes around Japan: quality test and application

NASA Astrophysics Data System (ADS)

Kubo, Atsuki; Fukuyama, Eiichi; Kawai, Hiroyuki; Nonomura, Ken'ichi

2002-10-01

We have examined the quality of the National Research Institute for Earth Science and Disaster Prevention (NIED) seismic moment tensor (MT) catalogue obtained using a regional broadband seismic network (FREESIA). First, we examined using synthetic waveforms the robustness of the solutions with regard to data noise as well as to errors in the velocity structure and focal location. Then, to estimate the reliability, robustness and validity of the catalogue, we compared it with the Harvard centroid moment tensor (CMT) catalogue as well as the Japan Meteorological Agency (JMA) focal mechanism catalogue. We found out that the NIED catalogue is consistent with Harvard and JMA catalogues within the uncertainty of 0.1 in moment magnitude, 10 km in depth, and 15° in direction of the stress axes. The NIED MT catalogue succeeded in reducing to 3.5 the lower limit of moment magnitude above which the moment tensor could be reliably estimated. Finally, we estimated the stress tensors in several different regions by using the NIED MT catalogue. This enables us to elucidate the stress/deformation field in and around the Japanese islands to understand the mode of deformation and applied stress. Moreover, we identified a region of abnormal stress in a swarm area from stress tensor estimates.
Test battery for measuring the perception and recognition of facial expressions of emotion

PubMed Central

Wilhelm, Oliver; Hildebrandt, Andrea; Manske, Karsten; Schacht, Annekathrin; Sommer, Werner

2014-01-01

Despite the importance of perceiving and recognizing facial expressions in everyday life, there is no comprehensive test battery for the multivariate assessment of these abilities. As a first step toward such a compilation, we present 16 tasks that measure the perception and recognition of facial emotion expressions, and data illustrating each task's difficulty and reliability. The scoring of these tasks focuses on either the speed or accuracy of performance. A sample of 269 healthy young adults completed all tasks. In general, accuracy and reaction time measures for emotion-general scores showed acceptable and high estimates of internal consistency and factor reliability. Emotion-specific scores yielded lower reliabilities, yet high enough to encourage further studies with such measures. Analyses of task difficulty revealed that all tasks are suitable for measuring emotion perception and emotion recognition related abilities in normal populations. PMID:24860528
Measurement of Postmortem Pupil Size: A New Method with Excellent Reliability and Its Application to Pupil Changes in the Early Postmortem Period.

PubMed

Fleischer, Luise; Sehner, Susanne; Gehl, Axel; Riemer, Martin; Raupach, Tobias; Anders, Sven

2017-05-01

Measurement of postmortem pupil width is a potential component of death time estimation. However, no standardized measurement method has been described. We analyzed a total of 71 digital images for pupil-iris ratio using the software ImageJ. Images were analyzed three times by four different examiners. In addition, serial images from 10 cases were taken between 2 and 50 h postmortem to detect spontaneous pupil changes. Intra- and inter-rater reliability of the method was excellent (ICC > 0.95). The method is observer independent and yields consistent results, and images can be digitally stored and re-evaluated. The method seems highly eligible for forensic and scientific purposes. While statistical analysis of spontaneous pupil changes revealed a significant polynomial of quartic degree for postmortem time (p = 0.001), an obvious pattern was not detected. These results do not indicate suitability of spontaneous pupil changes for forensic death time estimation, as formerly suggested. © 2016 American Academy of Forensic Sciences.
Reliability of School Surveys in Estimating Geographic Variation in Malaria Transmission in the Western Kenyan Highlands

PubMed Central

Gitonga, Caroline W.; Gillig, Jonathan; Owaga, Chrispin; Marube, Elizabeth; Odongo, Wycliffe; Okoth, Albert; China, Pauline; Oriango, Robin; Brooker, Simon J.; Bousema, Teun; Drakeley, Chris; Cox, Jonathan

2013-01-01

Background School surveys provide an operational approach to assess malaria transmission through parasite prevalence. There is limited evidence on the comparability of prevalence estimates obtained from school and community surveys carried out at the same locality. Methods Concurrent school and community cross-sectional surveys were conducted in 46 school/community clusters in the western Kenyan highlands and households of school children were geolocated. Malaria was assessed by rapid diagnostic test (RDT) and combined seroprevalence of antibodies to bloodstage Plasmodium falciparum antigens. Results RDT prevalence in school and community populations was 25.7% (95% CI: 24.4-26.8) and 15.5% (95% CI: 14.4-16.7), respectively. Seroprevalence in the school and community populations was 51.9% (95% CI: 50.5-53.3) and 51.5% (95% CI: 49.5-52.9), respectively. RDT prevalence in schools could differentiate between low (<7%, 95% CI: 0-19%) and high (>39%, 95% CI: 25-49%) transmission areas in the community and, after a simple adjustment, were concordant with the community estimates. Conclusions Estimates of malaria prevalence from school surveys were consistently higher than those from community surveys and were strongly correlated. School-based estimates can be used as a reliable indicator of malaria transmission intensity in the wider community and may provide a basis for identifying priority areas for malaria control. PMID:24143250
Reliability and validity of the Turkish version of the Rapid Estimate of Adult Literacy in Dentistry (TREALD-30).

PubMed

Peker, Kadriye; Köse, Taha Emre; Güray, Beliz; Uysal, Ömer; Erdem, Tamer Lütfi

2017-04-01

To culturally adapt the Turkish version of Rapid Estimate of Adult Literacy in Dentistry (TREALD-30) for Turkish-speaking adult dental patients and to evaluate its psychometric properties. After translation and cross-cultural adaptation, TREALD-30 was tested in a sample of 127 adult patients who attended a dental school clinic in Istanbul. Data were collected through clinical examinations and self-completed questionnaires, including TREALD-30, the Oral Health Impact Profile (OHIP), the Rapid Estimate of Adult Literacy in Medicine (REALM), two health literacy screening questions, and socio-behavioral characteristics. Psychometric properties were examined using Classical Test Theory (CTT) and Rasch analysis. Internal consistency (Cronbach's Alpha = 0.91) and test-retest reliability (Intraclass correlation coefficient = 0.99) were satisfactory for TREALD-30. It exhibited good convergent and predictive validity. Monthly family income, years of education, dental flossing, health literacy, and health literacy skills were found as stronger predictors of patients'oral health literacy (OHL). Confirmatory factor analysis (CFA) confirmed a two-factor model. The Rasch model explained 37.9% of the total variance in this dataset. In addition, TREALD-30 had eleven misfitting items, which indicated evidence of multidimensionality. The reliability indeces provided in Rasch analysis (person separation reliability = 0.91 and expected-a-posteriori/plausible reliability = 0.94) indicated that TREALD-30 had acceptable reliability. TREALD-30 showed satisfactory psychometric properties. It may be used to identify patients with low OHL. Socio-demographic factors, oral health behaviors and health literacy skills should be taken into account when planning future studies to assess the OHL in both clinical and community settings.
Accuracy and reliability of pulp/tooth area ratio in upper canines by peri-apical X-rays.

PubMed

Azevedo, A C; Michel-Crosato, E; Biazevic, M G H; Galić, I; Merelli, V; De Luca, S; Cameriere, R

2014-11-01

Due to the real need for careful staff training in age assessment, in order to improve capacity, consistency and competence, new research on the reliability and repeatability of methods frequently used in age assessment are required. The aim of this study was twofold: first, to test the accuracy of this method for age estimation; second, to obtain data on the reliability of this technique. A sample of 81 peri-apical radiographs of upper canines (44 men and 37 women), aged between 19 and 74years, was used; the teeth were taken from the osteological collection of Sassari (Sardinia, Italy). Three blinded observers used the technique in order to perform the age estimation. The mean real age of the 81 observations was 37.21 (CI95% 34.37 40.05), and estimated ages ranged from 36.65 to 38.99 (CI95%-Ex1 35.42; 41.28; CI95%-Ex2 33.89; 39.41; CI95%-Ex3 35.92; 42.06). The module differences found by the three observers were 3.43, 4.24 and 4.45, respectively for Ex1×Ex2, Ex1×Ex3 and Ex2×Ex3. The module differences observed among real and observed ages were 2.55 (CI95% 1.90; 3.20), 2.22 (CI95% 1.65; 2.78) and 4.39 (CI95% 3.80; 5.75), respectively for Ex1, Ex2 and Ex3. No differences were observed among measurements. This technique can be reproduced and repeated after proper training, since it was found high reliability and accuracy. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Reliability and Validity of Instruments for Assessing Perinatal Depression in African Settings: Systematic Review and Meta-Analysis

PubMed Central

Tsai, Alexander C.; Scott, Jennifer A.; Hung, Kristin J.; Zhu, Jennifer Q.; Matthews, Lynn T.; Psaros, Christina; Tomlinson, Mark

2013-01-01

Background A major barrier to improving perinatal mental health in Africa is the lack of locally validated tools for identifying probable cases of perinatal depression or for measuring changes in depression symptom severity. We systematically reviewed the evidence on the reliability and validity of instruments to assess perinatal depression in African settings. Methods and Findings Of 1,027 records identified through searching 7 electronic databases, we reviewed 126 full-text reports. We included 25 unique studies, which were disseminated in 26 journal articles and 1 doctoral dissertation. These enrolled 12,544 women living in nine different North and sub-Saharan African countries. Only three studies (12%) used instruments developed specifically for use in a given cultural setting. Most studies provided evidence of criterion-related validity (20 [80%]) or reliability (15 [60%]), while fewer studies provided evidence of construct validity, content validity, or internal structure. The Edinburgh postnatal depression scale (EPDS), assessed in 16 studies (64%), was the most frequently used instrument in our sample. Ten studies estimated the internal consistency of the EPDS (median estimated coefficient alpha, 0.84; interquartile range, 0.71-0.87). For the 14 studies that estimated sensitivity and specificity for the EPDS, we constructed 2 x 2 tables for each cut-off score. Using a bivariate random-effects model, we estimated a pooled sensitivity of 0.94 (95% confidence interval [CI], 0.68-0.99) and a pooled specificity of 0.77 (95% CI, 0.59-0.88) at a cut-off score of ≥9, with higher cut-off scores yielding greater specificity at the cost of lower sensitivity. Conclusions The EPDS can reliably and validly measure perinatal depression symptom severity or screen for probable postnatal depression in African countries, but more validation studies on other instruments are needed. In addition, more qualitative research is needed to adequately characterize local understandings of perinatal depression-like syndromes in different African contexts. PMID:24340036
Resimulation of noise: a precision estimator for least square error curve-fitting tested for axial strain time constant imaging

NASA Astrophysics Data System (ADS)

Nair, S. P.; Righetti, R.

2015-05-01

Recent elastography techniques focus on imaging information on properties of materials which can be modeled as viscoelastic or poroelastic. These techniques often require the fitting of temporal strain data, acquired from either a creep or stress-relaxation experiment to a mathematical model using least square error (LSE) parameter estimation. It is known that the strain versus time relationships for tissues undergoing creep compression have a non-linear relationship. In non-linear cases, devising a measure of estimate reliability can be challenging. In this article, we have developed and tested a method to provide non linear LSE parameter estimate reliability: which we called Resimulation of Noise (RoN). RoN provides a measure of reliability by estimating the spread of parameter estimates from a single experiment realization. We have tested RoN specifically for the case of axial strain time constant parameter estimation in poroelastic media. Our tests show that the RoN estimated precision has a linear relationship to the actual precision of the LSE estimator. We have also compared results from the RoN derived measure of reliability against a commonly used reliability measure: the correlation coefficient (CorrCoeff). Our results show that CorrCoeff is a poor measure of estimate reliability for non-linear LSE parameter estimation. While the RoN is specifically tested only for axial strain time constant imaging, a general algorithm is provided for use in all LSE parameter estimation.
Inter-Observer Reliability of DSM-5 Substance Use Disorders*

PubMed Central

Denis, Cécile M.; Gelernter, Joel; Hart, Amy B.; Kranzler, Henry R.

2015-01-01

Aims Although studies have examined the impact of changes made in DSM-5 on the estimated prevalence of substance use disorder (SUD) diagnoses, there is limited evidence of the reliability of DSM-5 SUDs. We evaluated the inter-observer reliability of four DSM-5 SUDs in a sample in which we had previously evaluated the reliability of DSM-IV diagnoses, allowing us to compare the two systems. Methods Two different interviewers each assessed 173 subjects over a 2-week period using the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA). Using the percent agreement and kappa (κ) coefficient, we examined the reliability of DSM-5 lifetime alcohol, opioid, cocaine, and cannabis use disorders, which we compared to that of SSADDA-derived DSM-IV SUD diagnoses. We also assessed the effect of additional lifetime SUD and lifetime mood or anxiety disorder diagnoses on the reliability of the DSM-5 SUD diagnoses. Results Reliability was good to excellent for the four disorders, with κ values ranging from 0.65 to 0.94. Agreement was consistently lower for SUDs of mild severity than for moderate or severe disorders. DSM-5 SUD diagnoses showed greater reliability than DSM-IV diagnoses of abuse or dependence or dependence only. Co-occurring SUD and lifetime mood or anxiety disorders exerted a modest effect on the reliability of the DSM-5 SUD diagnoses. Conclusions For alcohol, opioid, cocaine and cannabis use disorders, DSM-5 criteria and diagnoses are at least as reliable as those of DSM-IV. PMID:26048641
The reliability of the pass/fail decision for assessments comprised of multiple components.

PubMed

Möltner, Andreas; Tımbıl, Sevgi; Jünger, Jana

2015-01-01

The decision having the most serious consequences for a student taking an assessment is the one to pass or fail that student. For this reason, the reliability of the pass/fail decision must be determined for high quality assessments, just as the measurement reliability of the point values. Assessments in a particular subject (graded course credit) are often composed of multiple components that must be passed independently of each other. When "conjunctively" combining separate pass/fail decisions, as with other complex decision rules for passing, adequate methods of analysis are necessary for estimating the accuracy and consistency of these classifications. To date, very few papers have addressed this issue; a generally applicable procedure was published by Douglas and Mislevy in 2010. Using the example of an assessment comprised of several parts that must be passed separately, this study analyzes the reliability underlying the decision to pass or fail students and discusses the impact of an improved method for identifying those who do not fulfill the minimum requirements. The accuracy and consistency of the decision to pass or fail an examinee in the subject cluster Internal Medicine/General Medicine/Clinical Chemistry at the University of Heidelberg's Faculty of Medicine was investigated. This cluster requires students to separately pass three components (two written exams and an OSCE), whereby students may reattempt to pass each component twice. Our analysis was carried out using the method described by Douglas and Mislevy. Frequently, when complex logical connections exist between the individual pass/fail decisions in the case of low failure rates, only a very low reliability for the overall decision to grant graded course credit can be achieved, even if high reliabilities exist for the various components. For the example analyzed here, the classification accuracy and consistency when conjunctively combining the three individual parts is relatively low with κ=0.49 or κ=0.47, despite the good reliability of over 0.75 for each of the three components. The option to repeat each component twice leads to a situation in which only about half of the candidates who do not satisfy the minimum requirements would fail the overall assessment, while the other half is able to continue their studies despite having deficient knowledge and skills. The method put forth by Douglas and Mislevy allows the analysis of the decision accuracy and consistency for complex combinations of scores from different components. Even in the case of highly reliable components, it is not necessarily so that a reliable pass/fail decision has been reached - for instance in the case of low failure rates. Assessments must be administered with the explicit goal of identifying examinees that do not fulfill the minimum requirements.

The reliability of the pass/fail decision for assessments comprised of multiple components

PubMed Central

Möltner, Andreas; Tımbıl, Sevgi; Jünger, Jana

2015-01-01

Objective: The decision having the most serious consequences for a student taking an assessment is the one to pass or fail that student. For this reason, the reliability of the pass/fail decision must be determined for high quality assessments, just as the measurement reliability of the point values. Assessments in a particular subject (graded course credit) are often composed of multiple components that must be passed independently of each other. When “conjunctively” combining separate pass/fail decisions, as with other complex decision rules for passing, adequate methods of analysis are necessary for estimating the accuracy and consistency of these classifications. To date, very few papers have addressed this issue; a generally applicable procedure was published by Douglas and Mislevy in 2010. Using the example of an assessment comprised of several parts that must be passed separately, this study analyzes the reliability underlying the decision to pass or fail students and discusses the impact of an improved method for identifying those who do not fulfill the minimum requirements. Method: The accuracy and consistency of the decision to pass or fail an examinee in the subject cluster Internal Medicine/General Medicine/Clinical Chemistry at the University of Heidelberg’s Faculty of Medicine was investigated. This cluster requires students to separately pass three components (two written exams and an OSCE), whereby students may reattempt to pass each component twice. Our analysis was carried out using the method described by Douglas and Mislevy. Results: Frequently, when complex logical connections exist between the individual pass/fail decisions in the case of low failure rates, only a very low reliability for the overall decision to grant graded course credit can be achieved, even if high reliabilities exist for the various components. For the example analyzed here, the classification accuracy and consistency when conjunctively combining the three individual parts is relatively low with κ=0.49 or κ=0.47, despite the good reliability of over 0.75 for each of the three components. The option to repeat each component twice leads to a situation in which only about half of the candidates who do not satisfy the minimum requirements would fail the overall assessment, while the other half is able to continue their studies despite having deficient knowledge and skills. Conclusion: The method put forth by Douglas and Mislevy allows the analysis of the decision accuracy and consistency for complex combinations of scores from different components. Even in the case of highly reliable components, it is not necessarily so that a reliable pass/fail decision has been reached – for instance in the case of low failure rates. Assessments must be administered with the explicit goal of identifying examinees that do not fulfill the minimum requirements. PMID:26483855
The Reliability Estimation for the Open Function of Cabin Door Affected by the Imprecise Judgment Corresponding to Distribution Hypothesis

NASA Astrophysics Data System (ADS)

Yu, Z. P.; Yue, Z. F.; Liu, W.

2018-05-01

With the development of artificial intelligence, more and more reliability experts have noticed the roles of subjective information in the reliability design of complex system. Therefore, based on the certain numbers of experiment data and expert judgments, we have divided the reliability estimation based on distribution hypothesis into cognition process and reliability calculation. Consequently, for an illustration of this modification, we have taken the information fusion based on intuitional fuzzy belief functions as the diagnosis model of cognition process, and finished the reliability estimation for the open function of cabin door affected by the imprecise judgment corresponding to distribution hypothesis.
Gait assessment using the Microsoft Xbox One Kinect: Concurrent validity and inter-day reliability of spatiotemporal and kinematic variables.

PubMed

Mentiplay, Benjamin F; Perraton, Luke G; Bower, Kelly J; Pua, Yong-Hao; McGaw, Rebekah; Heywood, Sophie; Clark, Ross A

2015-07-16

The revised Xbox One Kinect, also known as the Microsoft Kinect V2 for Windows, includes enhanced hardware which may improve its utility as a gait assessment tool. This study examined the concurrent validity and inter-day reliability of spatiotemporal and kinematic gait parameters estimated using the Kinect V2 automated body tracking system and a criterion reference three-dimensional motion analysis (3DMA) marker-based camera system. Thirty healthy adults performed two testing sessions consisting of comfortable and fast paced walking trials. Spatiotemporal outcome measures related to gait speed, speed variability, step length, width and time, foot swing velocity and medial-lateral and vertical pelvis displacement were examined. Kinematic outcome measures including ankle flexion, knee flexion and adduction and hip flexion were examined. To assess the agreement between Kinect and 3DMA systems, Bland-Altman plots, relative agreement (Pearson's correlation) and overall agreement (concordance correlation coefficients) were determined. Reliability was assessed using intraclass correlation coefficients, Cronbach's alpha and standard error of measurement. The spatiotemporal measurements had consistently excellent (r≥0.75) concurrent validity, with the exception of modest validity for medial-lateral pelvis sway (r=0.45-0.46) and fast paced gait speed variability (r=0.73). In contrast kinematic validity was consistently poor to modest, with all associations between the systems weak (r<0.50). In those measures with acceptable validity, the inter-day reliability was similar between systems. In conclusion, while the Kinect V2 body tracking may not accurately obtain lower body kinematic data, it shows great potential as a tool for measuring spatiotemporal aspects of gait. Copyright © 2015 Elsevier Ltd. All rights reserved.
Cross-cultural adaptation, reliability, internal consistency and validation of the Hand Function Sort (HFS©) for French speaking patients with upper limb complaints.

PubMed

Konzelmann, M; Burrus, C; Hilfiker, R; Rivier, G; Deriaz, O; Luthi, F

2015-03-01

Functional evaluation of upper limb is not only based on clinical findings but requires self-administered questionnaires to address patients' perspective. The Hand Function Sort (HFS©) was only validated in English. The aim of this study was the French cross cultural adaptation and validation of the HFS© (HFS-F). 150 patients with various upper limbs impairments were recruited in a rehabilitation center. Translation and cross-cultural adaptation were made according to international guidelines. Construct validity was estimated through correlations with Disabilities Arm Shoulder and Hand (DASH) questionnaire, SF-36 mental component summary (MCS),SF-36 physical component summary (PCS) and pain intensity. Internal consistency was assessed by Cronbach's α and test-retest reliability by intraclass correlation. Cronbach's α was 0.98, test-retest reliability was excellent at 0.921 (95 % CI 0.871-0.971) same as original HFS©. Correlations with DASH were-0.779 (95 % CI -0.847 to -0.685); with SF 36 PCS 0.452 (95 % CI 0.276-0.599); with pain -0.247 (95 % CI -0.429 to -0.041); with SF 36 MCS 0.242 (95 % CI 0.042-0.422). There were no floor or ceiling effects. The HFS-F has the same good psychometric properties as the original HFS© (internal consistency, test retest reliability, convergent validity with DASH, divergent validity with SF-36 MCS, and no floor or ceiling effects). The convergent validity with SF-36 PCS was poor; we found no correlation with pain. The HFS-F could be used with confidence in a population of working patients. Other studies are necessary to study its psychometric properties in other populations.
How reliable is apparent age at death on cadavers?

PubMed

Amadasi, Alberto; Merusi, Nicolò; Cattaneo, Cristina

2015-07-01

The assessment of age at death for identification purposes is a frequent and tough challenge for forensic pathologists and anthropologists. Too frequently, visual assessment of age is performed on well-preserved corpses, a method considered subjective and full of pitfalls, but whose level of inadequacy no one has yet tested or proven. This study consisted in the visual estimation of the age of 100 cadavers performed by a total of 37 observers among those usually attending the dissection room. Cadavers were of Caucasian ethnicity, well preserved, belonging to individuals who died of natural death. All the evaluations were performed prior to autopsy. Observers assessed the age with ranges of 5 and 10 years, indicating also the body part they mainly observed for each case. Globally, the 5-year range had an accuracy of 35%, increasing to 69% with the 10-year range. The highest accuracy was in the 31-60 age category (74.7% with the 10-year range), and the skin seemed to be the most reliable age parameter (71.5% of accuracy when observed), while the face was considered most frequently, in 92.4% of cases. A simple formula with the general "mean of averages" in the range given by the observers and related standard deviations was then developed; the average values with standard deviations of 4.62 lead to age estimation with ranges of some 20 years that seem to be fairly reliable and suitable, sometimes in alignment with classic anthropological methods, in the age estimation of well-preserved corpses.
The live donor assessment tool: a psychosocial assessment tool for live organ donors.

PubMed

Iacoviello, Brian M; Shenoy, Akhil; Braoude, Jenna; Jennings, Tiane; Vaidya, Swapna; Brouwer, Julianna; Haydel, Brandy; Arroyo, Hansel; Thakur, Devendra; Leinwand, Joseph; Rudow, Dianne LaPointe

2015-01-01

Psychosocial evaluation is an important part of the live organ donor evaluation process, yet it is not standardized across institutions, and although tools exist for the psychosocial evaluation of organ recipients, none exist to assess donors. We set out to develop a semistructured psychosocial evaluation tool (the Live Donor Assessment Tool, LDAT) to assess potential live organ donors and to conduct preliminary analyses of the tool's reliability and validity. Review of the literature on the psychosocial variables associated with treatment adherence, quality of life, live organ donation outcome, and resilience, as well as review of the procedures for psychosocial evaluation at our center and other centers around the country, identified 9 domains to address; these domains were distilled into several items each, in collaboration with colleagues at transplant centers across the country, for a total of 29 items. Four raters were trained to use the LDAT, and they retrospectively scored 99 psychosocial evaluations conducted on live organ donor candidates. Reliability of the LDAT was assessed by calculating the internal consistency of the items in the scale and interrater reliability between raters; validity was estimated by comparing LDAT scores between those with a "positive" evaluation outcome and "negative" outcome. The LDAT was found to have good internal consistency, inter-rater reliability, and showed signs of validity: LDAT scores differentiated the positive vs. negative outcome groups. The LDAT demonstrated good reliability and validity, but future research on the LDAT and the ability to implement the LDAT prospectively is warranted. Copyright © 2015 The Academy of Psychosomatic Medicine. Published by Elsevier Inc. All rights reserved.
Adaptation and Assessment of Reliability and Validity of the Greek Version of the Ohkuma Questionnaire for Dysphagia Screening

PubMed Central

Papadopoulou, Soultana L.; Exarchakos, Georgios; Christodoulou, Dimitrios; Theodorou, Stavroula; Beris, Alexandre; Ploumis, Avraam

2016-01-01

Introduction The Ohkuma questionnaire is a validated screening tool originally used to detect dysphagia among patients hospitalized in Japanese nursing facilities. Objective The purpose of this study is to evaluate the reliability and validity of the adapted Greek version of the Ohkuma questionnaire. Methods Following the steps for cross-cultural adaptation, we delivered the validated Ohkuma questionnaire to 70 patients (53 men, 17 women) who were either suffering from dysphagia or not. All of them completed the questionnaire a second time within a month. For all of them, we performed a bedside and VFSS study of dysphagia and asked participants to undergo a second VFSS screening, with the exception of nine individuals. Statistical analysis included measurement of internal consistency with Cronbach's α coefficient, reliability with Cohen's Kappa, Pearson's correlation coefficient and construct validity with categorical components, and One-Way Anova test. Results According to Cronbach's α coefficient (0.976) for total score, there was high internal consistency for the Ohkuma Dysphagia questionnaire. Test-retest reliability (Cohen's Kappa) ranged from 0.586 to 1.00, exhibiting acceptable stability. We also estimated the Pearson's correlation coefficient for the test-retest total score, which reached high levels (0.952; p = 0.000). The One-Way Anova test in the two measurement times showed statistically significant correlation in both measurements (p = 0.02 and p = 0.016). Conclusion The adapted Greek version of the questionnaire is valid and reliable and can be used for the screening of dysphagia in the Greek-speaking patients. PMID:28050209
Adaptation and Assessment of Reliability and Validity of the Greek Version of the Ohkuma Questionnaire for Dysphagia Screening.

PubMed

Papadopoulou, Soultana L; Exarchakos, Georgios; Christodoulou, Dimitrios; Theodorou, Stavroula; Beris, Alexandre; Ploumis, Avraam

2017-01-01

Introduction The Ohkuma questionnaire is a validated screening tool originally used to detect dysphagia among patients hospitalized in Japanese nursing facilities. Objective The purpose of this study is to evaluate the reliability and validity of the adapted Greek version of the Ohkuma questionnaire. Methods Following the steps for cross-cultural adaptation, we delivered the validated Ohkuma questionnaire to 70 patients (53 men, 17 women) who were either suffering from dysphagia or not. All of them completed the questionnaire a second time within a month. For all of them, we performed a bedside and VFSS study of dysphagia and asked participants to undergo a second VFSS screening, with the exception of nine individuals. Statistical analysis included measurement of internal consistency with Cronbach's α coefficient, reliability with Cohen's Kappa, Pearson's correlation coefficient and construct validity with categorical components, and One-Way Anova test. Results According to Cronbach's α coefficient (0.976) for total score, there was high internal consistency for the Ohkuma Dysphagia questionnaire. Test-retest reliability (Cohen's Kappa) ranged from 0.586 to 1.00, exhibiting acceptable stability. We also estimated the Pearson's correlation coefficient for the test-retest total score, which reached high levels (0.952; p = 0.000). The One-Way Anova test in the two measurement times showed statistically significant correlation in both measurements ( p = 0.02 and p = 0.016). Conclusion The adapted Greek version of the questionnaire is valid and reliable and can be used for the screening of dysphagia in the Greek-speaking patients.
Are Validity and Reliability "Relevant" in Qualitative Evaluation Research?

ERIC Educational Resources Information Center

Goodwin, Laura D.; Goodwin, William L.

1984-01-01

The views of prominant qualitative methodologists on the appropriateness of validity and reliability estimation for the measurement strategies employed in qualitative evaluations are summarized. A case is made for the relevance of validity and reliability estimation. Definitions of validity and reliability for qualitative measurement are presented…
A General Approach for Estimating Scale Score Reliability for Panel Survey Data

ERIC Educational Resources Information Center

Biemer, Paul P.; Christ, Sharon L.; Wiesen, Christopher A.

2009-01-01

Scale score measures are ubiquitous in the psychological literature and can be used as both dependent and independent variables in data analysis. Poor reliability of scale score measures leads to inflated standard errors and/or biased estimates, particularly in multivariate analysis. Reliability estimation is usually an integral step to assess…
Bi-Factor Multidimensional Item Response Theory Modeling for Subscores Estimation, Reliability, and Classification

ERIC Educational Resources Information Center

Md Desa, Zairul Nor Deana

2012-01-01

In recent years, there has been increasing interest in estimating and improving subscore reliability. In this study, the multidimensional item response theory (MIRT) and the bi-factor model were combined to estimate subscores, to obtain subscores reliability, and subscores classification. Both the compensatory and partially compensatory MIRT…
Reliability and precision of pellet-group counts for estimating landscape-level deer density

Treesearch

David S. deCalesta

2013-01-01

This study provides hitherto unavailable methodology for reliably and precisely estimating deer density within forested landscapes, enabling quantitative rather than qualitative deer management. Reliability and precision of the deer pellet-group technique were evaluated in 1 small and 2 large forested landscapes. Density estimates, adjusted to reflect deer harvest and...
Method matters: Understanding diagnostic reliability in DSM-IV and DSM-5.

PubMed

Chmielewski, Michael; Clark, Lee Anna; Bagby, R Michael; Watson, David

2015-08-01

Diagnostic reliability is essential for the science and practice of psychology, in part because reliability is necessary for validity. Recently, the DSM-5 field trials documented lower diagnostic reliability than past field trials and the general research literature, resulting in substantial criticism of the DSM-5 diagnostic criteria. Rather than indicating specific problems with DSM-5, however, the field trials may have revealed long-standing diagnostic issues that have been hidden due to a reliance on audio/video recordings for estimating reliability. We estimated the reliability of DSM-IV diagnoses using both the standard audio-recording method and the test-retest method used in the DSM-5 field trials, in which different clinicians conduct separate interviews. Psychiatric patients (N = 339) were diagnosed using the SCID-I/P; 218 were diagnosed a second time by an independent interviewer. Diagnostic reliability using the audio-recording method (N = 49) was "good" to "excellent" (M κ = .80) and comparable to the DSM-IV field trials estimates. Reliability using the test-retest method (N = 218) was "poor" to "fair" (M κ = .47) and similar to DSM-5 field-trials' estimates. Despite low test-retest diagnostic reliability, self-reported symptoms were highly stable. Moreover, there was no association between change in self-report and change in diagnostic status. These results demonstrate the influence of method on estimates of diagnostic reliability. (c) 2015 APA, all rights reserved).
Modelling heterogeneity variances in multiple treatment comparison meta-analysis--are informative priors the better solution?

PubMed

Thorlund, Kristian; Thabane, Lehana; Mills, Edward J

2013-01-11

Multiple treatment comparison (MTC) meta-analyses are commonly modeled in a Bayesian framework, and weakly informative priors are typically preferred to mirror familiar data driven frequentist approaches. Random-effects MTCs have commonly modeled heterogeneity under the assumption that the between-trial variance for all involved treatment comparisons are equal (i.e., the 'common variance' assumption). This approach 'borrows strength' for heterogeneity estimation across treatment comparisons, and thus, ads valuable precision when data is sparse. The homogeneous variance assumption, however, is unrealistic and can severely bias variance estimates. Consequently 95% credible intervals may not retain nominal coverage, and treatment rank probabilities may become distorted. Relaxing the homogeneous variance assumption may be equally problematic due to reduced precision. To regain good precision, moderately informative variance priors or additional mathematical assumptions may be necessary. In this paper we describe four novel approaches to modeling heterogeneity variance - two novel model structures, and two approaches for use of moderately informative variance priors. We examine the relative performance of all approaches in two illustrative MTC data sets. We particularly compare between-study heterogeneity estimates and model fits, treatment effect estimates and 95% credible intervals, and treatment rank probabilities. In both data sets, use of moderately informative variance priors constructed from the pair wise meta-analysis data yielded the best model fit and narrower credible intervals. Imposing consistency equations on variance estimates, assuming variances to be exchangeable, or using empirically informed variance priors also yielded good model fits and narrow credible intervals. The homogeneous variance model yielded high precision at all times, but overall inadequate estimates of between-trial variances. Lastly, treatment rankings were similar among the novel approaches, but considerably different when compared with the homogenous variance approach. MTC models using a homogenous variance structure appear to perform sub-optimally when between-trial variances vary between comparisons. Using informative variance priors, assuming exchangeability or imposing consistency between heterogeneity variances can all ensure sufficiently reliable and realistic heterogeneity estimation, and thus more reliable MTC inferences. All four approaches should be viable candidates for replacing or supplementing the conventional homogeneous variance MTC model, which is currently the most widely used in practice.
Sample size planning for composite reliability coefficients: accuracy in parameter estimation via narrow confidence intervals.

PubMed

Terry, Leann; Kelley, Ken

2012-11-01

Composite measures play an important role in psychology and related disciplines. Composite measures almost always have error. Correspondingly, it is important to understand the reliability of the scores from any particular composite measure. However, the point estimates of the reliability of composite measures are fallible and thus all such point estimates should be accompanied by a confidence interval. When confidence intervals are wide, there is much uncertainty in the population value of the reliability coefficient. Given the importance of reporting confidence intervals for estimates of reliability, coupled with the undesirability of wide confidence intervals, we develop methods that allow researchers to plan sample size in order to obtain narrow confidence intervals for population reliability coefficients. We first discuss composite reliability coefficients and then provide a discussion on confidence interval formation for the corresponding population value. Using the accuracy in parameter estimation approach, we develop two methods to obtain accurate estimates of reliability by planning sample size. The first method provides a way to plan sample size so that the expected confidence interval width for the population reliability coefficient is sufficiently narrow. The second method ensures that the confidence interval width will be sufficiently narrow with some desired degree of assurance (e.g., 99% assurance that the 95% confidence interval for the population reliability coefficient will be less than W units wide). The effectiveness of our methods was verified with Monte Carlo simulation studies. We demonstrate how to easily implement the methods with easy-to-use and freely available software. ©2011 The British Psychological Society.
A Measure for the Reliability of a Rating Scale Based on Longitudinal Clinical Trial Data

ERIC Educational Resources Information Center

Laenen, Annouschka; Alonso, Ariel; Molenberghs, Geert

2007-01-01

A new measure for reliability of a rating scale is introduced, based on the classical definition of reliability, as the ratio of the true score variance and the total variance. Clinical trial data can be employed to estimate the reliability of the scale in use, whenever repeated measurements are taken. The reliability is estimated from the…
The protonation of N2O reexamined - A case study on the reliability of various electron correlation methods for minima and transition states

NASA Technical Reports Server (NTRS)

Martin, J. M. L.; Lee, Timothy J.

1993-01-01

The protonation of N2O and the intramolecular proton transfer in N2OH(+) are studied using various basis sets and a variety of methods, including second-order many-body perturbation theory (MP2), singles and doubles coupled cluster (CCSD), the augmented coupled cluster (CCSD/T/), and complete active space self-consistent field (CASSCF) methods. For geometries, MP2 leads to serious errors even for HNNO(+); for the transition state, only CCSD/T/ produces a reliable geometry due to serious nondynamical correlation effects. The proton affinity at 298.15 K is estimated at 137.6 kcal/mol, in close agreement with recent experimental determinations of 137.3 +/- 1 kcal/mol.
Batch settling curve registration via image data modeling.

PubMed

Derlon, Nicolas; Thürlimann, Christian; Dürrenmatt, David; Villez, Kris

2017-05-01

To this day, obtaining reliable characterization of sludge settling properties remains a challenging and time-consuming task. Without such assessments however, optimal design and operation of secondary settling tanks is challenging and conservative approaches will remain necessary. With this study, we show that automated sludge blanket height registration and zone settling velocity estimation is possible thanks to analysis of images taken during batch settling experiments. The experimental setup is particularly interesting for practical applications as it consists of off-the-shelf components only, no moving parts are required, and the software is released publicly. Furthermore, the proposed multivariate shape constrained spline model for image analysis appears to be a promising method for reliable sludge blanket height profile registration. Copyright © 2017 Elsevier Ltd. All rights reserved.
The importance of children's illness beliefs: the Children's Illness Perception Questionnaire (CIPQ) as a reliable assessment tool for eczema and asthma.

PubMed

Walker, C; Papadopoulos, L; Lipton, M; Hussein, M

2006-02-01

A lack of information about disease in children can lead to erroneous views such as children believing that hospital admittance or the presence of a disease is a punishment for a perceived wrong. There has thus far been no standard tool available to measure children's illness conceptualizations from a Leventhalian framework. Three groups of children with eczema, asthma and eczema and asthma between the ages of 7 and 12 years of age were recruited. Children were given the Children's Illness Perception Questionnaire (CIPQ), a 26-item instrument adapted from the Illness Perception Questionnaire for adults. A Kuder - Richardson 20 test of reliability for dichotomous data was performed allowing an estimate of the internal consistency of the measurement scales. It can be seen that, for all three illness groups, internal consistency is acceptable for the timeline and consequences scale. The cure/control scale, however, was not internally consistent for any illness group. As health professionals, we need to develop the means to further understand how paediatric illness beliefs relate to specific disease types, age and psychosocial factors and the utility of this instrument is discussed within this context.
Detection of the lunar body tide by the Lunar Orbiter Laser Altimeter

PubMed Central

Mazarico, Erwan; Barker, Michael K; Neumann, Gregory A; Zuber, Maria T; Smith, David E

2014-01-01

The Lunar Orbiter Laser Altimeter instrument onboard the Lunar Reconnaissance Orbiter spacecraft collected more than 5 billion measurements in the nominal 50 km orbit over ∼10,000 orbits. The data precision, geodetic accuracy, and spatial distribution enable two-dimensional crossovers to be used to infer relative radial position corrections between tracks to better than ∼1 m. We use nearly 500,000 altimetric crossovers to separate remaining high-frequency spacecraft trajectory errors from the periodic radial surface tidal deformation. The unusual sampling of the lunar body tide from polar lunar orbit limits the size of the typical differential signal expected at ground track intersections to ∼10 cm. Nevertheless, we reliably detect the topographic tidal signal and estimate the associated Love number h2 to be 0.0371 ± 0.0033, which is consistent with but lower than recent results from lunar laser ranging. Key Points Altimetric data are used to create radial constraints on the tidal deformationThe body tide amplitude is estimated from the crossover dataThe estimated Love number is consistent with previous estimates but more precise PMID:26074646

Generalizability and decision studies to inform observational and experimental research in classroom settings.

PubMed

Bottema-Beutel, Kristen; Lloyd, Blair; Carter, Erik W; Asmus, Jennifer M

2014-11-01

Attaining reliable estimates of observational measures can be challenging in school and classroom settings, as behavior can be influenced by multiple contextual factors. Generalizability (G) studies can enable researchers to estimate the reliability of observational data, and decision (D) studies can inform how many observation sessions are necessary to achieve a criterion level of reliability. We conducted G and D studies using observational data from a randomized control trial focusing on social and academic participation of students with severe disabilities in inclusive secondary classrooms. Results highlight the importance of anchoring observational decisions to reliability estimates from existing or pilot data sets. We outline steps for conducting G and D studies and address options when reliability estimates are lower than desired.
Assessing practice-based influences on adolescent psychosocial development in sport: the activity context in youth sport questionnaire.

PubMed

García Bengoechea, Enrique; Sabiston, Catherine M; Wilson, Philip M

2017-01-01

The aim of this study was to provide initial evidence of validity and reliability of scores derived from the Activity Context in Youth Sport Questionnaire (ACYSQ), an instrument designed to offer a comprehensive assessment of the activities adolescents take part in during sport practices. Two studies were designed for the purposes of item development and selection, and to provide evidence of structural and criterion validity of ACYSQ scores, respectively (N = 334; M age = 14.93, SD = 1.76 years). Confirmatory factor analysis (CFA) supported the adequacy of a 20-item ACYSQ measurement model, which was invariant across gender, and comprised the following dimensions: (1) stimulation; (2) usefulness-value; (3) authenticity; (4) repetition-boredom; and (5) ineffectiveness. Internal consistency reliability estimates and composite reliability estimates for ACYSQ subscale scores ranged from 0.72 to 0.91. In regression analyses, stimulation predicted enjoyment and perceived competence, ineffectiveness was significantly associated with perceived competence and authenticity emerged as a predictor of commitment in sport. These findings indicate that the ACYSQ displays adequate psychometric properties and the use of the instrument may be useful for studying selected activity-based features of the practice environment and their motivational consequences in youth sport.
A study of fault prediction and reliability assessment in the SEL environment

NASA Technical Reports Server (NTRS)

Basili, Victor R.; Patnaik, Debabrata

1986-01-01

An empirical study on estimation and prediction of faults, prediction of fault detection and correction effort, and reliability assessment in the Software Engineering Laboratory environment (SEL) is presented. Fault estimation using empirical relationships and fault prediction using curve fitting method are investigated. Relationships between debugging efforts (fault detection and correction effort) in different test phases are provided, in order to make an early estimate of future debugging effort. This study concludes with the fault analysis, application of a reliability model, and analysis of a normalized metric for reliability assessment and reliability monitoring during development of software.
Spatio-temporal Granger causality: a new framework

PubMed Central

Luo, Qiang; Lu, Wenlian; Cheng, Wei; Valdes-Sosa, Pedro A.; Wen, Xiaotong; Ding, Mingzhou; Feng, Jianfeng

2015-01-01

That physiological oscillations of various frequencies are present in fMRI signals is the rule, not the exception. Herein, we propose a novel theoretical framework, spatio-temporal Granger causality, which allows us to more reliably and precisely estimate the Granger causality from experimental datasets possessing time-varying properties caused by physiological oscillations. Within this framework, Granger causality is redefined as a global index measuring the directed information flow between two time series with time-varying properties. Both theoretical analyses and numerical examples demonstrate that Granger causality is a monotonically increasing function of the temporal resolution used in the estimation. This is consistent with the general principle of coarse graining, which causes information loss by smoothing out very fine-scale details in time and space. Our results confirm that the Granger causality at the finer spatio-temporal scales considerably outperforms the traditional approach in terms of an improved consistency between two resting-state scans of the same subject. To optimally estimate the Granger causality, the proposed theoretical framework is implemented through a combination of several approaches, such as dividing the optimal time window and estimating the parameters at the fine temporal and spatial scales. Taken together, our approach provides a novel and robust framework for estimating the Granger causality from fMRI, EEG, and other related data. PMID:23643924
Fusion of Kinect depth data with trifocal disparity estimation for near real-time high quality depth maps generation

NASA Astrophysics Data System (ADS)

Boisson, Guillaume; Kerbiriou, Paul; Drazic, Valter; Bureller, Olivier; Sabater, Neus; Schubert, Arno

2014-03-01

Generating depth maps along with video streams is valuable for Cinema and Television production. Thanks to the improvements of depth acquisition systems, the challenge of fusion between depth sensing and disparity estimation is widely investigated in computer vision. This paper presents a new framework for generating depth maps from a rig made of a professional camera with two satellite cameras and a Kinect device. A new disparity-based calibration method is proposed so that registered Kinect depth samples become perfectly consistent with disparities estimated between rectified views. Also, a new hierarchical fusion approach is proposed for combining on the flow depth sensing and disparity estimation in order to circumvent their respective weaknesses. Depth is determined by minimizing a global energy criterion that takes into account the matching reliability and the consistency with the Kinect input. Thus generated depth maps are relevant both in uniform and textured areas, without holes due to occlusions or structured light shadows. Our GPU implementation reaches 20fps for generating quarter-pel accurate HD720p depth maps along with main view, which is close to real-time performances for video applications. The estimated depth is high quality and suitable for 3D reconstruction or virtual view synthesis.
Tracking reliability for space cabin-borne equipment in development by Crow model.

PubMed

Chen, J D; Jiao, S J; Sun, H L

2001-12-01

Objective. To study and track the reliability growth of manned spaceflight cabin-borne equipment in the course of its development. Method. A new technique of reliability growth estimation and prediction, which is composed of the Crow model and test data conversion (TDC) method was used. Result. The estimation and prediction value of the reliability growth conformed to its expectations. Conclusion. The method could dynamically estimate and predict the reliability of the equipment by making full use of various test information in the course of its development. It offered not only a possibility of tracking the equipment reliability growth, but also the reference for quality control in manned spaceflight cabin-borne equipment design and development process.
Dictionary-based fiber orientation estimation with improved spatial consistency.

PubMed

Ye, Chuyang; Prince, Jerry L

2018-02-01

Diffusion magnetic resonance imaging (dMRI) has enabled in vivo investigation of white matter tracts. Fiber orientation (FO) estimation is a key step in tract reconstruction and has been a popular research topic in dMRI analysis. In particular, the sparsity assumption has been used in conjunction with a dictionary-based framework to achieve reliable FO estimation with a reduced number of gradient directions. Because image noise can have a deleterious effect on the accuracy of FO estimation, previous works have incorporated spatial consistency of FOs in the dictionary-based framework to improve the estimation. However, because FOs are only indirectly determined from the mixture fractions of dictionary atoms and not modeled as variables in the objective function, these methods do not incorporate FO smoothness directly, and their ability to produce smooth FOs could be limited. In this work, we propose an improvement to Fiber Orientation Reconstruction using Neighborhood Information (FORNI), which we call FORNI+; this method estimates FOs in a dictionary-based framework where FO smoothness is better enforced than in FORNI alone. We describe an objective function that explicitly models the actual FOs and the mixture fractions of dictionary atoms. Specifically, it consists of data fidelity between the observed signals and the signals represented by the dictionary, pairwise FO dissimilarity that encourages FO smoothness, and weighted ℓ 1 -norm terms that ensure the consistency between the actual FOs and the FO configuration suggested by the dictionary representation. The FOs and mixture fractions are then jointly estimated by minimizing the objective function using an iterative alternating optimization strategy. FORNI+ was evaluated on a simulation phantom, a physical phantom, and real brain dMRI data. In particular, in the real brain dMRI experiment, we have qualitatively and quantitatively evaluated the reproducibility of the proposed method. Results demonstrate that FORNI+ produces FOs with better quality compared with competing methods. Copyright © 2017 Elsevier B.V. All rights reserved.
Robust Methods for Moderation Analysis with a Two-Level Regression Model.

PubMed

Yang, Miao; Yuan, Ke-Hai

2016-01-01

Moderation analysis has many applications in social sciences. Most widely used estimation methods for moderation analysis assume that errors are normally distributed and homoscedastic. When these assumptions are not met, the results from a classical moderation analysis can be misleading. For more reliable moderation analysis, this article proposes two robust methods with a two-level regression model when the predictors do not contain measurement error. One method is based on maximum likelihood with Student's t distribution and the other is based on M-estimators with Huber-type weights. An algorithm for obtaining the robust estimators is developed. Consistent estimates of standard errors of the robust estimators are provided. The robust approaches are compared against normal-distribution-based maximum likelihood (NML) with respect to power and accuracy of parameter estimates through a simulation study. Results show that the robust approaches outperform NML under various distributional conditions. Application of the robust methods is illustrated through a real data example. An R program is developed and documented to facilitate the application of the robust methods.
Estimates of population change in selected species of tropical birds using mark-recapture data

USGS Publications Warehouse

Brawn, J.; Nichols, J.D.; Hines, J.E.; Nesbitt, J.

2000-01-01

The population biology of tropical birds is known for a only small sample of species; especially in the Neotropics. Robust estimates of parameters such as survival rate and finite rate of population change (A) are crucial for conservation purposes and useful for studies of avian life histories. We used methods developed by Pradel (1996, Biometrics 52:703-709) to estimate A for 10 species of tropical forest lowland birds using data from a long-term (> 20 yr) banding study in Panama. These species constitute a ecologically and phylogenetically diverse sample. We present these estimates and explore if they are consistent with what we know from selected studies of banded birds and from 5 yr of estimating nesting success (i.e., an important component of A). A major goal of these analyses is to assess if the mark-recapture methods generate reliable and reasonably precise estimates of population change than traditional methods that require more sampling effort.
The Yale-Brown Obsessive Compulsive Scale: A Reliability Generalization Meta-Analysis.

PubMed

López-Pina, José Antonio; Sánchez-Meca, Julio; López-López, José Antonio; Marín-Martínez, Fulgencio; Núñez-Núñez, Rosa Maria; Rosa-Alcázar, Ana I; Gómez-Conesa, Antonia; Ferrer-Requena, Josefa

2015-10-01

The Yale-Brown Obsessive Compulsive Scale (Y-BOCS) is the most frequently applied test to assess obsessive compulsive symptoms. We conducted a reliability generalization meta-analysis on the Y-BOCS to estimate the average reliability, examine the variability among the reliability estimates, search for moderators, and propose a predictive model that researchers and clinicians can use to estimate the expected reliability of the Y-BOCS. We included studies where the Y-BOCS was applied to a sample of adults and reliability estimate was reported. Out of the 11,490 references located, 144 studies met the selection criteria. For the total scale, the mean reliability was 0.866 for coefficients alpha, 0.848 for test-retest correlations, and 0.922 for intraclass correlations. The moderator analyses led to a predictive model where the standard deviation of the total test and the target population (clinical vs. nonclinical) explained 38.6% of the total variability among coefficients alpha. Finally, clinical implications of the results are discussed. © The Author(s) 2014.
Reliability of a questionnaire on substance use among adolescent students, Brazil.

PubMed

Machado Neto, Adelmo de Souza; Andrade, Tarcisio Matos; Fernandes, Gilênio Borges; Zacharias, Helder Paulo; Carvalho, Fernando Martins; Machado, Ana Paula Souza; Dias, Ana Carmen Costa; Garcia, Ana Carolina Rocha; Santana, Lauro Reis; Rolin, Carlos Eduardo; Sampaio, Cyntia; Ghiraldi, Gisele; Bastos, Francisco Inácio

2010-10-01

To analyze reliability of a self-applied questionnaire on substance use and misuse among adolescent students. Two cross-sectional studies were carried out for the instrument test-retest. The sample comprised male and female students aged 1119 years from public and private schools (elementary, middle, and high school students) in the city of Salvador, Northeastern Brazil, in 2006. A total of 591 questionnaires were applied in the test and 467 in the retest. Descriptive statistics, the Kappa index, Cronbach's alpha and intraclass correlation were estimated. The prevalence of substance use/misuse was similar in both test and retest. Sociodemographic variables showed a "moderate" to "almost perfect" agreement for the Kappa index, and a "satisfactory" (>0.75) consistency for Cronbach's alpha and intraclass correlation. The age which psychoactive substances (tobacco, alcohol, and cannabis) were first used and chronological age were similar in both studies. Test-retest reliability was found to be a good indicator of students' age of initiation and their patterns of substance use. The questionnaire reliability was found to be satisfactory in the population studied.
A direct observation method for auditing large urban centers using stratified sampling, mobile GIS technology and virtual environments.

PubMed

Lafontaine, Sean J V; Sawada, M; Kristjansson, Elizabeth

2017-02-16

With the expansion and growth of research on neighbourhood characteristics, there is an increased need for direct observational field audits. Herein, we introduce a novel direct observational audit method and systematic social observation instrument (SSOI) for efficiently assessing neighbourhood aesthetics over large urban areas. Our audit method uses spatial random sampling stratified by residential zoning and incorporates both mobile geographic information systems technology and virtual environments. The reliability of our method was tested in two ways: first, in 15 Ottawa neighbourhoods, we compared results at audited locations over two subsequent years, and second; we audited every residential block (167 blocks) in one neighbourhood and compared the distribution of SSOI aesthetics index scores with results from the randomly audited locations. Finally, we present interrater reliability and consistency results on all observed items. The observed neighbourhood average aesthetics index score estimated from four or five stratified random audit locations is sufficient to characterize the average neighbourhood aesthetics. The SSOI was internally consistent and demonstrated good to excellent interrater reliability. At the neighbourhood level, aesthetics is positively related to SES and physical activity and negatively correlated with BMI. The proposed approach to direct neighbourhood auditing performs sufficiently and has the advantage of financial and temporal efficiency when auditing a large city.
Estimating hydraulic parameters of a heterogeneous aquitard using long-term multi-extensometer and groundwater level data

NASA Astrophysics Data System (ADS)

Zhuang, Chao; Zhou, Zhifang; Illman, Walter A.; Guo, Qiaona; Wang, Jinguo

2017-09-01

The classical aquitard-drainage model COMPAC has been modified to simulate the compaction process of a heterogeneous aquitard consisting of multiple sub-units (Multi-COMPAC). By coupling Multi-COMPAC with the parameter estimation code PEST++, the vertical hydraulic conductivity ( K v) and elastic ( S ske) and inelastic ( S skp) skeletal specific-storage values of each sub-unit can be estimated using observed long-term multi-extensometer and groundwater level data. The approach was first tested through a synthetic case with known parameters. Results of the synthetic case revealed that it was possible to accurately estimate the three parameters for each sub-unit. Next, the methodology was applied to a field site located in Changzhou city, China. Based on the detailed stratigraphic information and extensometer data, the aquitard of interest was subdivided into three sub-units. Parameters K v, S ske and S skp of each sub-unit were estimated simultaneously and then were compared with laboratory results and with bulk values and geologic data from previous studies, demonstrating the reliability of parameter estimates. Estimated S skp values ranged within the magnitude of 10-4 m-1, while K v ranged over 10-10-10-8 m/s, suggesting moderately high heterogeneity of the aquitard. However, the elastic deformation of the third sub-unit, consisting of soft plastic silty clay, is masked by delayed drainage, and the inverse procedure leads to large uncertainty in the S ske estimate for this sub-unit.
Objective classification of historical tropical cyclone intensity

NASA Astrophysics Data System (ADS)

Chenoweth, Michael

2007-03-01

Preinstrumental records of historical tropical cyclone activity require objective methods for accurately categorizing tropical cyclone intensity. Here wind force terms and damage reports from newspaper accounts in the Lesser Antilles and Jamaica for the period 1795-1879 are compared with wind speed estimates calculated from barometric pressure data. A total of 95 separate barometric pressure readings and colocated simultaneous wind force descriptors and wind-induced damage reports are compared. The wind speed estimates from barometric pressure data are taken as the most reliable and serve as a standard to compare against other data. Wind-induced damage reports are used to produce an estimated wind speed range using a modified Fujita scale. Wind force terms are compared with the barometric pressure data to determine if a gale, as used in the contemporary newspapers, is consistent with the modern definition of a gale. Results indicate that the modern definition of a gale (the threshold point separating the classification of a tropical depression from a tropical storm) is equivalent to that in contemporary newspaper accounts. Barometric pressure values are consistent with both reported wind force terms and wind damage on land when the location, speed and direction of movement of the tropical cyclone are determined. Damage reports and derived wind force estimates are consistent with other published results. Biases in ships' logbooks are confirmed and wind force terms of gale strength or greater are identified. These results offer a bridge between the earlier noninstrumental records of tropical cyclones and modern records thereby offering a method of consistently classifying storms in the Caribbean region into tropical depressions, tropical storms, nonmajor and major hurricanes.
Assessing disease severity: accuracy and reliability of rater estimates in relation to number of diagrams in a standard area diagram set

USDA-ARS?s Scientific Manuscript database

Error in rater estimates of plant disease severity occur, and standard area diagrams (SADs) help improve accuracy and reliability. The effects of diagram number in a SAD set on accuracy and reliability is unknown. The objective of this study was to compare estimates of pecan scab severity made witho...
Estimating effective data density in a satellite retrieval or an objective analysis

NASA Technical Reports Server (NTRS)

Purser, R. J.; Huang, H.-L.

1993-01-01

An attempt is made to formulate consistent objective definitions of the concept of 'effective data density' applicable both in the context of satellite soundings and more generally in objective data analysis. The definitions based upon various forms of Backus-Gilbert 'spread' functions are found to be seriously misleading in satellite soundings where the model resolution function (expressing the sensitivity of retrieval or analysis to changes in the background error) features sidelobes. Instead, estimates derived by smoothing the trace components of the model resolution function are proposed. The new estimates are found to be more reliable and informative in simulated satellite retrieval problems and, for the special case of uniformly spaced perfect observations, agree exactly with their actual density. The new estimates integrate to the 'degrees of freedom for signal', a diagnostic that is invariant to changes of units or coordinates used.
Estimation of the caesium-137 source term from the Fukushima Daiichi nuclear power plant using a consistent joint assimilation of air concentration and deposition observations

NASA Astrophysics Data System (ADS)

Winiarek, Victor; Bocquet, Marc; Duhanyan, Nora; Roustan, Yelva; Saunier, Olivier; Mathieu, Anne

2014-01-01

Inverse modelling techniques can be used to estimate the amount of radionuclides and the temporal profile of the source term released in the atmosphere during the accident of the Fukushima Daiichi nuclear power plant in March 2011. In Winiarek et al. (2012b), the lower bounds of the caesium-137 and iodine-131 source terms were estimated with such techniques, using activity concentration measurements. The importance of an objective assessment of prior errors (the observation errors and the background errors) was emphasised for a reliable inversion. In such critical context where the meteorological conditions can make the source term partly unobservable and where only a few observations are available, such prior estimation techniques are mandatory, the retrieved source term being very sensitive to this estimation. We propose to extend the use of these techniques to the estimation of prior errors when assimilating observations from several data sets. The aim is to compute an estimate of the caesium-137 source term jointly using all available data about this radionuclide, such as activity concentrations in the air, but also daily fallout measurements and total cumulated fallout measurements. It is crucial to properly and simultaneously estimate the background errors and the prior errors relative to each data set. A proper estimation of prior errors is also a necessary condition to reliably estimate the a posteriori uncertainty of the estimated source term. Using such techniques, we retrieve a total released quantity of caesium-137 in the interval 11.6-19.3 PBq with an estimated standard deviation range of 15-20% depending on the method and the data sets. The “blind” time intervals of the source term have also been strongly mitigated compared to the first estimations with only activity concentration data.
Cross-Cultural Adaptation of the Profile Fitness Mapping Neck Questionnaire to Brazilian Portuguese: Internal Consistency, Reliability, and Construct and Structural Validity.

PubMed

Ferreira, Mariana Cândido; Björklund, Martin; Dach, Fabiola; Chaves, Thais Cristina

The purpose of this study was to adapt and evaluate the psychometric properties of the ProFitMap-neck to Brazilian Portuguese. The cross-cultural adaptation consisted of 5 stages, and 180 female patients with chronic neck pain participated in the study. A subsample (n = 30) answered the pretest, and another subsample (n = 100) answered the questionnaire a second time. Internal consistency, test-retest reliability, and construct validity (hypothesis testing and structural validity) were estimated. For construct validity, the scores of the questionnaire were correlated with the Neck Disability Index (NDI), and the Hospital Anxiety and Depression Scale (HADS), the Tampa Scale of Kinesiophobia (TSK), and the 36-item Short-Form Health Survey (SF-36). Internal consistency was determined by adequate Cronbach's α values (α > 0.70). Strong reliability was identified by high intraclass correlation coefficients (ICC > 0.75). Construct validity was identified by moderate and strong correlations of the Br-ProFitMap-neck with total NDI score (-0.56 50%, Kaiser-Meyer-Olkin index > 0.50, eigenvalue > 1, and factor loadings > 0.2. Br-ProFitMap-neck had adequate psychometric properties and can be used in clinical settings, as well as research, in patients with chronic neck pain. Copyright © 2017. Published by Elsevier Inc.
A robust design mark-resight abundance estimator allowing heterogeneity in resighting probabilities

USGS Publications Warehouse

McClintock, B.T.; White, Gary C.; Burnham, K.P.

2006-01-01

This article introduces the beta-binomial estimator (BBE), a closed-population abundance mark-resight model combining the favorable qualities of maximum likelihood theory and the allowance of individual heterogeneity in sighting probability (p). The model may be parameterized for a robust sampling design consisting of multiple primary sampling occasions where closure need not be met between primary occasions. We applied the model to brown bear data from three study areas in Alaska and compared its performance to the joint hypergeometric estimator (JHE) and Bowden's estimator (BOWE). BBE estimates suggest heterogeneity levels were non-negligible and discourage the use of JHE for these data. Compared to JHE and BOWE, confidence intervals were considerably shorter for the AICc model-averaged BBE. To evaluate the properties of BBE relative to JHE and BOWE when sample sizes are small, simulations were performed with data from three primary occasions generated under both individual heterogeneity and temporal variation in p. All models remained consistent regardless of levels of variation in p. In terms of precision, the AICc model-averaged BBE showed advantages over JHE and BOWE when heterogeneity was present and mean sighting probabilities were similar between primary occasions. Based on the conditions examined, BBE is a reliable alternative to JHE or BOWE and provides a framework for further advances in mark-resight abundance estimation. ?? 2006 American Statistical Association and the International Biometric Society.
Pupillary transient responses to within-task cognitive load variation.

PubMed

Wong, Hoe Kin; Epps, Julien

2016-12-01

Changes in physiological signals due to task evoked cognitive load have been reported extensively. However, pupil size based approaches for estimating cognitive load on a moment-to-moment basis are not as well understood as estimating cognitive load on a task-to-task basis, despite the appeal these approaches have for continuous load estimation. In particular, the pupillary transient response to instantaneous changes in induced load has not been experimentally quantified, and the within-task changes in pupil dilation have not been investigated in a manner that allows their consistency to be quantified with a view to biomedical system design. In this paper, a variation of the digit span task is developed which reliably induces rapid changes of cognitive load to generate task-evoked pupillary responses (TEPRs) associated with large, within-task load changes. Linear modelling and one-way ANOVA reveals that increasing the rate of cognitive loading, while keeping task demands constant, results in a steeper pupillary response. Instantaneous drops in cognitive load are shown to produce statistically significantly different transient pupillary responses relative to sustained load, and when characterised using an exponential decay response, the task-evoked pupillary response time constant is in the order of 1-5 s. Within-task test-retest analysis confirms the reliability of the moment-to-moment measurements. Based on these results, estimates of pupil diameter can be employed with considerably more confidence in moment-to-moment cognitive load estimation systems. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

General Aviation Aircraft Reliability Study

NASA Technical Reports Server (NTRS)

Pettit, Duane; Turnbull, Andrew; Roelant, Henk A. (Technical Monitor)

2001-01-01

This reliability study was performed in order to provide the aviation community with an estimate of Complex General Aviation (GA) Aircraft System reliability. To successfully improve the safety and reliability for the next generation of GA aircraft, a study of current GA aircraft attributes was prudent. This was accomplished by benchmarking the reliability of operational Complex GA Aircraft Systems. Specifically, Complex GA Aircraft System reliability was estimated using data obtained from the logbooks of a random sample of the Complex GA Aircraft population.
Expanding Reliability Generalization Methods with KR-21 Estimates: An RG Study of the Coopersmith Self-Esteem Inventory.

ERIC Educational Resources Information Center

Lane, Ginny G.; White, Amy E.; Henson, Robin K.

2002-01-01

Conducted a reliability generalizability study on the Coopersmith Self-Esteem Inventory (CSEI; S. Coopersmith, 1967) to examine the variability of reliability estimates across studies and to identify study characteristics that may predict this variability. Results show that reliability for CSEI scores can vary considerably, especially at the…
Inter-observer reliability of DSM-5 substance use disorders.

PubMed

Denis, Cécile M; Gelernter, Joel; Hart, Amy B; Kranzler, Henry R

2015-08-01

Although studies have examined the impact of changes made in DSM-5 on the estimated prevalence of substance use disorder (SUD) diagnoses, there is limited evidence concerning the reliability of DSM-5 SUDs. We evaluated the inter-observer reliability of four DSM-5 SUDs in a sample in which we had previously evaluated the reliability of DSM-IV diagnoses, allowing us to compare the two systems. Two different interviewers each assessed 173 subjects over a 2-week period using the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA). Using the percent agreement and kappa (κ) coefficient, we examined the reliability of DSM-5 lifetime alcohol, opioid, cocaine, and cannabis use disorders, which we compared to that of SSADDA-derived DSM-IV SUD diagnoses. We also assessed the effect of additional lifetime SUD and lifetime mood or anxiety disorder diagnoses on the reliability of the DSM-5 SUD diagnoses. Reliability was good to excellent for the four disorders, with κ values ranging from 0.65 to 0.94. Agreement was consistently lower for SUDs of mild severity than for moderate or severe disorders. DSM-5 SUD diagnoses showed greater reliability than DSM-IV diagnoses of abuse or dependence or dependence only. Co-occurring SUD and lifetime mood or anxiety disorders exerted a modest effect on the reliability of the DSM-5 SUD diagnoses. For alcohol, opioid, cocaine and cannabis use disorders, DSM-5 criteria and diagnoses are at least as reliable as those of DSM-IV. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Factors Influencing the Reliability of the Glasgow Coma Scale: A Systematic Review.

PubMed

Reith, Florence Cm; Synnot, Anneliese; van den Brande, Ruben; Gruen, Russell L; Maas, Andrew Ir

2017-06-01

The Glasgow Coma Scale (GCS) characterizes patients with diminished consciousness. In a recent systematic review, we found overall adequate reliability across different clinical settings, but reliability estimates varied considerably between studies, and methodological quality of studies was overall poor. Identifying and understanding factors that can affect its reliability is important, in order to promote high standards for clinical use of the GCS. The aim of this systematic review was to identify factors that influence reliability and to provide an evidence base for promoting consistent and reliable application of the GCS. A comprehensive literature search was undertaken in MEDLINE, EMBASE, and CINAHL from 1974 to July 2016. Studies assessing the reliability of the GCS in adults or describing any factor that influences reliability were included. Two reviewers independently screened citations, selected full texts, and undertook data extraction and critical appraisal. Methodological quality of studies was evaluated with the consensus-based standards for the selection of health measurement instruments checklist. Data were synthesized narratively and presented in tables. Forty-one studies were included for analysis. Factors identified that may influence reliability are education and training, the level of consciousness, and type of stimuli used. Conflicting results were found for experience of the observer, the pathology causing the reduced consciousness, and intubation/sedation. No clear influence was found for the professional background of observers. Reliability of the GCS is influenced by multiple factors and as such is context dependent. This review points to the potential for improvement from training and education and standardization of assessment methods, for which recommendations are presented. Copyright © 2017 by the Congress of Neurological Surgeons.
Reliability and validity of the multimedia activity recall in children and adults (MARCA) in people with chronic obstructive pulmonary disease.

PubMed

Hunt, Toby; Williams, Marie T; Olds, Tim S

2013-01-01

To determine the reliability and validity of the Multimedia Activity Recall for Children and Adults (MARCA) in people with chronic obstructive pulmonary disease (COPD). People with COPD and their carers completed the Multimedia Activity Recall for Children and Adults (MARCA) for four, 24-hour periods (including test-retest of 2 days) while wearing a triaxial accelerometer (Actigraph GT3X+®), a multi-sensor armband (Sensewear Pro3®) and a pedometer (New Lifestyles 1000®). Self reported activity recalls (MARCA) and objective activity monitoring (Accelerometry) were recorded under free-living conditions. 24 couples were included in the analysis (COPD; age 74.4 ± 7.9 yrs, FEV1 54 ± 13% Carer; age 69.6 ± 10.9 yrs, FEV1 99 ± 24%). Not applicable. Test-retest reliability was compared for MARCA activity domains and different energy expenditure zones. Validity was assessed between MARCA-derived physical activity level (in metabolic equivalent of task (MET) per minute), duration of moderate to vigorous physical activity (min) and related data from the objective measurement devices. Analysis included intra-class correlation coefficients (ICC), Bland-Altman analyses, paired t-tests (p) and Spearman's rank correlation coefficients (rs). Reliability between occasions of recall for all activity domains was uniformly high, with test-retest correlations consistently >0.9. Validity correlations were moderate to strong (rs = 0.43-0.80) across all comparisons. The MARCA yields comparable PAL estimates and slightly higher moderate to vigorous physical activity (MVPA) estimates. In older adults with chronic illness, the MARCA is a valid and reliable tool for capturing not only the time and energy expenditure associated with physical and sedentary activities but also information on the types of activities.
Testing comparison models of DASS-12 and its reliability among adolescents in Malaysia.

PubMed

Osman, Zubaidah Jamil; Mukhtar, Firdaus; Hashim, Hairul Anuar; Abdul Latiff, Latiffah; Mohd Sidik, Sherina; Awang, Hamidin; Ibrahim, Normala; Abdul Rahman, Hejar; Ismail, Siti Irma Fadhilah; Ibrahim, Faisal; Tajik, Esra; Othman, Norlijah

2014-10-01

The 21-item Depression, Anxiety and Stress Scale (DASS-21) is frequently used in non-clinical research to measure mental health factors among adults. However, previous studies have concluded that the 21 items are not stable for utilization among the adolescent population. Thus, the aims of this study are to examine the structure of the factors and to report on the reliability of the refined version of the DASS that consists of 12 items. A total of 2850 students (aged 13 to 17 years old) from three major ethnic in Malaysia completed the DASS-21. The study was conducted at 10 randomly selected secondary schools in the northern state of Peninsular Malaysia. The study population comprised secondary school students (Forms 1, 2 and 4) from the selected schools. Based on the results of the EFA stage, 12 items were included in a final CFA to test the fit of the model. Using maximum likelihood procedures to estimate the model, the selected fit indices indicated a close model fit (χ(2)=132.94, df=57, p=.000; CFI=.96; RMR=.02; RMSEA=.04). Moreover, significant loadings of all the unstandardized regression weights implied an acceptable convergent validity. Besides the convergent validity of the item, a discriminant validity of the subscales was also evident from the moderate latent factor inter-correlations, which ranged from .62 to .75. The subscale reliability was further estimated using Cronbach's alpha and the adequate reliability of the subscales was obtained (Total=76; Depression=.68; Anxiety=.53; Stress=.52). The new version of the 12-item DASS for adolescents in Malaysia (DASS-12) is reliable and has a stable factor structure, and thus it is a useful instrument for distinguishing between depression, anxiety and stress. Copyright © 2014 Elsevier Inc. All rights reserved.
Maximum Likelihood Estimations and EM Algorithms with Length-biased Data

PubMed Central

Qin, Jing; Ning, Jing; Liu, Hao; Shen, Yu

2012-01-01

SUMMARY Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, epidemiological, genetic and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimations and inference methods for traditional survival data are not directly applicable for length-biased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semi-parametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online. PMID:22323840
Benchmarking passive seismic methods of estimating the depth of velocity interfaces down to ~300 m

NASA Astrophysics Data System (ADS)

Czarnota, Karol; Gorbatov, Alexei

2016-04-01

In shallow passive seismology it is generally accepted that the spatial autocorrelation (SPAC) method is more robust than the horizontal-over-vertical spectral ratio (HVSR) method at resolving the depth to surface-wave velocity (Vs) interfaces. Here we present results of a field test of these two methods over ten drill sites in western Victoria, Australia. The target interface is the base of Cenozoic unconsolidated to semi-consolidated clastic and/or carbonate sediments of the Murray Basin, which overlie Paleozoic crystalline rocks. Depths of this interface intersected in drill holes are between ~27 m and ~300 m. Seismometers were deployed in a three-arm spiral array, with a radius of 250 m, consisting of 13 Trillium Compact 120 s broadband instruments. Data were acquired at each site for 7-21 hours. The Vs architecture beneath each site was determined through nonlinear inversion of HVSR and SPAC data using the neighbourhood algorithm, implemented in the geopsy modelling package (Wathelet, 2005, GRL v35). The HVSR technique yielded depth estimates of the target interface (Vs > 1000 m/s) generally within ±20% error. Successful estimates were even obtained at a site with an inverted velocity profile, where Quaternary basalts overlie Neogene sediments which in turn overlie the target basement. Half of the SPAC estimates showed significantly higher errors than were obtained using HVSR. Joint inversion provided the most reliable estimates but was unstable at three sites. We attribute the surprising success of HVSR over SPAC to a low content of transient signals within the seismic record caused by low levels of anthropogenic noise at the benchmark sites. At a few sites SPAC waveform curves showed clear overtones suggesting that more reliable SPAC estimates may be obtained utilizing a multi-modal inversion. Nevertheless, our study indicates that reliable basin thickness estimates in the Australian conditions tested can be obtained utilizing HVSR data from a single seismometer, without a priori knowledge of the surface-wave velocity of the basin material, thereby negating the need to deploy cumbersome arrays.
Evaluating direct medical expenditures estimation methods of adults using the medical expenditure panel survey: an example focusing on head and neck cancer.

PubMed

Coughlan, Diarmuid; Yeh, Susan T; O'Neill, Ciaran; Frick, Kevin D

2014-01-01

To inform policymakers of the importance of evaluating various methods for estimating the direct medical expenditures for a low-incidence condition, head and neck cancer (HNC). Four methods of estimation have been identified: 1) summing all health care expenditures, 2) estimating disease-specific expenditures consistent with an attribution approach, 3) estimating disease-specific expenditures by matching, and 4) estimating disease-specific expenditures by using a regression-based approach. A literature review of studies (2005-2012) that used the Medical Expenditure Panel Survey (MEPS) was undertaken to establish the most popular expenditure estimation methods. These methods were then applied to a sample of 120 respondents with HNC, derived from pooled data (2003-2008). The literature review shows that varying expenditure estimation methods have been used with MEPS but no study compared and contrasted all four methods. Our estimates are reflective of the national treated prevalence of HNC. The upper-bound estimate of annual direct medical expenditures of adult respondents with HNC between 2003 and 2008 was $3.18 billion (in 2008 dollars). Comparable estimates arising from methods focusing on disease-specific and incremental expenditures were all lower in magnitude. Attribution yielded annual expenditures of $1.41 billion, matching method of $1.56 billion, and regression method of $1.09 billion. This research demonstrates that variation exists across and within expenditure estimation methods applied to MEPS data. Despite concerns regarding aspects of reliability and consistency, reporting a combination of the four methods offers a degree of transparency and validity to estimating the likely range of annual direct medical expenditures of a condition. © 2013 International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Published by International Society for Pharmacoeconomics and Outcomes Research (ISPOR) All rights reserved.
Impact of Alzheimer's Disease on Caregiver Questionnaire: internal consistency, convergent validity, and test-retest reliability of a new measure for assessing caregiver burden.

PubMed

Cole, Jason C; Ito, Diane; Chen, Yaozhu J; Cheng, Rebecca; Bolognese, Jennifer; Li-McLeod, Josephine

2014-09-04

There is a lack of validated instruments to measure the level of burden of Alzheimer's disease (AD) on caregivers. The Impact of Alzheimer's Disease on Caregiver Questionnaire (IADCQ) is a 12-item instrument with a seven-day recall period that measures AD caregiver's burden across emotional, physical, social, financial, sleep, and time aspects. Primary objectives of this study were to evaluate psychometric properties of IADCQ administered on the Web and to determine most appropriate scoring algorithm. A national sample of 200 unpaid AD caregivers participated in this study by completing the Web-based version of IADCQ and Short Form-12 Health Survey Version 2 (SF-12v2™). The SF-12v2 was used to measure convergent validity of IADCQ scores and to provide an understanding of the overall health-related quality of life of sampled AD caregivers. The IADCQ survey was also completed four weeks later by a randomly selected subgroup of 50 participants to assess test-retest reliability. Confirmatory factor analysis (CFA) was implemented to test the dimensionality of the IADCQ items. Classical item-level and scale-level psychometric analyses were conducted to estimate psychometric characteristics of the instrument. Test-retest reliability was performed to evaluate the instrument's stability and consistency over time. Virtually none (2%) of the respondents had either floor or ceiling effects, indicating the IADCQ covers an ideal range of burden. A single-factor model obtained appropriate goodness of fit and provided evidence that a simple sum score of the 12 items of IADCQ can be used to measure AD caregiver's burden. Scales-level reliability was supported with a coefficient alpha of 0.93 and an intra-class correlation coefficient (for test-retest reliability) of 0.68 (95% CI: 0.50-0.80). Low-moderate negative correlations were observed between the IADCQ and scales of the SF-12v2. The study findings suggest the IADCQ has appropriate psychometric characteristics as a unidimensional, Web-based measure of AD caregiver burden and is supported by strong model fit statistics from CFA, high degree of item-level reliability, good internal consistency, moderate test-retest reliability, and moderate convergent validity. Additional validation of the IADCQ is warranted to ensure invariance between the paper-based and Web-based administration and to determine an appropriate responder definition.
Reliability and validity of a short form household food security scale in a Caribbean community.

PubMed

Gulliford, Martin C; Mahabir, Deepak; Rocke, Brian

2004-06-16

We evaluated the reliability and validity of the short form household food security scale in a different setting from the one in which it was developed. The scale was interview administered to 531 subjects from 286 households in north central Trinidad in Trinidad and Tobago, West Indies. We evaluated the six items by fitting item response theory models to estimate item thresholds, estimating agreement among respondents in the same households and estimating the slope index of income-related inequality (SII) after adjusting for age, sex and ethnicity. Item-score correlations ranged from 0.52 to 0.79 and Cronbach's alpha was 0.87. Item responses gave within-household correlation coefficients ranging from 0.70 to 0.78. Estimated item thresholds (standard errors) from the Rasch model ranged from -2.027 (0.063) for the 'balanced meal' item to 2.251 (0.116) for the 'hungry' item. The 'balanced meal' item had the lowest threshold in each ethnic group even though there was evidence of differential functioning for this item by ethnicity. Relative thresholds of other items were generally consistent with US data. Estimation of the SII, comparing those at the bottom with those at the top of the income scale, gave relative odds for an affirmative response of 3.77 (95% confidence interval 1.40 to 10.2) for the lowest severity item, and 20.8 (2.67 to 162.5) for highest severity item. Food insecurity was associated with reduced consumption of green vegetables after additionally adjusting for income and education (0.52, 0.28 to 0.96). The household food security scale gives reliable and valid responses in this setting. Differing relative item thresholds compared with US data do not require alteration to the cut-points for classification of 'food insecurity without hunger' or 'food insecurity with hunger'. The data provide further evidence that re-evaluation of the 'balanced meal' item is required.
Interrater reliability of schizoaffective disorder compared with schizophrenia, bipolar disorder, and unipolar depression - A systematic review and meta-analysis.

PubMed

Santelmann, Hanno; Franklin, Jeremy; Bußhoff, Jana; Baethge, Christopher

2016-10-01

Schizoaffective disorder is a common diagnosis in clinical practice but its nosological status has been subject to debate ever since it was conceptualized. Although it is key that diagnostic reliability is sufficient, schizoaffective disorder has been reported to have low interrater reliability. Evidence based on systematic review and meta-analysis methods, however, is lacking. Using a highly sensitive literature search in Medline, Embase, and PsycInfo we identified studies measuring the interrater reliability of schizoaffective disorder in comparison to schizophrenia, bipolar disorder, and unipolar disorder. Out of 4126 records screened we included 25 studies reporting on 7912 patients diagnosed by different raters. The interrater reliability of schizoaffective disorder was moderate (meta-analytic estimate of Cohen's kappa 0.57 [95% CI: 0.41-0.73]), and substantially lower than that of its main differential diagnoses (difference in kappa between 0.22 and 0.19). Although there was considerable heterogeneity, analyses revealed that the interrater reliability of schizoaffective disorder was consistently lower in the overwhelming majority of studies. The results remained robust in subgroup and sensitivity analyses (e.g., diagnostic manual used) as well as in meta-regressions (e.g., publication year) and analyses of publication bias. Clinically, the results highlight the particular importance of diagnostic re-evaluation in patients diagnosed with schizoaffective disorder. They also quantify a widely held clinical impression of lower interrater reliability and agree with earlier meta-analysis reporting low test-retest reliability. Copyright © 2016. Published by Elsevier B.V.
Reliability of self-aligned, ledge passivated 7.5 GHz GaAs/AlGaAs HBT power amplifiers under RF bias stress at elevated temperatures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Henderson, T.S.; Ikalainen, P.K.

1995-12-31

The authors report a two-temperature RF bias stress test on nominal 1.2 W 7.5 GHz GaAs/AlGaAs HBT unit cell amplifiers. MTTF`s of 2020 and 1340 hours were obtained at Tj = 218{degrees}C and 245{degrees}C, respectively, under nominal input bias. An activation energy of 0.42 eV is estimated, consistent with published results for similar devices under DC bias stress.
Multiple exposure photographic (MEP) technique: an objective assessment of sperm motility in infertility management.

PubMed

Adetoro, O O

1988-06-01

Multiple exposure photography (MEP), an objective technique, was used in determining the percentage of motile sperms in the semen samples from 41 males being investigated for infertility. This technique was compared with the conventional subjective ordinary microscopy method of spermatozoal motility assessment. A satisfactory correlation was observed in percentage sperm motility assessment using the two methods but the MEP estimation was more consistent and reliable. The value of this technique of sperm motility study in the developing world is discussed.
Validation of the Maslach Burnout Inventory-Human Services Survey for Estimating Burnout in Dental Students.

PubMed

Montiel-Company, José María; Subirats-Roig, Cristian; Flores-Martí, Pau; Bellot-Arcís, Carlos; Almerich-Silla, José Manuel

2016-11-01

The aim of this study was to examine the validity and reliability of the Maslach Burnout Inventory-Human Services Survey (MBI-HSS) as a tool for assessing the prevalence and level of burnout in dental students in Spanish universities. The survey was adapted from English to Spanish. A sample of 533 dental students from 15 Spanish universities and a control group of 188 medical students self-administered the survey online, using the Google Drive service. The test-retest reliability or reproducibility showed an Intraclass Correlation Coefficient of 0.95. The internal consistency of the survey was 0.922. Testing the construct validity showed two components with an eigenvalue greater than 1.5, which explained 51.2% of the total variance. Factor I (36.6% of the variance) comprised the items that estimated emotional exhaustion and depersonalization. Factor II (14.6% of the variance) contained the items that estimated personal accomplishment. The cut-off point for the existence of burnout achieved a sensitivity of 92.2%, a specificity of 92.1%, and an area under the curve of 0.96. Comparison of the total dental students sample and the control group of medical students showed significantly higher burnout levels for the dental students (50.3% vs. 40.4%). In this study, the MBI-HSS was found to be viable, valid, and reliable for measuring burnout in dental students. Since the study also found that the dental students suffered from high levels of this syndrome, these results suggest the need for preventive burnout control programs.
Integrated GNSS Attitude Determination and Positioning for Direct Geo-Referencing

PubMed Central

Nadarajah, Nandakumaran; Paffenholz, Jens-André; Teunissen, Peter J. G.

2014-01-01

Direct geo-referencing is an efficient methodology for the fast acquisition of 3D spatial data. It requires the fusion of spatial data acquisition sensors with navigation sensors, such as Global Navigation Satellite System (GNSS) receivers. In this contribution, we consider an integrated GNSS navigation system to provide estimates of the position and attitude (orientation) of a 3D laser scanner. The proposed multi-sensor system (MSS) consists of multiple GNSS antennas rigidly mounted on the frame of a rotating laser scanner and a reference GNSS station with known coordinates. Precise GNSS navigation requires the resolution of the carrier phase ambiguities. The proposed method uses the multivariate constrained integer least-squares (MC-LAMBDA) method for the estimation of rotating frame ambiguities and attitude angles. MC-LAMBDA makes use of the known antenna geometry to strengthen the underlying attitude model and, hence, to enhance the reliability of rotating frame ambiguity resolution and attitude determination. The reliable estimation of rotating frame ambiguities is consequently utilized to enhance the relative positioning of the rotating frame with respect to the reference station. This integrated (array-aided) method improves ambiguity resolution, as well as positioning accuracy between the rotating frame and the reference station. Numerical analyses of GNSS data from a real-data campaign confirm the improved performance of the proposed method over the existing method. In particular, the integrated method yields reliable ambiguity resolution and reduces position standard deviation by a factor of about 0.8, matching the theoretical gain of 3/4 for two antennas on the rotating frame and a single antenna at the reference station. PMID:25036330
Integrated GNSS attitude determination and positioning for direct geo-referencing.

PubMed

Nadarajah, Nandakumaran; Paffenholz, Jens-André; Teunissen, Peter J G

2014-07-17

Direct geo-referencing is an efficient methodology for the fast acquisition of 3D spatial data. It requires the fusion of spatial data acquisition sensors with navigation sensors, such as Global Navigation Satellite System (GNSS) receivers. In this contribution, we consider an integrated GNSS navigation system to provide estimates of the position and attitude (orientation) of a 3D laser scanner. The proposed multi-sensor system (MSS) consists of multiple GNSS antennas rigidly mounted on the frame of a rotating laser scanner and a reference GNSS station with known coordinates. Precise GNSS navigation requires the resolution of the carrier phase ambiguities. The proposed method uses the multivariate constrained integer least-squares (MC-LAMBDA) method for the estimation of rotating frame ambiguities and attitude angles. MC-LAMBDA makes use of the known antenna geometry to strengthen the underlying attitude model and, hence, to enhance the reliability of rotating frame ambiguity resolution and attitude determination. The reliable estimation of rotating frame ambiguities is consequently utilized to enhance the relative positioning of the rotating frame with respect to the reference station. This integrated (array-aided) method improves ambiguity resolution, as well as positioning accuracy between the rotating frame and the reference station. Numerical analyses of GNSS data from a real-data campaign confirm the improved performance of the proposed method over the existing method. In particular, the integrated method yields reliable ambiguity resolution and reduces position standard deviation by a factor of about 0:8, matching the theoretical gain of √ 3/4 for two antennas on the rotating frame and a single antenna at the reference station.
Estimation of snow in extratropical cyclones from multiple frequency airborne radar observations. An Expectation-Maximization approach

NASA Astrophysics Data System (ADS)

Grecu, M.; Tian, L.; Heymsfield, G. M.

2017-12-01

A major challenge in deriving accurate estimates of physical properties of falling snow particles from single frequency space- or airborne radar observations is that snow particles exhibit a large variety of shapes and their electromagnetic scattering characteristics are highly dependent on these shapes. Triple frequency (Ku-Ka-W) radar observations are expected to facilitate the derivation of more accurate snow estimates because specific snow particle shapes tend to have specific signatures in the associated two-dimensional dual-reflectivity-ratio (DFR) space. However, the derivation of accurate snow estimates from triple frequency radar observations is by no means a trivial task. This is because the radar observations can be subject to non-negligible attenuation (especially at W-band when super-cooled water is present), which may significantly impact the interpretation of the information in the DFR space. Moreover, the electromagnetic scattering properties of snow particles are computationally expensive to derive, which makes the derivation of reliable parameterizations usable in estimation methodologies challenging. In this study, we formulate an two-step Expectation Maximization (EM) methodology to derive accurate snow estimates in Extratropical Cyclones (ECTs) from triple frequency airborne radar observations. The Expectation (E) step consists of a least-squares triple frequency estimation procedure applied with given assumptions regarding the relationships between the density of snow particles and their sizes, while the Maximization (M) step consists of the optimization of the assumptions used in step E. The electromagnetic scattering properties of snow particles are derived using the Rayleigh-Gans approximation. The methodology is applied to triple frequency radar observations collected during the Olympic Mountains Experiment (OLYMPEX). Results show that snowfall estimates above the freezing level in ETCs consistent with the triple frequency radar observations as well as with independent rainfall estimates below the freezing level may be derived using the EM methodology formulated in the study.
Clinical validation of three short forms of the Dutch Wechsler Memory Scale-Fourth Edition (WMS-IV-NL) in a mixed clinical sample.

PubMed

Bouman, Zita; Hendriks, Marc P H; Van Der Veld, William M; Aldenkamp, Albert P; Kessels, Roy P C

2016-06-01

The reliability and validity of three short forms of the Dutch version of the Wechsler Memory Scale-Fourth Edition (WMS-IV-NL) were evaluated in a mixed clinical sample of 235 patients. The short forms were based on the WMS-IV Flexible Approach, that is, a 3-subtest combination (Older Adult Battery for Adults) and two 2-subtest combinations (Logical Memory and Visual Reproduction and Logical Memory and Designs), which can be used to estimate the Immediate, Delayed, Auditory and Visual Memory Indices. All short forms showed good reliability coefficients. As expected, for adults (16-69 years old) the 3-subtest short form was consistently more accurate (predictive accuracy ranged from 73% to 100%) than both 2-subtest short forms (range = 61%-80%). Furthermore, for older adults (65-90 years old), the predictive accuracy of the 2-subtest short form ranged from 75% to 100%. These results suggest that caution is warranted when using the WMS-IV-NL Flexible Approach short forms to estimate all four indices. © The Author(s) 2015.
A Height Estimation Approach for Terrain Following Flights from Monocular Vision.

PubMed

Campos, Igor S G; Nascimento, Erickson R; Freitas, Gustavo M; Chaimowicz, Luiz

2016-12-06

In this paper, we present a monocular vision-based height estimation algorithm for terrain following flights. The impressive growth of Unmanned Aerial Vehicle (UAV) usage, notably in mapping applications, will soon require the creation of new technologies to enable these systems to better perceive their surroundings. Specifically, we chose to tackle the terrain following problem, as it is still unresolved for consumer available systems. Virtually every mapping aircraft carries a camera; therefore, we chose to exploit this in order to use presently available hardware to extract the height information toward performing terrain following flights. The proposed methodology consists of using optical flow to track features from videos obtained by the UAV, as well as its motion information to estimate the flying height. To determine if the height estimation is reliable, we trained a decision tree that takes the optical flow information as input and classifies whether the output is trustworthy or not. The classifier achieved accuracies of 80 % for positives and 90 % for negatives, while the height estimation algorithm presented good accuracy.

The reliability of the Glasgow Coma Scale: a systematic review.

PubMed

Reith, Florence C M; Van den Brande, Ruben; Synnot, Anneliese; Gruen, Russell; Maas, Andrew I R

2016-01-01

The Glasgow Coma Scale (GCS) provides a structured method for assessment of the level of consciousness. Its derived sum score is applied in research and adopted in intensive care unit scoring systems. Controversy exists on the reliability of the GCS. The aim of this systematic review was to summarize evidence on the reliability of the GCS. A literature search was undertaken in MEDLINE, EMBASE and CINAHL. Observational studies that assessed the reliability of the GCS, expressed by a statistical measure, were included. Methodological quality was evaluated with the consensus-based standards for the selection of health measurement instruments checklist and its influence on results considered. Reliability estimates were synthesized narratively. We identified 52 relevant studies that showed significant heterogeneity in the type of reliability estimates used, patients studied, setting and characteristics of observers. Methodological quality was good (n = 7), fair (n = 18) or poor (n = 27). In good quality studies, kappa values were ≥0.6 in 85%, and all intraclass correlation coefficients indicated excellent reliability. Poor quality studies showed lower reliability estimates. Reliability for the GCS components was higher than for the sum score. Factors that may influence reliability include education and training, the level of consciousness and type of stimuli used. Only 13% of studies were of good quality and inconsistency in reported reliability estimates was found. Although the reliability was adequate in good quality studies, further improvement is desirable. From a methodological perspective, the quality of reliability studies needs to be improved. From a clinical perspective, a renewed focus on training/education and standardization of assessment is required.
Field reliability of competency and sanity opinions: A systematic review and meta-analysis.

PubMed

Guarnera, Lucy A; Murrie, Daniel C

2017-06-01

We know surprisingly little about the interrater reliability of forensic psychological opinions, even though courts and other authorities have long called for known error rates for scientific procedures admitted as courtroom testimony. This is particularly true for opinions produced during routine practice in the field, even for some of the most common types of forensic evaluations-evaluations of adjudicative competency and legal sanity. To address this gap, we used meta-analytic procedures and study space methodology to systematically review studies that examined the interrater reliability-particularly the field reliability-of competency and sanity opinions. Of 59 identified studies, 9 addressed the field reliability of competency opinions and 8 addressed the field reliability of sanity opinions. These studies presented a wide range of reliability estimates; pairwise percentage agreements ranged from 57% to 100% and kappas ranged from .28 to 1.0. Meta-analytic combinations of reliability estimates obtained by independent evaluators returned estimates of κ = .49 (95% CI: .40-.58) for competency opinions and κ = .41 (95% CI: .29-.53) for sanity opinions. This wide range of reliability estimates underscores the extent to which different evaluation contexts tend to produce different reliability rates. Unfortunately, our study space analysis illustrates that available field reliability studies typically provide little information about contextual variables crucial to understanding their findings. Given these concerns, we offer suggestions for improving research on the field reliability of competency and sanity opinions, as well as suggestions for improving reliability rates themselves. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Test Assembly Implications for Providing Reliable and Valid Subscores

ERIC Educational Resources Information Center

Lee, Minji K.; Sweeney, Kevin; Melican, Gerald J.

2017-01-01

This study investigates the relationships among factor correlations, inter-item correlations, and the reliability estimates of subscores, providing a guideline with respect to psychometric properties of useful subscores. In addition, it compares subscore estimation methods with respect to reliability and distinctness. The subscore estimation…
Test Reliability at the Individual Level

PubMed Central

Hu, Yueqin; Nesselroade, John R.; Erbacher, Monica K.; Boker, Steven M.; Burt, S. Alexandra; Keel, Pamela K.; Neale, Michael C.; Sisk, Cheryl L.; Klump, Kelly

2016-01-01

Reliability has a long history as one of the key psychometric properties of a test. However, a given test might not measure people equally reliably. Test scores from some individuals may have considerably greater error than others. This study proposed two approaches using intraindividual variation to estimate test reliability for each person. A simulation study suggested that the parallel tests approach and the structural equation modeling approach recovered the simulated reliability coefficients. Then in an empirical study, where forty-five females were measured daily on the Positive and Negative Affect Schedule (PANAS) for 45 consecutive days, separate estimates of reliability were generated for each person. Results showed that reliability estimates of the PANAS varied substantially from person to person. The methods provided in this article apply to tests measuring changeable attributes and require repeated measures across time on each individual. This article also provides a set of parallel forms of PANAS. PMID:28936107
Measuring leader perceptions of school readiness for reforms: use of an iterative model combining classical and Rasch methods.

PubMed

Chatterji, Madhabi

2002-01-01

This study examines validity of data generated by the School Readiness for Reforms: Leader Questionnaire (SRR-LQ) using an iterative procedure that combines classical and Rasch rating scale analysis. Following content-validation and pilot-testing, principal axis factor extraction and promax rotation of factors yielded a five factor structure consistent with the content-validated subscales of the original instrument. Factors were identified based on inspection of pattern and structure coefficients. The rotated factor pattern, inter-factor correlations, convergent validity coefficients, and Cronbach's alpha reliability estimates supported the hypothesized construct properties. To further examine unidimensionality and efficacy of the rating scale structures, item-level data from each factor-defined subscale were subjected to analysis with the Rasch rating scale model. Data-to-model fit statistics and separation reliability for items and persons met acceptable criteria. Rating scale results suggested consistency of expected and observed step difficulties in rating categories, and correspondence of step calibrations with increases in the underlying variables. The combined approach yielded more comprehensive diagnostic information on the quality of the five SRR-LQ subscales; further research is continuing.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Vanderwiel, Scott A; Wilson, Alyson G; Graves, Todd L

Both the U. S. Department of Defense (DoD) and Department of Energy (DOE) maintain weapons stockpiles: items like bullets, missiles and bombs that have already been produced and are being stored until needed. Ideally, these stockpiles maintain high reliability over time. To assess reliability, a surveillance program is implemented, where units are periodically removed from the stockpile and tested. The most definitive tests typically destroy the weapons so a given unit is tested only once. Surveillance managers need to decide how many units should be tested, how often they should be tested, what tests should be done, and how themore » resulting data are used to estimate the stockpile's current and future reliability. These issues are particularly critical from a planning perspective: given what has already been observed and our understanding of the mechanisms of stockpile aging, what is an appropriate and cost-effective surveillance program? Surveillance programs are costly, broad, and deep, especially in the DOE, where the US nuclear weapons surveillance program must 'ensure, through various tests, that the reliability of nuclear weapons is maintained' in the absence of full-system testing (General Accounting Office, 1996). The DOE program consists primarily of three types of tests: nonnuclear flight tests, that involve the actual dropping or launching of a weapon from which the nuclear components have been removed; and nonnuclear and nuclear systems laboratory tests, which detect defects due to aging, manufacturing, and design of the nonnuclear and nuclear portions of the weapons. Fully integrated analysis of the suite of nuclear weapons surveillance data is an ongoing area of research (Wilson et al., 2007). This paper introduces a simple model that captures high-level features of stockpile reliability over time and can be used to answer broad policy questions about surveillance programs. Our intention is to provide a framework that generates tractable answers that integrate expert knowledge and high-level summaries of surveillance data to allow decision-making about appropriate trade-offs between the cost of data and the precision of stockpile reliability estimates.« less
A longitudinal examination of event-related potentials sensitive to monetary reward and loss feedback from late childhood to middle adolescence.

PubMed

Kujawa, Autumn; Carroll, Ashley; Mumper, Emma; Mukherjee, Dahlia; Kessel, Ellen M; Olino, Thomas; Hajcak, Greg; Klein, Daniel N

2017-11-04

Brain regions involved in reward processing undergo developmental changes from childhood to adolescence, and alterations in reward-related brain function are thought to contribute to the development of psychopathology. Event-related potentials (ERPs), such as the reward positivity (RewP) component, are valid measures of reward responsiveness that are easily assessed across development and provide insight into temporal dynamics of reward processing. Little work has systematically examined developmental changes in ERPs sensitive to reward. In this longitudinal study of 75 youth assessed 3 times across 6years, we used principal components analyses (PCA) to differentiate ERPs sensitive to monetary reward and loss feedback in late childhood, early adolescence, and middle adolescence. We then tested reliability of, and developmental changes in, ERPs. A greater number of ERP components differentiated reward and loss feedback in late childhood compared to adolescence, but components in childhood accounted for only a small proportion of variance. A component consistent with RewP was the only one to consistently emerge at each of the 3 assessments. RewP demonstrated acceptable reliability, particularly from early to middle adolescence, though reliability estimates varied depending on scoring approach and developmental period. The magnitude of the RewP component did not significantly change across time. Results provide insight into developmental changes in the structure of ERPs sensitive to reward, and indicate that RewP is a consistently observed and relatively stable measure of reward responsiveness, particularly across adolescence. Copyright © 2017. Published by Elsevier B.V.
Accuracy of genomic predictions in Gyr (Bos indicus) dairy cattle.

PubMed

Boison, S A; Utsunomiya, A T H; Santos, D J A; Neves, H H R; Carvalheiro, R; Mészáros, G; Utsunomiya, Y T; do Carmo, A S; Verneque, R S; Machado, M A; Panetto, J C C; Garcia, J F; Sölkner, J; da Silva, M V G B

2017-07-01

Genomic selection may accelerate genetic progress in breeding programs of indicine breeds when compared with traditional selection methods. We present results of genomic predictions in Gyr (Bos indicus) dairy cattle of Brazil for milk yield (MY), fat yield (FY), protein yield (PY), and age at first calving using information from bulls and cows. Four different single nucleotide polymorphism (SNP) chips were studied. Additionally, the effect of the use of imputed data on genomic prediction accuracy was studied. A total of 474 bulls and 1,688 cows were genotyped with the Illumina BovineHD (HD; San Diego, CA) and BovineSNP50 (50K) chip, respectively. Genotypes of cows were imputed to HD using FImpute v2.2. After quality check of data, 496,606 markers remained. The HD markers present on the GeneSeek SGGP-20Ki (15,727; Lincoln, NE), 50K (22,152), and GeneSeek GGP-75Ki (65,018) were subset and used to assess the effect of lower SNP density on accuracy of prediction. Deregressed breeding values were used as pseudophenotypes for model training. Data were split into reference and validation to mimic a forward prediction scheme. The reference population consisted of animals whose birth year was ≤2004 and consisted of either only bulls (TR1) or a combination of bulls and dams (TR2), whereas the validation set consisted of younger bulls (born after 2004). Genomic BLUP was used to estimate genomic breeding values (GEBV) and reliability of GEBV (R 2 PEV ) was based on the prediction error variance approach. Reliability of GEBV ranged from ∼0.46 (FY and PY) to 0.56 (MY) with TR1 and from 0.51 (PY) to 0.65 (MY) with TR2. When averaged across all traits, R 2 PEV were substantially higher (R 2 PEV of TR1 = 0.50 and TR2 = 0.57) compared with reliabilities of parent averages (0.35) computed from pedigree data and based on diagonals of the coefficient matrix (prediction error variance approach). Reliability was similar for all the 4 marker panels using either TR1 or TR2, except that imputed HD cow data set led to an inflation of reliability. Reliability of GEBV could be increased by enlarging the limited bull reference population with cow information. A reduced panel of ∼15K markers resulted in reliabilities similar to using HD markers. Reliability of GEBV could be increased by enlarging the limited bull reference population with cow information. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
How Many Sleep Diary Entries Are Needed to Reliably Estimate Adolescent Sleep?

PubMed Central

Arora, Teresa; Gradisar, Michael; Taheri, Shahrad; Carskadon, Mary A.

2017-01-01

Abstract Study Objectives: To investigate (1) how many nights of sleep diary entries are required for reliable estimates of five sleep-related outcomes (bedtime, wake time, sleep onset latency [SOL], sleep duration, and wake after sleep onset [WASO]) and (2) the test–retest reliability of sleep diary estimates of school night sleep across 12 weeks. Methods: Data were drawn from four adolescent samples (Australia [n = 385], Qatar [n = 245], United Kingdom [n = 770], and United States [n = 366]), who provided 1766 eligible sleep diary weeks for reliability analyses. We performed reliability analyses for each cohort using complete data (7 days), one to five school nights, and one to two weekend nights. We also performed test–retest reliability analyses on 12-week sleep diary data available from a subgroup of 55 US adolescents. Results: Intraclass correlation coefficients for bedtime, SOL, and sleep duration indicated good-to-excellent reliability from five weekday nights of sleep diary entries across all adolescent cohorts. Four school nights was sufficient for wake times in the Australian and UK samples, but not the US or Qatari samples. Only Australian adolescents showed good reliability for two weekend nights of bedtime reports; estimates of SOL were adequate for UK adolescents based on two weekend nights. WASO was not reliably estimated using 1 week of sleep diaries. We observed excellent test–rest reliability across 12 weeks of sleep diary data in a subsample of US adolescents. Conclusion: We recommend at least five weekday nights of sleep dairy entries to be made when studying adolescent bedtimes, SOL, and sleep duration. Adolescent sleep patterns were stable across 12 consecutive school weeks. PMID:28199718
How Many Sleep Diary Entries Are Needed to Reliably Estimate Adolescent Sleep?

PubMed

Short, Michelle A; Arora, Teresa; Gradisar, Michael; Taheri, Shahrad; Carskadon, Mary A

2017-03-01

To investigate (1) how many nights of sleep diary entries are required for reliable estimates of five sleep-related outcomes (bedtime, wake time, sleep onset latency [SOL], sleep duration, and wake after sleep onset [WASO]) and (2) the test-retest reliability of sleep diary estimates of school night sleep across 12 weeks. Data were drawn from four adolescent samples (Australia [n = 385], Qatar [n = 245], United Kingdom [n = 770], and United States [n = 366]), who provided 1766 eligible sleep diary weeks for reliability analyses. We performed reliability analyses for each cohort using complete data (7 days), one to five school nights, and one to two weekend nights. We also performed test-retest reliability analyses on 12-week sleep diary data available from a subgroup of 55 US adolescents. Intraclass correlation coefficients for bedtime, SOL, and sleep duration indicated good-to-excellent reliability from five weekday nights of sleep diary entries across all adolescent cohorts. Four school nights was sufficient for wake times in the Australian and UK samples, but not the US or Qatari samples. Only Australian adolescents showed good reliability for two weekend nights of bedtime reports; estimates of SOL were adequate for UK adolescents based on two weekend nights. WASO was not reliably estimated using 1 week of sleep diaries. We observed excellent test-rest reliability across 12 weeks of sleep diary data in a subsample of US adolescents. We recommend at least five weekday nights of sleep dairy entries to be made when studying adolescent bedtimes, SOL, and sleep duration. Adolescent sleep patterns were stable across 12 consecutive school weeks. © Sleep Research Society 2017. Published by Oxford University Press on behalf of the Sleep Research Society. All rights reserved. For permissions, please e-mail journals.permissions@oup.com.
Inter-Observer Agreement on Subjects' Race and Race-Informative Characteristics

PubMed Central

Edgar, Heather J. H.; Daneshvari, Shamsi; Harris, Edward F.; Kroth, Philip J.

2011-01-01

Health and socioeconomic disparities tend to be experienced along racial and ethnic lines, but investigators are not sure how individuals are assigned to groups, or how consistent this process is. To address these issues, 1,919 orthodontic patient records were examined by at least two observers who estimated each individual's race and the characteristics that influenced each estimate. Agreement regarding race is high for African and European Americans, but not as high for Asian, Hispanic, and Native Americans. The indicator observers most often agreed upon as important in estimating group membership is name, especially for Asian and Hispanic Americans. The observers, who were almost all European American, most often agreed that skin color is an important indicator of race only when they also agreed the subject was European American. This suggests that in a diverse community, light skin color is associated with a particular group, while a range of darker shades can be associated with members of any other group. This research supports comparable studies showing that race estimations in medical records are likely reliable for African and European Americans, but are less so for other groups. Further, these results show that skin color is not consistently the primary indicator of an individual's race, but that other characteristics such as facial features add significant information. PMID:21897865
DOE Office of Scientific and Technical Information (OSTI.GOV)

Sun, Li-Min, E-mail: limin.sun@yahoo.com; Huang, Chih-Jen; Faculty of Medicine, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung City, Taiwan

Acute skin reaction during adjuvant radiotherapy for breast cancer is an inevitable process, and its severity is related to the skin dose. A high–skin dose area can be speculated based on the isodose distribution shown on a treatment planning. To determine whether treatment planning can reflect high–skin dose location, 80 patients were collected and their skin doses in different areas were measured using a thermoluminescent dosimeter to locate the highest–skin dose area in each patient. We determined whether the skin dose is consistent with the highest-dose area estimated by the treatment planning of the same patient. The χ{sup 2} andmore » Fisher exact tests revealed that these 2 methods yielded more consistent results when the highest-dose spots were located in the axillary and breast areas but not in the inframammary area. We suggest that skin doses shown on the treatment planning might be a reliable and simple alternative method for estimating the highest skin doses in some areas.« less
Validation of the Weight Concerns Scale Applied to Brazilian University Students.

PubMed

Dias, Juliana Chioda Ribeiro; da Silva, Wanderson Roberto; Maroco, João; Campos, Juliana Alvares Duarte Bonini

2015-06-01

The aim of this study was to evaluate the validity and reliability of the Portuguese version of the Weight Concerns Scale (WCS) when applied to Brazilian university students. The scale was completed by 1084 university students from Brazilian public education institutions. A confirmatory factor analysis was conducted. The stability of the model in independent samples was assessed through multigroup analysis, and the invariance was estimated. Convergent, concurrent, divergent, and criterion validities as well as internal consistency were estimated. Results indicated that the one-factor model presented an adequate fit to the sample and values of convergent validity. The concurrent validity with the Body Shape Questionnaire and divergent validity with the Maslach Burnout Inventory for Students were adequate. Internal consistency was adequate, and the factorial structure was invariant in independent subsamples. The results present a simple and short instrument capable of precisely and accurately assessing concerns with weight among Brazilian university students. Copyright © 2015 Elsevier Ltd. All rights reserved.
Individual Differences in Base Rate Neglect: A Fuzzy Processing Preference Index

PubMed Central

Wolfe, Christopher R.; Fisher, Christopher R.

2013-01-01

Little is known about individual differences in integrating numeric base-rates and qualitative text in making probability judgments. Fuzzy-Trace Theory predicts a preference for fuzzy processing. We conducted six studies to develop the FPPI, a reliable and valid instrument assessing individual differences in this fuzzy processing preference. It consists of 19 probability estimation items plus 4 "M-Scale" items that distinguish simple pattern matching from “base rate respect.” Cronbach's Alpha was consistently above 0.90. Validity is suggested by significant correlations between FPPI scores and three other measurers: "Rule Based" Process Dissociation Procedure scores; the number of conjunction fallacies in joint probability estimation; and logic index scores on syllogistic reasoning. Replicating norms collected in a university study with a web-based study produced negligible differences in FPPI scores, indicating robustness. The predicted relationships between individual differences in base rate respect and both conjunction fallacies and syllogistic reasoning were partially replicated in two web-based studies. PMID:23935255
The reliability of the Adelaide in-shoe foot model.

PubMed

Bishop, Chris; Hillier, Susan; Thewlis, Dominic

2017-07-01

Understanding the biomechanics of the foot is essential for many areas of research and clinical practice such as orthotic interventions and footwear development. Despite the widespread attention paid to the biomechanics of the foot during gait, what largely remains unknown is how the foot moves inside the shoe. This study investigated the reliability of the Adelaide In-Shoe Foot Model, which was designed to quantify in-shoe foot kinematics and kinetics during walking. Intra-rater reliability was assessed in 30 participants over five walking trials whilst wearing shoes during two data collection sessions, separated by one week. Sufficient reliability for use was interpreted as a coefficient of multiple correlation and intra-class correlation coefficient of >0.61. Inter-rater reliability was investigated separately in a second sample of 10 adults by two researchers with experience in applying markers for the purpose of motion analysis. The results indicated good consistency in waveform estimation for most kinematic and kinetic data, as well as good inter-and intra-rater reliability. The exception is the peak medial ground reaction force, the minimum abduction angle and the peak abduction/adduction external hindfoot joint moments which resulted in less than acceptable repeatability. Based on our results, the Adelaide in-shoe foot model can be used with confidence for 24 commonly measured biomechanical variables during shod walking. Copyright © 2017 Elsevier B.V. All rights reserved.
A Meta-Analysis of Reliability Coefficients in Second Language Research

ERIC Educational Resources Information Center

Plonsky, Luke; Derrick, Deirdre J.

2016-01-01

Ensuring internal validity in quantitative research requires, among other conditions, reliable instrumentation. Unfortunately, however, second language (L2) researchers often fail to report and even more often fail to interpret reliability estimates beyond generic benchmarks for acceptability. As a means to guide interpretations of such estimates,…
Reliability Estimates for Undergraduate Grade Point Average

ERIC Educational Resources Information Center

Westrick, Paul A.

2017-01-01

Undergraduate grade point average (GPA) is a commonly employed measure in educational research, serving as a criterion or as a predictor depending on the research question. Over the decades, researchers have used a variety of reliability coefficients to estimate the reliability of undergraduate GPA, which suggests that there has been no consensus…
Reliability of Test Scores in Nonparametric Item Response Theory.

ERIC Educational Resources Information Center

Sijtsma, Klaas; Molenaar, Ivo W.

1987-01-01

Three methods for estimating reliability are studied within the context of nonparametric item response theory. Two were proposed originally by Mokken and a third is developed in this paper. Using a Monte Carlo strategy, these three estimation methods are compared with four "classical" lower bounds to reliability. (Author/JAZ)
IRT-Estimated Reliability for Tests Containing Mixed Item Formats

ERIC Educational Resources Information Center

Shu, Lianghua; Schwarz, Richard D.

2014-01-01

As a global measure of precision, item response theory (IRT) estimated reliability is derived for four coefficients (Cronbach's a, Feldt-Raju, stratified a, and marginal reliability). Models with different underlying assumptions concerning test-part similarity are discussed. A detailed computational example is presented for the targeted…
Evaluation of Validity and Reliability for Hierarchical Scales Using Latent Variable Modeling

ERIC Educational Resources Information Center

Raykov, Tenko; Marcoulides, George A.

2012-01-01

A latent variable modeling method is outlined, which accomplishes estimation of criterion validity and reliability for a multicomponent measuring instrument with hierarchical structure. The approach provides point and interval estimates for the scale criterion validity and reliability coefficients, and can also be used for testing composite or…

Psychometric Properties of the Obsessive-Compulsive Inventory-Child Version (OCI-CV) in Chilean Children and Adolescents

PubMed Central

Martínez-González, Agustín E.; Rodríguez-Jiménez, Tíscar; Piqueras, José A.; Vera-Villarroel, Pablo; Godoy, Antonio

2015-01-01

In recent years, there has been a considerable increase in the development of assessment tools for obsessive-compulsive symptomatology in children and adolescents. The Obsessive Compulsive Inventory-Child Version (OCI-CV) is a well-established assessment self-report, with special interest for the assessment of dimensions of Obsessive Compulsive Disorder (OCD). This instrument has shown to be useful for clinical and non-clinical populations in two languages (English and European Spanish). Thus, the aim of this study was to analyze the psychometric properties of the OCI-CV in a Chilean community sample. The sample consisted of 816 children and adolescents with a mean age of 14.54 years (SD = 2.21; range = 10–18 years). Factor structure, internal consistency, test-retest reliability, convergent/divergent validity, and gender/age differences were examined. Confirmatory factor analysis showed a 6-factor structure (Doubting/Checking, Obsessing, Hoarding, Washing, Ordering, and Neutralizing) with one second-order factor. Good estimates of reliability (including internal consistency and test-retest), evidence supporting the validity, and small age and gender differences (higher levels of OCD symptomatology among older participants and women, respectively) are found. The OCI-CV is also an adequate scale for the assessment of obsessions and compulsions in a general population of Chilean children and adolescents. PMID:26317404
REVERBERATION AND PHOTOIONIZATION ESTIMATES OF THE BROAD-LINE REGION RADIUS IN LOW-z QUASARS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Negrete, C. Alenka; Dultzin, Deborah; Marziani, Paola

2013-07-01

Black hole mass estimation in quasars, especially at high redshift, involves the use of single-epoch spectra with signal-to-noise ratio and resolution that permit accurate measurement of the width of a broad line assumed to be a reliable virial estimator. Coupled with an estimate of the radius of the broad-line region (BLR) this yields the black hole mass M{sub BH}. The radius of the BLR may be inferred from an extrapolation of the correlation between source luminosity and reverberation-derived r{sub BLR} measures (the so-called Kaspi relation involving about 60 low-z sources). We are exploring a different method for estimating r{sub BLR}more » directly from inferred physical conditions in the BLR of each source. We report here on a comparison of r{sub BLR} estimates that come from our method and from reverberation mapping. Our ''photoionization'' method employs diagnostic line intensity ratios in the rest-frame range 1400-2000 A (Al III {lambda}1860/Si III] {lambda}1892, C IV {lambda}1549/Al III {lambda}1860) that enable derivation of the product of density and ionization parameter with the BLR distance derived from the definition of the ionization parameter. We find good agreement between our estimates of the density, ionization parameter, and r{sub BLR} and those from reverberation mapping. We suggest empirical corrections to improve the agreement between individual photoionization-derived r{sub BLR} values and those obtained from reverberation mapping. The results in this paper can be exploited to estimate M{sub BH} for large samples of high-z quasars using an appropriate virial broadening estimator. We show that the width of the UV intermediate emission lines are consistent with the width of H{beta}, thereby providing a reliable virial broadening estimator that can be measured in large samples of high-z quasars.« less
A Computer-Adaptive Disability Instrument for Lower Extremity Osteoarthritis Research Demonstrated Promising Breadth, Precision and Reliability

PubMed Central

Jette, Alan M.; McDonough, Christine M.; Haley, Stephen M.; Ni, Pengsheng; Olarsch, Sippy; Latham, Nancy; Hambleton, Ronald K.; Felson, David; Kim, Young-jo; Hunter, David

2012-01-01

Objective To develop and evaluate a prototype measure (OA-DISABILITY-CAT) for osteoarthritis research using Item Response Theory (IRT) and Computer Adaptive Test (CAT) methodologies. Study Design and Setting We constructed an item bank consisting of 33 activities commonly affected by lower extremity (LE) osteoarthritis. A sample of 323 adults with LE osteoarthritis reported their degree of limitation in performing everyday activities and completed the Health Assessment Questionnaire-II (HAQ-II). We used confirmatory factor analyses to assess scale unidimensionality and IRT methods to calibrate the items and examine the fit of the data. Using CAT simulation analyses, we examined the performance of OA-DISABILITY-CATs of different lengths compared to the full item bank and the HAQ-II. Results One distinct disability domain was identified. The 10-item OA-DISABILITY-CAT demonstrated a high degree of accuracy compared with the full item bank (r=0.99). The item bank and the HAQ-II scales covered a similar estimated scoring range. In terms of reliability, 95% of OA-DISABILITY reliability estimates were over 0.83 versus 0.60 for the HAQ-II. Except at the highest scores the 10-item OA-DISABILITY-CAT demonstrated superior precision to the HAQ-II. Conclusion The prototype OA-DISABILITY-CAT demonstrated promising measurement properties compared to the HAQ-II, and is recommended for use in LE osteoarthritis research. PMID:19216052
Utility of the Neuropsychiatric Inventory Questionnaire (NPI-Q) in the assessment of a sample of patients with Alzheimer's disease in Chile.

PubMed

Musa, Gada; Henríquez, Fernando; Muñoz-Neira, Carlos; Delgado, Carolina; Lillo, Patricia; Slachevsky, Andrea

2017-01-01

The Neuropsychiatric Inventory Questionnaire (NPI-Q) is an informant-based instrument that measures the presence and severity of 12 Neuropsychiatric Symptoms (NPS) in patients with dementia, as well as informant distress. To measure the psychometric properties of the NPI-Q and the prevalence of NPS in patients with Alzheimer's disease (AD) in Chile. 53 patients with AD were assessed. Subjects were divided into two different groups: mild AD (n=26) and moderate AD (n=27). Convergent validity was estimated by correlating the outcomes of the NPI-Q with Neuropsychiatric Inventory (NPI) scores and with a global cognitive efficiency test (Addenbrooke's Cognitive Examination - Revised - ACE-R). Reliability of the NPI-Q was analysed by calculating its internal consistency. Prevalence of NPS was estimated with both the NPI and NPI-Q. Positive and significant correlations were observed between the NPI-Q, the NPI, and the ACE-R (r=0.730; p<0.01 and 0.315; p<0.05 respectively). The instrument displayed an adequate level of reliability (Cronbach's alpha=0.783). The most prevalent NPS were apathy/indifference (62.3%) and dysphoria/depression (58.5%). The NPI-Q exhibited acceptable validity and reliability indicators for patients with AD in Chile, indicating that it is a suitable instrument for the routine assessment of NPS in clinical practice.
Utility of the Neuropsychiatric Inventory Questionnaire (NPI-Q) in the assessment of a sample of patients with Alzheimer's disease in Chile

PubMed Central

Musa, Gada; Henríquez, Fernando; Muñoz-Neira, Carlos; Delgado, Carolina; Lillo, Patricia; Slachevsky, Andrea

2017-01-01

The Neuropsychiatric Inventory Questionnaire (NPI-Q) is an informant-based instrument that measures the presence and severity of 12 Neuropsychiatric Symptoms (NPS) in patients with dementia, as well as informant distress. Objective To measure the psychometric properties of the NPI-Q and the prevalence of NPS in patients with Alzheimer's disease (AD) in Chile. Methods 53 patients with AD were assessed. Subjects were divided into two different groups: mild AD (n=26) and moderate AD (n=27). Convergent validity was estimated by correlating the outcomes of the NPI-Q with Neuropsychiatric Inventory (NPI) scores and with a global cognitive efficiency test (Addenbrooke's Cognitive Examination - Revised - ACE-R). Reliability of the NPI-Q was analysed by calculating its internal consistency. Prevalence of NPS was estimated with both the NPI and NPI-Q. Results Positive and significant correlations were observed between the NPI-Q, the NPI, and the ACE-R (r=0.730; p<0.01 and 0.315; p<0.05 respectively). The instrument displayed an adequate level of reliability (Cronbach's alpha=0.783). The most prevalent NPS were apathy/indifference (62.3%) and dysphoria/depression (58.5%). Conclusion The NPI-Q exhibited acceptable validity and reliability indicators for patients with AD in Chile, indicating that it is a suitable instrument for the routine assessment of NPS in clinical practice. PMID:29213504
Reliability generalization study of the Yale-Brown Obsessive-Compulsive Scale for children and adolescents.

PubMed

López-Pina, José Antonio; Sánchez-Meca, Julio; López-López, José Antonio; Marín-Martínez, Fulgencio; Núñez-Núñez, Rosa Ma; Rosa-Alcázar, Ana I; Gómez-Conesa, Antonia; Ferrer-Requena, Josefa

2015-01-01

The Yale-Brown Obsessive-Compulsive Scale for children and adolescents (CY-BOCS) is a frequently applied test to assess obsessive-compulsive symptoms. We conducted a reliability generalization meta-analysis on the CY-BOCS to estimate the average reliability, search for reliability moderators, and propose a predictive model that researchers and clinicians can use to estimate the expected reliability of the CY-BOCS scores. A total of 47 studies reporting a reliability coefficient with the data at hand were included in the meta-analysis. The results showed good reliability and a large variability associated to the standard deviation of total scores and sample size.
Bronchiolitis Score of Sant Joan de Déu: BROSJOD Score, validation and usefulness.

PubMed

Balaguer, Mònica; Alejandre, Carme; Vila, David; Esteban, Elisabeth; Carrasco, Josep L; Cambra, Francisco José; Jordan, Iolanda

2017-04-01

To validate the bronchiolitis score of Sant Joan de Déu (BROSJOD) and to examine the previously defined scoring cutoff. Prospective, observational study. BROSJOD scoring was done by two independent physicians (at admission, 24 and 48 hr). Internal consistency of the score was assessed using Cronbach's α. To determine inter-rater reliability, the concordance correlation coefficient estimated as an intraclass correlation coefficient (CCC) and limits of agreement estimated as the 90% total deviation index (TDI) were estimated. An expert opinion was used to classify patients according to clinical severity. A validity analysis was conducted comparing the 3-level classification score to that expert opinion. Volume under the surface (VUS), predictive values, and probability of correct classification (PCC) were measured to assess discriminant validity. About 112 patients were recruited, 62 of them (55.4%) males. Median age: 52.5 days (IQR: 32.75-115.25). The admission Cronbach's α was 0.77 (CI95%: 0.71; 0.82) and at 24 hr it was 0.65 (CI95%: 0.48; 0.7). The inter-rater reliability analysis was: CCC at admission 0.96 (95%CI 0.94-0.97), at 24 h 0.77 (95%CI 0.65-0.86), and at 48 hr 0.94 (95%CI 0.94-0.97); TDI 90%: 1.6, 2.9, and 1.57, respectively. The discriminant validity at admission: VUS of 0.8 (95%CI 0.70-0.90), at 24 h 0.92 (95%CI 0.85-0.99), and at 48 hr 0.93 (95%CI 0.87-0.99). The predictive values and PCC values were within 38-100% depending on the level of clinical severity. There is a high inter-rater reliability, showing the BROSJOD score to be reliable and valid, even when different observers apply it. Pediatr Pulmonol. 2017;52:533-539. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Improved estimation of subject-level functional connectivity using full and partial correlation with empirical Bayes shrinkage.

PubMed

Mejia, Amanda F; Nebel, Mary Beth; Barber, Anita D; Choe, Ann S; Pekar, James J; Caffo, Brian S; Lindquist, Martin A

2018-05-15

Reliability of subject-level resting-state functional connectivity (FC) is determined in part by the statistical techniques employed in its estimation. Methods that pool information across subjects to inform estimation of subject-level effects (e.g., Bayesian approaches) have been shown to enhance reliability of subject-level FC. However, fully Bayesian approaches are computationally demanding, while empirical Bayesian approaches typically rely on using repeated measures to estimate the variance components in the model. Here, we avoid the need for repeated measures by proposing a novel measurement error model for FC describing the different sources of variance and error, which we use to perform empirical Bayes shrinkage of subject-level FC towards the group average. In addition, since the traditional intra-class correlation coefficient (ICC) is inappropriate for biased estimates, we propose a new reliability measure denoted the mean squared error intra-class correlation coefficient (ICC MSE ) to properly assess the reliability of the resulting (biased) estimates. We apply the proposed techniques to test-retest resting-state fMRI data on 461 subjects from the Human Connectome Project to estimate connectivity between 100 regions identified through independent components analysis (ICA). We consider both correlation and partial correlation as the measure of FC and assess the benefit of shrinkage for each measure, as well as the effects of scan duration. We find that shrinkage estimates of subject-level FC exhibit substantially greater reliability than traditional estimates across various scan durations, even for the most reliable connections and regardless of connectivity measure. Additionally, we find partial correlation reliability to be highly sensitive to the choice of penalty term, and to be generally worse than that of full correlations except for certain connections and a narrow range of penalty values. This suggests that the penalty needs to be chosen carefully when using partial correlations. Copyright © 2018. Published by Elsevier Inc.
Predictors of validity and reliability of a physical activity record in adolescents

PubMed Central

2013-01-01

Background Poor to moderate validity of self-reported physical activity instruments is commonly observed in young people in low- and middle-income countries. However, the reasons for such low validity have not been examined in detail. We tested the validity of a self-administered daily physical activity record in adolescents and assessed if personal characteristics or the convenience level of reporting physical activity modified the validity estimates. Methods The study comprised a total of 302 adolescents from an urban and rural area in Ecuador. Validity was evaluated by comparing the record with accelerometer recordings for seven consecutive days. Test-retest reliability was examined by comparing registrations from two records administered three weeks apart. Time spent on sedentary (SED), low (LPA), moderate (MPA) and vigorous (VPA) intensity physical activity was estimated. Bland Altman plots were used to evaluate measurement agreement. We assessed if age, sex, urban or rural setting, anthropometry and convenience of completing the record explained differences in validity estimates using a linear mixed model. Results Although the record provided higher estimates for SED and VPA and lower estimates for LPA and MPA compared to the accelerometer, it showed an overall fair measurement agreement for validity. There was modest reliability for assessing physical activity in each intensity level. Validity was associated with adolescents’ personal characteristics: sex (SED: P = 0.007; LPA: P = 0.001; VPA: P = 0.009) and setting (LPA: P = 0.000; MPA: P = 0.047). Reliability was associated with the convenience of completing the physical activity record for LPA (low convenience: P = 0.014; high convenience: P = 0.045). Conclusions The physical activity record provided acceptable estimates for reliability and validity on a group level. Sex and setting were associated with validity estimates, whereas convenience to fill out the record was associated with better reliability estimates for LPA. This tendency of improved reliability estimates for adolescents reporting higher convenience merits further consideration. PMID:24289296
The Relative Contribution of Interaural Time and Magnitude Cues to Dynamic Sound Localization

NASA Technical Reports Server (NTRS)

Wenzel, Elizabeth M.; Null, Cynthia H. (Technical Monitor)

1995-01-01

This paper presents preliminary data from a study examining the relative contribution of interaural time differences (ITDs) and interaural level differences (ILDs) to the localization of virtual sound sources both with and without head motion. The listeners' task was to estimate the apparent direction and distance of virtual sources (broadband noise) presented over headphones. Stimuli were synthesized from minimum phase representations of nonindividualized directional transfer functions; binaural magnitude spectra were derived from the minimum phase estimates and ITDs were represented as a pure delay. During dynamic conditions, listeners were encouraged to move their heads; the position of the listener's head was tracked and the stimuli were synthesized in real time using a Convolvotron to simulate a stationary external sound source. ILDs and ITDs were either correctly or incorrectly correlated with head motion: (1) both ILDs and ITDs correctly correlated, (2) ILDs correct, ITD fixed at 0 deg azimuth and 0 deg elevation, (3) ITDs correct, ILDs fixed at 0 deg, 0 deg. Similar conditions were run for static conditions except that none of the cues changed with head motion. The data indicated that, compared to static conditions, head movements helped listeners to resolve confusions primarily when ILDs were correctly correlated, although a smaller effect was also seen for correct ITDs. Together with the results for static conditions, the data suggest that localization tends to be dominated by the cue that is most reliable or consistent, when reliability is defined by consistency over time as well as across frequency bands.
The size of the irregular migrant population in the European Union – counting the uncountable?

PubMed

Vogel, Dita; Kovacheva, Vesela; Prescott, Hannah

2011-01-01

It is difficult to estimate the size of the irregular migrant population in a specific city or country, and even more difficult to arrive at estimates at the European level. A review of past attempts at European-level estimates reveals that they rely on rough and outdated rules-of-thumb. In this paper, we present our own European level estimates for 2002, 2005, and 2008. We aggregate country-specific information, aiming at approximate comparability by consistent use of minimum and maximum estimates and by adjusting for obvious differences in definition and timescale. While the aggregated estimates are not considered highly reliable, they do -- for the first time -- provide transparency. The provision of more systematic medium quality estimates is shown to be the most promising way for improvement. The presented estimate indicates a minimum of 1.9 million and a maximum of 3.8 million irregular foreign residents in the 27 member states of the European Union (2008). Unlike rules-of-thumb, the aggregated EU estimates indicate a decline in the number of irregular foreign residents between 2002 and 2008. This decline has been influenced by the EU enlargement and legalisation programmes.
Development and testing of a scale to assess physician attitudes about handheld computers with decision support.

PubMed

Ray, Midge N; Houston, Thomas K; Yu, Feliciano B; Menachemi, Nir; Maisiak, Richard S; Allison, Jeroan J; Berner, Eta S

2006-01-01

The authors developed and evaluated a rating scale, the Attitudes toward Handheld Decision Support Software Scale (H-DSS), to assess physician attitudes about handheld decision support systems. The authors conducted a prospective assessment of psychometric characteristics of the H-DSS including reliability, validity, and responsiveness. Participants were 82 Internal Medicine residents. A higher score on each of the 14 five-point Likert scale items reflected a more positive attitude about handheld DSS. The H-DSS score is the mean across the fourteen items. Attitudes toward the use of the handheld DSS were assessed prior to and six months after receiving the handheld device. Cronbach's Alpha was used to assess internal consistency reliability. Pearson correlations were used to estimate and detect significant associations between scale scores and other measures (validity). Paired sample t-tests were used to test for changes in the mean attitude scale score (responsiveness) and for differences between groups. Internal consistency reliability for the scale was alpha = 0.73. In testing validity, moderate correlations were noted between the attitude scale scores and self-reported Personal Digital Assistant (PDA) usage in the hospital (correlation coefficient = 0.55) and clinic (0.48), p < 0.05 for both. The scale was responsive, in that it detected the expected increase in scores between the two administrations (3.99 (s.d. = 0.35) vs. 4.08, (s.d. = 0.34), p < 0.005). The authors' evaluation showed that the H-DSS scale was reliable, valid, and responsive. The scale can be used to guide future handheld DSS development and implementation.
Accuracy and reliability testing of two methods to measure internal rotation of the glenohumeral joint.

PubMed

Hall, Justin M; Azar, Frederick M; Miller, Robert H; Smith, Richard; Throckmorton, Thomas W

2014-09-01

We compared accuracy and reliability of a traditional method of measurement (most cephalad vertebral spinous process that can be reached by a patient with the extended thumb) to estimates made with the shoulder in abduction to determine if there were differences between the two methods. Six physicians with fellowship training in sports medicine or shoulder surgery estimated measurements in 48 healthy volunteers. Three were randomly chosen to make estimates of both internal rotation measurements for each volunteer. An independent observer made objective measurements on lateral scoliosis films (spinous process method) or with a goniometer (abduction method). Examiners were blinded to objective measurements as well as to previous estimates. Intraclass coefficients for interobserver reliability for the traditional method averaged 0.75, indicating good agreement among observers. The difference in vertebral level estimated by the examiner and the actual radiographic level averaged 1.8 levels. The intraclass coefficient for interobserver reliability for the abduction method averaged 0.81 for all examiners, indicating near-perfect agreement. Confidence intervals indicated that estimates were an average of 8° different from the objective goniometer measurements. Pearson correlation coefficients of intraobserver reliability for the abduction method averaged 0.94, indicating near-perfect agreement within observers. Confidence intervals demonstrated repeated estimates between 5° and 10° of the original. Internal rotation estimates made with the shoulder abducted demonstrated interobserver reliability superior to that of spinous process estimates, and reproducibility was high. On the basis of this finding, we now take glenohumeral internal rotation measurements with the shoulder in abduction and use a goniometer to maximize accuracy and objectivity. Copyright © 2014 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Mosby, Inc. All rights reserved.
Reliability and Validity of Food Frequency Questionnaire and Nutrient Biomarkers in Elders With and Without Mild Cognitive Impairment

PubMed Central

Bowman, Gene L.; Shannon, Jackilen; Ho, Emily; Traber, Maret G.; Frei, Balz; Oken, Barry S.; Kaye, Jeffery A.; Quinn, Joseph F.

2010-01-01

Introduction There is great interest in nutritional strategies for the prevention of age-related cognitive decline, yet the best methods for nutritional assessment in populations at risk for dementia are still evolving. Our study objective was to test the reliability and validity of two common nutritional assessments (plasma nutrient biomarkers and Food Frequency Questionnaire) in people at risk for dementia. Methods Thirty-eight elders, half with amnestic -Mild Cognitive Impairment and half with intact cognition were recruited. Nutritional assessments were collected together at baseline and again at 1 month. Intraclass and Pearson correlation coefficients quantified reliability and validity. Results Twenty-six nutrients were examined and reliability was very good or better for 77% (20/26, ICC ≥ .75) of the plasma nutrient biomarkers and for 88% of the FFQ estimates. Twelve of the plasma nutrient estimates were as reliable as the commonly measured plasma cholesterol (ICC=.92). FFQ and plasma long-chain fatty acids (docosahexaenoic acid, r =.39, eicosapentaenoic acid, r = .39) and carotenoids (α-carotene, r =.49; lutein + zeaxanthin, r = .48; β-carotene, r = .43; β-cryptoxanthin, r = .41) were correlated, but no other FFQ estimates correlated with respective nutrient biomarkers. Correlations between FFQ and plasma fatty acids and carotenoids were significantly stronger after removing subjects with MCI. Conclusion The reliability and validity of plasma and FFQ nutrient estimates vary according to the nutrient of interest. Memory deficit attenuates FFQ estimate validity and inflates FFQ estimate reliability. Many plasma nutrient biomarkers have very good reliability over 1-month regardless of memory state. This method can circumvent sources of error seen in other less direct methods of nutritional assessment. PMID:20856100
Modelling heterogeneity variances in multiple treatment comparison meta-analysis – Are informative priors the better solution?

PubMed Central

2013-01-01

Background Multiple treatment comparison (MTC) meta-analyses are commonly modeled in a Bayesian framework, and weakly informative priors are typically preferred to mirror familiar data driven frequentist approaches. Random-effects MTCs have commonly modeled heterogeneity under the assumption that the between-trial variance for all involved treatment comparisons are equal (i.e., the ‘common variance’ assumption). This approach ‘borrows strength’ for heterogeneity estimation across treatment comparisons, and thus, ads valuable precision when data is sparse. The homogeneous variance assumption, however, is unrealistic and can severely bias variance estimates. Consequently 95% credible intervals may not retain nominal coverage, and treatment rank probabilities may become distorted. Relaxing the homogeneous variance assumption may be equally problematic due to reduced precision. To regain good precision, moderately informative variance priors or additional mathematical assumptions may be necessary. Methods In this paper we describe four novel approaches to modeling heterogeneity variance - two novel model structures, and two approaches for use of moderately informative variance priors. We examine the relative performance of all approaches in two illustrative MTC data sets. We particularly compare between-study heterogeneity estimates and model fits, treatment effect estimates and 95% credible intervals, and treatment rank probabilities. Results In both data sets, use of moderately informative variance priors constructed from the pair wise meta-analysis data yielded the best model fit and narrower credible intervals. Imposing consistency equations on variance estimates, assuming variances to be exchangeable, or using empirically informed variance priors also yielded good model fits and narrow credible intervals. The homogeneous variance model yielded high precision at all times, but overall inadequate estimates of between-trial variances. Lastly, treatment rankings were similar among the novel approaches, but considerably different when compared with the homogenous variance approach. Conclusions MTC models using a homogenous variance structure appear to perform sub-optimally when between-trial variances vary between comparisons. Using informative variance priors, assuming exchangeability or imposing consistency between heterogeneity variances can all ensure sufficiently reliable and realistic heterogeneity estimation, and thus more reliable MTC inferences. All four approaches should be viable candidates for replacing or supplementing the conventional homogeneous variance MTC model, which is currently the most widely used in practice. PMID:23311298
Evaluating North American Electric Grid Reliability Using the Barabasi-Albert Network Model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chassin, David P.; Posse, Christian

2005-09-15

The reliability of electric transmission systems is examined using a scale-free model of network topology and failure propagation. The topologies of the North American eastern and western electric grids are analyzed to estimate their reliability based on the Barabási-Albert network model. A commonly used power system reliability index is computed using a simple failure propagation model. The results are compared to the values of power system reliability indices previously obtained using other methods and they suggest that scale-free network models are usable to estimate aggregate electric grid reliability.
Evaluating North American Electric Grid Reliability Using the Barabasi-Albert Network Model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chassin, David P.; Posse, Christian

2005-09-15

The reliability of electric transmission systems is examined using a scale-free model of network topology and failure propagation. The topologies of the North American eastern and western electric grids are analyzed to estimate their reliability based on the Barabasi-Albert network model. A commonly used power system reliability index is computed using a simple failure propagation model. The results are compared to the values of power system reliability indices previously obtained using standard power engineering methods, and they suggest that scale-free network models are usable to estimate aggregate electric grid reliability.
Development of an instrument based on the protection motivation theory to measure factors influencing women's intention to first pap test practice.

PubMed

Hassani, Lale; Dehdari, Tahereh; Hajizadeh, Ebrahim; Shojaeizadeh, Davoud; Abedini, Mehrandokht; Nedjat, Saharnaz

2014-01-01

Given that there are many Iranian women who have never had a Pap smear, this study was designed to develop and validate a measurement tool based on the Protection Motivation Theory to assess factors influencing the Iranian women's intention to perform first Pap testing. In this psychometric research, to determine the Content Validity Index (CVI) and the Content Validity Ratio (CVR), a panel of experts (n=10) reviewed scale items. Reliability was estimated through the Intraclass Correlation Coefficient (n=30) and internal consistency (n=240). Also, factor analysis (exploratory and conformity) was performed on the data of the sample women who had never had a Pap smear test (n=240). A 26-item questionnaire was developed. The CVI and CVR scores of the scale were 0.89 and 0.90, respectively. Exploratory factor analysis loaded a 26-item with seven factors questionnaire (perceived vulnerability and severity, fear, response costs, response efficacy, self-efficacy, and protection motivation (or intention)) that jointly accounted for 72.76% of the observed variance. Confirmatory factor analysis indicated a good fit for the data. Internal consistency (range 0.70-0.93) and test-retest reliability (range 0.72-0.96) of sub-scales were acceptable. This study showed that the designed instrument was a valid and reliable tool for measuring the factors influencing the women's intention to perform their first Pap testing.
Reliability of TMS phosphene threshold estimation: Toward a standardized protocol.

PubMed

Mazzi, Chiara; Savazzi, Silvia; Abrahamyan, Arman; Ruzzoli, Manuela

Phosphenes induced by transcranial magnetic stimulation (TMS) are a subjectively described visual phenomenon employed in basic and clinical research as index of the excitability of retinotopically organized areas in the brain. Phosphene threshold estimation is a preliminary step in many TMS experiments in visual cognition for setting the appropriate level of TMS doses; however, the lack of a direct comparison of the available methods for phosphene threshold estimation leaves unsolved the reliability of those methods in setting TMS doses. The present work aims at fulfilling this gap. We compared the most common methods for phosphene threshold calculation, namely the Method of Constant Stimuli (MOCS), the Modified Binary Search (MOBS) and the Rapid Estimation of Phosphene Threshold (REPT). In two experiments we tested the reliability of PT estimation under each of the three methods, considering the day of administration, participants' expertise in phosphene perception and the sensitivity of each method to the initial values used for the threshold calculation. We found that MOCS and REPT have comparable reliability when estimating phosphene thresholds, while MOBS estimations appear less stable. Based on our results, researchers and clinicians can estimate phosphene threshold according to MOCS or REPT equally reliably, depending on their specific investigation goals. We suggest several important factors for consideration when calculating phosphene thresholds and describe strategies to adopt in experimental procedures. Copyright © 2017 Elsevier Inc. All rights reserved.
Heritability estimates on resting state fMRI data using ENIGMA analysis pipeline.

PubMed

Adhikari, Bhim M; Jahanshad, Neda; Shukla, Dinesh; Glahn, David C; Blangero, John; Reynolds, Richard C; Cox, Robert W; Fieremans, Els; Veraart, Jelle; Novikov, Dmitry S; Nichols, Thomas E; Hong, L Elliot; Thompson, Paul M; Kochunov, Peter

2018-01-01

Big data initiatives such as the Enhancing NeuroImaging Genetics through Meta-Analysis consortium (ENIGMA), combine data collected by independent studies worldwide to achieve more generalizable estimates of effect sizes and more reliable and reproducible outcomes. Such efforts require harmonized image analyses protocols to extract phenotypes consistently. This harmonization is particularly challenging for resting state fMRI due to the wide variability of acquisition protocols and scanner platforms; this leads to site-to-site variance in quality, resolution and temporal signal-to-noise ratio (tSNR). An effective harmonization should provide optimal measures for data of different qualities. We developed a multi-site rsfMRI analysis pipeline to allow research groups around the world to process rsfMRI scans in a harmonized way, to extract consistent and quantitative measurements of connectivity and to perform coordinated statistical tests. We used the single-modality ENIGMA rsfMRI preprocessing pipeline based on modelfree Marchenko-Pastur PCA based denoising to verify and replicate resting state network heritability estimates. We analyzed two independent cohorts, GOBS (Genetics of Brain Structure) and HCP (the Human Connectome Project), which collected data using conventional and connectomics oriented fMRI protocols, respectively. We used seed-based connectivity and dual-regression approaches to show that the rsfMRI signal is consistently heritable across twenty major functional network measures. Heritability values of 20-40% were observed across both cohorts.

Estimation of signal-dependent noise level function in transform domain via a sparse recovery model.

PubMed

Yang, Jingyu; Gan, Ziqiao; Wu, Zhaoyang; Hou, Chunping

2015-05-01

This paper proposes a novel algorithm to estimate the noise level function (NLF) of signal-dependent noise (SDN) from a single image based on the sparse representation of NLFs. Noise level samples are estimated from the high-frequency discrete cosine transform (DCT) coefficients of nonlocal-grouped low-variation image patches. Then, an NLF recovery model based on the sparse representation of NLFs under a trained basis is constructed to recover NLF from the incomplete noise level samples. Confidence levels of the NLF samples are incorporated into the proposed model to promote reliable samples and weaken unreliable ones. We investigate the behavior of the estimation performance with respect to the block size, sampling rate, and confidence weighting. Simulation results on synthetic noisy images show that our method outperforms existing state-of-the-art schemes. The proposed method is evaluated on real noisy images captured by three types of commodity imaging devices, and shows consistently excellent SDN estimation performance. The estimated NLFs are incorporated into two well-known denoising schemes, nonlocal means and BM3D, and show significant improvements in denoising SDN-polluted images.
[Effect of speech estimation on social anxiety].

PubMed

Shirotsuki, Kentaro; Sasagawa, Satoko; Nomura, Shinobu

2009-02-01

This study investigates the effect of speech estimation on social anxiety to further understanding of this characteristic of Social Anxiety Disorder (SAD). In the first study, we developed the Speech Estimation Scale (SES) to assess negative estimation before giving a speech which has been reported to be the most fearful social situation in SAD. Undergraduate students (n = 306) completed a set of questionnaires, which consisted of the Short Fear of Negative Evaluation Scale (SFNE), the Social Interaction Anxiety Scale (SIAS), the Social Phobia Scale (SPS), and the SES. Exploratory factor analysis showed an adequate one-factor structure with eight items. Further analysis indicated that the SES had good reliability and validity. In the second study, undergraduate students (n = 315) completed the SFNE, SIAS, SPS, SES, and the Self-reported Depression Scale (SDS). The results of path analysis showed that fear of negative evaluation from others (FNE) predicted social anxiety, and speech estimation mediated the relationship between FNE and social anxiety. These results suggest that speech estimation might maintain SAD symptoms, and could be used as a specific target for cognitive intervention in SAD.
A novel technique for fetal heart rate estimation from Doppler ultrasound signal

PubMed Central

2011-01-01

Background The currently used fetal monitoring instrumentation that is based on Doppler ultrasound technique provides the fetal heart rate (FHR) signal with limited accuracy. It is particularly noticeable as significant decrease of clinically important feature - the variability of FHR signal. The aim of our work was to develop a novel efficient technique for processing of the ultrasound signal, which could estimate the cardiac cycle duration with accuracy comparable to a direct electrocardiography. Methods We have proposed a new technique which provides the true beat-to-beat values of the FHR signal through multiple measurement of a given cardiac cycle in the ultrasound signal. The method consists in three steps: the dynamic adjustment of autocorrelation window, the adaptive autocorrelation peak detection and determination of beat-to-beat intervals. The estimated fetal heart rate values and calculated indices describing variability of FHR, were compared to the reference data obtained from the direct fetal electrocardiogram, as well as to another method for FHR estimation. Results The results revealed that our method increases the accuracy in comparison to currently used fetal monitoring instrumentation, and thus enables to calculate reliable parameters describing the variability of FHR. Relating these results to the other method for FHR estimation we showed that in our approach a much lower number of measured cardiac cycles was rejected as being invalid. Conclusions The proposed method for fetal heart rate determination on a beat-to-beat basis offers a high accuracy of the heart interval measurement enabling reliable quantitative assessment of the FHR variability, at the same time reducing the number of invalid cardiac cycle measurements. PMID:21999764
The Chinese version of the Outcome Expectations for Exercise scale: validation study.

PubMed

Lee, Ling-Ling; Chiu, Yu-Yun; Ho, Chin-Chih; Wu, Shu-Chen; Watson, Roger

2011-06-01

Estimates of the reliability and validity of the English nine-item Outcome Expectations for Exercise (OEE) scale have been tested and found to be valid for use in various settings, particularly among older people, with good internal consistency and validity. Data on the use of the OEE scale among older Chinese people living in the community and how cultural differences might affect the administration of the OEE scale are limited. To test the validity and reliability of the Chinese version of the Outcome Expectations for Exercise scale among older people. A cross-sectional validation study was designed to test the Chinese version of the OEE scale (OEE-C). Reliability was examined by testing both the internal consistency for the overall scale and the squared multiple correlation coefficient for the single item measure. The validity of the scale was tested on the basis of both a traditional psychometric test and a confirmatory factor analysis using structural equation modelling. The Mokken Scaling Procedure (MSP) was used to investigate if there were any hierarchical, cumulative sets of items in the measure. The OEE-C scale was tested in a group of older people in Taiwan (n=108, mean age=77.1). There was acceptable internal consistency (alpha=.85) and model fit in the scale. Evidence of the validity of the measure was demonstrated by the tests for criterion-related validity and construct validity. There was a statistically significant correlation between exercise outcome expectations and exercise self-efficacy (r=.34, p<.01). An analysis of the Mokken Scaling Procedure found that nine items of the scale were all retained in the analysis and the resulting scale was reliable and statistically significant (p=.0008). The results obtained in the present study provided acceptable levels of reliability and validity evidence for the Chinese Outcome Expectations for Exercise scale when used with older people in Taiwan. Future testing of the OEE-C scale needs to be carried out to see whether these results are generalisable to older Chinese people living in urban areas. Copyright © 2010 Elsevier Ltd. All rights reserved.
Motion Estimation and Compensation Strategies in Dynamic Computerized Tomography

NASA Astrophysics Data System (ADS)

Hahn, Bernadette N.

2017-12-01

A main challenge in computerized tomography consists in imaging moving objects. Temporal changes during the measuring process lead to inconsistent data sets, and applying standard reconstruction techniques causes motion artefacts which can severely impose a reliable diagnostics. Therefore, novel reconstruction techniques are required which compensate for the dynamic behavior. This article builds on recent results from a microlocal analysis of the dynamic setting, which enable us to formulate efficient analytic motion compensation algorithms for contour extraction. Since these methods require information about the dynamic behavior, we further introduce a motion estimation approach which determines parameters of affine and certain non-affine deformations directly from measured motion-corrupted Radon-data. Our methods are illustrated with numerical examples for both types of motion.
Single point estimation of phenytoin dosing: a reappraisal.

PubMed

Koup, J R; Gibaldi, M; Godolphin, W

1981-11-01

A previously proposed method for estimation of phenytoin dosing requirement using a single serum sample obtained 24 hours after intravenous loading dose (18 mg/Kg) has been re-evaluated. Using more realistic values for the volume of distribution of phenytoin (0.4 to 1.2 L/Kg), simulations indicate that the proposed method will fail to consistently predict dosage requirements. Additional simulations indicate that two samples obtained during the 24 hour interval following the iv loading dose could be used to more reliably predict phenytoin dose requirement. Because of the nonlinear relationship which exists between phenytoin dose administration rate (RO) and the mean steady state serum concentration (CSS), small errors in prediction of the required RO result in much larger errors in CSS.
A new global 1-km dataset of percentage tree cover derived from remote sensing

USGS Publications Warehouse

DeFries, R.S.; Hansen, M.C.; Townshend, J.R.G.; Janetos, A.C.; Loveland, Thomas R.

2000-01-01

Accurate assessment of the spatial extent of forest cover is a crucial requirement for quantifying the sources and sinks of carbon from the terrestrial biosphere. In the more immediate context of the United Nations Framework Convention on Climate Change, implementation of the Kyoto Protocol calls for estimates of carbon stocks for a baseline year as well as for subsequent years. Data sources from country level statistics and other ground-based information are based on varying definitions of 'forest' and are consequently problematic for obtaining spatially and temporally consistent carbon stock estimates. By combining two datasets previously derived from the Advanced Very High Resolution Radiometer (AVHRR) at 1 km spatial resolution, we have generated a prototype global map depicting percentage tree cover and associated proportions of trees with different leaf longevity (evergreen and deciduous) and leaf type (broadleaf and needleleaf). The product is intended for use in terrestrial carbon cycle models, in conjunction with other spatial datasets such as climate and soil type, to obtain more consistent and reliable estimates of carbon stocks. The percentage tree cover dataset is available through the Global Land Cover Facility at the University of Maryland at http://glcf.umiacs.umd.edu.
A practical comparison of algorithms for the measurement of multiscale entropy in neural time series data.

PubMed

Kuntzelman, Karl; Jack Rhodes, L; Harrington, Lillian N; Miskovic, Vladimir

2018-06-01

There is a broad family of statistical methods for capturing time series regularity, with increasingly widespread adoption by the neuroscientific community. A common feature of these methods is that they permit investigators to quantify the entropy of brain signals - an index of unpredictability/complexity. Despite the proliferation of algorithms for computing entropy from neural time series data there is scant evidence concerning their relative stability and efficiency. Here we evaluated several different algorithmic implementations (sample, fuzzy, dispersion and permutation) of multiscale entropy in terms of their stability across sessions, internal consistency and computational speed, accuracy and precision using a combination of electroencephalogram (EEG) and synthetic 1/ƒ noise signals. Overall, we report fair to excellent internal consistency and longitudinal stability over a one-week period for the majority of entropy estimates, with several caveats. Computational timing estimates suggest distinct advantages for dispersion and permutation entropy over other entropy estimates. Considered alongside the psychometric evidence, we suggest several ways in which researchers can maximize computational resources (without sacrificing reliability), especially when working with high-density M/EEG data or multivoxel BOLD time series signals. Copyright © 2018 Elsevier Inc. All rights reserved.
A Latent Class Approach to Estimating Test-Score Reliability

ERIC Educational Resources Information Center

van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas

2011-01-01

This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…
Estimating Ordinal Reliability for Likert-Type and Ordinal Item Response Data: A Conceptual, Empirical, and Practical Guide

ERIC Educational Resources Information Center

Gadermann, Anne M.; Guhn, Martin; Zumbo, Bruno D.

2012-01-01

This paper provides a conceptual, empirical, and practical guide for estimating ordinal reliability coefficients for ordinal item response data (also referred to as Likert, Likert-type, ordered categorical, or rating scale item responses). Conventionally, reliability coefficients, such as Cronbach's alpha, are calculated using a Pearson…
The VLT-FLAMES Tarantula Survey. XXVII. Physical parameters of B-type main-sequence binary systems in the Tarantula nebula

NASA Astrophysics Data System (ADS)

Garland, R.; Dufton, P. L.; Evans, C. J.; Crowther, P. A.; Howarth, I. D.; de Koter, A.; de Mink, S. E.; Grin, N. J.; Langer, N.; Lennon, D. J.; McEvoy, C. M.; Sana, H.; Schneider, F. R. N.; Símon Díaz, S.; Taylor, W. D.; Thompson, A.; Vink, J. S.

2017-07-01

A spectroscopic analysis has been undertaken for the B-type multiple systems (excluding those with supergiant primaries) in the VLT-FLAMES Tarantula Survey (VFTS). Projected rotational velocities, vesini, for the primaries have been estimated using a Fourier Transform technique and confirmed by fitting rotationally broadened profiles. A subset of 33 systems with vesini ≤ 80 km s-1 have been analysed using a TLUSTY grid of model atmospheres to estimate stellar parameters and surface abundances for the primaries. The effects of a potential flux contribution from an unseen secondary have also been considered. For 20 targets it was possible to reliably estimate their effective temperatures (Teff) but for the other 13 objects it was only possible to provide a constraint of 20 000 ≤ Teff ≤ 26 000 K - the other parameters estimated for these targets will be consequently less reliable. The estimated stellar properties are compared with evolutionary models and are generally consistent with their membership of 30 Doradus, while the nature of the secondaries of 3 SB2 system is discussed. A comparison with a sample of single stars with vesini ≤ 80 km s-1 obtained from the VFTS and analysed with the same techniques implies that the atmospheric parameters and nitrogen abundances of the two samples are similar. However, the binary sample may have a lack of primaries with significant nitrogen enhancements, which would be consistent with them having low rotational velocities and having effectively evolved as single stars without significant rotational mixing. This result, which may be actually a consequence of the limitations of the pathfinder investigation presented in this paper, should be considered as a motivation for spectroscopic abundance analysis of large samples of binary stars, with high quality observational data. Based on observations collected at the European Organisation for Astronomical Research in the Southern Hemisphere under ESO programme 182.D-0222.Tables 6 and 7 are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/603/A91
Measurement errors when estimating the vertical jump height with flight time using photocell devices: the example of Optojump.

PubMed

Attia, A; Dhahbi, W; Chaouachi, A; Padulo, J; Wong, D P; Chamari, K

2017-03-01

Common methods to estimate vertical jump height (VJH) are based on the measurements of flight time (FT) or vertical reaction force. This study aimed to assess the measurement errors when estimating the VJH with flight time using photocell devices in comparison with the gold standard jump height measured by a force plate (FP). The second purpose was to determine the intrinsic reliability of the Optojump photoelectric cells in estimating VJH. For this aim, 20 subjects (age: 22.50±1.24 years) performed maximal vertical jumps in three modalities in randomized order: the squat jump (SJ), counter-movement jump (CMJ), and CMJ with arm swing (CMJarm). Each trial was simultaneously recorded by the FP and Optojump devices. High intra-class correlation coefficients (ICCs) for validity (0.98-0.99) and low limits of agreement (less than 1.4 cm) were found; even a systematic difference in jump height was consistently observed between FT and double integration of force methods (-31% to -27%; p<0.001) and a large effect size (Cohen's d >1.2). Intra-session reliability of Optojump was excellent, with ICCs ranging from 0.98 to 0.99, low coefficients of variation (3.98%), and low standard errors of measurement (0.8 cm). It was concluded that there was a high correlation between the two methods to estimate the vertical jump height, but the FT method cannot replace the gold standard, due to the large systematic bias. According to our results, the equations of each of the three jump modalities were presented in order to obtain a better estimation of the jump height.
Measurement errors when estimating the vertical jump height with flight time using photocell devices: the example of Optojump

PubMed Central

Attia, A; Chaouachi, A; Padulo, J; Wong, DP; Chamari, K

2016-01-01

Common methods to estimate vertical jump height (VJH) are based on the measurements of flight time (FT) or vertical reaction force. This study aimed to assess the measurement errors when estimating the VJH with flight time using photocell devices in comparison with the gold standard jump height measured by a force plate (FP). The second purpose was to determine the intrinsic reliability of the Optojump photoelectric cells in estimating VJH. For this aim, 20 subjects (age: 22.50±1.24 years) performed maximal vertical jumps in three modalities in randomized order: the squat jump (SJ), counter-movement jump (CMJ), and CMJ with arm swing (CMJarm). Each trial was simultaneously recorded by the FP and Optojump devices. High intra-class correlation coefficients (ICCs) for validity (0.98-0.99) and low limits of agreement (less than 1.4 cm) were found; even a systematic difference in jump height was consistently observed between FT and double integration of force methods (-31% to -27%; p<0.001) and a large effect size (Cohen’s d>1.2). Intra-session reliability of Optojump was excellent, with ICCs ranging from 0.98 to 0.99, low coefficients of variation (3.98%), and low standard errors of measurement (0.8 cm). It was concluded that there was a high correlation between the two methods to estimate the vertical jump height, but the FT method cannot replace the gold standard, due to the large systematic bias. According to our results, the equations of each of the three jump modalities were presented in order to obtain a better estimation of the jump height. PMID:28416900
Estimating secular velocities from GPS data contaminated by postseismic motion at sites with limited pre-earthquake data

NASA Astrophysics Data System (ADS)

Murray, J. R.; Svarc, J. L.

2016-12-01

Constant secular velocities estimated from Global Positioning System (GPS)-derived position time series are a central input for modeling interseismic deformation in seismically active regions. Both postseismic motion and temporally correlated noise produce long-period signals that are difficult to separate from secular motion and can bias velocity estimates. For GPS sites installed post-earthquake it is especially challenging to uniquely estimate velocities and postseismic signals and to determine when the postseismic transient has decayed sufficiently to enable use of subsequent data for estimating secular rates. Within 60 km of the 2003 M6.5 San Simeon and 2004 M6 Parkfield earthquakes in California, 16 continuous GPS sites (group 1) were established prior to mid-2001, and 52 stations (group 2) were installed following the events. We use group 1 data to investigate how early in the post-earthquake time period one may reliably begin using group 2 data to estimate velocities. For each group 1 time series, we obtain eight velocity estimates using observation time windows with successively later start dates (2006 - 2013) and a parameterization that includes constant velocity, annual, and semi-annual terms but no postseismic decay. We compare these to velocities estimated using only pre-San Simeon data to find when the pre- and post-earthquake velocities match within uncertainties. To obtain realistic velocity uncertainties, for each time series we optimize a temporally correlated noise model consisting of white, flicker, random walk, and, in some cases, band-pass filtered noise contributions. Preliminary results suggest velocities can be reliably estimated using data from 2011 to the present. Ongoing work will assess velocity bias as a function of epicentral distance and length of post-earthquake time series as well as explore spatio-temporal filtering of detrended group 1 time series to provide empirical corrections for postseismic motion in group 2 time series.
Uncertainties in obtaining high reliability from stress-strength models

NASA Technical Reports Server (NTRS)

Neal, Donald M.; Matthews, William T.; Vangel, Mark G.

1992-01-01

There has been a recent interest in determining high statistical reliability in risk assessment of aircraft components. The potential consequences are identified of incorrectly assuming a particular statistical distribution for stress or strength data used in obtaining the high reliability values. The computation of the reliability is defined as the probability of the strength being greater than the stress over the range of stress values. This method is often referred to as the stress-strength model. A sensitivity analysis was performed involving a comparison of reliability results in order to evaluate the effects of assuming specific statistical distributions. Both known population distributions, and those that differed slightly from the known, were considered. Results showed substantial differences in reliability estimates even for almost nondetectable differences in the assumed distributions. These differences represent a potential problem in using the stress-strength model for high reliability computations, since in practice it is impossible to ever know the exact (population) distribution. An alternative reliability computation procedure is examined involving determination of a lower bound on the reliability values using extreme value distributions. This procedure reduces the possibility of obtaining nonconservative reliability estimates. Results indicated the method can provide conservative bounds when computing high reliability. An alternative reliability computation procedure is examined involving determination of a lower bound on the reliability values using extreme value distributions. This procedure reduces the possibility of obtaining nonconservative reliability estimates. Results indicated the method can provide conservative bounds when computing high reliability.
Predicting Cost/Reliability/Maintainability of Advanced General Aviation Avionics Equipment

NASA Technical Reports Server (NTRS)

Davis, M. R.; Kamins, M.; Mooz, W. E.

1978-01-01

A methodology is provided for assisting NASA in estimating the cost, reliability, and maintenance (CRM) requirements for general avionics equipment operating in the 1980's. Practical problems of predicting these factors are examined. The usefulness and short comings of different approaches for modeling coast and reliability estimates are discussed together with special problems caused by the lack of historical data on the cost of maintaining general aviation avionics. Suggestions are offered on how NASA might proceed in assessing cost reliability CRM implications in the absence of reliable generalized predictive models.
Constraining uncertainties in water supply reliability in a tropical data scarce basin

NASA Astrophysics Data System (ADS)

Kaune, Alexander; Werner, Micha; Rodriguez, Erasmo; de Fraiture, Charlotte

2015-04-01

Assessing the water supply reliability in river basins is essential for adequate planning and development of irrigated agriculture and urban water systems. In many cases hydrological models are applied to determine the surface water availability in river basins. However, surface water availability and variability is often not appropriately quantified due to epistemic uncertainties, leading to water supply insecurity. The objective of this research is to determine the water supply reliability in order to support planning and development of irrigated agriculture in a tropical, data scarce environment. The approach proposed uses a simple hydrological model, but explicitly includes model parameter uncertainty. A transboundary river basin in the tropical region of Colombia and Venezuela with an approximately area of 2100 km² was selected as a case study. The Budyko hydrological framework was extended to consider climatological input variability and model parameter uncertainty, and through this the surface water reliability to satisfy the irrigation and urban demand was estimated. This provides a spatial estimate of the water supply reliability across the basin. For the middle basin the reliability was found to be less than 30% for most of the months when the water is extracted from an upstream source. Conversely, the monthly water supply reliability was high (r>98%) in the lower basin irrigation areas when water was withdrawn from a source located further downstream. Including model parameter uncertainty provides a complete estimate of the water supply reliability, but that estimate is influenced by the uncertainty in the model. Reducing the uncertainty in the model through improved data and perhaps improved model structure will improve the estimate of the water supply reliability allowing better planning of irrigated agriculture and dependable water allocation decisions.
Cost Estimation of Software Development and the Implications for the Program Manager

DTIC Science & Technology

1992-06-01

Software Lifecycle Model (SLIM), the Jensen System-4 model, the Software Productivity, Quality, and Reliability Estimator ( SPQR \\20), the Constructive...function models in current use are the Software Productivity, Quality, and Reliability Estimator ( SPQR /20) and the Software Architecture Sizing and...Estimator ( SPQR /20) was developed by T. Capers Jones of Software Productivity Research, Inc., in 1985. The model is intended to estimate the outcome
Introducing English and German versions of the Adolescent Time Attitude Scale.

PubMed

Worrell, Frank C; Mello, Zena R; Buhl, Monika

2013-08-01

In this study, the authors report on the development of English and German versions of the Adolescent Time Attitude Scale (ATAS). The ATAS consists of six subscales assessing Past Positive, Past Negative, Present Positive, Present Negative, Future Positive, and Future Negative time attitudes. The authors describe the development of the scales and present data on the reliability and structural validity of ATAS scores in samples of American (N = 300) and German (N = 316) adolescents. Internal consistency estimates for scores on the English and German versions of the ATAS were in the .70 to .80 range. Confirmatory factor analyses indicated that a six-factor structure yielded the best fit for scores and that the scores were invariant across samples.
Subsonic flight test evaluation of a propulsion system parameter estimation process for the F100 engine

NASA Technical Reports Server (NTRS)

Orme, John S.; Gilyard, Glenn B.

1992-01-01

Integrated engine-airframe optimal control technology may significantly improve aircraft performance. This technology requires a reliable and accurate parameter estimator to predict unmeasured variables. To develop this technology base, NASA Dryden Flight Research Facility (Edwards, CA), McDonnell Aircraft Company (St. Louis, MO), and Pratt & Whitney (West Palm Beach, FL) have developed and flight-tested an adaptive performance seeking control system which optimizes the quasi-steady-state performance of the F-15 propulsion system. This paper presents flight and ground test evaluations of the propulsion system parameter estimation process used by the performance seeking control system. The estimator consists of a compact propulsion system model and an extended Kalman filter. The extended Laman filter estimates five engine component deviation parameters from measured inputs. The compact model uses measurements and Kalman-filter estimates as inputs to predict unmeasured propulsion parameters such as net propulsive force and fan stall margin. The ability to track trends and estimate absolute values of propulsion system parameters was demonstrated. For example, thrust stand results show a good correlation, especially in trends, between the performance seeking control estimated and measured thrust.

APPLICATION OF TRAVEL TIME RELIABILITY FOR PERFORMANCE ORIENTED OPERATIONAL PLANNING OF EXPRESSWAYS

NASA Astrophysics Data System (ADS)

Mehran, Babak; Nakamura, Hideki

Evaluation of impacts of congestion improvement scheme s on travel time reliability is very significant for road authorities since travel time reliability repr esents operational performance of expressway segments. In this paper, a methodology is presented to estimate travel tim e reliability prior to implementation of congestion relief schemes based on travel time variation modeling as a function of demand, capacity, weather conditions and road accident s. For subject expressway segmen ts, traffic conditions are modeled over a whole year considering demand and capacity as random variables. Patterns of demand and capacity are generated for each five minute interval by appl ying Monte-Carlo simulation technique, and accidents are randomly generated based on a model that links acci dent rate to traffic conditions. A whole year analysis is performed by comparing de mand and available capacity for each scenario and queue length is estimated through shockwave analysis for each time in terval. Travel times are estimated from refined speed-flow relationships developed for intercity expressways and buffer time index is estimated consequently as a measure of travel time reliability. For validation, estimated reliability indices are compared with measured values from empirical data, and it is shown that the proposed method is suitable for operational evaluation and planning purposes.
Consistency assessment of rating curve data in various locations using Bidirectional Reach (BReach)

NASA Astrophysics Data System (ADS)

Van Eerdenbrugh, Katrien; Van Hoey, Stijn; Coxon, Gemma; Freer, Jim; Verhoest, Niko E. C.

2017-10-01

When estimating discharges through rating curves, temporal data consistency is a critical issue. In this research, consistency in stage-discharge data is investigated using a methodology called Bidirectional Reach (BReach), which departs from a (in operational hydrology) commonly used definition of consistency. A period is considered to be consistent if no consecutive and systematic deviations from a current situation occur that exceed observational uncertainty. Therefore, the capability of a rating curve model to describe a subset of the (chronologically sorted) data is assessed in each observation by indicating the outermost data points for which the rating curve model behaves satisfactorily. These points are called the maximum left or right reach, depending on the direction of the investigation. This temporal reach should not be confused with a spatial reach (indicating a part of a river). Changes in these reaches throughout the data series indicate possible changes in data consistency and if not resolved could introduce additional errors and biases. In this research, various measurement stations in the UK, New Zealand and Belgium are selected based on their significant historical ratings information and their specific characteristics related to data consistency. For each country, regional information is maximally used to estimate observational uncertainty. Based on this uncertainty, a BReach analysis is performed and, subsequently, results are validated against available knowledge about the history and behavior of the site. For all investigated cases, the methodology provides results that appear to be consistent with this knowledge of historical changes and thus facilitates a reliable assessment of (in)consistent periods in stage-discharge measurements. This assessment is not only useful for the analysis and determination of discharge time series, but also to enhance applications based on these data (e.g., by informing hydrological and hydraulic model evaluation design about consistent time periods to analyze).
Approximation Model Building for Reliability & Maintainability Characteristics of Reusable Launch Vehicles

NASA Technical Reports Server (NTRS)

Unal, Resit; Morris, W. Douglas; White, Nancy H.; Lepsch, Roger A.; Brown, Richard W.

2000-01-01

This paper describes the development of parametric models for estimating operational reliability and maintainability (R&M) characteristics for reusable vehicle concepts, based on vehicle size and technology support level. A R&M analysis tool (RMAT) and response surface methods are utilized to build parametric approximation models for rapidly estimating operational R&M characteristics such as mission completion reliability. These models that approximate RMAT, can then be utilized for fast analysis of operational requirements, for lifecycle cost estimating and for multidisciplinary sign optimization.
Bayesian Estimation of Reliability Burr Type XII Under Al-Bayyatis’ Suggest Loss Function with Numerical Solution

NASA Astrophysics Data System (ADS)

Mohammed, Amal A.; Abraheem, Sudad K.; Fezaa Al-Obedy, Nadia J.

2018-05-01

In this paper is considered with Burr type XII distribution. The maximum likelihood, Bayes methods of estimation are used for estimating the unknown scale parameter (α). Al-Bayyatis’ loss function and suggest loss function are used to find the reliability with the least loss. So the reliability function is expanded in terms of a set of power function. For this performance, the Matlab (ver.9) is used in computations and some examples are given.
Competing risk models in reliability systems, a weibull distribution model with bayesian analysis approach

NASA Astrophysics Data System (ADS)

Iskandar, Ismed; Satria Gondokaryono, Yudi

2016-02-01

In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range between the true value and the maximum likelihood estimated value lines.
A Consistent Definition of Phase Resetting Using Hilbert Transform.

PubMed

Oprisan, Sorinel A

2017-01-01

A phase resetting curve (PRC) measures the transient change in the phase of a neural oscillator subject to an external perturbation. The PRC encapsulates the dynamical response of a neural oscillator and, as a result, it is often used for predicting phase-locked modes in neural networks. While phase is a fundamental concept, it has multiple definitions that may lead to contradictory results. We used the Hilbert Transform (HT) to define the phase of the membrane potential oscillations and HT amplitude to estimate the PRC of a single neural oscillator. We found that HT's amplitude and its corresponding instantaneous frequency are very sensitive to membrane potential perturbations. We also found that the phase shift of HT amplitude between the pre- and poststimulus cycles gives an accurate estimate of the PRC. Moreover, HT phase does not suffer from the shortcomings of voltage threshold or isochrone methods and, as a result, gives accurate and reliable estimations of phase resetting.
A novel method to remotely measure food intake of free-living people in real-time

PubMed Central

Martin, Corby K.; Han, Hongmei; Coulon, Sandra M.; Allen, H. Raymond; Champagne, Catherine M.; Anton, Stephen D.

2008-01-01

The aim of this study was to report the first reliability and validity tests of the Remote Food Photography Method (RFPM), which consists of camera-enabled cell phones with data transfer capability. Participants take and transmit photographs of food selection and plate waste to researchers/clinicians for analysis. Following two pilot studies, adult participants (N=52, 20≤BMI≤35) were randomly assigned to the dine-in or take-out group. Energy intake (EI) was measured for three days. The dine-in group ate lunch and dinner in the laboratory. The take-out group ate lunch in the laboratory and dinner in free-living conditions (participants received a cooler with pre-weighed food that they returned the following morning). Energy intake was measured with the RFPM and by directly weighing foods. The RFPM was tested in laboratory and free-living conditions. Reliability was tested over three days and validity was tested by comparing directly weighed EI to EI estimated with the RFPM using Bland-Altman analysis. The RFPM produced reliable EI estimates over three days in laboratory (r=.62, p<.0001) and free-living (r=.68, p<.0001) conditions. Weighed EI correlated highly with EI estimated with the RFPM in laboratory and free-living conditions (r’s>.93, p<.0001). In two laboratory-based validity tests, the RFPM underestimated EI by -4.7% (p=.046) and -5.5% (p=.076). In free-living conditions, the RFPM underestimated EI by -6.6% (p=.017). Bias did not differ by body weight or age. The RFPM is a promising new method for accurately measuring the EI of free-living people. Error associated with the method is small compared to self-report methods. PMID:18616837
A novel method to remotely measure food intake of free-living individuals in real time: the remote food photography method.

PubMed

Martin, Corby K; Han, Hongmei; Coulon, Sandra M; Allen, H Raymond; Champagne, Catherine M; Anton, Stephen D

2009-02-01

The aim of the present study was to report the first reliability and validity tests of the remote food photography method (RFPM), which consists of camera-enabled cell phones with data transfer capability. Participants take and transmit photographs of food selection and plate waste to researchers/clinicians for analysis. Following two pilot studies, adult participants (n 52; BMI 20-35 kg/m2 inclusive) were randomly assigned to the dine-in or take-out group. Energy intake (EI) was measured for 3 d. The dine-in group ate lunch and dinner in the laboratory. The take-out group ate lunch in the laboratory and dinner in free-living conditions (participants received a cooler with pre-weighed food that they returned the following morning). EI was measured with the RFPM and by directly weighing foods. The RFPM was tested in laboratory and free-living conditions. Reliability was tested over 3 d and validity was tested by comparing directly weighed EI to EI estimated with the RFPM using Bland-Altman analysis. The RFPM produced reliable EI estimates over 3 d in laboratory (r 0.62; P < 0.0001) and free-living (r 0.68; P < 0.0001) conditions. Weighed EI correlated highly with EI estimated with the RFPM in laboratory and free-living conditions (r>0.93; P < 0.0001). In two laboratory-based validity tests, the RFPM underestimated EI by - 4.7 % (P = 0.046) and - 5.5 % (P = 0.076). In free-living conditions, the RFPM underestimated EI by - 6.6 % (P = 0.017). Bias did not differ by body weight or age. The RFPM is a promising new method for accurately measuring the EI of free-living individuals. Error associated with the method is small compared with self-report methods.
Six-minute-walk test in idiopathic pulmonary fibrosis: test validation and minimal clinically important difference.

PubMed

du Bois, Roland M; Weycker, Derek; Albera, Carlo; Bradford, Williamson Z; Costabel, Ulrich; Kartashov, Alex; Lancaster, Lisa; Noble, Paul W; Sahn, Steven A; Szwarcberg, Javier; Thomeer, Michiel; Valeyre, Dominique; King, Talmadge E

2011-05-01

The 6-minute-walk test (6MWT) is a practical and clinically meaningful measure of exercise tolerance with favorable performance characteristics in various cardiac and pulmonary diseases. Performance characteristics in patients with idiopathic pulmonary fibrosis (IPF) have not been systematically evaluated. To assess the reliability, validity, and responsiveness of the 6MWT and estimate the minimal clinically important difference (MCID) in patients with IPF. The study population included all subjects completing a 6MWT in a clinical trial evaluating interferon gamma-1b (n = 822). Six-minute walk distance (6MWD) and other parameters were measured at baseline and at 24-week intervals using a standardized protocol. Parametric and distribution-independent correlation coefficients were used to assess the strength of the relationships between 6MWD and measures of pulmonary function, dyspnea, and health-related quality of life. Both distribution-based and anchor-based methods were used to estimate the MCID. Comparison of two proximal measures of 6MWD (mean interval, 24 d) demonstrated good reliability (coefficient = 0.83; P < 0.001). 6MWD was weakly correlated with measures of physiologic function and health-related quality of life; however, values were consistently and significantly lower for patients with the poorest functional status, suggesting good construct validity. Importantly, change in 6MWD was highly predictive of mortality; a 24-week decline of greater than 50 m was associated with a fourfold increase in risk of death at 1 year (hazard ratio, 4.27; 95% confidence interval, 2.57- 7.10; P < 0.001). The estimated MCID was 24-45 m. The 6MWT is a reliable, valid, and responsive measure of disease status and a valid endpoint for clinical trials in IPF.
Stature in archeological samples from central Italy: methodological issues and diachronic changes.

PubMed

Giannecchini, Monica; Moggi-Cecchi, Jacopo

2008-03-01

Stature reconstructions from skeletal remains are usually obtained through regression equations based on the relationship between height and limb bone length. Different equations have been employed to reconstruct stature in skeletal samples, but this is the first study to provide a systematic analysis of the reliability of the different methods for Italian historical samples. Aims of this article are: 1) to analyze the reliability of different regression methods to estimate stature for populations living in Central Italy from the Iron Age to Medieval times; 2) to search for trends in stature over this time period by applying the most reliable regression method. Long bone measurements were collected from 1,021 individuals (560 males, 461 females), from 66 archeological sites for males and 54 for females. Three time periods were identified: Iron Age, Roman period, and Medieval period. To determine the most appropriate equation to reconstruct stature the Delta parameter of Gini (Memorie di metodologia statistica. Milano: Giuffre A. 1939), in which stature estimates derived from different limb bones are compared, was employed. The equations proposed by Pearson (Philos Trans R Soc London 192 (1899) 169-244) and Trotter and Gleser for Afro-Americans (Am J Phys Anthropol 10 (1952) 463-514; Am J Phys Anthropol 47 (1977) 355-356) provided the most consistent estimates when applied to our sample. We then used the equation by Pearson for further analyses. Results indicate a reduction in stature in the transition from the Iron Age to the Roman period, and a subsequent increase in the transition from the Roman period to the Medieval period. Changes of limb lengths over time were more pronounced in the distal than in the proximal elements in both limbs. 2007 Wiley-Liss, Inc.
A Bayesian consistent dual ensemble Kalman filter for state-parameter estimation in subsurface hydrology

NASA Astrophysics Data System (ADS)

Ait-El-Fquih, Boujemaa; El Gharamti, Mohamad; Hoteit, Ibrahim

2016-08-01

Ensemble Kalman filtering (EnKF) is an efficient approach to addressing uncertainties in subsurface groundwater models. The EnKF sequentially integrates field data into simulation models to obtain a better characterization of the model's state and parameters. These are generally estimated following joint and dual filtering strategies, in which, at each assimilation cycle, a forecast step by the model is followed by an update step with incoming observations. The joint EnKF directly updates the augmented state-parameter vector, whereas the dual EnKF empirically employs two separate filters, first estimating the parameters and then estimating the state based on the updated parameters. To develop a Bayesian consistent dual approach and improve the state-parameter estimates and their consistency, we propose in this paper a one-step-ahead (OSA) smoothing formulation of the state-parameter Bayesian filtering problem from which we derive a new dual-type EnKF, the dual EnKFOSA. Compared with the standard dual EnKF, it imposes a new update step to the state, which is shown to enhance the performance of the dual approach with almost no increase in the computational cost. Numerical experiments are conducted with a two-dimensional (2-D) synthetic groundwater aquifer model to investigate the performance and robustness of the proposed dual EnKFOSA, and to evaluate its results against those of the joint and dual EnKFs. The proposed scheme is able to successfully recover both the hydraulic head and the aquifer conductivity, providing further reliable estimates of their uncertainties. Furthermore, it is found to be more robust to different assimilation settings, such as the spatial and temporal distribution of the observations, and the level of noise in the data. Based on our experimental setups, it yields up to 25 % more accurate state and parameter estimations than the joint and dual approaches.
Interval Estimation of Revision Effect on Scale Reliability via Covariance Structure Modeling

ERIC Educational Resources Information Center

Raykov, Tenko

2009-01-01

A didactic discussion of a procedure for interval estimation of change in scale reliability due to revision is provided, which is developed within the framework of covariance structure modeling. The method yields ranges of plausible values for the population gain or loss in reliability of unidimensional composites, which results from deletion or…
An iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions

NASA Technical Reports Server (NTRS)

Peters, B. C., Jr.; Walker, H. F.

1978-01-01

This paper addresses the problem of obtaining numerically maximum-likelihood estimates of the parameters for a mixture of normal distributions. In recent literature, a certain successive-approximations procedure, based on the likelihood equations, was shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, we introduce a general iterative procedure, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. We show that, with probability 1 as the sample size grows large, this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. We also show that the step-size which yields optimal local convergence rates for large samples is determined in a sense by the 'separation' of the component normal densities and is bounded below by a number between 1 and 2.
An iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions, 2

NASA Technical Reports Server (NTRS)

Peters, B. C., Jr.; Walker, H. F.

1976-01-01

The problem of obtaining numerically maximum likelihood estimates of the parameters for a mixture of normal distributions is addressed. In recent literature, a certain successive approximations procedure, based on the likelihood equations, is shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, a general iterative procedure is introduced, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. With probability 1 as the sample size grows large, it is shown that this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. The step-size which yields optimal local convergence rates for large samples is determined in a sense by the separation of the component normal densities and is bounded below by a number between 1 and 2.
A Investigation of the Verbal Description of Trombone Tone Quality with Respect to Selected Attributes of Sound

NASA Astrophysics Data System (ADS)

Stroeher, Michael Steven

The purpose of this study was to determine the physical elements which experienced trombonists associate with selected descriptors in characterizing the tone quality of that instrument. Stimuli sampled from live trombone tones and synthesized into musical phrases represented 17 variations in (1) presence/absence of attack transient, (2) rise time, (3) duration, (4) number of harmonics, (5) upper limit of harmonicity, (6) spectral envelope shape, and (7) frequency. A vocabulary of 20 adjectives appropriate for describing trombone tone quality was established by a postal survey of 49 college trombone instructors. A pretest was presented to 28 University of North Texas trombone students, who rated the 17 stimuli in terms of the 20 adjectives on a Likert-type scale with reliability estimates of.8997 to.9439 (coefficient alpha). The pretest established the lack of significance of the attack transient in subjects' descriptions. Factor analysis of responses determined similarities in word usages and allowed the number of descriptors to be reduced to eight and the number of sound stimuli to nine for the final investigation. Subjects of the final study consisted of 161 trombonists at universities in Texas, Oklahoma, Colorado, Michigan and North Carolina, who rated the nine stimuli in terms of the eight chosen adjectives on Likert-type scales, with an overall reliability estimate of.7121 (coefficient alpha). Although the low reliability indicated some lack of agreement in descriptor usage, multiple regression analysis established relationships among subjects' use of descriptors and physical attributes of the stimuli. Subjects' judgements of bright were associated with tones of higher frequency; centered, good, dark and warm with faster rise time, longer duration and relatively fewer harmonics; pinched with a high number of harmonics and slower rise time; fuzzy with slow rise time and the presence of high-frequency inharmonic noise. Subjects did not use the word full with any degree of consistency.
Estimates of monthly streamflow characteristics at selected sites in the upper Missouri River basin, Montana, base period water years 1937-86

USGS Publications Warehouse

Parrett, Charles; Johnson, D.R.; Hull, J.A.

1989-01-01

Estimates of streamflow characteristics (monthly mean flow that is exceeded 90, 80, 50, and 20 percent of the time for all years of record and mean monthly flow) were made and are presented in tabular form for 312 sites in the Missouri River basin in Montana. Short-term gaged records were extended to the base period of water years 1937-86, and were used to estimate monthly streamflow characteristics at 100 sites. Data from 47 gaged sites were used in regression analysis relating the streamflow characteristics to basin characteristics and to active-channel width. The basin-characteristics equations, with standard errors of 35% to 97%, were used to estimate streamflow characteristics at 179 ungaged sites. The channel-width equations, with standard errors of 36% to 103%, were used to estimate characteristics at 138 ungaged sites. Streamflow measurements were correlated with concurrent streamflows at nearby gaged sites to estimate streamflow characteristics at 139 ungaged sites. In a test using 20 pairs of gages, the standard errors ranged from 31% to 111%. At 139 ungaged sites, the estimates from two or more of the methods were weighted and combined in accordance with the variance of individual methods. When estimates from three methods were combined the standard errors ranged from 24% to 63 %. A drainage-area-ratio adjustment method was used to estimate monthly streamflow characteristics at seven ungaged sites. The reliability of the drainage-area-ratio adjustment method was estimated to be about equal to that of the basin-characteristics method. The estimate were checked for reliability. Estimates of monthly streamflow characteristics from gaged records were considered to be most reliable, and estimates at sites with actual flow record from 1937-86 were considered to be completely reliable (zero error). Weighted-average estimates were considered to be the most reliable estimates made at ungaged sites. (USGS)
GeneCount: genome-wide calculation of absolute tumor DNA copy numbers from array comparative genomic hybridization data

PubMed Central

Lyng, Heidi; Lando, Malin; Brøvig, Runar S; Svendsrud, Debbie H; Johansen, Morten; Galteland, Eivind; Brustugun, Odd T; Meza-Zepeda, Leonardo A; Myklebost, Ola; Kristensen, Gunnar B; Hovig, Eivind; Stokke, Trond

2008-01-01

Absolute tumor DNA copy numbers can currently be achieved only on a single gene basis by using fluorescence in situ hybridization (FISH). We present GeneCount, a method for genome-wide calculation of absolute copy numbers from clinical array comparative genomic hybridization data. The tumor cell fraction is reliably estimated in the model. Data consistent with FISH results are achieved. We demonstrate significant improvements over existing methods for exploring gene dosages and intratumor copy number heterogeneity in cancers. PMID:18500990
SPIPS: Spectro-Photo-Interferometry of Pulsating Stars

NASA Astrophysics Data System (ADS)

Mérand, Antoine

2017-10-01

SPIPS (Spectro-Photo-Interferometry of Pulsating Stars) combines radial velocimetry, interferometry, and photometry to estimate physical parameters of pulsating stars, including presence of infrared excess, color excess, Teff, and ratio distance/p-factor. The global model-based parallax-of-pulsation method is implemented in Python. Derived parameters have a high level of confidence; statistical precision is improved (compared to other methods) due to the large number of data taken into account, accuracy is improved by using consistent physical modeling and reliability of the derived parameters is strengthened by redundancy in the data.
Precise Relative Earthquake Magnitudes from Cross Correlation

DOE PAGES

Cleveland, K. Michael; Ammon, Charles J.

2015-04-21

We present a method to estimate precise relative magnitudes using cross correlation of seismic waveforms. Our method incorporates the intercorrelation of all events in a group of earthquakes, as opposed to individual event pairings relative to a reference event. This method works well when a reliable reference event does not exist. We illustrate the method using vertical strike-slip earthquakes located in the northeast Pacific and Panama fracture zone regions. Our results are generally consistent with the Global Centroid Moment Tensor catalog, which we use to establish a baseline for the relative event sizes.
Comparison of breast tissue measurements using magnetic resonance imaging, digital mammography and a mathematical algorithm

NASA Astrophysics Data System (ADS)

Lu, Lee-Jane W.; Nishino, Thomas K.; Johnson, Raleigh F.; Nayeem, Fatima; Brunder, Donald G.; Ju, Hyunsu; Leonard, Morton H., Jr.; Grady, James J.; Khamapirad, Tuenchit

2012-11-01

Women with mostly mammographically dense fibroglandular tissue (breast density, BD) have a four- to six-fold increased risk for breast cancer compared to women with little BD. BD is most frequently estimated from two-dimensional (2D) views of mammograms by a histogram segmentation approach (HSM) and more recently by a mathematical algorithm consisting of mammographic imaging parameters (MATH). Two non-invasive clinical magnetic resonance imaging (MRI) protocols: 3D gradient-echo (3DGRE) and short tau inversion recovery (STIR) were modified for 3D volumetric reconstruction of the breast for measuring fatty and fibroglandular tissue volumes by a Gaussian-distribution curve-fitting algorithm. Replicate breast exams (N = 2 to 7 replicates in six women) by 3DGRE and STIR were highly reproducible for all tissue-volume estimates (coefficients of variation <5%). Reliability studies compared measurements from four methods, 3DGRE, STIR, HSM, and MATH (N = 95 women) by linear regression and intra-class correlation (ICC) analyses. Rsqr, regression slopes, and ICC, respectively, were (1) 0.76-0.86, 0.8-1.1, and 0.87-0.92 for %-gland tissue, (2) 0.72-0.82, 0.64-0.96, and 0.77-0.91, for glandular volume, (3) 0.87-0.98, 0.94-1.07, and 0.89-0.99, for fat volume, and (4) 0.89-0.98, 0.94-1.00, and 0.89-0.98, for total breast volume. For all values estimated, the correlation was stronger for comparisons between the two MRI than between each MRI versus mammography, and between each MRI versus MATH data than between each MRI versus HSM data. All ICC values were >0.75 indicating that all four methods were reliable for measuring BD and that the mathematical algorithm and the two complimentary non-invasive MRI protocols could objectively and reliably estimate different types of breast tissues.

Comparison of breast tissue measurements using magnetic resonance imaging, digital mammography and a mathematical algorithm

PubMed Central

Lu, Lee-Jane W.; Nishino, Thomas K.; Johnson, Raleigh F.; Nayeem, Fatima; Brunder, Donald G.; Ju, Hyunsu; Leonard, Morton H.; Grady, James J.; Khamapirad, Tuenchit

2012-01-01

Women with mostly mammographically dense fibroglandular tissue (breast density, BD) have a 4- to 6-fold increased risk for breast cancer compared to women with little BD. BD is most frequently estimated from 2-dimensional (2-D) views of mammograms by a histogram segmentation approach (HSM) and more recently by a mathematical algorithm consisting of mammographic imaging parameters (MATH). Two non-invasive clinical magnetic resonance imaging (MRI) protocols: 3-D gradient-echo (3DGRE) and short tau inversion recovery (STIR) were modified for 3-D volumetric reconstruction of the breast for measuring fatty and fibroglandular tissue volumes by a Gaussian-distribution curve-fitting algorithm. Replicate breast exams (N= 2 to 7 replicates in 6 women) by 3DGRE and STIR were highly reproducible for all tissue-volume estimates (coefficients of variation <5%). Reliability studies compared measurements from four methods, 3DGRE, STIR, HSM, and MATH (N=95 women) by linear regression and intra-class correlation (ICC) analyses. Rsqr, regression slopes, and ICC, respectively, were (I) 0.76–0.86, 0.8–1.1, and 0.87–0.92 for %-gland tissue, (II) 0.72–0.82, 0.64–0.96, and 0.77–0.91, for glandular volume, (III) 0.87–0.98, 0.94–1.07, and 0.89–0.99, for fat volume, and (IV) 0.89–0.98, 0.94–1.00, and 0.89–0.98, for total breast volume. For all values estimated, the correlation was stronger for comparisons between the two MRI than between each MRI vs. mammography, and between each MRI vs. MATH data than between each MRI vs. HSM data. All ICC values were >0.75 indicating that all four methods were reliable for measuring BD and that the mathematical algorithm and the two complimentary non-invasive MRI protocols could objectively and reliably estimate different types of breast tissues. PMID:23044556
Development and validation of a Markov microsimulation model for the economic evaluation of treatments in osteoporosis.

PubMed

Hiligsmann, Mickaël; Ethgen, Olivier; Bruyère, Olivier; Richy, Florent; Gathon, Henry-Jean; Reginster, Jean-Yves

2009-01-01

Markov models are increasingly used in economic evaluations of treatments for osteoporosis. Most of the existing evaluations are cohort-based Markov models missing comprehensive memory management and versatility. In this article, we describe and validate an original Markov microsimulation model to accurately assess the cost-effectiveness of prevention and treatment of osteoporosis. We developed a Markov microsimulation model with a lifetime horizon and a direct health-care cost perspective. The patient history was recorded and was used in calculations of transition probabilities, utilities, and costs. To test the internal consistency of the model, we carried out an example calculation for alendronate therapy. Then, external consistency was investigated by comparing absolute lifetime risk of fracture estimates with epidemiologic data. For women at age 70 years, with a twofold increase in the fracture risk of the average population, the costs per quality-adjusted life-year gained for alendronate therapy versus no treatment were estimated at €9105 and €15,325, respectively, under full and realistic adherence assumptions. All the sensitivity analyses in terms of model parameters and modeling assumptions were coherent with expected conclusions and absolute lifetime risk of fracture estimates were within the range of previous estimates, which confirmed both internal and external consistency of the model. Microsimulation models present some major advantages over cohort-based models, increasing the reliability of the results and being largely compatible with the existing state of the art, evidence-based literature. The developed model appears to be a valid model for use in economic evaluations in osteoporosis.
Uncertainty analysis of practical structural health monitoring systems currently employed for tall buildings consisting of small number of sensors

NASA Astrophysics Data System (ADS)

Hirai, Kenta; Mita, Akira

2016-04-01

Because of social background, such as repeated large earthquakes and cheating in design and construction, structural health monitoring (SHM) systems are getting strong attention. The SHM systems are in a practical phase. An SHM system consisting of small number of sensors has been introduced to 6 tall buildings in Shinjuku area. Including them, there are 2 major issues in the SHM systems consisting of small number of sensors. First, optimal system number of sensors and the location are not well-defined. In the practice, system placement is determined based on rough prediction and experience. Second, there are some uncertainties in estimation results by the SHM systems. Thus, the purpose of this research is to provide useful information for increasing reliability of SHM system and to improve estimation results based on uncertainty analysis of the SHM systems. The important damage index used here is the inter-story drift angle. The uncertainty considered here are number of sensors, earthquake motion characteristics, noise in data, error between numerical model and real building, nonlinearity of parameter. Then I have analyzed influence of each factor to estimation accuracy. The analysis conducted here will help to decide sensor system design considering valance of cost and accuracy. Because of constraint on the number of sensors, estimation results by the SHM system has tendency to provide smaller values. To overcome this problem, a compensation algorithm was discussed and presented. The usefulness of this compensation method was demonstrated for 40 story S and RC building models with nonlinear response.
Validity and Reliability of Assessing Body Composition Using a Mobile Application.

PubMed

Macdonald, Elizabeth Z; Vehrs, Pat R; Fellingham, Gilbert W; Eggett, Dennis; George, James D; Hager, Ronald

2017-12-01

The purpose of this study was to determine the validity and reliability of the LeanScreen (LS) mobile application that estimates percent body fat (%BF) using estimates of circumferences from photographs. The %BF of 148 weight-stable adults was estimated once using dual-energy x-ray absorptiometry (DXA). Each of two administrators assessed the %BF of each subject twice using the LS app and manually measured circumferences. A mixed-model ANOVA and Bland-Altman analyses were used to compare the estimates of %BF obtained from each method. Interrater and intrarater reliabilities values were determined using multiple measurements taken by each of the two administrators. The LS app and manually measured circumferences significantly underestimated (P < 0.05) the %BF determined using DXA by an average of -3.26 and -4.82 %BF, respectively. The LS app (6.99 %BF) and manually measured circumferences (6.76 %BF) had large limits of agreement. All interrater and intrarater reliability coefficients of estimates of %BF using the LS app and manually measured circumferences exceeded 0.99. The estimates of %BF from manually measured circumferences and the LS app were highly reliable. However, these field measures are not currently recommended for the assessment of body composition because of significant bias and large limits of agreements.
Preoperative planning of calcium deposit removal in calcifying tendinitis of the rotator cuff - possible contribution of computed tomography, ultrasound and conventional X-Ray.

PubMed

Izadpanah, Kaywan; Jaeger, Martin; Maier, Dirk; Südkamp, Norbert P; Ogon, Peter

2014-11-20

The purpose of the present study was to investigate the accuracy of Ultrasound (US), conventional X-Ray (CX) and Computed Tomography (CT) to estimate the total count, localization, morphology and consistency of Calcium deposits (CDs) in the rotator cuff. US, CX and CT imaging was performed pre-operatively in 151 patients who underwent arthroscopic removal of CDs in the rotator cuff. In all procedures: (1) total CD counts were determined, (2) the CDs appearance in each image modality was correlated to the intraoperative consistency and (3) CDs were localized in their relation to the acromion using US, CX and CT. Using US158 CDs, using CT 188 CDs and using CX 164 CDs were identified. Reliable localization of the CDs was possible with all used diagnostic modalities. CT revealed 49% of the CDs to be septated, out of which 85% were uni- and 15% multiseptated. CX was not suitable for prediction of CDs consistency. US reliably predicted viscous-solid CDs consistency only when presenting with full sound extinction (PPV 84.6%) . CT had high positive and negative predictive values for detection of liquid-soft (PPV 92.9%) and viscous-solid (PPV 87.8%) CDs. US and CX are sufficient for preoperative planning of CD removal with regards to localization and prediction of consistency if the deposits present with full sound extinction. This is the case in the majority of the patients. However, in patients with missing sound extinction CT can be recommended if CDs consistency of the deposits should be determined. Satellite deposits or septations are regularly present, which is of importance if complete CD removal is aspired.
Development and psychometric evaluation of the Primary Health Care Engagement (PHCE) Scale: a pilot survey of rural and remote nurses.

PubMed

Kosteniuk, Julie G; Wilson, Erin C; Penz, Kelly L; MacLeod, Martha L P; Stewart, Norma J; Kulig, Judith C; Karunanayake, Chandima P; Kilpatrick, Kelley

2016-01-01

To report the development and psychometric evaluation of a scale to measure rural and remote (rural/remote) nurses' perceptions of the engagement of their workplaces in key dimensions of primary health care (PHC). Amidst ongoing PHC reforms, a comprehensive instrument is needed to evaluate the degree to which rural/remote health care settings are involved in the key dimensions that characterize PHC delivery, particularly from the perspective of professionals delivering care. This study followed a three-phase process of instrument development and psychometric evaluation. A literature review and expert consultation informed instrument development in the first phase, followed by an iterative process of content evaluation in the second phase. In the final phase, a pilot survey was undertaken and item discrimination analysis employed to evaluate the internal consistency reliability of each subscale in the preliminary 60-item Primary Health Care Engagement (PHCE) Scale. The 60-item scale was subsequently refined to a 40-item instrument. The pilot survey sample included 89 nurses in current practice who had experience in rural/remote practice settings. Participants completed either a web-based or paper survey from September to December, 2013. Following item discrimination analysis, the 60-item instrument was refined to a 40-item PHCE Scale consisting of 10 subscales, each including three to five items. Alpha estimates of the 10 refined subscales ranged from 0.61 to 0.83, with seven of the subscales demonstrating acceptable reliability (α ⩾ 0.70). The refined 40-item instrument exhibited good internal consistency reliability (α=0.91). The 40-item PHCE Scale may be considered for use in future studies regardless of locale, to measure the extent to which health care professionals perceive their workplaces to be engaged in key dimensions of PHC.
A Laboratory Study on the Reliability Estimations of the Mini-CEX

ERIC Educational Resources Information Center

de Lima, Alberto Alves; Conde, Diego; Costabel, Juan; Corso, Juan; Van der Vleuten, Cees

2013-01-01

Reliability estimations of workplace-based assessments with the mini-CEX are typically based on real-life data. Estimations are based on the assumption of local independence: the object of the measurement should not be influenced by the measurement itself and samples should be completely independent. This is difficult to achieve. Furthermore, the…
Comparability and Reliability Considerations of Adequate Yearly Progress

ERIC Educational Resources Information Center

Maier, Kimberly S.; Maiti, Tapabrata; Dass, Sarat C.; Lim, Chae Young

2012-01-01

The purpose of this study is to develop an estimate of Adequate Yearly Progress (AYP) that will allow for reliable and valid comparisons among student subgroups, schools, and districts. A shrinkage-type estimator of AYP using the Bayesian framework is described. Using simulated data, the performance of the Bayes estimator will be compared to…
Sample Size for Estimation of G and Phi Coefficients in Generalizability Theory

ERIC Educational Resources Information Center

Atilgan, Hakan

2013-01-01

Problem Statement: Reliability, which refers to the degree to which measurement results are free from measurement errors, as well as its estimation, is an important issue in psychometrics. Several methods for estimating reliability have been suggested by various theories in the field of psychometrics. One of these theories is the generalizability…
Multi-model ensembles for assessment of flood losses and associated uncertainty

NASA Astrophysics Data System (ADS)

Figueiredo, Rui; Schröter, Kai; Weiss-Motz, Alexander; Martina, Mario L. V.; Kreibich, Heidi

2018-05-01

Flood loss modelling is a crucial part of risk assessments. However, it is subject to large uncertainty that is often neglected. Most models available in the literature are deterministic, providing only single point estimates of flood loss, and large disparities tend to exist among them. Adopting any one such model in a risk assessment context is likely to lead to inaccurate loss estimates and sub-optimal decision-making. In this paper, we propose the use of multi-model ensembles to address these issues. This approach, which has been applied successfully in other scientific fields, is based on the combination of different model outputs with the aim of improving the skill and usefulness of predictions. We first propose a model rating framework to support ensemble construction, based on a probability tree of model properties, which establishes relative degrees of belief between candidate models. Using 20 flood loss models in two test cases, we then construct numerous multi-model ensembles, based both on the rating framework and on a stochastic method, differing in terms of participating members, ensemble size and model weights. We evaluate the performance of ensemble means, as well as their probabilistic skill and reliability. Our results demonstrate that well-designed multi-model ensembles represent a pragmatic approach to consistently obtain more accurate flood loss estimates and reliable probability distributions of model uncertainty.
Software For Computing Reliability Of Other Software

NASA Technical Reports Server (NTRS)

Nikora, Allen; Antczak, Thomas M.; Lyu, Michael

1995-01-01

Computer Aided Software Reliability Estimation (CASRE) computer program developed for use in measuring reliability of other software. Easier for non-specialists in reliability to use than many other currently available programs developed for same purpose. CASRE incorporates mathematical modeling capabilities of public-domain Statistical Modeling and Estimation of Reliability Functions for Software (SMERFS) computer program and runs in Windows software environment. Provides menu-driven command interface; enabling and disabling of menu options guides user through (1) selection of set of failure data, (2) execution of mathematical model, and (3) analysis of results from model. Written in C language.
Rapid estimation of high-parameter auditory-filter shapes

PubMed Central

Shen, Yi; Sivakumar, Rajeswari; Richards, Virginia M.

2014-01-01

A Bayesian adaptive procedure, the quick-auditory-filter (qAF) procedure, was used to estimate auditory-filter shapes that were asymmetric about their peaks. In three experiments, listeners who were naive to psychoacoustic experiments detected a fixed-level, pure-tone target presented with a spectrally notched noise masker. The qAF procedure adaptively manipulated the masker spectrum level and the position of the masker notch, which was optimized for the efficient estimation of the five parameters of an auditory-filter model. Experiment I demonstrated that the qAF procedure provided a convergent estimate of the auditory-filter shape at 2 kHz within 150 to 200 trials (approximately 15 min to complete) and, for a majority of listeners, excellent test-retest reliability. In experiment II, asymmetric auditory filters were estimated for target frequencies of 1 and 4 kHz and target levels of 30 and 50 dB sound pressure level. The estimated filter shapes were generally consistent with published norms, especially at the low target level. It is known that the auditory-filter estimates are narrower for forward masking than simultaneous masking due to peripheral suppression, a result replicated in experiment III using fewer than 200 qAF trials. PMID:25324086
Regularized estimation of Euler pole parameters

NASA Astrophysics Data System (ADS)

Aktuğ, Bahadir; Yildirim, Ömer

2013-07-01

Euler vectors provide a unified framework to quantify the relative or absolute motions of tectonic plates through various geodetic and geophysical observations. With the advent of space geodesy, Euler parameters of several relatively small plates have been determined through the velocities derived from the space geodesy observations. However, the available data are usually insufficient in number and quality to estimate both the Euler vector components and the Euler pole parameters reliably. Since Euler vectors are defined globally in an Earth-centered Cartesian frame, estimation with the limited geographic coverage of the local/regional geodetic networks usually results in highly correlated vector components. In the case of estimating the Euler pole parameters directly, the situation is even worse, and the position of the Euler pole is nearly collinear with the magnitude of the rotation rate. In this study, a new method, which consists of an analytical derivation of the covariance matrix of the Euler vector in an ideal network configuration, is introduced and a regularized estimation method specifically tailored for estimating the Euler vector is presented. The results show that the proposed method outperforms the least squares estimation in terms of the mean squared error.
Age, year‐class strength variability, and partial age validation of Kiyis from Lake Superior

USGS Publications Warehouse

Lepak, Taylor A.; Ogle, Derek H.; Vinson, Mark

2017-01-01

ge estimates of Lake Superior Kiyis Coregonus kiyi from scales and otoliths were compared and 12 years (2003–2014) of length frequency data were examined to assess year‐class strength and validate age estimates. Ages estimated from otoliths were precise and were consistently older than ages estimated from scales. Maximum otolith‐derived ages were 20 years for females and 12 years for males. Age estimates showed high numbers of fish of ages 5, 6, and 11 in 2014, corresponding to the 2009, 2008, and 2003 year‐classes, respectively. Strong 2003 and 2009 year‐classes, along with the 2005 year‐class, were also evident based on distinct modes of age‐1 fish (<110 mm) in the length frequency distributions from 2004, 2010, and 2006, respectively. Modes from these year‐classes were present as progressively larger fish in subsequent years. Few to no age‐1 fish (<110 mm) were present in all other years. Ages estimated from otoliths were generally within 1 year of the ages corresponding to strong year‐classes, at least for age‐5 and older fish, suggesting that Kiyi age may be reliably estimated to within 1 year by careful examination of thin‐sectioned otoliths.
The Ability of Atmospheric Data to Reduce Disagreements in Wetland Methane Flux Estimates over North America

NASA Astrophysics Data System (ADS)

Miller, S. M.; Andrews, A. E.; Benmergui, J. S.; Commane, R.; Dlugokencky, E. J.; Janssens-Maenhout, G.; Melton, J. R.; Michalak, A. M.; Sweeney, C.; Worthy, D. E. J.

2015-12-01

Existing estimates of methane fluxes from wetlands differ in both magnitude and distribution across North America. We discuss seven different bottom-up methane estimates in the context of atmospheric methane data collected across the US and Canada. In the first component of this study, we explore whether the observation network can even detect a methane pattern from wetlands. We find that the observation network can identify a methane pattern from Canadian wetlands but not reliably from US wetlands. Over Canada, the network can even identify spatial patterns at multi-provence scales. Over the US, by contrast, anthropogenic emissions and modeling errors obscure atmospheric patterns from wetland fluxes. In the second component of the study, we then use these observations to reconcile disagreements in the magnitude, seasonal cycle, and spatial distribution of existing estimates. Most existing estimates predict fluxes that are too large with a seasonal cycle that is too narrow. A model known as LPJ-Bern has a spatial distribution most consistent with atmospheric observations. By contrast, a spatially-constant model outperforms the distribution of most existing flux estimates across Canada. The results presented here provide several pathways to reduce disagreements among existing wetland flux estimates across North America.
Iranian Health Literacy Questionnaire (IHLQ): An Instrument for Measuring Health Literacy in Iran.

PubMed

Haghdoost, Ali Akbar; Rakhshani, Fatemeh; Aarabi, Mohsen; Montazeri, Ali; Tavousi, Mahmoud; Solimanian, Atoosa; Sarbandi, Fatemeh; Namdar, Hosein; Iranpour, Abedin

2015-06-01

Promoting Health Literacy (HL) is considered as an important goal in strategic plans of many countries. In spite of the necessity for access to valid, reliable and native HL instruments, the number of such instruments in the Persian language is scarce. Moreover, there is no good estimation of HL status in Iran. The aim of this study was to provide a valid, reliable and native instrument to measure and monitor community HL in Iran and also, to provide an estimation of HL status in two Iranian provinces. By applying the multistage cluster sampling, 1080 respondents (540 from each gender) were recruited from Kerman and Mazandaran provinces of Iran, from February to June 2014 to participate in this cross-sectional study. The development of the Iranian Health Literacy Questionnaire (IHLQ) was initiated with a comprehensive review of the literature. Then, face, content and construct validity as well as reliability were determined. Internal consistency and test-retest reliability (ICC) of the factors was in the range of 0.71 to 0.96 and 0.73 to 0.86, respectively. In order to construct validity, Exploratory Factor Analysis (EFA) Kaiser-Meyer-Olkin (KMO) = 0.95 and Bartlett's test result of 3.017 with P < 0.001) with varimax rotation was used. Optimal reduced solution, including 36 items and seven factors, was found in EFA. Five of the factors identified were reading/comprehension skills, individual empowerment, communication/decision-making skills, social empowerment and health knowledge. It was concluded that IHLQ might be a practical and useful tool for investigating HL for Persian language speakers around the world. Since HL is dynamic and its instruments should be regularly revised, further studies are recommended to assess HL with application of IHLQ to detect its potential imperfections.
[Analysis of the reliability and validity of three self-report questionnaires to assess physical activity among Spanish adolescents].

PubMed

Cancela Carral, José María; Lago Ballesteros, Joaquín; Ayán Pérez, Carlos; Mosquera Morono, María Belén

2016-01-01

To analyse the reliability and validity of the Weekly Activity Checklist (WAC), the One Week Recall (OWR), and the Godin-Shephard Leisure Time Exercise Questionnaire (GLTEQ) in Spanish adolescents. A total of 78 adolescents wore a pedometer for one week, filled out the questionnaires at the end of this period and underwent a test to estimate their maximal oxygen consumption (VO2max). The reliability of the questionnaires was determined by means of a factor analysis. Convergent validity was obtained by comparing the questionnaires' scores against the amount of physical activity quantified by the pedometer and the VO2max reported. The questionnaires showed a weak internal consistency (WAC: α=0.59-0.78; OWR: α=0.53-0.73; GLTEQ: α=0.60). Moderate statistically significant correlations were found between the pedometer and the WAC (r=0.69; p <0.01) and the OWR (r=0.42; p <0.01), while a low statistically significant correlation was found for the GLTEQ (r=0.36; p=0.01). The estimated VO2max showed a low level of association with the WAC results (r=0.30; p <0.05), and the OWR results (r=0.29; p <0.05). When classifying the participants as active or inactive, the level of agreement with the pedometer was moderate for the WAC (k=0.46) and the OWR (r=0.44), and slight for the GLTEQ (r=0.20). Of the three questionnaires analysed, the WAC showed the best psychometric performance as it was the only one with respectable convergent validity, while sharing low reliability with the OWR and the GLTEQ. Copyright © 2016 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.
Turbulent stresses in the surf-zone: Which way is up?

USGS Publications Warehouse

Haines, John W.; Gelfenbaum, Guy; Edge, B.L

1997-01-01

Velocity observations from a vertical stack of three-component Acoustic Doppler Velocimeters (ADVs) within the energetic surf-zone are presented. Rapid temporal sampling and small sampling volume provide observations suitable for investigation of the role of turbulent fluctuations in surf-zone dynamics. While sensor performance was good, failure to recover reliable measures of tilt from the vertical compromise the data value. We will present some cursory observations supporting the ADV performance, and examine the sensitivity of stress estimates to uncertainty in the sensor orientation. It is well known that turbulent stress estimates are highly sensitive to orientation relative to vertical when wave motions are dominant. Analyses presented examine the potential to use observed flow-field characteristics to constrain sensor orientation. Results show that such an approach may provide a consistent orientation to a fraction of a degree, but the inherent sensitivity of stress estimates requires a still more restrictive constraint. Regardless, the observations indicate the degree to which stress estimates are dependent on orientation, and provide some indication of the temporal variability in time-averaged stress estimates.
Methods to assess geological CO2 storage capacity: Status and best practice

USGS Publications Warehouse

Heidug, Wolf; Brennan, Sean T.; Holloway, Sam; Warwick, Peter D.; McCoy, Sean; Yoshimura, Tsukasa

2013-01-01

To understand the emission reduction potential of carbon capture and storage (CCS), decision makers need to understand the amount of CO2 that can be safely stored in the subsurface and the geographical distribution of storage resources. Estimates of storage resources need to be made using reliable and consistent methods. Previous estimates of CO2 storage potential for a range of countries and regions have been based on a variety of methodologies resulting in a correspondingly wide range of estimates. Consequently, there has been uncertainty about which of the methodologies were most appropriate in given settings, and whether the estimates produced by these methods were useful to policy makers trying to determine the appropriate role of CCS. In 2011, the IEA convened two workshops which brought together experts for six national surveys organisations to review CO2 storage assessment methodologies and make recommendations on how to harmonise CO2 storage estimates worldwide. This report presents the findings of these workshops and an internationally shared guideline for quantifying CO2 storage resources.
Improved estimation of hydraulic conductivity by combining stochastically simulated hydrofacies with geophysical data.

PubMed

Zhu, Lin; Gong, Huili; Chen, Yun; Li, Xiaojuan; Chang, Xiang; Cui, Yijiao

2016-03-01

Hydraulic conductivity is a major parameter affecting the output accuracy of groundwater flow and transport models. The most commonly used semi-empirical formula for estimating conductivity is Kozeny-Carman equation. However, this method alone does not work well with heterogeneous strata. Two important parameters, grain size and porosity, often show spatial variations at different scales. This study proposes a method for estimating conductivity distributions by combining a stochastic hydrofacies model with geophysical methods. The Markov chain model with transition probability matrix was adopted to re-construct structures of hydrofacies for deriving spatial deposit information. The geophysical and hydro-chemical data were used to estimate the porosity distribution through the Archie's law. Results show that the stochastic simulated hydrofacies model reflects the sedimentary features with an average model accuracy of 78% in comparison with borehole log data in the Chaobai alluvial fan. The estimated conductivity is reasonable and of the same order of magnitude of the outcomes of the pumping tests. The conductivity distribution is consistent with the sedimentary distributions. This study provides more reliable spatial distributions of the hydraulic parameters for further numerical modeling.

Online estimation of internal stack temperatures in solid oxide fuel cell power generating units

NASA Astrophysics Data System (ADS)

Dolenc, B.; Vrečko, D.; Juričić, Ɖ.; Pohjoranta, A.; Pianese, C.

2016-12-01

Thermal stress is one of the main factors affecting the degradation rate of solid oxide fuel cell (SOFC) stacks. In order to mitigate the possibility of fatal thermal stress, stack temperatures and the corresponding thermal gradients need to be continuously controlled during operation. Due to the fact that in future commercial applications the use of temperature sensors embedded within the stack is impractical, the use of estimators appears to be a viable option. In this paper we present an efficient and consistent approach to data-driven design of the estimator for maximum and minimum stack temperatures intended (i) to be of high precision, (ii) to be simple to implement on conventional platforms like programmable logic controllers, and (iii) to maintain reliability in spite of degradation processes. By careful application of subspace identification, supported by physical arguments, we derive a simple estimator structure capable of producing estimates with 3% error irrespective of the evolving stack degradation. The degradation drift is handled without any explicit modelling. The approach is experimentally validated on a 10 kW SOFC system.
A Height Estimation Approach for Terrain Following Flights from Monocular Vision

PubMed Central

Campos, Igor S. G.; Nascimento, Erickson R.; Freitas, Gustavo M.; Chaimowicz, Luiz

2016-01-01

In this paper, we present a monocular vision-based height estimation algorithm for terrain following flights. The impressive growth of Unmanned Aerial Vehicle (UAV) usage, notably in mapping applications, will soon require the creation of new technologies to enable these systems to better perceive their surroundings. Specifically, we chose to tackle the terrain following problem, as it is still unresolved for consumer available systems. Virtually every mapping aircraft carries a camera; therefore, we chose to exploit this in order to use presently available hardware to extract the height information toward performing terrain following flights. The proposed methodology consists of using optical flow to track features from videos obtained by the UAV, as well as its motion information to estimate the flying height. To determine if the height estimation is reliable, we trained a decision tree that takes the optical flow information as input and classifies whether the output is trustworthy or not. The classifier achieved accuracies of 80% for positives and 90% for negatives, while the height estimation algorithm presented good accuracy. PMID:27929424
[A new method for evaluating psychomotor development based on information from parents. The Spanish version of the Kent Infant Development Scale].

PubMed

García-Tornel Florensa, S; García García, J J; Reuter, J; Clow, C; Reuter, L

1996-05-01

The purpose of this dissertation research was to design, standardize and validate the Spanish version of the Kent Infant Development Scale (KIDS). This questionnaire is based on information obtained from the parents. It was translated into Spanish and named "Escala de Desarrollo Infantil de Kent" (EDIK). The EDIK normative data were collected from the parents of 662 healthy infants (ages 1 to 15 months) in pediatric clinics in Catalonia (Spain). Test-retest reliability (r = 0.99; p < 0.001), interjudge reliability (r = 0.98; p < 0.001) and internal consistency (Cronbach alpha = 0.9947) were determined. An "r' of 0.96 was obtained when EDIK scores were compared to their estimated developmental ages obtained from the Denver Developmental Scale. The correlation of the infants' chronological age and their EDIK was 0.96 (p < 0.001). The high reliability and validity correlation coefficients demonstrate the sound psychometric properties of the EDIK. It appears to be a useful and acceptable instrument in measuring the developmental status of infants by using the reports of their parents.
Polish Adult Reading Test (PART) - construction of Polish test for estimating the level of premorbid intelligence in schizophrenia.

PubMed

Karakuła-Juchnowicz, Hanna; Stecka, Mariola

2017-08-29

In view of unavailability in Poland of the standardized methods to measure PIQ, the aim of the work was to develop a Polish test to assess the premorbid level of intelligence - PART(Polish AdultReading Test) and to measureits psychometric properties, such as validity, reliability as well as standardization in the group of schizophrenia patients. The principles of PART construction were based on the idea of popular worldwide National Adult Reading Test by Hazel Nelson. The research comprised a group of 122 subjects (65 schizophrenia patients and 57 healthy people), aged 18-60 years, matched for age and gender. PART appears to be a method with high internal consistency and reliability measured by test-retest, inter-rater reliability, and the method with acceptable diagnostic and prognostic validity. The standardized procedures of PART have been investigated and described. Considering the psychometric values of PART and a short time of its performance, the test may be a useful diagnostic instrument in the assessment of premorbid level of intelligence in a group of schizophrenic patients.
Evaluation of high-temperature and short-time sterilization of injection ampules by microwave heating.

PubMed

Sasaki, K; Honda, W; Miyake, Y

1998-01-01

The high-temperature and short-time sterilization by microwave heating with a continuous microwave sterilizer (MWS) was evaluated. The evaluation were performed with respect to: [1] lethal effect against microorganisms corresponding to F-value, and [2] reliability of MWS sterilization process. Bacillus stearothermophilus ATCC 7953 spores were used as the biological indicator and the heat-resistance of spores was evaluated with conventional heating method (121-129 degrees C). In MWS sterilization (125-135 degrees C), the actual lethal effect against B. stearothermophilus spores was almost in agreement with the F-value and the survival curve against the F-value was quite consistent with that for the autoclave. These results suggest that the actual lethal effect could be estimated by the F-value with heat-resistance parameters of spores from lower than actual temperatures and that there was no nonthermal effect of the microwave on B. stearothermophilus spores. The reliability of sterilization with the MWS was confirmed using more than 25,000 test ampules containing biological indicators. All biological indicators were killed, thus the present study shows that the MWS was completely reliable for all ampules.
Choosing a reliability inspection plan for interval censored data

DOE PAGES

Lu, Lu; Anderson-Cook, Christine Michaela

2017-04-19

Reliability test plans are important for producing precise and accurate assessment of reliability characteristics. This paper explores different strategies for choosing between possible inspection plans for interval censored data given a fixed testing timeframe and budget. A new general cost structure is proposed for guiding precise quantification of total cost in inspection test plan. Multiple summaries of reliability are considered and compared as the criteria for choosing the best plans using an easily adapted method. Different cost structures and representative true underlying reliability curves demonstrate how to assess different strategies given the logistical constraints and nature of the problem. Resultsmore » show several general patterns exist across a wide variety of scenarios. Given the fixed total cost, plans that inspect more units with less frequency based on equally spaced time points are favored due to the ease of implementation and consistent good performance across a large number of case study scenarios. Plans with inspection times chosen based on equally spaced probabilities offer improved reliability estimates for the shape of the distribution, mean lifetime, and failure time for a small fraction of population only for applications with high infant mortality rates. The paper uses a Monte Carlo simulation based approach in addition to the common evaluation based on the asymptotic variance and offers comparison and recommendation for different applications with different objectives. Additionally, the paper outlines a variety of different reliability metrics to use as criteria for optimization, presents a general method for evaluating different alternatives, as well as provides case study results for different common scenarios.« less
Choosing a reliability inspection plan for interval censored data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lu, Lu; Anderson-Cook, Christine Michaela

Reliability test plans are important for producing precise and accurate assessment of reliability characteristics. This paper explores different strategies for choosing between possible inspection plans for interval censored data given a fixed testing timeframe and budget. A new general cost structure is proposed for guiding precise quantification of total cost in inspection test plan. Multiple summaries of reliability are considered and compared as the criteria for choosing the best plans using an easily adapted method. Different cost structures and representative true underlying reliability curves demonstrate how to assess different strategies given the logistical constraints and nature of the problem. Resultsmore » show several general patterns exist across a wide variety of scenarios. Given the fixed total cost, plans that inspect more units with less frequency based on equally spaced time points are favored due to the ease of implementation and consistent good performance across a large number of case study scenarios. Plans with inspection times chosen based on equally spaced probabilities offer improved reliability estimates for the shape of the distribution, mean lifetime, and failure time for a small fraction of population only for applications with high infant mortality rates. The paper uses a Monte Carlo simulation based approach in addition to the common evaluation based on the asymptotic variance and offers comparison and recommendation for different applications with different objectives. Additionally, the paper outlines a variety of different reliability metrics to use as criteria for optimization, presents a general method for evaluating different alternatives, as well as provides case study results for different common scenarios.« less
Modeling heterogeneous (co)variances from adjacent-SNP groups improves genomic prediction for milk protein composition traits.

PubMed

Gebreyesus, Grum; Lund, Mogens S; Buitenhuis, Bart; Bovenhuis, Henk; Poulsen, Nina A; Janss, Luc G

2017-12-05

Accurate genomic prediction requires a large reference population, which is problematic for traits that are expensive to measure. Traits related to milk protein composition are not routinely recorded due to costly procedures and are considered to be controlled by a few quantitative trait loci of large effect. The amount of variation explained may vary between regions leading to heterogeneous (co)variance patterns across the genome. Genomic prediction models that can efficiently take such heterogeneity of (co)variances into account can result in improved prediction reliability. In this study, we developed and implemented novel univariate and bivariate Bayesian prediction models, based on estimates of heterogeneous (co)variances for genome segments (BayesAS). Available data consisted of milk protein composition traits measured on cows and de-regressed proofs of total protein yield derived for bulls. Single-nucleotide polymorphisms (SNPs), from 50K SNP arrays, were grouped into non-overlapping genome segments. A segment was defined as one SNP, or a group of 50, 100, or 200 adjacent SNPs, or one chromosome, or the whole genome. Traditional univariate and bivariate genomic best linear unbiased prediction (GBLUP) models were also run for comparison. Reliabilities were calculated through a resampling strategy and using deterministic formula. BayesAS models improved prediction reliability for most of the traits compared to GBLUP models and this gain depended on segment size and genetic architecture of the traits. The gain in prediction reliability was especially marked for the protein composition traits β-CN, κ-CN and β-LG, for which prediction reliabilities were improved by 49 percentage points on average using the MT-BayesAS model with a 100-SNP segment size compared to the bivariate GBLUP. Prediction reliabilities were highest with the BayesAS model that uses a 100-SNP segment size. The bivariate versions of our BayesAS models resulted in extra gains of up to 6% in prediction reliability compared to the univariate versions. Substantial improvement in prediction reliability was possible for most of the traits related to milk protein composition using our novel BayesAS models. Grouping adjacent SNPs into segments provided enhanced information to estimate parameters and allowing the segments to have different (co)variances helped disentangle heterogeneous (co)variances across the genome.
Reducing random measurement error in assessing postural load on the back in epidemiologic surveys.

PubMed

Burdorf, A

1995-02-01

The goal of this study was to design strategies to assess postural load on the back in occupational epidemiology by taking into account the reliability of measurement methods and the variability of exposure among the workers under study. Intermethod reliability studies were evaluated to estimate the systematic bias (accuracy) and random measurement error (precision) of various methods to assess postural load on the back. Intramethod reliability studies were reviewed to estimate random variability of back load over time. Intermethod surveys have shown that questionnaires have a moderate reliability for gross activities such as sitting, whereas duration of trunk flexion and rotation should be assessed by observation methods or inclinometers. Intramethod surveys indicate that exposure variability can markedly affect the reliability of estimates of back load if the estimates are based upon a single measurement over a certain time period. Equations have been presented to evaluate various study designs according to the reliability of the measurement method, the optimum allocation of the number of repeated measurements per subject, and the number of subjects in the study. Prior to a large epidemiologic study, an exposure-oriented survey should be conducted to evaluate the performance of measurement instruments and to estimate sources of variability for back load. The strategy for assessing back load can be optimized by balancing the number of workers under study and the number of repeated measurements per worker.
The relationship between cost estimates reliability and BIM adoption: SEM analysis

NASA Astrophysics Data System (ADS)

Ismail, N. A. A.; Idris, N. H.; Ramli, H.; Rooshdi, R. R. Raja Muhammad; Sahamir, S. R.

2018-02-01

This paper presents the usage of Structural Equation Modelling (SEM) approach in analysing the effects of Building Information Modelling (BIM) technology adoption in improving the reliability of cost estimates. Based on the questionnaire survey results, SEM analysis using SPSS-AMOS application examined the relationships between BIM-improved information and cost estimates reliability factors, leading to BIM technology adoption. Six hypotheses were established prior to SEM analysis employing two types of SEM models, namely the Confirmatory Factor Analysis (CFA) model and full structural model. The SEM models were then validated through the assessment on their uni-dimensionality, validity, reliability, and fitness index, in line with the hypotheses tested. The final SEM model fit measures are: P-value=0.000, RMSEA=0.079<0.08, GFI=0.824, CFI=0.962>0.90, TLI=0.956>0.90, NFI=0.935>0.90 and ChiSq/df=2.259; indicating that the overall index values achieved the required level of model fitness. The model supports all the hypotheses evaluated, confirming that all relationship exists amongst the constructs are positive and significant. Ultimately, the analysis verified that most of the respondents foresee better understanding of project input information through BIM visualization, its reliable database and coordinated data, in developing more reliable cost estimates. They also perceive to accelerate their cost estimating task through BIM adoption.
Reliability of Space-Shuttle Pressure Vessels with Random Batch Effects

NASA Technical Reports Server (NTRS)

Feiveson, Alan H.; Kulkarni, Pandurang M.

2000-01-01

In this article we revisit the problem of estimating the joint reliability against failure by stress rupture of a group of fiber-wrapped pressure vessels used on Space-Shuttle missions. The available test data were obtained from an experiment conducted at the U.S. Department of Energy Lawrence Livermore Laboratory (LLL) in which scaled-down vessels were subjected to life testing at four accelerated levels of pressure. We estimate the reliability assuming that both the Shuttle and LLL vessels were chosen at random in a two-stage process from an infinite population with spools of fiber as the primary sampling unit. Two main objectives of this work are: (1) to obtain practical estimates of reliability taking into account random spool effects and (2) to obtain a realistic assessment of estimation accuracy under the random model. Here, reliability is calculated in terms of a 'system' of 22 fiber-wrapped pressure vessels, taking into account typical pressures and exposure times experienced by Shuttle vessels. Comparisons are made with previous studies. The main conclusion of this study is that, although point estimates of reliability are still in the 'comfort zone,' it is advisable to plan for replacement of the pressure vessels well before the expected Lifetime of 100 missions per Shuttle Orbiter. Under a random-spool model, there is simply not enough information in the LLL data to provide reasonable assurance that such replacement would not be necessary.
Towards a sampling strategy for the assessment of forest condition at European level: combining country estimates.

PubMed

Travaglini, Davide; Fattorini, Lorenzo; Barbati, Anna; Bottalico, Francesca; Corona, Piermaria; Ferretti, Marco; Chirici, Gherardo

2013-04-01

A correct characterization of the status and trend of forest condition is essential to support reporting processes at national and international level. An international forest condition monitoring has been implemented in Europe since 1987 under the auspices of the International Co-operative Programme on Assessment and Monitoring of Air Pollution Effects on Forests (ICP Forests). The monitoring is based on harmonized methodologies, with individual countries being responsible for its implementation. Due to inconsistencies and problems in sampling design, however, the ICP Forests network is not able to produce reliable quantitative estimates of forest condition at European and sometimes at country level. This paper proposes (1) a set of requirements for status and change assessment and (2) a harmonized sampling strategy able to provide unbiased and consistent estimators of forest condition parameters and of their changes at both country and European level. Under the assumption that a common definition of forest holds among European countries, monitoring objectives, parameters of concern and accuracy indexes are stated. On the basis of fixed-area plot sampling performed independently in each country, an unbiased and consistent estimator of forest defoliation indexes is obtained at both country and European level, together with conservative estimators of their sampling variance and power in the detection of changes. The strategy adopts a probabilistic sampling scheme based on fixed-area plots selected by means of systematic or stratified schemes. Operative guidelines for its application are provided.
The Development of the Cleft Aesthetic Rating Scale: A New Rating Scale for the Assessment of Nasolabial Appearance in Complete Unilateral Cleft Lip and Palate Patients.

PubMed

Mosmuller, David G M; Mennes, Lisette M; Prahl, Charlotte; Kramer, Gem J C; Disse, Melissa A; van Couwelaar, Gijs M; Niessen, Frank B; Griot, J P W Don

2017-09-01

The development of the Cleft Aesthetic Rating Scale, a simple and reliable photographic reference scale for the assessment of nasolabial appearance in complete unilateral cleft lip and palate patients. A blind retrospective analysis of photographs of cleft lip and palate patients was performed with this new rating scale. VU Medical Center Amsterdam and the Academic Center for Dentistry of Amsterdam. Complete unilateral cleft lip and palate patients at the age of 6 years. Photographs that showed the highest interobserver agreement in earlier assessments were selected for the photographic reference scale. Rules were attached to the rating scale to provide a guideline for the assessment and improve interobserver reliability. Cropped photographs revealing only the nasolabial area were assessed by six observers using this new Cleft Aesthetic Rating Scale in two different sessions. Photographs of 62 children (6 years of age, 44 boys and 18 girls) were assessed. The interobserver reliability for the nose and lip together was 0.62, obtained with the intraclass correlation coefficient. To measure the internal consistency, a Cronbach alpha of .91 was calculated. The estimated reliability for three observers was .84, obtained with the Spearman Brown formula. A new, easy to use, and reliable scoring system with a photographic reference scale is presented in this study.
The psychometric properties, sensitivity and specificity of the geriatric anxiety inventory, hospital anxiety and depression scale, and rating anxiety in dementia scale in aged care residents.

PubMed

Creighton, Alexandra S; Davison, Tanya E; Kissane, David W

2018-02-22

Limited research has been conducted into the identification of a valid and reliable screening measure for anxiety in aged care settings, despite it being one of the most common psychological conditions. This study aimed to determine an appropriate anxiety screening tool for aged care by comparing the reliability and validity of three commonly used measures and identifying specific cut-offs for the identification of generalized anxiety disorder (GAD). One-hundred and eighty nursing home residents (M age = 85.39 years) completed the GAI, HADS-A, and RAID, along with a structured diagnostic interview. Twenty participants (11.1%) met DSM-5 criteria for GAD. All measures had good psychometric properties , although reliability estimates for the HADS-A were sub-optimal. Privileging sensitivity , the GAI cut-off score of 9 gave sensitivity of 90.0% and specificity of 86.3%; HADS-A cut-off of 6 gave sensitivity of 90.0% and specificity of 80.6%; and RAID cut-off of 11 gave sensitivity of 85.0% and specificity of 72.5%. While all three measures had adequate reliability, validity, and cut-scores with high levels of sensitivity and specificity to detect anxiety within aged care, the GAI was the most consistently reliable and valid measure for screening for GAD.
Measurement Properties of the Persian Translated Version of Graves Orbitopathy Quality of Life Questionnaire: A Validation Study.

PubMed

Kashkouli, Mohsen Bahmani; Karimi, Nasser; Aghamirsalim, Mohamadreza; Abtahi, Mohammad Bagher; Nojomi, Marzieh; Shahrad-Bejestani, Hadi; Salehi, Masoud

2017-02-01

To determine the measurement properties of the Persian language version of the Graves orbitopathy quality of life questionnaire (GO-QOL). Following a systematic translation and cultural adaptation process, 141 consecutive unselected thyroid eye disease (TED) patients answered the Persian GO-QOL and underwent complete ophthalmic examination. The questionnaire was again completed by 60 patients on the second visit, 2-4 weeks later. Construct validity (cross-cultural validity, structural validity and hypotheses testing), reliability (internal consistency and test-retest reliability), and floor and ceiling effects of the Persian version of the GO-QOL were evaluated. Furthermore, Rasch analysis was used to assess its psychometric properties. Cross-cultural validity was established by back-translation techniques, committee review and pretesting techniques. Bi-dimensionality of the questionnaire was confirmed by factor analysis. Construct validity was also supported through confirmation of 6 out of 8 predefined hypotheses. Cronbach's α and intraclass correlation coefficient (ICC) were 0.650 and 0.859 for visual functioning and 0.875 and 0.896 for appearance subscale, respectively. Mean quality of life (QOL) scores for visual functioning and appearance were 78.18 (standard deviation, SD, 21.57) and 56.25 (SD 26.87), respectively. Person reliabilities from the Rasch rating scale model for both visual functioning and appearance revealed an acceptable internal consistency for the Persian GO-QOL. The Persian GO-QOL questionnaire is a valid and reliable tool with good psychometric properties in evaluation of Persian-speaking patients with TED. Applying Rasch analysis to future versions of the GO-QOL is recommended in order to perform tests for linearity between the estimated item measures in different versions.
[Validity and Reliability of the KIDSCREEN-27 Life Quality Questionnaire, Parents' Version, in Medellin, Colombia].

PubMed

Vélez, Claudia Marcela; Lugo, Luz Helena; García, Héctor Iván

2012-09-01

Validate the KIDSCREEN-27 for parents in the metropolitan area of Medellín, Colombia, including the Social Acceptance (SA) subscale of KIDSCREEN-52, as it evaluates the effect of bullying in Life Quality of children. The study population was made up by parents of children between 8 and 18, from Medellín and its metropolitan area. A sample of 1,150 parents was estimated according to the different psychometric properties to be measured. Construct validation was made by comparing the mean scores between groups of high and low socioeconomic conditions. The content validity and the measurement of reliability were verified by internal consistency and test-retest stability. The parent-child agreement was also measured. The internal consistency was adequate (Cronbach alpha 0,76-0,83). Parents of children with better socio-economic status had higher scores in all dimensions (p<0,05). Scores were higher among healthy children. Women had lower scores than men, while children registered higher scores than adolescents. The intraclass correlation coefficient for the reliability assessment was above 0.7 in all dimensions, except in School Environment-SE- (ICC 0,6-0,92). The parent-child agreement reached moderate and good levels (ICC 0,49-0,69). The exploratory factorial analysis, including social acceptance subscale, registered eight dimensions, four of which in agreement with the original questionnaire: Physical activity, SE, Social Support, and SA subscale. KIDSCREEN-27 for parents is a valid and reliable instrument to be used in the Colombian context. Copyright © 2012 Asociación Colombiana de Psiquiatría. Publicado por Elsevier España. All rights reserved.
Translation, adaptation and validation of a Portuguese version of the Moorehead-Ardelt Quality of Life Questionnaire II.

PubMed

Maciel, João; Infante, Paulo; Ribeiro, Susana; Ferreira, André; Silva, Artur C; Caravana, Jorge; Carvalho, Manuel G

2014-11-01

The prevalence of obesity has increased worldwide. An assessment of the impact of obesity on health-related quality of life (HRQoL) requires specific instruments. The Moorehead-Ardelt Quality of Life Questionnaire II (MA-II) is a widely used instrument to assess HRQoL in morbidly obese patients. The objective of this study was to translate and validate a Portuguese version of the MA-II.The study included forward and backward translations of the original MA-II. The reliability of the Portuguese MA-II was estimated using the internal consistency and test-retest methods. For validation purposes, the Spearman's rank correlation coefficient was used to evaluate the correlation between the Portuguese MA-II and the Portuguese versions of two other questionnaires, the 36-item Short Form Health Survey (SF-36) and the Impact of Weight on Quality of Life-Lite (IWQOL-Lite).One hundred and fifty morbidly obese patients were randomly assigned to test the reliability and validity of the Portuguese MA-II. Good internal consistency was demonstrated by a Cronbach's alpha coefficient of 0.80, and a very good agreement in terms of test-retest reliability was recorded, with an overall intraclass correlation coefficient (ICC) of 0.88. The total sums of MA-II scores and each item of MA-II were significantly correlated with all domains of SF-36 and IWQOL-Lite. A statistically significant negative correlation was found between the MA-II total score and BMI. Moreover, age, gender and surgical status were independent predictors of MA-II total score.A reliable and valid Portuguese version of the MA-II was produced, thus enabling the routine use of MA-II in the morbidly obese Portuguese population.
Study on the Validity and Reliability of Melbourne Decision Making Scale in Turkey

ERIC Educational Resources Information Center

Çolakkadioglu, Oguzhan; Deniz, M. Engin

2015-01-01

This study is to analyze the validity and reliability of Melbourne Decision Making Questionnaire (MDMQ). The sample consisted of 650 university students. The structural validity of the MDMQ, as well as correlations among its sub-scales, measure-bound validity, internal consistency, item total correlations and test-retest reliability coefficients…
Reliability, Dimensionality, and Internal Consistency as Defined by Cronbach: Distinct Albeit Related Concepts

ERIC Educational Resources Information Center

Davenport, Ernest C.; Davison, Mark L.; Liou, Pey-Yan; Love, Quintin U.

2015-01-01

This article uses definitions provided by Cronbach in his seminal paper for coefficient a to show the concepts of reliability, dimensionality, and internal consistency are distinct but interrelated. The article begins with a critique of the definition of reliability and then explores mathematical properties of Cronbach's a. Internal consistency…
Reliability analysis of structural ceramic components using a three-parameter Weibull distribution

NASA Technical Reports Server (NTRS)

Duffy, Stephen F.; Powers, Lynn M.; Starlinger, Alois

1992-01-01

Described here are nonlinear regression estimators for the three-Weibull distribution. Issues relating to the bias and invariance associated with these estimators are examined numerically using Monte Carlo simulation methods. The estimators were used to extract parameters from sintered silicon nitride failure data. A reliability analysis was performed on a turbopump blade utilizing the three-parameter Weibull distribution and the estimates from the sintered silicon nitride data.

Current status, uncertainty and future needs in soil organic carbon monitoring.

PubMed

Jandl, Robert; Rodeghiero, Mirco; Martinez, Cristina; Cotrufo, M Francesca; Bampa, Francesca; van Wesemael, Bas; Harrison, Robert B; Guerrini, Iraê Amaral; Richter, Daniel Deb; Rustad, Lindsey; Lorenz, Klaus; Chabbi, Abad; Miglietta, Franco

2014-01-15

Increasing human demands on soil-derived ecosystem services requires reliable data on global soil resources for sustainable development. The soil organic carbon (SOC) pool is a key indicator of soil quality as it affects essential biological, chemical and physical soil functions such as nutrient cycling, pesticide and water retention, and soil structure maintenance. However, information on the SOC pool, and its temporal and spatial dynamics is unbalanced. Even in well-studied regions with a pronounced interest in environmental issues information on soil carbon (C) is inconsistent. Several activities for the compilation of global soil C data are under way. However, different approaches for soil sampling and chemical analyses make even regional comparisons highly uncertain. Often, the procedures used so far have not allowed the reliable estimation of the total SOC pool, partly because the available knowledge is focused on not clearly defined upper soil horizons and the contribution of subsoil to SOC stocks has been less considered. Even more difficult is quantifying SOC pool changes over time. SOC consists of variable amounts of labile and recalcitrant molecules of plant, and microbial and animal origin that are often operationally defined. A comprehensively active soil expert community needs to agree on protocols of soil surveying and lab procedures towards reliable SOC pool estimates. Already established long-term ecological research sites, where SOC changes are quantified and the underlying mechanisms are investigated, are potentially the backbones for regional, national, and international SOC monitoring programs. © 2013.
Psychometric properties of the Alcohol Use Disorders Identification Test (AUDIT) and prevalence of alcohol use among Iranian psychiatric outpatients.

PubMed

Noorbakhsh, Simasadat; Shams, Jamal; Faghihimohamadi, Mohamadmahdi; Zahiroddin, Hanieh; Hallgren, Mats; Kallmen, Hakan

2018-01-30

Iran is a developing and Islamic country where the consumption of alcoholic beverages is banned. However, psychiatric disorders and alcohol use disorders are often co-occurring. We used the Alcohol Use Disorders Identification Test (AUDIT) to estimate the prevalence of alcohol use and examined the psychometric properties of the test among psychiatric outpatients in Teheran, Iran. AUDIT was completed by 846 consecutive (sequential) patients. Descriptive statistics, internal consistency (Cronbach alpha), confirmatory and exploratory factor analyses were used to analyze the prevalence of alcohol use, reliability and construct validity. 12% of men and 1% of women were hazardous alcohol consumers. Internal reliability of the Iranian version of AUDIT was excellent. Confirmatory factor analyses showed that the construct validity and the fit of previous factor structures (1, 2 and 3 factors) to data were not good and seemingly contradicted results from the explorative principal axis factoring, which showed that a 1-factor solution explained 77% of the co-variances. We could not reproduce the suggested factor structure of AUDIT, probably due to the skewed distribution of alcohol consumption. Only 19% of men and 3% of women scored above 0 on AUDIT. This could be explained by the fact that alcohol is illegal in Iran. In conclusion the AUDIT exhibited good internal reliability when used as a single scale. The prevalence estimates according to AUDIT were somewhat higher among psychiatric patients compared to what was reported by WHO regarding the general population.
Proposed Reliability/Cost Model

NASA Technical Reports Server (NTRS)

Delionback, L. M.

1982-01-01

New technique estimates cost of improvement in reliability for complex system. Model format/approach is dependent upon use of subsystem cost-estimating relationships (CER's) in devising cost-effective policy. Proposed methodology should have application in broad range of engineering management decisions.
Final Report: Studies in Structural, Stochastic and Statistical Reliability for Communication Networks and Engineered Systems

DTIC Science & Technology

to do so, and (5) three distinct versions of the problem of estimating component reliability from system failure-time data are treated, each resulting inconsistent estimators with asymptotically normal distributions.
Research Review: Test-retest reliability of standardized diagnostic interviews to assess child and adolescent psychiatric disorders: a systematic review and meta-analysis.

PubMed

Duncan, Laura; Comeau, Jinette; Wang, Li; Vitoroulis, Irene; Boyle, Michael H; Bennett, Kathryn

2018-02-19

A better understanding of factors contributing to the observed variability in estimates of test-retest reliability in published studies on standardized diagnostic interviews (SDI) is needed. The objectives of this systematic review and meta-analysis were to estimate the pooled test-retest reliability for parent and youth assessments of seven common disorders, and to examine sources of between-study heterogeneity in reliability. Following a systematic review of the literature, multilevel random effects meta-analyses were used to analyse 202 reliability estimates (Cohen's kappa = ҡ) from 31 eligible studies and 5,369 assessments of 3,344 children and youth. Pooled reliability was moderate at ҡ = .58 (CI 95% 0.53-0.63) and between-study heterogeneity was substantial (Q = 2,063 (df = 201), p < .001 and I 2 = 79%). In subgroup analysis, reliability varied across informants for specific types of psychiatric disorder (ҡ = .53-.69 for parent vs. ҡ = .39-.68 for youth) with estimates significantly higher for parents on attention deficit hyperactivity disorder, oppositional defiant disorder and the broad groupings of externalizing and any disorder. Reliability was also significantly higher in studies with indicators of poor or fair study methodology quality (sample size <50, retest interval <7 days). Our findings raise important questions about the meaningfulness of published evidence on the test-retest reliability of SDIs and the usefulness of these tools in both clinical and research contexts. Potential remedies include the introduction of standardized study and reporting requirements for reliability studies, and exploration of other approaches to assessing and classifying child and adolescent psychiatric disorder. © 2018 Association for Child and Adolescent Mental Health.
Tutorial: Asteroseismic Stellar Modelling with AIMS

NASA Astrophysics Data System (ADS)

Lund, Mikkel N.; Reese, Daniel R.

The goal of aims (Asteroseismic Inference on a Massive Scale) is to estimate stellar parameters and credible intervals/error bars in a Bayesian manner from a set of asteroseismic frequency data and so-called classical constraints. To achieve reliable parameter estimates and computational efficiency, it searches through a grid of pre-computed models using an MCMC algorithm—interpolation within the grid of models is performed by first tessellating the grid using a Delaunay triangulation and then doing a linear barycentric interpolation on matching simplexes. Inputs for the modelling consist of individual frequencies from peak-bagging, which can be complemented with classical spectroscopic constraints. aims is mostly written in Python with a modular structure to facilitate contributions from the community. Only a few computationally intensive parts have been rewritten in Fortran in order to speed up calculations.
The test-retest reliability of the latent construct of executive function depends on whether tasks are represented as formative or reflective indicators.

PubMed

Willoughby, Michael T; Kuhn, Laura J; Blair, Clancy B; Samek, Anya; List, John A

2017-10-01

This study investigates the test-retest reliability of a battery of executive function (EF) tasks with a specific interest in testing whether the method that is used to create a battery-wide score would result in differences in the apparent test-retest reliability of children's performance. A total of 188 4-year-olds completed a battery of computerized EF tasks twice across a period of approximately two weeks. Two different approaches were used to create a score that indexed children's overall performance on the battery-i.e., (1) the mean score of all completed tasks and (2) a factor score estimate which used confirmatory factor analysis (CFA). Pearson and intra-class correlations were used to investigate the test-retest reliability of individual EF tasks, as well as an overall battery score. Consistent with previous studies, the test-retest reliability of individual tasks was modest (rs ≈ .60). The test-retest reliability of the overall battery scores differed depending on the scoring approach (r mean = .72; r factor_ score = .99). It is concluded that the children's performance on individual EF tasks exhibit modest levels of test-retest reliability. This underscores the importance of administering multiple tasks and aggregating performance across these tasks in order to improve precision of measurement. However, the specific strategy that is used has a large impact on the apparent test-retest reliability of the overall score. These results replicate our earlier findings and provide additional cautionary evidence against the routine use of factor analytic approaches for representing individual performance across a battery of EF tasks.
General inattentiveness is a long-term reliable trait independently predictive of psychological health: Danish validation studies of the Mindful Attention Awareness Scale.

PubMed

Jensen, Christian Gaden; Niclasen, Janni; Vangkilde, Signe Allerup; Petersen, Anders; Hasselbalch, Steen Gregers

2016-05-01

The Mindful Attention Awareness Scale (MAAS) measures perceived degree of inattentiveness in different contexts and is often used as a reversed indicator of mindfulness. MAAS is hypothesized to reflect a psychological trait or disposition when used outside attentional training contexts, but the long-term test-retest reliability of MAAS scores is virtually untested. It is unknown whether MAAS predicts psychological health after controlling for standardized socioeconomic status classifications. First, MAAS translated to Danish was validated psychometrically within a randomly invited healthy adult community sample (N = 490). Factor analysis confirmed that MAAS scores quantified a unifactorial construct of excellent composite reliability and consistent convergent validity. Structural equation modeling revealed that MAAS scores contributed independently to predicting psychological distress and mental health, after controlling for age, gender, income, socioeconomic occupational class, stressful life events, and social desirability (β = 0.32-.42, ps < .001). Second, MAAS scores showed satisfactory short-term test-retest reliability in 100 retested healthy university students. Finally, MAAS sample mean scores as well as individuals' scores demonstrated satisfactory test-retest reliability across a 6 months interval in the adult community (retested N = 407), intraclass correlations ≥ .74. MAAS scores displayed significantly stronger long-term test-retest reliability than scores measuring psychological distress (z = 2.78, p = .005). Test-retest reliability estimates did not differ within demographic and socioeconomic strata. Scores on the Danish MAAS were psychometrically validated in healthy adults. MAAS's inattentiveness scores reflected a unidimensional construct, long-term reliable disposition, and a factor of independent significance for predicting psychological health. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Examining the reliability of ADAS-Cog change scores.

PubMed

Grochowalski, Joseph H; Liu, Ying; Siedlecki, Karen L

2016-09-01

The purpose of this study was to estimate and examine ways to improve the reliability of change scores on the Alzheimer's Disease Assessment Scale, Cognitive Subtest (ADAS-Cog). The sample, provided by the Alzheimer's Disease Neuroimaging Initiative, included individuals with Alzheimer's disease (AD) (n = 153) and individuals with mild cognitive impairment (MCI) (n = 352). All participants were administered the ADAS-Cog at baseline and 1 year, and change scores were calculated as the difference in scores over the 1-year period. Three types of change score reliabilities were estimated using multivariate generalizability. Two methods to increase change score reliability were evaluated: reweighting the subtests of the scale and adding more subtests. Reliability of ADAS-Cog change scores over 1 year was low for both the AD sample (ranging from .53 to .64) and the MCI sample (.39 to .61). Reweighting the change scores from the AD sample improved reliability (.68 to .76), but lengthening provided no useful improvement for either sample. The MCI change scores had low reliability, even with reweighting and adding additional subtests. The ADAS-Cog scores had low reliability for measuring change. Researchers using the ADAS-Cog should estimate and report reliability for their use of the change scores. The ADAS-Cog change scores are not recommended for assessment of meaningful clinical change.
On robust parameter estimation in brain-computer interfacing

NASA Astrophysics Data System (ADS)

Samek, Wojciech; Nakajima, Shinichi; Kawanabe, Motoaki; Müller, Klaus-Robert

2017-12-01

Objective. The reliable estimation of parameters such as mean or covariance matrix from noisy and high-dimensional observations is a prerequisite for successful application of signal processing and machine learning algorithms in brain-computer interfacing (BCI). This challenging task becomes significantly more difficult if the data set contains outliers, e.g. due to subject movements, eye blinks or loose electrodes, as they may heavily bias the estimation and the subsequent statistical analysis. Although various robust estimators have been developed to tackle the outlier problem, they ignore important structural information in the data and thus may not be optimal. Typical structural elements in BCI data are the trials consisting of a few hundred EEG samples and indicating the start and end of a task. Approach. This work discusses the parameter estimation problem in BCI and introduces a novel hierarchical view on robustness which naturally comprises different types of outlierness occurring in structured data. Furthermore, the class of minimum divergence estimators is reviewed and a robust mean and covariance estimator for structured data is derived and evaluated with simulations and on a benchmark data set. Main results. The results show that state-of-the-art BCI algorithms benefit from robustly estimated parameters. Significance. Since parameter estimation is an integral part of various machine learning algorithms, the presented techniques are applicable to many problems beyond BCI.
Stratifying empiric risk of schizophrenia among first degree relatives using multiple predictors in two independent Indian samples.

PubMed

Bhatia, Triptish; Gettig, Elizabeth A; Gottesman, Irving I; Berliner, Jonathan; Mishra, N N; Nimgaonkar, Vishwajit L; Deshpande, Smita N

2016-12-01

Schizophrenia (SZ) has an estimated heritability of 64-88%, with the higher values based on twin studies. Conventionally, family history of psychosis is the best individual-level predictor of risk, but reliable risk estimates are unavailable for Indian populations. Genetic, environmental, and epigenetic factors are equally important and should be considered when predicting risk in 'at risk' individuals. To estimate risk based on an Indian schizophrenia participant's family history combined with selected demographic factors. To incorporate variables in addition to family history, and to stratify risk, we constructed a regression equation that included demographic variables in addition to family history. The equation was tested in two independent Indian samples: (i) an initial sample of SZ participants (N=128) with one sibling or offspring; (ii) a second, independent sample consisting of multiply affected families (N=138 families, with two or more sibs/offspring affected with SZ). The overall estimated risk was 4.31±0.27 (mean±standard deviation). There were 19 (14.8%) individuals in the high risk group, 75 (58.6%) in the moderate risk and 34 (26.6%) in the above average risk (in Sample A). In the validation sample, risks were distributed as: high (45%), moderate (38%) and above average (17%). Consistent risk estimates were obtained from both samples using the regression equation. Familial risk can be combined with demographic factors to estimate risk for SZ in India. If replicated, the proposed stratification of risk may be easier and more realistic for family members. Copyright © 2016. Published by Elsevier B.V.
Psychometric Properties of the Multidimensional Pain Inventory Applied to Brazilian Patients with Orofacial Pain.

PubMed

Zucoloto, Miriane Lucindo; Maroco, João; Duarte Bonini Campos, Juliana Alvares

2015-01-01

To evaluate the psychometric properties of the Multidimensional Pain Inventory (MPI) in a Brazilian sample of patients with orofacial pain. A total of 1,925 adult patients, who sought dental care in the School of Dentistry of São Paulo State University's Araraquara campus, were invited to participate; 62.5% (n=1,203) agreed to participate. Of these, 436 presented with orofacial pain and were included. The mean age was 39.9 (SD=13.6) years and 74.5% were female. Confirmatory factor analysis was conducted using χ²/df, comparative fit index, goodness of fit index, and root mean square error of approximation as indices of goodness of fit. Convergent validity was estimated by the average variance extracted and composite reliability, and internal consistency by Cronbach's alpha standardized coefficient (α). The stability of the models was tested in independent samples (test and validation; dental pain and orofacial pain). The factorial invariance was estimated by multigroup analysis (Δχ²). Factorial, convergent validity, and internal consistency were adequate in all three parts of the MPI. To achieve this adequate fit for Part 1, item 15 needed to be deleted (λ=0.13). Discriminant validity was compromised between the factors "activities outside the home" and "social activities" of Part 3 of the MPI in the total sample, validation sample, and in patients with dental pain and with orofacial pain. A strong invariance between different subsamples from the three parts of the MPI was detected. The MPI produced valid, reliable, and stable data for pain assessment among Brazilian patients with orofacial pain.
An Evaluation of Available Models for Estimating the Reliability and Validity of Criterion Referenced Measures.

ERIC Educational Resources Information Center

Oakland, Thomas

New strategies for evaluation criterion referenced measures (CRM) are discussed. These strategies examine the following issues: (1) the use of normed referenced measures (NRM) as CRM and then estimating the reliability and validity of such measures in terms of variance from an arbitrarily specified criterion score, (2) estimation of the…
A Note on the Reliability Coefficients for Item Response Model-Based Ability Estimates

ERIC Educational Resources Information Center

Kim, Seonghoon

2012-01-01

Assuming item parameters on a test are known constants, the reliability coefficient for item response theory (IRT) ability estimates is defined for a population of examinees in two different ways: as (a) the product-moment correlation between ability estimates on two parallel forms of a test and (b) the squared correlation between the true…
Metallicities of Galaxies in the Local Universe

NASA Astrophysics Data System (ADS)

Hirschauer, Alec Seth

2018-01-01

The degree of heavy-element enrichment for star-forming galaxies in the universe is a fundamental astrophysical characteristic which traces the amount of stellar nucleosynthesis undertaken by the constituent population of stars. Estimating this quantity via the so-called "direct-method" is observationally challenging and requires measurement of intrinsically weak temperature-sensitive nebular emission lines, however these are typically not found for galaxies unless their emission lines are exceptionally bright. Metal abundances ("metallicities") must then therefore be estimated by empirical means utilizing ratios of strong emission lines, calibrated to sources of known abundance and/or theoretical models, which are measurable in essentially any nebular spectrum of a star-forming system. Relationships concerning metallicities in galaxies such as the luminosity-metallicity and mass-metallicity are critically dependent upon reliable estimations of abundances. Therefore, having a reliable observational constraint is paramount to developing models which accurately reflect the universe. This dissertation presentation explores metallicities for galaxies in the local universe through a variety of means. First, an attempt is made to improve calibrations of empirical relationships for estimating abundances for star-forming galaxies at high-metallicities, finding some intrinsic shortcomings but also revealing some interesting new findings regarding the computation of the electron gas of star-forming systems, as well as detecting some anomalously under-abundant, overly-luminous galaxies. Second, the development of a self-consistent scale for estimating metallicities allows for the creation of luminosity-metallicity and mass-metallicity relations for a statistically representative sample of star-forming galaxies in the local universe. Finally, a discovery is made of an extremely metal-poor star-forming galaxy, which opens the possibility to find more similar systems and to better understand star-formation in exceptionally low-abundance environments.
Extensively Parameterized Mutation-Selection Models Reliably Capture Site-Specific Selective Constraint.

PubMed

Spielman, Stephanie J; Wilke, Claus O

2016-11-01

The mutation-selection model of coding sequence evolution has received renewed attention for its use in estimating site-specific amino acid propensities and selection coefficient distributions. Two computationally tractable mutation-selection inference frameworks have been introduced: One framework employs a fixed-effects, highly parameterized maximum likelihood approach, whereas the other employs a random-effects Bayesian Dirichlet Process approach. While both implementations follow the same model, they appear to make distinct predictions about the distribution of selection coefficients. The fixed-effects framework estimates a large proportion of highly deleterious substitutions, whereas the random-effects framework estimates that all substitutions are either nearly neutral or weakly deleterious. It remains unknown, however, how accurately each method infers evolutionary constraints at individual sites. Indeed, selection coefficient distributions pool all site-specific inferences, thereby obscuring a precise assessment of site-specific estimates. Therefore, in this study, we use a simulation-based strategy to determine how accurately each approach recapitulates the selective constraint at individual sites. We find that the fixed-effects approach, despite its extensive parameterization, consistently and accurately estimates site-specific evolutionary constraint. By contrast, the random-effects Bayesian approach systematically underestimates the strength of natural selection, particularly for slowly evolving sites. We also find that, despite the strong differences between their inferred selection coefficient distributions, the fixed- and random-effects approaches yield surprisingly similar inferences of site-specific selective constraint. We conclude that the fixed-effects mutation-selection framework provides the more reliable software platform for model application and future development. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Between-User Reliability of Tier 1 Exposure Assessment Tools Used Under REACH.

PubMed

Lamb, Judith; Galea, Karen S; Miller, Brian G; Hesse, Susanne; Van Tongeren, Martie

2017-10-01

When applying simple screening (Tier 1) tools to estimate exposure to chemicals in a given exposure situation under the Registration, Evaluation, Authorisation and restriction of CHemicals Regulation 2006 (REACH), users must select from several possible input parameters. Previous studies have suggested that results from exposure assessments using expert judgement and from the use of modelling tools can vary considerably between assessors. This study aimed to investigate the between-user reliability of Tier 1 tools. A remote-completion exercise and in person workshop were used to identify and evaluate tool parameters and factors such as user demographics that may be potentially associated with between-user variability. Participants (N = 146) generated dermal and inhalation exposure estimates (N = 4066) from specified workplace descriptions ('exposure situations') and Tier 1 tool combinations (N = 20). Interactions between users, tools, and situations were investigated and described. Systematic variation associated with individual users was minor compared with random between-user variation. Although variation was observed between choices made for the majority of input parameters, differing choices of Process Category ('PROC') code/activity descriptor and dustiness level impacted most on the resultant exposure estimates. Exposure estimates ranging over several orders of magnitude were generated for the same exposure situation by different tool users. Such unpredictable between-user variation will reduce consistency within REACH processes and could result in under-estimation or overestimation of exposure, risking worker ill-health or the implementation of unnecessary risk controls, respectively. Implementation of additional support and quality control systems for all tool users is needed to reduce between-assessor variation and so ensure both the protection of worker health and avoidance of unnecessary business risk management expenditure. © The Author 2017. Published by Oxford University Press on behalf of the British Occupational Hygiene Society.
Estimating Infection Attack Rates and Severity in Real Time during an Influenza Pandemic: Analysis of Serial Cross-Sectional Serologic Surveillance Data

PubMed Central

Wu, Joseph T.; Ho, Andrew; Ma, Edward S. K.; Lee, Cheuk Kwong; Chu, Daniel K. W.; Ho, Po-Lai; Hung, Ivan F. N.; Ho, Lai Ming; Lin, Che Kit; Tsang, Thomas; Lo, Su-Vui; Lau, Yu-Lung; Leung, Gabriel M.

2011-01-01

Background In an emerging influenza pandemic, estimating severity (the probability of a severe outcome, such as hospitalization, if infected) is a public health priority. As many influenza infections are subclinical, sero-surveillance is needed to allow reliable real-time estimates of infection attack rate (IAR) and severity. Methods and Findings We tested 14,766 sera collected during the first wave of the 2009 pandemic in Hong Kong using viral microneutralization. We estimated IAR and infection-hospitalization probability (IHP) from the serial cross-sectional serologic data and hospitalization data. Had our serologic data been available weekly in real time, we would have obtained reliable IHP estimates 1 wk after, 1–2 wk before, and 3 wk after epidemic peak for individuals aged 5–14 y, 15–29 y, and 30–59 y. The ratio of IAR to pre-existing seroprevalence, which decreased with age, was a major determinant for the timeliness of reliable estimates. If we began sero-surveillance 3 wk after community transmission was confirmed, with 150, 350, and 500 specimens per week for individuals aged 5–14 y, 15–19 y, and 20–29 y, respectively, we would have obtained reliable IHP estimates for these age groups 4 wk before the peak. For 30–59 y olds, even 800 specimens per week would not have generated reliable estimates until the peak because the ratio of IAR to pre-existing seroprevalence for this age group was low. The performance of serial cross-sectional sero-surveillance substantially deteriorates if test specificity is not near 100% or pre-existing seroprevalence is not near zero. These potential limitations could be mitigated by choosing a higher titer cutoff for seropositivity. If the epidemic doubling time is longer than 6 d, then serial cross-sectional sero-surveillance with 300 specimens per week would yield reliable estimates when IAR reaches around 6%–10%. Conclusions Serial cross-sectional serologic data together with clinical surveillance data can allow reliable real-time estimates of IAR and severity in an emerging pandemic. Sero-surveillance for pandemics should be considered. Please see later in the article for the Editors' Summary PMID:21990967
Smile line assessment comparing quantitative measurement and visual estimation.

PubMed

Van der Geld, Pieter; Oosterveld, Paul; Schols, Jan; Kuijpers-Jagtman, Anne Marie

2011-02-01

Esthetic analysis of dynamic functions such as spontaneous smiling is feasible by using digital videography and computer measurement for lip line height and tooth display. Because quantitative measurements are time-consuming, digital videography and semiquantitative (visual) estimation according to a standard categorization are more practical for regular diagnostics. Our objective in this study was to compare 2 semiquantitative methods with quantitative measurements for reliability and agreement. The faces of 122 male participants were individually registered by using digital videography. Spontaneous and posed smiles were captured. On the records, maxillary lip line heights and tooth display were digitally measured on each tooth and also visually estimated according to 3-grade and 4-grade scales. Two raters were involved. An error analysis was performed. Reliability was established with kappa statistics. Interexaminer and intraexaminer reliability values were high, with median kappa values from 0.79 to 0.88. Agreement of the 3-grade scale estimation with quantitative measurement showed higher median kappa values (0.76) than the 4-grade scale estimation (0.66). Differentiating high and gummy smile lines (4-grade scale) resulted in greater inaccuracies. The estimation of a high, average, or low smile line for each tooth showed high reliability close to quantitative measurements. Smile line analysis can be performed reliably with a 3-grade scale (visual) semiquantitative estimation. For a more comprehensive diagnosis, additional measuring is proposed, especially in patients with disproportional gingival display. Copyright © 2011 American Association of Orthodontists. Published by Mosby, Inc. All rights reserved.
Reliability of stellar inclination estimated from asteroseismology: analytical criteria, mock simulations and Kepler data analysis

NASA Astrophysics Data System (ADS)

Kamiaka, Shoya; Benomar, Othman; Suto, Yasushi

2018-05-01

Advances in asteroseismology of solar-like stars, now provide a unique method to estimate the stellar inclination i⋆. This enables to evaluate the spin-orbit angle of transiting planetary systems, in a complementary fashion to the Rossiter-McLaughlineffect, a well-established method to estimate the projected spin-orbit angle λ. Although the asteroseismic method has been broadly applied to the Kepler data, its reliability has yet to be assessed intensively. In this work, we evaluate the accuracy of i⋆ from asteroseismology of solar-like stars using 3000 simulated power spectra. We find that the low signal-to-noise ratio of the power spectra induces a systematic under-estimate (over-estimate) bias for stars with high (low) inclinations. We derive analytical criteria for the reliable asteroseismic estimate, which indicates that reliable measurements are possible in the range of 20° ≲ i⋆ ≲ 80° only for stars with high signal-to-noise ratio. We also analyse and measure the stellar inclination of 94 Kepler main-sequence solar-like stars, among which 33 are planetary hosts. According to our reliability criteria, a third of them (9 with planets, 22 without) have accurate stellar inclination. Comparison of our asteroseismic estimate of vsin i⋆ against spectroscopic measurements indicates that the latter suffers from a large uncertainty possibly due to the modeling of macro-turbulence, especially for stars with projected rotation speed vsin i⋆ ≲ 5km/s. This reinforces earlier claims, and the stellar inclination estimated from the combination of measurements from spectroscopy and photometric variation for slowly rotating stars needs to be interpreted with caution.

A method of bias correction for maximal reliability with dichotomous measures.

PubMed

Penev, Spiridon; Raykov, Tenko

2010-02-01

This paper is concerned with the reliability of weighted combinations of a given set of dichotomous measures. Maximal reliability for such measures has been discussed in the past, but the pertinent estimator exhibits a considerable bias and mean squared error for moderate sample sizes. We examine this bias, propose a procedure for bias correction, and develop a more accurate asymptotic confidence interval for the resulting estimator. In most empirically relevant cases, the bias correction and mean squared error correction can be performed simultaneously. We propose an approximate (asymptotic) confidence interval for the maximal reliability coefficient, discuss the implementation of this estimator, and investigate the mean squared error of the associated asymptotic approximation. We illustrate the proposed methods using a numerical example.
The Reliability and Validity of the Persian Version of Three-Factor Eating Questionnaire-R18 (TFEQ-R18) in Overweight and Obese Females

PubMed Central

Mostafavi, Seyed-Ali; Akhondzadeh, Shahin; Mohammadi, Mohammad Reza; Eshraghian, Mohammad Reza; Hosseini, Saeed; Chamari, Maryam; Keshavarz, Seyed Ali

2017-01-01

Objective : The Three-Factor Eating Questionnaire Reduced (TFEQ-R18) is one of the most widely used instruments for assessing eating behavior worldwide. The present study aimed at confirming the reliability and validity of the Persian version of TFEQ-R18 among overweight and obese females in Iran. Method: In the present study, 168 overweight and obese females consented to participate. We estimated the anthropometric indices and asked the participants to complete the TFEQ-R18. Beck Depression Inventory (BDI), Spielberger Anxiety Scale, Appetite Visual Analogue Rating Scale, Food Craving Questionnaire (FCQ), Compulsive Eating Scale (CES), and Restraint Eating Visual Analogue Rating Scale were performed simultaneously to assess concurrent validity. Two weeks later, TFEQ-R18 was repeated for 126 participants to assess test-retest reliability. Moreover, we reported the internal consistency and factor analysis of this questionnaire. Results: Using the results of the reliability analysis and exploratory factor analysis of the principal component by varimax rotation, we extracted 3 factors: hunger, cognitive restraint, and emotional eating. After removing the Items 16 and 18, the Cronbach’s alpha was increased to 0.73 (The Cronbach’s alpha of the factors was 0.84, 0.64, and 0.7, respectively). The results of the Pearson correlation revealed a consistency of 0.87 between the test and retest administrations (p = 0.001). Significant positive correlations were observed between TFEQ-R18 and BDI, Spielberger Anxiety Scale, FCQ, CES, appetite, body weight, fat percentage, and calorie intake. Moreover, a negative correlation was observed in Restraint Eating Visual Analogue Rating Scale and muscle percentage. Conclusion: This study aimed at presenting preliminary support for the reliability and validity of the Persian version of TFEQ-R18 and its psychometric characteristics. This instrument may be helpful in clinical practice and research studies of obesity, appetite, and eating behavior. PMID:28659982
Development of a patient safety climate survey for Chinese hospitals: cross-national adaptation and psychometric evaluation.

PubMed

Zhu, Junya; Li, Liping; Zhao, Hailei; Han, Guangshu; Wu, Albert W; Weingart, Saul N

2014-10-01

Existing patient safety climate instruments, most of which have been developed in the USA, may not accurately reflect the conditions in the healthcare systems of other countries. To develop and evaluate a patient safety climate instrument for healthcare workers in Chinese hospitals. Based on a review of existing instruments, expert panel review, focus groups and cognitive interviews, we developed items relevant to patient safety climate in Chinese hospitals. The draft instrument was distributed to 1700 hospital workers from 54 units in six hospitals in five Chinese cities between July and October 2011, and 1464 completed surveys were received. We performed exploratory and confirmatory factor analyses and estimated internal consistency reliability, within-unit agreement, between-unit variation, unit-mean reliability, correlation between multi-item composites, and association between the composites and two single items of perceived safety. The final instrument included 34 items organised into nine composites: institutional commitment to safety, unit management support for safety, organisational learning, safety system, adequacy of safety arrangements, error reporting, communication and peer support, teamwork and staffing. All composites had acceptable unit-mean reliabilities (≥0.74) and within-unit agreement (Rwg ≥0.71), and exhibited significant between-unit variation with intraclass correlation coefficients ranging from 9% to 21%. Internal consistency reliabilities ranged from 0.59 to 0.88 and were ≥0.70 for eight of the nine composites. Correlations between composites ranged from 0.27 to 0.73. All composites were positively and significantly associated with the two perceived safety items. The Chinese Hospital Survey on Patient Safety Climate demonstrates adequate dimensionality, reliability and validity. The integration of qualitative and quantitative methods is essential to produce an instrument that is culturally appropriate for Chinese hospitals. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Software reliability through fault-avoidance and fault-tolerance

NASA Technical Reports Server (NTRS)

Vouk, Mladen A.; Mcallister, David F.

1993-01-01

Strategies and tools for the testing, risk assessment and risk control of dependable software-based systems were developed. Part of this project consists of studies to enable the transfer of technology to industry, for example the risk management techniques for safety-concious systems. Theoretical investigations of Boolean and Relational Operator (BRO) testing strategy were conducted for condition-based testing. The Basic Graph Generation and Analysis tool (BGG) was extended to fully incorporate several variants of the BRO metric. Single- and multi-phase risk, coverage and time-based models are being developed to provide additional theoretical and empirical basis for estimation of the reliability and availability of large, highly dependable software. A model for software process and risk management was developed. The use of cause-effect graphing for software specification and validation was investigated. Lastly, advanced software fault-tolerance models were studied to provide alternatives and improvements in situations where simple software fault-tolerance strategies break down.
Psychometric properties of the Thai Spiritual Well-Being Scale.

PubMed

Chaiviboontham, Suchira; Phinitkhajorndech, Noppawan; Hanucharurnkul, Somchit; Noipiang, Thaniya

2016-04-01

The purpose of this study was to investigate the psychometric properties of the modified Thai Spiritual Well-Being Scale in patients with advanced cancer. This cross-sectional study was employed to investigate psychometric properties. Some 196 participants from three tertiary hospitals in Bangkok and suburban Thailand were asked to complete a Personal Information Questionnaire (PIQ), The Memorial Symptom Assessment Scale (MSAS), and the Spiritual Well-Being Scale (SWBS). Validity was determined by known-group, concurrent, and constructs validity. Reliability was estimated using internal consistency by Cronbach's α coefficients. Three factors were extracted: so-called existential well-being, religious well-being, and peacefulness accounted for 71.44% of total variance. The Cronbach's α coefficients for total SWB, EWB, RWB, and peacefulness were 0.96, 0.94, and 0.93, respectively. These findings indicate that the Thai SWBS is a valid and reliable instrument, and it presented one more factor than the original version.
Measuring awareness of financial skills: reliability and validity of a new measure.

PubMed

Cramer, K; Tuokko, H A; Mateer, C A; Hultsch, D F

2004-03-01

This paper examines the psychometric properties of a three-part (participant, informant, and performance) Measure for assessing Awareness of Financial Skills (MAFS). The MAFS was administered to 10 seniors with dementia and 25 well-functioning seniors, and their informants. Measures of cognitive functioning, social desirability, neuroticism, and perceived control were administered to each participant to allow for an assessment of validity. Internal consistency estimates for the participant and informant questionnaires were found to be 0.92 and 0.97, respectively. Convergent validity analysis indicated that performance on this measure was related to level of cognitive functioning, with higher level of unawareness associated with decreased cognitive ability. Discriminant validity analysis showed that performance on this measure was not related to social desirability or neuroticism. This study provides evidence that the MAFS is a reliable and valid tool for assessing awareness of financial skills in older adults.
On the reliability of self-reported health: evidence from Albanian data.

PubMed

Vaillant, Nicolas; Wolff, François-Charles

2012-06-01

This paper investigates the reliability of self-assessed measures of health using panel data collected in Albania by the World Bank in 2002, 2003 and 2004 through the Living Standard Measurement Study project. As the survey includes questions on a self-assessed measure of health and on more objective health problems, both types of information are combined with a view to understanding how respondents change their answers to the self-reported measures over time. Estimates from random effects ordered Probit models show that differences in self-reported subjective health between individuals are much more marked than those over time, suggesting a strong state dependence in subjective health status. The empirical analysis also reveals respondent consistency, from both a subjective and an objective viewpoint. Self-reported health is much more influenced by permanent shocks than by more transitory illness or injury. Copyright © 2012 Ministry of Health, Saudi Arabia. Published by Elsevier Ltd. All rights reserved.
Measuring self-concept among African-Americans: validating the factor structure of the self-perception profile for adolescents.

PubMed

Powell-Young, Yolanda M; Spruill, Ida J

2011-12-01

The purpose of this investigation was to examine the reliability and factor structure of the Harter Self-Perception Profile for Adolescents (SPPA) with African-Americans. While the SPPA has demonstrated strong psychometric properties with European-Americans, limited information exists with African-Americans. Three hundred and ten (N = 310) female adolescents, from 14 through 18 years of age, completed the SPPA. Estimations of internal consistency reliability with Cronbach's alpha (alpha), item suitability with Pearson (gamma) correlations, and evaluation of factor structure fit utilizing principle axis extraction with oblimin (oblique) rotation were conducted. When compared with Harter's normative data, psychometric properties of the SPPA varied significantly with the current sample. Findings suggested cautious interpretation of data generated with demographically similar cohorts. Further study is warranted to ascertain the factor structure that is most relevant for use with African-American adolescents.
Do hand-held calorimeters provide reliable and accurate estimates of resting metabolic rate?

PubMed

Van Loan, Marta D

2007-12-01

This paper provides an overview of a new technique for indirect calorimetry and the assessment of resting metabolic rate. Information from the research literature includes findings on the reliability and validity of a new hand-held indirect calorimeter as well as use in clinical and field settings. Research findings to date are of mixed results. The MedGem instrument has provided more consistent results when compared to the Douglas bag method of measuring metabolic rate. The BodyGem instrument has been shown to be less accurate when compared to standard metabolic carts. Furthermore, when the Body Gem has been used with clinical patients or with under nourished individuals the results have not been acceptable. Overall, there is not a large enough body of evidence to definitively support the use of these hand-held devices for assessment of metabolic rate in a wide variety of clinical or research environments.
Practical Issues in Implementing Software Reliability Measurement

NASA Technical Reports Server (NTRS)

Nikora, Allen P.; Schneidewind, Norman F.; Everett, William W.; Munson, John C.; Vouk, Mladen A.; Musa, John D.

1999-01-01

Many ways of estimating software systems' reliability, or reliability-related quantities, have been developed over the past several years. Of particular interest are methods that can be used to estimate a software system's fault content prior to test, or to discriminate between components that are fault-prone and those that are not. The results of these methods can be used to: 1) More accurately focus scarce fault identification resources on those portions of a software system most in need of it. 2) Estimate and forecast the risk of exposure to residual faults in a software system during operation, and develop risk and safety criteria to guide the release of a software system to fielded use. 3) Estimate the efficiency of test suites in detecting residual faults. 4) Estimate the stability of the software maintenance process.
Texture and haptic cues in slant discrimination: reliability-based cue weighting without statistically optimal cue combination

NASA Astrophysics Data System (ADS)

Rosas, Pedro; Wagemans, Johan; Ernst, Marc O.; Wichmann, Felix A.

2005-05-01

A number of models of depth-cue combination suggest that the final depth percept results from a weighted average of independent depth estimates based on the different cues available. The weight of each cue in such an average is thought to depend on the reliability of each cue. In principle, such a depth estimation could be statistically optimal in the sense of producing the minimum-variance unbiased estimator that can be constructed from the available information. Here we test such models by using visual and haptic depth information. Different texture types produce differences in slant-discrimination performance, thus providing a means for testing a reliability-sensitive cue-combination model with texture as one of the cues to slant. Our results show that the weights for the cues were generally sensitive to their reliability but fell short of statistically optimal combination - we find reliability-based reweighting but not statistically optimal cue combination.
Crystal field effect in light actinide dioxides and oxychalcogenides-a unified phenomenological description

NASA Astrophysics Data System (ADS)

Gajek, Z.

2004-05-01

The electronic properties of the actinide ions in the series of semi-conducting, antiferromagnetic compounds: dioxides, AnO2 and oxychalcogenides, AnOY, where An=U, Np and Y=S, Se, are re-examined from the point of view of the consistency of the crystal field (CF) model. The discussion is based on the supposition that the effective metal-ligand interaction solely determines the net CF effect in non-metallic compounds. The main question we address here is, whether a reliable, consistent description of the CF effect in terms of the intrinsic parameters can be achieved for this particular family of compounds. Encouraging calculations reported previously for the AnO2 and UOY series serve as a reference data in the present estimation of electronic structure parameters for neptunium oxychalcogenides.
Judging the Probability of Hypotheses Versus the Impact of Evidence: Which Form of Inductive Inference Is More Accurate and Time-Consistent?

PubMed

Tentori, Katya; Chater, Nick; Crupi, Vincenzo

2016-04-01

Inductive reasoning requires exploiting links between evidence and hypotheses. This can be done focusing either on the posterior probability of the hypothesis when updated on the new evidence or on the impact of the new evidence on the credibility of the hypothesis. But are these two cognitive representations equally reliable? This study investigates this question by comparing probability and impact judgments on the same experimental materials. The results indicate that impact judgments are more consistent in time and more accurate than probability judgments. Impact judgments also predict the direction of errors in probability judgments. These findings suggest that human inductive reasoning relies more on estimating evidential impact than on posterior probability. Copyright © 2015 Cognitive Science Society, Inc.
20 CFR 220.14 - Weighing of evidence.

Code of Federal Regulations, 2010 CFR

2010-04-01

... capacity evaluation is based upon functional objective tests with high validity and reliability; (2) The... consists of objective findings of exams that have poor reliability or validity; (7) The evidence consists...
Internal and temporal reliability estimates for informant ratings of personality using the NEO PI-R and IAS. NEO Personality Inventory. Interpersonal Adjective Scales.

PubMed

Kurtz, J E; Lee, P A; Sherker, J L

1999-06-01

This study examines the internal consistency and temporal stability of informant ratings from two widely used instruments for normal personality assessment, the revised NEO Personality Inventory (NEO PI-R) and the Interpersonal Adjective Scales (IAS). Well-known adult targets were selected by 109 undergraduate students and rated on two occasions separated by a 6-month interval. With few exceptions, estimates of internal consistency are adequate to good for both instruments. NEO PI-R domain scores yield coefficient alphas ranging from .89 to .96, with a median of .80 for the 30 facet scales. IAS octant scales show coefficient alphas ranging from .83 to .92. Retest Pearson correlations are above .70 for each of the NEO PI-R domain scores and both IAS axis coordinates, and intraclass correlations are above .60 for all scales from both instruments. Score changes were small but statistically significant for three of the five NEO PI-R domains at retest. The retest stability of IAS type classifications varies as a function of the extremity of the associated octant scores.
Depth and thermal sensor fusion to enhance 3D thermographic reconstruction.

PubMed

Cao, Yanpeng; Xu, Baobei; Ye, Zhangyu; Yang, Jiangxin; Cao, Yanlong; Tisse, Christel-Loic; Li, Xin

2018-04-02

Three-dimensional geometrical models with incorporated surface temperature data provide important information for various applications such as medical imaging, energy auditing, and intelligent robots. In this paper we present a robust method for mobile and real-time 3D thermographic reconstruction through depth and thermal sensor fusion. A multimodal imaging device consisting of a thermal camera and a RGB-D sensor is calibrated geometrically and used for data capturing. Based on the underlying principle that temperature information remains robust against illumination and viewpoint changes, we present a Thermal-guided Iterative Closest Point (T-ICP) methodology to facilitate reliable 3D thermal scanning applications. The pose of sensing device is initially estimated using correspondences found through maximizing the thermal consistency between consecutive infrared images. The coarse pose estimate is further refined by finding the motion parameters that minimize a combined geometric and thermographic loss function. Experimental results demonstrate that complimentary information captured by multimodal sensors can be utilized to improve performance of 3D thermographic reconstruction. Through effective fusion of thermal and depth data, the proposed approach generates more accurate 3D thermal models using significantly less scanning data.
Psychometric properties of the Plutchik´s Violence Risk Scale on adolescent sample of Spanish-speaking population.

PubMed

Alcázar-Córcoles, Miguel Á; Verdejo-García, Antonio; Bouso-Sáiz, José C

2016-01-01

The objective of the present study was the validation and scaling of the Plutchik's Violence Risk Scale (EV) in adolescent Spanish-speaking population. For this purpose, a sample of adolescents from El Salvador, Mexico and Spain was obtained. The sample consisted of 1035 participants with a mean age of 16.2. There were 450 adolescents from forensic population (those who committed crime) and 585 adolescents from normal population (no crime committed). The internal consistency of the EV was estimated by Cronbach's alpha coefficient and with a value of 0.782. As for validity, the factorial structures found explain a large proportion of the variance (53.385%); the convergent validity was estimated by the correlation between the dimensions found, the EV and sociodemographic, criminological and personality variables. The developed scales are presented, for the first time in a cross-cultural sample, differentiating between gender and continent. Consequently, the obtained results suggest that the EV is a valid and reliable instrument within adolescent Spanish-speaking population. Furthermore, it is a quick scale, easy to apply, which is something valuable in forensic assessment.
[The reliability of a questionnaire regarding Colombian children's physical activity].

PubMed

Herazo-Beltrán, Aliz Y; Domínguez-Anaya, Regina

2012-10-01

Reporting the Physical Activity Questionnaire for school children's (PAQ-C) test-retest reliability and internal consistency. This was a descriptive study of 100 school-aged children aged 9 to 11 years old attending a school in Cartagena, Colombia. The sample was randomly selected. The PAQ-C was given twice, one week apart, after the informed consent forms had been signing by the children's parents and school officials. Cronbach's alpha coefficient of reliability was used for assessing internal consistency and an intra-class correlation coefficient for test-retest reliability SPSS (version 17.0) was used for statistical analysis. The questionnaire scored 0.73 internal consistencies during the first measurement and 0.78 on the second; intra-class correlation coefficient was 0.60. There were differences between boys and girls regarding both measurements. The PAQ-C had acceptable internal consistency and test-retest reliability, thereby making it useful for measuring children's self-reported physical activity and a valuable tool for population studies in Colombia.
The Validity and Reliability of the Mobbing Scale (MS)

ERIC Educational Resources Information Center

Yaman, Erkan

2009-01-01

The aim of this research is to develop the Mobbing Scale and examine its validity and reliability. The sample of the study consisted of 515 persons from Sakarya and Bursa. In this study, construct validity, internal consistency, test-retest reliability, and item analysis of the scale were examined. As a result of factor analysis for construct…
Testing inter-observer reliability of the Transition Analysis aging method on the William M. Bass forensic skeletal collection.

PubMed

Fojas, Christina L; Kim, Jieun; Minsky-Rowland, Jocelyn D; Algee-Hewitt, Bridget F B

2018-01-01

Skeletal age estimation is an integral part of the biological profile. Recent work shows how multiple-trait approaches better capture senescence as it occurs at different rates among individuals. Furthermore, a Bayesian statistical framework of analysis provides more useful age estimates. The component-scoring method of Transition Analysis (TA) may resolve many of the functional and statistical limitations of traditional phase-aging methods and is applicable to both paleodemography and forensic casework. The present study contributes to TA-research by validating TA for multiple, differently experienced observers using a collection of modern forensic skeletal cases. Five researchers independently applied TA to a random sample of 58 documented individuals from the William M. Bass Forensic Skeletal Collection, for whom knowledge of chronological age was withheld. Resulting scores were input into the ADBOU software and maximum likelihood estimates (MLEs) and 95% confidence intervals (CIs) were produced using the forensic prior. Krippendorff's alpha was used to evaluate interrater reliability and agreement. Inaccuracy and bias were measured to gauge the magnitude and direction of difference between estimated ages and chronological ages among the five observers. The majority of traits had moderate to excellent agreement among observers (≥0.6). The superior surface morphology had the least congruence (0.4), while the ventral symphyseal margin had the most (0.9) among scores. Inaccuracy was the lowest for individuals younger than 30 and the greatest for individuals over 60. Consistent over-estimation of individuals younger than 30 and under-estimation of individuals over 40 years old occurred. Individuals in their 30s showed a mixed pattern of under- and over-estimation among observers. These results support the use of the TA method by researchers of varying experience levels. Further, they validate its use on forensic cases, given the low error overall. © 2017 Wiley Periodicals, Inc.

Estimation of motion fields by non-linear registration for local lung motion analysis in 4D CT image data.

PubMed

Werner, René; Ehrhardt, Jan; Schmidt-Richberg, Alexander; Heiss, Anabell; Handels, Heinz

2010-11-01

Motivated by radiotherapy of lung cancer non- linear registration is applied to estimate 3D motion fields for local lung motion analysis in thoracic 4D CT images. Reliability of analysis results depends on the registration accuracy. Therefore, our study consists of two parts: optimization and evaluation of a non-linear registration scheme for motion field estimation, followed by a registration-based analysis of lung motion patterns. The study is based on 4D CT data of 17 patients. Different distance measures and force terms for thoracic CT registration are implemented and compared: sum of squared differences versus a force term related to Thirion's demons registration; masked versus unmasked force computation. The most accurate approach is applied to local lung motion analysis. Masked Thirion forces outperform the other force terms. The mean target registration error is 1.3 ± 0.2 mm, which is in the order of voxel size. Based on resulting motion fields and inter-patient normalization of inner lung coordinates and breathing depths a non-linear dependency between inner lung position and corresponding strength of motion is identified. The dependency is observed for all patients without or with only small tumors. Quantitative evaluation of the estimated motion fields indicates high spatial registration accuracy. It allows for reliable registration-based local lung motion analysis. The large amount of information encoded in the motion fields makes it possible to draw detailed conclusions, e.g., to identify the dependency of inner lung localization and motion. Our examinations illustrate the potential of registration-based motion analysis.
Predictive ability of genomic selection models for breeding value estimation on growth traits of Pacific white shrimp Litopenaeus vannamei

NASA Astrophysics Data System (ADS)

Wang, Quanchao; Yu, Yang; Li, Fuhua; Zhang, Xiaojun; Xiang, Jianhai

2017-09-01

Genomic selection (GS) can be used to accelerate genetic improvement by shortening the selection interval. The successful application of GS depends largely on the accuracy of the prediction of genomic estimated breeding value (GEBV). This study is a first attempt to understand the practicality of GS in Litopenaeus vannamei and aims to evaluate models for GS on growth traits. The performance of GS models in L. vannamei was evaluated in a population consisting of 205 individuals, which were genotyped for 6 359 single nucleotide polymorphism (SNP) markers by specific length amplified fragment sequencing (SLAF-seq) and phenotyped for body length and body weight. Three GS models (RR-BLUP, BayesA, and Bayesian LASSO) were used to obtain the GEBV, and their predictive ability was assessed by the reliability of the GEBV and the bias of the predicted phenotypes. The mean reliability of the GEBVs for body length and body weight predicted by the different models was 0.296 and 0.411, respectively. For each trait, the performances of the three models were very similar to each other with respect to predictability. The regression coefficients estimated by the three models were close to one, suggesting near to zero bias for the predictions. Therefore, when GS was applied in a L. vannamei population for the studied scenarios, all three models appeared practicable. Further analyses suggested that improved estimation of the genomic prediction could be realized by increasing the size of the training population as well as the density of SNPs.
Cross-cultural adaptation and validation of the Korean version of the Roland-Morris Disability Questionnaire for use in low back pain.

PubMed

Kim, Kyoung-Eun; Lim, Jae-Young

2011-01-01

The Roland-Morris Disability Questionnaire (RMDQ) is a reliable tool for evaluating disability in patients with back pain, but no Korean version has been published and validated. We developed a cross-culturally adapted Korean version of the RMDQ (RMDQ-K) and validated its use for assessing disability in Korean patients with low back pain. Two hundred thirty-one patients with low back pain were assessed using the RMDQ-K, visual analog scale (VAS) during rest and activity, and the Oswestry Disability Index (ODI). The results of 40 patients were used to evaluate the test-retest reliability. The correlations of the RMDQ-K with the VAS and ODI were used to assess validity. The reliability of the RMDQ-K estimated using the internal consistency reached a Cronbach's alpha of 0.893. Test-retest trials showed a high intraclass correlation coefficient of 0.837 (95% CI 0.833-0.953). The RMDQ-K was significantly correlated with the ODI (r=0.738) and VAS during rest (r=0.450) and activity (r=0.412). This study demonstrates that the RMDQ-K is a reliable, valid instrument for measuring of disability in Korean patients with low back pain.
Estimation of pelvis kinematics in level walking based on a single inertial sensor positioned close to the sacrum: validation on healthy subjects with stereophotogrammetric system.

PubMed

Buganè, Francesca; Benedetti, Maria Grazia; D'Angeli, Valentina; Leardini, Alberto

2014-10-21

Kinematics measures from inertial sensors have a value in the clinical assessment of pathological gait, to track quantitatively the outcome of interventions and rehabilitation programs. To become a standard tool for clinicians, it is necessary to evaluate their capability to provide reliable and comprehensible information, possibly by comparing this with that provided by the traditional gait analysis. The aim of this study was to assess by state-of-the-art gait analysis the reliability of a single inertial device attached to the sacrum to measure pelvis kinematics during level walking. The output signals of the three-axis gyroscope were processed to estimate the spatial orientation of the pelvis in the sagittal (tilt angle), frontal (obliquity) and transverse (rotation) anatomical planes These estimated angles were compared with those provided by a 8 TV-cameras stereophotogrammetric system utilizing a standard experimental protocol, with four markers on the pelvis. This was observed in a group of sixteen healthy subjects while performing three repetitions of level walking along a 10 meter walkway at slow, normal and fast speeds. The determination coefficient, the scale factor and the bias of a linear regression model were calculated to represent the differences between the angular patterns from the two measurement systems. For the intra-subject variability, one volunteer was asked to repeat walking at normal speed 10 times. A good match was observed for obliquity and rotation angles. For the tilt angle, the pattern and range of motion was similar, but a bias was observed, due to the different initial inclination angle in the sagittal plane of the inertial sensor with respect to the pelvis anatomical frame. A good intra-subject consistency has also been shown by the small variability of the pelvic angles as estimated by the new system, confirmed by very small values of standard deviation for all three angles. These results suggest that this inertial device is a reliable alternative to stereophotogrammetric systems for pelvis kinematics measurements, in addition to being easier to use and cheaper. The device can provide to the patient and to the examiner reliable feedback in real-time during routine clinical tests.
Delimiting Coefficient a from Internal Consistency and Unidimensionality

ERIC Educational Resources Information Center

Sijtsma, Klaas

2015-01-01

I discuss the contribution by Davenport, Davison, Liou, & Love (2015) in which they relate reliability represented by coefficient a to formal definitions of internal consistency and unidimensionality, both proposed by Cronbach (1951). I argue that coefficient a is a lower bound to reliability and that concepts of internal consistency and…
Bayesian methods in reliability

NASA Astrophysics Data System (ADS)

Sander, P.; Badoux, R.

1991-11-01

The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.
Estimating the Term Structure With a Semiparametric Bayesian Hierarchical Model: An Application to Corporate Bonds.

PubMed

Cruz-Marcelo, Alejandro; Ensor, Katherine B; Rosner, Gary L

2011-06-01

The term structure of interest rates is used to price defaultable bonds and credit derivatives, as well as to infer the quality of bonds for risk management purposes. We introduce a model that jointly estimates term structures by means of a Bayesian hierarchical model with a prior probability model based on Dirichlet process mixtures. The modeling methodology borrows strength across term structures for purposes of estimation. The main advantage of our framework is its ability to produce reliable estimators at the company level even when there are only a few bonds per company. After describing the proposed model, we discuss an empirical application in which the term structure of 197 individual companies is estimated. The sample of 197 consists of 143 companies with only one or two bonds. In-sample and out-of-sample tests are used to quantify the improvement in accuracy that results from approximating the term structure of corporate bonds with estimators by company rather than by credit rating, the latter being a popular choice in the financial literature. A complete description of a Markov chain Monte Carlo (MCMC) scheme for the proposed model is available as Supplementary Material.
Estimating the Term Structure With a Semiparametric Bayesian Hierarchical Model: An Application to Corporate Bonds1

PubMed Central

Cruz-Marcelo, Alejandro; Ensor, Katherine B.; Rosner, Gary L.

2011-01-01

The term structure of interest rates is used to price defaultable bonds and credit derivatives, as well as to infer the quality of bonds for risk management purposes. We introduce a model that jointly estimates term structures by means of a Bayesian hierarchical model with a prior probability model based on Dirichlet process mixtures. The modeling methodology borrows strength across term structures for purposes of estimation. The main advantage of our framework is its ability to produce reliable estimators at the company level even when there are only a few bonds per company. After describing the proposed model, we discuss an empirical application in which the term structure of 197 individual companies is estimated. The sample of 197 consists of 143 companies with only one or two bonds. In-sample and out-of-sample tests are used to quantify the improvement in accuracy that results from approximating the term structure of corporate bonds with estimators by company rather than by credit rating, the latter being a popular choice in the financial literature. A complete description of a Markov chain Monte Carlo (MCMC) scheme for the proposed model is available as Supplementary Material. PMID:21765566
Empirical estimation of present-day Antarctic glacial isostatic adjustment and ice mass change

NASA Astrophysics Data System (ADS)

Gunter, B. C.; Didova, O.; Riva, R. E. M.; Ligtenberg, S. R. M.; Lenaerts, J. T. M.; King, M. A.; van den Broeke, M. R.; Urban, T.

2014-04-01

This study explores an approach that simultaneously estimates Antarctic mass balance and glacial isostatic adjustment (GIA) through the combination of satellite gravity and altimetry data sets. The results improve upon previous efforts by incorporating a firn densification model to account for firn compaction and surface processes as well as reprocessed data sets over a slightly longer period of time. A range of different Gravity Recovery and Climate Experiment (GRACE) gravity models were evaluated and a new Ice, Cloud, and Land Elevation Satellite (ICESat) surface height trend map computed using an overlapping footprint approach. When the GIA models created from the combination approach were compared to in situ GPS ground station displacements, the vertical rates estimated showed consistently better agreement than recent conventional GIA models. The new empirically derived GIA rates suggest the presence of strong uplift in the Amundsen Sea sector in West Antarctica (WA) and the Philippi/Denman sectors, as well as subsidence in large parts of East Antarctica (EA). The total GIA-related mass change estimates for the entire Antarctic ice sheet ranged from 53 to 103 Gt yr-1, depending on the GRACE solution used, with an estimated uncertainty of ±40 Gt yr-1. Over the time frame February 2003-October 2009, the corresponding ice mass change showed an average value of -100 ± 44 Gt yr-1 (EA: 5 ± 38, WA: -105 ± 22), consistent with other recent estimates in the literature, with regional mass loss mostly concentrated in WA. The refined approach presented in this study shows the contribution that such data combinations can make towards improving estimates of present-day GIA and ice mass change, particularly with respect to determining more reliable uncertainties.
The Thai version of the PSS-10: An Investigation of its psychometric properties.

PubMed

Wongpakaran, Nahathai; Wongpakaran, Tinakon

2010-06-12

Among the stress instruments that measure the degree to which life events are perceived as stressful, the Perceived Stress Scale (PSS) is widely used. The goal of this study was to examine the psychometric properties of a Thai version of the PSS-10 (T-PSS-10) with a clinical and non-clinical sample. Internal consistency, test-retest reliability, concurrent validity, and the factorial structure of the scale were tested. A total sample of 479 adult participants was recruited for the study: 368 medical students and 111 patients from two hospitals in Northern Thailand. The T-PSS-10 was used along with the Thai version of State Trait Anxiety Inventory (STAI), the Thai Version of the Rosenberg Self-Esteem Scale (RSES), and the Thai Depression Inventory (TDI). Exploratory Factor Analysis (EFA) yielded 2 factors with eigenvalues of 5.05 and 1.60, accounting for 66 percent of variance. Factor 1 consisted of 6 items representing "stress"; whereas Factor 2 consisted of 4 items representing "control". The item loadings ranged from 0.547 to 0.881. Investigation of the fit indices associated with Maximum Likelihood (ML) estimation revealed that the two-factor solution was adequate [chi2 = 35.035 (df = 26, N = 368, p < 0.111)]; Goodness-of-Fit Index (GFI) = 0.981; Root Mean Square Residual (RMR) = 0.022; Standardized Root Mean square Residual (SRMR) = 0.037, Comparative Fit Index (CFI) = 0.989; Normed Fit Index (NFI) = 0.96, Non-Normed Fit Index (NNFI) = 0.981, Root Mean Square Error of Approximation (RMSEA) = 0.031. It was found that the T-PSS-10 had a significant positive correlation with the STAI (r = 0.60, p < 0.0001), and the TDI (r = 0.55, p < 0.0001); and was significantly negatively correlated with the RSES (r = -0.46, p < 0.0001, N = 368). The overall Cronbach's alpha was 0.85. The ICC was 0.82 (95% CI, 0.72 and 0.88) at 4 week-retest reliability. The Thai version of the PSS-10 demonstrated excellent goodness-of-fit for the two factor solution model, as well as good reliability and validity for estimating the level of stress perception with a Thai population. Limitations of the study are discussed.
Methods and Costs to Achieve Ultra Reliable Life Support

NASA Technical Reports Server (NTRS)

Jones, Harry W.

2012-01-01

A published Mars mission is used to explore the methods and costs to achieve ultra reliable life support. The Mars mission and its recycling life support design are described. The life support systems were made triply redundant, implying that each individual system will have fairly good reliability. Ultra reliable life support is needed for Mars and other long, distant missions. Current systems apparently have insufficient reliability. The life cycle cost of the Mars life support system is estimated. Reliability can be increased by improving the intrinsic system reliability, adding spare parts, or by providing technically diverse redundant systems. The costs of these approaches are estimated. Adding spares is least costly but may be defeated by common cause failures. Using two technically diverse systems is effective but doubles the life cycle cost. Achieving ultra reliability is worth its high cost because the penalty for failure is very high.
Reliability of digital reactor protection system based on extenics.

PubMed

Zhao, Jing; He, Ya-Nan; Gu, Peng-Fei; Chen, Wei-Hua; Gao, Feng

2016-01-01

After the Fukushima nuclear accident, safety of nuclear power plants (NPPs) is widespread concerned. The reliability of reactor protection system (RPS) is directly related to the safety of NPPs, however, it is difficult to accurately evaluate the reliability of digital RPS. The method is based on estimating probability has some uncertainties, which can not reflect the reliability status of RPS dynamically and support the maintenance and troubleshooting. In this paper, the reliability quantitative analysis method based on extenics is proposed for the digital RPS (safety-critical), by which the relationship between the reliability and response time of RPS is constructed. The reliability of the RPS for CPR1000 NPP is modeled and analyzed by the proposed method as an example. The results show that the proposed method is capable to estimate the RPS reliability effectively and provide support to maintenance and troubleshooting of digital RPS system.
Advancing methods for reliably assessing motivational interviewing fidelity using the Motivational Interviewing Skills Code

PubMed Central

Lord, Sarah Peregrine; Can, Doğan; Yi, Michael; Marin, Rebeca; Dunn, Christopher W.; Imel, Zac E.; Georgiou, Panayiotis; Narayanan, Shrikanth; Steyvers, Mark; Atkins, David C.

2014-01-01

The current paper presents novel methods for collecting MISC data and accurately assessing reliability of behavior codes at the level of the utterance. The MISC 2.1 was used to rate MI interviews from five randomized trials targeting alcohol and drug use. Sessions were coded at the utterance-level. Utterance-based coding reliability was estimated using three methods and compared to traditional reliability estimates of session tallies. Session-level reliability was generally higher compared to reliability using utterance-based codes, suggesting that typical methods for MISC reliability may be biased. These novel methods in MI fidelity data collection and reliability assessment provided rich data for therapist feedback and further analyses. Beyond implications for fidelity coding, utterance-level coding schemes may elucidate important elements in the counselor-client interaction that could inform theories of change and the practice of MI. PMID:25242192
Advancing methods for reliably assessing motivational interviewing fidelity using the motivational interviewing skills code.

PubMed

Lord, Sarah Peregrine; Can, Doğan; Yi, Michael; Marin, Rebeca; Dunn, Christopher W; Imel, Zac E; Georgiou, Panayiotis; Narayanan, Shrikanth; Steyvers, Mark; Atkins, David C

2015-02-01

The current paper presents novel methods for collecting MISC data and accurately assessing reliability of behavior codes at the level of the utterance. The MISC 2.1 was used to rate MI interviews from five randomized trials targeting alcohol and drug use. Sessions were coded at the utterance-level. Utterance-based coding reliability was estimated using three methods and compared to traditional reliability estimates of session tallies. Session-level reliability was generally higher compared to reliability using utterance-based codes, suggesting that typical methods for MISC reliability may be biased. These novel methods in MI fidelity data collection and reliability assessment provided rich data for therapist feedback and further analyses. Beyond implications for fidelity coding, utterance-level coding schemes may elucidate important elements in the counselor-client interaction that could inform theories of change and the practice of MI. Copyright © 2015 Elsevier Inc. All rights reserved.
Low-flow characteristics of streams in the lower Wisconsin River basin

USGS Publications Warehouse

Gebert, W.A.

1978-01-01

Low-flow characteristics estimated for the lower Wisconsin River basin have a high degree of reliability when compared with other basins in Wisconsin, Reliable estimates appear to be related to the relatively uniform geologic features in the basin.
Autonomous navigation system based on GPS and magnetometer data

NASA Technical Reports Server (NTRS)

Julie, Thienel K. (Inventor); Richard, Harman R. (Inventor); Bar-Itzhack, Itzhack Y. (Inventor)

2004-01-01

This invention is drawn to an autonomous navigation system using Global Positioning System (GPS) and magnetometers for low Earth orbit satellites. As a magnetometer is reliable and always provides information on spacecraft attitude, rate, and orbit, the magnetometer-GPS configuration solves GPS initialization problem, decreasing the convergence time for navigation estimate and improving the overall accuracy. Eventually the magnetometer-GPS configuration enables the system to avoid costly and inherently less reliable gyro for rate estimation. Being autonomous, this invention would provide for black-box spacecraft navigation, producing attitude, orbit, and rate estimates without any ground input with high accuracy and reliability.
Service quality, satisfaction, and behavioral intention in home delivered meals program

PubMed Central

Joung, Hyun-Woo; Yuan, Jingxue Jessica; Huffman, Lynn

2011-01-01

This study was conducted to evaluate recipients' perception of service quality, satisfaction, and behavioral intention in home delivered meals program in the US. Out of 398 questionnaires, 265 (66.6%) were collected, and 209 questionnaires (52.5%) were used for the statistical analysis. A Confirmatory Factor Analysis (CFA) with a maximum likelihood was first conducted to estimate the measurement model by verifying the underlying structure of constructs. The level of internal consistency in each construct was acceptable, with Cronbach's alpha estimates ranging from 0.7 to 0.94. All of the composite reliabilities of the constructs were over the cutoff value of 0.50, ensuring adequate internal consistency of multiple items for each construct. As a second step, a Meals-On-Wheels (MOW) recipient perception model was estimated. The model's fit as indicated by these indexes was satisfactory and path coefficients were analyzed. Two paths between (1) volunteer issues and behavioral intention and (2) responsiveness and behavioral intention were not significant. The path for predicting a positive relationship between food quality and satisfaction was supported. The results show that having high food quality may create recipient satisfaction. The findings suggest that food quality and responsiveness are significant predictors of positive satisfaction. Moreover, satisfied recipients have positive behavioral intention toward MOW programs. PMID:21556231
Service quality, satisfaction, and behavioral intention in home delivered meals program.

PubMed

Joung, Hyun-Woo; Kim, Hak-Seon; Yuan, Jingxue Jessica; Huffman, Lynn

2011-04-01

This study was conducted to evaluate recipients' perception of service quality, satisfaction, and behavioral intention in home delivered meals program in the US. Out of 398 questionnaires, 265 (66.6%) were collected, and 209 questionnaires (52.5%) were used for the statistical analysis. A Confirmatory Factor Analysis (CFA) with a maximum likelihood was first conducted to estimate the measurement model by verifying the underlying structure of constructs. The level of internal consistency in each construct was acceptable, with Cronbach's alpha estimates ranging from 0.7 to 0.94. All of the composite reliabilities of the constructs were over the cutoff value of 0.50, ensuring adequate internal consistency of multiple items for each construct. As a second step, a Meals-On-Wheels (MOW) recipient perception model was estimated. The model's fit as indicated by these indexes was satisfactory and path coefficients were analyzed. Two paths between (1) volunteer issues and behavioral intention and (2) responsiveness and behavioral intention were not significant. The path for predicting a positive relationship between food quality and satisfaction was supported. The results show that having high food quality may create recipient satisfaction. The findings suggest that food quality and responsiveness are significant predictors of positive satisfaction. Moreover, satisfied recipients have positive behavioral intention toward MOW programs.
Methods for estimating comparable prevalence rates of food insecurity experienced by adults in 147 countries and areas

NASA Astrophysics Data System (ADS)

Nord, Mark; Cafiero, Carlo; Viviani, Sara

2016-11-01

Statistical methods based on item response theory are applied to experiential food insecurity survey data from 147 countries, areas, and territories to assess data quality and develop methods to estimate national prevalence rates of moderate and severe food insecurity at equal levels of severity across countries. Data were collected from nationally representative samples of 1,000 adults in each country. A Rasch-model-based scale was estimated for each country, and data were assessed for consistency with model assumptions. A global reference scale was calculated based on item parameters from all countries. Each country's scale was adjusted to the global standard, allowing for up to 3 of the 8 scale items to be considered unique in that country if their deviance from the global standard exceeded a set tolerance. With very few exceptions, data from all countries were sufficiently consistent with model assumptions to constitute reasonably reliable measures of food insecurity and were adjustable to the global standard with fair confidence. National prevalence rates of moderate-or-severe food insecurity assessed over a 12-month recall period ranged from 3 percent to 92 percent. The correlations of national prevalence rates with national income, health, and well-being indicators provide external validation of the food security measure.
Application of wavefield imaging to characterize scattering from artificial and impact damage in composite laminate panels

NASA Astrophysics Data System (ADS)

Williams, Westin B.; Michaels, Thomas E.; Michaels, Jennifer E.

2018-04-01

Composite materials used for aerospace applications are highly susceptible to impacts, which can result in barely visible delaminations. Reliable and fast detection of such damage is needed before structural failures occur. One approach is to use ultrasonic guided waves generated from a sparse array consisting of permanently mounted or embedded transducers for performing structural health monitoring. This array can detect introduction of damage after baseline subtraction, and also provide localization and characterization information via the minimum variance imaging algorithm. Imaging performance can vary considerably depending upon where damage is located with respect to the array; however, prior work has shown that knowledge of expected scattering can improve imaging consistency for artificial damage at various locations. In this study, anisotropic material attenuation and wave speed are estimated as a function of propagation angle using wavefield data recorded along radial lines at multiple angles with respect to an omnidirectional guided wave source. Additionally, full wavefield data are recorded before and after the introduction of artificial and impact damage so that wavefield baseline subtraction may be applied. 3-D filtering techniques are then used to reduce noise and isolate scattered waves. A model for estimating scattering of a circular defect is developed and scattering estimates for both artificial and impact damage are presented and compared.

More reliable estimates of divergence times in Pan using complete mtDNA sequences and accounting for population structure.

PubMed

Stone, Anne C; Battistuzzi, Fabia U; Kubatko, Laura S; Perry, George H; Trudeau, Evan; Lin, Hsiuman; Kumar, Sudhir

2010-10-27

Here, we report the sequencing and analysis of eight complete mitochondrial genomes of chimpanzees (Pan troglodytes) from each of the three established subspecies (P. t. troglodytes, P. t. schweinfurthii and P. t. verus) and the proposed fourth subspecies (P. t. ellioti). Our population genetic analyses are consistent with neutral patterns of evolution that have been shaped by demography. The high levels of mtDNA diversity in western chimpanzees are unlike those seen at nuclear loci, which may reflect a demographic history of greater female to male effective population sizes possibly owing to the characteristics of the founding population. By using relaxed-clock methods, we have inferred a timetree of chimpanzee species and subspecies. The absolute divergence times vary based on the methods and calibration used, but relative divergence times show extensive uniformity. Overall, mtDNA produces consistently older times than those known from nuclear markers, a discrepancy that is reduced significantly by explicitly accounting for chimpanzee population structures in time estimation. Assuming the human-chimpanzee split to be between 7 and 5 Ma, chimpanzee time estimates are 2.1-1.5, 1.1-0.76 and 0.25-0.18 Ma for the chimpanzee/bonobo, western/(eastern + central) and eastern/central chimpanzee divergences, respectively.
The Challenges of Credible Thermal Protection System Reliability Quantification

NASA Technical Reports Server (NTRS)

Green, Lawrence L.

2013-01-01

The paper discusses several of the challenges associated with developing a credible reliability estimate for a human-rated crew capsule thermal protection system. The process of developing such a credible estimate is subject to the quantification, modeling and propagation of numerous uncertainties within a probabilistic analysis. The development of specific investment recommendations, to improve the reliability prediction, among various potential testing and programmatic options is then accomplished through Bayesian analysis.
Reliability of semiautomated computational methods for estimating tibiofemoral contact stress in the Multicenter Osteoarthritis Study.

PubMed

Anderson, Donald D; Segal, Neil A; Kern, Andrew M; Nevitt, Michael C; Torner, James C; Lynch, John A

2012-01-01

Recent findings suggest that contact stress is a potent predictor of subsequent symptomatic osteoarthritis development in the knee. However, much larger numbers of knees (likely on the order of hundreds, if not thousands) need to be reliably analyzed to achieve the statistical power necessary to clarify this relationship. This study assessed the reliability of new semiautomated computational methods for estimating contact stress in knees from large population-based cohorts. Ten knees of subjects from the Multicenter Osteoarthritis Study were included. Bone surfaces were manually segmented from sequential 1.0 Tesla magnetic resonance imaging slices by three individuals on two nonconsecutive days. Four individuals then registered the resulting bone surfaces to corresponding bone edges on weight-bearing radiographs, using a semi-automated algorithm. Discrete element analysis methods were used to estimate contact stress distributions for each knee. Segmentation and registration reliabilities (day-to-day and interrater) for peak and mean medial and lateral tibiofemoral contact stress were assessed with Shrout-Fleiss intraclass correlation coefficients (ICCs). The segmentation and registration steps of the modeling approach were found to have excellent day-to-day (ICC 0.93-0.99) and good inter-rater reliability (0.84-0.97). This approach for estimating compartment-specific tibiofemoral contact stress appears to be sufficiently reliable for use in large population-based cohorts.
Impact of the time scale of model sensitivity response on coupled model parameter estimation

NASA Astrophysics Data System (ADS)

Liu, Chang; Zhang, Shaoqing; Li, Shan; Liu, Zhengyu

2017-11-01

That a model has sensitivity responses to parameter uncertainties is a key concept in implementing model parameter estimation using filtering theory and methodology. Depending on the nature of associated physics and characteristic variability of the fluid in a coupled system, the response time scales of a model to parameters can be different, from hourly to decadal. Unlike state estimation, where the update frequency is usually linked with observational frequency, the update frequency for parameter estimation must be associated with the time scale of the model sensitivity response to the parameter being estimated. Here, with a simple coupled model, the impact of model sensitivity response time scales on coupled model parameter estimation is studied. The model includes characteristic synoptic to decadal scales by coupling a long-term varying deep ocean with a slow-varying upper ocean forced by a chaotic atmosphere. Results show that, using the update frequency determined by the model sensitivity response time scale, both the reliability and quality of parameter estimation can be improved significantly, and thus the estimated parameters make the model more consistent with the observation. These simple model results provide a guideline for when real observations are used to optimize the parameters in a coupled general circulation model for improving climate analysis and prediction initialization.
An MCMC determination of the primordial helium abundance

NASA Astrophysics Data System (ADS)

Aver, Erik; Olive, Keith A.; Skillman, Evan D.

2012-04-01

Spectroscopic observations of the chemical abundances in metal-poor H II regions provide an independent method for estimating the primordial helium abundance. H II regions are described by several physical parameters such as electron density, electron temperature, and reddening, in addition to y, the ratio of helium to hydrogen. It had been customary to estimate or determine self-consistently these parameters to calculate y. Frequentist analyses of the parameter space have been shown to be successful in these parameter determinations, and Markov Chain Monte Carlo (MCMC) techniques have proven to be very efficient in sampling this parameter space. Nevertheless, accurate determination of the primordial helium abundance from observations of H II regions is constrained by both systematic and statistical uncertainties. In an attempt to better reduce the latter, and continue to better characterize the former, we apply MCMC methods to the large dataset recently compiled by Izotov, Thuan, & Stasińska (2007). To improve the reliability of the determination, a high quality dataset is needed. In pursuit of this, a variety of cuts are explored. The efficacy of the He I λ4026 emission line as a constraint on the solutions is first examined, revealing the introduction of systematic bias through its absence. As a clear measure of the quality of the physical solution, a χ2 analysis proves instrumental in the selection of data compatible with the theoretical model. Nearly two-thirds of the observations fall outside a standard 95% confidence level cut, which highlights the care necessary in selecting systems and warrants further investigation into potential deficiencies of the model or data. In addition, the method also allows us to exclude systems for which parameter estimations are statistical outliers. As a result, the final selected dataset gains in reliability and exhibits improved consistency. Regression to zero metallicity yields Yp = 0.2534 ± 0.0083, in broad agreement with the WMAP result. The inclusion of more observations shows promise for further reducing the uncertainty, but more high quality spectra are required.
The panchromatic Hubble Andromeda Treasury. VI. The reliability of far-ultraviolet flux as a star formation tracer on subkiloparsec scales

DOE Office of Scientific and Technical Information (OSTI.GOV)

Simones, Jacob E.; Skillman, Evan D.; Weisz, Daniel R.

We have used optical observations of resolved stars from the Panchromatic Hubble Andromeda Treasury to measure the recent (<500 Myr) star formation histories (SFHs) of 33 far-UV (FUV)-bright regions in M31. The region areas ranged from ∼10{sup 4} to 10{sup 6} pc{sup 2}, which allowed us to test the reliability of FUV flux as a tracer of recent star formation on subkiloparsec scales. The star formation rates (SFRs) derived from the extinction-corrected observed FUV fluxes were, on average, consistent with the 100 Myr mean SFRs of the SFHs to within the 1σ scatter. Overall, the scatter was larger than themore » uncertainties in the SFRs and particularly evident among the smallest regions. The scatter was consistent with an even combination of discrete sampling of the initial mass function and high variability in the SFHs. This result demonstrates the importance of satisfying both the full-IMF and the constant-SFR assumptions for obtaining precise SFR estimates from FUV flux. Assuming a robust FUV extinction correction, we estimate that a factor of 2.5 uncertainty can be expected in FUV-based SFRs for regions smaller than 10{sup 5} pc{sup 2} or a few hundred parsecs. We also examined ages and masses derived from UV flux under the common assumption that the regions are simple stellar populations (SSPs). The SFHs showed that most of the regions are not SSPs, and the age and mass estimates were correspondingly discrepant from the SFHs. For those regions with SSP-like SFHs, we found mean discrepancies of 10 Myr in age and a factor of 3-4 in mass. It was not possible to distinguish the SSP-like regions from the others based on integrated FUV flux.« less
The psychometric characteristics of the revised depression attitude questionnaire (R-DAQ) in Pakistani medical practitioners: a cross-sectional study of doctors in Lahore.

PubMed

Haddad, Mark; Waqas, Ahmed; Sukhera, Ahmed Bashir; Tarar, Asad Zaman

2017-07-27

Depression is common mental health problem and leading contributor to the global burden of disease. The attitudes and beliefs of the public and of health professionals influence social acceptance and affect the esteem and help-seeking of people experiencing mental health problems. The attitudes of clinicians are particularly relevant to their role in accurately recognising and providing appropriate support and management of depression. This study examines the characteristics of the revised depression attitude questionnaire (R-DAQ) with doctors working in healthcare settings in Lahore, Pakistan. A cross-sectional survey was conducted in 2015 using the revised depression attitude questionnaire (R-DAQ). A convenience sample of 700 medical practitioners based in six hospitals in Lahore was approached to participate in the survey. The R-DAQ structure was examined using Parallel Analysis from polychoric correlations. Unweighted least squares analysis (ULSA) was used for factor extraction. Model fit was estimated using goodness-of-fit indices and the root mean square of standardized residuals (RMSR), and internal consistency reliability for the overall scale and subscales was assessed using reliability estimates based on Mislevy and Bock (BILOG 3 Item analysis and test scoring with binary logistic models. Mooresville: Scientific Software, 55) and the McDonald's Omega statistic. Findings using this approach were compared with principal axis factor analysis based on Pearson correlation matrix. 601 (86%) of the doctors approached consented to participate in the study. Exploratory factor analysis of R-DAQ scale responses demonstrated the same 3-factor structure as in the UK development study, though analyses indicated removal of 7 of the 22 items because of weak loading or poor model fit. The 3 factor solution accounted for 49.8% of the common variance. Scale reliability and internal consistency were adequate: total scale standardised alpha was 0.694; subscale reliability for professional confidence was 0.732, therapeutic optimism/pessimism was 0.638, and generalist perspective was 0.769. The R-DAQ was developed with a predominantly UK-based sample of health professionals. This study indicates that this scale functions adequately and provides a valid measure of depression attitudes for medical practitioners in Pakistan, with the same factor structure as in the scale development sample. However, optimal scale function necessitated removal of several items, with a 15-item scale enabling the most parsimonious factor solution for this population.
Psychometric instrumentation: reliability and validity of instruments used for clinical practice, evidence-based practice projects and research studies.

PubMed

Mayo, Ann M

2015-01-01

It is important for CNSs and other APNs to consider the reliability and validity of instruments chosen for clinical practice, evidence-based practice projects, or research studies. Psychometric testing uses specific research methods to evaluate the amount of error associated with any particular instrument. Reliability estimates explain more about how well the instrument is designed, whereas validity estimates explain more about scores that are produced by the instrument. An instrument may be architecturally sound overall (reliable), but the same instrument may not be valid. For example, if a specific group does not understand certain well-constructed items, then the instrument does not produce valid scores when used with that group. Many instrument developers may conduct reliability testing only once, yet continue validity testing in different populations over many years. All CNSs should be advocating for the use of reliable instruments that produce valid results. Clinical nurse specialists may find themselves in situations where reliability and validity estimates for some instruments that are being utilized are unknown. In such cases, CNSs should engage key stakeholders to sponsor nursing researchers to pursue this most important work.
Experimental evidences of a large extrinsic spin Hall effect in AuW alloy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Laczkowski, P.; Rojas-Sánchez, J.-C.; INAC/SP2M, CEA-Université Joseph Fourier, F-38054 Grenoble

2014-04-07

We report an experimental study of a gold-tungsten alloy (7 at. % W concentration in Au host) displaying remarkable properties for spintronics applications using both magneto-transport in lateral spin valve devices and spin-pumping with inverse spin Hall effect experiments. A very large spin Hall angle of about 10% is consistently found using both techniques with the reliable spin diffusion length of 2 nm estimated by the spin sink experiments in the lateral spin valves. With its chemical stability, high resistivity, and small induced damping, this AuW alloy may find applications in the nearest future.
Federal investment in health information technology: how to motivate it?

PubMed

Bower, Anthony G

2005-01-01

Health care market failures include inefficient standard making, problems with coordination among local providers to optimize care, and inability to measure quality accurately, inexpensively, or reliably. Study of other industries suggests policy directions for health information technology and the magnitude of gains from improving market functioning, which are very large. A perspective drawn from U.S. industrial history--in particular railroads and the interstate highway system--suggests an investment level roughly consistent with recent estimates drawn from the medical literature. The benefits of quick action probably outweigh the benefits of delaying and choosing the perfect funding mechanism.
Sobol Sensitivity Analysis: A Tool to Guide the Development and Evaluation of Systems Pharmacology Models

PubMed Central

Trame, MN; Lesko, LJ

2015-01-01

A systems pharmacology model typically integrates pharmacokinetic, biochemical network, and systems biology concepts into a unifying approach. It typically consists of a large number of parameters and reaction species that are interlinked based upon the underlying (patho)physiology and the mechanism of drug action. The more complex these models are, the greater the challenge of reliably identifying and estimating respective model parameters. Global sensitivity analysis provides an innovative tool that can meet this challenge. CPT Pharmacometrics Syst. Pharmacol. (2015) 4, 69–79; doi:10.1002/psp4.6; published online 25 February 2015 PMID:27548289
User's guide to the Reliability Estimation System Testbed (REST)

NASA Technical Reports Server (NTRS)

Nicol, David M.; Palumbo, Daniel L.; Rifkin, Adam

1992-01-01

The Reliability Estimation System Testbed is an X-window based reliability modeling tool that was created to explore the use of the Reliability Modeling Language (RML). RML was defined to support several reliability analysis techniques including modularization, graphical representation, Failure Mode Effects Simulation (FMES), and parallel processing. These techniques are most useful in modeling large systems. Using modularization, an analyst can create reliability models for individual system components. The modules can be tested separately and then combined to compute the total system reliability. Because a one-to-one relationship can be established between system components and the reliability modules, a graphical user interface may be used to describe the system model. RML was designed to permit message passing between modules. This feature enables reliability modeling based on a run time simulation of the system wide effects of a component's failure modes. The use of failure modes effects simulation enhances the analyst's ability to correctly express system behavior when using the modularization approach to reliability modeling. To alleviate the computation bottleneck often found in large reliability models, REST was designed to take advantage of parallel processing on hypercube processors.
Assessing the Reliability of Regional Depth-Duration-Frequency Equations for Gauged and Ungauged Sites

NASA Astrophysics Data System (ADS)

Castellarin, A.; Montanari, A.; Brath, A.

2002-12-01

The study derives Regional Depth-Duration-Frequency (RDDF) equations for a wide region of northern-central Italy (37,200 km 2) by following an adaptation of the approach originally proposed by Alila [WRR, 36(7), 2000]. The proposed RDDF equations have a rather simple structure and allow an estimation of the design storm, defined as the rainfall depth expected for a given storm duration and recurrence interval, in any location of the study area for storm durations from 1 to 24 hours and for recurrence intervals up to 100 years. The reliability of the proposed RDDF equations represents the main concern of the study and it is assessed at two different levels. The first level considers the gauged sites and compares estimates of the design storm obtained with the RDDF equations with at-site estimates based upon the observed annual maximum series of rainfall depth and with design storm estimates resulting from a regional estimator recently developed for the study area through a Hierarchical Regional Approach (HRA) [Gabriele and Arnell, WRR, 27(6), 1991]. The second level performs a reliability assessment of the RDDF equations for ungauged sites by means of a jack-knife procedure. Using the HRA estimator as a reference term, the jack-knife procedure assesses the reliability of design storm estimates provided by the RDDF equations for a given location when dealing with the complete absence of pluviometric information. The results of the analysis show that the proposed RDDF equations represent practical and effective computational means for producing a first guess of the design storm at the available raingauges and reliable design storm estimates for ungauged locations. The first author gratefully acknowledges D.H. Burn for sponsoring the submission of the present abstract.
The Berg Balance Scale has high intra- and inter-rater reliability but absolute reliability varies across the scale: a systematic review.

PubMed

Downs, Stephen; Marquez, Jodie; Chiarelli, Pauline

2013-06-01

What is the intra-rater and inter-rater relative reliability of the Berg Balance Scale? What is the absolute reliability of the Berg Balance Scale? Does the absolute reliability of the Berg Balance Scale vary across the scale? Systematic review with meta-analysis of reliability studies. Any clinical population that has undergone assessment with the Berg Balance Scale. Relative intra-rater reliability, relative inter-rater reliability, and absolute reliability. Eleven studies involving 668 participants were included in the review. The relative intrarater reliability of the Berg Balance Scale was high, with a pooled estimate of 0.98 (95% CI 0.97 to 0.99). Relative inter-rater reliability was also high, with a pooled estimate of 0.97 (95% CI 0.96 to 0.98). A ceiling effect of the Berg Balance Scale was evident for some participants. In the analysis of absolute reliability, all of the relevant studies had an average score of 20 or above on the 0 to 56 point Berg Balance Scale. The absolute reliability across this part of the scale, as measured by the minimal detectable change with 95% confidence, varied between 2.8 points and 6.6 points. The Berg Balance Scale has a higher absolute reliability when close to 56 points due to the ceiling effect. We identified no data that estimated the absolute reliability of the Berg Balance Scale among participants with a mean score below 20 out of 56. The Berg Balance Scale has acceptable reliability, although it might not detect modest, clinically important changes in balance in individual subjects. The review was only able to comment on the absolute reliability of the Berg Balance Scale among people with moderately poor to normal balance. Copyright © 2013 Australian Physiotherapy Association. Published by .. All rights reserved.
Improving statistical inference on pathogen densities estimated by quantitative molecular methods: malaria gametocytaemia as a case study.

PubMed

Walker, Martin; Basáñez, María-Gloria; Ouédraogo, André Lin; Hermsen, Cornelus; Bousema, Teun; Churcher, Thomas S

2015-01-16

Quantitative molecular methods (QMMs) such as quantitative real-time polymerase chain reaction (q-PCR), reverse-transcriptase PCR (qRT-PCR) and quantitative nucleic acid sequence-based amplification (QT-NASBA) are increasingly used to estimate pathogen density in a variety of clinical and epidemiological contexts. These methods are often classified as semi-quantitative, yet estimates of reliability or sensitivity are seldom reported. Here, a statistical framework is developed for assessing the reliability (uncertainty) of pathogen densities estimated using QMMs and the associated diagnostic sensitivity. The method is illustrated with quantification of Plasmodium falciparum gametocytaemia by QT-NASBA. The reliability of pathogen (e.g. gametocyte) densities, and the accompanying diagnostic sensitivity, estimated by two contrasting statistical calibration techniques, are compared; a traditional method and a mixed model Bayesian approach. The latter accounts for statistical dependence of QMM assays run under identical laboratory protocols and permits structural modelling of experimental measurements, allowing precision to vary with pathogen density. Traditional calibration cannot account for inter-assay variability arising from imperfect QMMs and generates estimates of pathogen density that have poor reliability, are variable among assays and inaccurately reflect diagnostic sensitivity. The Bayesian mixed model approach assimilates information from replica QMM assays, improving reliability and inter-assay homogeneity, providing an accurate appraisal of quantitative and diagnostic performance. Bayesian mixed model statistical calibration supersedes traditional techniques in the context of QMM-derived estimates of pathogen density, offering the potential to improve substantially the depth and quality of clinical and epidemiological inference for a wide variety of pathogens.
Evaluation of Dimensionality in the Assessment of Internal Consistency Reliability: Coefficient Alpha and Omega Coefficients

ERIC Educational Resources Information Center

Green, Samuel B.; Yang, Yanyun

2015-01-01

In the lead article, Davenport, Davison, Liou, & Love demonstrate the relationship among homogeneity, internal consistency, and coefficient alpha, and also distinguish among them. These distinctions are important because too often coefficient alpha--a reliability coefficient--is interpreted as an index of homogeneity or internal consistency.…
A statistical methodology for estimating transport parameters: Theory and applications to one-dimensional advectivec-dispersive systems

USGS Publications Warehouse

Wagner, Brian J.; Gorelick, Steven M.

1986-01-01

A simulation nonlinear multiple-regression methodology for estimating parameters that characterize the transport of contaminants is developed and demonstrated. Finite difference contaminant transport simulation is combined with a nonlinear weighted least squares multiple-regression procedure. The technique provides optimal parameter estimates and gives statistics for assessing the reliability of these estimates under certain general assumptions about the distributions of the random measurement errors. Monte Carlo analysis is used to estimate parameter reliability for a hypothetical homogeneous soil column for which concentration data contain large random measurement errors. The value of data collected spatially versus data collected temporally was investigated for estimation of velocity, dispersion coefficient, effective porosity, first-order decay rate, and zero-order production. The use of spatial data gave estimates that were 2–3 times more reliable than estimates based on temporal data for all parameters except velocity. Comparison of estimated linear and nonlinear confidence intervals based upon Monte Carlo analysis showed that the linear approximation is poor for dispersion coefficient and zero-order production coefficient when data are collected over time. In addition, examples demonstrate transport parameter estimation for two real one-dimensional systems. First, the longitudinal dispersivity and effective porosity of an unsaturated soil are estimated using laboratory column data. We compare the reliability of estimates based upon data from individual laboratory experiments versus estimates based upon pooled data from several experiments. Second, the simulation nonlinear regression procedure is extended to include an additional governing equation that describes delayed storage during contaminant transport. The model is applied to analyze the trends, variability, and interrelationship of parameters in a mourtain stream in northern California.
An empirical Bayes approach for the Poisson life distribution.

NASA Technical Reports Server (NTRS)

Canavos, G. C.

1973-01-01

A smooth empirical Bayes estimator is derived for the intensity parameter (hazard rate) in the Poisson distribution as used in life testing. The reliability function is also estimated either by using the empirical Bayes estimate of the parameter, or by obtaining the expectation of the reliability function. The behavior of the empirical Bayes procedure is studied through Monte Carlo simulation in which estimates of mean-squared errors of the empirical Bayes estimators are compared with those of conventional estimators such as minimum variance unbiased or maximum likelihood. Results indicate a significant reduction in mean-squared error of the empirical Bayes estimators over the conventional variety.
The reliability of eyetracking to assess attentional bias to threatening words in healthy individuals.

PubMed

Skinner, Ian W; Hübscher, Markus; Moseley, G Lorimer; Lee, Hopin; Wand, Benedict M; Traeger, Adrian C; Gustin, Sylvia M; McAuley, James H

2017-08-15

Eyetracking is commonly used to investigate attentional bias. Although some studies have investigated the internal consistency of eyetracking, data are scarce on the test-retest reliability and agreement of eyetracking to investigate attentional bias. This study reports the test-retest reliability, measurement error, and internal consistency of 12 commonly used outcome measures thought to reflect the different components of attentional bias: overall attention, early attention, and late attention. Healthy participants completed a preferential-looking eyetracking task that involved the presentation of threatening (sensory words, general threat words, and affective words) and nonthreatening words. We used intraclass correlation coefficients (ICCs) to measure test-retest reliability (ICC > .70 indicates adequate reliability). The ICCs(2, 1) ranged from -.31 to .71. Reliability varied according to the outcome measure and threat word category. Sensory words had a lower mean ICC (.08) than either affective words (.32) or general threat words (.29). A longer exposure time was associated with higher test-retest reliability. All of the outcome measures, except second-run dwell time, demonstrated low measurement error (<6%). Most of the outcome measures reported high internal consistency (α > .93). Recommendations are discussed for improving the reliability of eyetracking tasks in future research.
A review of sex estimation techniques during examination of skeletal remains in forensic anthropology casework.

PubMed

Krishan, Kewal; Chatterjee, Preetika M; Kanchan, Tanuj; Kaur, Sandeep; Baryah, Neha; Singh, R K

2016-04-01

Sex estimation is considered as one of the essential parameters in forensic anthropology casework, and requires foremost consideration in the examination of skeletal remains. Forensic anthropologists frequently employ morphologic and metric methods for sex estimation of human remains. These methods are still very imperative in identification process in spite of the advent and accomplishment of molecular techniques. A constant boost in the use of imaging techniques in forensic anthropology research has facilitated to derive as well as revise the available population data. These methods however, are less reliable owing to high variance and indistinct landmark details. The present review discusses the reliability and reproducibility of various analytical approaches; morphological, metric, molecular and radiographic methods in sex estimation of skeletal remains. Numerous studies have shown a higher reliability and reproducibility of measurements taken directly on the bones and hence, such direct methods of sex estimation are considered to be more reliable than the other methods. Geometric morphometric (GM) method and Diagnose Sexuelle Probabiliste (DSP) method are emerging as valid methods and widely used techniques in forensic anthropology in terms of accuracy and reliability. Besides, the newer 3D methods are shown to exhibit specific sexual dimorphism patterns not readily revealed by traditional methods. Development of newer and better methodologies for sex estimation as well as re-evaluation of the existing ones will continue in the endeavour of forensic researchers for more accurate results. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.