Integrated performance and reliability specification for digital avionics systems
NASA Technical Reports Server (NTRS)
Brehm, Eric W.; Goettge, Robert T.
1995-01-01
This paper describes an automated tool for performance and reliability assessment of digital avionics systems, called the Automated Design Tool Set (ADTS). ADTS is based on an integrated approach to design assessment that unifies traditional performance and reliability views of system designs, and that addresses interdependencies between performance and reliability behavior via exchange of parameters and result between mathematical models of each type. A multi-layer tool set architecture has been developed for ADTS that separates the concerns of system specification, model generation, and model solution. Performance and reliability models are generated automatically as a function of candidate system designs, and model results are expressed within the system specification. The layered approach helps deal with the inherent complexity of the design assessment process, and preserves long-term flexibility to accommodate a wide range of models and solution techniques within the tool set structure. ADTS research and development to date has focused on development of a language for specification of system designs as a basis for performance and reliability evaluation. A model generation and solution framework has also been developed for ADTS, that will ultimately encompass an integrated set of analytic and simulated based techniques for performance, reliability, and combined design assessment.
Cai, Gaigai; Chen, Xuefeng; Li, Bing; Chen, Baojia; He, Zhengjia
2012-01-01
The reliability of cutting tools is critical to machining precision and production efficiency. The conventional statistic-based reliability assessment method aims at providing a general and overall estimation of reliability for a large population of identical units under given and fixed conditions. However, it has limited effectiveness in depicting the operational characteristics of a cutting tool. To overcome this limitation, this paper proposes an approach to assess the operation reliability of cutting tools. A proportional covariate model is introduced to construct the relationship between operation reliability and condition monitoring information. The wavelet packet transform and an improved distance evaluation technique are used to extract sensitive features from vibration signals, and a covariate function is constructed based on the proportional covariate model. Ultimately, the failure rate function of the cutting tool being assessed is calculated using the baseline covariate function obtained from a small sample of historical data. Experimental results and a comparative study show that the proposed method is effective for assessing the operation reliability of cutting tools. PMID:23201980
A Model for Assessing the Liability of Seemingly Correct Software
NASA Technical Reports Server (NTRS)
Voas, Jeffrey M.; Voas, Larry K.; Miller, Keith W.
1991-01-01
Current research on software reliability does not lend itself to quantitatively assessing the risk posed by a piece of life-critical software. Black-box software reliability models are too general and make too many assumptions to be applied confidently to assessing the risk of life-critical software. We present a model for assessing the risk caused by a piece of software; this model combines software testing results and Hamlet's probable correctness model. We show how this model can assess software risk for those who insure against a loss that can occur if life-critical software fails.
Chen, J D; Sun, H L
1999-04-01
Objective. To assess and predict reliability of an equipment dynamically by making full use of various test informations in the development of products. Method. A new reliability growth assessment method based on army material system analysis activity (AMSAA) model was developed. The method is composed of the AMSAA model and test data conversion technology. Result. The assessment and prediction results of a space-borne equipment conform to its expectations. Conclusion. It is suggested that this method should be further researched and popularized.
Using LISREL to Evaluate Measurement Models and Scale Reliability.
ERIC Educational Resources Information Center
Fleishman, John; Benson, Jeri
1987-01-01
LISREL program was used to examine measurement model assumptions and to assess reliability of Coopersmith Self-Esteem Inventory for Children, Form B. Data on 722 third-sixth graders from over 70 schools in large urban school district were used. LISREL program assessed (1) nature of basic measurement model for scale, (2) scale invariance across…
ERIC Educational Resources Information Center
Brady, Michael P.; Heiser, Lawrence A.; McCormick, Jazarae K.; Forgan, James
2016-01-01
High-stakes standardized student assessments are increasingly used in value-added evaluation models to connect teacher performance to P-12 student learning. These assessments are also being used to evaluate teacher preparation programs, despite validity and reliability threats. A more rational model linking student performance to candidates who…
Scaled CMOS Technology Reliability Users Guide
NASA Technical Reports Server (NTRS)
White, Mark
2010-01-01
The desire to assess the reliability of emerging scaled microelectronics technologies through faster reliability trials and more accurate acceleration models is the precursor for further research and experimentation in this relevant field. The effect of semiconductor scaling on microelectronics product reliability is an important aspect to the high reliability application user. From the perspective of a customer or user, who in many cases must deal with very limited, if any, manufacturer's reliability data to assess the product for a highly-reliable application, product-level testing is critical in the characterization and reliability assessment of advanced nanometer semiconductor scaling effects on microelectronics reliability. A methodology on how to accomplish this and techniques for deriving the expected product-level reliability on commercial memory products are provided.Competing mechanism theory and the multiple failure mechanism model are applied to the experimental results of scaled SDRAM products. Accelerated stress testing at multiple conditions is applied at the product level of several scaled memory products to assess the performance degradation and product reliability. Acceleration models are derived for each case. For several scaled SDRAM products, retention time degradation is studied and two distinct soft error populations are observed with each technology generation: early breakdown, characterized by randomly distributed weak bits with Weibull slope (beta)=1, and a main population breakdown with an increasing failure rate. Retention time soft error rates are calculated and a multiple failure mechanism acceleration model with parameters is derived for each technology. Defect densities are calculated and reflect a decreasing trend in the percentage of random defective bits for each successive product generation. A normalized soft error failure rate of the memory data retention time in FIT/Gb and FIT/cm2 for several scaled SDRAM generations is presented revealing a power relationship. General models describing the soft error rates across scaled product generations are presented. The analysis methodology may be applied to other scaled microelectronic products and their key parameters.
Human Reliability Analysis in Support of Risk Assessment for Positive Train Control
DOT National Transportation Integrated Search
2003-06-01
This report describes an approach to evaluating the reliability of human actions that are modeled in a probabilistic risk assessment : (PRA) of train control operations. This approach to human reliability analysis (HRA) has been applied in the case o...
Advanced reliability modeling of fault-tolerant computer-based systems
NASA Technical Reports Server (NTRS)
Bavuso, S. J.
1982-01-01
Two methodologies for the reliability assessment of fault tolerant digital computer based systems are discussed. The computer-aided reliability estimation 3 (CARE 3) and gate logic software simulation (GLOSS) are assessment technologies that were developed to mitigate a serious weakness in the design and evaluation process of ultrareliable digital systems. The weak link is based on the unavailability of a sufficiently powerful modeling technique for comparing the stochastic attributes of one system against others. Some of the more interesting attributes are reliability, system survival, safety, and mission success.
Towards early software reliability prediction for computer forensic tools (case study).
Abu Talib, Manar
2016-01-01
Versatility, flexibility and robustness are essential requirements for software forensic tools. Researchers and practitioners need to put more effort into assessing this type of tool. A Markov model is a robust means for analyzing and anticipating the functioning of an advanced component based system. It is used, for instance, to analyze the reliability of the state machines of real time reactive systems. This research extends the architecture-based software reliability prediction model for computer forensic tools, which is based on Markov chains and COSMIC-FFP. Basically, every part of the computer forensic tool is linked to a discrete time Markov chain. If this can be done, then a probabilistic analysis by Markov chains can be performed to analyze the reliability of the components and of the whole tool. The purposes of the proposed reliability assessment method are to evaluate the tool's reliability in the early phases of its development, to improve the reliability assessment process for large computer forensic tools over time, and to compare alternative tool designs. The reliability analysis can assist designers in choosing the most reliable topology for the components, which can maximize the reliability of the tool and meet the expected reliability level specified by the end-user. The approach of assessing component-based tool reliability in the COSMIC-FFP context is illustrated with the Forensic Toolkit Imager case study.
Apples to Apples: Equivalent-Reliability Power Systems Across Diverse Resource Mix Scenarios
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stephen, Gordon W; Frew, Bethany A; Sigler, Devon
Electricity market research is highly price sensitive, and prices are strongly influenced by balance of supply and demand. This work looks at how to combine capacity expansion models and reliability assessment tools to assess equivalent-reliability power systems across diverse resource mix scenarios.
Structural reliability assessment capability in NESSUS
NASA Technical Reports Server (NTRS)
Millwater, H.; Wu, Y.-T.
1992-01-01
The principal capabilities of NESSUS (Numerical Evaluation of Stochastic Structures Under Stress), an advanced computer code developed for probabilistic structural response analysis, are reviewed, and its structural reliability assessed. The code combines flexible structural modeling tools with advanced probabilistic algorithms in order to compute probabilistic structural response and resistance, component reliability and risk, and system reliability and risk. An illustrative numerical example is presented.
Structural reliability assessment capability in NESSUS
NASA Astrophysics Data System (ADS)
Millwater, H.; Wu, Y.-T.
1992-07-01
The principal capabilities of NESSUS (Numerical Evaluation of Stochastic Structures Under Stress), an advanced computer code developed for probabilistic structural response analysis, are reviewed, and its structural reliability assessed. The code combines flexible structural modeling tools with advanced probabilistic algorithms in order to compute probabilistic structural response and resistance, component reliability and risk, and system reliability and risk. An illustrative numerical example is presented.
Rollover risk prediction of heavy vehicles by reliability index and empirical modelling
NASA Astrophysics Data System (ADS)
Sellami, Yamine; Imine, Hocine; Boubezoul, Abderrahmane; Cadiou, Jean-Charles
2018-03-01
This paper focuses on a combination of a reliability-based approach and an empirical modelling approach for rollover risk assessment of heavy vehicles. A reliability-based warning system is developed to alert the driver to a potential rollover before entering into a bend. The idea behind the proposed methodology is to estimate the rollover risk by the probability that the vehicle load transfer ratio (LTR) exceeds a critical threshold. Accordingly, a so-called reliability index may be used as a measure to assess the vehicle safe functioning. In the reliability method, computing the maximum of LTR requires to predict the vehicle dynamics over the bend which can be in some cases an intractable problem or time-consuming. With the aim of improving the reliability computation time, an empirical model is developed to substitute the vehicle dynamics and rollover models. This is done by using the SVM (Support Vector Machines) algorithm. The preliminary obtained results demonstrate the effectiveness of the proposed approach.
2011-11-01
assessment to quality of localization/characterization estimates. This protocol includes four critical components: (1) a procedure to identify the...critical factors impacting SHM system performance; (2) a multistage or hierarchical approach to SHM system validation; (3) a model -assisted evaluation...Lindgren, E. A ., Buynak, C. F., Steffes, G., Derriso, M., “ Model -assisted Probabilistic Reliability Assessment for Structural Health Monitoring
Reevaluating Interrater Reliability in Offender Risk Assessment
ERIC Educational Resources Information Center
van der Knaap, Leontien M.; Leenarts, Laura E. W.; Born, Marise Ph.; Oosterveld, Paul
2012-01-01
Offender risk and needs assessment, one of the pillars of the risk-need-responsivity model of offender rehabilitation, usually depends on raters assessing offender risk and needs. The few available studies of interrater reliability in offender risk assessment are, however, limited in the generalizability of their results. The present study…
NASA Astrophysics Data System (ADS)
Chen, Fan; Huang, Shaoxiong; Ding, Jinjin; Ding, Jinjin; Gao, Bo; Xie, Yuguang; Wang, Xiaoming
2018-01-01
This paper proposes a fast reliability assessing method for distribution grid with distributed renewable energy generation. First, the Weibull distribution and the Beta distribution are used to describe the probability distribution characteristics of wind speed and solar irradiance respectively, and the models of wind farm, solar park and local load are built for reliability assessment. Then based on power system production cost simulation probability discretization and linearization power flow, a optimal power flow objected with minimum cost of conventional power generation is to be resolved. Thus a reliability assessment for distribution grid is implemented fast and accurately. The Loss Of Load Probability (LOLP) and Expected Energy Not Supplied (EENS) are selected as the reliability index, a simulation for IEEE RBTS BUS6 system in MATLAB indicates that the fast reliability assessing method calculates the reliability index much faster with the accuracy ensured when compared with Monte Carlo method.
Boser, Quinn A; Valevicius, Aïda M; Lavoie, Ewen B; Chapman, Craig S; Pilarski, Patrick M; Hebert, Jacqueline S; Vette, Albert H
2018-04-27
Quantifying angular joint kinematics of the upper body is a useful method for assessing upper limb function. Joint angles are commonly obtained via motion capture, tracking markers placed on anatomical landmarks. This method is associated with limitations including administrative burden, soft tissue artifacts, and intra- and inter-tester variability. An alternative method involves the tracking of rigid marker clusters affixed to body segments, calibrated relative to anatomical landmarks or known joint angles. The accuracy and reliability of applying this cluster method to the upper body has, however, not been comprehensively explored. Our objective was to compare three different upper body cluster models with an anatomical model, with respect to joint angles and reliability. Non-disabled participants performed two standardized functional upper limb tasks with anatomical and cluster markers applied concurrently. Joint angle curves obtained via the marker clusters with three different calibration methods were compared to those from an anatomical model, and between-session reliability was assessed for all models. The cluster models produced joint angle curves which were comparable to and highly correlated with those from the anatomical model, but exhibited notable offsets and differences in sensitivity for some degrees of freedom. Between-session reliability was comparable between all models, and good for most degrees of freedom. Overall, the cluster models produced reliable joint angles that, however, cannot be used interchangeably with anatomical model outputs to calculate kinematic metrics. Cluster models appear to be an adequate, and possibly advantageous alternative to anatomical models when the objective is to assess trends in movement behavior. Copyright © 2018 Elsevier Ltd. All rights reserved.
A study of fault prediction and reliability assessment in the SEL environment
NASA Technical Reports Server (NTRS)
Basili, Victor R.; Patnaik, Debabrata
1986-01-01
An empirical study on estimation and prediction of faults, prediction of fault detection and correction effort, and reliability assessment in the Software Engineering Laboratory environment (SEL) is presented. Fault estimation using empirical relationships and fault prediction using curve fitting method are investigated. Relationships between debugging efforts (fault detection and correction effort) in different test phases are provided, in order to make an early estimate of future debugging effort. This study concludes with the fault analysis, application of a reliability model, and analysis of a normalized metric for reliability assessment and reliability monitoring during development of software.
Validation of urban freeway models. [supporting datasets
DOT National Transportation Integrated Search
2015-01-01
The goal of the SHRP 2 Project L33 Validation of Urban Freeway Models was to assess and enhance the predictive travel time reliability models developed in the SHRP 2 Project L03, Analytic Procedures for Determining the Impacts of Reliability Mitigati...
NASA Astrophysics Data System (ADS)
Hancock, G. R.; Webb, A. A.; Turner, L.
2017-11-01
Sediment transport and soil erosion can be determined by a variety of field and modelling approaches. Computer based soil erosion and landscape evolution models (LEMs) offer the potential to be reliable assessment and prediction tools. An advantage of such models is that they provide both erosion and deposition patterns as well as total catchment sediment output. However, before use, like all models they require calibration and validation. In recent years LEMs have been used for a variety of both natural and disturbed landscape assessment. However, these models have not been evaluated for their reliability in steep forested catchments. Here, the SIBERIA LEM is calibrated and evaluated for its reliability for two steep forested catchments in south-eastern Australia. The model is independently calibrated using two methods. Firstly, hydrology and sediment transport parameters are inferred from catchment geomorphology and soil properties and secondly from catchment sediment transport and discharge data. The results demonstrate that both calibration methods provide similar parameters and reliable modelled sediment transport output. A sensitivity study of the input parameters demonstrates the model's sensitivity to correct parameterisation and also how the model could be used to assess potential timber harvesting as well as the removal of vegetation by fire.
Spaceflight tracking and data network operational reliability assessment for Skylab
NASA Technical Reports Server (NTRS)
Seneca, V. I.; Mlynarczyk, R. H.
1974-01-01
Data on the spaceflight communications equipment status during the Skylab mission were subjected to an operational reliability assessment. Reliability models were revised to reflect pertinent equipment changes accomplished prior to the beginning of the Skylab missions. Appropriate adjustments were made to fit the data to the models. The availabilities are based on the failure events resulting in the stations inability to support a function of functions and the MTBF's are based on all events including 'can support' and 'cannot support'. Data were received from eleven land-based stations and one ship.
Aarons, Gregory A; McDonald, Elizabeth J; Connelly, Cynthia D; Newton, Rae R
2007-12-01
The purpose of this study was to examine the factor structure, reliability, and validity of the Family Assessment Device (FAD) among a national sample of Caucasian and Hispanic American families receiving public sector mental health services. A confirmatory factor analysis conducted to test model fit yielded equivocal findings. With few exceptions, indices of model fit, reliability, and validity were poorer for Hispanic Americans compared with Caucasian Americans. Contrary to our expectation, an exploratory factor analysis did not result in a better fitting model of family functioning. Without stronger evidence supporting a reformulation of the FAD, we recommend against such a course of action. Findings highlight the need for additional research on the role of culture in measurement of family functioning.
Silva, Wanderson Roberto; Costa, David; Pimenta, Filipa; Maroco, João; Campos, Juliana Alvares Duarte Bonini
2016-07-21
The objectives of this study were to develop a unified Portuguese-language version, for use in Brazil and Portugal, of the Body Shape Questionnaire (BSQ) and to estimate its validity, reliability, and internal consistency in Brazilian and Portuguese female university students. Confirmatory factor analysis was performed using both original (34-item) and shortened (8-item) versions. The model's fit was assessed with χ²/df, CFI, NFI, and RMSEA. Concurrent and convergent validity were assessed. Reliability was estimated through internal consistency and composite reliability (α). Transnational invariance of the BSQ was tested using multi-group analysis. The original 32-item model was refined to present a better fit and adequate validity and reliability. The shortened model was stable in both independent samples and in transnational samples (Brazil and Portugal). The use of this unified version is recommended for the assessment of body shape concerns in both Brazilian and Portuguese college students.
NASA Astrophysics Data System (ADS)
Ha, Taesung
A probabilistic risk assessment (PRA) was conducted for a loss of coolant accident, (LOCA) in the McMaster Nuclear Reactor (MNR). A level 1 PRA was completed including event sequence modeling, system modeling, and quantification. To support the quantification of the accident sequence identified, data analysis using the Bayesian method and human reliability analysis (HRA) using the accident sequence evaluation procedure (ASEP) approach were performed. Since human performance in research reactors is significantly different from that in power reactors, a time-oriented HRA model (reliability physics model) was applied for the human error probability (HEP) estimation of the core relocation. This model is based on two competing random variables: phenomenological time and performance time. The response surface and direct Monte Carlo simulation with Latin Hypercube sampling were applied for estimating the phenomenological time, whereas the performance time was obtained from interviews with operators. An appropriate probability distribution for the phenomenological time was assigned by statistical goodness-of-fit tests. The human error probability (HEP) for the core relocation was estimated from these two competing quantities: phenomenological time and operators' performance time. The sensitivity of each probability distribution in human reliability estimation was investigated. In order to quantify the uncertainty in the predicted HEPs, a Bayesian approach was selected due to its capability of incorporating uncertainties in model itself and the parameters in that model. The HEP from the current time-oriented model was compared with that from the ASEP approach. Both results were used to evaluate the sensitivity of alternative huinan reliability modeling for the manual core relocation in the LOCA risk model. This exercise demonstrated the applicability of a reliability physics model supplemented with a. Bayesian approach for modeling human reliability and its potential usefulness of quantifying model uncertainty as sensitivity analysis in the PRA model.
A Survey of Software Reliability Modeling and Estimation
1983-09-01
considered include: the Jelinski-Moranda Model, the ,Geometric Model,’ and Musa’s Model. A Monte -Carlo study of the behavior of the ’V"’"*least squares...ceedings Number 261, 1979, pp. 34-1, 34-11. IoelAmrit, AGieboSSukert, Alan and Goel, Ararat , "A Guidebookfor Software Reliability Assessment, 1980
Relating design and environmental variables to reliability
NASA Astrophysics Data System (ADS)
Kolarik, William J.; Landers, Thomas L.
The combination of space application and nuclear power source demands high reliability hardware. The possibilities of failure, either an inability to provide power or a catastrophic accident, must be minimized. Nuclear power experiences on the ground have led to highly sophisticated probabilistic risk assessment procedures, most of which require quantitative information to adequately assess such risks. In the area of hardware risk analysis, reliability information plays a key role. One of the lessons learned from the Three Mile Island experience is that thorough analyses of critical components are essential. Nuclear grade equipment shows some reliability advantages over commercial. However, no statistically significant difference has been found. A recent study pertaining to spacecraft electronics reliability, examined some 2500 malfunctions on more than 300 aircraft. The study classified the equipment failures into seven general categories. Design deficiencies and lack of environmental protection accounted for about half of all failures. Within each class, limited reliability modeling was performed using a Weibull failure model.
System and Software Reliability (C103)
NASA Technical Reports Server (NTRS)
Wallace, Dolores
2003-01-01
Within the last decade better reliability models (hardware. software, system) than those currently used have been theorized and developed but not implemented in practice. Previous research on software reliability has shown that while some existing software reliability models are practical, they are no accurate enough. New paradigms of development (e.g. OO) have appeared and associated reliability models have been proposed posed but not investigated. Hardware models have been extensively investigated but not integrated into a system framework. System reliability modeling is the weakest of the three. NASA engineers need better methods and tools to demonstrate that the products meet NASA requirements for reliability measurement. For the new models for the software component of the last decade, there is a great need to bring them into a form that they can be used on software intensive systems. The Statistical Modeling and Estimation of Reliability Functions for Systems (SMERFS'3) tool is an existing vehicle that may be used to incorporate these new modeling advances. Adapting some existing software reliability modeling changes to accommodate major changes in software development technology may also show substantial improvement in prediction accuracy. With some additional research, the next step is to identify and investigate system reliability. System reliability models could then be incorporated in a tool such as SMERFS'3. This tool with better models would greatly add value in assess in GSFC projects.
Hou, Xianlong; Hodges, Ben R; Feng, Dongyu; Liu, Qixiao
2017-03-15
As oil transport increasing in the Texas bays, greater risks of ship collisions will become a challenge, yielding oil spill accidents as a consequence. To minimize the ecological damage and optimize rapid response, emergency managers need to be informed with how fast and where oil will spread as soon as possible after a spill. The state-of-the-art operational oil spill forecast modeling system improves the oil spill response into a new stage. However uncertainty due to predicted data inputs often elicits compromise on the reliability of the forecast result, leading to misdirection in contingency planning. Thus understanding the forecast uncertainty and reliability become significant. In this paper, Monte Carlo simulation is implemented to provide parameters to generate forecast probability maps. The oil spill forecast uncertainty is thus quantified by comparing the forecast probability map and the associated hindcast simulation. A HyosPy-based simple statistic model is developed to assess the reliability of an oil spill forecast in term of belief degree. The technologies developed in this study create a prototype for uncertainty and reliability analysis in numerical oil spill forecast modeling system, providing emergency managers to improve the capability of real time operational oil spill response and impact assessment. Copyright © 2017 Elsevier Ltd. All rights reserved.
Skog, Alexander; Peyre, Sarah E; Pozner, Charles N; Thorndike, Mary; Hicks, Gloria; Dellaripa, Paul F
2012-01-01
The situational leadership model suggests that an effective leader adapts leadership style depending on the followers' level of competency. We assessed the applicability and reliability of the situational leadership model when observing residents in simulated hospital floor-based scenarios. Resident teams engaged in clinical simulated scenarios. Video recordings were divided into clips based on Emergency Severity Index v4 acuity scores. Situational leadership styles were identified in clips by two physicians. Interrater reliability was determined through descriptive statistical data analysis. There were 114 participants recorded in 20 sessions, and 109 clips were reviewed and scored. There was a high level of interrater reliability (weighted kappa r = .81) supporting situational leadership model's applicability to medical teams. A suggestive correlation was found between frequency of changes in leadership style and the ability to effectively lead a medical team. The situational leadership model represents a unique tool to assess medical leadership performance in the context of acuity changes.
Predicting Cost/Reliability/Maintainability of Advanced General Aviation Avionics Equipment
NASA Technical Reports Server (NTRS)
Davis, M. R.; Kamins, M.; Mooz, W. E.
1978-01-01
A methodology is provided for assisting NASA in estimating the cost, reliability, and maintenance (CRM) requirements for general avionics equipment operating in the 1980's. Practical problems of predicting these factors are examined. The usefulness and short comings of different approaches for modeling coast and reliability estimates are discussed together with special problems caused by the lack of historical data on the cost of maintaining general aviation avionics. Suggestions are offered on how NASA might proceed in assessing cost reliability CRM implications in the absence of reliable generalized predictive models.
Reliability and maintainability assessment factors for reliable fault-tolerant systems
NASA Technical Reports Server (NTRS)
Bavuso, S. J.
1984-01-01
A long term goal of the NASA Langley Research Center is the development of a reliability assessment methodology of sufficient power to enable the credible comparison of the stochastic attributes of one ultrareliable system design against others. This methodology, developed over a 10 year period, is a combined analytic and simulative technique. An analytic component is the Computer Aided Reliability Estimation capability, third generation, or simply CARE III. A simulative component is the Gate Logic Software Simulator capability, or GLOSS. The numerous factors that potentially have a degrading effect on system reliability and the ways in which these factors that are peculiar to highly reliable fault tolerant systems are accounted for in credible reliability assessments. Also presented are the modeling difficulties that result from their inclusion and the ways in which CARE III and GLOSS mitigate the intractability of the heretofore unworkable mathematics.
ERIC Educational Resources Information Center
Schumacker, Randall E.; Smith, Everett V., Jr.
2007-01-01
Measurement error is a common theme in classical measurement models used in testing and assessment. In classical measurement models, the definition of measurement error and the subsequent reliability coefficients differ on the basis of the test administration design. Internal consistency reliability specifies error due primarily to poor item…
Bayesian methods in reliability
NASA Astrophysics Data System (ADS)
Sander, P.; Badoux, R.
1991-11-01
The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.
Validation and Improvement of Reliability Methods for Air Force Building Systems
focusing primarily on HVAC systems . This research used contingency analysis to assess the performance of each model for HVAC systems at six Air Force...probabilistic model produced inflated reliability calculations for HVAC systems . In light of these findings, this research employed a stochastic method, a...Nonhomogeneous Poisson Process (NHPP), in an attempt to produce accurate HVAC system reliability calculations. This effort ultimately concluded that
Li, Yingshuang; Ding, Chunge
2017-01-01
The Adult Carer Quality of Life questionnaire (AC-QoL) is a reliable and valid instrument used to assess the quality of life (QoL) of adult family caregivers. We explored the psychometric properties and tested the reliability and validity of a Chinese version of the AC-QoL with reliability and validity testing in 409 Chinese stroke caregivers. We used item-total correlation and extreme group comparison to do item analysis. To evaluate its reliability, we used a test-retest reliability approach, intraclass correlation coefficient (ICC), together with Cronbach’s alpha and model-based internal consistency index; to evaluate its validity, we used scale content validity, confirmatory factor analysis (CFA) and exploratory factor analysis (EFA) via principal component analysis with varimax rotation. We found that the CFA did not in fact confirm the original factor model and our EFA yielded a 31-item measure with a five-factor model. In conclusions, although some items performed differently in our analysis of the original English language version and our Chinese language version, our translated AC-QoL is a reliable and valid tool which can be used to assess the quality of life of stroke caregivers in mainland China. Chinese version AC-QoL is a comprehensive and good measurement to understand caregivers and has the potential to be a screening tool to assess QoL of caregiver. PMID:29131845
Reliability Analysis of the Adult Mentoring Assessment for Extension Professionals
ERIC Educational Resources Information Center
Denny, Marina D'Abreau
2017-01-01
The Adult Mentoring Assessment for Extension Professionals will help mentors develop an accurate profile of their mentoring style with adult learners and identify areas of proficiency and deficiency based on six constructs--relationship, information, facilitation, confrontation, modeling, and vision. This article reports on the reliability of this…
The Role of Assessment in a Response to Intervention Model
ERIC Educational Resources Information Center
Crawford, Lindy
2014-01-01
This article discusses the role of assessment in a response-to-intervention model. Although assessment represents only 1 component in a response-to-intervention model, a well-articulated assessment system is critical in providing teachers with reliable data that are easily interpreted and used to make instructional decisions. Three components of…
NASA Astrophysics Data System (ADS)
McPhee, J.; William, Y. W.
2005-12-01
This work presents a methodology for pumping test design based on the reliability requirements of a groundwater model. Reliability requirements take into consideration the application of the model results in groundwater management, expressed in this case as a multiobjective management model. The pumping test design is formulated as a mixed-integer nonlinear programming (MINLP) problem and solved using a combination of genetic algorithm (GA) and gradient-based optimization. Bayesian decision theory provides a formal framework for assessing the influence of parameter uncertainty over the reliability of the proposed pumping test. The proposed methodology is useful for selecting a robust design that will outperform all other candidate designs under most potential 'true' states of the system
Modeling and experimental characterization of electromigration in interconnect trees
NASA Astrophysics Data System (ADS)
Thompson, C. V.; Hau-Riege, S. P.; Andleigh, V. K.
1999-11-01
Most modeling and experimental characterization of interconnect reliability is focussed on simple straight lines terminating at pads or vias. However, laid-out integrated circuits often have interconnects with junctions and wide-to-narrow transitions. In carrying out circuit-level reliability assessments it is important to be able to assess the reliability of these more complex shapes, generally referred to as `trees.' An interconnect tree consists of continuously connected high-conductivity metal within one layer of metallization. Trees terminate at diffusion barriers at vias and contacts, and, in the general case, can have more than one terminating branch when they include junctions. We have extended the understanding of `immortality' demonstrated and analyzed for straight stud-to-stud lines, to trees of arbitrary complexity. This leads to a hierarchical approach for identifying immortal trees for specific circuit layouts and models for operation. To complete a circuit-level-reliability analysis, it is also necessary to estimate the lifetimes of the mortal trees. We have developed simulation tools that allow modeling of stress evolution and failure in arbitrarily complex trees. We are testing our models and simulations through comparisons with experiments on simple trees, such as lines broken into two segments with different currents in each segment. Models, simulations and early experimental results on the reliability of interconnect trees are shown to be consistent.
NASA Technical Reports Server (NTRS)
Wallace, Dolores R.
2003-01-01
In FY01 we learned that hardware reliability models need substantial changes to account for differences in software, thus making software reliability measurements more effective, accurate, and easier to apply. These reliability models are generally based on familiar distributions or parametric methods. An obvious question is 'What new statistical and probability models can be developed using non-parametric and distribution-free methods instead of the traditional parametric method?" Two approaches to software reliability engineering appear somewhat promising. The first study, begin in FY01, is based in hardware reliability, a very well established science that has many aspects that can be applied to software. This research effort has investigated mathematical aspects of hardware reliability and has identified those applicable to software. Currently the research effort is applying and testing these approaches to software reliability measurement, These parametric models require much project data that may be difficult to apply and interpret. Projects at GSFC are often complex in both technology and schedules. Assessing and estimating reliability of the final system is extremely difficult when various subsystems are tested and completed long before others. Parametric and distribution free techniques may offer a new and accurate way of modeling failure time and other project data to provide earlier and more accurate estimates of system reliability.
Reliability Assessment for Low-cost Unmanned Aerial Vehicles
NASA Astrophysics Data System (ADS)
Freeman, Paul Michael
Existing low-cost unmanned aerospace systems are unreliable, and engineers must blend reliability analysis with fault-tolerant control in novel ways. This dissertation introduces the University of Minnesota unmanned aerial vehicle flight research platform, a comprehensive simulation and flight test facility for reliability and fault-tolerance research. An industry-standard reliability assessment technique, the failure modes and effects analysis, is performed for an unmanned aircraft. Particular attention is afforded to the control surface and servo-actuation subsystem. Maintaining effector health is essential for safe flight; failures may lead to loss of control incidents. Failure likelihood, severity, and risk are qualitatively assessed for several effector failure modes. Design changes are recommended to improve aircraft reliability based on this analysis. Most notably, the control surfaces are split, providing independent actuation and dual-redundancy. The simulation models for control surface aerodynamic effects are updated to reflect the split surfaces using a first-principles geometric analysis. The failure modes and effects analysis is extended by using a high-fidelity nonlinear aircraft simulation. A trim state discovery is performed to identify the achievable steady, wings-level flight envelope of the healthy and damaged vehicle. Tolerance of elevator actuator failures is studied using familiar tools from linear systems analysis. This analysis reveals significant inherent performance limitations for candidate adaptive/reconfigurable control algorithms used for the vehicle. Moreover, it demonstrates how these tools can be applied in a design feedback loop to make safety-critical unmanned systems more reliable. Control surface impairments that do occur must be quickly and accurately detected. This dissertation also considers fault detection and identification for an unmanned aerial vehicle using model-based and model-free approaches and applies those algorithms to experimental faulted and unfaulted flight test data. Flight tests are conducted with actuator faults that affect the plant input and sensor faults that affect the vehicle state measurements. A model-based detection strategy is designed and uses robust linear filtering methods to reject exogenous disturbances, e.g. wind, while providing robustness to model variation. A data-driven algorithm is developed to operate exclusively on raw flight test data without physical model knowledge. The fault detection and identification performance of these complementary but different methods is compared. Together, enhanced reliability assessment and multi-pronged fault detection and identification techniques can help to bring about the next generation of reliable low-cost unmanned aircraft.
A Dialogue about MCQs, Reliability, and Item Response Modelling
ERIC Educational Resources Information Center
Wright, Daniel B.; Skagerberg, Elin M.
2006-01-01
Multiple choice questions (MCQs) are becoming more common in UK psychology departments and the need to assess their reliability is apparent. Having examined the reliability of MCQs in our department we faced many questions from colleagues about why we were examining reliability, what it was that we were doing, and what should be reported when…
Structural reliability assessment of the Oman India Pipeline
DOE Office of Scientific and Technical Information (OSTI.GOV)
Al-Sharif, A.M.; Preston, R.
1996-12-31
Reliability techniques are increasingly finding application in design. The special design conditions for the deep water sections of the Oman India Pipeline dictate their use since the experience basis for application of standard deterministic techniques is inadequate. The paper discusses the reliability analysis as applied to the Oman India Pipeline, including selection of a collapse model, characterization of the variability in the parameters that affect pipe resistance to collapse, and implementation of first and second order reliability analyses to assess the probability of pipe failure. The reliability analysis results are used as the basis for establishing the pipe wall thicknessmore » requirements for the pipeline.« less
Arunasalam, Mark; Paulson, Albert; Wallace, William
2003-01-01
Preferred provider organizations (PPOs) provide healthcare services to an expanding proportion of the U.S. population. This paper presents a programmatic assessment of service quality in the workers' compensation environment using two different models: the PPO program model and the fee-for-service (FFS) payor model. The methodology used here will augment currently available research in workers' compensation, which has been lacking in measuring service quality determinants and assessing programmatic success/failure of managed care type programs. Results indicated that the SERVQUAL tool provided a reliable and valid clinical quality assessment tool that ascertained that PPO marketers should focus on promoting physician outreach (to show empathy) and accessibility (to show reliability) for injured workers.
A simulation model for risk assessment of turbine wheels
NASA Technical Reports Server (NTRS)
Safie, Fayssal M.; Hage, Richard T.
1991-01-01
A simulation model has been successfully developed to evaluate the risk of the Space Shuttle auxiliary power unit (APU) turbine wheels for a specific inspection policy. Besides being an effective tool for risk/reliability evaluation, the simulation model also allows the analyst to study the trade-offs between wheel reliability, wheel life, inspection interval, and rejection crack size. For example, in the APU application, sensitivity analysis results showed that the wheel life limit has the least effect on wheel reliability when compared to the effect of the inspection interval and the rejection crack size. In summary, the simulation model developed represents a flexible tool to predict turbine wheel reliability and study the risk under different inspection policies.
A simulation model for risk assessment of turbine wheels
NASA Astrophysics Data System (ADS)
Safie, Fayssal M.; Hage, Richard T.
A simulation model has been successfully developed to evaluate the risk of the Space Shuttle auxiliary power unit (APU) turbine wheels for a specific inspection policy. Besides being an effective tool for risk/reliability evaluation, the simulation model also allows the analyst to study the trade-offs between wheel reliability, wheel life, inspection interval, and rejection crack size. For example, in the APU application, sensitivity analysis results showed that the wheel life limit has the least effect on wheel reliability when compared to the effect of the inspection interval and the rejection crack size. In summary, the simulation model developed represents a flexible tool to predict turbine wheel reliability and study the risk under different inspection policies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bucknor, Matthew; Grabaskas, David; Brunett, Acacia
2015-04-26
Advanced small modular reactor designs include many advantageous design features such as passively driven safety systems that are arguably more reliable and cost effective relative to conventional active systems. Despite their attractiveness, a reliability assessment of passive systems can be difficult using conventional reliability methods due to the nature of passive systems. Simple deviations in boundary conditions can induce functional failures in a passive system, and intermediate or unexpected operating modes can also occur. As part of an ongoing project, Argonne National Laboratory is investigating various methodologies to address passive system reliability. The Reliability Method for Passive Systems (RMPS), amore » systematic approach for examining reliability, is one technique chosen for this analysis. This methodology is combined with the Risk-Informed Safety Margin Characterization (RISMC) approach to assess the reliability of a passive system and the impact of its associated uncertainties. For this demonstration problem, an integrated plant model of an advanced small modular pool-type sodium fast reactor with a passive reactor cavity cooling system is subjected to a station blackout using RELAP5-3D. This paper discusses important aspects of the reliability assessment, including deployment of the methodology, the uncertainty identification and quantification process, and identification of key risk metrics.« less
Sazonovas, A; Japertas, P; Didziapetris, R
2010-01-01
This study presents a new type of acute toxicity (LD(50)) prediction that enables automated assessment of the reliability of predictions (which is synonymous with the assessment of the Model Applicability Domain as defined by the Organization for Economic Cooperation and Development). Analysis involved nearly 75,000 compounds from six animal systems (acute rat toxicity after oral and intraperitoneal administration; acute mouse toxicity after oral, intraperitoneal, intravenous, and subcutaneous administration). Fragmental Partial Least Squares (PLS) with 100 bootstraps yielded baseline predictions that were automatically corrected for non-linear effects in local chemical spaces--a combination called Global, Adjusted Locally According to Similarity (GALAS) modelling methodology. Each prediction obtained in this manner is provided with a reliability index value that depends on both compound's similarity to the training set (that accounts for similar trends in LD(50) variations within multiple bootstraps) and consistency of experimental results with regard to the baseline model in the local chemical environment. The actual performance of the Reliability Index (RI) was proven by its good (and uniform) correlations with Root Mean Square Error (RMSE) in all validation sets, thus providing quantitative assessment of the Model Applicability Domain. The obtained models can be used for compound screening in the early stages of drug development and prioritization for experimental in vitro testing or later in vivo animal acute toxicity studies.
Reliability of the ECHOWS Tool for Assessment of Patient Interviewing Skills.
Boissonnault, Jill S; Evans, Kerrie; Tuttle, Neil; Hetzel, Scott J; Boissonnault, William G
2016-04-01
History taking is an important component of patient/client management. Assessment of student history-taking competency can be achieved via a standardized tool. The ECHOWS tool has been shown to be valid with modest intrarater reliability in a previous study but did not demonstrate sufficient power to definitively prove its stability. The purposes of this study were: (1) to assess the reliability of the ECHOWS tool for student assessment of patient interviewing skills and (2) to determine whether the tool discerns between novice and experienced skill levels. A reliability and construct validity assessment was conducted. Three faculty members from the United States and Australia scored videotaped histories from standardized patients taken by students and experienced clinicians from each of these countries. The tapes were scored twice, 3 to 6 weeks apart. Reliability was assessed using interclass correlation coefficients (ICCs) and repeated measures. Analysis of variance models assessed the ability of the tool to discern between novice and experienced skill levels. The ECHOWS tool showed excellent intrarater reliability (ICC [3,1]=.74-.89) and good interrater reliability (ICC [2,1]=.55) as a whole. The summary of performance (S) section showed poor interrater reliability (ICC [2,1]=.27). There was no statistical difference in performance on the tool between novice and experienced clinicians. A possible ceiling effect may occur when standardized patients are not coached to provide complex and obtuse responses to interviewer questions. Variation in familiarity with the ECHOWS tool and in use of the online training may have influenced scoring of the S section. The ECHOWS tool demonstrates excellent intrarater reliability and moderate interrater reliability. Sufficient training with the tool prior to student assessment is recommended. The S section must evolve in order to provide a more discerning measure of interviewing skills. © 2016 American Physical Therapy Association.
ERIC Educational Resources Information Center
Goclowski, John C.; And Others
The Reliability, Maintainability, and Cost Model (RMCM) described in this report is an interactive mathematical model with a built-in sensitivity analysis capability. It is a major component of the Life Cycle Cost Impact Model (LCCIM), which was developed as part of the DAIS advanced development program to be used to assess the potential impacts…
Kumar, A; Bridgham, R; Potts, M; Gushurst, C; Hamp, M; Passal, D
2001-01-01
To determine consistency of assessment in a new paper case-based structured oral examination in a multi-community pediatrics clerkship, and to identify correctable problems in the administration of examination and assessment process. Nine paper case-based oral examinations were audio-taped. From audio-tapes five community coordinators scored examiner behaviors and graded student performance. Correlations among examiner behaviors scores were examined. Graphs identified grading patterns of evaluators. The effect of exam-giving on evaluators was assessed by t-test. Reliability of grades was calculated and the effect of reducing assessment problems was modeled. Exam-givers differed most in their "teaching-guiding" behavior, and this negatively correlated with student grades. Exam reliability was lowered mainly by evaluator differences in leniency and grading pattern; less important was absence of standardization in cases. While grade reliability was low in early use of the paper case-based oral examination, modeling of plausible effects of training and monitoring for greater uniformity in administration of the examination and assigning scores suggests that more adequate reliabilities can be attained.
Janssen, Ellen M; Marshall, Deborah A; Hauber, A Brett; Bridges, John F P
2017-12-01
The recent endorsement of discrete-choice experiments (DCEs) and other stated-preference methods by regulatory and health technology assessment (HTA) agencies has placed a greater focus on demonstrating the validity and reliability of preference results. Areas covered: We present a practical overview of tests of validity and reliability that have been applied in the health DCE literature and explore other study qualities of DCEs. From the published literature, we identify a variety of methods to assess the validity and reliability of DCEs. We conceptualize these methods to create a conceptual model with four domains: measurement validity, measurement reliability, choice validity, and choice reliability. Each domain consists of three categories that can be assessed using one to four procedures (for a total of 24 tests). We present how these tests have been applied in the literature and direct readers to applications of these tests in the health DCE literature. Based on a stakeholder engagement exercise, we consider the importance of study characteristics beyond traditional concepts of validity and reliability. Expert commentary: We discuss study design considerations to assess the validity and reliability of a DCE, consider limitations to the current application of tests, and discuss future work to consider the quality of DCEs in healthcare.
Development of confidence limits by pivotal functions for estimating software reliability
NASA Technical Reports Server (NTRS)
Dotson, Kelly J.
1987-01-01
The utility of pivotal functions is established for assessing software reliability. Based on the Moranda geometric de-eutrophication model of reliability growth, confidence limits for attained reliability and prediction limits for the time to the next failure are derived using a pivotal function approach. Asymptotic approximations to the confidence and prediction limits are considered and are shown to be inadequate in cases where only a few bugs are found in the software. Departures from the assumed exponentially distributed interfailure times in the model are also investigated. The effect of these departures is discussed relative to restricting the use of the Moranda model.
NASA Astrophysics Data System (ADS)
Li, Lin; Zeng, Li; Lin, Zi-Jing; Cazzell, Mary; Liu, Hanli
2015-05-01
Test-retest reliability of neuroimaging measurements is an important concern in the investigation of cognitive functions in the human brain. To date, intraclass correlation coefficients (ICCs), originally used in inter-rater reliability studies in behavioral sciences, have become commonly used metrics in reliability studies on neuroimaging and functional near-infrared spectroscopy (fNIRS). However, as there are six popular forms of ICC, the adequateness of the comprehensive understanding of ICCs will affect how one may appropriately select, use, and interpret ICCs toward a reliability study. We first offer a brief review and tutorial on the statistical rationale of ICCs, including their underlying analysis of variance models and technical definitions, in the context of assessment on intertest reliability. Second, we provide general guidelines on the selection and interpretation of ICCs. Third, we illustrate the proposed approach by using an actual research study to assess intertest reliability of fNIRS-based, volumetric diffuse optical tomography of brain activities stimulated by a risk decision-making protocol. Last, special issues that may arise in reliability assessment using ICCs are discussed and solutions are suggested.
Reliability evaluation of microgrid considering incentive-based demand response
NASA Astrophysics Data System (ADS)
Huang, Ting-Cheng; Zhang, Yong-Jun
2017-07-01
Incentive-based demand response (IBDR) can guide customers to adjust their behaviour of electricity and curtail load actively. Meanwhile, distributed generation (DG) and energy storage system (ESS) can provide time for the implementation of IBDR. The paper focus on the reliability evaluation of microgrid considering IBDR. Firstly, the mechanism of IBDR and its impact on power supply reliability are analysed. Secondly, the IBDR dispatch model considering customer’s comprehensive assessment and the customer response model are developed. Thirdly, the reliability evaluation method considering IBDR based on Monte Carlo simulation is proposed. Finally, the validity of the above models and method is studied through numerical tests on modified RBTS Bus6 test system. Simulation results demonstrated that IBDR can improve the reliability of microgrid.
Lyon, Aaron R; Pullmann, Michael D; Dorsey, Shannon; Martin, Prerna; Grigore, Alexandra A; Becker, Emily M; Jensen-Doss, Amanda
2018-05-11
Measurement-based care (MBC) is an increasingly popular, evidence-based practice, but there are no tools with established psychometrics to evaluate clinician use of MBC practices in mental health service delivery. The current study evaluated the reliability, validity, and factor structure of scores generated from a brief, standardized tool to measure MBC practices, the Current Assessment Practice Evaluation-Revised (CAPER). Survey data from a national sample of 479 mental health clinicians were used to conduct exploratory and confirmatory factor analyses, as well as reliability and validity analyses (e.g., relationships between CAPER subscales and clinician MBC attitudes). Analyses revealed competing two- and three-factor models. Regardless of the model used, scores from CAPER subscales demonstrated good reliability and convergent and divergent validity with MBC attitudes in the expected directions. The CAPER appears to be a psychometrically sound tool for assessing clinician MBC practices. Future directions for development and application of the tool are discussed.
Reliability Evaluation of Machine Center Components Based on Cascading Failure Analysis
NASA Astrophysics Data System (ADS)
Zhang, Ying-Zhi; Liu, Jin-Tong; Shen, Gui-Xiang; Long, Zhe; Sun, Shu-Guang
2017-07-01
In order to rectify the problems that the component reliability model exhibits deviation, and the evaluation result is low due to the overlook of failure propagation in traditional reliability evaluation of machine center components, a new reliability evaluation method based on cascading failure analysis and the failure influenced degree assessment is proposed. A direct graph model of cascading failure among components is established according to cascading failure mechanism analysis and graph theory. The failure influenced degrees of the system components are assessed by the adjacency matrix and its transposition, combined with the Pagerank algorithm. Based on the comprehensive failure probability function and total probability formula, the inherent failure probability function is determined to realize the reliability evaluation of the system components. Finally, the method is applied to a machine center, it shows the following: 1) The reliability evaluation values of the proposed method are at least 2.5% higher than those of the traditional method; 2) The difference between the comprehensive and inherent reliability of the system component presents a positive correlation with the failure influenced degree of the system component, which provides a theoretical basis for reliability allocation of machine center system.
Coefficient Alpha: A Reliability Coefficient for the 21st Century?
ERIC Educational Resources Information Center
Yang, Yanyun; Green, Samuel B.
2011-01-01
Coefficient alpha is almost universally applied to assess reliability of scales in psychology. We argue that researchers should consider alternatives to coefficient alpha. Our preference is for structural equation modeling (SEM) estimates of reliability because they are informative and allow for an empirical evaluation of the assumptions…
Item Response Theory for Peer Assessment
ERIC Educational Resources Information Center
Uto, Masaki; Ueno, Maomi
2016-01-01
As an assessment method based on a constructivist approach, peer assessment has become popular in recent years. However, in peer assessment, a problem remains that reliability depends on the rater characteristics. For this reason, some item response models that incorporate rater parameters have been proposed. Those models are expected to improve…
Otis, Colombe; Gervais, Julie; Guillot, Martin; Gervais, Julie-Anne; Gauvin, Dominique; Péthel, Catherine; Authier, Simon; Dansereau, Marc-André; Sarret, Philippe; Martel-Pelletier, Johanne; Pelletier, Jean-Pierre; Beaudry, Francis; Troncy, Eric
2016-06-23
Lack of validity in osteoarthritis pain models and assessment methods is suspected. Our goal was to 1) assess the repeatability and reproducibility of measurement and the influence of environment, and acclimatization, to different pain assessment outcomes in normal rats, and 2) test the concurrent validity of the most reliable methods in relation to the expression of different spinal neuropeptides in a chemical model of osteoarthritic pain. Repeatability and inter-rater reliability of reflexive nociceptive mechanical thresholds, spontaneous static weight-bearing, treadmill, rotarod, and operant place escape/avoidance paradigm (PEAP) were assessed by the intraclass correlation coefficient (ICC). The most reliable acclimatization protocol was determined by comparing coefficients of variation. In a pilot comparative study, the sensitivity and responsiveness to treatment of the most reliable methods were tested in the monosodium iodoacetate (MIA) model over 21 days. Two MIA (2 mg) groups (including one lidocaine treatment group) and one sham group (0.9 % saline) received an intra-articular (50 μL) injection. No effect of environment (observer, inverted circadian cycle, or exercise) was observed; all tested methods except mechanical sensitivity (ICC <0.3), offered good repeatability (ICC ≥0.7). The most reliable acclimatization protocol included five assessments over two weeks. MIA-related osteoarthritic change in pain was demonstrated with static weight-bearing, punctate tactile allodynia evaluation, treadmill exercise and operant PEAP, the latter being the most responsive to analgesic intra-articular lidocaine. Substance P and calcitonin gene-related peptide were higher in MIA groups compared to naive (adjusted P (adj-P) = 0.016) or sham-treated (adj-P = 0.029) rats. Repeated post-MIA lidocaine injection resulted in 34 times lower downregulation for spinal substance P compared to MIA alone (adj-P = 0.029), with a concomitant increase of 17 % in time spent on the PEAP dark side (indicative of increased comfort). This study of normal rats and rats with pain established the most reliable and sensitive pain assessment methods and an optimized acclimatization protocol. Operant PEAP testing was more responsive to lidocaine analgesia than other tests used, while neuropeptide spinal concentration is an objective quantification method attractive to support and validate different centralized pain functional assessment methods.
Developing an oropharyngeal cancer (OPC) knowledge and behaviors survey.
Dodd, Virginia J; Riley Iii, Joseph L; Logan, Henrietta L
2012-09-01
To use the community participation research model to (1) develop a survey assessing knowledge about mouth and throat cancer and (2) field test and establish test-retest reliability with newly developed instrument. Cognitive interviews with primarily rural African American adults to assess their perception and interpretation of survey items. Test-retest reliability was established with a racially diverse rural population. Test-retest reliabilities ranged from .79 to .40 for screening awareness and .74 to .19 for knowledge. Coefficients increased for composite scores. Community participation methodology provided a culturally appropriate survey instrument that demonstrated acceptable levels of reliability.
Arheart, Kristopher L; Sly, David F; Trapido, Edward J; Rodriguez, Richard D; Ellestad, Amy J
2004-11-01
To identify multi-item attitude/belief scales associated with the theoretical foundations of an anti-tobacco counter-marketing campaign and assess their reliability and validity. The data analyzed are from two state-wide, random, cross-sectional telephone surveys [n(S1)=1,079, n(S2)=1,150]. Items forming attitude/belief scales are identified using factor analysis. Reliability is assessed with Chronbach's alpha. Relationships among scales are explored using Pearson correlation. Validity is assessed by testing associations derived from the Centers for Disease Control and Prevention's (CDC) logic model for tobacco control program development and evaluation linking media exposure to attitudes/beliefs, and attitudes/beliefs to smoking-related behaviors. Adjusted odds ratios are employed for these analyses. Three factors emerged: traditional attitudes/beliefs about tobacco and tobacco use, tobacco industry manipulation and anti-tobacco empowerment. Reliability coefficients are in the range of 0.70 and vary little between age groups. The factors are correlated with one-another as hypothesized. Associations between media exposure and the attitude/belief scales and between these scales and behaviors are consistent with the CDC logic model. Using reliable, valid multi-item scales is theoretically and methodologically more sound than employing single-item measures of attitudes/beliefs. Methodological, theoretical and practical implications are discussed.
Integrating Machine Learning into a Crowdsourced Model for Earthquake-Induced Damage Assessment
NASA Technical Reports Server (NTRS)
Rebbapragada, Umaa; Oommen, Thomas
2011-01-01
On January 12th, 2010, a catastrophic 7.0M earthquake devastated the country of Haiti. In the aftermath of an earthquake, it is important to rapidly assess damaged areas in order to mobilize the appropriate resources. The Haiti damage assessment effort introduced a promising model that uses crowdsourcing to map damaged areas in freely available remotely-sensed data. This paper proposes the application of machine learning methods to improve this model. Specifically, we apply work on learning from multiple, imperfect experts to the assessment of volunteer reliability, and propose the use of image segmentation to automate the detection of damaged areas. We wrap both tasks in an active learning framework in order to shift volunteer effort from mapping a full catalog of images to the generation of high-quality training data. We hypothesize that the integration of machine learning into this model improves its reliability, maintains the speed of damage assessment, and allows the model to scale to higher data volumes.
Rossini, Gabriele; Parrini, Simone; Castroflorio, Tommaso; Deregibus, Andrea; Debernardi, Cesare L
2016-02-01
Our objective was to assess the accuracy, validity, and reliability of measurements obtained from virtual dental study models compared with those obtained from plaster models. PubMed, PubMed Central, National Library of Medicine Medline, Embase, Cochrane Central Register of Controlled Clinical trials, Web of Knowledge, Scopus, Google Scholar, and LILACs were searched from January 2000 to November 2014. A grading system described by the Swedish Council on Technology Assessment in Health Care and the Cochrane tool for risk of bias assessment were used to rate the methodologic quality of the articles. Thirty-five relevant articles were selected. The methodologic quality was high. No significant differences were observed for most of the studies in all the measured parameters, with the exception of the American Board of Orthodontics Objective Grading System. Digital models are as reliable as traditional plaster models, with high accuracy, reliability, and reproducibility. Landmark identification, rather than the measuring device or the software, appears to be the greatest limitation. Furthermore, with their advantages in terms of cost, time, and space required, digital models could be considered the new gold standard in current practice. Copyright © 2016 American Association of Orthodontists. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Yusmaita, E.; Nasra, Edi
2018-04-01
This research aims to produce instrument for measuring chemical literacy assessment in basic chemistry courses with solubility topic. The construction of this measuring instrument is adapted to the PISA (Programme for International Student Assessment) problem’s characteristics and the Syllaby of Basic Chemistry in KKNI-IndonesianNational Qualification Framework. The PISA is a cross-country study conducted periodically to monitor the outcomes of learners' achievement in each participating country. So far, studies conducted by PISA include reading literacy, mathematic literacy and scientific literacy. Refered to the scientific competence of the PISA study on science literacy, an assessment designed to measure the chemical literacy of the chemistry department’s students in UNP. The research model used is MER (Model of Educational Reconstruction). The validity and reliability values of discourse questions is measured using the software ANATES. Based on the acquisition of these values is obtained a valid and reliable chemical literacy questions.There are seven question items limited response on the topic of solubility with valid category, the acquisition value of test reliability is 0,86, and has a difficulty index and distinguishing good
ERIC Educational Resources Information Center
Young, Meredith E.; Cruess, Sylvia R.; Cruess, Richard L.; Steinert, Yvonne
2014-01-01
Physicians function as clinicians, teachers, and role models within the clinical environment. Negative learning environments have been shown to be due to many factors, including the presence of unprofessional behaviors among clinical teachers. Reliable and valid assessments of clinical teacher performance, including professional behaviors, may…
NASA Astrophysics Data System (ADS)
Bogachkov, I. V.; Lutchenko, S. S.
2018-05-01
The article deals with the method for the assessment of the fiber optic communication lines (FOCL) reliability taking into account the effect of the optical fiber tension, the temperature influence and the built-in diagnostic equipment errors of the first kind. The reliability is assessed in terms of the availability factor using the theory of Markov chains and probabilistic mathematical modeling. To obtain a mathematical model, the following steps are performed: the FOCL state is defined and validated; the state graph and system transitions are described; the system transition of states that occur at a certain point is specified; the real and the observed time of system presence in the considered states are identified. According to the permissible value of the availability factor, it is possible to determine the limiting frequency of FOCL maintenance.
Valid and Reliable Science Content Assessments for Science Teachers
NASA Astrophysics Data System (ADS)
Tretter, Thomas R.; Brown, Sherri L.; Bush, William S.; Saderholm, Jon C.; Holmes, Vicki-Lynn
2013-03-01
Science teachers' content knowledge is an important influence on student learning, highlighting an ongoing need for programs, and assessments of those programs, designed to support teacher learning of science. Valid and reliable assessments of teacher science knowledge are needed for direct measurement of this crucial variable. This paper describes multiple sources of validity and reliability (Cronbach's alpha greater than 0.8) evidence for physical, life, and earth/space science assessments—part of the Diagnostic Teacher Assessments of Mathematics and Science (DTAMS) project. Validity was strengthened by systematic synthesis of relevant documents, extensive use of external reviewers, and field tests with 900 teachers during assessment development process. Subsequent results from 4,400 teachers, analyzed with Rasch IRT modeling techniques, offer construct and concurrent validity evidence.
Reliability modeling of fault-tolerant computer based systems
NASA Technical Reports Server (NTRS)
Bavuso, Salvatore J.
1987-01-01
Digital fault-tolerant computer-based systems have become commonplace in military and commercial avionics. These systems hold the promise of increased availability, reliability, and maintainability over conventional analog-based systems through the application of replicated digital computers arranged in fault-tolerant configurations. Three tightly coupled factors of paramount importance, ultimately determining the viability of these systems, are reliability, safety, and profitability. Reliability, the major driver affects virtually every aspect of design, packaging, and field operations, and eventually produces profit for commercial applications or increased national security. However, the utilization of digital computer systems makes the task of producing credible reliability assessment a formidable one for the reliability engineer. The root of the problem lies in the digital computer's unique adaptability to changing requirements, computational power, and ability to test itself efficiently. Addressed here are the nuances of modeling the reliability of systems with large state sizes, in the Markov sense, which result from systems based on replicated redundant hardware and to discuss the modeling of factors which can reduce reliability without concomitant depletion of hardware. Advanced fault-handling models are described and methods of acquiring and measuring parameters for these models are delineated.
Culture Representation in Human Reliability Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
David Gertman; Julie Marble; Steven Novack
Understanding human-system response is critical to being able to plan and predict mission success in the modern battlespace. Commonly, human reliability analysis has been used to predict failures of human performance in complex, critical systems. However, most human reliability methods fail to take culture into account. This paper takes an easily understood state of the art human reliability analysis method and extends that method to account for the influence of culture, including acceptance of new technology, upon performance. The cultural parameters used to modify the human reliability analysis were determined from two standard industry approaches to cultural assessment: Hofstede’s (1991)more » cultural factors and Davis’ (1989) technology acceptance model (TAM). The result is called the Culture Adjustment Method (CAM). An example is presented that (1) reviews human reliability assessment with and without cultural attributes for a Supervisory Control and Data Acquisition (SCADA) system attack, (2) demonstrates how country specific information can be used to increase the realism of HRA modeling, and (3) discusses the differences in human error probability estimates arising from cultural differences.« less
Integrated Model to Assess Cloud Deployment Effectiveness When Developing an IT-strategy
NASA Astrophysics Data System (ADS)
Razumnikov, S.; Prankevich, D.
2016-04-01
Developing an IT-strategy of cloud deployment is a complex issue since even the stage of its formation necessitates revealing what applications will be the best possible to meet the requirements of a company business-strategy, evaluate reliability and safety of cloud providers and analyze staff satisfaction. A system of criteria, as well an integrated model to assess cloud deployment effectiveness is offered. The model makes it possible to identify what applications being at the disposal of a company, as well as new tools to be deployed are reliable and safe enough for implementation in the cloud environment. The data on practical use of the procedure to assess cloud deployment effectiveness by a provider of telecommunication services is presented. The model was used to calculate values of integral indexes of services to be assessed, then, ones, meeting the criteria and answering the business-strategy of a company, were selected.
NASA Astrophysics Data System (ADS)
Wallace, Jon Michael
2003-10-01
Reliability prediction of components operating in complex systems has historically been conducted in a statistically isolated manner. Current physics-based, i.e. mechanistic, component reliability approaches focus more on component-specific attributes and mathematical algorithms and not enough on the influence of the system. The result is that significant error can be introduced into the component reliability assessment process. The objective of this study is the development of a framework that infuses the needs and influence of the system into the process of conducting mechanistic-based component reliability assessments. The formulated framework consists of six primary steps. The first three steps, identification, decomposition, and synthesis, are primarily qualitative in nature and employ system reliability and safety engineering principles to construct an appropriate starting point for the component reliability assessment. The following two steps are the most unique. They involve a step to efficiently characterize and quantify the system-driven local parameter space and a subsequent step using this information to guide the reduction of the component parameter space. The local statistical space quantification step is accomplished using two proposed multivariate probability models: Multi-Response First Order Second Moment and Taylor-Based Inverse Transformation. Where existing joint probability models require preliminary distribution and correlation information of the responses, these models combine statistical information of the input parameters with an efficient sampling of the response analyses to produce the multi-response joint probability distribution. Parameter space reduction is accomplished using Approximate Canonical Correlation Analysis (ACCA) employed as a multi-response screening technique. The novelty of this approach is that each individual local parameter and even subsets of parameters representing entire contributing analyses can now be rank ordered with respect to their contribution to not just one response, but the entire vector of component responses simultaneously. The final step of the framework is the actual probabilistic assessment of the component. Although the same multivariate probability tools employed in the characterization step can be used for the component probability assessment, variations of this final step are given to allow for the utilization of existing probabilistic methods such as response surface Monte Carlo and Fast Probability Integration. The overall framework developed in this study is implemented to assess the finite-element based reliability prediction of a gas turbine airfoil involving several failure responses. Results of this implementation are compared to results generated using the conventional 'isolated' approach as well as a validation approach conducted through large sample Monte Carlo simulations. The framework resulted in a considerable improvement to the accuracy of the part reliability assessment and an improved understanding of the component failure behavior. Considerable statistical complexity in the form of joint non-normal behavior was found and accounted for using the framework. Future applications of the framework elements are discussed.
Reliability-based trajectory optimization using nonintrusive polynomial chaos for Mars entry mission
NASA Astrophysics Data System (ADS)
Huang, Yuechen; Li, Haiyang
2018-06-01
This paper presents the reliability-based sequential optimization (RBSO) method to settle the trajectory optimization problem with parametric uncertainties in entry dynamics for Mars entry mission. First, the deterministic entry trajectory optimization model is reviewed, and then the reliability-based optimization model is formulated. In addition, the modified sequential optimization method, in which the nonintrusive polynomial chaos expansion (PCE) method and the most probable point (MPP) searching method are employed, is proposed to solve the reliability-based optimization problem efficiently. The nonintrusive PCE method contributes to the transformation between the stochastic optimization (SO) and the deterministic optimization (DO) and to the approximation of trajectory solution efficiently. The MPP method, which is used for assessing the reliability of constraints satisfaction only up to the necessary level, is employed to further improve the computational efficiency. The cycle including SO, reliability assessment and constraints update is repeated in the RBSO until the reliability requirements of constraints satisfaction are satisfied. Finally, the RBSO is compared with the traditional DO and the traditional sequential optimization based on Monte Carlo (MC) simulation in a specific Mars entry mission to demonstrate the effectiveness and the efficiency of the proposed method.
NASA Astrophysics Data System (ADS)
Asfaroh, Jati Aurum; Rosana, Dadan; Supahar
2017-08-01
This research aims to develop an evaluation instrument models CIPP valid and reliable as well as determine the feasibility and practicality of an evaluation instrument models CIPP. An evaluation instrument models CIPP to evaluate the implementation of the project assessment topic optik to measure problem-solving skills of junior high school class VIII in the Yogyakarta region. This research is a model of development that uses 4-D. Subject of product trials are students in class VIII SMP N 1 Galur and SMP N 1 Sleman. Data collection techniques in this research using non-test techniques include interviews, questionnaires and observations. Validity in this research was analyzed using V'Aikens. Reliability analyzed using ICC. This research uses 7 raters are derived from two lecturers expert (expert judgment), two practitioners (science teacher) and three colleagues. The results of this research is the evaluation's instrument model of CIPP is used to evaluate the implementation of the implementation of the project assessment instruments. The validity result of evaluation instrument have V'Aikens values between 0.86 to 1, which means a valid and 0.836 reliability values into categories so well that it has been worth used as an evaluation instrument.
Decision-theoretic methodology for reliability and risk allocation in nuclear power plants
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cho, N.Z.; Papazoglou, I.A.; Bari, R.A.
1985-01-01
This paper describes a methodology for allocating reliability and risk to various reactor systems, subsystems, components, operations, and structures in a consistent manner, based on a set of global safety criteria which are not rigid. The problem is formulated as a multiattribute decision analysis paradigm; the multiobjective optimization, which is performed on a PRA model and reliability cost functions, serves as the guiding principle for reliability and risk allocation. The concept of noninferiority is used in the multiobjective optimization problem. Finding the noninferior solution set is the main theme of the current approach. The assessment of the decision maker's preferencesmore » could then be performed more easily on the noninferior solution set. Some results of the methodology applications to a nontrivial risk model are provided and several outstanding issues such as generic allocation and preference assessment are discussed.« less
Niedermann, K; Forster, A; Hammond, A; Uebelhart, D; de Bie, R
2007-03-15
Joint protection (JP) is an important part of the treatment concept for patients with rheumatoid arthritis (RA). The Joint Protection Behavior Assessment short form (JPBA-S) assesses the use of hand JP methods by patients with RA while preparing a hot drink. The purpose of this study was to develop a German version of the JPBA-S (D-JPBA-S) and to test its validity and reliability. A manual was developed through consensus with 8 occupational therapist (OT) experts as the reference for assessing patients' JP behavior. Twenty-four patients with RA and 10 healthy individuals were videotaped while performing 10 tasks reflecting the activity of preparing instant coffee. Recordings were repeated after 3 months for test-retest analysis. One rater assessed all available patient recordings (n = 23, recorded twice) for test-retest reliability. The video recordings of 10 randomly selected patients and all healthy individuals were independently assessed for interrater reliability by 6 OTs who were explicitly asked to follow the manual. Rasch analysis was performed to test construct validity and transform ordinal raw data into interval data for reliability calculations. Nine of the 10 tasks fit the Rasch model. The D-JPBA-S, consisting of 9 valid tasks, had an intraclass correlation coefficient of 0.77 for interrater reliability and 0.71 for test-retest reliability. The D-JPBA-S provides a valid and reliable instrument for assessing JP behavior of patients with RA and can be used in German-speaking countries.
ERIC Educational Resources Information Center
Sun, Anji; Valiga, Michael J.
In this study, the reliability of the American College Testing (ACT) Program's "Survey of Academic Advising" (SAA) was examined using both univariate and multivariate generalizability theory approaches. The primary purpose of the study was to compare the results of three generalizability theory models (a random univariate model, a mixed…
Assessing the Reliability of Curriculum-Based Measurement: An Application of Latent Growth Modeling
ERIC Educational Resources Information Center
Yeo, Seungsoo; Kim, Dong-Il; Branum-Martin, Lee; Wayman, Miya Miura; Espin, Christine A.
2012-01-01
The purpose of this study was to demonstrate the use of Latent Growth Modeling (LGM) as a method for estimating reliability of Curriculum-Based Measurement (CBM) progress-monitoring data. The LGM approach permits the error associated with each measure to differ at each time point, thus providing an alternative method for examining of the…
Muehrer, Rebecca J; Lanuza, Dorothy M; Brown, Roger L; Djamali, Arjang
2015-01-01
This study describes the development and psychometric testing of the Sexual Concerns Questionnaire (SCQ) in kidney transplant (KTx) recipients. Construct validity was assessed using the Kroonenberg and Lewis exploratory/confirmatory procedure and testing hypothesized relationships with established questionnaires. Configural and weak invariance were examined across gender, dialysis history, relationship status, and transplant type. Reliability was assessed with Cronbach's alpha, composite reliability, and test-retest reliability. Factor analysis resulted in a 7-factor solution and suggests good model fit. Construct validity was also supported by the tests of hypothesized relationships. Configural and weak invariance were supported for all subgroups. Reliability of the SCQ was also supported. Findings indicate the SCQ is a valid and reliable measure of KTx recipients' sexual concerns.
Anderson-Cook, Christine M.; Morzinski, Jerome; Blecker, Kenneth D.
2015-08-19
Understanding the impact of production, environmental exposure and age characteristics on the reliability of a population is frequently based on underlying science and empirical assessment. When there is incomplete science to prescribe which inputs should be included in a model of reliability to predict future trends, statistical model/variable selection techniques can be leveraged on a stockpile or population of units to improve reliability predictions as well as suggest new mechanisms affecting reliability to explore. We describe a five-step process for exploring relationships between available summaries of age, usage and environmental exposure and reliability. The process involves first identifying potential candidatemore » inputs, then second organizing data for the analysis. Third, a variety of models with different combinations of the inputs are estimated, and fourth, flexible metrics are used to compare them. As a result, plots of the predicted relationships are examined to distill leading model contenders into a prioritized list for subject matter experts to understand and compare. The complexity of the model, quality of prediction and cost of future data collection are all factors to be considered by the subject matter experts when selecting a final model.« less
A sensitive and reliable test instrument to assess swimming in rats with spinal cord injury.
Xu, Ning; Åkesson, Elisabet; Holmberg, Lena; Sundström, Erik
2015-09-15
For clinical translation of experimental spinal cord injury (SCI) research, evaluation of animal SCI models should include several sensorimotor functions. Validated and reliable assessment tools should be applicable to a wide range of injury severity. The BBB scale is the most widely used test instrument, but similar to most others it is used to assess open field ambulation. We have developed an assessment tool for swimming in rats with SCI, with high discriminative power and sensitivity to functional recovery after mild and severe injuries, without need for advanced test equipment. We studied various parameters of swimming in four groups of rats with thoracic SCI of different severity and a control group, for 8 weeks after surgery. Six parameters were combined in a multiple item scale, the Karolinska Institutet Swim Assessment Tool (KSAT). KSAT scores for all SCI groups showed consistent functional improvement after injury, and significant differences between the five experimental groups. The internal consistency, the inter-rater and the test-retest reliability were very high. The KSAT score was highly correlated to the cross-section area of white matter spared at the injury epicenter. Importantly, even after 8 weeks of recovery the KSAT score reliably discriminated normal animals from those inflicted by the mildest injury, and also displayed the recovery of the most severely injured rats. We conclude that this swim scale is an efficient and reliable tool to assess motor activity during swimming, and an important addition to the methods available for evaluating rat models of SCI. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Tamura, Yoshinobu; Yamada, Shigeru
OSS (open source software) systems which serve as key components of critical infrastructures in our social life are still ever-expanding now. Especially, embedded OSS systems have been gaining a lot of attention in the embedded system area, i.e., Android, BusyBox, TRON, etc. However, the poor handling of quality problem and customer support prohibit the progress of embedded OSS. Also, it is difficult for developers to assess the reliability and portability of embedded OSS on a single-board computer. In this paper, we propose a method of software reliability assessment based on flexible hazard rates for the embedded OSS. Also, we analyze actual data of software failure-occurrence time-intervals to show numerical examples of software reliability assessment for the embedded OSS. Moreover, we compare the proposed hazard rate model for the embedded OSS with the typical conventional hazard rate models by using the comparison criteria of goodness-of-fit. Furthermore, we discuss the optimal software release problem for the porting-phase based on the total expected software maintenance cost.
Reliability of Fault Tolerant Control Systems. Part 1
NASA Technical Reports Server (NTRS)
Wu, N. Eva
2001-01-01
This paper reports Part I of a two part effort, that is intended to delineate the relationship between reliability and fault tolerant control in a quantitative manner. Reliability analysis of fault-tolerant control systems is performed using Markov models. Reliability properties, peculiar to fault-tolerant control systems are emphasized. As a consequence, coverage of failures through redundancy management can be severely limited. It is shown that in the early life of a syi1ein composed of highly reliable subsystems, the reliability of the overall system is affine with respect to coverage, and inadequate coverage induces dominant single point failures. The utility of some existing software tools for assessing the reliability of fault tolerant control systems is also discussed. Coverage modeling is attempted in Part II in a way that captures its dependence on the control performance and on the diagnostic resolution.
JUPITER PROJECT - JOINT UNIVERSAL PARAMETER IDENTIFICATION AND EVALUATION OF RELIABILITY
The JUPITER (Joint Universal Parameter IdenTification and Evaluation of Reliability) project builds on the technology of two widely used codes for sensitivity analysis, data assessment, calibration, and uncertainty analysis of environmental models: PEST and UCODE.
Development and Exemplification of a Model for Teacher Assessment in Primary Science
ERIC Educational Resources Information Center
Davies, D. J.; Earle, S.; McMahon, K.; Howe, A.; Collier, C.
2017-01-01
The Teacher Assessment in Primary Science project is funded by the Primary Science Teaching Trust and based at Bath Spa University. The study aims to develop a whole-school model of valid, reliable and manageable teacher assessment to inform practice and make a positive impact on primary-aged children's learning in science. The model is based on a…
A CONSISTENT APPROACH FOR THE APPLICATION OF PHARMACOKINETIC MODELING IN CANCER RISK ASSESSMENT
Physiologically based pharmacokinetic (PBPK) modeling provides important capabilities for improving the reliability of the extrapolations across dose, species, and exposure route that are generally required in chemical risk assessment regardless of the toxic endpoint being consid...
NASA Technical Reports Server (NTRS)
Kleinhammer, Roger K.; Graber, Robert R.; DeMott, D. L.
2016-01-01
Reliability practitioners advocate getting reliability involved early in a product development process. However, when assigned to estimate or assess the (potential) reliability of a product or system early in the design and development phase, they are faced with lack of reasonable models or methods for useful reliability estimation. Developing specific data is costly and time consuming. Instead, analysts rely on available data to assess reliability. Finding data relevant to the specific use and environment for any project is difficult, if not impossible. Instead, analysts attempt to develop the "best" or composite analog data to support the assessments. Industries, consortia and vendors across many areas have spent decades collecting, analyzing and tabulating fielded item and component reliability performance in terms of observed failures and operational use. This data resource provides a huge compendium of information for potential use, but can also be compartmented by industry, difficult to find out about, access, or manipulate. One method used incorporates processes for reviewing these existing data sources and identifying the available information based on similar equipment, then using that generic data to derive an analog composite. Dissimilarities in equipment descriptions, environment of intended use, quality and even failure modes impact the "best" data incorporated in an analog composite. Once developed, this composite analog data provides a "better" representation of the reliability of the equipment or component. It can be used to support early risk or reliability trade studies, or analytical models to establish the predicted reliability data points. It also establishes a baseline prior that may updated based on test data or observed operational constraints and failures, i.e., using Bayesian techniques. This tutorial presents a descriptive compilation of historical data sources across numerous industries and disciplines, along with examples of contents and data characteristics. It then presents methods for combining failure information from different sources and mathematical use of this data in early reliability estimation and analyses.
A simulated training model for laparoscopic pyloromyotomy: Is 3D printing the way of the future?
Williams, Andrew; McWilliam, Morgan; Ahlin, James; Davidson, Jacob; Quantz, Mackenzie A; Bütter, Andreana
2018-05-01
Hypertrophic pyloric stenosis (HPS) is a common neonatal condition treated with open or laparoscopic pyloromyotomy. 3D-printed organs offer realistic simulations to practice surgical techniques. The purpose of this study was to validate a 3D HPS stomach model and assess model reliability and surgical realism. Medical students, general surgery residents, and adult and pediatric general surgeons were recruited from a single center. Participants were videotaped three times performing a laparoscopic pyloromyotomy using box trainers and 3D-printed stomachs. Attempts were graded independently by three reviewers using GOALS and Task Specific Assessments (TSA). Participants were surveyed using the Index of Agreement of Assertions on Model Accuracy (IAAMA). Participants reported their experience levels as novice (22%), inexperienced (26%), intermediate (19%), and experienced (33%). Interrater reliability was similar for overall average GOALS and TSA scores. There was a significant improvement in GOALS (p<0.0001) and TSA scores (p=0.03) between attempts and overall. Participants felt the model accurately simulated a laparoscopic pyloromyotomy (82%) and would be a useful tool for beginners (100%). A 3D-printed stomach model for simulated laparoscopic pyloromyotomy is a useful training tool for learners to improve laparoscopic skills. The GOALS and TSA provide reliable technical skills assessments. II. Copyright © 2018 Elsevier Inc. All rights reserved.
Estévez, Natalia; Yu, Ningbo; Brügger, Mike; Villiger, Michael; Hepp-Reymond, Marie-Claude; Riener, Robert; Kollias, Spyros
2014-11-01
In neurorehabilitation, longitudinal assessment of arm movement related brain function in patients with motor disability is challenging due to variability in task performance. MRI-compatible robots monitor and control task performance, yielding more reliable evaluation of brain function over time. The main goals of the present study were first to define the brain network activated while performing active and passive elbow movements with an MRI-compatible arm robot (MaRIA) in healthy subjects, and second to test the reproducibility of this activation over time. For the fMRI analysis two models were compared. In model 1 movement onset and duration were included, whereas in model 2 force and range of motion were added to the analysis. Reliability of brain activation was tested with several statistical approaches applied on individual and group activation maps and on summary statistics. The activated network included mainly the primary motor cortex, primary and secondary somatosensory cortex, superior and inferior parietal cortex, medial and lateral premotor regions, and subcortical structures. Reliability analyses revealed robust activation for active movements with both fMRI models and all the statistical methods used. Imposed passive movements also elicited mainly robust brain activation for individual and group activation maps, and reliability was improved by including additional force and range of motion using model 2. These findings demonstrate that the use of robotic devices, such as MaRIA, can be useful to reliably assess arm movement related brain activation in longitudinal studies and may contribute in studies evaluating therapies and brain plasticity following injury in the nervous system.
Wagner, Flávia; Martel, Michelle M; Cogo-Moreira, Hugo; Maia, Carlos Renato Moreira; Pan, Pedro Mario; Rohde, Luis Augusto; Salum, Giovanni Abrahão
2016-01-01
The best structural model for attention-deficit/hyperactivity disorder (ADHD) symptoms remains a matter of debate. The objective of this study is to test the fit and factor reliability of competing models of the dimensional structure of ADHD symptoms in a sample of randomly selected and high-risk children and pre-adolescents from Brazil. Our sample comprised 2512 children aged 6-12 years from 57 schools in Brazil. The ADHD symptoms were assessed using parent report on the development and well-being assessment (DAWBA). Fit indexes from confirmatory factor analysis were used to test unidimensional, correlated, and bifactor models of ADHD, the latter including "g" ADHD and "s" symptom domain factors. Reliability of all models was measured with omega coefficients. A bifactor model with one general factor and three specific factors (inattention, hyperactivity, impulsivity) exhibited the best fit to the data, according to fit indices, as well as the most consistent factor loadings. However, based on omega reliability statistics, the specific inattention, hyperactivity, and impulsivity dimensions provided very little reliable information after accounting for the reliable general ADHD factor. Our study presents some psychometric evidence that ADHD specific ("s") factors might be unreliable after taking common ("g" factor) variance into account. These results are in accordance with the lack of longitudinal stability among subtypes, the absence of dimension-specific molecular genetic findings and non-specific effects of treatment strategies. Therefore, researchers and clinicians might most effectively rely on the "g" ADHD to characterize ADHD dimensional phenotype, based on currently available symptom items.
NASA Technical Reports Server (NTRS)
1991-01-01
The technical effort and computer code enhancements performed during the sixth year of the Probabilistic Structural Analysis Methods program are summarized. Various capabilities are described to probabilistically combine structural response and structural resistance to compute component reliability. A library of structural resistance models is implemented in the Numerical Evaluations of Stochastic Structures Under Stress (NESSUS) code that included fatigue, fracture, creep, multi-factor interaction, and other important effects. In addition, a user interface was developed for user-defined resistance models. An accurate and efficient reliability method was developed and was successfully implemented in the NESSUS code to compute component reliability based on user-selected response and resistance models. A risk module was developed to compute component risk with respect to cost, performance, or user-defined criteria. The new component risk assessment capabilities were validated and demonstrated using several examples. Various supporting methodologies were also developed in support of component risk assessment.
A low-cost, tablet-based option for prehospital neurologic assessment: The iTREAT Study.
Chapman Smith, Sherita N; Govindarajan, Prasanthi; Padrick, Matthew M; Lippman, Jason M; McMurry, Timothy L; Resler, Brian L; Keenan, Kevin; Gunnell, Brian S; Mehndiratta, Prachi; Chee, Christina Y; Cahill, Elizabeth A; Dietiker, Cameron; Cattell-Gordon, David C; Smith, Wade S; Perina, Debra G; Solenski, Nina J; Worrall, Bradford B; Southerland, Andrew M
2016-07-05
In this 2-center study, we assessed the technical feasibility and reliability of a low cost, tablet-based mobile telestroke option for ambulance transport and hypothesized that the NIH Stroke Scale (NIHSS) could be performed with similar reliability between remote and bedside examinations. We piloted our mobile telemedicine system in 2 geographic regions, central Virginia and the San Francisco Bay Area, utilizing commercial cellular networks for videoconferencing transmission. Standardized patients portrayed scripted stroke scenarios during ambulance transport and were evaluated by independent raters comparing bedside to remote mobile telestroke assessments. We used a mixed-effects regression model to determine intraclass correlation of the NIHSS between bedside and remote examinations (95% confidence interval). We conducted 27 ambulance runs at both sites and successfully completed the NIHSS for all prehospital assessments without prohibitive technical interruption. The mean difference between bedside (face-to-face) and remote (video) NIHSS scores was 0.25 (1.00 to -0.50). Overall, correlation of the NIHSS between bedside and mobile telestroke assessments was 0.96 (0.92-0.98). In the mixed-effects regression model, there were no statistically significant differences accounting for method of evaluation or differences between sites. Utilizing a low-cost, tablet-based platform and commercial cellular networks, we can reliably perform prehospital neurologic assessments in both rural and urban settings. Further research is needed to establish the reliability and validity of prehospital mobile telestroke assessment in live patients presenting with acute neurologic symptoms. © 2016 American Academy of Neurology.
A low-cost, tablet-based option for prehospital neurologic assessment
Chapman Smith, Sherita N.; Govindarajan, Prasanthi; Padrick, Matthew M.; Lippman, Jason M.; McMurry, Timothy L.; Resler, Brian L.; Keenan, Kevin; Gunnell, Brian S.; Mehndiratta, Prachi; Chee, Christina Y.; Cahill, Elizabeth A.; Dietiker, Cameron; Cattell-Gordon, David C.; Smith, Wade S.; Perina, Debra G.; Solenski, Nina J.; Worrall, Bradford B.
2016-01-01
Objectives: In this 2-center study, we assessed the technical feasibility and reliability of a low cost, tablet-based mobile telestroke option for ambulance transport and hypothesized that the NIH Stroke Scale (NIHSS) could be performed with similar reliability between remote and bedside examinations. Methods: We piloted our mobile telemedicine system in 2 geographic regions, central Virginia and the San Francisco Bay Area, utilizing commercial cellular networks for videoconferencing transmission. Standardized patients portrayed scripted stroke scenarios during ambulance transport and were evaluated by independent raters comparing bedside to remote mobile telestroke assessments. We used a mixed-effects regression model to determine intraclass correlation of the NIHSS between bedside and remote examinations (95% confidence interval). Results: We conducted 27 ambulance runs at both sites and successfully completed the NIHSS for all prehospital assessments without prohibitive technical interruption. The mean difference between bedside (face-to-face) and remote (video) NIHSS scores was 0.25 (1.00 to −0.50). Overall, correlation of the NIHSS between bedside and mobile telestroke assessments was 0.96 (0.92–0.98). In the mixed-effects regression model, there were no statistically significant differences accounting for method of evaluation or differences between sites. Conclusions: Utilizing a low-cost, tablet-based platform and commercial cellular networks, we can reliably perform prehospital neurologic assessments in both rural and urban settings. Further research is needed to establish the reliability and validity of prehospital mobile telestroke assessment in live patients presenting with acute neurologic symptoms. PMID:27281534
Ishman, Stacey L; Benke, James R; Johnson, Kaalan Erik; Zur, Karen B; Jacobs, Ian N; Thorne, Marc C; Brown, David J; Lin, Sandra Y; Bhatti, Nasir; Deutsch, Ellen S
2012-10-01
OBJECTIVES To confirm interrater reliability using blinded evaluation of a skills-assessment instrument to assess the surgical performance of resident and fellow trainees performing pediatric direct laryngoscopy and rigid bronchoscopy in simulated models. DESIGN Prospective, paired, blinded observational validation study. SUBJECTS Paired observers from multiple institutions simultaneously evaluated residents and fellows who were performing surgery in an animal laboratory or using high-fidelity manikins. The evaluators had no previous affiliation with the residents and fellows and did not know their year of training. INTERVENTIONS One- and 2-page versions of an objective structured assessment of technical skills (OSATS) assessment instrument composed of global and a task-specific surgical items were used to evaluate surgical performance. RESULTS Fifty-two evaluations were completed by 17 attending evaluators. The instrument agreement for the 2-page assessment was 71.4% when measured as a binary variable (ie, competent vs not competent) (κ = 0.38; P = .08). Evaluation as a continuous variable revealed a 42.9% percentage agreement (κ = 0.18; P = .14). The intraclass correlation was 0.53, considered substantial/good interrater reliability (69% reliable). For the 1-page instrument, agreement was 77.4% when measured as a binary variable (κ = 0.53, P = .0015). Agreement when evaluated as a continuous measure was 71.0% (κ = 0.54, P < .001). The intraclass correlation was 0.73, considered high interrater reliability (85% reliable). CONCLUSIONS The OSATS assessment instrument is an effective tool for evaluating surgical performance among trainees with acceptable interrater reliability in a simulator setting. Reliability was good for both the 1- and 2-page OSATS checklists, and both serve as excellent tools to provide immediate formative feedback on operational competency.
Reliability Quantification of Advanced Stirling Convertor (ASC) Components
NASA Technical Reports Server (NTRS)
Shah, Ashwin R.; Korovaichuk, Igor; Zampino, Edward
2010-01-01
The Advanced Stirling Convertor, is intended to provide power for an unmanned planetary spacecraft and has an operational life requirement of 17 years. Over this 17 year mission, the ASC must provide power with desired performance and efficiency and require no corrective maintenance. Reliability demonstration testing for the ASC was found to be very limited due to schedule and resource constraints. Reliability demonstration must involve the application of analysis, system and component level testing, and simulation models, taken collectively. Therefore, computer simulation with limited test data verification is a viable approach to assess the reliability of ASC components. This approach is based on physics-of-failure mechanisms and involves the relationship among the design variables based on physics, mechanics, material behavior models, interaction of different components and their respective disciplines such as structures, materials, fluid, thermal, mechanical, electrical, etc. In addition, these models are based on the available test data, which can be updated, and analysis refined as more data and information becomes available. The failure mechanisms and causes of failure are included in the analysis, especially in light of the new information, in order to develop guidelines to improve design reliability and better operating controls to reduce the probability of failure. Quantified reliability assessment based on fundamental physical behavior of components and their relationship with other components has demonstrated itself to be a superior technique to conventional reliability approaches based on utilizing failure rates derived from similar equipment or simply expert judgment.
Ceramic component reliability with the restructured NASA/CARES computer program
NASA Technical Reports Server (NTRS)
Powers, Lynn M.; Starlinger, Alois; Gyekenyesi, John P.
1992-01-01
The Ceramics Analysis and Reliability Evaluation of Structures (CARES) integrated design program on statistical fast fracture reliability and monolithic ceramic components is enhanced to include the use of a neutral data base, two-dimensional modeling, and variable problem size. The data base allows for the efficient transfer of element stresses, temperatures, and volumes/areas from the finite element output to the reliability analysis program. Elements are divided to insure a direct correspondence between the subelements and the Gaussian integration points. Two-dimensional modeling is accomplished by assessing the volume flaw reliability with shell elements. To demonstrate the improvements in the algorithm, example problems are selected from a round-robin conducted by WELFEP (WEakest Link failure probability prediction by Finite Element Postprocessors).
Hadi, Azlihanis Abdul; Naing, Nyi Nyi; Daud, Aziah; Nordin, Rusli
2006-11-01
This study was conducted to assess the reliability and construct validity of the Malay version of Job Content Questionnaire (JCQ) among secondary school teachers in Kota Bharu, Kelantan. A total of 68 teachers consented to participate in the study and were administered the Malay version of JCQ. Reliability was determined using Cronbach's alpha for internal consistency whilst construct validity was assessed using factor analysis. The results indicated that Cronbach's alpha coefficients revealed decision latitude (0.75), psychological job demand (0.50) and social support (0.84). Factor analysis showed three meaningful common factors that could explain the construct of Karasek's demand-control-social support model. The study suggests the JCQ scales are reliable and valid tools for assessing job stress in school teachers.
Noninteractive macroscopic reliability model for whisker-reinforced ceramic composites
NASA Technical Reports Server (NTRS)
Duffy, Stephen F.; Arnold, Steven M.
1990-01-01
Considerable research is underway in the field of material science focusing on incorporating silicon carbide whiskers into silicon nitride and alumina matrices. These composites show the requisite thermal stability and thermal shock resistance necessary for use as components in advanced gas turbines and heat exchangers. This paper presents a macroscopic noninteractive reliability model for whisker-reinforced ceramic composites. The theory is multiaxial and is applicable to composites that can be characterized as transversely isotropic. Enough processing data exists to suggest this idealization encompasses a significantly large class of fabricated components. A qualitative assessment of the model is made by presenting reliability surfaces in several different stress spaces and for different values of model parameters.
Kainz, Hans; Hoang, Hoa X; Stockton, Chris; Boyd, Roslyn R; Lloyd, David G; Carty, Christopher P
2017-10-01
Gait analysis together with musculoskeletal modeling is widely used for research. In the absence of medical images, surface marker locations are used to scale a generic model to the individual's anthropometry. Studies evaluating the accuracy and reliability of different scaling approaches in a pediatric and/or clinical population have not yet been conducted and, therefore, formed the aim of this study. Magnetic resonance images (MRI) and motion capture data were collected from 12 participants with cerebral palsy and 6 typically developed participants. Accuracy was assessed by comparing the scaled model's segment measures to the corresponding MRI measures, whereas reliability was assessed by comparing the model's segments scaled with the experimental marker locations from the first and second motion capture session. The inclusion of joint centers into the scaling process significantly increased the accuracy of thigh and shank segment length estimates compared to scaling with markers alone. Pelvis scaling approaches which included the pelvis depth measure led to the highest errors compared to the MRI measures. Reliability was similar between scaling approaches with mean ICC of 0.97. The pelvis should be scaled using pelvic width and height and the thigh and shank segment should be scaled using the proximal and distal joint centers.
Madison, Matthew J; Bradshaw, Laine P
2015-06-01
Diagnostic classification models are psychometric models that aim to classify examinees according to their mastery or non-mastery of specified latent characteristics. These models are well-suited for providing diagnostic feedback on educational assessments because of their practical efficiency and increased reliability when compared with other multidimensional measurement models. A priori specifications of which latent characteristics or attributes are measured by each item are a core element of the diagnostic assessment design. This item-attribute alignment, expressed in a Q-matrix, precedes and supports any inference resulting from the application of the diagnostic classification model. This study investigates the effects of Q-matrix design on classification accuracy for the log-linear cognitive diagnosis model. Results indicate that classification accuracy, reliability, and convergence rates improve when the Q-matrix contains isolated information from each measured attribute.
Multiple Subtypes among Vocationally Undecided College Students: A Model and Assessment Instrument.
ERIC Educational Resources Information Center
Jones, Lawrence K.; Chenery, Mary Faeth
1980-01-01
A model of vocational decision status was developed, and an instrument was constructed and used to assess its three dimensions. Results demonstrated the utility of the model, supported the reliability and validity of the instrument, and illustrated the value of viewing vocationally undecided students as multiple subtypes. (Author)
Bond, Mary Lou; Cason, Carolyn L
2014-01-01
To assess the content validity and internal consistency reliability of the Healthcare Professions Education Program Self-Assessment (PSA) and the Institutional Self-Assessment for Factors Supporting Hispanic Student Retention (ISA). Health disparities among vulnerable populations are among the top priorities demanding attention in the United States. Efforts to recruit and retain Hispanic nursing students are essential. Based on a sample of provosts, deans/directors, and an author of the Model of Institutional Support, participants commented on the perceived validity and usefulness of each item of the PSA and ISA. Internal consistency reliability was calculated by Cronbach's alpha using responses from nursing schools in states with large Hispanic populations. The ISA and PSA were found to be reliable and valid tools for assessing institutional friendliness. The instruments highlight strengths and identify potential areas of improvement at institutional and program levels.
López-Pina, José Antonio; Sánchez-Meca, Julio; López-López, José Antonio; Marín-Martínez, Fulgencio; Núñez-Núñez, Rosa Ma; Rosa-Alcázar, Ana I; Gómez-Conesa, Antonia; Ferrer-Requena, Josefa
2015-01-01
The Yale-Brown Obsessive-Compulsive Scale for children and adolescents (CY-BOCS) is a frequently applied test to assess obsessive-compulsive symptoms. We conducted a reliability generalization meta-analysis on the CY-BOCS to estimate the average reliability, search for reliability moderators, and propose a predictive model that researchers and clinicians can use to estimate the expected reliability of the CY-BOCS scores. A total of 47 studies reporting a reliability coefficient with the data at hand were included in the meta-analysis. The results showed good reliability and a large variability associated to the standard deviation of total scores and sample size.
Quality assessment of a new surgical simulator for neuroendoscopic training.
Filho, Francisco Vaz Guimarães; Coelho, Giselle; Cavalheiro, Sergio; Lyra, Marcos; Zymberg, Samuel T
2011-04-01
Ideal surgical training models should be entirely reliable, atoxic, easy to handle, and, if possible, low cost. All available models have their advantages and disadvantages. The choice of one or another will depend on the type of surgery to be performed. The authors created an anatomical model called the S.I.M.O.N.T. (Sinus Model Oto-Rhino Neuro Trainer) Neurosurgical Endotrainer, which can provide reliable neuroendoscopic training. The aim in the present study was to assess both the quality of the model and the development of surgical skills by trainees. The S.I.M.O.N.T. is built of a synthetic thermoretractable, thermosensible rubber called Neoderma, which, combined with different polymers, produces more than 30 different formulas. Quality assessment of the model was based on qualitative and quantitative data obtained from training sessions with 9 experienced and 13 inexperienced neurosurgeons. The techniques used for evaluation were face validation, retest and interrater reliability, and construct validation. The experts considered the S.I.M.O.N.T. capable of reproducing surgical situations as if they were real and presenting great similarity with the human brain. Surgical results of serial training showed that the model could be considered precise. Finally, development and improvement in surgical skills by the trainees were observed and considered relevant to further training. It was also observed that the probability of any single error was dramatically decreased after each training session, with a mean reduction of 41.65% (range 38.7%-45.6%). Neuroendoscopic training has some specific requirements. A unique set of instruments is required, as is a model that can resemble real-life situations. The S.I.M.O.N.T. is a new alternative model specially designed for this purpose. Validation techniques followed by precision assessments attested to the model's feasibility.
Aragón, Mônica L C; Pontes, Luana F; Bichara, Lívia M; Flores-Mir, Carlos; Normando, David
2016-08-01
The development of 3D technology and the trend of increasing the use of intraoral scanners in dental office routine lead to the need for comparisons with conventional techniques. To determine if intra- and inter-arch measurements from digital dental models acquired by an intraoral scanner are as reliable and valid as the similar measurements achieved from dental models obtained through conventional intraoral impressions. An unrestricted electronic search of seven databases until February 2015. Studies that focused on the accuracy and reliability of images obtained from intraoral scanners compared to images obtained from conventional impressions. After study selection the QUADAS risk of bias assessment tool for diagnostic studies was used to assess the risk of bias (RoB) among the included studies. Four articles were included in the qualitative synthesis. The scanners evaluated were OrthoProof, Lava, iOC intraoral, Lava COS, iTero and D250. These studies evaluated the reliability of tooth widths, Bolton ratio measurements, and image superimposition. Two studies were classified as having low RoB; one had moderate RoB and the remaining one had high RoB. Only one study evaluated the time required to complete clinical procedures and patient's opinion about the procedure. Patients reported feeling more comfortable with the conventional dental impression method. Associated costs were not considered in any of the included study. Inter- and intra-arch measurements from digital models produced from intraoral scans appeared to be reliable and accurate in comparison to those from conventional impressions. This assessment only applies to the intraoral scanners models considered in the finally included studies. Digital models produced by intraoral scan eliminate the need of impressions materials; however, currently, longer time is needed to take the digital images. PROSPERO (CRD42014009702). None. © The Author 2016. Published by Oxford University Press on behalf of the European Orthodontic Society. All rights reserved. For permissions, please email: journals.permissions@oup.com.
2011-09-01
a quality evaluation with limited data, a model -based assessment must be...that affect system performance, a multistage approach to system validation, a modeling and experimental methodology for efficiently addressing a ...affect system performance, a multistage approach to system validation, a modeling and experimental methodology for efficiently addressing a wide range
Developing Confidence Limits For Reliability Of Software
NASA Technical Reports Server (NTRS)
Hayhurst, Kelly J.
1991-01-01
Technique developed for estimating reliability of software by use of Moranda geometric de-eutrophication model. Pivotal method enables straightforward construction of exact bounds with associated degree of statistical confidence about reliability of software. Confidence limits thus derived provide precise means of assessing quality of software. Limits take into account number of bugs found while testing and effects of sampling variation associated with random order of discovering bugs.
ERIC Educational Resources Information Center
Davison, Mark L.; Semmes, Robert; Huang, Lan; Close, Catherine N.
2012-01-01
Data from 181 college students were used to assess whether math reasoning item response times in computerized testing can provide valid and reliable measures of a speed dimension. The alternate forms reliability of the speed dimension was .85. A two-dimensional structural equation model suggests that the speed dimension is related to the accuracy…
The relationship between cost estimates reliability and BIM adoption: SEM analysis
NASA Astrophysics Data System (ADS)
Ismail, N. A. A.; Idris, N. H.; Ramli, H.; Rooshdi, R. R. Raja Muhammad; Sahamir, S. R.
2018-02-01
This paper presents the usage of Structural Equation Modelling (SEM) approach in analysing the effects of Building Information Modelling (BIM) technology adoption in improving the reliability of cost estimates. Based on the questionnaire survey results, SEM analysis using SPSS-AMOS application examined the relationships between BIM-improved information and cost estimates reliability factors, leading to BIM technology adoption. Six hypotheses were established prior to SEM analysis employing two types of SEM models, namely the Confirmatory Factor Analysis (CFA) model and full structural model. The SEM models were then validated through the assessment on their uni-dimensionality, validity, reliability, and fitness index, in line with the hypotheses tested. The final SEM model fit measures are: P-value=0.000, RMSEA=0.079<0.08, GFI=0.824, CFI=0.962>0.90, TLI=0.956>0.90, NFI=0.935>0.90 and ChiSq/df=2.259; indicating that the overall index values achieved the required level of model fitness. The model supports all the hypotheses evaluated, confirming that all relationship exists amongst the constructs are positive and significant. Ultimately, the analysis verified that most of the respondents foresee better understanding of project input information through BIM visualization, its reliable database and coordinated data, in developing more reliable cost estimates. They also perceive to accelerate their cost estimating task through BIM adoption.
Uncertainties in obtaining high reliability from stress-strength models
NASA Technical Reports Server (NTRS)
Neal, Donald M.; Matthews, William T.; Vangel, Mark G.
1992-01-01
There has been a recent interest in determining high statistical reliability in risk assessment of aircraft components. The potential consequences are identified of incorrectly assuming a particular statistical distribution for stress or strength data used in obtaining the high reliability values. The computation of the reliability is defined as the probability of the strength being greater than the stress over the range of stress values. This method is often referred to as the stress-strength model. A sensitivity analysis was performed involving a comparison of reliability results in order to evaluate the effects of assuming specific statistical distributions. Both known population distributions, and those that differed slightly from the known, were considered. Results showed substantial differences in reliability estimates even for almost nondetectable differences in the assumed distributions. These differences represent a potential problem in using the stress-strength model for high reliability computations, since in practice it is impossible to ever know the exact (population) distribution. An alternative reliability computation procedure is examined involving determination of a lower bound on the reliability values using extreme value distributions. This procedure reduces the possibility of obtaining nonconservative reliability estimates. Results indicated the method can provide conservative bounds when computing high reliability. An alternative reliability computation procedure is examined involving determination of a lower bound on the reliability values using extreme value distributions. This procedure reduces the possibility of obtaining nonconservative reliability estimates. Results indicated the method can provide conservative bounds when computing high reliability.
Reliability of Computerized Neurocognitive Tests for Concussion Assessment: A Meta-Analysis.
Farnsworth, James L; Dargo, Lucas; Ragan, Brian G; Kang, Minsoo
2017-09-01
Although widely used, computerized neurocognitive tests (CNTs) have been criticized because of low reliability and poor sensitivity. A systematic review was published summarizing the reliability of Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) scores; however, this was limited to a single CNT. Expansion of the previous review to include additional CNTs and a meta-analysis is needed. Therefore, our purpose was to analyze reliability data for CNTs using meta-analysis and examine moderating factors that may influence reliability. A systematic literature search (key terms: reliability, computerized neurocognitive test, concussion) of electronic databases (MEDLINE, PubMed, Google Scholar, and SPORTDiscus) was conducted to identify relevant studies. Studies were included if they met all of the following criteria: used a test-retest design, involved at least 1 CNT, provided sufficient statistical data to allow for effect-size calculation, and were published in English. Two independent reviewers investigated each article to assess inclusion criteria. Eighteen studies involving 2674 participants were retained. Intraclass correlation coefficients were extracted to calculate effect sizes and determine overall reliability. The Fisher Z transformation adjusted for sampling error associated with averaging correlations. Moderator analyses were conducted to evaluate the effects of the length of the test-retest interval, intraclass correlation coefficient model selection, participant demographics, and study design on reliability. Heterogeneity was evaluated using the Cochran Q statistic. The proportion of acceptable outcomes was greatest for the Axon Sports CogState Test (75%) and lowest for the ImPACT (25%). Moderator analyses indicated that the type of intraclass correlation coefficient model used significantly influenced effect-size estimates, accounting for 17% of the variation in reliability. The Axon Sports CogState Test, which has a higher proportion of acceptable outcomes and shorter test duration relative to other CNTs, may be a reliable option; however, future studies are needed to compare the diagnostic accuracy of these instruments.
Constraining uncertainties in water supply reliability in a tropical data scarce basin
NASA Astrophysics Data System (ADS)
Kaune, Alexander; Werner, Micha; Rodriguez, Erasmo; de Fraiture, Charlotte
2015-04-01
Assessing the water supply reliability in river basins is essential for adequate planning and development of irrigated agriculture and urban water systems. In many cases hydrological models are applied to determine the surface water availability in river basins. However, surface water availability and variability is often not appropriately quantified due to epistemic uncertainties, leading to water supply insecurity. The objective of this research is to determine the water supply reliability in order to support planning and development of irrigated agriculture in a tropical, data scarce environment. The approach proposed uses a simple hydrological model, but explicitly includes model parameter uncertainty. A transboundary river basin in the tropical region of Colombia and Venezuela with an approximately area of 2100 km² was selected as a case study. The Budyko hydrological framework was extended to consider climatological input variability and model parameter uncertainty, and through this the surface water reliability to satisfy the irrigation and urban demand was estimated. This provides a spatial estimate of the water supply reliability across the basin. For the middle basin the reliability was found to be less than 30% for most of the months when the water is extracted from an upstream source. Conversely, the monthly water supply reliability was high (r>98%) in the lower basin irrigation areas when water was withdrawn from a source located further downstream. Including model parameter uncertainty provides a complete estimate of the water supply reliability, but that estimate is influenced by the uncertainty in the model. Reducing the uncertainty in the model through improved data and perhaps improved model structure will improve the estimate of the water supply reliability allowing better planning of irrigated agriculture and dependable water allocation decisions.
A Critique of a Phenomenological Fiber Breakage Model for Stress Rupture of Composite Materials
NASA Technical Reports Server (NTRS)
Reeder, James R.
2010-01-01
Stress rupture is not a critical failure mode for most composite structures, but there are a few applications where it can be critical. One application where stress rupture can be a critical design issue is in Composite Overwrapped Pressure Vessels (COPV's), where the composite material is highly and uniformly loaded for long periods of time and where very high reliability is required. COPV's are normally required to be proof loaded before being put into service to insure strength, but it is feared that the proof load may cause damage that reduces the stress rupture reliability. Recently, a fiber breakage model was proposed specifically to estimate a reduced reliability due to proof loading. The fiber breakage model attempts to model physics believed to occur at the microscopic scale, but validation of the model has not occurred. In this paper, the fiber breakage model is re-derived while highlighting assumptions that were made during the derivation. Some of the assumptions are examined to assess their effect on the final predicted reliability.
Jackson, Brian A; Faith, Kay Sullivan
2013-02-01
Although significant progress has been made in measuring public health emergency preparedness, system-level performance measures are lacking. This report examines a potential approach to such measures for Strategic National Stockpile (SNS) operations. We adapted an engineering analytic technique used to assess the reliability of technological systems-failure mode and effects analysis-to assess preparedness. That technique, which includes systematic mapping of the response system and identification of possible breakdowns that affect performance, provides a path to use data from existing SNS assessment tools to estimate likely future performance of the system overall. Systems models of SNS operations were constructed and failure mode analyses were performed for each component. Linking data from existing assessments, including the technical assistance review and functional drills, to reliability assessment was demonstrated using publicly available information. The use of failure mode and effects estimates to assess overall response system reliability was demonstrated with a simple simulation example. Reliability analysis appears an attractive way to integrate information from the substantial investment in detailed assessments for stockpile delivery and dispensing to provide a view of likely future response performance.
The 20 GHz solid state transmitter design, impatt diode development and reliability assessment
NASA Technical Reports Server (NTRS)
Picone, S.; Cho, Y.; Asmus, J. R.
1984-01-01
A single drift gallium arsenide (GaAs) Schottky barrier IMPATT diode and related components were developed. The IMPATT diode reliability was assessed. A proof of concept solid state transmitter design and a technology assessment study were performed. The transmitter design utilizes technology which, upon implementation, will demonstrate readiness for development of a POC model within the 1982 time frame and will provide an information base for flight hardware capable of deployment in a 1985 to 1990 demonstrational 30/20 GHz satellite communication system. Life test data for Schottky barrier GaAs diodes and grown junction GaAs diodes are described. The results demonstrate the viability of GaAs IMPATTs as high performance, reliable RF power sources which, based on the recommendation made herein, will surpass device reliability requirements consistent with a ten year spaceborne solid state power amplifier mission.
Togari, Taisuke; Yamazaki, Yoshihiko; Koide, Syotaro; Miyata, Ayako
2006-01-01
In community and workplace health plans, the Perceived Health Competence Scale (PHCS) is employed as an index of health competency. The purpose of this research was to examine the reliability and validity of a modified Japanese PHCS. Interviews were sought with 3,000 randomly selected Japanese individuals using a two-step stratified method. Valid PHCS responses were obtained from 1,910 individuals, yielding a 63.7% response rate. Reliability was assessed using Cronbach's alpha coefficient (henceforth, alpha) to evaluate internal consistency, and by employing item-total correlation and alpha coefficient analyses to assess the effect of removal of variables from the model. To examine content validity, we assessed the correlation between the PHCS score and four respondent attribute characteristics, that is, sex, age, the presence of chronic disease, and the existence of chronic disease at age 18. The correlation between PHCS score and commonly employed healthy lifestyle indices was examined to assess construct validity. General linear model statistical analysis was employed. The modified Japanese PHCS demonstrated a satisfactory alpha coefficient of 0.869. Moreover, reliability was confirmed by item-total correlation and alpha coefficient analyses after removal of variables from the model. Differences in PHCS scores were seen between individuals 60 years and older, and younger individuals. These with current chronic disease, or who had had a chronic disease at age 18, tended to have lower PHCS scores. After controlling for the presence of current or age 18 chronic disease, age, and sex, significant correlations were seen between PHCS scores and tobacco use, dietary habits, and exercise, but not alcohol use or frequency of medical consultation. This study supports the reliability and validity, and hence supports the use, of the modified Japanese PHCS. Future longitudinal research is needed to evaluate the predictive power of modified Japanese PHCS scores, to examine factors influencing the development of perceived health competence, and to assess the effects of interventions on perceived health competence.
2007-12-01
there are no reliable alternatives to animal testing in the determination of toxicity. QSARs are only as reliable as the corroborating toxicological ...2) QSAR approaches can also be used to estimate toxicological impact. Toxicity QSAR models can often predict many toxicity parameters without... Toxicology Study No. 87-XE-03N3-05, Assessing the Potential Environmental Consequences of a New Energetic Material: A Phased Approach, September 2005 1
The reliability of the Hendrich Fall Risk Model in a geriatric hospital.
Heinze, Cornelia; Halfens, Ruud; Dassen, Theo
2008-12-01
Aims and objectives. The purpose of this study was to test the interrater reliability of the Hendrich Fall Risk Model, an instrument to identify patients in a hospital setting with a high risk of falling. Background. Falls are a serious problem in older patients. Valid and reliable fall risk assessment tools are required to identify high-risk patients and to take adequate preventive measures. Methods. Seventy older patients were independently and simultaneously assessed by six pairs of raters made up of nursing staff members. Consensus estimates were calculated using simple percentage agreement and consistency estimates using Spearman's rho and intra class coefficient. Results. Percentage agreement ranged from 0.70 to 0.92 between the six pairs of raters. Spearman's rho coefficients were between 0.54 and 0.80 and the intra class coefficients were between 0.46 and 0.92. Conclusions. Whereas some pairs of raters obtained considerable interobserver agreement and internal consistency, the others did not. Therefore, it is concluded that the Hendrich Fall Risk Model is not a reliable instrument. The use of more unambiguous operationalized items is preferred. Relevance to clinical practice. In practice, well operationalized fall risk assessment tools are necessary. Observer agreement should always be investigated after introducing a standardized measurement tool. © 2008 The Authors. Journal compilation © 2008 Blackwell Publishing Ltd.
Site-specific to local-scale shallow landslides triggering zones assessment using TRIGRS
NASA Astrophysics Data System (ADS)
Bordoni, M.; Meisina, C.; Valentino, R.; Bittelli, M.; Chersich, S.
2015-05-01
Rainfall-induced shallow landslides are common phenomena in many parts of the world, affecting cultivation and infrastructure and sometimes causing human losses. Assessing the triggering zones of shallow landslides is fundamental for land planning at different scales. This work defines a reliable methodology to extend a slope stability analysis from the site-specific to local scale by using a well-established physically based model (TRIGRS-unsaturated). The model is initially applied to a sample slope and then to the surrounding 13.4 km2 area in Oltrepo Pavese (northern Italy). To obtain more reliable input data for the model, long-term hydro-meteorological monitoring has been carried out at the sample slope, which has been assumed to be representative of the study area. Field measurements identified the triggering mechanism of shallow failures and were used to verify the reliability of the model to obtain pore water pressure trends consistent with those measured during the monitoring activity. In this way, more reliable trends have been modelled for past landslide events, such as the April 2009 event that was assumed as a benchmark. The assessment of shallow landslide triggering zones obtained using TRIGRS-unsaturated for the benchmark event appears good for both the monitored slope and the whole study area, with better results when a pedological instead of geological zoning is considered at the regional scale. The sensitivity analyses of the influence of the soil input data show that the mean values of the soil properties give the best results in terms of the ratio between the true positive and false positive rates. The scheme followed in this work allows us to obtain better results in the assessment of shallow landslide triggering areas in terms of the reduction in the overestimation of unstable zones with respect to other distributed models applied in the past.
Issues in Benchmarking Human Reliability Analysis Methods: A Literature Review
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ronald L. Boring; Stacey M. L. Hendrickson; John A. Forester
There is a diversity of human reliability analysis (HRA) methods available for use in assessing human performance within probabilistic risk assessments (PRA). Due to the significant differences in the methods, including the scope, approach, and underlying models, there is a need for an empirical comparison investigating the validity and reliability of the methods. To accomplish this empirical comparison, a benchmarking study comparing and evaluating HRA methods in assessing operator performance in simulator experiments is currently underway. In order to account for as many effects as possible in the construction of this benchmarking study, a literature review was conducted, reviewing pastmore » benchmarking studies in the areas of psychology and risk assessment. A number of lessons learned through these studies are presented in order to aid in the design of future HRA benchmarking endeavors.« less
Prediction of Software Reliability using Bio Inspired Soft Computing Techniques.
Diwaker, Chander; Tomar, Pradeep; Poonia, Ramesh C; Singh, Vijander
2018-04-10
A lot of models have been made for predicting software reliability. The reliability models are restricted to using particular types of methodologies and restricted number of parameters. There are a number of techniques and methodologies that may be used for reliability prediction. There is need to focus on parameters consideration while estimating reliability. The reliability of a system may increase or decreases depending on the selection of different parameters used. Thus there is need to identify factors that heavily affecting the reliability of the system. In present days, reusability is mostly used in the various area of research. Reusability is the basis of Component-Based System (CBS). The cost, time and human skill can be saved using Component-Based Software Engineering (CBSE) concepts. CBSE metrics may be used to assess those techniques which are more suitable for estimating system reliability. Soft computing is used for small as well as large-scale problems where it is difficult to find accurate results due to uncertainty or randomness. Several possibilities are available to apply soft computing techniques in medicine related problems. Clinical science of medicine using fuzzy-logic, neural network methodology significantly while basic science of medicine using neural-networks-genetic algorithm most frequently and preferably. There is unavoidable interest shown by medical scientists to use the various soft computing methodologies in genetics, physiology, radiology, cardiology and neurology discipline. CBSE boost users to reuse the past and existing software for making new products to provide quality with a saving of time, memory space, and money. This paper focused on assessment of commonly used soft computing technique like Genetic Algorithm (GA), Neural-Network (NN), Fuzzy Logic, Support Vector Machine (SVM), Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO), and Artificial Bee Colony (ABC). This paper presents working of soft computing techniques and assessment of soft computing techniques to predict reliability. The parameter considered while estimating and prediction of reliability are also discussed. This study can be used in estimation and prediction of the reliability of various instruments used in the medical system, software engineering, computer engineering and mechanical engineering also. These concepts can be applied to both software and hardware, to predict the reliability using CBSE.
Lopes, V P; Barnett, L M; Saraiva, L; Gonçalves, C; Bowe, S J; Abbott, G; Rodrigues, L P
2016-09-01
It is important to assess young children's perceived Fundamental Movement Skill (FMS) competence in order to examine the role of perceived FMS competence in motivation toward physical activity. Children's perceptions of motor competence may vary according to the culture/country of origin; therefore, it is also important to measure perceptions in different cultural contexts. The purpose was to assess the face validity, internal consistency, test-retest reliability and construct validity of the 12 FMS items in the Pictorial Scale for Perceived Movement Skill Competence for Young Children (PMSC) in a Portuguese sample. Two hundred one Portuguese children (girls, n = 112), 5 to 10 years of age (7.6 ± 1.4), participated. All children completed the PMSC once. Ordinal alpha assessed internal consistency. A random subsamples (n = 47) were reassessed one week later to determine test-retest reliability with Bland-Altman method. Children were asked questions after the second administration to determine face validity. Construct validity was assessed on the whole sample with a Bayesian Structural Equation Modelling (BSEM) approach. The hypothesized theoretical model used the 12 items and two hypothesized factors: object control and locomotor skills. The majority of children correctly identified the skills and could understand most of the pictures. Test-retest reliability analysis was good, with an agreement ration between 0.99 and 1.02. Ordinal alpha values ranged from acceptable (object control 0.73, locomotor 0.68) to good (all FMS 0.81). The hypothesized BSEM model had an adequate fit. The PMSC can be used to investigate perceptions of children's FMS competence. This instrument can also be satisfactorily used among Portuguese children. © 2016 John Wiley & Sons Ltd.
Carvalho, Teresa; Cunha, Marina; Pinto-Gouveia, José; Duarte, Joana
2015-03-30
The PTSD Checklist-Military Version (PCL-M) is a brief self-report instrument widely used to assess Post-traumatic Stress Disorder (PTSD) symptomatology in war Veterans, according to DSM-IV. This study sought out to explore the factor structure and reliability of the Portuguese version of the PCL-M. A sample of 660 Portuguese Colonial War Veterans completed the PCL-M. Several Confirmatory Factor Analyses were conducted to test different structures for PCL-M PTSD symptoms. Although the respecified first-order four-factor model based on King et al.'s model showed the best fit to the data, the respecified first and second-order models based on the DSM-IV symptom clusters also presented an acceptable fit. In addition, the PCL-M showed adequate reliability. The Portuguese version of the PCL-M is thus a valid and reliable measure to assess the severity of PTSD symptoms as described in DSM-IV. Its use with Portuguese Colonial War Veterans may ease screening of possible PTSD cases, promote more suitable treatment planning, and enable monitoring of therapeutic outcomes. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
What do we gain with Probabilistic Flood Loss Models?
NASA Astrophysics Data System (ADS)
Schroeter, K.; Kreibich, H.; Vogel, K.; Merz, B.; Lüdtke, S.
2015-12-01
The reliability of flood loss models is a prerequisite for their practical usefulness. Oftentimes, traditional uni-variate damage models as for instance depth-damage curves fail to reproduce the variability of observed flood damage. Innovative multi-variate probabilistic modelling approaches are promising to capture and quantify the uncertainty involved and thus to improve the basis for decision making. In this study we compare the predictive capability of two probabilistic modelling approaches, namely Bagging Decision Trees and Bayesian Networks and traditional stage damage functions which are cast in a probabilistic framework. For model evaluation we use empirical damage data which are available from computer aided telephone interviews that were respectively compiled after the floods in 2002, 2005, 2006 and 2013 in the Elbe and Danube catchments in Germany. We carry out a split sample test by sub-setting the damage records. One sub-set is used to derive the models and the remaining records are used to evaluate the predictive performance of the model. Further we stratify the sample according to catchments which allows studying model performance in a spatial transfer context. Flood damage estimation is carried out on the scale of the individual buildings in terms of relative damage. The predictive performance of the models is assessed in terms of systematic deviations (mean bias), precision (mean absolute error) as well as in terms of reliability which is represented by the proportion of the number of observations that fall within the 95-quantile and 5-quantile predictive interval. The reliability of the probabilistic predictions within validation runs decreases only slightly and achieves a very good coverage of observations within the predictive interval. Probabilistic models provide quantitative information about prediction uncertainty which is crucial to assess the reliability of model predictions and improves the usefulness of model results.
Confronting uncertainty in flood damage predictions
NASA Astrophysics Data System (ADS)
Schröter, Kai; Kreibich, Heidi; Vogel, Kristin; Merz, Bruno
2015-04-01
Reliable flood damage models are a prerequisite for the practical usefulness of the model results. Oftentimes, traditional uni-variate damage models as for instance depth-damage curves fail to reproduce the variability of observed flood damage. Innovative multi-variate probabilistic modelling approaches are promising to capture and quantify the uncertainty involved and thus to improve the basis for decision making. In this study we compare the predictive capability of two probabilistic modelling approaches, namely Bagging Decision Trees and Bayesian Networks. For model evaluation we use empirical damage data which are available from computer aided telephone interviews that were respectively compiled after the floods in 2002, 2005 and 2006, in the Elbe and Danube catchments in Germany. We carry out a split sample test by sub-setting the damage records. One sub-set is used to derive the models and the remaining records are used to evaluate the predictive performance of the model. Further we stratify the sample according to catchments which allows studying model performance in a spatial transfer context. Flood damage estimation is carried out on the scale of the individual buildings in terms of relative damage. The predictive performance of the models is assessed in terms of systematic deviations (mean bias), precision (mean absolute error) as well as in terms of reliability which is represented by the proportion of the number of observations that fall within the 95-quantile and 5-quantile predictive interval. The reliability of the probabilistic predictions within validation runs decreases only slightly and achieves a very good coverage of observations within the predictive interval. Probabilistic models provide quantitative information about prediction uncertainty which is crucial to assess the reliability of model predictions and improves the usefulness of model results.
Structural Validation of the Holistic Wellness Assessment
ERIC Educational Resources Information Center
Brown, Charlene; Applegate, E. Brooks; Yildiz, Mustafa
2015-01-01
The Holistic Wellness Assessment (HWA) is a relatively new assessment instrument based on an emergent transdisciplinary model of wellness. This study validated the factor structure identified via exploratory factor analysis (EFA), assessed test-retest reliability, and investigated concurrent validity of the HWA in three separate samples. The…
Infant polysomnography: reliability and validity of infant arousal assessment.
Crowell, David H; Kulp, Thomas D; Kapuniai, Linda E; Hunt, Carl E; Brooks, Lee J; Weese-Mayer, Debra E; Silvestri, Jean; Ward, Sally Davidson; Corwin, Michael; Tinsley, Larry; Peucker, Mark
2002-10-01
Infant arousal scoring based on the Atlas Task Force definition of transient EEG arousal was evaluated to determine (1). whether transient arousals can be identified and assessed reliably in infants and (2). whether arousal and no-arousal epochs scored previously by trained raters can be validated reliably by independent sleep experts. Phase I for inter- and intrarater reliability scoring was based on two datasets of sleep epochs selected randomly from nocturnal polysomnograms of healthy full-term, preterm, idiopathic apparent life-threatening event cases, and siblings of Sudden Infant Death Syndrome infants of 35 to 64 weeks postconceptional age. After training, test set 1 reliability was assessed and discrepancies identified. After retraining, test set 2 was scored by the same raters to determine interrater reliability. Later, three raters from the trained group rescored test set 2 to assess inter- and intrarater reliabilities. Interrater and intrarater reliability kappa's, with 95% confidence intervals, ranged from substantial to almost perfect levels of agreement. Interrater reliabilities for spontaneous arousals were initially moderate and then substantial. During the validation phase, 315 previously scored epochs were presented to four sleep experts to rate as containing arousal or no-arousal events. Interrater expert agreements were diverse and considered as noninterpretable. Concordance in sleep experts' agreements, based on identification of the previously sampled arousal and no-arousal epochs, was used as a secondary evaluative technique. Results showed agreement by two or more experts on 86% of the Collaborative Home Infant Monitoring Evaluation Study arousal scored events. Conversely, only 1% of the Collaborative Home Infant Monitoring Evaluation Study-scored no-arousal epochs were rated as an arousal. In summary, this study presents an empirically tested model with procedures and criteria for attaining improved reliability in transient EEG arousal assessments in infants using the modified Atlas Task Force standards. With training based on specific criteria, substantial inter- and intrarater agreement in identifying infant arousals was demonstrated. Corroborative validation results were too disparate for meaningful interpretation. Alternate evaluation based on concordance agreements supports reliance on infant EEG criteria for assessment. Results mandate additional confirmatory validation studies with specific training on infant EEG arousal assessment criteria.
Gamado, Kokouvi; Marion, Glenn; Porphyre, Thibaud
2017-01-01
Livestock epidemics have the potential to give rise to significant economic, welfare, and social costs. Incursions of emerging and re-emerging pathogens may lead to small and repeated outbreaks. Analysis of the resulting data is statistically challenging but can inform disease preparedness reducing potential future losses. We present a framework for spatial risk assessment of disease incursions based on data from small localized historic outbreaks. We focus on between-farm spread of livestock pathogens and illustrate our methods by application to data on the small outbreak of Classical Swine Fever (CSF) that occurred in 2000 in East Anglia, UK. We apply models based on continuous time semi-Markov processes, using data-augmentation Markov Chain Monte Carlo techniques within a Bayesian framework to infer disease dynamics and detection from incompletely observed outbreaks. The spatial transmission kernel describing pathogen spread between farms, and the distribution of times between infection and detection, is estimated alongside unobserved exposure times. Our results demonstrate inference is reliable even for relatively small outbreaks when the data-generating model is known. However, associated risk assessments depend strongly on the form of the fitted transmission kernel. Therefore, for real applications, methods are needed to select the most appropriate model in light of the data. We assess standard Deviance Information Criteria (DIC) model selection tools and recently introduced latent residual methods of model assessment, in selecting the functional form of the spatial transmission kernel. These methods are applied to the CSF data, and tested in simulated scenarios which represent field data, but assume the data generation mechanism is known. Analysis of simulated scenarios shows that latent residual methods enable reliable selection of the transmission kernel even for small outbreaks whereas the DIC is less reliable. Moreover, compared with DIC, model choice based on latent residual assessment correlated better with predicted risk. PMID:28293559
Modelling utility-scale wind power plants. Part 1: Economics
NASA Astrophysics Data System (ADS)
Milligan, Michael R.
1999-10-01
As the worldwide use of wind turbine generators continues to increase in utility-scale applications, it will become increasingly important to assess the economic and reliability impact of these intermittent resources. Although the utility industry in the United States appears to be moving towards a restructured environment, basic economic and reliability issues will continue to be relevant to companies involved with electricity generation. This article is the first of two which address modelling approaches and results obtained in several case studies and research projects at the National Renewable Energy Laboratory (NREL). This first article addresses the basic economic issues associated with electricity production from several generators that include large-scale wind power plants. An important part of this discussion is the role of unit commitment and economic dispatch in production cost models. This paper includes overviews and comparisons of the prevalent production cost modelling methods, including several case studies applied to a variety of electric utilities. The second article discusses various methods of assessing capacity credit and results from several reliability-based studies performed at NREL.
Figueroa, José; Guarachi, Juan Pablo; Matas, José; Arnander, Magnus; Orrego, Mario
2016-04-01
Computed tomography (CT) is widely used to assess component rotation in patients with poor results after total knee arthroplasty (TKA). The purpose of this study was to simultaneously determine the accuracy and reliability of CT in measuring TKA component rotation. TKA components were implanted in dry-bone models and assigned to two groups. The first group (n = 7) had variable femoral component rotations, and the second group (n = 6) had variable tibial tray rotations. CT images were then used to assess component rotation. Accuracy of CT rotational assessment was determined by mean difference, in degrees, between implanted component rotation and CT-measured rotation. Intraclass correlation coefficient (ICC) was applied to determine intra-observer and inter-observer reliability. Femoral component accuracy showed a mean difference of 2.5° and the tibial tray a mean difference of 3.2°. There was good intra- and inter-observer reliability for both components, with a femoral ICC of 0.8 and 0.76, and tibial ICC of 0.68 and 0.65, respectively. CT rotational assessment accuracy can differ from true component rotation by approximately 3° for each component. It does, however, have good inter- and intra-observer reliability.
Evaluation of a model of violence risk assessment among forensic psychiatric patients.
Douglas, Kevin S; Ogloff, James R P; Hart, Stephen D
2003-10-01
This study tested the interrater reliability and criterion-related validity of structured violence risk judgments made by using one application of the structured professional judgment model of violence risk assessment, the HCR-20 violence risk assessment scheme, which assesses 20 key risk factors in three domains: historical, clinical, and risk management. The HCR-20 was completed for a sample of 100 forensic psychiatric patients who had been found not guilty by reason of a mental disorder and were subsequently released to the community. Violence in the community was determined from multiple file-based sources. Interrater reliability of structured final risk judgments of low, moderate, or high violence risk made on the basis of the structured professional judgment model was acceptable (weighted kappa=.61). Structured final risk judgments were significantly predictive of postrelease community violence, yielding moderate to large effect sizes. Event history analyses showed that final risk judgments made with the structured professional judgment model added incremental validity to the HCR-20 used in an actuarial (numerical) sense. The findings support the structured professional judgment model of risk assessment as well as the HCR-20 specifically and suggest that clinical judgment, if made within a structured context, can contribute in meaningful ways to the assessment of violence risk.
Burt, Jenni; Abel, Gary; Elmore, Natasha; Campbell, John; Roland, Martin; Benson, John; Silverman, Jonathan
2014-03-06
To investigate initial reliability of the Global Consultation Rating Scale (GCRS: an instrument to assess the effectiveness of communication across an entire doctor-patient consultation, based on the Calgary-Cambridge guide to the medical interview), in simulated patient consultations. Multiple ratings of simulated general practitioner (GP)-patient consultations by trained GP evaluators. UK primary care. 21 GPs and six trained GP evaluators. GCRS score. 6 GP raters used GCRS to rate randomly assigned video recordings of GP consultations with simulated patients. Each of the 42 consultations was rated separately by four raters. We considered whether a fixed difference between scores had the same meaning at all levels of performance. We then examined the reliability of GCRS using mixed linear regression models. We augmented our regression model to also examine whether there were systematic biases between the scores given by different raters and to look for possible order effects. Assessing the communication quality of individual consultations, GCRS achieved a reliability of 0.73 (95% CI 0.44 to 0.79) for two raters, 0.80 (0.54 to 0.85) for three and 0.85 (0.61 to 0.88) for four. We found an average difference of 1.65 (on a 0-10 scale) in the scores given by the least and most generous raters: adjusting for this evaluator bias increased reliability to 0.78 (0.53 to 0.83) for two raters; 0.85 (0.63 to 0.88) for three and 0.88 (0.69 to 0.91) for four. There were considerable order effects, with later consultations (after 15-20 ratings) receiving, on average, scores more than one point higher on a 0-10 scale. GCRS shows good reliability with three raters assessing each consultation. We are currently developing the scale further by assessing a large sample of real-world consultations.
Are inspectors' assessments reliable? Ratings of NHS acute hospital trust services in England.
Boyd, Alan; Addicott, Rachael; Robertson, Ruth; Ross, Shilpa; Walshe, Kieran
2017-01-01
The credibility of a regulator could be threatened if stakeholders perceive that assessments of performance made by its inspectors are unreliable. Yet there is little published research on the reliability of inspectors' assessments of health care organizations' services. Objectives We investigated the inter-rater reliability of assessments made by inspectors inspecting acute hospitals in England during the piloting of a new regulatory model implemented by the Care Quality Commission (CQC) during 2013 and 2014. Multi-professional teams of inspectors rated service provision on a four-point scale for each of five domains: safety; effectiveness; caring; responsiveness; and leadership. Methods In an online survey, we asked individual inspectors to assign a domain and a rating to each of 10 vignettes of service information extracted from CQC inspection reports. We used these data to simulate the ratings that might be produced by teams of inspectors. We also observed inspection teams in action, and interviewed inspectors and staff from hospitals that had been inspected. Results Levels of agreement varied substantially from vignette to vignette. Characteristics such as professional background explained only a very small part of the variation. Overall, agreement was higher on ratings than on domains, and for groups of inspectors compared with individual inspectors. A number of potential causes of disagreement were identified, such as differences regarding the weight that should be given to contextual factors and general uncertainty about interpreting the rating and domain categories. Conclusion Groups of inspectors produced more reliable assessments than individual inspectors, and there is evidence to support the utility of appropriate discussions between inspectors in improving reliability. The reliability of domain allocations was lower than for ratings. It is important to define categories and rating levels clearly, and to train inspectors in their use. Further research is needed to replicate these results now that the model has been fully implemented, and to understand better the impact that inspector uncertainty and disagreement may have on published CQC ratings.
The Yale-Brown Obsessive Compulsive Scale: A Reliability Generalization Meta-Analysis.
López-Pina, José Antonio; Sánchez-Meca, Julio; López-López, José Antonio; Marín-Martínez, Fulgencio; Núñez-Núñez, Rosa Maria; Rosa-Alcázar, Ana I; Gómez-Conesa, Antonia; Ferrer-Requena, Josefa
2015-10-01
The Yale-Brown Obsessive Compulsive Scale (Y-BOCS) is the most frequently applied test to assess obsessive compulsive symptoms. We conducted a reliability generalization meta-analysis on the Y-BOCS to estimate the average reliability, examine the variability among the reliability estimates, search for moderators, and propose a predictive model that researchers and clinicians can use to estimate the expected reliability of the Y-BOCS. We included studies where the Y-BOCS was applied to a sample of adults and reliability estimate was reported. Out of the 11,490 references located, 144 studies met the selection criteria. For the total scale, the mean reliability was 0.866 for coefficients alpha, 0.848 for test-retest correlations, and 0.922 for intraclass correlations. The moderator analyses led to a predictive model where the standard deviation of the total test and the target population (clinical vs. nonclinical) explained 38.6% of the total variability among coefficients alpha. Finally, clinical implications of the results are discussed. © The Author(s) 2014.
Wang, Bowen; Xiong, Haitao; Jiang, Chengrui
2014-01-01
As a hot topic in supply chain management, fuzzy method has been widely used in logistics center location selection to improve the reliability and suitability of the logistics center location selection with respect to the impacts of both qualitative and quantitative factors. However, it does not consider the consistency and the historical assessments accuracy of experts in predecisions. So this paper proposes a multicriteria decision making model based on credibility of decision makers by introducing priority of consistency and historical assessments accuracy mechanism into fuzzy multicriteria decision making approach. In this way, only decision makers who pass the credibility check are qualified to perform the further assessment. Finally, a practical example is analyzed to illustrate how to use the model. The result shows that the fuzzy multicriteria decision making model based on credibility mechanism can improve the reliability and suitability of site selection for the logistics center.
Wang, Bowen; Jiang, Chengrui
2014-01-01
As a hot topic in supply chain management, fuzzy method has been widely used in logistics center location selection to improve the reliability and suitability of the logistics center location selection with respect to the impacts of both qualitative and quantitative factors. However, it does not consider the consistency and the historical assessments accuracy of experts in predecisions. So this paper proposes a multicriteria decision making model based on credibility of decision makers by introducing priority of consistency and historical assessments accuracy mechanism into fuzzy multicriteria decision making approach. In this way, only decision makers who pass the credibility check are qualified to perform the further assessment. Finally, a practical example is analyzed to illustrate how to use the model. The result shows that the fuzzy multicriteria decision making model based on credibility mechanism can improve the reliability and suitability of site selection for the logistics center. PMID:25215319
Lubans, David R; Smith, Jordan J; Harries, Simon K; Barnett, Lisa M; Faigenbaum, Avery D
2014-05-01
The aim of this study was to describe the development and assess test-retest reliability and construct validity of the Resistance Training Skills Battery (RTSB) for adolescents. The RTSB provides an assessment of resistance training skill competency and includes 6 exercises (i.e., body weight squat, push-up, lunge, suspended row, standing overhead press, and front support with chest touches). Scoring for each skill is based on the number of performance criteria successfully demonstrated. An overall resistance training skill quotient (RTSQ) is created by adding participants' scores for the 6 skills. Participants (44 boys and 19 girls, mean age = 14.5 ± 1.2 years) completed the RTSB on 2 occasions separated by 7 days. Participants also completed the following fitness tests, which were used to create a muscular fitness score (MFS): handgrip strength, timed push-up, and standing long jump tests. Intraclass correlation (ICC), paired samples t-tests, and typical error were used to assess test-retest reliability. To assess construct validity, gender and RTSQ were entered into a regression model predicting MFS. The rank order repeatability of the RTSQ was high (ICC = 0.88). The model explained 39% of the variance in MFS (p ≤ 0.001) and RTSQ (r = 0.40, p ≤ 0.001) was a significant predictor. This study has demonstrated the construct validity and test-retest reliability of the RTSB in a sample of adolescents. The RTSB can reliably rank participants in regards to their resistance training competency and has the necessary sensitivity to detect small changes in resistance training skill proficiency.
Litzenburger, Friederike; Heck, Katrin; Pitchika, Vinay; Neuhaus, Klaus W; Jost, Fabian N; Hickel, Reinhard; Jablonski-Momeni, Anahita; Welk, Alexander; Lederer, Alexander; Kühnisch, Jan
2018-02-01
The purpose of this in vitro study was to evaluate the inter- and intraexaminer reliability of digital bitewing (DBW) radiography and near-infrared light transillumination (NIRT) for proximal caries detection and assessment in posterior teeth. From a pool of 85 patients, 100 corresponding pairs of DBW and NIRT images (~1/3 healthy, ~1/3 with enamel caries and ~1/3 with dentin caries) were chosen. 12 dentists with different professional status and clinical experience repeated the evaluation in two blinded cycles. Two experienced dentists provided a reference diagnosis after analysing all images independently. Statistical analysis included the calculation of simple (κ) and weighted Kappa (wκ) values as a measure of reliability. Logistic regression with a backward elimination model was used to investigate the influence of the diagnostic method, evaluation cycle, type of tooth, and clinical experience on reliability. Altogether, inter- and intraexaminer reliability exhibited good to excellent κ and wκ values for DBW radiography (Inter: κ = 0.60/ 0.63; wκ = 0.74/0.76; Intra: κ = 0.64; wκ = 0.77) and NIRT (Inter: κ = 0.74/0.64; wκ = 0.86/0.82; Intra: κ = 0.68; wκ = 0.84). The backward elimination model revealed NIRT to be significantly more reliable than DBW radiography. This study revealed a good to excellent inter- and intraexaminer reliability for proximal caries detection using DBW and NIRT images. The logistic regression analysis revealed significantly better reliability for NIRT. Additionally, the first evaluation cycle was more reliable according to the reference diagnoses.
Reliability modelling and analysis of thermal MEMS
NASA Astrophysics Data System (ADS)
Muratet, Sylvaine; Lavu, Srikanth; Fourniols, Jean-Yves; Bell, George; Desmulliez, Marc P. Y.
2006-04-01
This paper presents a MEMS reliability study methodology based on the novel concept of 'virtual prototyping'. This methodology can be used for the development of reliable sensors or actuators and also to characterize their behaviour in specific use conditions and applications. The methodology is demonstrated on the U-shaped micro electro thermal actuator used as test vehicle. To demonstrate this approach, a 'virtual prototype' has been developed with the modeling tools MatLab and VHDL-AMS. A best practice FMEA (Failure Mode and Effect Analysis) is applied on the thermal MEMS to investigate and assess the failure mechanisms. Reliability study is performed by injecting the identified defaults into the 'virtual prototype'. The reliability characterization methodology predicts the evolution of the behavior of these MEMS as a function of the number of cycles of operation and specific operational conditions.
Comparing the reliability of related populations with the probability of agreement
Stevens, Nathaniel T.; Anderson-Cook, Christine M.
2016-07-26
Combining information from different populations to improve precision, simplify future predictions, or improve underlying understanding of relationships can be advantageous when considering the reliability of several related sets of systems. Using the probability of agreement to help quantify the similarities of populations can help to give a realistic assessment of whether the systems have reliability that are sufficiently similar for practical purposes to be treated as a homogeneous population. In addition, the new method is described and illustrated with an example involving two generations of a complex system where the reliability is modeled using either a logistic or probit regressionmore » model. Note that supplementary materials including code, datasets, and added discussion are available online.« less
Comparing the reliability of related populations with the probability of agreement
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stevens, Nathaniel T.; Anderson-Cook, Christine M.
Combining information from different populations to improve precision, simplify future predictions, or improve underlying understanding of relationships can be advantageous when considering the reliability of several related sets of systems. Using the probability of agreement to help quantify the similarities of populations can help to give a realistic assessment of whether the systems have reliability that are sufficiently similar for practical purposes to be treated as a homogeneous population. In addition, the new method is described and illustrated with an example involving two generations of a complex system where the reliability is modeled using either a logistic or probit regressionmore » model. Note that supplementary materials including code, datasets, and added discussion are available online.« less
Quantitative metal magnetic memory reliability modeling for welded joints
NASA Astrophysics Data System (ADS)
Xing, Haiyan; Dang, Yongbin; Wang, Ben; Leng, Jiancheng
2016-03-01
Metal magnetic memory(MMM) testing has been widely used to detect welded joints. However, load levels, environmental magnetic field, and measurement noises make the MMM data dispersive and bring difficulty to quantitative evaluation. In order to promote the development of quantitative MMM reliability assessment, a new MMM model is presented for welded joints. Steel Q235 welded specimens are tested along the longitudinal and horizontal lines by TSC-2M-8 instrument in the tensile fatigue experiments. The X-ray testing is carried out synchronously to verify the MMM results. It is found that MMM testing can detect the hidden crack earlier than X-ray testing. Moreover, the MMM gradient vector sum K vs is sensitive to the damage degree, especially at early and hidden damage stages. Considering the dispersion of MMM data, the K vs statistical law is investigated, which shows that K vs obeys Gaussian distribution. So K vs is the suitable MMM parameter to establish reliability model of welded joints. At last, the original quantitative MMM reliability model is first presented based on the improved stress strength interference theory. It is shown that the reliability degree R gradually decreases with the decreasing of the residual life ratio T, and the maximal error between prediction reliability degree R 1 and verification reliability degree R 2 is 9.15%. This presented method provides a novel tool of reliability testing and evaluating in practical engineering for welded joints.
Long-term reliability of ImPACT in professional ice hockey.
Echemendia, Ruben J; Bruce, Jared M; Meeuwisse, Willem; Comper, Paul; Aubry, Mark; Hutchison, Michael
2016-02-01
This study sought to assess the test-retest reliability of Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) across 2-4 year time intervals and evaluate the utility of a newly proposed two-factor (Speed/Memory) model of ImPACT across multiple language versions. Test-retest data were collected from non-concussed National Hockey League (NHL) players across 2-, 3-, and 4-year time intervals. The two-factor model was examined using different language versions (English, French, Czech, Swedish) of the test using a one-year interval, and across 2-4 year intervals using the English version of the test. The two-factor Speed index improved reliability across multiple language versions of ImPACT. The Memory factor also improved but reliability remained below the traditional cutoff of .70 for use in clinical decision-making. ImPACT reliabilities remained low (below .70) regardless of whether the four-composite or the two-factor model was used across 2-, 3-, and 4-year time intervals. The two-factor approach increased ImPACT's one-year reliability over the traditional four-composite model among NHL players. The increased stability in test scores improves the test's ability to detect cognitive changes following injury, which increases the diagnostic utility of the test and allows for better return to play decision-making by reducing the risk of exposing an athlete to additional trauma while the brain may be at a heightened vulnerability to such trauma. Although the Speed Index increases the clinical utility of the test, the stability of the Memory index remains low. Irrespective of whether the two-factor or traditional four-composite approach is used, these data suggest that new baselines should occur on a yearly basis in order to maximize clinical utility.
NASA Astrophysics Data System (ADS)
Goh, A. T. C.; Kulhawy, F. H.
2005-05-01
In urban environments, one major concern with deep excavations in soft clay is the potentially large ground deformations in and around the excavation. Excessive movements can damage adjacent buildings and utilities. There are many uncertainties associated with the calculation of the ultimate or serviceability performance of a braced excavation system. These include the variabilities of the loadings, geotechnical soil properties, and engineering and geometrical properties of the wall. A risk-based approach to serviceability performance failure is necessary to incorporate systematically the uncertainties associated with the various design parameters. This paper demonstrates the use of an integrated neural network-reliability method to assess the risk of serviceability failure through the calculation of the reliability index. By first performing a series of parametric studies using the finite element method and then approximating the non-linear limit state surface (the boundary separating the safe and failure domains) through a neural network model, the reliability index can be determined with the aid of a spreadsheet. Two illustrative examples are presented to show how the serviceability performance for braced excavation problems can be assessed using the reliability index.
PROBABILISTIC AQUATIC EXPOSURE ASSESSMENT FOR PESTICIDES 1: FOUNDATIONS
Models that capture underlying mechanisms and processes are necessary for reliable extrapolation of laboratory chemical data to field conditions. For validation, these models require a major revision of the conventional model testing paradigm to better recognize the conflict betw...
Spatial surplus production modeling of Atlantic tunas and billfish.
Carruthers, Thomas R; McAllister, Murdoch K; Taylor, Nathan G
2011-10-01
We formulate and simulation-test a spatial surplus production model that provides a basis with which to undertake multispecies, multi-area, stock assessment. Movement between areas is parameterized using a simple gravity model that includes a "residency" parameter that determines the degree of stock mixing among areas. The model is deliberately simple in order to (1) accommodate nontarget species that typically have fewer available data and (2) minimize computational demand to enable simulation evaluation of spatial management strategies. Using this model, we demonstrate that careful consideration of spatial catch and effort data can provide the basis for simple yet reliable spatial stock assessments. If simple spatial dynamics can be assumed, tagging data are not required to reliably estimate spatial distribution and movement. When applied to eight stocks of Atlantic tuna and billfish, the model tracks regional catch data relatively well by approximating local depletions and exchange among high-abundance areas. We use these results to investigate and discuss the implications of using spatially aggregated stock assessment for fisheries in which the distribution of both the population and fishing vary over time.
2011-01-01
Background Variation in assessments is a universal given, and work disability assessments by insurance physicians are no exception. Little is known about the considerations and views of insurance physicians that may partly explain such variation. On the basis of the Attitude - Social norm - self Efficacy (ASE) model, we have developed measurement instruments for assessment behaviour and its determinants. Methods Based on theory and interviews with insurance physicians the questionnaire included blocks of items concerning background variables, intentions, attitudes, social norms, self-efficacy, knowledge, barriers and behaviour of the insurance physicians in relation to work disability assessment issues. The responses of 231 insurance physicians were suitable for further analysis. Factor analysis and reliability analysis were used to form scale variables and homogeneity analysis was used to form dimension variables. Thus, we included 169 of the 177 original items. Results Factor analysis and reliability analysis yielded 29 scales with sufficient reliability. Homogeneity analysis yielded 19 dimensions. Scales and dimensions fitted with the concepts of the ASE model. We slightly modified the ASE model by dividing behaviour into two blocks: behaviour that reflects the assessment process and behaviour that reflects assessment behaviour. The picture that emerged from the descriptive results was of a group of physicians who were motivated in their job and positive about the Dutch social security system in general. However, only half of them had a positive opinion about the Dutch Work and Income (Capacity for Work) Act (WIA). They also reported serious barriers, the most common of which was work pressure. Finally, 73% of the insurance physicians described the majority of their cases as 'difficult'. Conclusions The scales and dimensions developed appear to be valid and offer a promising basis for future research. The results suggest that the underlying ASE model, in modified form, is suitable for describing the assessment behaviour of insurance physicians and the determinants of this behaviour. The next step in this line of research should be to validate the model using structural equation modelling. Finally, the predictive value should be tested in relation to outcome measurements of work disability assessments. PMID:21199570
Mansberger, Steven L; Sheppler, Christina R; McClure, Tina M; Vanalstine, Cory L; Swanson, Ingrid L; Stoumbos, Zoey; Lambert, William E
2013-09-01
To report the psychometrics of the Glaucoma Treatment Compliance Assessment Tool (GTCAT), a new questionnaire designed to assess adherence with glaucoma therapy. We developed the questionnaire according to the constructs of the Health Belief Model. We evaluated the questionnaire using data from a cross-sectional study with focus groups (n = 20) and a prospective observational case series (n=58). Principal components analysis provided assessment of construct validity. We repeated the questionnaire after 3 months for test-retest reliability. We evaluated predictive validity using an electronic dosing monitor as an objective measure of adherence. Focus group participants provided 931 statements related to adherence, of which 88.7% (826/931) could be categorized into the constructs of the Health Belief Model. Perceived barriers accounted for 31% (288/931) of statements, cues-to-action 14% (131/931), susceptibility 12% (116/931), benefits 12% (115/931), severity 10% (91/931), and self-efficacy 9% (85/931). The principal components analysis explained 77% of the variance with five components representing Health Belief Model constructs. Reliability analyses showed acceptable Cronbach's alphas (>.70) for four of the seven components (severity, susceptibility, barriers [eye drop administration], and barriers [discomfort]). Predictive validity was high, with several Health Belief Model questions significantly associated (P <.05) with adherence and a correlation coefficient (R (2)) of .40. Test-retest reliability was 90%. The GTCAT shows excellent repeatability, content, construct, and predictive validity for glaucoma adherence. A multisite trial is needed to determine whether the results can be generalized and whether the questionnaire accurately measures the effect of interventions to increase adherence.
Campos, Juliana Alvares Duarte Bonini; Spexoto, Maria Cláudia Bernardes; da Silva, Wanderson Roberto; Serrano, Sergio Vicente; Marôco, João
2018-01-01
ABSTRACT Objective To evaluate the psychometric properties of the seven theoretical models proposed in the literature for European Organization for Research and Treatment of Cancer Quality of Life Questionnaire Core 30 (EORTC QLQ-C30), when applied to a sample of Brazilian cancer patients. Methods Content and construct validity (factorial, convergent, discriminant) were estimated. Confirmatory factor analysis was performed. Convergent validity was analyzed using the average variance extracted. Discriminant validity was analyzed using correlational analysis. Internal consistency and composite reliability were used to assess the reliability of instrument. Results A total of 1,020 cancer patients participated. The mean age was 53.3±13.0 years, and 62% were female. All models showed adequate factorial validity for the study sample. Convergent and discriminant validities and the reliability were compromised in all of the models for all of the single items referring to symptoms, as well as for the “physical function” and “cognitive function” factors. Conclusion All theoretical models assessed in this study presented adequate factorial validity when applied to Brazilian cancer patients. The choice of the best model for use in research and/or clinical protocols should be centered on the purpose and underlying theory of each model. PMID:29694609
Alonso, Ariel; Laenen, Annouschka
2013-05-01
Laenen, Alonso, and Molenberghs (2007) and Laenen, Alonso, Molenberghs, and Vangeneugden (2009) proposed a method to assess the reliability of rating scales in a longitudinal context. The methodology is based on hierarchical linear models, and reliability coefficients are derived from the corresponding covariance matrices. However, finding a good parsimonious model to describe complex longitudinal data is a challenging task. Frequently, several models fit the data equally well, raising the problem of model selection uncertainty. When model uncertainty is high one may resort to model averaging, where inferences are based not on one but on an entire set of models. We explored the use of different model building strategies, including model averaging, in reliability estimation. We found that the approach introduced by Laenen et al. (2007, 2009) combined with some of these strategies may yield meaningful results in the presence of high model selection uncertainty and when all models are misspecified, in so far as some of them manage to capture the most salient features of the data. Nonetheless, when all models omit prominent regularities in the data, misleading results may be obtained. The main ideas are further illustrated on a case study in which the reliability of the Hamilton Anxiety Rating Scale is estimated. Importantly, the ambit of model selection uncertainty and model averaging transcends the specific setting studied in the paper and may be of interest in other areas of psychometrics. © 2012 The British Psychological Society.
Electronic Quality of Life Assessment Using Computer-Adaptive Testing
2016-01-01
Background Quality of life (QoL) questionnaires are desirable for clinical practice but can be time-consuming to administer and interpret, making their widespread adoption difficult. Objective Our aim was to assess the performance of the World Health Organization Quality of Life (WHOQOL)-100 questionnaire as four item banks to facilitate adaptive testing using simulated computer adaptive tests (CATs) for physical, psychological, social, and environmental QoL. Methods We used data from the UK WHOQOL-100 questionnaire (N=320) to calibrate item banks using item response theory, which included psychometric assessments of differential item functioning, local dependency, unidimensionality, and reliability. We simulated CATs to assess the number of items administered before prespecified levels of reliability was met. Results The item banks (40 items) all displayed good model fit (P>.01) and were unidimensional (fewer than 5% of t tests significant), reliable (Person Separation Index>.70), and free from differential item functioning (no significant analysis of variance interaction) or local dependency (residual correlations < +.20). When matched for reliability, the item banks were between 45% and 75% shorter than paper-based WHOQOL measures. Across the four domains, a high standard of reliability (alpha>.90) could be gained with a median of 9 items. Conclusions Using CAT, simulated assessments were as reliable as paper-based forms of the WHOQOL with a fraction of the number of items. These properties suggest that these item banks are suitable for computerized adaptive assessment. These item banks have the potential for international development using existing alternative language versions of the WHOQOL items. PMID:27694100
ERIC Educational Resources Information Center
Bulut, Okan
2013-01-01
The importance of subscores in educational and psychological assessments is undeniable. Subscores yield diagnostic information that can be used for determining how each examinee's abilities/skills vary over different content domains. One of the most common criticisms about reporting and using subscores is insufficient reliability of subscores.…
Öksüz, Çigdem; Alemdaroglu, Ipek; Kilinç, Muhammed; Abaoğlu, Hatice; Demirci, Cevher; Karahan, Sevilay; Yilmaz, Oznur; Yildirim, Sibel Aksu
2017-10-01
This study was performed to examine the reliability and validity of the Turkish version of ABILHAND-Kids questionnaire which assesses manual functions of children with neuromuscular diseases (NMDs). A cross sectional survey study design and Rasch analysis were used to assess the reliability and validity of the Turkish version of scale. Ninety-three children with different neuromuscular disorders and their parents were included in the study. The scale was applied to the parents with face-to-face interview twice; on their first visit and after an interval of 15 days. The test-retest reliability was assessed with intraclass correlation coefficient (ICC), and internal consistency of the multi-item subscales by calculating Cronbach alpha values. Brooke Upper Extremity Functional Classification (BUEFC) and Wee-Functional Independency Measurement (Wee-FIM) were correlated to determine the construct validity. The ICC value for the test/retest reliability was 0.94. The internal consistency was 0.81. Floor (1.1%) and ceiling (11.8%) effects were not significant. There were moderate correlations between the Turkish version of ABILHAND-Kids and Wee-FIM (0.67) and BUEFC (-0.37). Rasch analysis indicated good item fit, unidimensionality, and model fit. The Turkish version of ABILHAND-Kids questionnaire was found to be a reliable and valid scale for the assessment of the manual ability of children with NMDs.
Space station electrical power system availability study
NASA Technical Reports Server (NTRS)
Turnquist, Scott R.; Twombly, Mark A.
1988-01-01
ARINC Research Corporation performed a preliminary reliability, and maintainability (RAM) anlaysis of the NASA space station Electric Power Station (EPS). The analysis was performed using the ARINC Research developed UNIRAM RAM assessment methodology and software program. The analysis was performed in two phases: EPS modeling and EPS RAM assessment. The EPS was modeled in four parts: the insolar power generation system, the eclipse power generation system, the power management and distribution system (both ring and radial power distribution control unit (PDCU) architectures), and the power distribution to the inner keel PDCUs. The EPS RAM assessment was conducted in five steps: the use of UNIRAM to perform baseline EPS model analyses and to determine the orbital replacement unit (ORU) criticalities; the determination of EPS sensitivity to on-orbit spared of ORUs and the provision of an indication of which ORUs may need to be spared on-orbit; the determination of EPS sensitivity to changes in ORU reliability; the determination of the expected annual number of ORU failures; and the integration of the power generator system model results with the distribution system model results to assess the full EPS. Conclusions were drawn and recommendations were made.
NASA Astrophysics Data System (ADS)
Yusliana Ekawati, Elvin
2017-01-01
This study aimed to produce a model of scientific attitude assessment in terms of the observations for physics learning based scientific approach (case study of dynamic fluid topic in high school). Development of instruments in this study adaptation of the Plomp model, the procedure includes the initial investigation, design, construction, testing, evaluation and revision. The test is done in Surakarta, so that the data obtained are analyzed using Aiken formula to determine the validity of the content of the instrument, Cronbach’s alpha to determine the reliability of the instrument, and construct validity using confirmatory factor analysis with LISREL 8.50 program. The results of this research were conceptual models, instruments and guidelines on scientific attitudes assessment by observation. The construct assessment instruments include components of curiosity, objectivity, suspended judgment, open-mindedness, honesty and perseverance. The construct validity of instruments has been qualified (rated load factor > 0.3). The reliability of the model is quite good with the Alpha value 0.899 (> 0.7). The test showed that the model fits the theoretical models are supported by empirical data, namely p-value 0.315 (≥ 0.05), RMSEA 0.027 (≤ 0.08)
NASA Astrophysics Data System (ADS)
Rahardianto, Trias; Saputra, Aditya; Gomez, Christopher
2017-07-01
Research on landslide susceptibility has evolved rapidly over the few last decades thanks to the availability of large databases. Landslide research used to be focused on discreet events but the usage of large inventory dataset has become a central pillar of landslide susceptibility, hazard, and risk assessment. Indeed, extracting meaningful information from the large database is now at the forth of geoscientific research, following the big-data research trend. Indeed, the more comprehensive information of the past landslide available in a particular area is, the better the produced map will be, in order to support the effective decision making, planning, and engineering practice. The landslide inventory data which is freely accessible online gives an opportunity for many researchers and decision makers to prevent casualties and economic loss caused by future landslides. This data is advantageous especially for areas with poor landslide historical data. Since the construction criteria of landslide inventory map and its quality evaluation remain poorly defined, the assessment of open source landslide inventory map reliability is required. The present contribution aims to assess the reliability of open-source landslide inventory data based on the particular topographical setting of the observed area in Niigata prefecture, Japan. Geographic Information System (GIS) platform and statistical approach are applied to analyze the data. Frequency ratio method is utilized to model and assess the landslide map. The outcomes of the generated model showed unsatisfactory results with AUC value of 0.603 indicate the low prediction accuracy and unreliability of the model.
Hoben, Matthias; Estabrooks, Carole A.; Squires, Janet E.; Behrens, Johann
2016-01-01
We translated the Canadian residential long term care versions of the Alberta Context Tool (ACT) and the Conceptual Research Utilization (CRU) Scale into German, to study the association between organizational context factors and research utilization in German nursing homes. The rigorous translation process was based on best practice guidelines for tool translation, and we previously published methods and results of this process in two papers. Both instruments are self-report questionnaires used with care providers working in nursing homes. The aim of this study was to assess the factor structure, reliability, and measurement invariance (MI) between care provider groups responding to these instruments. In a stratified random sample of 38 nursing homes in one German region (Metropolregion Rhein-Neckar), we collected questionnaires from 273 care aides, 196 regulated nurses, 152 allied health providers, 6 quality improvement specialists, 129 clinical leaders, and 65 nursing students. The factor structure was assessed using confirmatory factor models. The first model included all 10 ACT concepts. We also decided a priori to run two separate models for the scale-based and the count-based ACT concepts as suggested by the instrument developers. The fourth model included the five CRU Scale items. Reliability scores were calculated based on the parameters of the best-fitting factor models. Multiple-group confirmatory factor models were used to assess MI between provider groups. Rather than the hypothesized ten-factor structure of the ACT, confirmatory factor models suggested 13 factors. The one-factor solution of the CRU Scale was confirmed. The reliability was acceptable (>0.7 in the entire sample and in all provider groups) for 10 of 13 ACT concepts, and high (0.90–0.96) for the CRU Scale. We could demonstrate partial strong MI for both ACT models and partial strict MI for the CRU Scale. Our results suggest that the scores of the German ACT and the CRU Scale for nursing homes are acceptably reliable and valid. However, as the ACT lacked strict MI, observed variables (or scale scores based on them) cannot be compared between provider groups. Rather, group comparisons should be based on latent variable models, which consider the different residual variances of each group. PMID:27656156
NASA Astrophysics Data System (ADS)
Zammouri, Mounira; Ribeiro, Luis
2017-05-01
Groundwater flow model of the transboundary Saharan aquifer system is developed in 2003 and used for management and decision-making by Algeria, Tunisia and Libya. In decision-making processes, reliability plays a decisive role. This paper looks into the reliability assessment of the Saharan aquifers model. It aims to detect the shortcomings of the model considered properly calibrated. After presenting the calibration results of the effort modelling in 2003, the uncertainty in the model which arising from the lack of the groundwater level and the transmissivity data is analyzed using kriging technique and stochastic approach. The structural analysis of piezometry in steady state and logarithms of transmissivity were carried out for the Continental Intercalaire (CI) and the Complexe Terminal (CT) aquifers. The available data (piezometry and transmissivity) were compared to the calculated values, using geostatistics approach. Using a stochastic approach, 2500 realizations of a log-normal random transmissivity field of the CI aquifer has been performed to assess the errors of the model output, due to the uncertainty in transmissivity. Two types of bad calibration are shown. In some regions, calibration should be improved using the available data. In others areas, undertaking the model refinement requires gathering new data to enhance the aquifer system knowledge. Stochastic simulations' results showed that the calculated drawdowns in 2050 could be higher than the values predicted by the calibrated model.
Støre-Valen, Jakob; Ryum, Truls; Pedersen, Geir A F; Pripp, Are H; Jose, Paul E; Karterud, Sigmund
2015-09-01
The Global Assessment of Functioning (GAF) Scale is used in routine clinical practice and research to estimate symptom and functional severity and longitudinal change. Concerns about poor interrater reliability have been raised, and the present study evaluated the effect of a Web-based GAF training program designed to improve interrater reliability in routine clinical practice. Clinicians rated up to 20 vignettes online, and received deviation scores as immediate feedback (i.e., own scores compared with expert raters) after each rating. Growth curves of absolute SD scores across the vignettes were modeled. A linear mixed effects model, using the clinician's deviation scores from expert raters as the dependent variable, indicated an improvement in reliability during training. Moderation by content of scale (symptoms; functioning), scale range (average; extreme), previous experience with GAF rating, profession, and postgraduate training were assessed. Training reduced deviation scores for inexperienced GAF raters, for individuals in clinical professions other than nursing and medicine, and for individuals with no postgraduate specialization. In addition, training was most beneficial for cases with average severity of symptoms compared with cases with extreme severity. The results support the use of Web-based training with feedback routines as a means to improve the reliability of GAF ratings performed by clinicians in mental health practice. These results especially pertain to clinicians in mental health practice who do not have a masters or doctoral degree. (c) 2015 APA, all rights reserved.
Nicholson, Patricia; Griffin, Patrick; Gillis, Shelley; Wu, Margaret; Dunning, Trisha
2013-09-01
Concern about the process of identifying underlying competencies that contribute to effective nursing performance has been debated with a lack of consensus surrounding an approved measurement instrument for assessing clinical performance. Although a number of methodologies are noted in the development of competency-based assessment measures, these studies are not without criticism. The primary aim of the study was to develop and validate a Performance Based Scoring Rubric, which included both analytical and holistic scales. The aim included examining the validity and reliability of the rubric, which was designed to measure clinical competencies in the operating theatre. The fieldwork observations of 32 nurse educators and preceptors assessing the performance of 95 instrument nurses in the operating theatre were used in the calibration of the rubric. The Rasch model, a particular model among Item Response Models, was used in the calibration of each item in the rubric in an attempt at improving the measurement properties of the scale. This is done by establishing the 'fit' of the data to the conditions demanded by the Rasch model. Acceptable reliability estimates, specifically a high Cronbach's alpha reliability coefficient (0.940), as well as empirical support for construct and criterion validity for the rubric were achieved. Calibration of the Performance Based Scoring Rubric using Rasch model revealed that the fit statistics for most items were acceptable. The use of the Rasch model offers a number of features in developing and refining healthcare competency-based assessments, improving confidence in measuring clinical performance. The Rasch model was shown to be useful in developing and validating a competency-based assessment for measuring the competence of the instrument nurse in the operating theatre with implications for use in other areas of nursing practice. Crown Copyright © 2012. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Miara, A.; Macknick, J.; Vorosmarty, C. J.; Corsi, F.; Fekete, B. M.; Newmark, R. L.; Tidwell, V. C.; Cohen, S. M.
2016-12-01
Thermoelectric plants supply 85% of electricity generation in the United States. Under a warming climate, the performance of these power plants may be reduced, as thermoelectric generation is dependent upon cool ambient temperatures and sufficient water supplies at adequate temperatures. In this study, we assess the vulnerability and reliability of 1,100 operational power plants (2015) across the contiguous United States under a comprehensive set of climate scenarios (five Global Circulation Models each with four Representative Concentration Pathways). We model individual power plant capacities using the Thermoelectric Power and Thermal Pollution model (TP2M) coupled with the Water Balance Model (WBM) at a daily temporal resolution and 5x5 km spatial resolution. Together, these models calculate power plant capacity losses that account for geophysical constraints and river network dynamics. Potential losses at the single-plant level are put into a regional energy security context by assessing the collective system-level reliability at the North-American Electricity Reliability Corporation (NERC) regions. Results show that the thermoelectric sector at the national level has low vulnerability under the contemporary climate and that system-level reliability in terms of available thermoelectric resources relative to thermoelectric demand is sufficient. Under future climates scenarios, changes in water availability and warm ambient temperatures lead to constraints on operational capacity and increased vulnerability at individual power plant sites across all regions in the United States. However, there is a strong disparity in regional vulnerability trends and magnitudes that arise from each region's climate, hydrology and technology mix. Despite increases in vulnerabilities at the individual power plant level, regional energy systems may still be reliable (with no system failures) due to sufficient back-up reserve capacities.
In vitro model to evaluate reliability and accuracy of a dental shade-matching instrument.
Kim-Pusateri, Seungyee; Brewer, Jane D; Dunford, Robert G; Wee, Alvin G
2007-11-01
There are several electronic shade-matching instruments available for clinical use; unfortunately, there are limited acceptable in vitro models to evaluate their reliability and accuracy. The purpose of this in vitro study was to evaluate the reliability and accuracy of a dental clinical shade-matching instrument. Using the shade-matching instrument (ShadeScan), color measurements were made of 3 commercial shade guides (VITA Classical, VITA 3D-Master, and Chromascop). Shade tabs were selected and placed in the middle of a gingival matrix (Shofu Gummy), with tabs of the same nominal shade from additional shade guides placed on both sides. Measurements were made of the central region of the shade tab inside a black box. For the reliability assessment, each shade tab from each of the 3 shade guide types was measured 10 times. For the accuracy assessment, each shade tab from 10 guides of each of the 3 types evaluated was measured once. Reliability, accuracy, and 95% confidence intervals were calculated for each shade tab. Differences were determined by 1-way ANOVA followed by the Bonferroni multiple comparison procedure. Reliability of ShadeScan was as follows: VITA Classical = 95.0%, VITA 3D-Master = 91.2%, and Chromascop = 76.5%. Accuracy of ShadeScan was as follows: VITA Classical = 65.0%, VITA 3D-Master = 54.2%, Chromascop = 84.5%. This in vitro study showed a varying degree of reliability and accuracy for ShadeScan, depending on the type of shade guide system used.
Reliability Assessment Approach for Stirling Convertors and Generators
NASA Technical Reports Server (NTRS)
Shah, Ashwin R.; Schreiber, Jeffrey G.; Zampino, Edward; Best, Timothy
2004-01-01
Stirling power conversion is being considered for use in a Radioisotope Power System for deep-space science missions because it offers a multifold increase in the conversion efficiency of heat to electric power. Quantifying the reliability of a Radioisotope Power System that utilizes Stirling power conversion technology is important in developing and demonstrating the capability for long-term success. A description of the Stirling power convertor is provided, along with a discussion about some of the key components. Ongoing efforts to understand component life, design variables at the component and system levels, related sources, and the nature of uncertainties is discussed. The requirement for reliability also is discussed, and some of the critical areas of concern are identified. A section on the objectives of the performance model development and a computation of reliability is included to highlight the goals of this effort. Also, a viable physics-based reliability plan to model the design-level variable uncertainties at the component and system levels is outlined, and potential benefits are elucidated. The plan involves the interaction of different disciplines, maintaining the physical and probabilistic correlations at all the levels, and a verification process based on rational short-term tests. In addition, both top-down and bottom-up coherency were maintained to follow the physics-based design process and mission requirements. The outlined reliability assessment approach provides guidelines to improve the design and identifies governing variables to achieve high reliability in the Stirling Radioisotope Generator design.
USDA-ARS?s Scientific Manuscript database
In risk assessment models, the solid-solution partition coefficient (Kd), and plant uptake factor (PUF), are often employed to model the fate and transport of trace elements in soils. The trustworthiness of risk assessments depends on the reliability of the parameters used. In this study, we exami...
Meylan, Grégoire; Reck, Barbara K; Rechberger, Helmut; Graedel, Thomas E; Schwab, Oliver
2017-10-17
Decision-makers traditionally expect "hard facts" from scientific inquiry, an expectation that the results of material flow analyses (MFAs) can hardly meet. MFA limitations are attributable to incompleteness of flowcharts, limited data quality, and model assumptions. Moreover, MFA results are, for the most part, based less on empirical observation but rather on social knowledge construction processes. Developing, applying, and improving the means of evaluating and communicating the reliability of MFA results is imperative. We apply two recently proposed approaches for making quantitative statements on MFA reliability to national minor metals systems: rhenium, gallium, and germanium in the United States in 2012. We discuss the reliability of results in policy and management contexts. The first approach consists of assessing data quality based on systematic characterization of MFA data and the associated meta-information and quantifying the "information content" of MFAs. The second is a quantification of data inconsistencies indicated by the "degree of data reconciliation" between the data and the model. A high information content and a low degree of reconciliation indicate reliable or certain MFA results. This article contributes to reliability and uncertainty discourses in MFA, exemplifying the usefulness of the approaches in policy and management, and to raw material supply discussions by providing country-level information on three important minor metals often considered critical.
Subject-level reliability analysis of fast fMRI with application to epilepsy.
Hao, Yongfu; Khoo, Hui Ming; von Ellenrieder, Nicolas; Gotman, Jean
2017-07-01
Recent studies have applied the new magnetic resonance encephalography (MREG) sequence to the study of interictal epileptic discharges (IEDs) in the electroencephalogram (EEG) of epileptic patients. However, there are no criteria to quantitatively evaluate different processing methods, to properly use the new sequence. We evaluated different processing steps of this new sequence under the common generalized linear model (GLM) framework by assessing the reliability of results. A bootstrap sampling technique was first used to generate multiple replicated data sets; a GLM with different processing steps was then applied to obtain activation maps, and the reliability of these maps was assessed. We applied our analysis in an event-related GLM related to IEDs. A higher reliability was achieved by using a GLM with head motion confound regressor with 24 components rather than the usual 6, with an autoregressive model of order 5 and with a canonical hemodynamic response function (HRF) rather than variable latency or patient-specific HRFs. Comparison of activation with IED field also favored the canonical HRF, consistent with the reliability analysis. The reliability analysis helps to optimize the processing methods for this fast fMRI sequence, in a context in which we do not know the ground truth of activation areas. Magn Reson Med 78:370-382, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
Reliability Assessment for COTS Components in Space Flight Applications
NASA Technical Reports Server (NTRS)
Krishnan, G. S.; Mazzuchi, Thomas A.
2001-01-01
Systems built for space flight applications usually demand very high degree of performance and a very high level of accuracy. Hence, the design engineers are often prone to selecting state-of-art technologies for inclusion in their system design. The shrinking budgets also necessitate use of COTS (Commercial Off-The-Shelf) components, which are construed as being less expensive. The performance and accuracy requirements for space flight applications are much more stringent than those for the commercial applications. The quantity of systems designed and developed for space applications are much lower in number than those produced for the commercial applications. With a given set of requirements, are these COTS components reliable? This paper presents a model for assessing the reliability of COTS components in space applications and the associated affect on the system reliability. We illustrate the method with a real application.
One-year test-retest reliability of intrinsic connectivity network fMRI in older adults
Guo, Cong C.; Kurth, Florian; Zhou, Juan; Mayer, Emeran A.; Eickhoff, Simon B; Kramer, Joel H.; Seeley, William W.
2014-01-01
“Resting-state” or task-free fMRI can assess intrinsic connectivity network (ICN) integrity in health and disease, suggesting a potential for use of these methods as disease-monitoring biomarkers. Numerous analytical options are available, including model-driven ROI-based correlation analysis and model-free, independent component analysis (ICA). High test-retest reliability will be a necessary feature of a successful ICN biomarker, yet available reliability data remains limited. Here, we examined ICN fMRI test-retest reliability in 24 healthy older subjects scanned roughly one year apart. We focused on the salience network, a disease-relevant ICN not previously subjected to reliability analysis. Most ICN analytical methods proved reliable (intraclass coefficients > 0.4) and could be further improved by wavelet analysis. Seed-based ROI correlation analysis showed high map-wise reliability, whereas graph theoretical measures and temporal concatenation group ICA produced the most reliable individual unit-wise outcomes. Including global signal regression in ROI-based correlation analyses reduced reliability. Our study provides a direct comparison between the most commonly used ICN fMRI methods and potential guidelines for measuring intrinsic connectivity in aging control and patient populations over time. PMID:22446491
Ecosystem Model Skill Assessment. Yes We Can!
Olsen, Erik; Fay, Gavin; Gaichas, Sarah; Gamble, Robert; Lucey, Sean; Link, Jason S
2016-01-01
Accelerated changes to global ecosystems call for holistic and integrated analyses of past, present and future states under various pressures to adequately understand current and projected future system states. Ecosystem models can inform management of human activities in a complex and changing environment, but are these models reliable? Ensuring that models are reliable for addressing management questions requires evaluating their skill in representing real-world processes and dynamics. Skill has been evaluated for just a limited set of some biophysical models. A range of skill assessment methods have been reviewed but skill assessment of full marine ecosystem models has not yet been attempted. We assessed the skill of the Northeast U.S. (NEUS) Atlantis marine ecosystem model by comparing 10-year model forecasts with observed data. Model forecast performance was compared to that obtained from a 40-year hindcast. Multiple metrics (average absolute error, root mean squared error, modeling efficiency, and Spearman rank correlation), and a suite of time-series (species biomass, fisheries landings, and ecosystem indicators) were used to adequately measure model skill. Overall, the NEUS model performed above average and thus better than expected for the key species that had been the focus of the model tuning. Model forecast skill was comparable to the hindcast skill, showing that model performance does not degenerate in a 10-year forecast mode, an important characteristic for an end-to-end ecosystem model to be useful for strategic management purposes. We identify best-practice approaches for end-to-end ecosystem model skill assessment that would improve both operational use of other ecosystem models and future model development. We show that it is possible to not only assess the skill of a complicated marine ecosystem model, but that it is necessary do so to instill confidence in model results and encourage their use for strategic management. Our methods are applicable to any type of predictive model, and should be considered for use in fields outside ecology (e.g. economics, climate change, and risk assessment).
Pac, A; Oruba, Z; Olszewska-Czyż, I; Chomyszyn-Gajewska, M
2014-03-01
The individual evaluation of patients' motivation should be introduced to the protocol of periodontal treatment, as it could impact positively on effective treatment planning and treatment outcomes. However, a standardised tool measuring the extent of periodontal patients' motivation has not yet been proposed in the literature. Thus, the objective of the present study was to determine the validity and reliability of the Zychlińscy motivation scale adjusted to the needs of periodontology. Cross sectional study. Department of Periodontology and Oral Medicine, Dental University Clinic, Jagiellonian University, Krakow, Poland. 199 adult periodontal patients, aged 20-78. 14-item questionnaire. The items were adopted from the original Zychlińscy motivation assessment scale. Validity and reliability of the proposed motivation assessment instrument. The assessed Cronbach's alpha of 0.79 indicates the scale is a reliable tool. Principal component analysis revealed a model with three factors, which explained half of the total variance. Those factors represented: the patient's attitude towards treatment and oral hygiene practice; previous experiences during treatment; and the influence of external conditions on the patient's attitude towards treatment. The proposed scale proved to be a reliable and accurate tool for the evaluation of periodontal patients' motivation.
Jaeschke, Lina; Steinbrecher, Astrid; Jeran, Stephanie; Konigorski, Stefan; Pischon, Tobias
2018-04-20
24 h-accelerometry is now used to objectively assess physical activity (PA) in many observational studies like the German National Cohort; however, PA variability, observational time needed to estimate habitual PA, and reliability are unclear. We assessed 24 h-PA of 50 participants using triaxial accelerometers (ActiGraph GT3X+) over 2 weeks. Variability of overall PA and different PA intensities (time in inactivity and in low intensity, moderate, vigorous, and very vigorous PA) between days of assessment or days of the week was quantified using linear mixed-effects and random effects models. We calculated the required number of days to estimate PA, and calculated PA reliability using intraclass correlation coefficients. Between- and within-person variance accounted for 34.4-45.5% and 54.5-65.6%, respectively, of total variance in overall PA and PA intensities over the 2 weeks. Overall PA and times in low intensity, moderate, and vigorous PA decreased slightly over the first 3 days of assessment. Overall PA (p = 0.03), time in inactivity (p = 0.003), in low intensity PA (p = 0.001), in moderate PA (p = 0.02), and in vigorous PA (p = 0.04) slightly differed between days of the week, being highest on Wednesday and Friday and lowest on Sunday and Monday, with apparent differences between Saturday and Sunday. In nested random models, the day of the week accounted for < 19% of total variance in the PA parameters. On average, the required number of days to estimate habitual PA was around 1 week, being 7 for overall PA and ranging from 6 to 9 for the PA intensities. Week-to-week reliability was good (intraclass correlation coefficients, range, 0.68-0.82). Individual PA, as assessed using 24 h-accelerometry, is highly variable between days, but the day of assessment or the day of the week explain only small parts of this variance. Our data indicate that 1 week of assessment is necessary for reliable estimation of habitual PA.
La Padula, Simone; Hersant, Barbara; SidAhmed, Mounia; Niddam, Jeremy; Meningaud, Jean Paul
2016-07-01
Most patients requesting aesthetic rejuvenation treatment expect to look healthier and younger. Some scales for ageing assessment have been proposed, but none is focused on patient age prediction. The aim of this study was to develop and validate a new facial rating scale assessing facial ageing sign severity. One thousand Caucasian patients were included and assessed. The Rasch model was used as part of the validation process. A score was attributed to each patient, based on the scales we developed. The correlation between the real age and scores obtained, the inter-rater reliability and test-retest reliability were analysed. The objective was to develop a tool enabling the assigning of a patient to a specific age range based on the calculated score. All scales exceeded criteria for acceptability, reliability and validity. The real age strongly correlated with the total facial score in both sex groups. The test-retest reliability confirmed this strong correlation. We developed a facial ageing scale which could be a useful tool to assess patients before and after rejuvenation treatment and an important new metrics to be used in facial rejuvenation and regenerative clinical research. Copyright © 2016 European Association for Cranio-Maxillo-Facial Surgery. Published by Elsevier Ltd. All rights reserved.
Siu, B W M; Au-Yeung, C C Y; Chan, A W L; Chan, L S Y; Yuen, K K; Leung, H W; Yan, C K; Ng, K K; Lai, A C H; Davies, S; Collins, M
Mapping forensic psychiatric services with the security needs of patients is a salient step in service planning, audit and review. A valid and reliable instrument for measuring the security needs of Chinese forensic psychiatric inpatients was not yet available. This study aimed to develop and validate the Chinese version of the Security Needs Assessment Profile for measuring the profiles of security needs of Chinese forensic psychiatric inpatients. The Security Needs Assessment Profile by Davis was translated into Chinese. Its face validity, content validity, construct validity and internal consistency reliability were assessed by measuring the security needs of 98 Chinese forensic psychiatric inpatients. Principal factor analysis for construct validity provided a six-factor security needs model explaining 68.7% of the variance. Based on the Cronbach's alpha coefficient, the internal consistency reliability was rated as acceptable for procedural security (0.73), and fair for both physical security (0.62) and relational security (0.58). A significant sex difference (p=0.002) in total security score was found. The Chinese version of the Security Needs Assessment Profile is a valid and reliable instrument for assessing the security needs of Chinese forensic psychiatric inpatients. Copyright © 2017 Elsevier Ltd. All rights reserved.
Issues in benchmarking human reliability analysis methods : a literature review.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lois, Erasmia; Forester, John Alan; Tran, Tuan Q.
There is a diversity of human reliability analysis (HRA) methods available for use in assessing human performance within probabilistic risk assessment (PRA). Due to the significant differences in the methods, including the scope, approach, and underlying models, there is a need for an empirical comparison investigating the validity and reliability of the methods. To accomplish this empirical comparison, a benchmarking study is currently underway that compares HRA methods with each other and against operator performance in simulator studies. In order to account for as many effects as possible in the construction of this benchmarking study, a literature review was conducted,more » reviewing past benchmarking studies in the areas of psychology and risk assessment. A number of lessons learned through these studies are presented in order to aid in the design of future HRA benchmarking endeavors.« less
Test-retest reliability of effective connectivity in the face perception network.
Frässle, Stefan; Paulus, Frieder Michel; Krach, Sören; Jansen, Andreas
2016-02-01
Computational approaches have great potential for moving neuroscience toward mechanistic models of the functional integration among brain regions. Dynamic causal modeling (DCM) offers a promising framework for inferring the effective connectivity among brain regions and thus unraveling the neural mechanisms of both normal cognitive function and psychiatric disorders. While the benefit of such approaches depends heavily on their reliability, systematic analyses of the within-subject stability are rare. Here, we present a thorough investigation of the test-retest reliability of an fMRI paradigm for DCM analysis dedicated to unraveling intra- and interhemispheric integration among the core regions of the face perception network. First, we examined the reliability of face-specific BOLD activity in 25 healthy volunteers, who performed a face perception paradigm in two separate sessions. We found good to excellent reliability of BOLD activity within the DCM-relevant regions. Second, we assessed the stability of effective connectivity among these regions by analyzing the reliability of Bayesian model selection and model parameter estimation in DCM. Reliability was excellent for the negative free energy and good for model parameter estimation, when restricting the analysis to parameters with substantial effect sizes. Third, even when the experiment was shortened, reliability of BOLD activity and DCM results dropped only slightly as a function of the length of the experiment. This suggests that the face perception paradigm presented here provides reliable estimates for both conventional activation and effective connectivity measures. We conclude this paper with an outlook on potential clinical applications of the paradigm for studying psychiatric disorders. Hum Brain Mapp 37:730-744, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
Deltombe, T; Jamart, J; Recloux, S; Legrand, C; Vandenbroeck, N; Theys, S; Hanson, P
2007-03-01
We conducted a reliability comparison study to determine the intrarater and inter-rater reliability and the limits of agreement of the volume estimated by circumferential measurements using the frustum sign method and the disk model method, by water displacement volumetry, and by infrared optoelectronic volumetry in the assessment of upper limb lymphedema. Thirty women with lymphedema following axillary lymph node dissection surgery for breast cancer surgery were enrolled. In each patient, the volumes of the upper limbs were estimated by three physical therapists using circumference measurements, water displacement and optoelectronic volumetry. One of the physical therapists performed each measure twice. Intraclass correlation coefficients (ICCs), relative differences, and limits of agreement were determined. Intrarater and interrater reliability ICCs ranged from 0.94 to 1. Intrarater relative differences were 1.9% for the disk model method, 3.2% for the frustum sign model method, 2.9% for water displacement volumetry, and 1.5% for optoelectronic volumetry. Intrarater reliability was always better than interrater, except for the optoelectronic method. Intrarater and interrater limits of agreement were calculated for each technique. The disk model method and optoelectronic volumetry had better reliability than the frustum sign method and water displacement volumetry, which is usually considered to be the gold standard. In terms of low-cost, simplicity, and reliability, we recommend the disk model method as the method of choice in clinical practice. Since intrarater reliability was always better than interrater reliability (except for optoelectronic volumetry), patients should therefore, ideally, always be evaluated by the same therapist. Additionally, the limits of agreement must be taken into account when determining the response of a patient to treatment.
Tooth-size discrepancy: A comparison between manual and digital methods
Correia, Gabriele Dória Cabral; Habib, Fernando Antonio Lima; Vogel, Carlos Jorge
2014-01-01
Introduction Technological advances in Dentistry have emerged primarily in the area of diagnostic tools. One example is the 3D scanner, which can transform plaster models into three-dimensional digital models. Objective This study aimed to assess the reliability of tooth size-arch length discrepancy analysis measurements performed on three-dimensional digital models, and compare these measurements with those obtained from plaster models. Material and Methods To this end, plaster models of lower dental arches and their corresponding three-dimensional digital models acquired with a 3Shape R700T scanner were used. All of them had lower permanent dentition. Four different tooth size-arch length discrepancy calculations were performed on each model, two of which by manual methods using calipers and brass wire, and two by digital methods using linear measurements and parabolas. Results Data were statistically assessed using Friedman test and no statistically significant differences were found between the two methods (P > 0.05), except for values found by the linear digital method which revealed a slight, non-significant statistical difference. Conclusions Based on the results, it is reasonable to assert that any of these resources used by orthodontists to clinically assess tooth size-arch length discrepancy can be considered reliable. PMID:25279529
Opportunities of probabilistic flood loss models
NASA Astrophysics Data System (ADS)
Schröter, Kai; Kreibich, Heidi; Lüdtke, Stefan; Vogel, Kristin; Merz, Bruno
2016-04-01
Oftentimes, traditional uni-variate damage models as for instance depth-damage curves fail to reproduce the variability of observed flood damage. However, reliable flood damage models are a prerequisite for the practical usefulness of the model results. Innovative multi-variate probabilistic modelling approaches are promising to capture and quantify the uncertainty involved and thus to improve the basis for decision making. In this study we compare the predictive capability of two probabilistic modelling approaches, namely Bagging Decision Trees and Bayesian Networks and traditional stage damage functions. For model evaluation we use empirical damage data which are available from computer aided telephone interviews that were respectively compiled after the floods in 2002, 2005, 2006 and 2013 in the Elbe and Danube catchments in Germany. We carry out a split sample test by sub-setting the damage records. One sub-set is used to derive the models and the remaining records are used to evaluate the predictive performance of the model. Further we stratify the sample according to catchments which allows studying model performance in a spatial transfer context. Flood damage estimation is carried out on the scale of the individual buildings in terms of relative damage. The predictive performance of the models is assessed in terms of systematic deviations (mean bias), precision (mean absolute error) as well as in terms of sharpness of the predictions the reliability which is represented by the proportion of the number of observations that fall within the 95-quantile and 5-quantile predictive interval. The comparison of the uni-variable Stage damage function and the multivariable model approach emphasises the importance to quantify predictive uncertainty. With each explanatory variable, the multi-variable model reveals an additional source of uncertainty. However, the predictive performance in terms of precision (mbe), accuracy (mae) and reliability (HR) is clearly improved in comparison to uni-variable Stage damage function. Overall, Probabilistic models provide quantitative information about prediction uncertainty which is crucial to assess the reliability of model predictions and improves the usefulness of model results.
Cobb, Stephen C; Joshi, Mukta N; Pomeroy, Robin L
2016-12-01
In-vitro and invasive in-vivo studies have reported relatively independent motion in the medial and lateral forefoot segments during gait. However, most current surface-based models have not defined medial and lateral forefoot or midfoot segments. The purpose of the current study was to determine the reliability of a 7-segment foot model that includes medial and lateral midfoot and forefoot segments during walking gait. Three-dimensional positions of marker clusters located on the leg and 6 foot segments were tracked as 10 participants completed 5 walking trials. To examine the reliability of the foot model, coefficients of multiple correlation (CMC) were calculated across the trials for each participant. Three-dimensional stance time series and range of motion (ROM) during stance were also calculated for each functional articulation. CMCs for all of the functional articulations were ≥ 0.80. Overall, the rearfoot complex (leg-calcaneus segments) was the most reliable articulation and the medial midfoot complex (calcaneus-navicular segments) was the least reliable. With respect to ROM, reliability was greatest for plantarflexion/dorsiflexion and least for abduction/adduction. Further, the stance ROM and time-series patterns results between the current study and previous invasive in-vivo studies that have assessed actual bone motion were generally consistent.
Moghadam, Manije; Salavati, Mahyar; Sahaf, Robab; Rassouli, Maryam; Moghadam, Mojgan; Kamrani, Ahmad Ali Akbari
2018-03-01
After forward-backward translation, the LSS was administered to 334 Persian speaking, cognitively healthy elderly aged 60 years and over recruited through convenience sampling. To analyze the validity of the model's constructs and the relationships between the constructs, a confirmatory factor analysis followed by PLS analysis was performed. The Construct validity was further investigated by calculating the correlations between the LSS and the "Short Form Health Survey" (SF-36) subscales measuring similar and dissimilar constructs. The LSS was re-administered to 50 participants a month later to assess the reliability. For the eight-factor model of the life satisfaction construct, adequate goodness of fit between the hypothesized model and the model derived from the sample data was attained (positive and statistically significant beta coefficients, good R-squares and acceptable GoF). Construct validity was supported by convergent and discriminant validity, and correlations between the LSS and SF-36 subscales. Minimum Intraclass Correlation Coefficient level of 0.60 was exceeded by all subscales. Minimum level of reliability indices (Cronbach's α, composite reliability and indicator reliability) was exceeded by all subscales. The Persian-version of the Life Satisfaction Scale is a reliable and valid instrument, with psychometric properties which are consistent with the original version.
77 FR 4282 - Gulf of Mexico Fishery Management Council; Public Meeting
Federal Register 2010, 2011, 2012, 2013, 2014
2012-01-27
... Workshop will evaluate the data used in the assessment and whether data uncertainties acknowledged/reported are within normal or expected levels, e.g., recruitment deviations; whether data were applied properly within the assessment model; are input data series reliable and sufficient to support the assessment...
NASA Astrophysics Data System (ADS)
Hardikar, Kedar Y.; Liu, Bill J. J.; Bheemreddy, Venkata
2016-09-01
Gaining an understanding of degradation mechanisms and their characterization are critical in developing relevant accelerated tests to ensure PV module performance warranty over a typical lifetime of 25 years. As newer technologies are adapted for PV, including new PV cell technologies, new packaging materials, and newer product designs, the availability of field data over extended periods of time for product performance assessment cannot be expected within the typical timeframe for business decisions. In this work, to enable product design decisions and product performance assessment for PV modules utilizing newer technologies, Simulation and Mechanism based Accelerated Reliability Testing (SMART) methodology and empirical approaches to predict field performance from accelerated test results are presented. The method is demonstrated for field life assessment of flexible PV modules based on degradation mechanisms observed in two accelerated tests, namely, Damp Heat and Thermal Cycling. The method is based on design of accelerated testing scheme with the intent to develop relevant acceleration factor models. The acceleration factor model is validated by extensive reliability testing under different conditions going beyond the established certification standards. Once the acceleration factor model is validated for the test matrix a modeling scheme is developed to predict field performance from results of accelerated testing for particular failure modes of interest. Further refinement of the model can continue as more field data becomes available. While the demonstration of the method in this work is for thin film flexible PV modules, the framework and methodology can be adapted to other PV products.
Van der Elst, Wim; Molenberghs, Geert; Hilgers, Ralf-Dieter; Verbeke, Geert; Heussen, Nicole
2016-11-01
There are various settings in which researchers are interested in the assessment of the correlation between repeated measurements that are taken within the same subject (i.e., reliability). For example, the same rating scale may be used to assess the symptom severity of the same patients by multiple physicians, or the same outcome may be measured repeatedly over time in the same patients. Reliability can be estimated in various ways, for example, using the classical Pearson correlation or the intra-class correlation in clustered data. However, contemporary data often have a complex structure that goes well beyond the restrictive assumptions that are needed with the more conventional methods to estimate reliability. In the current paper, we propose a general and flexible modeling approach that allows for the derivation of reliability estimates, standard errors, and confidence intervals - appropriately taking hierarchies and covariates in the data into account. Our methodology is developed for continuous outcomes together with covariates of an arbitrary type. The methodology is illustrated in a case study, and a Web Appendix is provided which details the computations using the R package CorrMixed and the SAS software. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Development and validation of an instrument to assess perceived social influence on health behaviors
HOLT, CHERYL L.; CLARK, EDDIE M.; ROTH, DAVID L.; CROWTHER, MARTHA; KOHLER, CONNIE; FOUAD, MONA; FOUSHEE, RUSTY; LEE, PATRICIA A.; SOUTHWARD, PENNY L.
2012-01-01
Assessment of social influence on health behavior is often approached through a situational context. The current study adapted an existing, theory-based instrument from another content domain to assess Perceived Social Influence on Health Behavior (PSI-HB) among African Americans, using an individual difference approach. The adapted instrument was found to have high internal reliability (α = .81–.84) and acceptable testretest reliability (r = .68–.85). A measurement model revealed a three-factor structure and supported the theoretical underpinnings. Scores were predictive of health behaviors, particularly among women. Future research using the new instrument may have applied value assessing social influence in the context of health interventions. PMID:20522506
2008-10-01
provide adequate means for thermal heat dissipation and cooling. Thus electronic packaging has four main functions [1]: • Signal distribution which... dissipation , involving structural and materials consideration. • Mechanical, chemical and electromagnetic protection of components and... nature when compared to phenomenological models. Microelectronic packaging industry spends typically several months building and reliability
Tan, Christine L; Hassali, Mohamed A; Saleem, Fahad; Shafie, Asrul A; Aljadhey, Hisham; Gan, Vincent B
2015-01-01
(i) To develop the Pharmacy Value-Added Services Questionnaire (PVASQ) using emerging themes generated from interviews. (ii) To establish reliability and validity of questionnaire instrument. Using an extended Theory of Planned Behavior as the theoretical model, face-to-face interviews generated salient beliefs of pharmacy value-added services. The PVASQ was constructed initially in English incorporating important themes and later translated into the Malay language with forward and backward translation. Intention (INT) to adopt pharmacy value-added services is predicted by attitudes (ATT), subjective norms (SN), perceived behavioral control (PBC), knowledge and expectations. Using a 7-point Likert-type scale and a dichotomous scale, test-retest reliability (N=25) was assessed by administrating the questionnaire instrument twice at an interval of one week apart. Internal consistency was measured by Cronbach's alpha and construct validity between two administrations was assessed using the kappa statistic and the intraclass correlation coefficient (ICC). Confirmatory Factor Analysis, CFA (N=410) was conducted to assess construct validity of the PVASQ. The kappa coefficients indicate a moderate to almost perfect strength of agreement between test and retest. The ICC for all scales tested for intra-rater (test-retest) reliability was good. The overall Cronbach' s alpha (N=25) is 0.912 and 0.908 for the two time points. The result of CFA (N=410) showed most items loaded strongly and correctly into corresponding factors. Only one item was eliminated. This study is the first to develop and establish the reliability and validity of the Pharmacy Value-Added Services Questionnaire instrument using the Theory of Planned Behavior as the theoretical model. The translated Malay language version of PVASQ is reliable and valid to predict Malaysian patients' intention to adopt pharmacy value-added services to collect partial medicine supply.
Ecosystem Model Skill Assessment. Yes We Can!
Olsen, Erik; Fay, Gavin; Gaichas, Sarah; Gamble, Robert; Lucey, Sean; Link, Jason S.
2016-01-01
Need to Assess the Skill of Ecosystem Models Accelerated changes to global ecosystems call for holistic and integrated analyses of past, present and future states under various pressures to adequately understand current and projected future system states. Ecosystem models can inform management of human activities in a complex and changing environment, but are these models reliable? Ensuring that models are reliable for addressing management questions requires evaluating their skill in representing real-world processes and dynamics. Skill has been evaluated for just a limited set of some biophysical models. A range of skill assessment methods have been reviewed but skill assessment of full marine ecosystem models has not yet been attempted. Northeast US Atlantis Marine Ecosystem Model We assessed the skill of the Northeast U.S. (NEUS) Atlantis marine ecosystem model by comparing 10-year model forecasts with observed data. Model forecast performance was compared to that obtained from a 40-year hindcast. Multiple metrics (average absolute error, root mean squared error, modeling efficiency, and Spearman rank correlation), and a suite of time-series (species biomass, fisheries landings, and ecosystem indicators) were used to adequately measure model skill. Overall, the NEUS model performed above average and thus better than expected for the key species that had been the focus of the model tuning. Model forecast skill was comparable to the hindcast skill, showing that model performance does not degenerate in a 10-year forecast mode, an important characteristic for an end-to-end ecosystem model to be useful for strategic management purposes. Skill Assessment Is Both Possible and Advisable We identify best-practice approaches for end-to-end ecosystem model skill assessment that would improve both operational use of other ecosystem models and future model development. We show that it is possible to not only assess the skill of a complicated marine ecosystem model, but that it is necessary do so to instill confidence in model results and encourage their use for strategic management. Our methods are applicable to any type of predictive model, and should be considered for use in fields outside ecology (e.g. economics, climate change, and risk assessment). PMID:26731540
Sleeper, Mark D; Kenyon, Lisa K; Elliott, James M; Cheng, M Samuel
2016-12-01
Despite the availability of various field-tests for many competitive sports, a reliable and valid test specifically developed for use in men's gymnastics has not yet been developed. The Men's Gymnastics Functional Measurement Tool (MGFMT) was designed to assess sport-specific physical abilities in male competitive gymnasts. The purpose of this study was to develop the MGFMT by establishing a scoring system for individual test items and to initiate the process of establishing test-retest reliability and construct validity. A total of 83 competitive male gymnasts ages 7-18 underwent testing using the MGFMT. Thirty of these subjects underwent re-testing one week later in order to assess test-retest reliability. Construct validity was assessed using a simple regression analysis between total MGFMT scores and the gymnasts' USA-Gymnastics competitive level to calculate the coefficient of determination (r 2 ). Test-retest reliability was analyzed using Model 1 Intraclass correlation coefficients (ICC). Statistical significance was set at the p<0.05 level. The relationship between total MGFMT scores and subjects' current USA-Gymnastics competitive level was found to be good (r 2 = 0.63). Reliability testing of the MGFMT composite test score showed excellent test-retest reliability over a one-week period (ICC = 0.97). Test-retest reliability of the individual component tests ranged from good to excellent (ICC = 0.75-0.97). The results of this study provide initial support for the construct validity and test-retest reliability of the MGFMT. Level 3.
Cramer, Emily
2016-01-01
Abstract Hospital performance reports often include rankings of unit pressure ulcer rates. Differentiating among units on the basis of quality requires reliable measurement. Our objectives were to describe and apply methods for assessing reliability of hospital‐acquired pressure ulcer rates and evaluate a standard signal‐noise reliability measure as an indicator of precision of differentiation among units. Quarterly pressure ulcer data from 8,199 critical care, step‐down, medical, surgical, and medical‐surgical nursing units from 1,299 US hospitals were analyzed. Using beta‐binomial models, we estimated between‐unit variability (signal) and within‐unit variability (noise) in annual unit pressure ulcer rates. Signal‐noise reliability was computed as the ratio of between‐unit variability to the total of between‐ and within‐unit variability. To assess precision of differentiation among units based on ranked pressure ulcer rates, we simulated data to estimate the probabilities of a unit's observed pressure ulcer rate rank in a given sample falling within five and ten percentiles of its true rank, and the probabilities of units with ulcer rates in the highest quartile and highest decile being identified as such. We assessed the signal‐noise measure as an indicator of differentiation precision by computing its correlations with these probabilities. Pressure ulcer rates based on a single year of quarterly or weekly prevalence surveys were too susceptible to noise to allow for precise differentiation among units, and signal‐noise reliability was a poor indicator of precision of differentiation. To ensure precise differentiation on the basis of true differences, alternative methods of assessing reliability should be applied to measures purported to differentiate among providers or units based on quality. © 2016 The Authors. Research in Nursing & Health published by Wiley Periodicals, Inc. PMID:27223598
Mansberger, Steven L.; Sheppler, Christina R.; McClure, Tina M.; VanAlstine, Cory L.; Swanson, Ingrid L.; Stoumbos, Zoey; Lambert, William E.
2013-01-01
Purpose: To report the psychometrics of the Glaucoma Treatment Compliance Assessment Tool (GTCAT), a new questionnaire designed to assess adherence with glaucoma therapy. Methods: We developed the questionnaire according to the constructs of the Health Belief Model. We evaluated the questionnaire using data from a cross-sectional study with focus groups (n = 20) and a prospective observational case series (n=58). Principal components analysis provided assessment of construct validity. We repeated the questionnaire after 3 months for test-retest reliability. We evaluated predictive validity using an electronic dosing monitor as an objective measure of adherence. Results: Focus group participants provided 931 statements related to adherence, of which 88.7% (826/931) could be categorized into the constructs of the Health Belief Model. Perceived barriers accounted for 31% (288/931) of statements, cues-to-action 14% (131/931), susceptibility 12% (116/931), benefits 12% (115/931), severity 10% (91/931), and self-efficacy 9% (85/931). The principal components analysis explained 77% of the variance with five components representing Health Belief Model constructs. Reliability analyses showed acceptable Cronbach’s alphas (>.70) for four of the seven components (severity, susceptibility, barriers [eye drop administration], and barriers [discomfort]). Predictive validity was high, with several Health Belief Model questions significantly associated (P <.05) with adherence and a correlation coefficient (R2) of .40. Test-retest reliability was 90%. Conclusion: The GTCAT shows excellent repeatability, content, construct, and predictive validity for glaucoma adherence. A multisite trial is needed to determine whether the results can be generalized and whether the questionnaire accurately measures the effect of interventions to increase adherence. PMID:24072942
Domeyer, Philip J; Aletras, Vassilis; Anagnostopoulos, Fotios; Katsari, Vasiliki; Niakas, Dimitris
2017-01-01
The use of generic medicines is a cost-effective policy, often dictated by fiscal restraints. To our knowledge, no fully validated tool exploring the students' knowledge and attitudes towards generic medicines exists. The aim of our study was to develop and validate a questionnaire exploring the knowledge and attitudes of M.Sc. in Health Care Management students and recent alumni's towards generic drugs in Greece. The development of the questionnaire was a result of literature review and pilot-testing of its preliminary versions to researchers and students. The final version of the questionnaire contains 18 items measuring the respondents' knowledge and attitude towards generic medicines on a 5-point Likert scale. Given the ordinal nature of the data, ordinal alpha and polychoric correlations were computed. The sample was randomly split into two halves. Exploratory factor analysis, performed in the first sample, was used for the creation of multi-item scales. Confirmatory factor analysis and Generalized Linear Latent and Mixed Model analysis (GLLAMM) with the use of the rating scale model were used in the second sample to assess goodness of fit. An assessment of internal consistency reliability, test-retest reliability, and construct validity was also performed. Among 1402 persons contacted, 986 persons completed our questionnaire (response rate = 70.3%). Overall Cronbach's alpha was 0.871. The conjoint use of exploratory and confirmatory factor analysis resulted in a six-scale model, which seemed to fit the data well. Five of the six scales, namely trust, drug quality, state audit, fiscal impact and drug substitution were found to be valid and reliable, while the knowledge scale suffered only from low inter-scale correlations and a ceiling effect. However, the subsequent confirmatory factor and GLLAMM analyses indicated a good fit of the model to the data. The ATTOGEN instrument proved to be a reliable and valid tool, suitable for assessing students' knowledge and attitudes towards generic medicines.
Validity and reliability of the robotic objective structured assessment of technical skills
Siddiqui, Nazema Y.; Galloway, Michael L.; Geller, Elizabeth J.; Green, Isabel C.; Hur, Hye-Chun; Langston, Kyle; Pitter, Michael C.; Tarr, Megan E.; Martino, Martin A.
2015-01-01
Objective Objective structured assessments of technical skills (OSATS) have been developed to measure the skill of surgical trainees. Our aim was to develop an OSATS specifically for trainees learning robotic surgery. Study Design This is a multi-institutional study in eight academic training programs. We created an assessment form to evaluate robotic surgical skill through five inanimate exercises. Obstetrics/gynecology, general surgery, and urology residents, fellows, and faculty completed five robotic exercises on a standard training model. Study sessions were recorded and randomly assigned to three blinded judges who scored performance using the assessment form. Construct validity was evaluated by comparing scores between participants with different levels of surgical experience; inter- and intra-rater reliability were also assessed. Results We evaluated 83 residents, 9 fellows, and 13 faculty, totaling 105 participants; 88 (84%) were from obstetrics/gynecology. Our assessment form demonstrated construct validity, with faculty and fellows performing significantly better than residents (mean scores: 89 ± 8 faculty; 74 ± 17 fellows; 59 ± 22 residents, p<0.01). In addition, participants with more robotic console experience scored significantly higher than those with fewer prior console surgeries (p<0.01). R-OSATS demonstrated good inter-rater reliability across all five drills (mean Cronbach's α: 0.79 ± 0.02). Intra-rater reliability was also high (mean Spearman's correlation: 0.91 ± 0.11). Conclusions We developed an assessment form for robotic surgical skill that demonstrates construct validity, inter- and intra-rater reliability. When paired with standardized robotic skill drills this form may be useful to distinguish between levels of trainee performance. PMID:24807319
Assessing the applicability of template-based protein docking in the twilight zone.
Negroni, Jacopo; Mosca, Roberto; Aloy, Patrick
2014-09-02
The structural modeling of protein interactions in the absence of close homologous templates is a challenging task. Recently, template-based docking methods have emerged to exploit local structural similarities to help ab-initio protocols provide reliable 3D models for protein interactions. In this work, we critically assess the performance of template-based docking in the twilight zone. Our results show that, while it is possible to find templates for nearly all known interactions, the quality of the obtained models is rather limited. We can increase the precision of the models at expenses of coverage, but it drastically reduces the potential applicability of the method, as illustrated by the whole-interactome modeling of nine organisms. Template-based docking is likely to play an important role in the structural characterization of the interaction space, but we still need to improve the repertoire of structural templates onto which we can reliably model protein complexes. Copyright © 2014 Elsevier Ltd. All rights reserved.
Garcia, Darren J.; Skadberg, Rebecca M.; Schmidt, Megan; ...
2018-03-05
The Diagnostic and Statistical Manual of Mental Disorders (5th ed. [DSM–5]; American Psychiatric Association, 2013) Section III Alternative Model for Personality Disorders (AMPD) represents a novel approach to the diagnosis of personality disorder (PD). In this model, PD diagnosis requires evaluation of level of impairment in personality functioning (Criterion A) and characterization by pathological traits (Criterion B). Questions about clinical utility, complexity, and difficulty in learning and using the AMPD have been expressed in recent scholarly literature. We examined the learnability, interrater reliability, and clinical utility of the AMPD using a vignette methodology and graduate student raters. Results showed thatmore » student clinicians can learn Criterion A of the AMPD to a high level of interrater reliability and agreement with expert ratings. Interrater reliability of the 25 trait facets of the AMPD varied but showed overall acceptable levels of agreement. Examination of severity indexes of PD impairment showed the level of personality functioning (LPF) added information beyond that of global assessment of functioning (GAF). Clinical utility ratings were generally strong. Lastly, the satisfactory interrater reliability of components of the AMPD indicates the model, including the LPF, is very learnable.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garcia, Darren J.; Skadberg, Rebecca M.; Schmidt, Megan
The Diagnostic and Statistical Manual of Mental Disorders (5th ed. [DSM–5]; American Psychiatric Association, 2013) Section III Alternative Model for Personality Disorders (AMPD) represents a novel approach to the diagnosis of personality disorder (PD). In this model, PD diagnosis requires evaluation of level of impairment in personality functioning (Criterion A) and characterization by pathological traits (Criterion B). Questions about clinical utility, complexity, and difficulty in learning and using the AMPD have been expressed in recent scholarly literature. We examined the learnability, interrater reliability, and clinical utility of the AMPD using a vignette methodology and graduate student raters. Results showed thatmore » student clinicians can learn Criterion A of the AMPD to a high level of interrater reliability and agreement with expert ratings. Interrater reliability of the 25 trait facets of the AMPD varied but showed overall acceptable levels of agreement. Examination of severity indexes of PD impairment showed the level of personality functioning (LPF) added information beyond that of global assessment of functioning (GAF). Clinical utility ratings were generally strong. Lastly, the satisfactory interrater reliability of components of the AMPD indicates the model, including the LPF, is very learnable.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Paret, Paul
The National Renewable Energy Laboratory (NREL) will conduct thermal and reliability modeling on three sets of power modules for the development of a next generation inverter for electric traction drive vehicles. These modules will be chosen by General Motors (GM) to represent three distinct technological approaches to inverter power module packaging. Likely failure mechanisms will be identified in each package and a physics-of-failure-based reliability assessment will be conducted.
NASA Astrophysics Data System (ADS)
Lin, Yi-Kuei; Huang, Cheng-Fu
2015-04-01
From a quality of service viewpoint, the transmission packet unreliability and transmission time are both critical performance indicators in a computer system when assessing the Internet quality for supervisors and customers. A computer system is usually modelled as a network topology where each branch denotes a transmission medium and each vertex represents a station of servers. Almost every branch has multiple capacities/states due to failure, partial failure, maintenance, etc. This type of network is known as a multi-state computer network (MSCN). This paper proposes an efficient algorithm that computes the system reliability, i.e., the probability that a specified amount of data can be sent through k (k ≥ 2) disjoint minimal paths within both the tolerable packet unreliability and time threshold. Furthermore, two routing schemes are established in advance to indicate the main and spare minimal paths to increase the system reliability (referred to as spare reliability). Thus, the spare reliability can be readily computed according to the routing scheme.
Reliability Analysis and Standardization of Spacecraft Command Generation Processes
NASA Technical Reports Server (NTRS)
Meshkat, Leila; Grenander, Sven; Evensen, Ken
2011-01-01
center dot In order to reduce commanding errors that are caused by humans, we create an approach and corresponding artifacts for standardizing the command generation process and conducting risk management during the design and assurance of such processes. center dot The literature review conducted during the standardization process revealed that very few atomic level human activities are associated with even a broad set of missions. center dot Applicable human reliability metrics for performing these atomic level tasks are available. center dot The process for building a "Periodic Table" of Command and Control Functions as well as Probabilistic Risk Assessment (PRA) models is demonstrated. center dot The PRA models are executed using data from human reliability data banks. center dot The Periodic Table is related to the PRA models via Fault Links.
Constructing a Grounded Theory of E-Learning Assessment
ERIC Educational Resources Information Center
Alonso-Díaz, Laura; Yuste-Tosina, Rocío
2015-01-01
This study traces the development of a grounded theory of assessment in e-learning environments, a field in need of research to establish the parameters of an assessment that is both reliable and worthy of higher learning accreditation. Using grounded theory as a research method, we studied an e-assessment model that does not require physical…
Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis
NASA Technical Reports Server (NTRS)
Dezfuli, Homayoon; Kelly, Dana; Smith, Curtis; Vedros, Kurt; Galyean, William
2009-01-01
This document, Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis, is intended to provide guidelines for the collection and evaluation of risk and reliability-related data. It is aimed at scientists and engineers familiar with risk and reliability methods and provides a hands-on approach to the investigation and application of a variety of risk and reliability data assessment methods, tools, and techniques. This document provides both: A broad perspective on data analysis collection and evaluation issues. A narrow focus on the methods to implement a comprehensive information repository. The topics addressed herein cover the fundamentals of how data and information are to be used in risk and reliability analysis models and their potential role in decision making. Understanding these topics is essential to attaining a risk informed decision making environment that is being sought by NASA requirements and procedures such as 8000.4 (Agency Risk Management Procedural Requirements), NPR 8705.05 (Probabilistic Risk Assessment Procedures for NASA Programs and Projects), and the System Safety requirements of NPR 8715.3 (NASA General Safety Program Requirements).
Evaluating the capabilities of watershed-scale models in estimating sediment yield at field-scale.
Sommerlot, Andrew R; Nejadhashemi, A Pouyan; Woznicki, Sean A; Giri, Subhasis; Prohaska, Michael D
2013-09-30
Many watershed model interfaces have been developed in recent years for predicting field-scale sediment loads. They share the goal of providing data for decisions aimed at improving watershed health and the effectiveness of water quality conservation efforts. The objectives of this study were to: 1) compare three watershed-scale models (Soil and Water Assessment Tool (SWAT), Field_SWAT, and the High Impact Targeting (HIT) model) against calibrated field-scale model (RUSLE2) in estimating sediment yield from 41 randomly selected agricultural fields within the River Raisin watershed; 2) evaluate the statistical significance among models; 3) assess the watershed models' capabilities in identifying areas of concern at the field level; 4) evaluate the reliability of the watershed-scale models for field-scale analysis. The SWAT model produced the most similar estimates to RUSLE2 by providing the closest median and the lowest absolute error in sediment yield predictions, while the HIT model estimates were the worst. Concerning statistically significant differences between models, SWAT was the only model found to be not significantly different from the calibrated RUSLE2 at α = 0.05. Meanwhile, all models were incapable of identifying priorities areas similar to the RUSLE2 model. Overall, SWAT provided the most correct estimates (51%) within the uncertainty bounds of RUSLE2 and is the most reliable among the studied models, while HIT is the least reliable. The results of this study suggest caution should be exercised when using watershed-scale models for field level decision-making, while field specific data is of paramount importance. Copyright © 2013 Elsevier Ltd. All rights reserved.
Psychometric Properties of an Abbreviated Instrument of the Five-Factor Model
ERIC Educational Resources Information Center
Mullins-Sweatt, Stephanie N.; Jamerson, Janetta E.; Samuel, Douglas B.; Olson, David R.; Widiger, Thomas A.
2006-01-01
Brief measures of the five-factor model (FFM) have been developed but none include an assessment of facets within each domain. The purpose of this study was to examine the validity of a simple, one-page, facet-level description of the FFM. Five data collections were completed to assess the reliability and the convergent and discriminant validity…
Binks-Cantrell, Emily; Joshi, R Malatesha; Washburn, Erin K
2012-10-01
Recent national reports have stressed the importance of teacher knowledge in teaching reading. However, in the past, teachers' knowledge of language and literacy constructs has typically been assessed with instruments that are not fully tested for validity. In the present study, an instrument was developed; and its reliability, item difficulty, and item discrimination were computed and examined to identify model fit by applying exploratory factor analysis. Such analyses showed that the instrument demonstrated adequate estimates of reliability in assessing teachers' knowledge of language constructs. The implications for professional development of in-service teachers as well as preservice teacher education are also discussed.
Petscher, Yaacov; Mitchell, Alison M; Foorman, Barbara R
2015-01-01
A growing body of literature suggests that response latency, the amount of time it takes an individual to respond to an item, may be an important factor to consider when using assessment data to estimate the ability of an individual. Considering that tests of passage and list fluency are being adapted to a computer administration format, it is possible that accounting for individual differences in response times may be an increasingly feasible option to strengthen the precision of individual scores. The present research evaluated the differential reliability of scores when using classical test theory and item response theory as compared to a conditional item response model which includes response time as an item parameter. Results indicated that the precision of student ability scores increased by an average of 5 % when using the conditional item response model, with greater improvements for those who were average or high ability. Implications for measurement models of speeded assessments are discussed.
Petscher, Yaacov; Mitchell, Alison M.; Foorman, Barbara R.
2016-01-01
A growing body of literature suggests that response latency, the amount of time it takes an individual to respond to an item, may be an important factor to consider when using assessment data to estimate the ability of an individual. Considering that tests of passage and list fluency are being adapted to a computer administration format, it is possible that accounting for individual differences in response times may be an increasingly feasible option to strengthen the precision of individual scores. The present research evaluated the differential reliability of scores when using classical test theory and item response theory as compared to a conditional item response model which includes response time as an item parameter. Results indicated that the precision of student ability scores increased by an average of 5 % when using the conditional item response model, with greater improvements for those who were average or high ability. Implications for measurement models of speeded assessments are discussed. PMID:27721568
Evaluation of ceramics for stator application: Gas turbine engine report
NASA Technical Reports Server (NTRS)
Trela, W.; Havstad, P. H.
1978-01-01
Current ceramic materials, component fabrication processes, and reliability prediction capability for ceramic stators in an automotive gas turbine engine environment are assessed. Simulated engine duty cycle testing of stators conducted at temperatures up to 1093 C is discussed. Materials evaluated are SiC and Si3N4 fabricated from two near-net-shape processes: slip casting and injection molding. Stators for durability cycle evaluation and test specimens for material property characterization, and reliability prediction model prepared to predict stator performance in the simulated engine environment are considered. The status and description of the work performed for the reliability prediction modeling, stator fabrication, material property characterization, and ceramic stator evaluation efforts are reported.
Stirling engine - Approach for long-term durability assessment
NASA Technical Reports Server (NTRS)
Tong, Michael T.; Bartolotta, Paul A.; Halford, Gary R.; Freed, Alan D.
1992-01-01
The approach employed by NASA Lewis for the long-term durability assessment of the Stirling engine hot-section components is summarized. The approach consists of: preliminary structural assessment; development of a viscoplastic constitutive model to accurately determine material behavior under high-temperature thermomechanical loads; an experimental program to characterize material constants for the viscoplastic constitutive model; finite-element thermal analysis and structural analysis using a viscoplastic constitutive model to obtain stress/strain/temperature at the critical location of the hot-section components for life assessment; and development of a life prediction model applicable for long-term durability assessment at high temperatures. The approach should aid in the provision of long-term structural durability and reliability of Stirling engines.
The impact of symptom stability on time frame and recall reliability in CFS.
Evans, Meredyth; Jason, Leonard A
This study is an investigation of the potential impact of perceived symptom stability on the recall reliability of symptom severity and frequency as reported by individuals with chronic fatigue syndrome (CFS). Symptoms were recalled using three different recall timeframes (the past week, the past month, and the past six months) and at two assessment points (with one week in between each assessment). Participants were 51 adults (45 women and 6 men), between the ages of 29 and 66 with a current diagnosis of CFS. Multilevel Model (MLM) Analyses were used to determine the optimal recall timeframe (in terms of test-retest reliability) for reporting symptoms perceived as variable and as stable over time. Headaches were recalled more reliably when they were reported as stable over time. Furthermore, the optimal timeframe in terms of test-retest reliability for stable symptoms was highly uniform, such that all Fukuda 1 CFS symptoms were more reliably recalled at the six month timeframe. Furthermore, the optimal timeframe for CFS symptoms perceived as variable, differed across symptoms. Symptom stability and recall timeframe are important to consider in order to improve the accuracy and reliability of the current methods for diagnosing this illness.
NASA Astrophysics Data System (ADS)
Liu, Haixing; Savić, Dragan; Kapelan, Zoran; Zhao, Ming; Yuan, Yixing; Zhao, Hongbin
2014-07-01
Flow entropy is a measure of uniformity of pipe flows in water distribution systems. By maximizing flow entropy one can identify reliable layouts or connectivity in networks. In order to overcome the disadvantage of the common definition of flow entropy that does not consider the impact of pipe diameter on reliability, an extended definition of flow entropy, termed as diameter-sensitive flow entropy, is proposed. This new methodology is then assessed by using other reliability methods, including Monte Carlo Simulation, a pipe failure probability model, and a surrogate measure (resilience index) integrated with water demand and pipe failure uncertainty. The reliability assessment is based on a sample of WDS designs derived from an optimization process for each of the two benchmark networks. Correlation analysis is used to evaluate quantitatively the relationship between entropy and reliability. To ensure reliability, a comparative analysis between the flow entropy and the new method is conducted. The results demonstrate that the diameter-sensitive flow entropy shows consistently much stronger correlation with the three reliability measures than simple flow entropy. Therefore, the new flow entropy method can be taken as a better surrogate measure for reliability and could be potentially integrated into the optimal design problem of WDSs. Sensitivity analysis results show that the velocity parameters used in the new flow entropy has no significant impact on the relationship between diameter-sensitive flow entropy and reliability.
Burt, Jenni; Abel, Gary; Elmore, Natasha; Campbell, John; Roland, Martin; Benson, John; Silverman, Jonathan
2014-01-01
Objectives To investigate initial reliability of the Global Consultation Rating Scale (GCRS: an instrument to assess the effectiveness of communication across an entire doctor–patient consultation, based on the Calgary-Cambridge guide to the medical interview), in simulated patient consultations. Design Multiple ratings of simulated general practitioner (GP)–patient consultations by trained GP evaluators. Setting UK primary care. Participants 21 GPs and six trained GP evaluators. Outcome measures GCRS score. Methods 6 GP raters used GCRS to rate randomly assigned video recordings of GP consultations with simulated patients. Each of the 42 consultations was rated separately by four raters. We considered whether a fixed difference between scores had the same meaning at all levels of performance. We then examined the reliability of GCRS using mixed linear regression models. We augmented our regression model to also examine whether there were systematic biases between the scores given by different raters and to look for possible order effects. Results Assessing the communication quality of individual consultations, GCRS achieved a reliability of 0.73 (95% CI 0.44 to 0.79) for two raters, 0.80 (0.54 to 0.85) for three and 0.85 (0.61 to 0.88) for four. We found an average difference of 1.65 (on a 0–10 scale) in the scores given by the least and most generous raters: adjusting for this evaluator bias increased reliability to 0.78 (0.53 to 0.83) for two raters; 0.85 (0.63 to 0.88) for three and 0.88 (0.69 to 0.91) for four. There were considerable order effects, with later consultations (after 15–20 ratings) receiving, on average, scores more than one point higher on a 0–10 scale. Conclusions GCRS shows good reliability with three raters assessing each consultation. We are currently developing the scale further by assessing a large sample of real-world consultations. PMID:24604483
Development of an Integrated Agricultural Planning Model Considering Climate Change
NASA Astrophysics Data System (ADS)
Santikayasa, I. P.
2016-01-01
The goal of this study is to develop an agriculture planning model in order to sustain the future water use under the estimation of crop water requirement, water availability and future climate projection. For this purpose, the Citarum river basin which is located in West Java - Indonesia is selected as the study area. Two emission scenarios A2 and B2 were selected. For the crop water requirement estimation, the output of HadCM3 AOGCM is statistically downscale using SDSM and used as the input for WEAP model developed by SEI (Stockholm Environmental Institute). The reliability of water uses is assessed by comparing the irrigation water demand and the water allocation for the irrigation area. The water supply resources are assessed using the water planning tool. This study shows that temperature and precipitation over the study area are projected to increase in the future. The water availability was projected to increase under both A2 and B2 emission scenarios in the future. The irrigation water requirement is expected to decrease in the future under A2 and B2 scenarios. By comparing the irrigation water demand and water allocation for irrigation, the reliability of agriculture water use is expected to change in the period of 2050s and 2080s while the reliability will not change in 2020s. The reliability under A2 scenario is expected to be higher than B2 scenario. The combination of WEAP and SDSM is significance to use in assessing and allocating the water resources in the region.
de Vreede, Paul L; Samson, Monique M; van Meeteren, Nico L; Duursma, Sijmen A; Verhaar, Harald J
2006-08-01
The Assessment of Daily Activity Performance (ADAP) test was developed, and modeled after the Continuous-scale Physical Functional Performance (CS-PFP) test, to provide a quantitative assessment of older adults' physical functional performance. The aim of this study was to determine the intra-examiner reliability and construct validity of the ADAP in a community-living older population, and to identify the importance of tester experience. Forty-three community-dwelling, older women (mean age 75 yr +/-4.3) were randomized to the test-retest reliability study (n=19) or validation study (n=24). The intra-examiner reliability of an experienced (tester 1) and an inexperienced tester (tester 2) was assessed by comparing test and retest scores of 19 participants. Construct validity was assessed by comparing the ADAP scores of 24 participants with self-perceived function by the SF-36 Health Survey, muscle function tests, and the Timed Up and Go test (TUG). Tester 1 had good consistency and reliability scores (mean difference between test and retest scores (DIF), -1.05+/-1.99; 95% confidence interval (CI), -2.58 to 0.48; Cronbach's alpha (alpha) range, 0.83 to 0.98; intraclass correlation (ICC) range, 0.75 to 0.96; Limits of Agreement (LoA), -2.58 to 4.95). Tester 2 had lower reliability scores (DIF, -2.45+/-4.36; 95% CI, -5.56 to 0.67; alpha range, 0.53 to 0.94; ICC range, 0.36 to 0.90; LoA, -6.09 to 10.99), with a systematic difference between test and retest scores for the ADAP domain lower-body strength (-3.81; 95% CI, -6.09 to -1.54), ADAP correlated with SF-36 Physical Functioning scale (r=0.67), TUG test (r=-0.91) and with isometric knee extensor strength (r=0.80). The ADAP test is a reliable and valid instrument. Our results suggest that testers should practise using the test, to improve reliability, before applying it to clinical settings.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grabaskas, David; Brunett, Acacia J.; Passerini, Stefano
GE Hitachi Nuclear Energy (GEH) and Argonne National Laboratory (Argonne) participated in a two year collaboration to modernize and update the probabilistic risk assessment (PRA) for the PRISM sodium fast reactor. At a high level, the primary outcome of the project was the development of a next-generation PRA that is intended to enable risk-informed prioritization of safety- and reliability-focused research and development. A central Argonne task during this project was a reliability assessment of passive safety systems, which included the Reactor Vessel Auxiliary Cooling System (RVACS) and the inherent reactivity feedbacks of the metal fuel core. Both systems were examinedmore » utilizing a methodology derived from the Reliability Method for Passive Safety Functions (RMPS), with an emphasis on developing success criteria based on mechanistic system modeling while also maintaining consistency with the Fuel Damage Categories (FDCs) of the mechanistic source term assessment. This paper provides an overview of the reliability analyses of both systems, including highlights of the FMEAs, the construction of best-estimate models, uncertain parameter screening and propagation, and the quantification of system failure probability. In particular, special focus is given to the methodologies to perform the analysis of uncertainty propagation and the determination of the likelihood of violating FDC limits. Additionally, important lessons learned are also reviewed, such as optimal sampling methodologies for the discovery of low likelihood failure events and strategies for the combined treatment of aleatory and epistemic uncertainties.« less
Multi-model ensembles for assessment of flood losses and associated uncertainty
NASA Astrophysics Data System (ADS)
Figueiredo, Rui; Schröter, Kai; Weiss-Motz, Alexander; Martina, Mario L. V.; Kreibich, Heidi
2018-05-01
Flood loss modelling is a crucial part of risk assessments. However, it is subject to large uncertainty that is often neglected. Most models available in the literature are deterministic, providing only single point estimates of flood loss, and large disparities tend to exist among them. Adopting any one such model in a risk assessment context is likely to lead to inaccurate loss estimates and sub-optimal decision-making. In this paper, we propose the use of multi-model ensembles to address these issues. This approach, which has been applied successfully in other scientific fields, is based on the combination of different model outputs with the aim of improving the skill and usefulness of predictions. We first propose a model rating framework to support ensemble construction, based on a probability tree of model properties, which establishes relative degrees of belief between candidate models. Using 20 flood loss models in two test cases, we then construct numerous multi-model ensembles, based both on the rating framework and on a stochastic method, differing in terms of participating members, ensemble size and model weights. We evaluate the performance of ensemble means, as well as their probabilistic skill and reliability. Our results demonstrate that well-designed multi-model ensembles represent a pragmatic approach to consistently obtain more accurate flood loss estimates and reliable probability distributions of model uncertainty.
Spector, Aimee; Hebditch, Molly; Stoner, Charlotte R; Gibbor, Luke
2016-09-01
The ability to identify biological, social, and psychological issues for people with dementia is an important skill for healthcare professionals. Therefore, valid and reliable measures are needed to assess this ability. This study involves the development of a vignette style measure to capture the extent to which health professionals use "Biopsychosocial" thinking in dementia care (VIG-Dem), based on the framework of the model developed by Spector and Orrell (2010). The development process consisted of Phase 1: Developing and refining the vignettes; Phase 2: Field testing (N = 9), and Phase 3: A pilot study to assess reliability and validity (N = 131). The VIG-Dem, consisting of two vignettes with open-ended questions and a standardized scoring scheme, was developed. Evidence for the good inter-rater reliability, convergent validity, and test-retest reliability were established. The VIG-Dem has good psychometric properties and may provide a useful tool in dementia care research and practice.
ASSESSING THE RELIABILITY OF REGRESSION MODELS FOR USE IN DECISION MAKING
Can any ecological model be truly validated? Strictly speaking, the answer must be "No", because models are simplified sketches of reality, and there will always be some conditions under which they give distorted views of that reality. Policymakers might instead ask whether a mod...
Schwartz, Carolyn E; Rapkin, Bruce D
2004-01-01
The increasing evidence for response shift phenomena in quality of life (QOL) assessment points to the necessity to reconsider both the measurement model and the application of psychometric analyses. The proposed psychometric model posits that the QOL true score is always contingent upon parameters of the appraisal process. This new model calls into question existing methods for establishing the reliability and validity of QOL assessment tools and suggests several new approaches for describing the psychometric properties of these scales. Recommendations for integrating the assessment of appraisal into QOL research and clinical practice are discussed. PMID:15038830
Probabilistic Assessment of High-Throughput Wireless Sensor Networks
Kim, Robin E.; Mechitov, Kirill; Sim, Sung-Han; Spencer, Billie F.; Song, Junho
2016-01-01
Structural health monitoring (SHM) using wireless smart sensors (WSS) has the potential to provide rich information on the state of a structure. However, because of their distributed nature, maintaining highly robust and reliable networks can be challenging. Assessing WSS network communication quality before and after finalizing a deployment is critical to achieve a successful WSS network for SHM purposes. Early studies on WSS network reliability mostly used temporal signal indicators, composed of a smaller number of packets, to assess the network reliability. However, because the WSS networks for SHM purpose often require high data throughput, i.e., a larger number of packets are delivered within the communication, such an approach is not sufficient. Instead, in this study, a model that can assess, probabilistically, the long-term performance of the network is proposed. The proposed model is based on readily-available measured data sets that represent communication quality during high-throughput data transfer. Then, an empirical limit-state function is determined, which is further used to estimate the probability of network communication failure. Monte Carlo simulation is adopted in this paper and applied to a small and a full-bridge wireless networks. By performing the proposed analysis in complex sensor networks, an optimized sensor topology can be achieved. PMID:27258270
Rosales, Roberto S; Martin-Hidalgo, Yolanda; Reboso-Morales, Luis; Atroshi, Isam
2016-03-03
The purpose of this study was to assess the reliability and construct validity of the Spanish version of the 6-item carpal tunnel syndrome (CTS) symptoms scale (CTS-6). In this cross-sectional study 40 patients diagnosed with CTS based on clinical and neurophysiologic criteria, completed the standard Spanish versions of the CTS-6 and the disabilities of the arm, shoulder and hand (QuickDASH) scales on two occasions with a 1-week interval. Internal-consistency reliability was assessed with the Cronbach alpha coefficient and test-retest reliability with the intraclass correlation coefficient, two way random effect model and absolute agreement definition (ICC2,1). Cross-sectional precision was analyzed with the Standard Error of the Measurement (SEM). Longitudinal precision for test-retest reliability coefficient was assessed with the Standard Error of the Measurement difference (SEMdiff) and the Minimal Detectable Change at 95 % confidence level (MDC95). For assessing construct validity it was hypothesized that the CTS-6 would have a strong positive correlation with the QuickDASH, analyzed with the Pearson correlation coefficient (r). The standard Spanish version of the CTS-6 presented a Cronbach alpha of 0.81 with a SEM of 0.3. Test-retest reliability showed an ICC of 0.85 with a SRMdiff of 0.36 and a MDC95 of 0.7. The correlation between CTS-6 and the QuickDASH was concordant with the a priori formulated construct hypothesis (r 0.69) CONCLUSIONS: The standard Spanish version of the 6-item CTS symptoms scale showed good internal consistency, test-retest reliability and construct validity for outcomes assessment in CTS. The CTS-6 will be useful to clinicians and researchers in Spanish speaking parts of the world. The use of standardized outcome measures across countries also will facilitate comparison of research results in carpal tunnel syndrome.
ERIC Educational Resources Information Center
Maynard, Jennifer Leigh
2012-01-01
Emphasis on regular mathematics skill assessment, intervention, and progress monitoring under the RTI model has created a need for the development of assessment instruments that are psychometrically sound, reliable, universal, and brief. Important factors to consider when developing or selecting assessments for the school environment include what…
Maximizing Statistical Power When Verifying Probabilistic Forecasts of Hydrometeorological Events
NASA Astrophysics Data System (ADS)
DeChant, C. M.; Moradkhani, H.
2014-12-01
Hydrometeorological events (i.e. floods, droughts, precipitation) are increasingly being forecasted probabilistically, owing to the uncertainties in the underlying causes of the phenomenon. In these forecasts, the probability of the event, over some lead time, is estimated based on some model simulations or predictive indicators. By issuing probabilistic forecasts, agencies may communicate the uncertainty in the event occurring. Assuming that the assigned probability of the event is correct, which is referred to as a reliable forecast, the end user may perform some risk management based on the potential damages resulting from the event. Alternatively, an unreliable forecast may give false impressions of the actual risk, leading to improper decision making when protecting resources from extreme events. Due to this requisite for reliable forecasts to perform effective risk management, this study takes a renewed look at reliability assessment in event forecasts. Illustrative experiments will be presented, showing deficiencies in the commonly available approaches (Brier Score, Reliability Diagram). Overall, it is shown that the conventional reliability assessment techniques do not maximize the ability to distinguish between a reliable and unreliable forecast. In this regard, a theoretical formulation of the probabilistic event forecast verification framework will be presented. From this analysis, hypothesis testing with the Poisson-Binomial distribution is the most exact model available for the verification framework, and therefore maximizes one's ability to distinguish between a reliable and unreliable forecast. Application of this verification system was also examined within a real forecasting case study, highlighting the additional statistical power provided with the use of the Poisson-Binomial distribution.
Zeeman, Jacqueline M; McLaughlin, Jacqueline E; Cox, Wendy C
2017-11-01
With increased emphasis placed on non-academic skills in the workplace, a need exists to identify an admissions process that evaluates these skills. This study assessed the validity and reliability of an application review process involving three dedicated application reviewers in a multi-stage admissions model. A multi-stage admissions model was utilized during the 2014-2015 admissions cycle. After advancing through the academic review, each application was independently reviewed by two dedicated application reviewers utilizing a six-construct rubric (written communication, extracurricular and community service activities, leadership experience, pharmacy career appreciation, research experience, and resiliency). Rubric scores were extrapolated to a three-tier ranking to select candidates for on-site interviews. Kappa statistics were used to assess interrater reliability. A three-facet Many-Facet Rasch Model (MFRM) determined reviewer severity, candidate suitability, and rubric construct difficulty. The kappa statistic for candidates' tier rank score (n = 388 candidates) was 0.692 with a perfect agreement frequency of 84.3%. There was substantial interrater reliability between reviewers for the tier ranking (kappa: 0.654-0.710). Highest construct agreement occurred in written communication (kappa: 0.924-0.984). A three-facet MFRM analysis explained 36.9% of variance in the ratings, with 0.06% reflecting application reviewer scoring patterns (i.e., severity or leniency), 22.8% reflecting candidate suitability, and 14.1% reflecting construct difficulty. Utilization of dedicated application reviewers and a defined tiered rubric provided a valid and reliable method to effectively evaluate candidates during the application review process. These analyses provide insight into opportunities for improving the application review process among schools and colleges of pharmacy. Copyright © 2017 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Fisher, W. P., Jr.; Elbaum, B.; Coulter, A.
2010-07-01
Reliability coefficients indicate the proportion of total variance attributable to differences among measures separated along a quantitative continuum by a testing, survey, or assessment instrument. Reliability is usually considered to be influenced by both the internal consistency of a data set and the number of items, though textbooks and research papers rarely evaluate the extent to which these factors independently affect the data in question. Probabilistic formulations of the requirements for unidimensional measurement separate consistency from error by modelling individual response processes instead of group-level variation. The utility of this separation is illustrated via analyses of small sets of simulated data, and of subsets of data from a 78-item survey of over 2,500 parents of children with disabilities. Measurement reliability ultimately concerns the structural invariance specified in models requiring sufficient statistics, parameter separation, unidimensionality, and other qualities that historically have made quantification simple, practical, and convenient for end users. The paper concludes with suggestions for a research program aimed at focusing measurement research more on the calibration and wide dissemination of tools applicable to individuals, and less on the statistical study of inter-variable relations in large data sets.
Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam
2017-01-01
The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t -test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis.
Huisman, J.A.; Breuer, L.; Bormann, H.; Bronstert, A.; Croke, B.F.W.; Frede, H.-G.; Graff, T.; Hubrechts, L.; Jakeman, A.J.; Kite, G.; Lanini, J.; Leavesley, G.; Lettenmaier, D.P.; Lindstrom, G.; Seibert, J.; Sivapalan, M.; Viney, N.R.; Willems, P.
2009-01-01
An ensemble of 10 hydrological models was applied to the same set of land use change scenarios. There was general agreement about the direction of changes in the mean annual discharge and 90% discharge percentile predicted by the ensemble members, although a considerable range in the magnitude of predictions for the scenarios and catchments under consideration was obvious. Differences in the magnitude of the increase were attributed to the different mean annual actual evapotranspiration rates for each land use type. The ensemble of model runs was further analyzed with deterministic and probabilistic ensemble methods. The deterministic ensemble method based on a trimmed mean resulted in a single somewhat more reliable scenario prediction. The probabilistic reliability ensemble averaging (REA) method allowed a quantification of the model structure uncertainty in the scenario predictions. It was concluded that the use of a model ensemble has greatly increased our confidence in the reliability of the model predictions. ?? 2008 Elsevier Ltd.
Diaz, Claris; Farr, Tracy D; Harrison, David J; Fuller, Anna; Tokarczuk, Paweł F; Stewart, Andrew J; Paisey, Stephen J; Dunnett, Stephen B
2016-01-01
In order to test therapeutics, functional assessments are required. In pre-clinical stroke research, there is little consensus regarding the most appropriate behavioural tasks to assess deficits, especially when testing over extended times in milder models with short occlusion times and small lesion volumes. In this study, we comprehensively assessed 16 different behavioural tests, with the aim of identifying those that show robust, reliable and stable deficits for up to two months. These tasks are regularly used in stroke research, as well as being useful for examining striatal dysfunction in models of Huntington’s and Parkinson’s disease. Two cohorts of male Wistar rats underwent the intraluminal filament model of middle cerebral artery occlusion (30 min) and were imaged 24 h later. This resulted in primarily subcortical infarcts, with a small amount of cortical damage. Animals were tested, along with sham and naïve groups at 24 h, seven days, and one and two months. Following behavioural testing, brains were processed and striatal neuronal counts were performed alongside measurements of total brain and white matter atrophy. The staircase, adjusting steps, rotarod and apomorphine-induced rotations were the most reliable for assessing long-term deficits in the 30 min transient middle cerebral artery occlusion model of stroke. PMID:27317655
Savage, Trevor Nicholas; McIntosh, Andrew Stuart
2017-03-01
It is important to understand factors contributing to and directly causing sports injuries to improve the effectiveness and safety of sports skills. The characteristics of injury events must be evaluated and described meaningfully and reliably. However, many complex skills cannot be effectively investigated quantitatively because of ethical, technological and validity considerations. Increasingly, qualitative methods are being used to investigate human movement for research purposes, but there are concerns about reliability and measurement bias of such methods. Using the tackle in Rugby union as an example, we outline a systematic approach for developing a skill analysis protocol with a focus on improving objectivity, validity and reliability. Characteristics for analysis were selected using qualitative analysis and biomechanical theoretical models and epidemiological and coaching literature. An expert panel comprising subject matter experts provided feedback and the inter-rater reliability of the protocol was assessed using ten trained raters. The inter-rater reliability results were reviewed by the expert panel and the protocol was revised and assessed in a second inter-rater reliability study. Mean agreement in the second study improved and was comparable (52-90% agreement and ICC between 0.6 and 0.9) with other studies that have reported inter-rater reliability of qualitative analysis of human movement.
Saloheimo, T; González, S A; Erkkola, M; Milauskas, D M; Meisel, J D; Champagne, C M; Tudor-Locke, C; Sarmiento, O; Katzmarzyk, P T; Fogelholm, M
2015-01-01
Objective: The main aim of this study was to assess the reliability and validity of a food frequency questionnaire with 23 food groups (I-FFQ) among a sample of 9–11-year-old children from three different countries that differ on economical development and income distribution, and to assess differences between country sites. Furthermore, we assessed factors associated with I-FFQ's performance. Methods: This was an ancillary study of the International Study of Childhood Obesity, Lifestyle and the Environment. Reliability (n=321) and validity (n=282) components of this study had the same participants. Participation rates were 95% and 70%, respectively. Participants completed two I-FFQs with a mean interval of 4.9 weeks to assess reliability. A 3-day pre-coded food diary (PFD) was used as the reference method in the validity analyses. Wilcoxon signed-rank tests, intraclass correlation coefficients and cross-classifications were used to assess the reliability of I-FFQ. Spearman correlation coefficients, percentage difference and cross-classifications were used to assess the validity of I-FFQ. A logistic regression model was used to assess the relation of selected variables with the estimate of validity. Analyses based on information in the PFDs were performed to assess how participants interpreted food groups. Results: Reliability correlation coefficients ranged from 0.37 to 0.78 and gross misclassification for all food groups was <5%. Validity correlation coefficients were below 0.5 for 22/23 food groups, and they differed among country sites. For validity, gross misclassification was <5% for 22/23 food groups. Over- or underestimation did not appear for 19/23 food groups. Logistic regression showed that country of participation and parental education were associated (P⩽0.05) with the validity of I-FFQ. Analyses of children's interpretation of food groups suggested that the meaning of most food groups was understood by the children. Conclusion: I-FFQ is a moderately reliable method and its validity ranged from low to moderate, depending on food group and country site. PMID:27152180
Engineering Risk Assessment of Space Thruster Challenge Problem
NASA Technical Reports Server (NTRS)
Mathias, Donovan L.; Mattenberger, Christopher J.; Go, Susie
2014-01-01
The Engineering Risk Assessment (ERA) team at NASA Ames Research Center utilizes dynamic models with linked physics-of-failure analyses to produce quantitative risk assessments of space exploration missions. This paper applies the ERA approach to the baseline and extended versions of the PSAM Space Thruster Challenge Problem, which investigates mission risk for a deep space ion propulsion system with time-varying thruster requirements and operations schedules. The dynamic mission is modeled using a combination of discrete and continuous-time reliability elements within the commercially available GoldSim software. Loss-of-mission (LOM) probability results are generated via Monte Carlo sampling performed by the integrated model. Model convergence studies are presented to illustrate the sensitivity of integrated LOM results to the number of Monte Carlo trials. A deterministic risk model was also built for the three baseline and extended missions using the Ames Reliability Tool (ART), and results are compared to the simulation results to evaluate the relative importance of mission dynamics. The ART model did a reasonable job of matching the simulation models for the baseline case, while a hybrid approach using offline dynamic models was required for the extended missions. This study highlighted that state-of-the-art techniques can adequately adapt to a range of dynamic problems.
Flood loss model transfer: on the value of additional data
NASA Astrophysics Data System (ADS)
Schröter, Kai; Lüdtke, Stefan; Vogel, Kristin; Kreibich, Heidi; Thieken, Annegret; Merz, Bruno
2017-04-01
The transfer of models across geographical regions and flood events is a key challenge in flood loss estimation. Variations in local characteristics and continuous system changes require regional adjustments and continuous updating with current evidence. However, acquiring data on damage influencing factors is expensive and therefore assessing the value of additional data in terms of model reliability and performance improvement is of high relevance. The present study utilizes empirical flood loss data on direct damage to residential buildings available from computer aided telephone interviews that were carried out after the floods in 2002, 2005, 2006, 2010, 2011 and 2013 mainly in the Elbe and Danube catchments in Germany. Flood loss model performance is assessed for incrementally increased numbers of loss data which are differentiated according to region and flood event. Two flood loss modeling approaches are considered: (i) a multi-variable flood loss model approach using Random Forests and (ii) a uni-variable stage damage function. Both model approaches are embedded in a bootstrapping process which allows evaluating the uncertainty of model predictions. Predictive performance of both models is evaluated with regard to mean bias, mean absolute and mean squared errors, as well as hit rate and sharpness. Mean bias and mean absolute error give information about the accuracy of model predictions; mean squared error and sharpness about precision and hit rate is an indicator for model reliability. The results of incremental, regional and temporal updating demonstrate the usefulness of additional data to improve model predictive performance and increase model reliability, particularly in a spatial-temporal transfer setting.
A comparison of hydrologic models for ecological flows and water availability
Peter V. Caldwell; Jonathan G. Kennen; Ge Sun; Julie E. Kiang; Jon B. Butcher; Michele C. Eddy; Lauren E. Hay; Jacob H. LaFontaine; Ernie F. Hain; Stacy A. C. Nelson; Steve G. McNulty
2015-01-01
Robust hydrologic models are needed to help manage water resources for healthy aquatic ecosystems and reliable water supplies for people, but there is a lack of comprehensive model comparison studies that quantify differences in streamflow predictions among model applications developed to answer management questions. We assessed differences in daily streamflow...
Reliable models for assessing human exposures are important for understanding health risks from chemicals. The Stochastic Human Exposure and Dose Simulation model for multimedia, multi-route/pathway chemicals (SHEDS-Multimedia), developed by EPA’s Office of Research and Developm...
ERIC Educational Resources Information Center
Madison, Matthew J.; Bradshaw, Laine P.
2015-01-01
Diagnostic classification models are psychometric models that aim to classify examinees according to their mastery or non-mastery of specified latent characteristics. These models are well-suited for providing diagnostic feedback on educational assessments because of their practical efficiency and increased reliability when compared with other…
A cross-validation study of the TGMD-2: The case of an adolescent population.
Issartel, Johann; McGrane, Bronagh; Fletcher, Richard; O'Brien, Wesley; Powell, Danielle; Belton, Sarahjane
2017-05-01
This study proposes an extension of a widely used test evaluating fundamental movement skills proficiency to an adolescent population, with a specific emphasis on validity and reliability for this older age group. Cross-sectional observational study. A total of 844 participants (n=456 male, 12.03±0.49) participated in this study. The 12 fundamental movement skills of the TGMD-2 were assessed. Inter-rater reliability was examined to ensure a minimum of 95% consistency between coders. Confirmatory factor analysis was undertaken with a one-factor model (all 12 skills) and two-factor model (6 locomotor skills and 6 object-control skills) as proposed by Ulrich et al. (2000). The model fit was examined using χ 2 , TLI, CFI and RMSEA. Test-retest reliability was carried out with a subsample of 35 participants. The test-retest reliability reached Intraclass Correlation Coefficient of 0.78 (locomotor), 0.76 (object related) and 0.91 (gross motor skill proficiency). The confirmatory factor analysis did not display a good fit for either the one-factor or two-factor model due to a really low contribution of several skills. A reduction in the number of skills to just seven (run, gallop, hop, horizontal jump, bounce, kick and roll) revealed an overall good fit by TLI, CFI and RMSEA measures. The proposed new model offers the possibility of longitudinal studies to track the maturation of fundamental movement skills across the child and adolescent spectrum, while also giving researchers a valid assessment to tool to evaluate adolescent fundamental movement skills proficiency level. Copyright © 2016 Sports Medicine Australia. All rights reserved.
Panagiotopoulou, O.; Wilshin, S. D.; Rayfield, E. J.; Shefelbine, S. J.; Hutchinson, J. R.
2012-01-01
Finite element modelling is well entrenched in comparative vertebrate biomechanics as a tool to assess the mechanical design of skeletal structures and to better comprehend the complex interaction of their form–function relationships. But what makes a reliable subject-specific finite element model? To approach this question, we here present a set of convergence and sensitivity analyses and a validation study as an example, for finite element analysis (FEA) in general, of ways to ensure a reliable model. We detail how choices of element size, type and material properties in FEA influence the results of simulations. We also present an empirical model for estimating heterogeneous material properties throughout an elephant femur (but of broad applicability to FEA). We then use an ex vivo experimental validation test of a cadaveric femur to check our FEA results and find that the heterogeneous model matches the experimental results extremely well, and far better than the homogeneous model. We emphasize how considering heterogeneous material properties in FEA may be critical, so this should become standard practice in comparative FEA studies along with convergence analyses, consideration of element size, type and experimental validation. These steps may be required to obtain accurate models and derive reliable conclusions from them. PMID:21752810
TVA-Based Assessment of Visual Attention Using Line-Drawings of Fruits and Vegetables
Wang, Tianlu; Gillebert, Celine R.
2018-01-01
Visuospatial attention and short-term memory allow us to prioritize, select, and briefly maintain part of the visual information that reaches our senses. These cognitive abilities are quantitatively accounted for by Bundesen’s theory of visual attention (TVA; Bundesen, 1990). Previous studies have suggested that TVA-based assessments are sensitive to inter-individual differences in spatial bias, visual short-term memory capacity, top-down control, and processing speed in healthy volunteers as well as in patients with various neurological and psychiatric conditions. However, most neuropsychological assessments of attention and executive functions, including TVA-based assessment, make use of alphanumeric stimuli and/or are performed verbally, which can pose difficulties for individuals who have troubles processing letters or numbers. Here we examined the reliability of TVA-based assessments when stimuli are used that are not alphanumeric, but instead based on line-drawings of fruits and vegetables. We compared five TVA parameters quantifying the aforementioned cognitive abilities, obtained by modeling accuracy data on a whole/partial report paradigm using conventional alphabet stimuli versus the food stimuli. Significant correlations were found for all TVA parameters, indicating a high parallel-form reliability. Split-half correlations assessing internal reliability, and correlations between predicted and observed data assessing goodness-of-fit were both significant. Our results provide an indication that line-drawings of fruits and vegetables can be used for a reliable assessment of attention and short-term memory. PMID:29535660
The Shutdown Dissociation Scale (Shut-D)
Schalinski, Inga; Schauer, Maggie; Elbert, Thomas
2015-01-01
The evolutionary model of the defense cascade by Schauer and Elbert (2010) provides a theoretical frame for a short interview to assess problems underlying and leading to the dissociative subtype of posttraumatic stress disorder. Based on known characteristics of the defense stages “fright,” “flag,” and “faint,” we designed a structured interview to assess the vulnerability for the respective types of dissociation. Most of the scales that assess dissociative phenomena are designed as self-report questionnaires. Their items are usually selected based on more heuristic considerations rather than a theoretical model and thus include anything from minor dissociative experiences to major pathological dissociation. The shutdown dissociation scale (Shut-D) was applied in several studies in patients with a history of multiple traumatic events and different disorders that have been shown previously to be prone to symptoms of dissociation. The goal of the present investigation was to obtain psychometric characteristics of the Shut-D (including factor structure, internal consistency, retest reliability, predictive, convergent and criterion-related concurrent validity). A total population of 225 patients and 68 healthy controls were accessed. Shut-D appears to have sufficient internal reliability, excellent retest reliability, high convergent validity, and satisfactory predictive validity, while the summed score of the scale reliably separates patients with exposure to trauma (in different diagnostic groups) from healthy controls. The Shut-D is a brief structured interview for assessing the vulnerability to dissociate as a consequence of exposure to traumatic stressors. The scale demonstrates high-quality psychometric properties and may be useful for researchers and clinicians in assessing shutdown dissociation as well as in predicting the risk of dissociative responding. PMID:25976478
NASA Technical Reports Server (NTRS)
Scheper, C.; Baker, R.; Frank, G.; Yalamanchili, S.; Gray, G.
1992-01-01
Systems for Space Defense Initiative (SDI) space applications typically require both high performance and very high reliability. These requirements present the systems engineer evaluating such systems with the extremely difficult problem of conducting performance and reliability trade-offs over large design spaces. A controlled development process supported by appropriate automated tools must be used to assure that the system will meet design objectives. This report describes an investigation of methods, tools, and techniques necessary to support performance and reliability modeling for SDI systems development. Models of the JPL Hypercubes, the Encore Multimax, and the C.S. Draper Lab Fault-Tolerant Parallel Processor (FTPP) parallel-computing architectures using candidate SDI weapons-to-target assignment algorithms as workloads were built and analyzed as a means of identifying the necessary system models, how the models interact, and what experiments and analyses should be performed. As a result of this effort, weaknesses in the existing methods and tools were revealed and capabilities that will be required for both individual tools and an integrated toolset were identified.
Pisarnturakit, Pagaporn P; Shaw, Bret R; Tanasukarn, Chanuantong; Vatanasomboon, Paranee
2012-09-01
Primary caregivers' child oral health care beliefs and practices are major factors in the prevention of Early Childhood Caries (ECC). This study assessed the validity and reliability of a newly-developed scale--the Early Childhood Caries Perceptions Scale (ECCPS)--used to measure beliefs regarding ECC preventive practices among primary caregivers of young children. The ECCPS was developed based on the Health Belief Model. The construct validity and reliability of the ECCPS were examined among 254 low-socioeconomic status primary caregivers with children under five years old, recruifed from 4 Bangkok Metropolitan Administration Health Centers and a kindergarten school. Exploratory factor analysis (EFA) revealed a four-factor structure. The four factors were labeled as Perceived Susceptibility, Perceived Severity, Perceived Benefits and Perceived Barriers. Internal consistency measured by the Cronbach's coefficient alpha for those four factors were 0.897, 0.971, 0.975 and 0.789, respectively. The ECCPS demonstrated satisfactory levels of reliability and validity for assessing the health beliefs related to ECC prevention among low-socioeconomic primary caregivers.
A Comparison of Career-Related Assessment Tools/Models. Final [Report].
ERIC Educational Resources Information Center
WestEd, San Francisco, CA.
This document contains charts that evaluate career related assessment items. Chart categories include: Purpose/Current Uses/Format; Intended Population; Oregon Career Related Learning Standards Addressed; Relationship to the Standards; Relationship to Endorsement Area Frameworks; Evidence of Validity; Evidence of Reliability; Evidence of Fairness…
Perraton, Luke G.; Bower, Kelly J.; Adair, Brooke; Pua, Yong-Hao; Williams, Gavin P.; McGaw, Rebekah
2015-01-01
Introduction Hand-held dynamometry (HHD) has never previously been used to examine isometric muscle power. Rate of force development (RFD) is often used for muscle power assessment, however no consensus currently exists on the most appropriate method of calculation. The aim of this study was to examine the reliability of different algorithms for RFD calculation and to examine the intra-rater, inter-rater, and inter-device reliability of HHD as well as the concurrent validity of HHD for the assessment of isometric lower limb muscle strength and power. Methods 30 healthy young adults (age: 23±5yrs, male: 15) were assessed on two sessions. Isometric muscle strength and power were measured using peak force and RFD respectively using two HHDs (Lafayette Model-01165 and Hoggan microFET2) and a criterion-reference KinCom dynamometer. Statistical analysis of reliability and validity comprised intraclass correlation coefficients (ICC), Pearson correlations, concordance correlations, standard error of measurement, and minimal detectable change. Results Comparison of RFD methods revealed that a peak 200ms moving window algorithm provided optimal reliability results. Intra-rater, inter-rater, and inter-device reliability analysis of peak force and RFD revealed mostly good to excellent reliability (coefficients ≥ 0.70) for all muscle groups. Concurrent validity analysis showed moderate to excellent relationships between HHD and fixed dynamometry for the hip and knee (ICCs ≥ 0.70) for both peak force and RFD, with mostly poor to good results shown for the ankle muscles (ICCs = 0.31–0.79). Conclusions Hand-held dynamometry has good to excellent reliability and validity for most measures of isometric lower limb strength and power in a healthy population, particularly for proximal muscle groups. To aid implementation we have created freely available software to extract these variables from data stored on the Lafayette device. Future research should examine the reliability and validity of these variables in clinical populations. PMID:26509265
Kenyon, Lisa K.; Elliott, James M; Cheng, M. Samuel
2016-01-01
Purpose/Background Despite the availability of various field-tests for many competitive sports, a reliable and valid test specifically developed for use in men's gymnastics has not yet been developed. The Men's Gymnastics Functional Measurement Tool (MGFMT) was designed to assess sport-specific physical abilities in male competitive gymnasts. The purpose of this study was to develop the MGFMT by establishing a scoring system for individual test items and to initiate the process of establishing test-retest reliability and construct validity. Methods A total of 83 competitive male gymnasts ages 7-18 underwent testing using the MGFMT. Thirty of these subjects underwent re-testing one week later in order to assess test-retest reliability. Construct validity was assessed using a simple regression analysis between total MGFMT scores and the gymnasts’ USA-Gymnastics competitive level to calculate the coefficient of determination (r2). Test-retest reliability was analyzed using Model 1 Intraclass correlation coefficients (ICC). Statistical significance was set at the p<0.05 level. Results The relationship between total MGFMT scores and subjects’ current USA-Gymnastics competitive level was found to be good (r2 = 0.63). Reliability testing of the MGFMT composite test score showed excellent test-retest reliability over a one-week period (ICC = 0.97). Test-retest reliability of the individual component tests ranged from good to excellent (ICC = 0.75-0.97). Conclusions The results of this study provide initial support for the construct validity and test-retest reliability of the MGFMT. Level of Evidence Level 3 PMID:27999723
Population-based validation of a German version of the Brief Resilience Scale
Wenzel, Mario; Stieglitz, Rolf-Dieter; Kunzler, Angela; Bagusat, Christiana; Helmreich, Isabella; Gerlicher, Anna; Kampa, Miriam; Kubiak, Thomas; Kalisch, Raffael; Lieb, Klaus; Tüscher, Oliver
2018-01-01
Smith and colleagues developed the Brief Resilience Scale (BRS) to assess the individual ability to recover from stress despite significant adversity. This study aimed to validate the German version of the BRS. We used data from a population-based (sample 1: n = 1.481) and a representative (sample 2: n = 1.128) sample of participants from the German general population (age ≥ 18) to assess reliability and validity. Confirmatory factor analyses (CFA) were conducted to compare one- and two-factorial models from previous studies with a method-factor model which especially accounts for the wording of the items. Reliability was analyzed. Convergent validity was measured by correlating BRS scores with mental health measures, coping, social support, and optimism. Reliability was good (α = .85, ω = .85 for both samples). The method-factor model showed excellent model fit (sample 1: χ2/df = 7.544; RMSEA = .07; CFI = .99; SRMR = .02; sample 2: χ2/df = 1.166; RMSEA = .01; CFI = 1.00; SRMR = .01) which was significantly better than the one-factor model (Δχ2(4) = 172.71, p < .001) or the two-factor model (Δχ2(3) = 31.16, p < .001). The BRS was positively correlated with well-being, social support, optimism, and the coping strategies active coping, positive reframing, acceptance, and humor. It was negatively correlated with somatic symptoms, anxiety and insomnia, social dysfunction, depression, and the coping strategies religion, denial, venting, substance use, and self-blame. To conclude, our results provide evidence for the reliability and validity of the German adaptation of the BRS as well as the unidimensional structure of the scale once method effects are accounted for. PMID:29438435
Jordan, D; McEwen, S A; Wilson, J B; McNab, W B; Lammerding, A M
1999-05-01
A study was conducted to provide a quantitative description of the amount of tag (mud, soil, and bedding) adhered to the hides of feedlot beef cattle and to appraise the statistical reliability of a subjective rating system for assessing this trait. Initially, a single rater obtained baseline data by assessing 2,417 cattle for 1 month at an Ontario beef processing plant. Analysis revealed that there was a strong tendency for animals within sale-lots to have a similar total tag score (intralot correlation = 0.42). Baseline data were summarized by fitting a linear model describing an individual's total tag score as the sum of their lot mean tag score (LMTS) plus an amount representing normal variation within the lot. LMTSs predicted by the linear model were adequately described by a beta distribution with parameters nu = 3.12 and omega = 5.82 scaled to fit on the 0-to-9 interval. Five raters, trained in use of the tag scoring system, made 1,334 tag score observations in a commercial abattoir, allowing reliability to be assessed at the individual level and at the lot level. High values for reliability were obtained for individual total tag score (0.84) and lot total tag score (0.83); these values suggest that the tag scoring system could be used in the marketing and slaughter of Ontario beef cattle to improve the cleanliness of animals presented for slaughter in an effort to control the entry of microbial contamination into abattoirs. Implications for the use of the tag scoring system in research are discussed.
Reliability of Soft Tissue Model Based Implant Surgical Guides; A Methodological Mistake.
Sabour, Siamak; Dastjerdi, Elahe Vahid
2012-08-20
Abstract We were interested to read the paper by Maney P and colleagues published in the July 2012 issue of J Oral Implantol. The authors aimed to assess the reliability of soft tissue model based implant surgical guides reported that the accuracy was evaluated using software. 1 I found the manuscript title of Maney P, et al. incorrect and misleading. Moreover, they reported twenty-two sites (46.81%) were considered accurate (13 of 24 maxillary and 9 of 23 mandibular sites). As the authors point out in their conclusion, Soft tissue models do not always provide sufficient accuracy for implant surgical guide fabrication.Reliability (precision) and validity (accuracy) are two different methodological issues in researches. Sensitivity, specificity, PPV, NPV, likelihood ratio positive (true positive/false negative) and likelihood ratio negative (false positive/ true negative) as well as odds ratio (true results\\false results - preferably more than 50) are among the tests to evaluate the validity (accuracy) of a single test compared to a gold standard.2-4 It is not clear that the reported twenty-two sites (46.81%) which were considered accurate related to which of the above mentioned estimates for validity analysis. Reliability (repeatability or reproducibility) is being assessed by different statistical tests such as Pearson r, least square and paired t.test which all of them are among common mistakes in reliability analysis 5. Briefly, for quantitative variable Intra Class Correlation Coefficient (ICC) and for qualitative variables weighted kappa should be used with caution because kappa has its own limitation too. Regarding reliability or agreement, it is good to know that for computing kappa value, just concordant cells are being considered, whereas discordant cells should also be taking into account in order to reach a correct estimation of agreement (Weighted kappa).2-4 As a take home message, for reliability and validity analysis, appropriate tests should be applied.
Validation of the Physical Activity Questionnaire for Older Children (PAQ-C) among Chinese Children.
Wang, Jing Jing; Baranowski, Tom; Lau, Wc Patrick; Chen, Tzu An; Pitkethly, Amanda Jane
2016-03-01
This study initially validates the Chinese version of the Physical Activity Questionnaire for Older Children (PAQ-C), which has been identified as a potentially valid instrument to assess moderate-to-vigorous physical activity (MVPA) in children among diverse racial groups. The psychometric properties of the PAQ-C with 742 Hong Kong Chinese children were assessed with the scale's internal consistency, reliability, test-retest reliability, confirmatory factory analysis (CFA) in the overall sample, and multistep invariance tests across gender groups as well as convergent validity with body mass index (BMI), and an accelerometry-based MVPA. The Cronbach alpha coefficient (α=0.79), composite reliability value (ρ=0.81), and the intraclass correlation coefficient (α=0.82) indicate the satisfactory reliability of the PAQ-C score. The CFA indicated data fit a single factor model, suggesting that the PAQ-C measures only one construct, on MVPA over the previous 7 days. The multiple-group CFAs suggested that the factor loadings and variances and covariances of the PAQ-C measurement model were invariant across gender groups. The PAQ-C score was related to accelerometry-based MVPA (r=0.33) and inversely related to BMI (r=-0.18). This study demonstrates the reliability and validity of the PAQ-C in Chinese children. Copyright © 2016 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.
Simulation-Based Training for Colonoscopy
Preisler, Louise; Svendsen, Morten Bo Søndergaard; Nerup, Nikolaj; Svendsen, Lars Bo; Konge, Lars
2015-01-01
Abstract The aim of this study was to create simulation-based tests with credible pass/fail standards for 2 different fidelities of colonoscopy models. Only competent practitioners should perform colonoscopy. Reliable and valid simulation-based tests could be used to establish basic competency in colonoscopy before practicing on patients. Twenty-five physicians (10 consultants with endoscopic experience and 15 fellows with very little endoscopic experience) were tested on 2 different simulator models: a virtual-reality simulator and a physical model. Tests were repeated twice on each simulator model. Metrics with discriminatory ability were identified for both modalities and reliability was determined. The contrasting-groups method was used to create pass/fail standards and the consequences of these were explored. The consultants significantly performed faster and scored higher than the fellows on both the models (P < 0.001). Reliability analysis showed Cronbach α = 0.80 and 0.87 for the virtual-reality and the physical model, respectively. The established pass/fail standards failed one of the consultants (virtual-reality simulator) and allowed one fellow to pass (physical model). The 2 tested simulations-based modalities provided reliable and valid assessments of competence in colonoscopy and credible pass/fail standards were established for both the tests. We propose to use these standards in simulation-based training programs before proceeding to supervised training on patients. PMID:25634177
Ebara, Takeshi; Azuma, Ryohei; Shoji, Naoto; Matsukawa, Tsuyoshi; Yamada, Yasuyuki; Akiyama, Tomohiro; Kurihara, Takahiro; Yamada, Shota
2017-11-25
Objective measurements using built-in smartphone sensors that can measure physical activity/inactivity in daily working life have the potential to provide a new approach to assessing workers' health effects. The aim of this study was to elucidate the characteristics and reliability of built-in step counting sensors on smartphones for development of an easy-to-use objective measurement tool that can be applied in ergonomics or epidemiological research. To evaluate the reliability of step counting sensors embedded in seven major smartphone models, the 6-minute walk test was conducted and the following analyses of sensor precision and accuracy were performed: 1) relationship between actual step count and step count detected by sensors, 2) reliability between smartphones of the same model, and 3) false detection rates when sitting during office work, while riding the subway, and driving. On five of the seven models, the inter-class correlations coefficient (ICC (3,1) ) showed high reliability with a range of 0.956-0.993. The other two models, however, had ranges of 0.443-0.504 and the relative error ratios of the sensor-detected step count to the actual step count were ±48.7%-49.4%. The level of agreement between the same models was ICC (3,1) : 0.992-0.998. The false detection rates differed between the sitting conditions. These results suggest the need for appropriate regulation of step counts measured by sensors, through means such as correction or calibration with a predictive model formula, in order to obtain the highly reliable measurement results that are sought in scientific investigation.
Development of modelling algorithm of technological systems by statistical tests
NASA Astrophysics Data System (ADS)
Shemshura, E. A.; Otrokov, A. V.; Chernyh, V. G.
2018-03-01
The paper tackles the problem of economic assessment of design efficiency regarding various technological systems at the stage of their operation. The modelling algorithm of a technological system was performed using statistical tests and with account of the reliability index allows estimating the level of machinery technical excellence and defining the efficiency of design reliability against its performance. Economic feasibility of its application shall be determined on the basis of service quality of a technological system with further forecasting of volumes and the range of spare parts supply.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dandini, Vincent John; Duran, Felicia Angelica; Wyss, Gregory Dane
2003-09-01
This article describes how features of event tree analysis and Monte Carlo-based discrete event simulation can be combined with concepts from object-oriented analysis to develop a new risk assessment methodology, with some of the best features of each. The resultant object-based event scenario tree (OBEST) methodology enables an analyst to rapidly construct realistic models for scenarios for which an a priori discovery of event ordering is either cumbersome or impossible. Each scenario produced by OBEST is automatically associated with a likelihood estimate because probabilistic branching is integral to the object model definition. The OBEST methodology is then applied to anmore » aviation safety problem that considers mechanisms by which an aircraft might become involved in a runway incursion incident. The resulting OBEST model demonstrates how a close link between human reliability analysis and probabilistic risk assessment methods can provide important insights into aviation safety phenomenology.« less
CRAX/Cassandra Reliability Analysis Software
DOE Office of Scientific and Technical Information (OSTI.GOV)
Robinson, D.
1999-02-10
Over the past few years Sandia National Laboratories has been moving toward an increased dependence on model- or physics-based analyses as a means to assess the impact of long-term storage on the nuclear weapons stockpile. These deterministic models have also been used to evaluate replacements for aging systems, often involving commercial off-the-shelf components (COTS). In addition, the models have been used to assess the performance of replacement components manufactured via unique, small-lot production runs. In either case, the limited amount of available test data dictates that the only logical course of action to characterize the reliability of these components ismore » to specifically consider the uncertainties in material properties, operating environment etc. within the physics-based (deterministic) model. This not only provides the ability to statistically characterize the expected performance of the component or system, but also provides direction regarding the benefits of additional testing on specific components within the system. An effort was therefore initiated to evaluate the capabilities of existing probabilistic methods and, if required, to develop new analysis methods to support the inclusion of uncertainty in the classical design tools used by analysts and design engineers at Sandia. The primary result of this effort is the CMX (Cassandra Exoskeleton) reliability analysis software.« less
ERIC Educational Resources Information Center
Colombini, Crystal Broch; McBride, Maureen
2012-01-01
Composition assessment scholars have exhibited uneasiness with the language of norming grounded in distaste for the psychometric assumption that achievement of consensus in a communal assessment setting is desirable even at the cost of individual pedagogical values. Responding to the problems of a "reliability" defined by homogenous agreement,…
Yang, Huan; Meijer, Hil G E; Buitenweg, Jan R; van Gils, Stephan A
2016-01-01
Healthy or pathological states of nociceptive subsystems determine different stimulus-response relations measured from quantitative sensory testing. In turn, stimulus-response measurements may be used to assess these states. In a recently developed computational model, six model parameters characterize activation of nerve endings and spinal neurons. However, both model nonlinearity and limited information in yes-no detection responses to electrocutaneous stimuli challenge to estimate model parameters. Here, we address the question whether and how one can overcome these difficulties for reliable parameter estimation. First, we fit the computational model to experimental stimulus-response pairs by maximizing the likelihood. To evaluate the balance between model fit and complexity, i.e., the number of model parameters, we evaluate the Bayesian Information Criterion. We find that the computational model is better than a conventional logistic model regarding the balance. Second, our theoretical analysis suggests to vary the pulse width among applied stimuli as a necessary condition to prevent structural non-identifiability. In addition, the numerically implemented profile likelihood approach reveals structural and practical non-identifiability. Our model-based approach with integration of psychophysical measurements can be useful for a reliable assessment of states of the nociceptive system.
ERIC Educational Resources Information Center
Ling, Guangming
2012-01-01
To assess the value of individual students' subscores on the Major Field Test in Business (MFT Business), I examined the test's internal structure with factor analysis and structural equation model methods, and analyzed the subscore reliabilities using the augmented scores method. Analyses of the internal structure suggested that the MFT Business…
ERIC Educational Resources Information Center
Wesolowski, Brian C.; Amend, Ross M.; Barnstead, Thomas S.; Edwards, Andrew S.; Everhart, Matthew; Goins, Quentin R.; Grogan, Robert J., III; Herceg, Amanda M.; Jenkins, S. Ira; Johns, Paul M.; McCarver, Christopher J.; Schaps, Robin E.; Sorrell, Gary W.; Williams, Jonathan D.
2017-01-01
The purpose of this study was to describe the development of a valid and reliable rubric to assess secondary-level solo instrumental music performance based on principles of invariant measurement. The research questions that guided this study included (1) What is the psychometric quality (i.e., validity, reliability, and precision) of a scale…
System reliability of randomly vibrating structures: Computational modeling and laboratory testing
NASA Astrophysics Data System (ADS)
Sundar, V. S.; Ammanagi, S.; Manohar, C. S.
2015-09-01
The problem of determination of system reliability of randomly vibrating structures arises in many application areas of engineering. We discuss in this paper approaches based on Monte Carlo simulations and laboratory testing to tackle problems of time variant system reliability estimation. The strategy we adopt is based on the application of Girsanov's transformation to the governing stochastic differential equations which enables estimation of probability of failure with significantly reduced number of samples than what is needed in a direct simulation study. Notably, we show that the ideas from Girsanov's transformation based Monte Carlo simulations can be extended to conduct laboratory testing to assess system reliability of engineering structures with reduced number of samples and hence with reduced testing times. Illustrative examples include computational studies on a 10-degree of freedom nonlinear system model and laboratory/computational investigations on road load response of an automotive system tested on a four-post test rig.
Fatigue reliability of deck structures subjected to correlated crack growth
NASA Astrophysics Data System (ADS)
Feng, G. Q.; Garbatov, Y.; Guedes Soares, C.
2013-12-01
The objective of this work is to analyse fatigue reliability of deck structures subjected to correlated crack growth. The stress intensity factors of the correlated cracks are obtained by finite element analysis and based on which the geometry correction functions are derived. The Monte Carlo simulations are applied to predict the statistical descriptors of correlated cracks based on the Paris-Erdogan equation. A probabilistic model of crack growth as a function of time is used to analyse the fatigue reliability of deck structures accounting for the crack propagation correlation. A deck structure is modelled as a series system of stiffened panels, where a stiffened panel is regarded as a parallel system composed of plates and are longitudinal. It has been proven that the method developed here can be conveniently applied to perform the fatigue reliability assessment of structures subjected to correlated crack growth.
Dynamic one-dimensional modeling of secondary settling tanks and system robustness evaluation.
Li, Ben; Stenstrom, M K
2014-01-01
One-dimensional secondary settling tank models are widely used in current engineering practice for design and optimization, and usually can be expressed as a nonlinear hyperbolic or nonlinear strongly degenerate parabolic partial differential equation (PDE). Reliable numerical methods are needed to produce approximate solutions that converge to the exact analytical solutions. In this study, we introduced a reliable numerical technique, the Yee-Roe-Davis (YRD) method as the governing PDE solver, and compared its reliability with the prevalent Stenstrom-Vitasovic-Takács (SVT) method by assessing their simulation results at various operating conditions. The YRD method also produced a similar solution to the previously developed Method G and Enquist-Osher method. The YRD and SVT methods were also used for a time-to-failure evaluation, and the results show that the choice of numerical method can greatly impact the solution. Reliable numerical methods, such as the YRD method, are strongly recommended.
Modelling utility-scale wind power plants. Part 2: Capacity credit
NASA Astrophysics Data System (ADS)
Milligan, Michael R.
2000-10-01
As the worldwide use of wind turbine generators in utility-scale applications continues to increase, it will become increasingly important to assess the economic and reliability impact of these intermittent resources. Although the utility industry appears to be moving towards a restructured environment, basic economic and reliability issues will continue to be relevant to companies involved with electricity generation. This article is the second in a two-part series that addresses modelling approaches and results that were obtained in several case studies and research projects at the National Renewable Energy Laboratory (NREL). This second article focuses on wind plant capacity credit as measured with power system reliability indices. Reliability-based methods of measuring capacity credit are compared with wind plant capacity factor. The relationship between capacity credit and accurate wind forecasting is also explored. Published in 2000 by John Wiley & Sons, Ltd.
Proactive assessment of accident risk to improve safety on a system of freeways : [research brief].
DOT National Transportation Integrated Search
2012-05-01
As traffic safety on freeways continues to be a growing concern, much progress has been made in shifting from reactive (incident detection) to proactive (real-time crash risk assessment) traffic strategies. Reliable models that can take in real-time ...
Ouyang, Tingping; Fu, Shuqing; Zhu, Zhaoyu; Kuang, Yaoqiu; Huang, Ningsheng; Wu, Zhifeng
2008-11-01
The thermodynamic law is one of the most widely used scientific principles. The comparability between the environmental impact of urbanization and the thermodynamic entropy was systematically analyzed. Consequently, the concept "Urban Environment Entropy" was brought forward and the "Urban Environment Entropy" model was established for urbanization environmental impact assessment in this study. The model was then utilized in a case study for the assessment of river water quality in the Pearl River Delta Economic Zone. The results indicated that the assessing results of the model are consistent to that of the equalized synthetic pollution index method. Therefore, it can be concluded that the Urban Environment Entropy model has high reliability and can be applied widely in urbanization environmental assessment research using many different environmental parameters.
Development and validation of a Malawian version of the primary care assessment tool.
Dullie, Luckson; Meland, Eivind; Hetlevik, Øystein; Mildestvedt, Thomas; Gjesdal, Sturla
2018-05-16
Malawi does not have validated tools for assessing primary care performance from patients' experience. The aim of this study was to develop a Malawian version of Primary Care Assessment Tool (PCAT-Mw) and to evaluate its reliability and validity in the assessment of the core primary care dimensions from adult patients' perspective in Malawi. A team of experts assessed the South African version of the primary care assessment tool (ZA-PCAT) for face and content validity. The adapted questionnaire underwent forward and backward translation and a pilot study. The tool was then used in an interviewer administered cross-sectional survey in Neno district, Malawi, to test validity and reliability. Exploratory factor analysis was performed on a random half of the sample to evaluate internal consistency, reliability and construct validity of items and scales. The identified constructs were then tested with confirmatory factor analysis. Likert scale assumption testing and descriptive statistics were done on the final factor structure. The PCAT-Mw was further tested for intra-rater and inter-rater reliability. From the responses of 631 patients, a 29-item PCAT-Mw was constructed comprising seven multi-item scales, representing five primary care dimensions (first contact, continuity, comprehensiveness, coordination and community orientation). All the seven scales achieved good internal consistency, item-total correlations and construct validity. Cronbach's alpha coefficient ranged from 0.66 to 0.91. A satisfactory goodness of fit model was achieved (GFI = 0.90, CFI = 0.91, RMSEA = 0.05, PCLOSE = 0.65). The full range of possible scores was observed for all scales. Scaling assumptions tests were achieved for all except the two comprehensiveness scales. Intra-class correlation coefficient (ICC) was 0.90 (n = 44, 95% CI 0.81-0.94, p < 0.001) for intra-rater reliability and 0.84 (n = 42, 95% CI 0.71-0.96, p < 0.001) for inter-rater reliability. Comprehensive metric analyses supported the reliability and validity of PCAT-Mw in assessing the core concepts of primary care from adult patients' experience. This tool could be used for health service research in primary care in Malawi.
Assessment of physical server reliability in multi cloud computing system
NASA Astrophysics Data System (ADS)
Kalyani, B. J. D.; Rao, Kolasani Ramchand H.
2018-04-01
Business organizations nowadays functioning with more than one cloud provider. By spreading cloud deployment across multiple service providers, it creates space for competitive prices that minimize the burden on enterprises spending budget. To assess the software reliability of multi cloud application layered software reliability assessment paradigm is considered with three levels of abstractions application layer, virtualization layer, and server layer. The reliability of each layer is assessed separately and is combined to get the reliability of multi-cloud computing application. In this paper, we focused on how to assess the reliability of server layer with required algorithms and explore the steps in the assessment of server reliability.
Sullivan, Jennifer L; Rivard, Peter E; Shin, Marlena H; Rosen, Amy K
2016-09-01
The lack of a tool for categorizing and differentiating hospitals according to their high reliability organization (HRO)-related characteristics has hindered progress toward implementing and sustaining evidence-based HRO practices. Hospitals would benefit both from an understanding of the organizational characteristics that support HRO practices and from knowledge about the steps necessary to achieve HRO status to reduce the risk of harm and improve outcomes. The High Reliability Health Care Maturity (HRHCM) model, a model for health care organizations' achievement of high reliability with zero patient harm, incorporates three major domains critical for promoting HROs-Leadership, Safety Culture, and Robust Process Improvement ®. A study was conducted to examine the content validity of the HRHCM model and evaluate whether it can differentiate hospitals' maturity levels for each of the model's components. Staff perceptions of patient safety at six US Department of Veterans Affairs (VA) hospitals were examined to determine whether all 14 HRHCM components were present and to characterize each hospital's level of organizational maturity. Twelve of the 14 components from the HRHCM model were detected; two additional characteristics emerged that are present in the HRO literature but not represented in the model-teamwork culture and system-focused tools for learning and improvement. Each hospital's level of organizational maturity could be characterized for 9 of the 14 components. The findings suggest the HRHCM model has good content validity and that there is differentiation between hospitals on model components. Additional research is needed to understand how these components can be used to build the infrastructure necessary for reaching high reliability.
NASA Astrophysics Data System (ADS)
Hanna, Ryan
Distributed energy resources (DERs), and increasingly microgrids, are becoming an integral part of modern distribution systems. Interest in microgrids--which are insular and autonomous power networks embedded within the bulk grid--stems largely from the vast array of flexibilities and benefits they can offer stakeholders. Managed well, they can improve grid reliability and resiliency, increase end-use energy efficiency by coupling electric and thermal loads, reduce transmission losses by generating power locally, and may reduce system-wide emissions, among many others. Whether these public benefits are realized, however, depends on whether private firms see a "business case", or private value, in investing. To this end, firms need models that evaluate costs, benefits, risks, and assumptions that underlie decisions to invest. The objectives of this dissertation are to assess the business case for microgrids that provide what industry analysts forecast as two primary drivers of market growth--that of providing energy services (similar to an electric utility) as well as reliability service to customers within. Prototypical first adopters are modeled--using an existing model to analyze energy services and a new model that couples that analysis with one of reliability--to explore interactions between technology choice, reliability, costs, and benefits. The new model has a bi-level hierarchy; it uses heuristic optimization to select and size DERs and analytical optimization to schedule them. It further embeds Monte Carlo simulation to evaluate reliability as well as regression models for customer damage functions to monetize reliability. It provides least-cost microgrid configurations for utility customers who seek to reduce interruption and operating costs. Lastly, the model is used to explore the impact of such adoption on system-wide greenhouse gas emissions in California. Results indicate that there are, at present, co-benefits for emissions reductions when customers adopt and operate microgrids for private benefit, though future analysis is needed as the bulk grid continues to transition toward a less carbon intensive system.
Validity and Reliability of Assessing Body Composition Using a Mobile Application.
Macdonald, Elizabeth Z; Vehrs, Pat R; Fellingham, Gilbert W; Eggett, Dennis; George, James D; Hager, Ronald
2017-12-01
The purpose of this study was to determine the validity and reliability of the LeanScreen (LS) mobile application that estimates percent body fat (%BF) using estimates of circumferences from photographs. The %BF of 148 weight-stable adults was estimated once using dual-energy x-ray absorptiometry (DXA). Each of two administrators assessed the %BF of each subject twice using the LS app and manually measured circumferences. A mixed-model ANOVA and Bland-Altman analyses were used to compare the estimates of %BF obtained from each method. Interrater and intrarater reliabilities values were determined using multiple measurements taken by each of the two administrators. The LS app and manually measured circumferences significantly underestimated (P < 0.05) the %BF determined using DXA by an average of -3.26 and -4.82 %BF, respectively. The LS app (6.99 %BF) and manually measured circumferences (6.76 %BF) had large limits of agreement. All interrater and intrarater reliability coefficients of estimates of %BF using the LS app and manually measured circumferences exceeded 0.99. The estimates of %BF from manually measured circumferences and the LS app were highly reliable. However, these field measures are not currently recommended for the assessment of body composition because of significant bias and large limits of agreements.
ERIC Educational Resources Information Center
Vowles, C. G.
2012-01-01
Controlled assessment (CA) was introduced as a valid and reliable replacement for coursework in GCSE English and English Literature assessments in 2009. I argue that CA lacks clear definition, typically mimics externally-assessed public examinations and, when interrogated through the Crooks eight-link chain model, is undermined by several threats…
How Reliable is the Acetabular Cup Position Assessment from Routine Radiographs?
Carvajal Alba, Jaime A.; Vincent, Heather K.; Sodhi, Jagdeep S.; Latta, Loren L.; Parvataneni, Hari K.
2017-01-01
Abstract Background: Cup position is crucial for optimal outcomes in total hip arthroplasty. Radiographic assessment of component position is routinely performed in the early postoperative period. Aims: The aims of this study were to determine in a controlled environment if routine radiographic methods accurately and reliably assess the acetabular cup position and to assess if there is a statistical difference related to the rater’s level of training. Methods: A pelvic model was mounted in a spatial frame. An acetabular cup was fixed in different degrees of version and inclination. Standardized radiographs were obtained. Ten observers including five fellowship-trained orthopaedic surgeons and five orthopaedic residents performed a blind assessment of cup position. Inclination was assessed from anteroposterior radiographs of the pelvis and version from cross-table lateral radiographs of the hip. Results: The radiographic methods used showed to be imprecise specially when the cup was positioned at the extremes of version and inclination. An excellent inter-observer reliability (Intra-class coefficient > 0,9) was evidenced. There were no differences related to the level of training of the raters. Conclusions: These widely used radiographic methods should be interpreted cautiously and computed tomography should be utilized in cases when further intervention is contemplated. PMID:28852355
Selecting statistical model and optimum maintenance policy: a case study of hydraulic pump.
Ruhi, S; Karim, M R
2016-01-01
Proper maintenance policy can play a vital role for effective investigation of product reliability. Every engineered object such as product, plant or infrastructure needs preventive and corrective maintenance. In this paper we look at a real case study. It deals with the maintenance of hydraulic pumps used in excavators by a mining company. We obtain the data that the owner had collected and carry out an analysis and building models for pump failures. The data consist of both failure and censored lifetimes of the hydraulic pump. Different competitive mixture models are applied to analyze a set of maintenance data of a hydraulic pump. Various characteristics of the mixture models, such as the cumulative distribution function, reliability function, mean time to failure, etc. are estimated to assess the reliability of the pump. Akaike Information Criterion, adjusted Anderson-Darling test statistic, Kolmogrov-Smirnov test statistic and root mean square error are considered to select the suitable models among a set of competitive models. The maximum likelihood estimation method via the EM algorithm is applied mainly for estimating the parameters of the models and reliability related quantities. In this study, it is found that a threefold mixture model (Weibull-Normal-Exponential) fits well for the hydraulic pump failures data set. This paper also illustrates how a suitable statistical model can be applied to estimate the optimum maintenance period at a minimum cost of a hydraulic pump.
Hierarchical Bayesian Model Averaging for Chance Constrained Remediation Designs
NASA Astrophysics Data System (ADS)
Chitsazan, N.; Tsai, F. T.
2012-12-01
Groundwater remediation designs are heavily relying on simulation models which are subjected to various sources of uncertainty in their predictions. To develop a robust remediation design, it is crucial to understand the effect of uncertainty sources. In this research, we introduce a hierarchical Bayesian model averaging (HBMA) framework to segregate and prioritize sources of uncertainty in a multi-layer frame, where each layer targets a source of uncertainty. The HBMA framework provides an insight to uncertainty priorities and propagation. In addition, HBMA allows evaluating model weights in different hierarchy levels and assessing the relative importance of models in each level. To account for uncertainty, we employ a chance constrained (CC) programming for stochastic remediation design. Chance constrained programming was implemented traditionally to account for parameter uncertainty. Recently, many studies suggested that model structure uncertainty is not negligible compared to parameter uncertainty. Using chance constrained programming along with HBMA can provide a rigorous tool for groundwater remediation designs under uncertainty. In this research, the HBMA-CC was applied to a remediation design in a synthetic aquifer. The design was to develop a scavenger well approach to mitigate saltwater intrusion toward production wells. HBMA was employed to assess uncertainties from model structure, parameter estimation and kriging interpolation. An improved harmony search optimization method was used to find the optimal location of the scavenger well. We evaluated prediction variances of chloride concentration at the production wells through the HBMA framework. The results showed that choosing the single best model may lead to a significant error in evaluating prediction variances for two reasons. First, considering the single best model, variances that stem from uncertainty in the model structure will be ignored. Second, considering the best model with non-dominant model weight may underestimate or overestimate prediction variances by ignoring other plausible propositions. Chance constraints allow developing a remediation design with a desirable reliability. However, considering the single best model, the calculated reliability will be different from the desirable reliability. We calculated the reliability of the design for the models at different levels of HBMA. The results showed that by moving toward the top layers of HBMA, the calculated reliability converges to the chosen reliability. We employed the chance constrained optimization along with the HBMA framework to find the optimal location and pumpage for the scavenger well. The results showed that using models at different levels in the HBMA framework, the optimal location of the scavenger well remained the same, but the optimal extraction rate was altered. Thus, we concluded that the optimal pumping rate was sensitive to the prediction variance. Also, the prediction variance was changed by using different extraction rate. Using very high extraction rate will cause prediction variances of chloride concentration at the production wells to approach zero regardless of which HBMA models used.
An assessment of laser velocimetry in hypersonic flow
NASA Technical Reports Server (NTRS)
1992-01-01
Although extensive progress has been made in computational fluid mechanics, reliable flight vehicle designs and modifications still cannot be made without recourse to extensive wind tunnel testing. Future progress in the computation of hypersonic flow fields is restricted by the need for a reliable mean flow and turbulence modeling data base which could be used to aid in the development of improved empirical models for use in numerical codes. Currently, there are few compressible flow measurements which could be used for this purpose. In this report, the results of experiments designed to assess the potential for laser velocimeter measurements of mean flow and turbulent fluctuations in hypersonic flow fields are presented. Details of a new laser velocimeter system which was designed and built for this test program are described.
Improvement of automatic control system for high-speed current collectors
NASA Astrophysics Data System (ADS)
Sidorov, O. A.; Goryunov, V. N.; Golubkov, A. S.
2018-01-01
The article considers the ways of regulation of pantographs to provide quality and reliability of current collection at high speeds. To assess impact of regulation was proposed integral criterion of the quality of current collection, taking into account efficiency and reliability of operation of the pantograph. The study was carried out using mathematical model of interaction of pantograph and catenary system, allowing to assess contact force and intensity of arcing at the contact zone at different movement speeds. The simulation results allowed us to estimate the efficiency of different methods of regulation of pantographs and determine the best option.
Reliability of System Identification Techniques to Assess Standing Balance in Healthy Elderly
Maier, Andrea B.; Aarts, Ronald G. K. M.; van Gerven, Joop M. A.; Arendzen, J. Hans; Schouten, Alfred C.; Meskers, Carel G. M.; van der Kooij, Herman
2016-01-01
Objectives System identification techniques have the potential to assess the contribution of the underlying systems involved in standing balance by applying well-known disturbances. We investigated the reliability of standing balance parameters obtained with multivariate closed loop system identification techniques. Methods In twelve healthy elderly balance tests were performed twice a day during three days. Body sway was measured during two minutes of standing with eyes closed and the Balance test Room (BalRoom) was used to apply four disturbances simultaneously: two sensory disturbances, to the proprioceptive and the visual system, and two mechanical disturbances applied at the leg and trunk segment. Using system identification techniques, sensitivity functions of the sensory disturbances and the neuromuscular controller were estimated. Based on the generalizability theory (G theory), systematic errors and sources of variability were assessed using linear mixed models and reliability was assessed by computing indexes of dependability (ID), standard error of measurement (SEM) and minimal detectable change (MDC). Results A systematic error was found between the first and second trial in the sensitivity functions. No systematic error was found in the neuromuscular controller and body sway. The reliability of 15 of 25 parameters and body sway were moderate to excellent when the results of two trials on three days were averaged. To reach an excellent reliability on one day in 7 out of 25 parameters, it was predicted that at least seven trials must be averaged. Conclusion This study shows that system identification techniques are a promising method to assess the underlying systems involved in standing balance in elderly. However, most of the parameters do not appear to be reliable unless a large number of trials are collected across multiple days. To reach an excellent reliability in one third of the parameters, a training session for participants is needed and at least seven trials of two minutes must be performed on one day. PMID:26953694
A Passive System Reliability Analysis for a Station Blackout
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brunett, Acacia; Bucknor, Matthew; Grabaskas, David
2015-05-03
The latest iterations of advanced reactor designs have included increased reliance on passive safety systems to maintain plant integrity during unplanned sequences. While these systems are advantageous in reducing the reliance on human intervention and availability of power, the phenomenological foundations on which these systems are built require a novel approach to a reliability assessment. Passive systems possess the unique ability to fail functionally without failing physically, a result of their explicit dependency on existing boundary conditions that drive their operating mode and capacity. Argonne National Laboratory is performing ongoing analyses that demonstrate various methodologies for the characterization of passivemore » system reliability within a probabilistic framework. Two reliability analysis techniques are utilized in this work. The first approach, the Reliability Method for Passive Systems, provides a mechanistic technique employing deterministic models and conventional static event trees. The second approach, a simulation-based technique, utilizes discrete dynamic event trees to treat time- dependent phenomena during scenario evolution. For this demonstration analysis, both reliability assessment techniques are used to analyze an extended station blackout in a pool-type sodium fast reactor (SFR) coupled with a reactor cavity cooling system (RCCS). This work demonstrates the entire process of a passive system reliability analysis, including identification of important parameters and failure metrics, treatment of uncertainties and analysis of results.« less
Steenson, Sharalyn; Özcebe, Hilal; Arslan, Umut; Konşuk Ünlü, Hande; Araz, Özgür M; Yardim, Mahmut; Üner, Sarp; Bilir, Nazmi; Huang, Terry T-K
2018-01-01
Childhood obesity rates have been rising rapidly in developing countries. A better understanding of the risk factors and social context is necessary to inform public health interventions and policies. This paper describes the validation of several measurement scales for use in Turkey, which relate to child and parent perceptions of physical activity (PA) and enablers and barriers of physical activity in the home environment. The aim of this study was to assess the validity and reliability of several measurement scales in Turkey using a population sample across three socio-economic strata in the Turkish capital, Ankara. Surveys were conducted in Grade 4 children (mean age = 9.7 years for boys; 9.9 years for girls), and their parents, across 6 randomly selected schools, stratified by SES (n = 641 students, 483 parents). Construct validity of the scales was evaluated through exploratory and confirmatory factor analysis. Internal consistency of scales and test-retest reliability were assessed by Cronbach's alpha and intra-class correlation. The scales as a whole were found to have acceptable-to-good model fit statistics (PA Barriers: RMSEA = 0.076, SRMR = 0.0577, AGFI = 0.901; PA Outcome Expectancies: RMSEA = 0.054, SRMR = 0.0545, AGFI = 0.916, and PA Home Environment: RMSEA = 0.038, SRMR = 0.0233, AGFI = 0.976). The PA Barriers subscales showed good internal consistency and poor to fair test-retest reliability (personal α = 0.79, ICC = 0.29, environmental α = 0.73, ICC = 0.59). The PA Outcome Expectancies subscales showed good internal consistency and test-retest reliability (negative α = 0.77, ICC = 0.56; positive α = 0.74, ICC = 0.49). Only the PA Home Environment subscale on support for PA was validated in the final confirmatory model; it showed moderate internal consistency and test-retest reliability (α = 0.61, ICC = 0.48). This study is the first to validate measures of perceptions of physical activity and the physical activity home environment in Turkey. Our results support the originally hypothesized two-factor structures for Physical Activity Barriers and Physical Activity Outcome Expectancies. However, we found the one-factor rather than two-factor structure for Physical Activity Home Environment had the best model fit. This study provides general support for the use of these scales in Turkey in terms of validity, but test-retest reliability warrants further research.
Camera-tracking gaming control device for evaluation of active wrist flexion and extension.
Shefer Eini, Dalit; Ratzon, Navah Z; Rizzo, Albert A; Yeh, Shih-Ching; Lange, Belinda; Yaffe, Batia; Daich, Alexander; Weiss, Patrice L; Kizony, Rachel
Cross sectional. Measuring wrist range of motion (ROM) is an essential procedure in hand therapy clinics. To test the reliability and validity of a dynamic ROM assessment, the Camera Wrist Tracker (CWT). Wrist flexion and extension ROM of 15 patients with distal radius fractures and 15 matched controls were assessed with the CWT and with a universal goniometer. One-way model intraclass correlation coefficient analysis indicated high test-retest reliability for extension (ICC = 0.92) and moderate reliability for flexion (ICC = 0.49). Standard error for extension was 2.45° and for flexion was 4.07°. Repeated-measures analysis revealed a significant main effect for group; ROM was greater in the control group (F[1, 28] = 47.35; P < .001). The concurrent validity of the CWT was partially supported. The results indicate that the CWT may provide highly reliable scores for dynamic wrist extension ROM, and moderately reliable scores for flexion, in people recovering from a distal radius fracture. N/A. Copyright © 2016 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.
78 FR 75257 - Flutriafol; Pesticide Tolerances
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-11
... for which there is reliable information.'' This includes exposure through drinking water and in...), modeled drinking water estimates, and DEEM\\TM\\ ver. 7.81 default processing factors. ii. Chronic exposure... residues adjusted to account for the residues of concern for risk assessment, 100 PCT, modeled drinking...
Repeatability of the Oxford Foot Model in children with foot deformity.
McCahill, Jennifer; Stebbins, Julie; Koning, Bart; Harlaar, Jaap; Theologis, Tim
2018-03-01
The Oxford Foot Model (OFM) is a multi-segment, kinematic model developed to assess foot motion. It has previously been assessed for repeatability in healthy populations. To determine the OFM's reliability for detecting foot deformity, it is important to know repeatability in pathological conditions. The aim of the study was to assess the repeatability of the OFM in children with foot deformity. Intra-tester repeatability was assessed for 45 children (15 typically developing, 15 hemiplegic, 15 clubfoot). Inter-tester repeatability was assessed in the clubfoot population. The mean absolute differences between testers (clubfoot) and sessions (clubfoot and hemiplegic) were calculated for each of 15 clinically relevant, kinematic variables and compared to typically developing children. Children with clubfoot showed a mean difference between visits of 2.9° and a mean difference between raters of 3.6° Mean absolute differences were within one degree for the intra and inter-rater reliability in 12/15 variables. Hindfoot rotation, forefoot/tibia abduction and forefoot supination were the most variable between testers. Overall the clubfoot data were less variable than the typically developing population. Children with hemiplegia demonstrated slightly higher differences between sessions (mean 4.1°), with the most reliable data in the sagittal plane, and largest differences in the transverse plane. The OFM was designed to measure different types of foot deformity. The results of this study show that it provides repeatable results in children with foot deformity. To be distinguished from measurement artifact, changes in foot kinematics as a result of intervention or natural progression over time must be greater than the repeatability reported here. Copyright © 2018 Elsevier B.V. All rights reserved.
Austvoll-Dahlgren, Astrid; Guttersrud, Øystein; Nsangi, Allen; Semakula, Daniel; Oxman, Andrew D
2017-05-25
The Claim Evaluation Tools database contains multiple-choice items for measuring people's ability to apply the key concepts they need to know to be able to assess treatment claims. We assessed items from the database using Rasch analysis to develop an outcome measure to be used in two randomised trials in Uganda. Rasch analysis is a form of psychometric testing relying on Item Response Theory. It is a dynamic way of developing outcome measures that are valid and reliable. To assess the validity, reliability and responsiveness of 88 items addressing 22 key concepts using Rasch analysis. We administrated four sets of multiple-choice items in English to 1114 people in Uganda and Norway, of which 685 were children and 429 were adults (including 171 health professionals). We scored all items dichotomously. We explored summary and individual fit statistics using the RUMM2030 analysis package. We used SPSS to perform distractor analysis. Most items conformed well to the Rasch model, but some items needed revision. Overall, the four item sets had satisfactory reliability. We did not identify significant response dependence between any pairs of items and, overall, the magnitude of multidimensionality in the data was acceptable. The items had a high level of difficulty. Most of the items conformed well to the Rasch model's expectations. Following revision of some items, we concluded that most of the items were suitable for use in an outcome measure for evaluating the ability of children or adults to assess treatment claims. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Sasaki, Hatoko; Kakee, Naoko; Morisaki, Naho; Mori, Rintaro; Ravens-Sieberer, Ulrike; Bullinger, Monika
2018-05-02
This study examined the reliability and validity of the Japanese versions of the DISABKIDS-37 generic modules, a tool for assessing the health-related quality of life (HRQOL) of children and adolescents with a chronic condition. The study was conducted using a sample of 123 children/adolescents with a chronic medical condition, aged 8-18 years, and their parents. Focus interviews were performed to ensure content validity after translation. The classical psychometric tests were used to assess reliability and scale intercorrelations. The factor structure was examined with confirmatory factor analysis (CFA). Convergent validity was assessed by the correlation between the total score and the sub-scales of DISABKIDS-37 as well as the total score of KIDSCREEN-10. Both the children/adolescent and parent versions of the score showed good to high internal consistency, and the test-retest reliability correlations were r = 0.91 or above. The CFA revealed that the modified models for all domains were better fit than the original 37 item scale model for both self-report and proxy-report. Moderate to high positive correlations were found for the associations within DISABKIDS-37 sub-scales and between the subscales and total score, except for the treatment sub-scale, which correlated weakly with the remaining sub-scales. The total score of the child-reported version of KIDSCREEN-10 correlated significantly and positively with the total score and all the sub-scales of the child-reported version of DISABKIDS-37 except the Treatment sub-scale in adolescents. The modified models of Japanese version of DISABKIDS generic module were psychometrically robust enough to assess the HRQOL of children with a chronic condition.
Bourret, A; Garant, D
2017-03-01
Quantitative genetics approaches, and particularly animal models, are widely used to assess the genetic (co)variance of key fitness related traits and infer adaptive potential of wild populations. Despite the importance of precision and accuracy of genetic variance estimates and their potential sensitivity to various ecological and population specific factors, their reliability is rarely tested explicitly. Here, we used simulations and empirical data collected from an 11-year study on tree swallow (Tachycineta bicolor), a species showing a high rate of extra-pair paternity and a low recruitment rate, to assess the importance of identity errors, structure and size of the pedigree on quantitative genetic estimates in our dataset. Our simulations revealed an important lack of precision in heritability and genetic-correlation estimates for most traits, a low power to detect significant effects and important identifiability problems. We also observed a large bias in heritability estimates when using the social pedigree instead of the genetic one (deflated heritabilities) or when not accounting for an important cause of resemblance among individuals (for example, permanent environment or brood effect) in model parameterizations for some traits (inflated heritabilities). We discuss the causes underlying the low reliability observed here and why they are also likely to occur in other study systems. Altogether, our results re-emphasize the difficulties of generalizing quantitative genetic estimates reliably from one study system to another and the importance of reporting simulation analyses to evaluate these important issues.
Sustainability of transport structures - some aspects of the nonlinear reliability assessment
NASA Astrophysics Data System (ADS)
Pukl, Radomír; Sajdlová, Tereza; Strauss, Alfred; Lehký, David; Novák, Drahomír
2017-09-01
Efficient techniques for both nonlinear numerical analysis of concrete structures and advanced stochastic simulation methods have been combined in order to offer an advanced tool for assessment of realistic behaviour, failure and safety assessment of transport structures. The utilized approach is based on randomization of the non-linear finite element analysis of the structural models. Degradation aspects such as carbonation of concrete can be accounted in order predict durability of the investigated structure and its sustainability. Results can serve as a rational basis for the performance and sustainability assessment based on advanced nonlinear computer analysis of the structures of transport infrastructure such as bridges or tunnels. In the stochastic simulation the input material parameters obtained from material tests including their randomness and uncertainty are represented as random variables or fields. Appropriate identification of material parameters is crucial for the virtual failure modelling of structures and structural elements. Inverse analysis using artificial neural networks and virtual stochastic simulations approach is applied to determine the fracture mechanical parameters of the structural material and its numerical model. Structural response, reliability and sustainability have been investigated on different types of transport structures made from various materials using the above mentioned methodology and tools.
Outdoor Leader Career Development: Exploration of a Career Path
ERIC Educational Resources Information Center
Wagstaff, Mark
2016-01-01
The purpose of this study was to assess the efficacy of the proposed Outdoor Leader Career Development Model (OLCDM) through the development of the Outdoor Leader Career Development Inventory (OLCDI). I assessed the reliability and validity of the OLCDI through exploratory factor analysis, principal component analysis, and varimax rotation, based…
TENI: A comprehensive battery for cognitive assessment based on games and technology.
Delgado, Marcela Tenorio; Uribe, Paulina Arango; Alonso, Andrés Aparicio; Díaz, Ricardo Rosas
2016-01-01
TENI (Test de Evaluación Neuropsicológica Infantil) is an instrument developed to assess cognitive abilities in children between 3 and 9 years of age. It is based on a model that incorporates games and technology as tools to improve the assessment of children's capacities. The test was standardized with two Chilean samples of 524 and 82 children living in urban zones. Evidence of reliability and validity based on current standards is presented. Data show good levels of reliability for all subtests. Some evidence of validity in terms of content, test structure, and association with other variables is presented. This instrument represents a novel approach and a new frontier in cognitive assessment. Further studies with clinical, rural, and cross-cultural populations are required.
NASA Astrophysics Data System (ADS)
Irwanto, Rohaeti, Eli; LFX, Endang Widjajanti; Suyanta
2017-05-01
This research aims to develop instrument and determine the characteristics of an integrated assessment instrument. This research uses 4-D model, which includes define, design, develop, and disseminate. The primary product is validated by expert judgment, tested it's readability by students, and assessed it's feasibility by chemistry teachers. This research involved 246 students of grade XI of four senior high schools in Yogyakarta, Indonesia. Data collection techniques include interview, questionnaire, and test. Data collection instruments include interview guideline, item validation sheet, users' response questionnaire, instrument readability questionnaire, and essay test. The results show that the integrated assessment instrument has Aiken validity value of 0.95. Item reliability was 0.99 and person reliability was 0.69. Teachers' response to the integrated assessment instrument is very good. Therefore, the integrated assessment instrument is feasible to be applied to measure the students' analytical thinking and science process skills.
[Psychometric properties and diagnostic value of 'lexical screening for aphasias'].
Pena-Chavez, R; Martinez-Jimenez, L; Lopez-Espinoza, M
2014-09-16
INTRODUCTION. Language assessment in persons with brain injury makes it possible to know whether they require language rehabilitation or not. Given the importance of a precise evaluation, assessment instruments must be valid and reliable, so as to avoid mistaken and subjective diagnoses. AIM. To validate 'lexical screening for aphasias' in a sample of 58 Chilean individuals. SUBJECTS AND METHODS. A screening-type language test, lasting 20 minutes and based on the lexical processing model devised by Patterson and Shewell (1987), was constructed. The sample was made up of two groups containing 29 aphasic subjects and 29 control subjects from different health centres in the regions of Biobio and Maule, Chile. Their ages ranged between 24 and 79 years and had between 0 and 17 years' schooling. Tests were carried out to determine discriminating validity, concurrent validity with the aphasia disorder assessment battery, reliability, sensitivity and specificity. RESULTS. The statistical analysis showed a high discriminating validity (p < 0.001), an acceptable mean concurrent validity with aphasia disorder assessment battery (rs = 0.65), high mean reliability (alpha = 0.87), moderate mean sensitivity (69%) and high mean specificity (86%). CONCLUSION. 'Lexical screening for aphasias' is valid and reliable for assessing language in persons with aphasias; it is sensitive for detecting aphasic subjects and is specific for precluding language disorders in persons with normal language abilities.
Reliability and Probabilistic Risk Assessment - How They Play Together
NASA Technical Reports Server (NTRS)
Safie, Fayssal M.; Stutts, Richard G.; Zhaofeng, Huang
2015-01-01
PRA methodology is one of the probabilistic analysis methods that NASA brought from the nuclear industry to assess the risk of LOM, LOV and LOC for launch vehicles. PRA is a system scenario based risk assessment that uses a combination of fault trees, event trees, event sequence diagrams, and probability and statistical data to analyze the risk of a system, a process, or an activity. It is a process designed to answer three basic questions: What can go wrong? How likely is it? What is the severity of the degradation? Since 1986, NASA, along with industry partners, has conducted a number of PRA studies to predict the overall launch vehicles risks. Planning Research Corporation conducted the first of these studies in 1988. In 1995, Science Applications International Corporation (SAIC) conducted a comprehensive PRA study. In July 1996, NASA conducted a two-year study (October 1996 - September 1998) to develop a model that provided the overall Space Shuttle risk and estimates of risk changes due to proposed Space Shuttle upgrades. After the Columbia accident, NASA conducted a PRA on the Shuttle External Tank (ET) foam. This study was the most focused and extensive risk assessment that NASA has conducted in recent years. It used a dynamic, physics-based, integrated system analysis approach to understand the integrated system risk due to ET foam loss in flight. Most recently, a PRA for Ares I launch vehicle has been performed in support of the Constellation program. Reliability, on the other hand, addresses the loss of functions. In a broader sense, reliability engineering is a discipline that involves the application of engineering principles to the design and processing of products, both hardware and software, for meeting product reliability requirements or goals. It is a very broad design-support discipline. It has important interfaces with many other engineering disciplines. Reliability as a figure of merit (i.e. the metric) is the probability that an item will perform its intended function(s) for a specified mission profile. In general, the reliability metric can be calculated through the analyses using reliability demonstration and reliability prediction methodologies. Reliability analysis is very critical for understanding component failure mechanisms and in identifying reliability critical design and process drivers. The following sections discuss the PRA process and reliability engineering in detail and provide an application where reliability analysis and PRA were jointly used in a complementary manner to support a Space Shuttle flight risk assessment.
Staggs, Vincent S; Cramer, Emily
2016-08-01
Hospital performance reports often include rankings of unit pressure ulcer rates. Differentiating among units on the basis of quality requires reliable measurement. Our objectives were to describe and apply methods for assessing reliability of hospital-acquired pressure ulcer rates and evaluate a standard signal-noise reliability measure as an indicator of precision of differentiation among units. Quarterly pressure ulcer data from 8,199 critical care, step-down, medical, surgical, and medical-surgical nursing units from 1,299 US hospitals were analyzed. Using beta-binomial models, we estimated between-unit variability (signal) and within-unit variability (noise) in annual unit pressure ulcer rates. Signal-noise reliability was computed as the ratio of between-unit variability to the total of between- and within-unit variability. To assess precision of differentiation among units based on ranked pressure ulcer rates, we simulated data to estimate the probabilities of a unit's observed pressure ulcer rate rank in a given sample falling within five and ten percentiles of its true rank, and the probabilities of units with ulcer rates in the highest quartile and highest decile being identified as such. We assessed the signal-noise measure as an indicator of differentiation precision by computing its correlations with these probabilities. Pressure ulcer rates based on a single year of quarterly or weekly prevalence surveys were too susceptible to noise to allow for precise differentiation among units, and signal-noise reliability was a poor indicator of precision of differentiation. To ensure precise differentiation on the basis of true differences, alternative methods of assessing reliability should be applied to measures purported to differentiate among providers or units based on quality. © 2016 The Authors. Research in Nursing & Health published by Wiley Periodicals, Inc. © 2016 The Authors. Research in Nursing & Health published by Wiley Periodicals, Inc.
Boileau, C; Martel-Pelletier, J; Abram, F; Raynauld, J-P; Troncy, E; D'Anjou, M-A; Moreau, M; Pelletier, J-P
2008-07-01
Osteoarthritis (OA) structural changes take place over decades in humans. MRI can provide precise and reliable information on the joint structure and changes over time. In this study, we investigated the reliability of quantitative MRI in assessing knee OA structural changes in the experimental anterior cruciate ligament (ACL) dog model of OA. OA was surgically induced by transection of the ACL of the right knee in five dogs. High resolution three dimensional MRI using a 1.5 T magnet was performed at baseline, 4, 8 and 26 weeks post surgery. Cartilage volume/thickness, cartilage defects, trochlear osteophyte formation and subchondral bone lesion (hypersignal) were assessed on MRI images. Animals were killed 26 weeks post surgery and macroscopic evaluation was performed. There was a progressive and significant increase over time in the loss of knee cartilage volume, the cartilage defect and subchondral bone hypersignal. The trochlear osteophyte size also progressed over time. The greatest cartilage loss at 26 weeks was found on the tibial plateaus and in the medial compartment. There was a highly significant correlation between total knee cartilage volume loss or defect and subchondral bone hypersignal, and also a good correlation between the macroscopic and the MRI findings. This study demonstrated that MRI is a useful technology to provide a non-invasive and reliable assessment of the joint structural changes during the development of OA in the ACL dog model. The combination of this OA model with MRI evaluation provides a promising tool for the evaluation of new disease-modifying osteoarthritis drugs (DMOADs).
Radiation Measurements Performed with Active Detectors Relevant for Human Space Exploration
Narici, Livio; Berger, Thomas; Matthiä, Daniel; Reitz, Günther
2015-01-01
A reliable radiation risk assessment in space is a mandatory step for the development of countermeasures and long-duration mission planning in human spaceflight. Research in radiobiology provides information about possible risks linked to radiation. In addition, for a meaningful risk evaluation, the radiation exposure has to be assessed to a sufficient level of accuracy. Consequently, both the radiation models predicting the risks and the measurements used to validate such models must have an equivalent precision. Corresponding measurements can be performed both with passive and active devices. The former is easier to handle, cheaper, lighter, and smaller but they measure neither the time dependence of the radiation environment nor some of the details useful for a comprehensive radiation risk assessment. Active detectors provide most of these details and have been extensively used in the International Space Station. To easily access such an amount of data, a single point access is becoming essential. This review presents an ongoing work on the development of a tool that allows obtaining information about all relevant measurements performed with active detectors providing reliable inputs for radiation model validation. PMID:26697408
Radiation Measurements Performed with Active Detectors Relevant for Human Space Exploration.
Narici, Livio; Berger, Thomas; Matthiä, Daniel; Reitz, Günther
2015-01-01
A reliable radiation risk assessment in space is a mandatory step for the development of countermeasures and long-duration mission planning in human spaceflight. Research in radiobiology provides information about possible risks linked to radiation. In addition, for a meaningful risk evaluation, the radiation exposure has to be assessed to a sufficient level of accuracy. Consequently, both the radiation models predicting the risks and the measurements used to validate such models must have an equivalent precision. Corresponding measurements can be performed both with passive and active devices. The former is easier to handle, cheaper, lighter, and smaller but they measure neither the time dependence of the radiation environment nor some of the details useful for a comprehensive radiation risk assessment. Active detectors provide most of these details and have been extensively used in the International Space Station. To easily access such an amount of data, a single point access is becoming essential. This review presents an ongoing work on the development of a tool that allows obtaining information about all relevant measurements performed with active detectors providing reliable inputs for radiation model validation.
Raymond, Mark R; Clauser, Brian E; Furman, Gail E
2010-10-01
The use of standardized patients to assess communication skills is now an essential part of assessing a physician's readiness for practice. To improve the reliability of communication scores, it has become increasingly common in recent years to use statistical models to adjust ratings provided by standardized patients. This study employed ordinary least squares regression to adjust ratings, and then used generalizability theory to evaluate the impact of these adjustments on score reliability and the overall standard error of measurement. In addition, conditional standard errors of measurement were computed for both observed and adjusted scores to determine whether the improvements in measurement precision were uniform across the score distribution. Results indicated that measurement was generally less precise for communication ratings toward the lower end of the score distribution; and the improvement in measurement precision afforded by statistical modeling varied slightly across the score distribution such that the most improvement occurred in the upper-middle range of the score scale. Possible reasons for these patterns in measurement precision are discussed, as are the limitations of the statistical models used for adjusting performance ratings.
Reliability of VEP Recordings Using Chronically Implanted Screw Electrodes in Mice
Makowiecki, Kalina; Garrett, Andrew; Clark, Vince; Graham, Stuart L.; Rodger, Jennifer
2015-01-01
Purpose: Visual evoked potentials (VEPs) are widely used to objectively assess visual system function in animal models of ophthalmological diseases. Although use of chronically implanted electrodes is common in longitudinal VEP studies using rodent models, reliability of recordings over time has not been assessed. We compared VEPs 1 and 7 days after electrode implantation in the adult mouse. We also examined stimulus-independent changes over time, by assessing electroencephalogram (EEG) power and approximate entropy of the EEG signal. Methods: Stainless steel screws (600-μm diameter) were implanted into the skull overlying the right visual cortex and the orbitofrontal cortex of adult mice (C57Bl/6J, n = 7). Animals were reanesthetized 1 and 7 days after implantation to record VEP responses (flashed gratings) and EEG activity. Brain sections were stained for glial activation (GFAP) and cell death (TUNEL). Results: Reliability analysis, using intraclass correlation coefficients, showed VEP recordings had high reliability within the same session, regardless of time after electrode implantation and peak latencies and approximate entropy of the EEG did not change significantly with time. However, there was poorer reliability between recordings obtained on different days, and a significant decrease in VEP amplitudes and EEG power. This amplitude decrease could be normalized by scaling to EEG power (within-subjects). Furthermore, glial activation was present at both time points but there was no evidence of cell death. Conclusions: These results indicate that VEP responses can be reliably recorded even after a relatively short recovery period but decrease response peak amplitude over time. Although scaling the VEP trace to EEG power normalized this decrease, our results highlight that time-dependent cortical excitability changes are an important consideration in longitudinal VEP studies. Translational Relevance: This study shows changes in VEP characteristics over time in chronically implanted mice. Thus, future preclinical longitudinal studies should consider time in addition to amplitude and latency when designing and interpreting research. PMID:25938003
Tan, Christine L.; Hassali, Mohamed A.; Saleem, Fahad; Shafie, Asrul A.; Aljadhey, Hisham; Gan, Vincent B.
2015-01-01
Objective: (i) To develop the Pharmacy Value-Added Services Questionnaire (PVASQ) using emerging themes generated from interviews. (ii) To establish reliability and validity of questionnaire instrument. Methods: Using an extended Theory of Planned Behavior as the theoretical model, face-to-face interviews generated salient beliefs of pharmacy value-added services. The PVASQ was constructed initially in English incorporating important themes and later translated into the Malay language with forward and backward translation. Intention (INT) to adopt pharmacy value-added services is predicted by attitudes (ATT), subjective norms (SN), perceived behavioral control (PBC), knowledge and expectations. Using a 7-point Likert-type scale and a dichotomous scale, test-retest reliability (N=25) was assessed by administrating the questionnaire instrument twice at an interval of one week apart. Internal consistency was measured by Cronbach’s alpha and construct validity between two administrations was assessed using the kappa statistic and the intraclass correlation coefficient (ICC). Confirmatory Factor Analysis, CFA (N=410) was conducted to assess construct validity of the PVASQ. Results: The kappa coefficients indicate a moderate to almost perfect strength of agreement between test and retest. The ICC for all scales tested for intra-rater (test-retest) reliability was good. The overall Cronbach’ s alpha (N=25) is 0.912 and 0.908 for the two time points. The result of CFA (N=410) showed most items loaded strongly and correctly into corresponding factors. Only one item was eliminated. Conclusions: This study is the first to develop and establish the reliability and validity of the Pharmacy Value-Added Services Questionnaire instrument using the Theory of Planned Behavior as the theoretical model. The translated Malay language version of PVASQ is reliable and valid to predict Malaysian patients’ intention to adopt pharmacy value-added services to collect partial medicine supply. PMID:26445622
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jeffrey C. JOe; Ronald L. Boring
Probabilistic Risk Assessment (PRA) and Human Reliability Assessment (HRA) are important technical contributors to the United States (U.S.) Nuclear Regulatory Commission’s (NRC) risk-informed and performance based approach to regulating U.S. commercial nuclear activities. Furthermore, all currently operating commercial NPPs in the U.S. are required by federal regulation to be staffed with crews of operators. Yet, aspects of team performance are underspecified in most HRA methods that are widely used in the nuclear industry. There are a variety of "emergent" team cognition and teamwork errors (e.g., communication errors) that are 1) distinct from individual human errors, and 2) important to understandmore » from a PRA perspective. The lack of robust models or quantification of team performance is an issue that affects the accuracy and validity of HRA methods and models, leading to significant uncertainty in estimating HEPs. This paper describes research that has the objective to model and quantify team dynamics and teamwork within NPP control room crews for risk informed applications, thereby improving the technical basis of HRA, which improves the risk-informed approach the NRC uses to regulate the U.S. commercial nuclear industry.« less
ERIC Educational Resources Information Center
Raymond, Mark R.; Clauser, Brian E.; Furman, Gail E.
2010-01-01
The use of standardized patients to assess communication skills is now an essential part of assessing a physician's readiness for practice. To improve the reliability of communication scores, it has become increasingly common in recent years to use statistical models to adjust ratings provided by standardized patients. This study employed ordinary…
Short assessment of the Big Five: robust across survey methods except telephone interviewing.
Lang, Frieder R; John, Dennis; Lüdtke, Oliver; Schupp, Jürgen; Wagner, Gert G
2011-06-01
We examined measurement invariance and age-related robustness of a short 15-item Big Five Inventory (BFI-S) of personality dimensions, which is well suited for applications in large-scale multidisciplinary surveys. The BFI-S was assessed in three different interviewing conditions: computer-assisted or paper-assisted face-to-face interviewing, computer-assisted telephone interviewing, and a self-administered questionnaire. Randomized probability samples from a large-scale German panel survey and a related probability telephone study were used in order to test method effects on self-report measures of personality characteristics across early, middle, and late adulthood. Exploratory structural equation modeling was used in order to test for measurement invariance of the five-factor model of personality trait domains across different assessment methods. For the short inventory, findings suggest strong robustness of self-report measures of personality dimensions among young and middle-aged adults. In old age, telephone interviewing was associated with greater distortions in reliable personality assessment. It is concluded that the greater mental workload of telephone interviewing limits the reliability of self-report personality assessment. Face-to-face surveys and self-administrated questionnaire completion are clearly better suited than phone surveys when personality traits in age-heterogeneous samples are assessed.
[New questionnaire to assess self-efficacy toward physical activity in children].
Aedo, Angeles; Avila, Héctor
2009-10-01
To design a questionnaire for assessment of self-efficacy toward physical activity in school children, as well as to measure its construct validity, test-retest reliability, and internal consistency. A four-stage multimethod approach was used: (1) bibliographic research followed by exploratory study and the formulation of questions and responses based on a dichotomous scale of 14 items; (2) validation of the content by a panel of experts; (3) application of the preliminary version of the questionnaire to a sample of 900 school-aged children in Mexico City; and (4) determination of the construct validity, test-retest reliability, and internal consistency (Cronbach's alpha). Three factors were identified that explain 64.15% of the variance: the search for positive alternatives to physical activity, ability to deal with possible barriers to exercising, and expectations of skill or competence. The model was validated using the goodness of fit, and the result of 65% less than 0.05 indicated that the estimated factor model fit the data. Cronbach's consistency alpha was 0.733; test-retest reliability was 0.867. The scale designed has adequate reliability and validity. These results are a good indicator of self-efficacy toward physical activity in school children, which is important when developing programs intended to promote such behavior in this age group.
Parent-reported social support for child's fruit and vegetable intake: validity of measures.
Dave, Jayna M; Evans, Alexandra E; Condrasky, Marge D; Williams, Joel E
2012-01-01
To develop and validate measures of parental social support to increase their child's fruit and vegetable (FV) consumption. Cross-sectional study design. School and home. Two hundred three parents with at least 1 elementary school-aged child. Parents completed a questionnaire that included instrumental social support scale (ISSPS), emotional social support scale (ESSPS), household FV availability and accessibility index, and demographics. Exploratory factor analysis with promax rotation was conducted to obtain the psychometric properties of ISSPS and ESSPS. Internal consistency and test-retest reliabilities were also assessed. Factor analysis indicated a 4-factor model for ESSPS: positive encouragement, negative role modeling, discouragement, and an item cluster called reinforcement. Psychometric properties indicated that ISSPS performed best as independent single scales with α = .87. Internal consistency reliabilities were acceptable, and test-retest reliabilities ranged from low to acceptable. Correlations between scales, subscales, and item clusters were significant (P < .05). In addition, ISSPS and the positive encouragement subscale were significantly correlated with household FV availability. The ISSPS and ESSPS subscales demonstrated good internal consistency reliability and are suitable for impact assessment of an intervention designed to target parents to help their children eat more fruit and vegetables. Copyright © 2012 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
Reliability assessments in qualitative health promotion research.
Cook, Kay E
2012-03-01
This article contributes to the debate about the use of reliability assessments in qualitative research in general, and health promotion research in particular. In this article, I examine the use of reliability assessments in qualitative health promotion research in response to health promotion researchers' commonly held misconception that reliability assessments improve the rigor of qualitative research. All qualitative articles published in the journal Health Promotion International from 2003 to 2009 employing reliability assessments were examined. In total, 31.3% (20/64) articles employed some form of reliability assessment. The use of reliability assessments increased over the study period, ranging from <20% in 2003/2004 to 50% and above in 2008/2009, while at the same time the total number of qualitative articles decreased. The articles were then classified into four types of reliability assessments, including the verification of thematic codes, the use of inter-rater reliability statistics, congruence in team coding and congruence in coding across sites. The merits of each type were discussed, with the subsequent discussion focusing on the deductive nature of reliable thematic coding, the limited depth of immediately verifiable data and the usefulness of such studies to health promotion and the advancement of the qualitative paradigm.
MEMS reliability: The challenge and the promise
DOE Office of Scientific and Technical Information (OSTI.GOV)
Miller, W.M.; Tanner, D.M.; Miller, S.L.
1998-05-01
MicroElectroMechanical Systems (MEMS) that think, sense, act and communicate will open up a broad new array of cost effective solutions only if they prove to be sufficiently reliable. A valid reliability assessment of MEMS has three prerequisites: (1) statistical significance; (2) a technique for accelerating fundamental failure mechanisms, and (3) valid physical models to allow prediction of failures during actual use. These already exist for the microelectronics portion of such integrated systems. The challenge lies in the less well understood micromachine portions and its synergistic effects with microelectronics. This paper presents a methodology addressing these prerequisites and a description ofmore » the underlying physics of reliability for micromachines.« less
Katsari, Vasiliki; Niakas, Dimitris
2017-01-01
Introduction The use of generic medicines is a cost-effective policy, often dictated by fiscal restraints. To our knowledge, no fully validated tool exploring the students’ knowledge and attitudes towards generic medicines exists. The aim of our study was to develop and validate a questionnaire exploring the knowledge and attitudes of M.Sc. in Health Care Management students and recent alumni’s towards generic drugs in Greece. Materials and methods The development of the questionnaire was a result of literature review and pilot-testing of its preliminary versions to researchers and students. The final version of the questionnaire contains 18 items measuring the respondents’ knowledge and attitude towards generic medicines on a 5-point Likert scale. Given the ordinal nature of the data, ordinal alpha and polychoric correlations were computed. The sample was randomly split into two halves. Exploratory factor analysis, performed in the first sample, was used for the creation of multi-item scales. Confirmatory factor analysis and Generalized Linear Latent and Mixed Model analysis (GLLAMM) with the use of the rating scale model were used in the second sample to assess goodness of fit. An assessment of internal consistency reliability, test-retest reliability, and construct validity was also performed. Results Among 1402 persons contacted, 986 persons completed our questionnaire (response rate = 70.3%). Overall Cronbach’s alpha was 0.871. The conjoint use of exploratory and confirmatory factor analysis resulted in a six-scale model, which seemed to fit the data well. Five of the six scales, namely trust, drug quality, state audit, fiscal impact and drug substitution were found to be valid and reliable, while the knowledge scale suffered only from low inter-scale correlations and a ceiling effect. However, the subsequent confirmatory factor and GLLAMM analyses indicated a good fit of the model to the data. Conclusions The ATTOGEN instrument proved to be a reliable and valid tool, suitable for assessing students’ knowledge and attitudes towards generic medicines. PMID:29186163
77 FR 26954 - 1-Naphthaleneacetic acid; Pesticide Tolerances
Federal Register 2010, 2011, 2012, 2013, 2014
2012-05-08
... for which there is reliable information.'' This includes exposure through drinking water and in... exposure from drinking water. The Agency used screening level water exposure models in the dietary exposure analysis and risk assessment for NAA in drinking water. These simulation models take into account data on...
NASA Technical Reports Server (NTRS)
Dennehy, Cornelius J.
2010-01-01
This final report summarizes the results of a comparative assessment of the fault tolerance and reliability of different Guidance, Navigation and Control (GN&C) architectural approaches. This study was proactively performed by a combined Massachusetts Institute of Technology (MIT) and Draper Laboratory team as a GN&C "Discipline-Advancing" activity sponsored by the NASA Engineering and Safety Center (NESC). This systematic comparative assessment of GN&C system architectural approaches was undertaken as a fundamental step towards understanding the opportunities for, and limitations of, architecting highly reliable and fault tolerant GN&C systems composed of common avionic components. The primary goal of this study was to obtain architectural 'rules of thumb' that could positively influence future designs in the direction of an optimized (i.e., most reliable and cost-efficient) GN&C system. A secondary goal was to demonstrate the application and the utility of a systematic modeling approach that maps the entire possible architecture solution space.
Seng, Elizabeth K; Lovejoy, Travis I
2013-12-01
This study psychometrically evaluates the Motivational Interviewing Treatment Integrity Code (MITI) to assess fidelity to motivational interviewing to reduce sexual risk behaviors in people living with HIV/AIDS. 74 sessions from a pilot randomized controlled trial of motivational interviewing to reduce sexual risk behaviors in people living with HIV were coded with the MITI. Participants reported sexual behavior at baseline, 3-month, and 6-months. Regarding reliability, excellent inter-rater reliability was achieved for measures of behavior frequency across the 12 sessions coded by both coders; global scales demonstrated poor intraclass correlations, but adequate percent agreement. Regarding validity, principle components analyses indicated that a two-factor model accounted for an adequate amount of variance in the data. These factors were associated with decreases in sexual risk behaviors after treatment. The MITI is a reliable and valid measurement of treatment fidelity for motivational interviewing targeting sexual risk behaviors in people living with HIV/AIDS.
Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam
2017-01-01
Aims and Objective: The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Materials and Methods: Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t-test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Results: Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. Conclusion: CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis. PMID:28852639
NASA Astrophysics Data System (ADS)
Berthet, Lionel; Marty, Renaud; Bourgin, François; Viatgé, Julie; Piotte, Olivier; Perrin, Charles
2017-04-01
An increasing number of operational flood forecasting centres assess the predictive uncertainty associated with their forecasts and communicate it to the end users. This information can match the end-users needs (i.e. prove to be useful for an efficient crisis management) only if it is reliable: reliability is therefore a key quality for operational flood forecasts. In 2015, the French flood forecasting national and regional services (Vigicrues network; www.vigicrues.gouv.fr) implemented a framework to compute quantitative discharge and water level forecasts and to assess the predictive uncertainty. Among the possible technical options to achieve this goal, a statistical analysis of past forecasting errors of deterministic models has been selected (QUOIQUE method, Bourgin, 2014). It is a data-based and non-parametric approach based on as few assumptions as possible about the forecasting error mathematical structure. In particular, a very simple assumption is made regarding the predictive uncertainty distributions for large events outside the range of the calibration data: the multiplicative error distribution is assumed to be constant, whatever the magnitude of the flood. Indeed, the predictive distributions may not be reliable in extrapolation. However, estimating the predictive uncertainty for these rare events is crucial when major floods are of concern. In order to improve the forecasts reliability for major floods, an attempt at combining the operational strength of the empirical statistical analysis and a simple error modelling is done. Since the heteroscedasticity of forecast errors can considerably weaken the predictive reliability for large floods, this error modelling is based on the log-sinh transformation which proved to reduce significantly the heteroscedasticity of the transformed error in a simulation context, even for flood peaks (Wang et al., 2012). Exploratory tests on some operational forecasts issued during the recent floods experienced in France (major spring floods in June 2016 on the Loire river tributaries and flash floods in fall 2016) will be shown and discussed. References Bourgin, F. (2014). How to assess the predictive uncertainty in hydrological modelling? An exploratory work on a large sample of watersheds, AgroParisTech Wang, Q. J., Shrestha, D. L., Robertson, D. E. and Pokhrel, P (2012). A log-sinh transformation for data normalization and variance stabilization. Water Resources Research, , W05514, doi:10.1029/2011WR010973
Validation of a combined health literacy and numeracy instrument for patients with type 2 diabetes.
Luo, Huabin; Patil, Shivajirao P; Wu, Qiang; Bell, Ronny A; Cummings, Doyle M; Adams, Alyssa D; Hambidge, Bertha; Craven, Kay; Gao, Fei
2018-05-20
This study aimed to validate a new consolidated measure of health literacy and numeracy (health literacy scale [HLS] plus the subjective numeracy scale [SNS]) in patients with type 2 diabetes (T2DM). A convenience sample (N = 102) of patients with T2DM was recruited from an academic family medicine center in the southeastern US between September-December 2017. Participants completed a questionnaire that included the composite HLS/SNS (22 questions) and a commonly used objective measure of health literacy-S-TOFHLA (40 questions). Internal reliability of the HLS/SNS was assessed using Cronbach's alpha. Criterion and construct validity was assessed against the S-TOFHLA. The composite HLS/SNS had good internal reliability (Cronbach's alpha = 0.83). A confirmatory factor analysis revealed there were four factors in the new instrument. Model fit indices showed good model-data fit (RMSEA = 0.08). The Spearman's rank order correlation coefficient between the HLS/SNS and the S-TOFHLA was 0.45 (p < 0.01). Our study suggests that the composite HLS/SNS is a reliable, valid instrument. Published by Elsevier B.V.
The Brazilian version of the effort-reward imbalance questionnaire to assess job stress.
Chor, Dóra; Werneck, Guilherme Loureiro; Faerstein, Eduardo; Alves, Márcia Guimarães de Mello; Rotenberg, Lúcia
2008-01-01
The effort-reward imbalance (ERI) model has been used to assess the health impact of job stress. We aimed at describing the cross-cultural adaptation of the ERI questionnaire into Portuguese and some psychometric properties, in particular internal consistency, test-retest reliability, and factorial structure. We developed a Brazilian version of the ERI using a back-translation method and tested its reliability. The test-retest reliability study was conducted with 111 health workers and University staff. The current analyses are based on 89 participants, after exclusion of those with missing data. Reproducibility (interclass correlation coefficients) for the "effort", "'reward", and "'overcommitment"' dimensions of the scale was estimated at 0.76, 0.86, and 0.78, respectively. Internal consistency (Cronbach's alpha) estimates for these same dimensions were 0.68, 0.78, and 0.78, respectively. The exploratory factorial structure was fairly consistent with the model's theoretical components. We conclude that the results of this study represent the first evidence in favor of the application of the Brazilian Portuguese version of the ERI scale in health research in populations with similar socioeconomic characteristics.
The Revised Neurobehavioral Severity Scale (NSS-R) for Rodents.
Yarnell, Angela M; Barry, Erin S; Mountney, Andrea; Shear, Deborah; Tortella, Frank; Grunberg, Neil E
2016-04-08
Motor and sensory deficits are common following traumatic brain injury (TBI). Although rodent models provide valuable insight into the biological and functional outcomes of TBI, the success of translational research is critically dependent upon proper selection of sensitive, reliable, and reproducible assessments. Published literature includes various observational scales designed to evaluate post-injury functionality; however, the heterogeneity in TBI location, severity, and symptomology can complicate behavioral assessments. The importance of choosing behavioral outcomes that can be reliably and objectively quantified in an efficient manner is becoming increasingly important. The Revised Neurobehavioral Severity Scale (NSS-R) is a continuous series of specific, sensitive, and standardized observational tests that evaluate balance, motor coordination, and sensorimotor reflexes in rodents. The tasks follow a specific order designed to minimize interference: balance, landing, tail raise, dragging, righting reflex, ear reflex, eye reflex, sound reflex, tail pinch, and hindpaw pinch. The NSS-R has proven to be a reliable method differentiating brain-injured rodents from non-brain-injured rodents across many brain injury models. Copyright © 2016 John Wiley & Sons, Inc.
Sommer, Johanna; Lanier, Cédric; Perron, Noelle Junod; Nendaz, Mathieu; Clavet, Diane; Audétat, Marie-Claude
2016-04-01
The aim of this study was to develop a descriptive tool for peer review of clinical teaching skills. Two analogies framed our research: (1) between the patient-centered and the learner-centered approach; (2) between the structures of clinical encounters (Calgary-Cambridge communication model) and teaching sessions. During the course of one year, each step of the action research was carried out in collaboration with twelve clinical teachers from an outpatient general internal medicine clinic and with three experts in medical education. The content validation consisted of a literature review, expert opinion and the participatory research process. Interrater reliability was evaluated by three clinical teachers coding thirty audiotaped standardized learner-teacher interactions. This tool contains sixteen items covering the process and content of clinical supervisions. Descriptors define the expected teaching behaviors for three levels of competence. Interrater reliability was significant for eleven items (Kendall's coefficient p<0.05). This peer assessment tool has high reliability and can be used to facilitate the acquisition of teaching skills. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Ebara, Takeshi; Azuma, Ryohei; Shoji, Naoto; Matsukawa, Tsuyoshi; Yamada, Yasuyuki; Akiyama, Tomohiro; Kurihara, Takahiro; Yamada, Shota
2017-01-01
Objectives: Objective measurements using built-in smartphone sensors that can measure physical activity/inactivity in daily working life have the potential to provide a new approach to assessing workers' health effects. The aim of this study was to elucidate the characteristics and reliability of built-in step counting sensors on smartphones for development of an easy-to-use objective measurement tool that can be applied in ergonomics or epidemiological research. Methods: To evaluate the reliability of step counting sensors embedded in seven major smartphone models, the 6-minute walk test was conducted and the following analyses of sensor precision and accuracy were performed: 1) relationship between actual step count and step count detected by sensors, 2) reliability between smartphones of the same model, and 3) false detection rates when sitting during office work, while riding the subway, and driving. Results: On five of the seven models, the inter-class correlations coefficient (ICC (3,1)) showed high reliability with a range of 0.956-0.993. The other two models, however, had ranges of 0.443-0.504 and the relative error ratios of the sensor-detected step count to the actual step count were ±48.7%-49.4%. The level of agreement between the same models was ICC (3,1): 0.992-0.998. The false detection rates differed between the sitting conditions. Conclusions: These results suggest the need for appropriate regulation of step counts measured by sensors, through means such as correction or calibration with a predictive model formula, in order to obtain the highly reliable measurement results that are sought in scientific investigation. PMID:28835575
NASA Astrophysics Data System (ADS)
Martowicz, Adam; Uhl, Tadeusz
2012-10-01
The paper discusses the applicability of a reliability- and performance-based multi-criteria robust design optimization technique for micro-electromechanical systems, considering their technological uncertainties. Nowadays, micro-devices are commonly applied systems, especially in the automotive industry, taking advantage of utilizing both the mechanical structure and electronic control circuit on one board. Their frequent use motivates the elaboration of virtual prototyping tools that can be applied in design optimization with the introduction of technological uncertainties and reliability. The authors present a procedure for the optimization of micro-devices, which is based on the theory of reliability-based robust design optimization. This takes into consideration the performance of a micro-device and its reliability assessed by means of uncertainty analysis. The procedure assumes that, for each checked design configuration, the assessment of uncertainty propagation is performed with the meta-modeling technique. The described procedure is illustrated with an example of the optimization carried out for a finite element model of a micro-mirror. The multi-physics approach allowed the introduction of several physical phenomena to correctly model the electrostatic actuation and the squeezing effect present between electrodes. The optimization was preceded by sensitivity analysis to establish the design and uncertain domains. The genetic algorithms fulfilled the defined optimization task effectively. The best discovered individuals are characterized by a minimized value of the multi-criteria objective function, simultaneously satisfying the constraint on material strength. The restriction of the maximum equivalent stresses was introduced with the conditionally formulated objective function with a penalty component. The yielded results were successfully verified with a global uniform search through the input design domain.
A flexible count data regression model for risk analysis.
Guikema, Seth D; Coffelt, Jeremy P; Goffelt, Jeremy P
2008-02-01
In many cases, risk and reliability analyses involve estimating the probabilities of discrete events such as hardware failures and occurrences of disease or death. There is often additional information in the form of explanatory variables that can be used to help estimate the likelihood of different numbers of events in the future through the use of an appropriate regression model, such as a generalized linear model. However, existing generalized linear models (GLM) are limited in their ability to handle the types of variance structures often encountered in using count data in risk and reliability analysis. In particular, standard models cannot handle both underdispersed data (variance less than the mean) and overdispersed data (variance greater than the mean) in a single coherent modeling framework. This article presents a new GLM based on a reformulation of the Conway-Maxwell Poisson (COM) distribution that is useful for both underdispersed and overdispersed count data and demonstrates this model by applying it to the assessment of electric power system reliability. The results show that the proposed COM GLM can provide as good of fits to data as the commonly used existing models for overdispered data sets while outperforming these commonly used models for underdispersed data sets.
Malinowsky, Camilla; Kassberg, Ann-Charlotte; Larsson-Lund, Maria; Kottorp, Anders
2016-01-01
To evaluate the test-retest reliability of the Management of Everyday Technology Assessment (META) in a sample of people with acquired brain injury (ABI). The META was administered twice within a two-week period to 25 people with ABI. A Rasch measurement model was used to convert the META ordinal raw scores into equal-interval linear measures of each participant's ability to manage everyday technology (ET). Test-retest reliability of the stability of the person ability measures in the META was examined by a standardized difference Z-test and an intra-class correlations analysis (ICC 1). The results showed that the paired person ability measures generated from the META were stable over the test-retest period for 22 of the 25 subjects. The ICC 1 correlation was 0.63, which indicates good overall reliability. The META demonstrated acceptable test-retest reliability in a sample of people with ABI. The results illustrate the importance of using sufficiently challenging ETs (relative to a person's abilities) to generate stable META measurements over time. Implications for Rehabilitation The findings add evidence regarding the test-retest reliability of the person ability measures generated from the observation assessment META in a sample of people with ABI. The META might support professionals in the evaluation of interventions that are designed to improve clients' performance of activities including the ability to manage ET.
Constructing the "Best" Reliability Data for the Job
NASA Technical Reports Server (NTRS)
DeMott, D. L.; Kleinhammer, R. K.
2014-01-01
Modern business and technical decisions are based on the results of analyses. When considering assessments using "reliability data", the concern is how long a system will continue to operate as designed. Generally, the results are only as good as the data used. Ideally, a large set of pass/fail tests or observations to estimate the probability of failure of the item under test would produce the best data. However, this is a costly endeavor if used for every analysis and design. Developing specific data is costly and time consuming. Instead, analysts rely on available data to assess reliability. Finding data relevant to the specific use and environment for any project is difficult, if not impossible. Instead, we attempt to develop the "best" or composite analog data to support our assessments. One method used incorporates processes for reviewing existing data sources and identifying the available information based on similar equipment, then using that generic data to derive an analog composite. Dissimilarities in equipment descriptions, environment of intended use, quality and even failure modes impact the "best" data incorporated in an analog composite. Once developed, this composite analog data provides a "better" representation of the reliability of the equipment or component can be used to support early risk or reliability trade studies, or analytical models to establish the predicted reliability data points. Data that is more representative of reality and more project specific would provide more accurate analysis, and hopefully a better final decision.
Alsalaheen, Bara; Haines, Jamie; Yorke, Amy; Broglio, Steven P
2015-12-01
To examine the reliability, convergent, and discriminant validity of the limits of stability (LOS) test to assess dynamic postural stability in adolescents using a portable forceplate system. Cross-sectional reliability observational study. School setting. Adolescents (N=36) completed all measures during the first session. To examine the reliability of the LOS test, a subset of 15 participants repeated the LOS test after 1 week. Not applicable. Outcome measurements included the LOS test, Balance Error Scoring System, Instrumented Balance Error Scoring System, and Modified Clinical Test for Sensory Interaction on Balance. A significant relation was observed among LOS composite scores (r=.36-.87, P<.05). However, no relation was observed between LOS and static balance outcome measurements. The reliability of the LOS composite scores ranged from moderate to good (intraclass correlation coefficient model 2,1=.73-.96). The results suggest that the LOS composite scores provide unique information about dynamic postural stability, and the LOS test completed at 100% of the theoretical limit appeared to be a reliable test of dynamic postural stability in adolescents. Clinicians should use dynamic balance measurement as part of their balance assessment and should not use static balance testing (eg, Balance Error Scoring System) to make inferences about dynamic balance, especially when balance assessment is used to determine rehabilitation outcomes, or when making return to play decisions after injury. Copyright © 2015 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Constructing the Best Reliability Data for the Job
NASA Technical Reports Server (NTRS)
Kleinhammer, R. K.; Kahn, J. C.
2014-01-01
Modern business and technical decisions are based on the results of analyses. When considering assessments using "reliability data", the concern is how long a system will continue to operate as designed. Generally, the results are only as good as the data used. Ideally, a large set of pass/fail tests or observations to estimate the probability of failure of the item under test would produce the best data. However, this is a costly endeavor if used for every analysis and design. Developing specific data is costly and time consuming. Instead, analysts rely on available data to assess reliability. Finding data relevant to the specific use and environment for any project is difficult, if not impossible. Instead, we attempt to develop the "best" or composite analog data to support our assessments. One method used incorporates processes for reviewing existing data sources and identifying the available information based on similar equipment, then using that generic data to derive an analog composite. Dissimilarities in equipment descriptions, environment of intended use, quality and even failure modes impact the "best" data incorporated in an analog composite. Once developed, this composite analog data provides a "better" representation of the reliability of the equipment or component can be used to support early risk or reliability trade studies, or analytical models to establish the predicted reliability data points. Data that is more representative of reality and more project specific would provide more accurate analysis, and hopefully a better final decision.
An overview of reliability assessment and control for design of civil engineering structures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Field, R.V. Jr.; Grigoriadis, K.M.; Bergman, L.A.
1998-06-01
Random variations, whether they occur in the input signal or the system parameters, are phenomena that occur in nearly all engineering systems of interest. As a result, nondeterministic modeling techniques must somehow account for these variations to ensure validity of the solution. As might be expected, this is a difficult proposition and the focus of many current research efforts. Controlling seismically excited structures is one pertinent application of nondeterministic analysis and is the subject of the work presented herein. This overview paper is organized into two sections. First, techniques to assess system reliability, in a context familiar to civil engineers,more » are discussed. Second, and as a consequence of the first, active control methods that ensure good performance in this random environment are presented. It is the hope of the authors that these discussions will ignite further interest in the area of reliability assessment and design of controlled civil engineering structures.« less
Validation of the Practice Environment Scale to the Brazilian culture.
Gasparino, Renata C; Guirardello, Edinêis de B
2017-07-01
To validate the Brazilian version of the Practice Environment Scale. The Practice Environment Scale is a tool that evaluates the presence of characteristics that are favourable for professional nursing practice because a better work environment contributes to positive results for patients, professionals and institutions. Methodological study including 209 nurses. Validity was assessed via a confirmatory factor analysis using structural equation modelling, in which the correlations between the instrument and the following variables were tested: burnout, job satisfaction, safety climate, perception of quality of care and intention to leave the job. Subgroups were compared and the reliability was assessed using Cronbach's alpha and the composite reliability. Factor analysis resulted in exclusion of seven items. Significant correlations were obtained between the subscales and all variables in the study. The reliability was considered acceptable. The Brazilian version of the Practice Environment Scale is a valid and reliable tool used to assess the characteristics that promote professional nursing practice. Use of this tool in Brazilian culture should allow managers to implement changes that contribute to the achievement of better results, in addition to identifying and comparing the environments of health institutions. © 2017 John Wiley & Sons Ltd.
Parts and Components Reliability Assessment: A Cost Effective Approach
NASA Technical Reports Server (NTRS)
Lee, Lydia
2009-01-01
System reliability assessment is a methodology which incorporates reliability analyses performed at parts and components level such as Reliability Prediction, Failure Modes and Effects Analysis (FMEA) and Fault Tree Analysis (FTA) to assess risks, perform design tradeoffs, and therefore, to ensure effective productivity and/or mission success. The system reliability is used to optimize the product design to accommodate today?s mandated budget, manpower, and schedule constraints. Stand ard based reliability assessment is an effective approach consisting of reliability predictions together with other reliability analyses for electronic, electrical, and electro-mechanical (EEE) complex parts and components of large systems based on failure rate estimates published by the United States (U.S.) military or commercial standards and handbooks. Many of these standards are globally accepted and recognized. The reliability assessment is especially useful during the initial stages when the system design is still in the development and hard failure data is not yet available or manufacturers are not contractually obliged by their customers to publish the reliability estimates/predictions for their parts and components. This paper presents a methodology to assess system reliability using parts and components reliability estimates to ensure effective productivity and/or mission success in an efficient manner, low cost, and tight schedule.
Assessment of wheelchair driving performance in a virtual reality-based simulator
Mahajan, Harshal P.; Dicianno, Brad E.; Cooper, Rory A.; Ding, Dan
2013-01-01
Objective To develop a virtual reality (VR)-based simulator that can assist clinicians in performing standardized wheelchair driving assessments. Design A completely within-subjects repeated measures design. Methods Participants drove their wheelchairs along a virtual driving circuit modeled after the Power Mobility Road Test (PMRT) and in a hallway with decreasing width. The virtual simulator was displayed on computer screen and VR screens and participants interacted with it using a set of instrumented rollers and a wheelchair joystick. Driving performances of participants were estimated and compared using quantitative metrics from the simulator. Qualitative ratings from two experienced clinicians were used to estimate intra- and inter-rater reliability. Results Ten regular wheelchair users (seven men, three women; mean age ± SD, 39.5 ± 15.39 years) participated. The virtual PMRT scores from the two clinicians show high inter-rater reliability (78–90%) and high intra-rater reliability (71–90%) for all test conditions. More research is required to explore user preferences and effectiveness of the two control methods (rollers and mathematical model) and the display screens. Conclusions The virtual driving simulator seems to be a promising tool for wheelchair driving assessment that clinicians can use to supplement their real-world evaluations. PMID:23820148
Work-related stress assessed by a text message single-item stress question.
Arapovic-Johansson, B; Wåhlin, C; Kwak, L; Björklund, C; Jensen, I
2017-12-02
Given the prevalence of work stress-related ill-health in the Western world, it is important to find cost-effective, easy-to-use and valid measures which can be used both in research and in practice. To examine the validity and reliability of the single-item stress question (SISQ), distributed weekly by short message service (SMS) and used for measurement of work-related stress. The convergent validity was assessed through associations between the SISQ and subscales of the Job Demand-Control-Support model, the Effort-Reward Imbalance model and scales measuring depression, exhaustion and sleep. The predictive validity was assessed using SISQ data collected through SMS. The reliability was analysed by the test-retest procedure. Correlations between the SISQ and all the subscales except for job strain and esteem reward were significant, ranging from -0.186 to 0.627. The SISQ could also predict sick leave, depression and exhaustion at 12-month follow-up. The analysis on reliability revealed a satisfactory stability with a weighted kappa between 0.804 and 0.868. The SISQ, administered through SMS, can be used for the screening of stress levels in a working population. © The Author 2017. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com
A measure of early physical functioning (EPF) post-stroke.
Finch, Lois E; Higgins, Johanne; Wood-Dauphinee, Sharon; Mayo, Nancy E
2008-07-01
To develop a comprehensive measure of Early Physical Functioning (EPF) post-stroke quantified through Rasch analysis and conceptualized using the International Classification of Functioning Disability and Health (ICF). An observational cohort study. A cohort of 262 subjects (mean age 71.6 (standard deviation 12.5) years) hospitalized post-acute stroke. Functional assessments were made within 3 days of stroke with items from valid and reliable indices commonly utilized to evaluate stroke survivors. Information on important variables was also collected. Principal component and Rasch analysis confirmed the factor structure, and dimensionality of the measure. Rasch analysis combined items across ICF components to develop the measure. Items were deleted iteratively, those retained fit the model and were related to the construct; reliability and validity were assessed. A 38-item unidimensional measure of the EPF met all Rasch model requirements. The item difficulty matched the person ability (mean person measure: -0.31; standard error 0.37 logits), reliability of the person-item-hierarchy was excellent at 0.97. Initial validity was adequate. The 38-item EPF measure was developed. It expands the range of assessment post acute stroke; it covers a broad spectrum of difficulty with good initial psychometric properties that, once revalidated, can assist in planning and evaluating early interventions.
Reliability and validity of the Euthanasia Attitude Scale (EAS) for Hong Kong medical doctors.
Tang, Wai-Kiu; Mak, Kwok-Kei; Kam, Philip Ming-Ho; Ho, Joanna Wing-Kiu; Chan, Denise Che-Ying; Suen, To-Lam; Lau, Michael Chak-Kwan; Cheng, Adrian Ka-Chun; Wan, Yuen-Ting; Wan, Ho-Yan; Hussain, Assad
2010-08-01
This study aimed to examine the reliability and validity of the Euthanasia Attitude Scale (EAS) in Hong Kong medical doctors. A total of 107 medical doctors (61.7% men) participated in a survey at clinical settings in 2008. The 21-item EAS was used to assess their attitudes toward euthanasia. The mean (standard deviation) and median of the EAS were 63.60 (60.31) and 63.00. Total EAS scores correlated well with ''Ethical Considerations,'' ''Practical Considerations,'' and ''Treasuring Life'' (Spearman rho =.37-.96, P < .001) but not ''Naturalistic Beliefs.'' The construct validity of the 3-factor model was appropriate (Kaiser-Meyer-Olkin [KMO] value = 0.90) and showed high internal consistency (Cronbach alpha =.79-.92). Euthanasia Attitude Scale may be a reliable and valid measure for assessing the attitudes toward euthanasia in medical professionals.
Using meta-quality to assess the utility of volunteered geographic information for science.
Langley, Shaun A; Messina, Joseph P; Moore, Nathan
2017-11-06
Volunteered geographic information (VGI) has strong potential to be increasingly valuable to scientists in collaboration with non-scientists. The abundance of mobile phones and other wireless forms of communication open up significant opportunities for the public to get involved in scientific research. As these devices and activities become more abundant, questions of uncertainty and error in volunteer data are emerging as critical components for using volunteer-sourced spatial data. Here we present a methodology for using VGI and assessing its sensitivity to three types of error. More specifically, this study evaluates the reliability of data from volunteers based on their historical patterns. The specific context is a case study in surveillance of tsetse flies, a health concern for being the primary vector of African Trypanosomiasis. Reliability, as measured by a reputation score, determines the threshold for accepting the volunteered data for inclusion in a tsetse presence/absence model. Higher reputation scores are successful in identifying areas of higher modeled tsetse prevalence. A dynamic threshold is needed but the quality of VGI will improve as more data are collected and the errors in identifying reliable participants will decrease. This system allows for two-way communication between researchers and the public, and a way to evaluate the reliability of VGI. Boosting the public's ability to participate in such work can improve disease surveillance and promote citizen science. In the absence of active surveillance, VGI can provide valuable spatial information given that the data are reliable.
Alwan, Faris M; Baharum, Adam; Hassan, Geehan S
2013-01-01
The reliability of the electrical distribution system is a contemporary research field due to diverse applications of electricity in everyday life and diverse industries. However a few research papers exist in literature. This paper proposes a methodology for assessing the reliability of 33/11 Kilovolt high-power stations based on average time between failures. The objective of this paper is to find the optimal fit for the failure data via time between failures. We determine the parameter estimation for all components of the station. We also estimate the reliability value of each component and the reliability value of the system as a whole. The best fitting distribution for the time between failures is a three parameter Dagum distribution with a scale parameter [Formula: see text] and shape parameters [Formula: see text] and [Formula: see text]. Our analysis reveals that the reliability value decreased by 38.2% in each 30 days. We believe that the current paper is the first to address this issue and its analysis. Thus, the results obtained in this research reflect its originality. We also suggest the practicality of using these results for power systems for both the maintenance of power systems models and preventive maintenance models.
Alwan, Faris M.; Baharum, Adam; Hassan, Geehan S.
2013-01-01
The reliability of the electrical distribution system is a contemporary research field due to diverse applications of electricity in everyday life and diverse industries. However a few research papers exist in literature. This paper proposes a methodology for assessing the reliability of 33/11 Kilovolt high-power stations based on average time between failures. The objective of this paper is to find the optimal fit for the failure data via time between failures. We determine the parameter estimation for all components of the station. We also estimate the reliability value of each component and the reliability value of the system as a whole. The best fitting distribution for the time between failures is a three parameter Dagum distribution with a scale parameter and shape parameters and . Our analysis reveals that the reliability value decreased by 38.2% in each 30 days. We believe that the current paper is the first to address this issue and its analysis. Thus, the results obtained in this research reflect its originality. We also suggest the practicality of using these results for power systems for both the maintenance of power systems models and preventive maintenance models. PMID:23936346
An overview of the mathematical and statistical analysis component of RICIS
NASA Technical Reports Server (NTRS)
Hallum, Cecil R.
1987-01-01
Mathematical and statistical analysis components of RICIS (Research Institute for Computing and Information Systems) can be used in the following problem areas: (1) quantification and measurement of software reliability; (2) assessment of changes in software reliability over time (reliability growth); (3) analysis of software-failure data; and (4) decision logic for whether to continue or stop testing software. Other areas of interest to NASA/JSC where mathematical and statistical analysis can be successfully employed include: math modeling of physical systems, simulation, statistical data reduction, evaluation methods, optimization, algorithm development, and mathematical methods in signal processing.
Kim, Youl-Ri; Tyrer, Peter; Lee, Hong-Seock; Kim, Sung-Gon; Connan, Frances; Kinnaird, Emma; Olajide, Kike; Crawford, Mike
2016-05-01
The underlying core of personality is insufficiently assessed by any single instrument. This has led to the development of instruments adapted for written records in the assessment of personality disorder. To test the construct validity and inter-rater reliability of a new personality assessment method. This study (four parts) assessed the construct validity of the Schedule for Personality Assessment from Notes and Documents (SPAN-DOC), a dimensional assessment from clinical records. We examined inter-rater reliability using case vignettes (Part 1) and convergent validity in three ways: by comparison with NEO Five-Factor Inventory in 130 Korean patients (Part 2), with agreed ICD-11 personality severity levels in two populations (Part 3) and determining its use in assessing the personality status in 90 British patients with eating disorders (Part 4). Internal consistency (alpha = .90) and inter-rater reliability (intraclass correlation coefficient ≥ .88) were satisfactory. Each factor in the five-factor model of personality was correlated with conceptually valid SPAN-DOC variables. The SPAN-DOC domain traits in those with eating disorders were categorized into 3 clusters: self-aggrandisement, emotionally unstable, and anxious/dependent. This study provides preliminary support for the usefulness of SPAN-DOC in the assessment of personality disorder. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Wood, David L; Sawicki, Gregory S; Miller, M David; Smotherman, Carmen; Lukens-Bull, Katryne; Livingood, William C; Ferris, Maria; Kraemer, Dale F
2014-01-01
National consensus statements recommend that providers regularly assess the transition readiness skills of adolescent and young adults (AYA). In 2010 we developed a 29-item version of Transition Readiness Assessment Questionnaire (TRAQ). We reevaluated item performance and factor structure, and reassessed the TRAQ's reliability and validity. We surveyed youth from 3 academic clinics in Jacksonville, Florida; Chapel Hill, North Carolina; and Boston, Massachusetts. Participants were AYA with special health care needs aged 14 to 21 years. From a convenience sample of 306 patients, we conducted item reduction strategies and exploratory factor analysis (EFA). On a second convenience sample of 221 patients, we conducted confirmatory factor analysis (CFA). Internal reliability was assessed by Cronbach's alpha and criterion validity. Analyses were conducted by the Wilcoxon rank sum test and mixed linear models. The item reduction and EFA resulted in a 20-item scale with 5 identified subscales. The CFA conducted on a second sample provided a good fit to the data. The overall scale has high reliability overall (Cronbach's alpha = .94) and good reliability for 4 of the 5 subscales (Cronbach's alpha ranging from .90 to .77 in the pooled sample). Each of the 5 subscale scores were significantly higher for adolescents aged 18 years and older versus those younger than 18 (P < .0001) in both univariate and multivariate analyses. The 20-item, 5-factor structure for the TRAQ is supported by EFA and CFA on independent samples and has good internal reliability and criterion validity. Additional work is needed to expand or revise the TRAQ subscales and test their predictive validity. Copyright © 2014 Academic Pediatric Association. Published by Elsevier Inc. All rights reserved.
Wennberg, Richard; Cheyne, Douglas
2014-05-01
To assess the reliability of MEG source imaging (MSI) of anterior temporal spikes through detailed analysis of the localization and orientation of source solutions obtained for a large number of spikes that were separately confirmed by intracranial EEG to be focally generated within a single, well-characterized spike focus. MSI was performed on 64 identical right anterior temporal spikes from an anterolateral temporal neocortical spike focus. The effects of different volume conductors (sphere and realistic head model), removal of noise with low frequency filters (LFFs) and averaging multiple spikes were assessed in terms of the reliability of the source solutions. MSI of single spikes resulted in scattered dipole source solutions that showed reasonable reliability for localization at the lobar level, but only for solutions with a goodness-of-fit exceeding 80% using a LFF of 3 Hz. Reliability at a finer level of intralobar localization was limited. Spike averaging significantly improved the reliability of source solutions and averaging 8 or more spikes reduced dependency on goodness-of-fit and data filtering. MSI performed on topographically identical individual spikes from an intracranially defined classical anterior temporal lobe spike focus was limited by low reliability (i.e., scattered source solutions) in terms of fine, sublobar localization within the ipsilateral temporal lobe. Spike averaging significantly improved reliability. MSI performed on individual anterior temporal spikes is limited by low reliability. Reduction of background noise through spike averaging significantly improves the reliability of MSI solutions. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Wisconsin's Model Academic Standards for Music.
ERIC Educational Resources Information Center
Nikolay, Pauli; Grady, Susan; Stefonek, Thomas
To assist parents and educators in preparing students for the 21st century, Wisconsin citizens have become involved in the development of challenging academic standards in 12 curricular areas. Having clear standards for students and teachers makes it possible to develop rigorous local curricula and valid, reliable assessments. This model of…
Federal Register 2010, 2011, 2012, 2013, 2014
2010-01-15
... that is based on rigorous scientifically based research methods to assess the effectiveness of a...) Relies on measurements or observational methods that provide reliable and valid data across evaluators... of innovative, cohesive models that are based on research and have demonstrated that they effectively...
Lance A. Vickers; Thomas R. Fox; David L. Loftis; David A. Boucugnani
2013-01-01
The difficulty of achieving reliable oak (Quercus spp.) regeneration is well documented. Application of silvicultural techniques to facilitate oak regeneration largely depends on current regeneration potential. A computer model to assess regeneration potential based on existing advanced reproduction in Appalachian hardwoods was developed by David...
The reliability of the Adelaide in-shoe foot model.
Bishop, Chris; Hillier, Susan; Thewlis, Dominic
2017-07-01
Understanding the biomechanics of the foot is essential for many areas of research and clinical practice such as orthotic interventions and footwear development. Despite the widespread attention paid to the biomechanics of the foot during gait, what largely remains unknown is how the foot moves inside the shoe. This study investigated the reliability of the Adelaide In-Shoe Foot Model, which was designed to quantify in-shoe foot kinematics and kinetics during walking. Intra-rater reliability was assessed in 30 participants over five walking trials whilst wearing shoes during two data collection sessions, separated by one week. Sufficient reliability for use was interpreted as a coefficient of multiple correlation and intra-class correlation coefficient of >0.61. Inter-rater reliability was investigated separately in a second sample of 10 adults by two researchers with experience in applying markers for the purpose of motion analysis. The results indicated good consistency in waveform estimation for most kinematic and kinetic data, as well as good inter-and intra-rater reliability. The exception is the peak medial ground reaction force, the minimum abduction angle and the peak abduction/adduction external hindfoot joint moments which resulted in less than acceptable repeatability. Based on our results, the Adelaide in-shoe foot model can be used with confidence for 24 commonly measured biomechanical variables during shod walking. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Biryuk, V. V.; Tsapkova, A. B.; Larin, E. A.; Livshiz, M. Y.; Sheludko, L. P.
2018-01-01
A set of mathematical models for calculating the reliability indexes of structurally complex multifunctional combined installations in heat and power supply systems was developed. Reliability of energy supply is considered as required condition for the creation and operation of heat and power supply systems. The optimal value of the power supply system coefficient F is based on an economic assessment of the consumers’ loss caused by the under-supply of electric power and additional system expences for the creation and operation of an emergency capacity reserve. Rationing of RI of the industrial heat supply is based on the use of concept of technological margin of safety of technological processes. The definition of rationed RI values of heat supply of communal consumers is based on the air temperature level iside the heated premises. The complex allows solving a number of practical tasks for providing reliability of heat supply for consumers. A probabilistic model is developed for calculating the reliability indexes of combined multipurpose heat and power plants in heat-and-power supply systems. The complex of models and calculation programs can be used to solve a wide range of specific tasks of optimization of schemes and parameters of combined heat and power plants and systems, as well as determining the efficiency of various redundance methods to ensure specified reliability of power supply.
Risk assessment of storm surge disaster based on numerical models and remote sensing
NASA Astrophysics Data System (ADS)
Liu, Qingrong; Ruan, Chengqing; Zhong, Shan; Li, Jian; Yin, Zhonghui; Lian, Xihu
2018-06-01
Storm surge is one of the most serious ocean disasters in the world. Risk assessment of storm surge disaster for coastal areas has important implications for planning economic development and reducing disaster losses. Based on risk assessment theory, this paper uses coastal hydrological observations, a numerical storm surge model and multi-source remote sensing data, proposes methods for valuing hazard and vulnerability for storm surge and builds a storm surge risk assessment model. Storm surges in different recurrence periods are simulated in numerical models and the flooding areas and depth are calculated, which are used for assessing the hazard of storm surge; remote sensing data and GIS technology are used for extraction of coastal key objects and classification of coastal land use are identified, which is used for vulnerability assessment of storm surge disaster. The storm surge risk assessment model is applied for a typical coastal city, and the result shows the reliability and validity of the risk assessment model. The building and application of storm surge risk assessment model provides some basis reference for the city development plan and strengthens disaster prevention and mitigation.
Salum, Giovanni A; DeSousa, Diogo Araújo; Manfro, Gisele Gus; Pan, Pedro Mario; Gadelha, Ary; Brietzke, Elisa; Miguel, Eurípedes Constantino; Mari, Jair J; do Rosário, Maria Conceição; Grassi-Oliveira, Rodrigo
2016-01-01
To investigate the validity and reliability of a multi-informant approach to measuring child maltreatment (CM) comprising seven questions assessing CM administered to children and their parents in a large community sample. Our sample comprised 2,512 children aged 6 to 12 years and their parents. Child maltreatment (CM) was assessed with three questions answered by the children and four answered by their parents, covering physical abuse, physical neglect, emotional abuse and sexual abuse. Confirmatory factor analysis was used to compare the fit indices of different models. Convergent and divergent validity were tested using parent-report and teacher-report scores on the Strengths and Difficulties Questionnaire. Discriminant validity was investigated using the Development and Well-Being Assessment to divide subjects into five diagnostic groups: typically developing controls (n = 1,880), fear disorders (n = 108), distress disorders (n = 76), attention deficit hyperactivity disorder (n = 143) and oppositional defiant disorder/conduct disorder (n = 56). A higher-order model with one higher-order factor (child maltreatment) encompassing two lower-order factors (child report and parent report) exhibited the best fit to the data and this model's reliability results were acceptable. As expected, child maltreatment was positively associated with measures of psychopathology and negatively associated with prosocial measures. All diagnostic category groups had higher levels of overall child maltreatment than typically developing children. We found evidence for the validity and reliability of this brief measure of child maltreatment using data from a large survey combining information from parents and their children.
Guvenc, Gulten; Seven, Memnun; Akyuz, Aygul
2016-06-01
To adapt and psychometrically test the Health Belief Model Scale for Human Papilloma Virus (HPV) and Its Vaccination (HBMS-HPVV) for use in a Turkish population and to assess the Human Papilloma Virus Knowledge score (HPV-KS) among female college students. Instrument adaptation and psychometric testing study. The sample consisted of 302 nursing students at a nursing school in Turkey between April and May 2013. Questionnaire-based data were collected from the participants. Information regarding HBMS-HPVV and HPV knowledge and descriptive characteristic of participants was collected using translated HBMS-HPVV and HPV-KS. Test-retest reliability was evaluated and Cronbach α was used to assess internal consistency reliability, and exploratory factor analysis was used to assess construct validity of the HBMS-HPVV. The scale consists of 4 subscales that measure 4 constructs of the Health Belief Model covering the perceived susceptibility and severity of HPV and the benefits and barriers. The final 14-item scale had satisfactory validity and internal consistency. Cronbach α values for the 4 subscales ranged from 0.71 to 0.78. Total HPV-KS ranged from 0 to 8 (scale range, 0-10; 3.80 ± 2.12). The HBMS-HPVV is a valid and reliable instrument for measuring young Turkish women's beliefs and attitudes about HPV and its vaccination. Copyright © 2015 North American Society for Pediatric and Adolescent Gynecology. Published by Elsevier Inc. All rights reserved.
Introducing the TAPS Pyramid Model
ERIC Educational Resources Information Center
Earle, Sarah
2015-01-01
The Teacher Assessment in Primary Science (TAPS) project is a three-year project based at Bath Spa University and funded by the Primary Science Teaching Trust (PSTT). It aims to develop support for a valid, reliable and manageable system of science assessment that will have a positive impact on children's learning. In this article, the author…
ERIC Educational Resources Information Center
Kaspar, Roman; Döring, Ottmar; Wittmann, Eveline; Hartig, Johannes; Weyland, Ulrike; Nauerth, Annette; Möllers, Michaela; Rechenbach, Simone; Simon, Julia; Worofka, Iberé
2016-01-01
Valid and reliable standardized assessment of nursing competencies is needed to monitor the quality of vocational education and training (VET) in nursing and evaluate learning outcomes for care work trainees with increasingly heterogeneous learning backgrounds. To date, however, the modeling of professional competencies has not yet evolved into…
Accurate assessment of chronic human exposure to atmospheric criteria pollutants, such as ozone, is critical for understanding human health risks associated with living in environments with elevated ambient pollutant concentrations. In this study, we analyzed a data set from a...
A novel integrated assessment methodology of urban water reuse.
Listowski, A; Ngo, H H; Guo, W S; Vigneswaran, S
2011-01-01
Wastewater is no longer considered a waste product and water reuse needs to play a stronger part in securing urban water supply. Although treatment technologies for water reclamation have significantly improved the question that deserves further analysis is, how selection of a particular wastewater treatment technology relates to performance and sustainability? The proposed assessment model integrates; (i) technology, characterised by selected quantity and quality performance parameters; (ii) productivity, efficiency and reliability criteria; (iii) quantitative performance indicators; (iv) development of evaluation model. The challenges related to hierarchy and selections of performance indicators have been resolved through the case study analysis. The goal of this study is to validate a new assessment methodology in relation to performance of the microfiltration (MF) technology, a key element of the treatment process. Specific performance data and measurements were obtained at specific Control and Data Acquisition Points (CP) to satisfy the input-output inventory in relation to water resources, products, material flows, energy requirements, chemicals use, etc. Performance assessment process contains analysis and necessary linking across important parametric functions leading to reliable outcomes and results.
2018-01-01
On-chip LiDAR sensors for vehicle collision avoidance are a rapidly expanding area of research and development. The assessment of reliable obstacle detection using data collected by LiDAR sensors has become a key issue that the scientific community is actively exploring. The design of a self-tuning methodology and its implementation are presented in this paper, to maximize the reliability of LiDAR sensors network for obstacle detection in the ‘Internet of Things’ (IoT) mobility scenarios. The Webots Automobile 3D simulation tool for emulating sensor interaction in complex driving environments is selected in order to achieve that objective. Furthermore, a model-based framework is defined that employs a point-cloud clustering technique, and an error-based prediction model library that is composed of a multilayer perceptron neural network, and k-nearest neighbors and linear regression models. Finally, a reinforcement learning technique, specifically a Q-learning method, is implemented to determine the number of LiDAR sensors that are required to increase sensor reliability for obstacle localization tasks. In addition, a IoT driving assistance user scenario, connecting a five LiDAR sensor network is designed and implemented to validate the accuracy of the computational intelligence-based framework. The results demonstrated that the self-tuning method is an appropriate strategy to increase the reliability of the sensor network while minimizing detection thresholds. PMID:29748521
Castaño, Fernando; Beruvides, Gerardo; Villalonga, Alberto; Haber, Rodolfo E
2018-05-10
On-chip LiDAR sensors for vehicle collision avoidance are a rapidly expanding area of research and development. The assessment of reliable obstacle detection using data collected by LiDAR sensors has become a key issue that the scientific community is actively exploring. The design of a self-tuning methodology and its implementation are presented in this paper, to maximize the reliability of LiDAR sensors network for obstacle detection in the 'Internet of Things' (IoT) mobility scenarios. The Webots Automobile 3D simulation tool for emulating sensor interaction in complex driving environments is selected in order to achieve that objective. Furthermore, a model-based framework is defined that employs a point-cloud clustering technique, and an error-based prediction model library that is composed of a multilayer perceptron neural network, and k-nearest neighbors and linear regression models. Finally, a reinforcement learning technique, specifically a Q-learning method, is implemented to determine the number of LiDAR sensors that are required to increase sensor reliability for obstacle localization tasks. In addition, a IoT driving assistance user scenario, connecting a five LiDAR sensor network is designed and implemented to validate the accuracy of the computational intelligence-based framework. The results demonstrated that the self-tuning method is an appropriate strategy to increase the reliability of the sensor network while minimizing detection thresholds.
Bem Sex Role Inventory Validation in the International Mobility in Aging Study.
Ahmed, Tamer; Vafaei, Afshin; Belanger, Emmanuelle; Phillips, Susan P; Zunzunegui, Maria-Victoria
2016-09-01
This study investigated the measurement structure of the Bem Sex Role Inventory (BSRI) with different factor analysis methods. Most previous studies on validity applied exploratory factor analysis (EFA) to examine the BSRI. We aimed to assess the psychometric properties and construct validity of the 12-item short-form BSRI in a sample administered to 1,995 older adults from wave 1 of the International Mobility in Aging Study (IMIAS). We used Cronbach's alpha to assess internal consistency reliability and confirmatory factor analysis (CFA) to assess psychometric properties. EFA revealed a three-factor model, further confirmed by CFA and compared with the original two-factor structure model. Results revealed that a two-factor solution (instrumentality-expressiveness) has satisfactory construct validity and superior fit to data compared to the three-factor solution. The two-factor solution confirms expected gender differences in older adults. The 12-item BSRI provides a brief, psychometrically sound, and reliable instrument in international samples of older adults.
Mitchell, John D; Amir, Rabia; Montealegre-Gallegos, Mario; Mahmood, Feroze; Shnider, Marc; Mashari, Azad; Yeh, Lu; Bose, Ruma; Wong, Vanessa; Hess, Philip; Amador, Yannis; Jeganathan, Jelliffe; Jones, Stephanie B; Matyal, Robina
2018-06-01
While standardized examinations and data from simulators and phantom models can assess knowledge and manual skills for ultrasound, an Objective Structured Clinical Examination (OSCE) could assess workflow understanding. We recruited 8 experts to develop an OSCE to assess workflow understanding in perioperative ultrasound. The experts used a binary grading system to score 19 graduating anesthesia residents at 6 stations. Overall average performance was 86.2%, and 3 stations had an acceptable internal reliability (Kuder-Richardson formula 20 coefficient >0.5). After refinement, this OSCE can be combined with standardized examinations and data from simulators and phantom models to assess proficiency in ultrasound.
An Online Risk Monitor System (ORMS) to Increase Safety and Security Levels in Industry
NASA Astrophysics Data System (ADS)
Zubair, M.; Rahman, Khalil Ur; Hassan, Mehmood Ul
2013-12-01
The main idea of this research is to develop an Online Risk Monitor System (ORMS) based on Living Probabilistic Safety Assessment (LPSA). The article highlights the essential features and functions of ORMS. The basic models and modules such as, Reliability Data Update Model (RDUM), running time update, redundant system unavailability update, Engineered Safety Features (ESF) unavailability update and general system update have been described in this study. ORMS not only provides quantitative analysis but also highlights qualitative aspects of risk measures. ORMS is capable of automatically updating the online risk models and reliability parameters of equipment. ORMS can support in the decision making process of operators and managers in Nuclear Power Plants.
Brazhnik, Olga; Jones, John F.
2007-01-01
Producing reliable information is the ultimate goal of data processing. The ocean of data created with the advances of science and technologies calls for integration of data coming from heterogeneous sources that are diverse in their purposes, business rules, underlying models and enabling technologies. Reference models, Semantic Web, standards, ontology, and other technologies enable fast and efficient merging of heterogeneous data, while the reliability of produced information is largely defined by how well the data represent the reality. In this paper we initiate a framework for assessing the informational value of data that includes data dimensions; aligning data quality with business practices; identifying authoritative sources and integration keys; merging models; uniting updates of varying frequency and overlapping or gapped data sets. PMID:17071142
Development and construct validity of the Classroom Strategies Scale-Observer Form.
Reddy, Linda A; Fabiano, Gregory; Dudek, Christopher M; Hsu, Louis
2013-12-01
Research on progress monitoring has almost exclusively focused on student behavior and not on teacher practices. This article presents the development and validation of a new teacher observational assessment (Classroom Strategies Scale) of classroom instructional and behavioral management practices. The theoretical underpinnings and empirical basis for the instructional and behavioral management scales are presented. The Classroom Strategies Scale (CSS) evidenced overall good reliability estimates including internal consistency, interrater reliability, test-retest reliability, and freedom from item bias on important teacher demographics (age, educational degree, years of teaching experience). Confirmatory factor analyses (CFAs) of CSS data from 317 classrooms were carried out to assess the level of empirical support for (a) a 4 first-order factor theory concerning teachers' instructional practices, and (b) a 4 first-order factor theory concerning teachers' behavior management practice. Several fit indices indicated acceptable fit of the (a) and (b) CFA models to the data, as well as acceptable fit of less parsimonious alternative CFA models that included 1 or 2 second-order factors. Information-theory-based indices generally suggested that the (a) and (b) CFA models fit better than some more parsimonious alternative CFA models that included constraints on relations of first-order factors. Overall, CFA first-order and higher order factor results support the CSS-Observer Total, Composite, and subscales. Suggestions for future measurement development efforts are outlined. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Identification of reliable gridded reference data for statistical downscaling methods in Alberta
NASA Astrophysics Data System (ADS)
Eum, H. I.; Gupta, A.
2017-12-01
Climate models provide essential information to assess impacts of climate change at regional and global scales. However, statistical downscaling methods have been applied to prepare climate model data for various applications such as hydrologic and ecologic modelling at a watershed scale. As the reliability and (spatial and temporal) resolution of statistically downscaled climate data mainly depend on a reference data, identifying the most reliable reference data is crucial for statistical downscaling. A growing number of gridded climate products are available for key climate variables which are main input data to regional modelling systems. However, inconsistencies in these climate products, for example, different combinations of climate variables, varying data domains and data lengths and data accuracy varying with physiographic characteristics of the landscape, have caused significant challenges in selecting the most suitable reference climate data for various environmental studies and modelling. Employing various observation-based daily gridded climate products available in public domain, i.e. thin plate spline regression products (ANUSPLIN and TPS), inverse distance method (Alberta Townships), and numerical climate model (North American Regional Reanalysis) and an optimum interpolation technique (Canadian Precipitation Analysis), this study evaluates the accuracy of the climate products at each grid point by comparing with the Adjusted and Homogenized Canadian Climate Data (AHCCD) observations for precipitation, minimum and maximum temperature over the province of Alberta. Based on the performance of climate products at AHCCD stations, we ranked the reliability of these publically available climate products corresponding to the elevations of stations discretized into several classes. According to the rank of climate products for each elevation class, we identified the most reliable climate products based on the elevation of target points. A web-based system was developed to allow users to easily select the most reliable reference climate data at each target point based on the elevation of grid cell. By constructing the best combination of reference data for the study domain, the accurate and reliable statistically downscaled climate projections could be significantly improved.
A Methodology for Quantifying Certain Design Requirements During the Design Phase
NASA Technical Reports Server (NTRS)
Adams, Timothy; Rhodes, Russel
2005-01-01
A methodology for developing and balancing quantitative design requirements for safety, reliability, and maintainability has been proposed. Conceived as the basis of a more rational approach to the design of spacecraft, the methodology would also be applicable to the design of automobiles, washing machines, television receivers, or almost any other commercial product. Heretofore, it has been common practice to start by determining the requirements for reliability of elements of a spacecraft or other system to ensure a given design life for the system. Next, safety requirements are determined by assessing the total reliability of the system and adding redundant components and subsystems necessary to attain safety goals. As thus described, common practice leaves the maintainability burden to fall to chance; therefore, there is no control of recurring costs or of the responsiveness of the system. The means that have been used in assessing maintainability have been oriented toward determining the logistical sparing of components so that the components are available when needed. The process established for developing and balancing quantitative requirements for safety (S), reliability (R), and maintainability (M) derives and integrates NASA s top-level safety requirements and the controls needed to obtain program key objectives for safety and recurring cost (see figure). Being quantitative, the process conveniently uses common mathematical models. Even though the process is shown as being worked from the top down, it can also be worked from the bottom up. This process uses three math models: (1) the binomial distribution (greaterthan- or-equal-to case), (2) reliability for a series system, and (3) the Poisson distribution (less-than-or-equal-to case). The zero-fail case for the binomial distribution approximates the commonly known exponential distribution or "constant failure rate" distribution. Either model can be used. The binomial distribution was selected for modeling flexibility because it conveniently addresses both the zero-fail and failure cases. The failure case is typically used for unmanned spacecraft as with missiles.
1992-07-24
Pmo Pbuf Tarr PenvTmm Patneo TTneo Teo Pateo ’,- r 0,Traneo 8 a e P naco Pfa Te aco Tsro- - PPatnen Them, Teyn Pate.. ••==P a• Tat e,- Fr ti I 3 456...Trivedi. Reliability modeling using SHARPE. IEEE Trans. Reliability, R-36(2):186-193, June 1987. 3 [14] K. Salem and H. Garcia-Molina. Disk striping
ERIC Educational Resources Information Center
Simons, Johan; Daly, Daniel; Theodorou, Fani; Caron, Cindy; Simons, Joke; Andoniadou, Elena
2008-01-01
The purpose of this study was to assess validity and reliability of the TGMD-2 on Flemish children with intellectual disability. The total sample consisted of 99 children aged 7-10 years of which 67 were boys and 32 were girls. A factor analysis supported a two factor model of the TGMD-2. A low significant age effect was also found for the object…
The reliability of vertical jump tests between the Vertec and My Jump phone application.
Yingling, Vanessa R; Castro, Dimitri A; Duong, Justin T; Malpartida, Fiorella J; Usher, Justin R; O, Jenny
2018-01-01
The vertical jump is used to estimate sports performance capabilities and physical fitness in children, elderly, non-athletic and injured individuals. Different jump techniques and measurement tools are available to assess vertical jump height and peak power; however, their use is limited by access to laboratory settings, excessive cost and/or time constraints thus making these tools oftentimes unsuitable for field assessment. A popular field test uses the Vertec and the Sargent vertical jump with countermovement; however, new low cost, easy to use tools are becoming available, including the My Jump iOS mobile application (app). The purpose of this study was to assess the reliability of the My Jump relative to values obtained by the Vertec for the Sargent stand and reach vertical jump (VJ) test. One hundred and thirty-five healthy participants aged 18-39 years (94 males, 41 females) completed three maximal Sargent VJ with countermovement that were simultaneously measured using the Vertec and the My Jump . Jump heights were quantified for each jump and peak power was calculated using the Sayers equation. Four separate ICC estimates and their 95% confidence intervals were used to assess reliability. Two analyses (with jump height and calculated peak power as the dependent variables, respectively) were based on a single rater, consistency, two-way mixed-effects model, while two others (with jump height and calculated peak power as the dependent variables, respectively) were based on a single rater, absolute agreement, two-way mixed-effects model. Moderate to excellent reliability relative to the degree of consistency between the Vertec and My Jump values was found for jump height (ICC = 0.813; 95% CI [0.747-0.863]) and calculated peak power (ICC = 0.926; 95% CI [0.897-0.947]). However, poor to good reliability relative to absolute agreement for VJ height (ICC = 0.665; 95% CI [0.050-0.859]) and poor to excellent reliability relative to absolute agreement for peak power (ICC = 0.851; 95% CI [0.272-0.946]) between the Vertec and My Jump values were found; Vertec VJ height, and thus, Vertec calculated peak power values, were significantly higher than those calculated from My Jump values ( p < 0.0001). The My Jump app may provide a reliable measure of vertical jump height and calculated peak power in multiple field and laboratory settings without the need of costly equipment such as force plates or Vertec. The reliability relative to degree of consistency between the Vertec and My Jump app was moderate to excellent. However, the reliability relative to absolute agreement between Vertec and My Jump values contained significant variation (based on CI values), thus, it is recommended that either the My Jump or the Vertec be used to assess VJ height in repeated measures within subjects' designs; these measurement tools should not be considered interchangeable within subjects or in group measurement designs.
The reliability of vertical jump tests between the Vertec and My Jump phone application
Castro, Dimitri A.; Duong, Justin T.; Malpartida, Fiorella J.; Usher, Justin R.; O, Jenny
2018-01-01
Background The vertical jump is used to estimate sports performance capabilities and physical fitness in children, elderly, non-athletic and injured individuals. Different jump techniques and measurement tools are available to assess vertical jump height and peak power; however, their use is limited by access to laboratory settings, excessive cost and/or time constraints thus making these tools oftentimes unsuitable for field assessment. A popular field test uses the Vertec and the Sargent vertical jump with countermovement; however, new low cost, easy to use tools are becoming available, including the My Jump iOS mobile application (app). The purpose of this study was to assess the reliability of the My Jump relative to values obtained by the Vertec for the Sargent stand and reach vertical jump (VJ) test. Methods One hundred and thirty-five healthy participants aged 18–39 years (94 males, 41 females) completed three maximal Sargent VJ with countermovement that were simultaneously measured using the Vertec and the My Jump. Jump heights were quantified for each jump and peak power was calculated using the Sayers equation. Four separate ICC estimates and their 95% confidence intervals were used to assess reliability. Two analyses (with jump height and calculated peak power as the dependent variables, respectively) were based on a single rater, consistency, two-way mixed-effects model, while two others (with jump height and calculated peak power as the dependent variables, respectively) were based on a single rater, absolute agreement, two-way mixed-effects model. Results Moderate to excellent reliability relative to the degree of consistency between the Vertec and My Jump values was found for jump height (ICC = 0.813; 95% CI [0.747–0.863]) and calculated peak power (ICC = 0.926; 95% CI [0.897–0.947]). However, poor to good reliability relative to absolute agreement for VJ height (ICC = 0.665; 95% CI [0.050–0.859]) and poor to excellent reliability relative to absolute agreement for peak power (ICC = 0.851; 95% CI [0.272–0.946]) between the Vertec and My Jump values were found; Vertec VJ height, and thus, Vertec calculated peak power values, were significantly higher than those calculated from My Jump values (p < 0.0001). Discussion The My Jump app may provide a reliable measure of vertical jump height and calculated peak power in multiple field and laboratory settings without the need of costly equipment such as force plates or Vertec. The reliability relative to degree of consistency between the Vertec and My Jump app was moderate to excellent. However, the reliability relative to absolute agreement between Vertec and My Jump values contained significant variation (based on CI values), thus, it is recommended that either the My Jump or the Vertec be used to assess VJ height in repeated measures within subjects’ designs; these measurement tools should not be considered interchangeable within subjects or in group measurement designs. PMID:29692955
Empirical methods for assessing meaningful neuropsychological change following epilepsy surgery.
Sawrie, S M; Chelune, G J; Naugle, R I; Lüders, H O
1996-11-01
Traditional methods for assessing the neurocognitive effects of epilepsy surgery are confounded by practice effects, test-retest reliability issues, and regression to the mean. This study employs 2 methods for assessing individual change that allow direct comparison of changes across both individuals and test measures. Fifty-one medically intractable epilepsy patients completed a comprehensive neuropsychological battery twice, approximately 8 months apart, prior to any invasive monitoring or surgical intervention. First, a Reliable Change (RC) index score was computed for each test score to take into account the reliability of that measure, and a cutoff score was empirically derived to establish the limits of statistically reliable change. These indices were subsequently adjusted for expected practice effects. The second approach used a regression technique to establish "change norms" along a common metric that models both expected practice effects and regression to the mean. The RC index scores provide the clinician with a statistical means of determining whether a patient's retest performance is "significantly" changed from baseline. The regression norms for change allow the clinician to evaluate the magnitude of a given patient's change on 1 or more variables along a common metric that takes into account the reliability and stability of each test measure. Case data illustrate how these methods provide an empirically grounded means for evaluating neurocognitive outcomes following medical interventions such as epilepsy surgery.
Cavitating Propeller Performance in Inclined Shaft Conditions with OpenFOAM: PPTC 2015 Test Case
NASA Astrophysics Data System (ADS)
Gaggero, Stefano; Villa, Diego
2018-05-01
In this paper, we present our analysis of the non-cavitating and cavitating unsteady performances of the Potsdam Propeller Test Case (PPTC) in oblique flow. For our calculations, we used the Reynolds-averaged Navier-Stokes equation (RANSE) solver from the open-source OpenFOAM libraries. We selected the homogeneous mixture approach to solve for multiphase flow with phase change, using the volume of fluid (VoF) approach to solve the multiphase flow and modeling the mass transfer between vapor and water with the Schnerr-Sauer model. Comparing the model results with the experimental measurements collected during the Second Workshop on Cavitation and Propeller Performance - SMP'15 enabled our assessment of the reliability of the open-source calculations. Comparisons with the numerical data collected during the workshop enabled further analysis of the reliability of different flow solvers from which we produced an overview of recommended guidelines (mesh arrangements and solver setups) for accurate numerical prediction even in off-design conditions. Lastly, we propose a number of calculations using the boundary element method developed at the University of Genoa for assessing the reliability of this dated but still widely adopted approach for design and optimization in the preliminary stages of very demanding test cases.
Zumpano, Camila Eugênia; Mendonça, Tânia Maria da Silva; Silva, Carlos Henrique Martins da; Correia, Helena; Arnold, Benjamin; Pinto, Rogério de Melo Costa
2017-01-23
This study aimed to perform the cross-cultural adaptation and validation of the Patient-Reported Outcomes Measurement Information System (PROMIS) Global Health scale in the Portuguese language. The ten Global Health items were cross-culturally adapted by the method proposed in the Functional Assessment of Chronic Illness Therapy (FACIT). The instrument's final version in Portuguese was self-administered by 1,010 participants in Brazil. The scale's precision was verified by floor and ceiling effects analysis, reliability of internal consistency, and test-retest reliability. Exploratory and confirmatory factor analyses were used to assess the construct's validity and instrument's dimensionality. Calibration of the items used the Gradual Response Model proposed by Samejima. Four global items required adjustments after the pretest. Analysis of the psychometric properties showed that the Global Health scale has good reliability, with Cronbach's alpha of 0.83 and intra-class correlation of 0.89. Exploratory and confirmatory factor analyses showed good fit in the previously established two-dimensional model. The Global Physical Health and Global Mental Health scale showed good latent trait coverage according to the Gradual Response Model. The PROMIS Global Health items showed equivalence in Portuguese compared to the original version and satisfactory psychometric properties for application in clinical practice and research in the Brazilian population.
Developing a model for hospital inherent safety assessment: Conceptualization and validation.
Yari, Saeed; Akbari, Hesam; Gholami Fesharaki, Mohammad; Khosravizadeh, Omid; Ghasemi, Mohammad; Barsam, Yalda; Akbari, Hamed
2018-01-01
Paying attention to the safety of hospitals, as the most crucial institute for providing medical and health services wherein a bundle of facilities, equipment, and human resource exist, is of significant importance. The present research aims at developing a model for assessing hospitals' safety based on principles of inherent safety design. Face validity (30 experts), content validity (20 experts), construct validity (268 examples), convergent validity, and divergent validity have been employed to validate the prepared questionnaire; and the items analysis, the Cronbach's alpha test, ICC test (to measure reliability of the test), composite reliability coefficient have been used to measure primary reliability. The relationship between variables and factors has been confirmed at 0.05 significance level by conducting confirmatory factor analysis (CFA) and structural equations modeling (SEM) technique with the use of Smart-PLS. R-square and load factors values, which were higher than 0.67 and 0.300 respectively, indicated the strong fit. Moderation (0.970), simplification (0.959), substitution (0.943), and minimization (0.5008) have had the most weights in determining the inherent safety of hospital respectively. Moderation, simplification, and substitution, among the other dimensions, have more weight on the inherent safety, while minimization has the less weight, which could be due do its definition as to minimize the risk.
Anderson, Donald D; Segal, Neil A; Kern, Andrew M; Nevitt, Michael C; Torner, James C; Lynch, John A
2012-01-01
Recent findings suggest that contact stress is a potent predictor of subsequent symptomatic osteoarthritis development in the knee. However, much larger numbers of knees (likely on the order of hundreds, if not thousands) need to be reliably analyzed to achieve the statistical power necessary to clarify this relationship. This study assessed the reliability of new semiautomated computational methods for estimating contact stress in knees from large population-based cohorts. Ten knees of subjects from the Multicenter Osteoarthritis Study were included. Bone surfaces were manually segmented from sequential 1.0 Tesla magnetic resonance imaging slices by three individuals on two nonconsecutive days. Four individuals then registered the resulting bone surfaces to corresponding bone edges on weight-bearing radiographs, using a semi-automated algorithm. Discrete element analysis methods were used to estimate contact stress distributions for each knee. Segmentation and registration reliabilities (day-to-day and interrater) for peak and mean medial and lateral tibiofemoral contact stress were assessed with Shrout-Fleiss intraclass correlation coefficients (ICCs). The segmentation and registration steps of the modeling approach were found to have excellent day-to-day (ICC 0.93-0.99) and good inter-rater reliability (0.84-0.97). This approach for estimating compartment-specific tibiofemoral contact stress appears to be sufficiently reliable for use in large population-based cohorts.
NASA Astrophysics Data System (ADS)
Yu, Bo; Ning, Chao-lie; Li, Bing
2017-03-01
A probabilistic framework for durability assessment of concrete structures in marine environments was proposed in terms of reliability and sensitivity analysis, which takes into account the uncertainties under the environmental, material, structural and executional conditions. A time-dependent probabilistic model of chloride ingress was established first to consider the variations in various governing parameters, such as the chloride concentration, chloride diffusion coefficient, and age factor. Then the Nataf transformation was adopted to transform the non-normal random variables from the original physical space into the independent standard Normal space. After that the durability limit state function and its gradient vector with respect to the original physical parameters were derived analytically, based on which the first-order reliability method was adopted to analyze the time-dependent reliability and parametric sensitivity of concrete structures in marine environments. The accuracy of the proposed method was verified by comparing with the second-order reliability method and the Monte Carlo simulation. Finally, the influences of environmental conditions, material properties, structural parameters and execution conditions on the time-dependent reliability of concrete structures in marine environments were also investigated. The proposed probabilistic framework can be implemented in the decision-making algorithm for the maintenance and repair of deteriorating concrete structures in marine environments.
Burns, Scott A; Cleland, Joshua A; Carpenter, Kristin; Mintken, Paul E
2016-03-01
Examine the interrater reliability of cervicothoracic and shoulder physical examination in patients with a primary complaint of shoulder pain. Single-group repeated-measures design for interrater reliability. Orthopaedic physical therapy clinics. Twenty-one patients with a primary complaint of shoulder pain underwent a standardized examination by a physical therapist (PT). A PT conducted the first examination and one of two additional PTs conducted the 2nd examination. The Cohen κ and weighted κ were used to calculate the interrater reliability of ordinal level data. Intraclass correlation coefficients model 2,1 (ICC2,1) and the 95% confidence intervals were calculated to determine the interrater reliability. The kappa coefficients ranged from -.24 to .83 for the mobility assessment of the glenohumeral, acromioclavicular and sternoclavicular joints. The kappa coefficients ranged from -.20 to .58 for joint mobility assessment of the cervical and thoracic spine. The kappa coefficients ranged from .23 to 1.0 for special tests of the shoulder and cervical spine. The present study reported the reliability of a comprehensive upper quarter physical examination for a group of patients with a primary report of shoulder pain. The reliability varied considerably for the cervical and shoulder examination and was significantly higher for the examination of muscle length and cervical range of motion. Copyright © 2015 Elsevier Ltd. All rights reserved.
O'Connor, Teresia M; Cerin, Ester; Hughes, Sheryl O; Robles, Jessica; Thompson, Deborah I; Mendoza, Jason A; Baranowski, Tom; Lee, Rebecca E
2014-01-15
Latino preschoolers (3-5 year old children) have among the highest rates of obesity. Low levels of physical activity (PA) are a risk factor for obesity. Characterizing what Latino parents do to encourage or discourage their preschooler to be physically active can help inform interventions to increase their PA. The objective was therefore to develop and assess the psychometrics of a new instrument: the Preschooler Physical Activity Parenting Practices (PPAPP) among a Latino sample, to assess parenting practices used to encourage or discourage PA among preschool-aged children. Cross-sectional study of 240 Latino parents who reported the frequency of using PA parenting practices. 95% of respondents were mothers; 42% had more than a high school education. Child mean age was 4.5 (±0.9) years (52% male). Test-retest reliability was assessed in 20%, 2 weeks later. We assessed the fit of a priori models using Confirmatory factor analyses (CFA). In a separate sub-sample (35%), preschool-aged children wore accelerometers to assess associations with their PA and PPAPP subscales. The a-priori models showed poor fit to the data. A modified factor structure for encouraging PPAPP had one multiple-item scale: engagement (15 items), and two single-items (have outdoor toys; not enroll in sport-reverse coded). The final factor structure for discouraging PPAPP had 4 subscales: promote inactive transport (3 items), promote screen time (3 items), psychological control (4 items) and restricting for safety (4 items). Test-retest reliability (ICC) for the two scales ranged from 0.56-0.85. Cronbach's alphas ranged from 0.5-0.9. Several sub-factors correlated in the expected direction with children's objectively measured PA. The final models for encouraging and discouraging PPAPP had moderate to good fit, with moderate to excellent test-retest reliabilities. The PPAPP should be further evaluated to better assess its associations with children's PA and offers a new tool for measuring PPAPP among Latino families with preschool-aged children.
Big data analytics for the Future Circular Collider reliability and availability studies
NASA Astrophysics Data System (ADS)
Begy, Volodimir; Apollonio, Andrea; Gutleber, Johannes; Martin-Marquez, Manuel; Niemi, Arto; Penttinen, Jussi-Pekka; Rogova, Elena; Romero-Marin, Antonio; Sollander, Peter
2017-10-01
Responding to the European Strategy for Particle Physics update 2013, the Future Circular Collider study explores scenarios of circular frontier colliders for the post-LHC era. One branch of the study assesses industrial approaches to model and simulate the reliability and availability of the entire particle collider complex based on the continuous monitoring of CERN’s accelerator complex operation. The modelling is based on an in-depth study of the CERN injector chain and LHC, and is carried out as a cooperative effort with the HL-LHC project. The work so far has revealed that a major challenge is obtaining accelerator monitoring and operational data with sufficient quality, to automate the data quality annotation and calculation of reliability distribution functions for systems, subsystems and components where needed. A flexible data management and analytics environment that permits integrating the heterogeneous data sources, the domain-specific data quality management algorithms and the reliability modelling and simulation suite is a key enabler to complete this accelerator operation study. This paper describes the Big Data infrastructure and analytics ecosystem that has been put in operation at CERN, serving as the foundation on which reliability and availability analysis and simulations can be built. This contribution focuses on data infrastructure and data management aspects and presents case studies chosen for its validation.
Kramp, Kelvin H; van Det, Marc J; Hoff, Christiaan; Lamme, Bas; Veeger, Nic J G M; Pierie, Jean-Pierre E N
2015-01-01
Global Operative Assessment of Laparoscopic Skills (GOALS) assessment has been designed to evaluate skills in laparoscopic surgery. A longitudinal blinded study of randomized video fragments was conducted to estimate the validity and reliability of GOALS in novice trainees. In total, 10 trainees each performed 6 consecutive laparoscopic cholecystectomies. Sixty procedures were recorded on video. Video fragments of (1) opening of the peritoneum; (2) dissection of Calot's triangle and achievement of critical view of safety; and (3) dissection of the gallbladder from the liver bed were blinded, randomized, and rated by 2 consultant surgeons using GOALS. Also, a grade was given for overall competence. The correlation of GOALS with live observation Objective Structured Assessment of Technical Skills (OSATS) scores was calculated. Construct validity was estimated using the Friedman 2-way analysis of variance by ranks and the Wilcoxon signed-rank test. The interrater reliability was calculated using the absolute and consistency agreement 2-way random-effects model intraclass correlation coefficient. A high correlation was found between mean GOALS score (r = 0.879, p = 0.021) and mean OSATS score. The GOALS score increased significantly across the 6 procedures (p = 0.002). The trainees performed significantly better on their sixth when compared with their first cholecystectomy (p = 0.004). The consistency agreement interrater reliability was 0.37 for the mean GOALS score (p = 0.002) and 0.55 for overall competence (p < 0.001) of the 3 video fragments. The validity observed in this randomized blinded longitudinal study supports the existing evidence that GOALS is a valid tool for assessment of novice trainees. A relatively low reliability was found in this study. Copyright © 2014 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
An evidence-based decision assistance model for predicting training outcome in juvenile guide dogs.
Harvey, Naomi D; Craigon, Peter J; Blythe, Simon A; England, Gary C W; Asher, Lucy
2017-01-01
Working dog organisations, such as Guide Dogs, need to regularly assess the behaviour of the dogs they train. In this study we developed a questionnaire-style behaviour assessment completed by training supervisors of juvenile guide dogs aged 5, 8 and 12 months old (n = 1,401), and evaluated aspects of its reliability and validity. Specifically, internal reliability, temporal consistency, construct validity, predictive criterion validity (comparing against later training outcome) and concurrent criterion validity (comparing against a standardised behaviour test) were evaluated. Thirty-nine questions were sourced either from previously published literature or created to meet requirements identified via Guide Dogs staff surveys and staff feedback. Internal reliability analyses revealed seven reliable and interpretable trait scales named according to the questions within them as: Adaptability; Body Sensitivity; Distractibility; Excitability; General Anxiety; Trainability and Stair Anxiety. Intra-individual temporal consistency of the scale scores between 5-8, 8-12 and 5-12 months was high. All scales excepting Body Sensitivity showed some degree of concurrent criterion validity. Predictive criterion validity was supported for all seven scales, since associations were found with training outcome, at at-least one age. Thresholds of z-scores on the scales were identified that were able to distinguish later training outcome by identifying 8.4% of all dogs withdrawn for behaviour and 8.5% of all qualified dogs, with 84% and 85% specificity. The questionnaire assessment was reliable and could detect traits that are consistent within individuals over time, despite juvenile dogs undergoing development during the study period. By applying thresholds to scores produced from the questionnaire this assessment could prove to be a highly valuable decision-making tool for Guide Dogs. This is the first questionnaire-style assessment of juvenile dogs that has shown value in predicting the training outcome of individual working dogs.
Li, Wei Bo; Greiter, Matthias; Oeh, Uwe; Hoeschen, Christoph
2011-12-01
The reliability of biokinetic models is essential for the assessment of internal doses and a radiation risk analysis for the public and occupational workers exposed to radionuclides. In the present study, a method for assessing the reliability of biokinetic models by means of uncertainty and sensitivity analysis was developed. In the first part of the paper, the parameter uncertainty was analyzed for two biokinetic models of zirconium (Zr); one was reported by the International Commission on Radiological Protection (ICRP), and one was developed at the Helmholtz Zentrum München-German Research Center for Environmental Health (HMGU). In the second part of the paper, the parameter uncertainties and distributions of the Zr biokinetic models evaluated in Part I are used as the model inputs for identifying the most influential parameters in the models. Furthermore, the most influential model parameter on the integral of the radioactivity of Zr over 50 y in source organs after ingestion was identified. The results of the systemic HMGU Zr model showed that over the first 10 d, the parameters of transfer rates between blood and other soft tissues have the largest influence on the content of Zr in the blood and the daily urinary excretion; however, after day 1,000, the transfer rate from bone to blood becomes dominant. For the retention in bone, the transfer rate from blood to bone surfaces has the most influence out to the endpoint of the simulation; the transfer rate from blood to the upper larger intestine contributes a lot in the later days; i.e., after day 300. The alimentary tract absorption factor (fA) influences mostly the integral of radioactivity of Zr in most source organs after ingestion.
Kyongho Son; Christina Tague; Carolyn Hunsaker
2016-01-01
The effect of fine-scale topographic variability on model estimates of ecohydrologic responses to climate variability in Californiaâs Sierra Nevada watersheds has not been adequately quantified and may be important for supporting reliable climate-impact assessments. This study tested the effect of digital elevation model (DEM) resolution on model accuracy and estimates...
NASA Astrophysics Data System (ADS)
Zemenkova, M. Yu; Zemenkov, Yu D.; Shantarin, V. D.
2016-10-01
The paper reviews the development of methodology for calculation of hydrocarbon emissions during seepage and evaporation to monitor the reliability and safety of hydrocarbon storage and transportation. The authors have analyzed existing methods, models and techniques for assessing the amount of evaporated oil. Models used for predicting the material balance of multicomponent two-phase systems have been discussed. The results of modeling the open-air hydrocarbon evaporation from an oil spill are provided and exemplified by an emergency pit. Dependences and systems of differential equations have been obtained to assess parameters of mass transfer from the open surface of a liquid multicomponent mixture.
Modern psychometrics for assessing achievement goal orientation: a Rasch analysis.
Muis, Krista R; Winne, Philip H; Edwards, Ordene V
2009-09-01
A program of research is needed that assesses the psychometric properties of instruments designed to quantify students' achievement goal orientations to clarify inconsistencies across previous studies and to provide a stronger basis for future research. We conducted traditional psychometric and modern Rasch-model analyses of the Achievement Goals Questionnaire (AGQ, Elliot & McGregor, 2001) and the Patterns of Adaptive Learning Scale (PALS, Midgley et al., 2000) to provide an in-depth analysis of the two most popular instruments in educational psychology. For Study 1, 217 undergraduate students enrolled in educational psychology courses participated. Thirty-four were male and 181 were female (two did not respond). Participants completed the AGQ in the context of their educational psychology class. For Study 2, 126 undergraduate students enrolled in educational psychology courses participated. Thirty were male and 95 were female (one did not respond). Participants completed the PALS in the context of their educational psychology class. Traditional psychometric assessments of the AGQ and PALS replicated previous studies. For both, reliability estimates ranged from good to very good for raw subscale scores and fit for the models of goal orientations were good. Based on traditional psychometrics, the AGQ and PALS are valid and reliable indicators of achievement goals. Rasch analyses revealed that estimates of reliability for items were very good but respondent ability estimates varied from poor to good for both the AGQ and PALS. These findings indicate that items validly and reliably reflect a group's aggregate goal orientation, but using either instrument to characterize an individual's goal orientation is hazardous.
Lehotkay, R; Saraswathi Devi, T; Raju, M V R; Bada, P K; Nuti, S; Kempf, N; Carminati, G Galli
2015-03-01
In this study realised in collaboration with the department of psychology and parapsychology of Andhra University, validation of the Aberrant Behavior Checklist-Community (ABC-C) in Telugu, the official language of Andhra Pradesh, one of India's 28 states, was carried out. To assess the factor validity and reliability of this Telugu version, 120 participants with moderate to profound intellectual disability (94 men and 26 women, mean age 25.2, SD 7.1) were rated by the staff of the Lebenshilfe Institution for Mentally Handicapped in Visakhapatnam, Andhra Pradesh, India. Rating data were analysed with a confirmatory factor analysis. The internal consistency was estimated by Cronbach's alpha. To confirm the test-retest reliability, 50 participants were rated twice with an interval of 4 weeks, and 50 were rated by pairs of raters to assess inter-rater reliability. Confirmatory factor analysis revealed that the root mean square error of approximation (RMSEA) was equal to 0.06, the comparative fit index (CFI) was equal to 0.77, and the Tucker Lewis index (TLI) was equal to 0.77, which indicated that the model with five correlated factors had a good fit. Coefficient alpha ranged from 0.85 to 0.92 across the five subscales. Spearman's rank correlation coefficients for inter-rater reliability tests ranged from 0.65 to 0.75, and the correlations for test-retest reliability ranged from 0.58 to 0.76. All reliability coefficients were statistically significant (P < 0.01). The factor validity and reliability of Telugu version of the ABC-C evidenced factor validity and reliability comparable to the original English version and appears to be useful for assessing behaviour disorders in Indian people with intellectual disabilities. © 2014 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
External validation of Global Evaluative Assessment of Robotic Skills (GEARS).
Aghazadeh, Monty A; Jayaratna, Isuru S; Hung, Andrew J; Pan, Michael M; Desai, Mihir M; Gill, Inderbir S; Goh, Alvin C
2015-11-01
We demonstrate the construct validity, reliability, and utility of Global Evaluative Assessment of Robotic Skills (GEARS), a clinical assessment tool designed to measure robotic technical skills, in an independent cohort using an in vivo animal training model. Using a cross-sectional observational study design, 47 voluntary participants were categorized as experts (>30 robotic cases completed as primary surgeon) or trainees. The trainee group was further divided into intermediates (≥5 but ≤30 cases) or novices (<5 cases). All participants completed a standardized in vivo robotic task in a porcine model. Task performance was evaluated by two expert robotic surgeons and self-assessed by the participants using the GEARS assessment tool. Kruskal-Wallis test was used to compare the GEARS performance scores to determine construct validity; Spearman's rank correlation measured interobserver reliability; and Cronbach's alpha was used to assess internal consistency. Performance evaluations were completed on nine experts and 38 trainees (14 intermediate, 24 novice). Experts demonstrated superior performance compared to intermediates and novices overall and in all individual domains (p < 0.0001). In comparing intermediates and novices, the overall performance difference trended toward significance (p = 0.0505), while the individual domains of efficiency and autonomy were significantly different between groups (p = 0.0280 and 0.0425, respectively). Interobserver reliability between expert ratings was confirmed with a strong correlation observed (r = 0.857, 95 % CI [0.691, 0.941]). Experts and participant scoring showed less agreement (r = 0.435, 95 % CI [0.121, 0.689] and r = 0.422, 95 % CI [0.081, 0.0672]). Internal consistency was excellent for experts and participants (α = 0.96, 0.98, 0.93). In an independent cohort, GEARS was able to differentiate between different robotic skill levels, demonstrating excellent construct validity. As a standardized assessment tool, GEARS maintained consistency and reliability for an in vivo robotic surgical task and may be applied for skills evaluation in a broad range of robotic procedures.
Data Applicability of Heritage and New Hardware for Launch Vehicle System Reliability Models
NASA Technical Reports Server (NTRS)
Al Hassan Mohammad; Novack, Steven
2015-01-01
Many launch vehicle systems are designed and developed using heritage and new hardware. In most cases, the heritage hardware undergoes modifications to fit new functional system requirements, impacting the failure rates and, ultimately, the reliability data. New hardware, which lacks historical data, is often compared to like systems when estimating failure rates. Some qualification of applicability for the data source to the current system should be made. Accurately characterizing the reliability data applicability and quality under these circumstances is crucial to developing model estimations that support confident decisions on design changes and trade studies. This presentation will demonstrate a data-source classification method that ranks reliability data according to applicability and quality criteria to a new launch vehicle. This method accounts for similarities/dissimilarities in source and applicability, as well as operating environments like vibrations, acoustic regime, and shock. This classification approach will be followed by uncertainty-importance routines to assess the need for additional data to reduce uncertainty.
NASA Astrophysics Data System (ADS)
Strunz, Richard; Herrmann, Jeffrey W.
2011-12-01
The hot fire test strategy for liquid rocket engines has always been a concern of space industry and agency alike because no recognized standard exists. Previous hot fire test plans focused on the verification of performance requirements but did not explicitly include reliability as a dimensioning variable. The stakeholders are, however, concerned about a hot fire test strategy that balances reliability, schedule, and affordability. A multiple criteria test planning model is presented that provides a framework to optimize the hot fire test strategy with respect to stakeholder concerns. The Staged Combustion Rocket Engine Demonstrator, a program of the European Space Agency, is used as example to provide the quantitative answer to the claim that a reduced thrust scale demonstrator is cost beneficial for a subsequent flight engine development. Scalability aspects of major subsystems are considered in the prior information definition inside the Bayesian framework. The model is also applied to assess the impact of an increase of the demonstrated reliability level on schedule and affordability.
A Student Assessment Tool for Standardized Patient Simulations (SAT-SPS): Psychometric analysis.
Castro-Yuste, Cristina; García-Cabanillas, María José; Rodríguez-Cornejo, María Jesús; Carnicer-Fuentes, Concepción; Paloma-Castro, Olga; Moreno-Corral, Luis Javier
2018-05-01
The evaluation of the level of clinical competence acquired by the student is a complex process that must meet various requirements to ensure its quality. The psychometric analysis of the data collected by the assessment tools used is a fundamental aspect to guarantee the student's competence level. To conduct a psychometric analysis of an instrument which assesses clinical competence in nursing students at simulation stations with standardized patients in OSCE-format tests. The construct of clinical competence was operationalized as a set of observable and measurable behaviors, measured by the newly-created Student Assessment Tool for Standardized Patient Simulations (SAT-SPS), which was comprised of 27 items. The categories assigned to the items were 'incorrect or not performed' (0), 'acceptable' (1), and 'correct' (2). 499 nursing students. Data were collected by two independent observers during the assessment of the students' performance at a four-station OSCE with standardized patients. Descriptive statistics were used to summarize the variables. The difficulty levels and floor and ceiling effects were determined for each item. Reliability was analyzed using internal consistency and inter-observer reliability. The validity analysis was performed considering face validity, content and construct validity (through exploratory factor analysis), and criterion validity. Internal reliability and inter-observer reliability were higher than 0.80. The construct validity analysis suggested a three-factor model accounting for 37.1% of the variance. These three factors were named 'Nursing process', 'Communication skills', and 'Safe practice'. A significant correlation was found between the scores obtained and the students' grades in general, as well as with the grades obtained in subjects with clinical content. The assessment tool has proven to be sufficiently reliable and valid for the assessment of the clinical competence of nursing students using standardized patients. This tool has three main components: the nursing process, communication skills, and safety management. Copyright © 2018 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boring, Ronald; Mandelli, Diego; Rasmussen, Martin
2016-06-01
This report presents an application of a computation-based human reliability analysis (HRA) framework called the Human Unimodel for Nuclear Technology to Enhance Reliability (HUNTER). HUNTER has been developed not as a standalone HRA method but rather as framework that ties together different HRA methods to model dynamic risk of human activities as part of an overall probabilistic risk assessment (PRA). While we have adopted particular methods to build an initial model, the HUNTER framework is meant to be intrinsically flexible to new pieces that achieve particular modeling goals. In the present report, the HUNTER implementation has the following goals: •more » Integration with a high fidelity thermal-hydraulic model capable of modeling nuclear power plant behaviors and transients • Consideration of a PRA context • Incorporation of a solid psychological basis for operator performance • Demonstration of a functional dynamic model of a plant upset condition and appropriate operator response This report outlines these efforts and presents the case study of a station blackout scenario to demonstrate the various modules developed to date under the HUNTER research umbrella.« less
NASA Astrophysics Data System (ADS)
Schöniger, Anneli; Wöhling, Thomas; Nowak, Wolfgang
2014-05-01
Bayesian model averaging ranks the predictive capabilities of alternative conceptual models based on Bayes' theorem. The individual models are weighted with their posterior probability to be the best one in the considered set of models. Finally, their predictions are combined into a robust weighted average and the predictive uncertainty can be quantified. This rigorous procedure does, however, not yet account for possible instabilities due to measurement noise in the calibration data set. This is a major drawback, since posterior model weights may suffer a lack of robustness related to the uncertainty in noisy data, which may compromise the reliability of model ranking. We present a new statistical concept to account for measurement noise as source of uncertainty for the weights in Bayesian model averaging. Our suggested upgrade reflects the limited information content of data for the purpose of model selection. It allows us to assess the significance of the determined posterior model weights, the confidence in model selection, and the accuracy of the quantified predictive uncertainty. Our approach rests on a brute-force Monte Carlo framework. We determine the robustness of model weights against measurement noise by repeatedly perturbing the observed data with random realizations of measurement error. Then, we analyze the induced variability in posterior model weights and introduce this "weighting variance" as an additional term into the overall prediction uncertainty analysis scheme. We further determine the theoretical upper limit in performance of the model set which is imposed by measurement noise. As an extension to the merely relative model ranking, this analysis provides a measure of absolute model performance. To finally decide, whether better data or longer time series are needed to ensure a robust basis for model selection, we resample the measurement time series and assess the convergence of model weights for increasing time series length. We illustrate our suggested approach with an application to model selection between different soil-plant models following up on a study by Wöhling et al. (2013). Results show that measurement noise compromises the reliability of model ranking and causes a significant amount of weighting uncertainty, if the calibration data time series is not long enough to compensate for its noisiness. This additional contribution to the overall predictive uncertainty is neglected without our approach. Thus, we strongly advertise to include our suggested upgrade in the Bayesian model averaging routine.
Assessment of NDE reliability data
NASA Technical Reports Server (NTRS)
Yee, B. G. W.; Couchman, J. C.; Chang, F. H.; Packman, D. F.
1975-01-01
Twenty sets of relevant nondestructive test (NDT) reliability data were identified, collected, compiled, and categorized. A criterion for the selection of data for statistical analysis considerations was formulated, and a model to grade the quality and validity of the data sets was developed. Data input formats, which record the pertinent parameters of the defect/specimen and inspection procedures, were formulated for each NDE method. A comprehensive computer program was written and debugged to calculate the probability of flaw detection at several confidence limits by the binomial distribution. This program also selects the desired data sets for pooling and tests the statistical pooling criteria before calculating the composite detection reliability. An example of the calculated reliability of crack detection in bolt holes by an automatic eddy current method is presented.
Improving Learner Handovers in Medical Education.
Warm, Eric J; Englander, Robert; Pereira, Anne; Barach, Paul
2017-07-01
Multiple studies have demonstrated that the information included in the Medical Student Performance Evaluation fails to reliably predict medical students' future performance. This faulty transfer of information can lead to harm when poorly prepared students fail out of residency or, worse, are shuttled through the medical education system without an honest accounting of their performance. Such poor learner handovers likely arise from two root causes: (1) the absence of agreed-on outcomes of training and/or accepted assessments of those outcomes, and (2) the lack of standardized ways to communicate the results of those assessments. To improve the current learner handover situation, an authentic, shared mental model of competency is needed; high-quality tools to assess that competency must be developed and tested; and transparent, reliable, and safe ways to communicate this information must be created.To achieve these goals, the authors propose using a learner handover process modeled after a patient handover process. The CLASS model includes a description of the learner's Competency attainment, a summary of the Learner's performance, an Action list and statement of Situational awareness, and Synthesis by the receiving program. This model also includes coaching oriented towards improvement along the continuum of education and care. Just as studies have evaluated patient handover models using metrics that matter most to patients, studies must evaluate this learner handover model using metrics that matter most to providers, patients, and learners.
Tabuse, Hideaki; Kalali, Amir; Azuma, Hideki; Ozaki, Norio; Iwata, Nakao; Naitoh, Hiroshi; Higuchi, Teruhiko; Kanba, Shigenobu; Shioe, Kunihiko; Akechi, Tatsuo; Furukawa, Toshi A
2007-09-30
The Hamilton Rating Scale for Depression (HAMD) is the de facto international gold standard for the assessment of depression. There are some criticisms, however, especially with regard to its inter-rater reliability, due to the lack of standardized questions or explicit scoring procedures. The GRID-HAMD was developed to provide standardized explicit scoring conventions and a structured interview guide for administration and scoring of the HAMD. We developed the Japanese version of the GRID-HAMD and examined its inter-rater reliability among experienced and inexperienced clinicians (n=70), how rater characteristics may affect it, and how training can improve it in the course of a model training program using videotaped interviews. The results showed that the inter-rater reliability of the GRID-HAMD total score was excellent to almost perfect and those of most individual items were also satisfactory to excellent, both with experienced and inexperienced raters, and both before and after the training. With its standardized definitions, questions and detailed scoring conventions, the GRID-HAMD appears to be the best achievable set of interview guides for the HAMD and can provide a solid tool for highly reliable assessment of depression severity.
Validation of the Arabic Version of the Iowa Infant Feeding Attitude Scale among Lebanese Women.
Charafeddine, Lama; Tamim, Hani; Soubra, Marwa; de la Mora, Arlene; Nabulsi, Mona
2016-05-01
There is need in the Arab world for validated instruments that can reliably assess infant feeding attitudes among women. The 17-item Iowa Infant Feeding Attitude Scale (IIFAS) has consistently shown good reliability and validity in different cultures and the ability to predict breastfeeding intention and exclusivity. This study assessed the psychometric properties of the Arabic version of the IIFAS (IIFAS-A). After translating to classical Arabic and back-translating to English, the IIFAS-A was pilot tested among 20 women for comprehension, clarity, length, and cultural appropriateness. The IIFAS-A was then validated among 170 women enrolled in a breastfeeding promotion and support clinical trial in Lebanon. The IIFAS-A showed acceptable internal consistency reliability (Cronbach's α = 0.640), with principal components analysis revealing that it is unidimensional. The 17 items had good interitem reliabilities ranging between 0.599 and 0.665. The number of breastfed children was the only predictor of the overall IIFAS-A score in a multivariate stepwise regression model (β = 1.531, P < .0001). The 17-item IIFAS-A is a reliable and valid instrument for measuring women's infant feeding attitudes in the Arab context. © The Author(s) 2015.
Dependability modeling and assessment in UML-based software development.
Bernardi, Simona; Merseguer, José; Petriu, Dorina C
2012-01-01
Assessment of software nonfunctional properties (NFP) is an important problem in software development. In the context of model-driven development, an emerging approach for the analysis of different NFPs consists of the following steps: (a) to extend the software models with annotations describing the NFP of interest; (b) to transform automatically the annotated software model to the formalism chosen for NFP analysis; (c) to analyze the formal model using existing solvers; (d) to assess the software based on the results and give feedback to designers. Such a modeling→analysis→assessment approach can be applied to any software modeling language, be it general purpose or domain specific. In this paper, we focus on UML-based development and on the dependability NFP, which encompasses reliability, availability, safety, integrity, and maintainability. The paper presents the profile used to extend UML with dependability information, the model transformation to generate a DSPN formal model, and the assessment of the system properties based on the DSPN results.
Dependability Modeling and Assessment in UML-Based Software Development
Bernardi, Simona; Merseguer, José; Petriu, Dorina C.
2012-01-01
Assessment of software nonfunctional properties (NFP) is an important problem in software development. In the context of model-driven development, an emerging approach for the analysis of different NFPs consists of the following steps: (a) to extend the software models with annotations describing the NFP of interest; (b) to transform automatically the annotated software model to the formalism chosen for NFP analysis; (c) to analyze the formal model using existing solvers; (d) to assess the software based on the results and give feedback to designers. Such a modeling→analysis→assessment approach can be applied to any software modeling language, be it general purpose or domain specific. In this paper, we focus on UML-based development and on the dependability NFP, which encompasses reliability, availability, safety, integrity, and maintainability. The paper presents the profile used to extend UML with dependability information, the model transformation to generate a DSPN formal model, and the assessment of the system properties based on the DSPN results. PMID:22988428
Gale, T C E; Roberts, M J; Sice, P J; Langton, J A; Patterson, F C; Carr, A S; Anderson, I R; Lam, W H; Davies, P R F
2010-11-01
Assessment centres are an accepted method of recruitment in industry and are gaining popularity within medicine. We describe the development and validation of a selection centre for recruitment to speciality training in anaesthesia based on an assessment centre model incorporating the rating of candidate's non-technical skills. Expert consensus identified non-technical skills suitable for assessment at the point of selection. Four stations-structured interview, portfolio review, presentation, and simulation-were developed, the latter two being realistic scenarios of work-related tasks. Evaluation of the selection centre focused on applicant and assessor feedback ratings, inter-rater agreement, and internal consistency reliability coefficients. Predictive validity was sought via correlations of selection centre scores with subsequent workplace-based ratings of appointed trainees. Two hundred and twenty-four candidates were assessed over two consecutive annual recruitment rounds; 68 were appointed and followed up during training. Candidates and assessors demonstrated strong approval of the selection centre with more than 70% of ratings 'good' or 'excellent'. Mean inter-rater agreement coefficients ranged from 0.62 to 0.77 and internal consistency reliability of the selection centre score was high (Cronbach's α=0.88-0.91). The overall selection centre score was a good predictor of workplace performance during the first year of appointment. An assessment centre model based on the rating of non-technical skills can produce a reliable and valid selection tool for recruitment to speciality training in anaesthesia. Early results on predictive validity are encouraging and justify further development and evaluation.
Austvoll-Dahlgren, Astrid; Guttersrud, Øystein; Nsangi, Allen; Semakula, Daniel; Oxman, Andrew D
2017-01-01
Background The Claim Evaluation Tools database contains multiple-choice items for measuring people’s ability to apply the key concepts they need to know to be able to assess treatment claims. We assessed items from the database using Rasch analysis to develop an outcome measure to be used in two randomised trials in Uganda. Rasch analysis is a form of psychometric testing relying on Item Response Theory. It is a dynamic way of developing outcome measures that are valid and reliable. Objectives To assess the validity, reliability and responsiveness of 88 items addressing 22 key concepts using Rasch analysis. Participants We administrated four sets of multiple-choice items in English to 1114 people in Uganda and Norway, of which 685 were children and 429 were adults (including 171 health professionals). We scored all items dichotomously. We explored summary and individual fit statistics using the RUMM2030 analysis package. We used SPSS to perform distractor analysis. Results Most items conformed well to the Rasch model, but some items needed revision. Overall, the four item sets had satisfactory reliability. We did not identify significant response dependence between any pairs of items and, overall, the magnitude of multidimensionality in the data was acceptable. The items had a high level of difficulty. Conclusion Most of the items conformed well to the Rasch model’s expectations. Following revision of some items, we concluded that most of the items were suitable for use in an outcome measure for evaluating the ability of children or adults to assess treatment claims. PMID:28550019
Pololi, Linda H; Evans, Arthur T; Nickell, Leslie; Reboli, Annette C; Coplit, Lisa D; Stuber, Margaret L; Vasiliou, Vasilia; Civian, Janet T; Brennan, Robert T
2017-06-01
A practical, reliable, and valid instrument is needed to measure the impact of the learning environment on medical students' well-being and educational experience and to meet medical school accreditation requirements. From 2012 to 2015, medical students were surveyed at the end of their first, second, and third year of studies at four medical schools. The survey assessed students' perceptions of the following nine dimensions of the school culture: vitality, self-efficacy, institutional support, relationships/inclusion, values alignment, ethical/moral distress, work-life integration, gender equity, and ethnic minority equity. The internal reliability of each of the nine dimensions was measured. Construct validity was evaluated by assessing relationships predicted by our conceptual model and prior research. Assessment was made of whether the measurements were sensitive to differences over time and across institutions. Six hundred and eighty-six students completed the survey (49 % women; 9 % underrepresented minorities), with a response rate of 89 % (range over the student cohorts 72-100 %). Internal consistency of each dimension was high (Cronbach's α 0.71-0.86). The instrument was able to detect significant differences in the learning environment across institutions and over time. Construct validity was supported by demonstrating several relationships predicted by our conceptual model. The C-Change Medical Student Survey is a practical, reliable, and valid instrument for assessing the learning environment of medical students. Because it is sensitive to changes over time and differences across institution, results could potentially be used to facilitate and monitor improvements in the learning environment of medical students.
Validation and Reliability of a Novel Test of Upper Body Isometric Strength.
Bellar, David; Marcus, Lena; Judge, Lawrence W
2015-09-29
The purpose of the present investigation was to examine the association of a novel test of upper body isometric strength against a 1RM bench press measurement. Forty college age adults (n = 20 female, n = 20 male; age 22.8 ± 2.8 years; body height 171.6 ± 10.8 cm; body mass 73.5 ± 16.3 kg; body fat 23.1 ± 5.4%) volunteered for the present investigation. The participants reported to the lab on three occasions. The first visit included anthropometric measurements and familiarization with both the upper body isometric test and bench press exercise. The final visits were conducted in a randomized order, with one being a 1RM assessment on the bench press and the other consisting of three trials of the upper body isometric assessment. For the isometric test, participants were positioned in a "push-up" style position while tethered (stainless steel chain) to a load cell (high frequency) anchored to the ground. The peak isometric force was consistent across all three trials (ICC = 0.98) suggesting good reliability. Multiple regression analysis was completed with the predictors: peak isometric force, gender, against the outcome variable 1RM bench press. The analysis resulted in a significant model (r2 = 0.861, p≤0.001) with all predictor variables attaining significance in the model (p<0.05). Isometric peak strength had the greatest effect on the model (Beta = 5.19, p≤0.001). Results from this study suggest that the described isometric upper body strength assessment is likely a valid and reliable tool to determine strength. Further research is warranted to gather a larger pool of data in regard to this assessment.
Validation and Reliability of a Novel Test of Upper Body Isometric Strength
Bellar, David; Marcus, Lena; Judge, Lawrence W.
2015-01-01
The purpose of the present investigation was to examine the association of a novel test of upper body isometric strength against a 1RM bench press measurement. Forty college age adults (n = 20 female, n = 20 male; age 22.8 ± 2.8 years; body height 171.6 ± 10.8 cm; body mass 73.5 ± 16.3 kg; body fat 23.1 ± 5.4%) volunteered for the present investigation. The participants reported to the lab on three occasions. The first visit included anthropometric measurements and familiarization with both the upper body isometric test and bench press exercise. The final visits were conducted in a randomized order, with one being a 1RM assessment on the bench press and the other consisting of three trials of the upper body isometric assessment. For the isometric test, participants were positioned in a “push-up” style position while tethered (stainless steel chain) to a load cell (high frequency) anchored to the ground. The peak isometric force was consistent across all three trials (ICC = 0.98) suggesting good reliability. Multiple regression analysis was completed with the predictors: peak isometric force, gender, against the outcome variable 1RM bench press. The analysis resulted in a significant model (r2 = 0.861, p≤0.001) with all predictor variables attaining significance in the model (p<0.05). Isometric peak strength had the greatest effect on the model (Beta = 5.19, p≤0.001). Results from this study suggest that the described isometric upper body strength assessment is likely a valid and reliable tool to determine strength. Further research is warranted to gather a larger pool of data in regard to this assessment. PMID:26557203
Duane, B G; Humphris, G; Richards, D; Okeefe, E J; Gordon, K; Freeman, R
2014-12-01
To assess the use of the WCMT in two Scottish health boards and to consider the impact of simplifying the tool to improve efficient use. A retrospective analysis of routine WCMT data (47,276 cases). Public Dental Service (PDS) within NHS Lothian and Highland. The WCMT consists of six criteria. Each criterion is measured independently on a four-point scale to assess patient complexity and the dental care for the disabled/impaired patient. Psychometric analyses on the data-set were conducted. Conventional internal consistency coefficients were calculated. Latent variable modelling was performed to assess the 'fit' of the raw data to a pre-specified measurement model. A Confirmatory Factor Analysis (CFA) was used to test three potential changes to the existing WCMT that included, the removal of the oral risk factor question, the removal of original weightings for scoring the Tool, and collapsing the 4-point rating scale to three categories. The removal of the oral risk factor question had little impact on the reliability of the proposed simplified CMT to discriminate between levels of patient complexity. The removal of weighting and collapsing each item's rating scale to three categories had limited impact on reliability of the revised tool. The CFA analysis provided strong evidence that a new, proposed simplified Case Mix Tool (sCMT) would operate closely to the pre-specified measurement model (the WMCT). A modified sCMT can demonstrate, without reducing reliability, a useful measure of the complexity of patient care. The proposed sCMT may be implemented within primary care dentistry to record patient complexity as part of an oral health assessment.
ERIC Educational Resources Information Center
Keller, Lisa A.; Clauser, Brian E.; Swanson, David B.
2010-01-01
In recent years, demand for performance assessments has continued to grow. However, performance assessments are notorious for lower reliability, and in particular, low reliability resulting from task specificity. Since reliability analyses typically treat the performance tasks as randomly sampled from an infinite universe of tasks, these estimates…
Soleimani, Mohammad Ali; Bahrami, Nasim; Yaghoobzadeh, Ameneh; Banihashemi, Hedieh; Nia, Hamid Sharif; Haghdoost, Ali Akbar
2016-01-01
Due to increasing recognition of the importance of death anxiety for understanding human nature, it is important that researchers who investigate death anxiety have reliable and valid methodology to measure. The purpose of this study was to evaluate the validity and reliability of the Persian version of Templer Death Anxiety Scale (TDAS) in family caregivers of cancer patients. A sample of 326 caregivers of cancer patients completed a 15-item questionnaire. Principal components analysis (PCA) followed by a varimax rotation was used to assess factor structure of the DAS. The construct validity of the scale was assessed using exploratory and confirmatory factor analyses. Convergent and discriminant validity were also examined. Reliability was assessed with Cronbach's alpha coefficients and construction reliability. Based on the results of the PCA and consideration of the meaning of our items, a three-factor solution, explaining 60.38% of the variance, was identified. A confirmatory factor analysis (CFA) then supported the adequacy of the three-domain structure of the DAS. Goodness-of-fit indices showed an acceptable fit overall with the full model {χ(2)(df) = 262.32 (61), χ(2)/df = 2.04 [adjusted goodness of fit index (AGFI) = 0.922, parsimonious comparative fit index (PCFI) = 0.703, normed fit Index (NFI) = 0.912, CMIN/DF = 2.048, root mean square error of approximation (RMSEA) = 0.055]}. Convergent and discriminant validity were shown with construct fulfilled. The Cronbach's alpha and construct reliability were greater than 0.70. The findings show that the Persian version of the TDAS has a three-factor structure and acceptable validity and reliability.
Kutlay, Sehim; Kuçukdeveci, Ayse A; Elhan, Atilla H; Yavuzer, Gunes; Tennant, Alan
2007-02-28
Assessment of cognitive impairment with a valid cognitive screening tool is essential in neurorehabilitation. The aim of this study was to test the reliability and validity of the Turkish-adapted version of the Middlesex Elderly Assessment of Mental State (MEAMS) among acquired brain injury patients in Turkey. Some 155 patients with acquired brain injury admitted for rehabilitation were assessed by the adapted version of MEAMS at admission and discharge. Reliability was tested by internal consistency, intra-class correlation coefficient (ICC) and person separation index; internal construct validity by Rasch analysis; external construct validity by associations with physical and cognitive disability (FIM); and responsiveness by Effect Size. Reliability was found to be good with Cronbach's alpha of 0.82 at both admission and discharge; and likewise an ICC of 0.80. Person separation index was 0.813. Internal construct validity was good by fit of the data to the Rasch model (mean item fit -0.178; SD 1.019). Items were substantially free of differential item functioning. External construct validity was confirmed by expected associations with physical and cognitive disability. Effect size was 0.42 compared with 0.22 for cognitive FIM. The reliability and validity of the Turkish version of MEAMS as a cognitive impairment screening tool in acquired brain injury has been demonstrated.
Confirmatory factor analysis of the Chinese Breast Cancer Screening Beliefs Questionnaire.
Kwok, Cannas; Fethney, Judith; White, Kate
2012-01-01
Chinese women have been consistently reported as having low breast cancer screening practices. The Chinese Breast Cancer Screening Beliefs Questionnaire (CBCSB) was designed to assess Chinese Australian women's beliefs, knowledge, and attitudes toward breast cancer and screening practices. The objectives of the study were to confirm the factor structure of the CBCSB with a new, larger sample of immigrant Chinese Australian women and to report its clinical validity. A convenience sample of 785 Chinese Australian women was recruited from Chinese community organizations and shopping malls. Cronbach α was used to assess internal consistency reliability, and Amos v18 was used for confirmatory factor analysis. Clinical validity was assessed through linear regression using SPSS v18. The 3-factor structure of the CBCSB was confirmed, although the model required respecification to arrive at a suitable model fit as measured by the goodness-of-fit index (0.98), adjusted goodness-of-fit index (0.97), normed fit index (0.95), and root mean square error of approximation (0.031). Internal consistency reliability coefficients were satisfactory (>.6). Women who engaged in all 3 types of screening had more proactive attitudes to health checkups and perceived less barriers to mammographic screening. The CBCSB is a valid and reliable tool for assessing Chinese women's beliefs, knowledge, and attitudes about breast cancer and breast cancer screening practices. The CBCSB can be used for providing practicing nurses with insights into the provision of culturally sensitive breast health education.
ERIC Educational Resources Information Center
Raphael, Dennis; And Others
1996-01-01
A conceptual model of quality of life, developed at the Centre for Health Promotion at the University of Toronto (Canada), and associated instrumentation for collecting data from persons with developmental disabilities are presented. Results from a preliminary study with 41 participants support the reliability and validity of the model's…
NASA Astrophysics Data System (ADS)
Bruno, Delia Evelina; Barca, Emanuele; Goncalves, Rodrigo Mikosz; de Araujo Queiroz, Heithor Alexandre; Berardi, Luigi; Passarella, Giuseppe
2018-01-01
In this paper, the Evolutionary Polynomial Regression data modelling strategy has been applied to study small scale, short-term coastal morphodynamics, given its capability for treating a wide database of known information, non-linearly. Simple linear and multilinear regression models were also applied to achieve a balance between the computational load and reliability of estimations of the three models. In fact, even though it is easy to imagine that the more complex the model, the more the prediction improves, sometimes a "slight" worsening of estimations can be accepted in exchange for the time saved in data organization and computational load. The models' outcomes were validated through a detailed statistical, error analysis, which revealed a slightly better estimation of the polynomial model with respect to the multilinear model, as expected. On the other hand, even though the data organization was identical for the two models, the multilinear one required a simpler simulation setting and a faster run time. Finally, the most reliable evolutionary polynomial regression model was used in order to make some conjecture about the uncertainty increase with the extension of extrapolation time of the estimation. The overlapping rate between the confidence band of the mean of the known coast position and the prediction band of the estimated position can be a good index of the weakness in producing reliable estimations when the extrapolation time increases too much. The proposed models and tests have been applied to a coastal sector located nearby Torre Colimena in the Apulia region, south Italy.
NASA Technical Reports Server (NTRS)
Oswald, Fred B.; Savage, Michael; Zaretsky, Erwin V.
2015-01-01
The U.S. Space Shuttle fleet was originally intended to have a life of 100 flights for each vehicle, lasting over a 10-year period, with minimal scheduled maintenance or inspection. The first space shuttle flight was that of the Space Shuttle Columbia (OV-102), launched April 12, 1981. The disaster that destroyed Columbia occurred on its 28th flight, February 1, 2003, nearly 22 years after its first launch. In order to minimize risk of losing another Space Shuttle, a probabilistic life and reliability analysis was conducted for the Space Shuttle rudder/speed brake actuators to determine the number of flights the actuators could sustain. A life and reliability assessment of the actuator gears was performed in two stages: a contact stress fatigue model and a gear tooth bending fatigue model. For the contact stress analysis, the Lundberg-Palmgren bearing life theory was expanded to include gear-surface pitting for the actuator as a system. The mission spectrum of the Space Shuttle rudder/speed brake actuator was combined into equivalent effective hinge moment loads including an actuator input preload for the contact stress fatigue and tooth bending fatigue models. Gear system reliabilities are reported for both models and their combination. Reliability of the actuator bearings was analyzed separately, based on data provided by the actuator manufacturer. As a result of the analysis, the reliability of one half of a single actuator was calculated to be 98.6 percent for 12 flights. Accordingly, each actuator was subsequently limited to 12 flights before removal from service in the Space Shuttle.
NASA Astrophysics Data System (ADS)
Suhir, E.
2014-05-01
The well known and widely used experimental reliability "passport" of a mass manufactured electronic or a photonic product — the bathtub curve — reflects the combined contribution of the statistics-related and reliability-physics (physics-of-failure)-related processes. When time progresses, the first process results in a decreasing failure rate, while the second process associated with the material aging and degradation leads to an increased failure rate. An attempt has been made in this analysis to assess the level of the reliability physics-related aging process from the available bathtub curve (diagram). It is assumed that the products of interest underwent the burn-in testing and therefore the obtained bathtub curve does not contain the infant mortality portion. It has been also assumed that the two random processes in question are statistically independent, and that the failure rate of the physical process can be obtained by deducting the theoretically assessed statistical failure rate from the bathtub curve ordinates. In the carried out numerical example, the Raleigh distribution for the statistical failure rate was used, for the sake of a relatively simple illustration. The developed methodology can be used in reliability physics evaluations, when there is a need to better understand the roles of the statistics-related and reliability-physics-related irreversible random processes in reliability evaluations. The future work should include investigations on how powerful and flexible methods and approaches of the statistical mechanics can be effectively employed, in addition to reliability physics techniques, to model the operational reliability of electronic and photonic products.
Yellepeddi, Venkata; Rower, Joseph; Liu, Xiaoxi; Kumar, Shaun; Rashid, Jahidur; Sherwin, Catherine M T
2018-05-18
Physiologically based pharmacokinetic modeling and simulation is an important tool for predicting the pharmacokinetics, pharmacodynamics, and safety of drugs in pediatrics. Physiologically based pharmacokinetic modeling is applied in pediatric drug development for first-time-in-pediatric dose selection, simulation-based trial design, correlation with target organ toxicities, risk assessment by investigating possible drug-drug interactions, real-time assessment of pharmacokinetic-safety relationships, and assessment of non-systemic biodistribution targets. This review summarizes the details of a physiologically based pharmacokinetic modeling approach in pediatric drug research, emphasizing reports on pediatric physiologically based pharmacokinetic models of individual drugs. We also compare and contrast the strategies employed by various researchers in pediatric physiologically based pharmacokinetic modeling and provide a comprehensive overview of physiologically based pharmacokinetic modeling strategies and approaches in pediatrics. We discuss the impact of physiologically based pharmacokinetic models on regulatory reviews and product labels in the field of pediatric pharmacotherapy. Additionally, we examine in detail the current limitations and future directions of physiologically based pharmacokinetic modeling in pediatrics with regard to the ability to predict plasma concentrations and pharmacokinetic parameters. Despite the skepticism and concern in the pediatric community about the reliability of physiologically based pharmacokinetic models, there is substantial evidence that pediatric physiologically based pharmacokinetic models have been used successfully to predict differences in pharmacokinetics between adults and children for several drugs. It is obvious that the use of physiologically based pharmacokinetic modeling to support various stages of pediatric drug development is highly attractive and will rapidly increase, provided the robustness and reliability of these techniques are well established.
Camolas, José; Ferreira, André; Mannucci, Edoardo; Mascarenhas, Mário; Carvalho, Manuel; Moreira, Pedro; do Carmo, Isabel; Santos, Osvaldo
2016-06-01
Several health-related quality-of-life (HRQoL) dimensions are affected by obesity. Our goal was to characterize the psychometric properties of the ORWELL-R, a new obesity-related quality-of-life instrument for assessing the "individual experience of overweightness". This psychometric assessment included two different samples: one multicenter clinical sample, used for assessing internal consistency, construct validity and temporal reliability; and a community sample (collected through a cross-sectional mailing survey design), used for additional construct validity assessment and model fit confirmation. Overall, 946 persons participated (188 from the clinical sample; 758 from community sample). An alpha coefficient of 0.925 (clinical sample) and 0.934 (community sample) was found. Three subscales were identified (53.2 % of variance): Body environment experience (alpha = 0.875), Illness perception and distress (alpha = 0.864), Physical symptoms (alpha = 0.674). Adequate test-retest reliability has been confirmed (ICC: 0.78 for the overall score). ORWELL-R scores were worse in the clinical sample. Worst HRQoL, as measured by higher ORWELL-R scores, was associated with BMI increases. ORWELL-R scores were associated with IWQOL-Lite and lower scores in happiness. ORWELL-R shows good internal consistency and adequate test-retest reliability. Good construct validity was also observed (for convergent and discriminant validity) and confirmed through confirmatory factor analysis (in both clinical and community samples). Presented data sustain ORWELL-R as a reliable and useful instrument to assess obesity-related QoL, in both research and clinical contexts.
Issues in the Pharmacokinetics of Trichloroethylene and Its Metabolites
Chiu, Weihsueh A.; Okino, Miles S.; Lipscomb, John C.; Evans, Marina V.
2006-01-01
Much progress has been made in understanding the complex pharmacokinetics of trichloroethylene (TCE). Qualitatively, it is clear that TCE is metabolized to multiple metabolites either locally or into systemic circulation. Many of these metabolites are thought to have toxicologic importance. In addition, efforts to develop physiologically based pharmacokinetic (PBPK) models have led to a better quantitative assessment of the dosimetry of TCE and several of its metabolites. As part of a mini-monograph on key issues in the health risk assessment of TCE, this article is a review of a number of the current scientific issues in TCE pharmacokinetics and recent PBPK modeling efforts with a focus on literature published since 2000. Particular attention is paid to factors affecting PBPK modeling for application to risk assessment. Recent TCE PBPK modeling efforts, coupled with methodologic advances in characterizing uncertainty and variability, suggest that rigorous application of PBPK modeling to TCE risk assessment appears feasible at least for TCE and its major oxidative metabolites trichloroacetic acid and trichloroethanol. However, a number of basic structural hypotheses such as enterohepatic recirculation, plasma binding, and flow- or diffusion-limited treatment of tissue distribution require additional evaluation and analysis. Moreover, there are a number of metabolites of potential toxicologic interest, such as chloral, dichloroacetic acid, and those derived from glutathione conjugation, for which reliable pharmacokinetic data is sparse because of analytical difficulties or low concentrations in systemic circulation. It will be a challenge to develop reliable dosimetry for such cases. PMID:16966104
Fundamental Movement Skills Are More than Run, Throw and Catch: The Role of Stability Skills.
Rudd, James R; Barnett, Lisa M; Butson, Michael L; Farrow, Damian; Berry, Jason; Polman, Remco C J
2015-01-01
In motor development literature fundamental movement skills are divided into three constructs: locomotive, object control and stability skills. Most fundamental movement skills research has focused on children's competency in locomotor and object control skills. The first aim of this study was to validate a test battery to assess the construct of stability skills, in children aged 6 to 10 (M age = 8.2, SD = 1.2). Secondly we assessed how the stability skills construct fitted into a model of fundamental movement skill. The Delphi method was used to select the stability skill battery. Confirmatory factor analysis (CFA) was used to assess if the skills loaded onto the same construct and a new model of FMS was developed using structural equation modelling. Three postural control tasks were selected (the log roll, rock and back support) because they had good face and content validity. These skills also demonstrated good predictive validity with gymnasts scoring significantly better than children without gymnastic training and children from a high SES school performing better than those from a mid and low SES schools and the mid SES children scored better than the low SES children (all p < .05). Inter rater reliability tests were excellent for all three skills (ICC = 0.81, 0.87, 0.87) as was test re-test reliability (ICC 0.87-0.95). CFA provided good construct validity, and structural equation modelling revealed stability skills to be an independent factor in an overall FMS model which included locomotor (r = .88), object control (r = .76) and stability skills (r = .81). This study provides a rationale for the inclusion of stability skills in FMS assessment. The stability skills could be used alongside other FMS assessment tools to provide a holistic assessment of children's fundamental movement skills.
Gorgulho, B M; Pot, G K; Marchioni, D M
2017-05-01
The aim of this study was to evaluate the validity and reliability of the Main Meal Quality Index when applied on the UK population. The indicator was developed to assess meal quality in different populations, and is composed of 10 components: fruit, vegetables (excluding potatoes), ratio of animal protein to total protein, fiber, carbohydrate, total fat, saturated fat, processed meat, sugary beverages and desserts, and energy density, resulting in a score range of 0-100 points. The performance of the indicator was measured using strategies for assessing content validity, construct validity, discriminant validity and reliability, including principal component analysis, linear regression models and Cronbach's alpha. The indicator presented good reliability. The Main Meal Quality Index has been shown to be valid for use as an instrument to evaluate, monitor and compare the quality of meals consumed by adults in the United Kingdom.
Validation of new psychosocial factors questionnaires: a Colombian national study.
Villalobos, Gloria H; Vargas, Angélica M; Rondón, Martin A; Felknor, Sarah A
2013-01-01
The study of workers' health problems possibly associated with stressful conditions requires valid and reliable tools for monitoring risk factors. The present study validates two questionnaires to assess psychosocial risk factors for stress-related illnesses within a sample of Colombian workers. The validation process was based on a representative sample survey of 2,360 Colombian employees, aged 18-70 years. Worker response rate was 90%; 46% of the responders were women. Internal consistency was calculated, construct validity was tested with factor analysis and concurrent validity was tested with Spearman correlations. The questionnaires demonstrated adequate reliability (0.88-0.95). Factor analysis confirmed the dimensions proposed in the measurement model. Concurrent validity resulted in significant correlations with stress and health symptoms. "Work and Non-work Psychosocial Factors Questionnaires" were found to be valid and reliable for the assessment of workers' psychosocial factors, and they provide information for research and intervention. Copyright © 2012 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Karmalkar, A.
2017-12-01
Ensembles of dynamically downscaled climate change simulations are routinely used to capture uncertainty in projections at regional scales. I assess the reliability of two such ensembles for North America - NARCCAP and NA-CORDEX - by investigating the impact of model selection on representing uncertainty in regional projections, and the ability of the regional climate models (RCMs) to provide reliable information. These aspects - discussed for the six regions used in the US National Climate Assessment - provide an important perspective on the interpretation of downscaled results. I show that selecting general circulation models for downscaling based on their equilibrium climate sensitivities is a reasonable choice, but the six models chosen for NA-CORDEX do a poor job at representing uncertainty in winter temperature and precipitation projections in many parts of the eastern US, which lead to overconfident projections. The RCM performance is highly variable across models, regions, and seasons and the ability of the RCMs to provide improved seasonal mean performance relative to their parent GCMs seems limited in both RCM ensembles. Additionally, the ability of the RCMs to simulate historical climates is not strongly related to their ability to simulate climate change across the ensemble. This finding suggests limited use of models' historical performance to constrain their projections. Given these challenges in dynamical downscaling, the RCM results should not be used in isolation. Information on how well the RCM ensembles represent known uncertainties in regional climate change projections discussed here needs to be communicated clearly to inform maagement decisions.
Niraula, Rewati; Norman, Laura A.; Meixner, Thomas; Callegary, James B.
2012-01-01
In most watershed-modeling studies, flow is calibrated at one monitoring site, usually at the watershed outlet. Like many arid and semi-arid watersheds, the main reach of the Santa Cruz watershed, located on the Arizona-Mexico border, is discontinuous for most of the year except during large flood events, and therefore the flow characteristics at the outlet do not represent the entire watershed. Calibration is required at multiple locations along the Santa Cruz River to improve model reliability. The objective of this study was to best portray surface water flow in this semiarid watershed and evaluate the effect of multi-gage calibration on flow predictions. In this study, the Soil and Water Assessment Tool (SWAT) was calibrated at seven monitoring stations, which improved model performance and increased the reliability of flow, in the Santa Cruz watershed. The most sensitive parameters to affect flow were found to be curve number (CN2), soil evaporation and compensation coefficient (ESCO), threshold water depth in shallow aquifer for return flow to occur (GWQMN), base flow alpha factor (Alpha_Bf), and effective hydraulic conductivity of the soil layer (Ch_K2). In comparison, when the model was established with a single calibration at the watershed outlet, flow predictions at other monitoring gages were inaccurate. This study emphasizes the importance of multi-gage calibration to develop a reliable watershed model in arid and semiarid environments. The developed model, with further calibration of water quality parameters will be an integral part of the Santa Cruz Watershed Ecosystem Portfolio Model (SCWEPM), an online decision support tool, to assess the impacts of climate change and urban growth in the Santa Cruz watershed.
Correcting Measurement Error in Latent Regression Covariates via the MC-SIMEX Method
ERIC Educational Resources Information Center
Rutkowski, Leslie; Zhou, Yan
2015-01-01
Given the importance of large-scale assessments to educational policy conversations, it is critical that subpopulation achievement is estimated reliably and with sufficient precision. Despite this importance, biased subpopulation estimates have been found to occur when variables in the conditioning model side of a latent regression model contain…
Judgment Research and the Dimensional Model of Personality
ERIC Educational Resources Information Center
Garb, Howard N.
2008-01-01
Comments on the original article "Plate tectonics in the classification of personality disorder: Shifting to a dimensional model," by T. A. Widiger and T. J. Trull. The purpose of this comment is to address (a) whether psychologists know how personality traits are currently assessed by clinicians and (b) the reliability and validity of those…
COCOA: A New Validated Instrument to Assess Medical Students' Attitudes towards Older Adults
ERIC Educational Resources Information Center
Hollar, David; Roberts, Ellen; Busby-Whitehead, Jan
2011-01-01
This study tested the reliability and validity of the Carolina Opinions on Care of Older Adults (COCOA) survey compared with the Geriatric Assessment Survey (GAS). Participants were first year medical students (n = 160). A Linear Structural Relations (LISREL) measurement model for COCOA had a moderately strong fit that was significantly better…
ERIC Educational Resources Information Center
Cameron, Kim; And Others
This study attempted to develop a reliable and valid instrument for assessing work environment and continuous quality improvement efforts in the non-academic sectors of colleges and universities particularly those institutions who have adopted Total Quality Management programs. A model of a work environment for continuous quality improvement was…
ERIC Educational Resources Information Center
Farwell, Tricia M.; Alligood, Leon; Fitzgerald, Sharon; Blake, Ken
2016-01-01
This article introduces an objective grammar and math assessment and evaluates the assessment's outcome and reliability when fielded among eighty-one students in media writing courses. In addition, the article proposes a rubric for grading straight news leads and compares the rubric's reliability with the reliability of rating straight news leads…
San José, Verónica; Bellot-Arcís, Carlos; Tarazona, Beatriz; Zamora, Natalia; O Lagravère, Manuel
2017-01-01
Background To compare the reliability and accuracy of direct and indirect dental measurements derived from two types of 3D virtual models: generated by intraoral laser scanning (ILS) and segmented cone beam computed tomography (CBCT), comparing these with a 2D digital model. Material and Methods One hundred patients were selected. All patients’ records included initial plaster models, an intraoral scan and a CBCT. Patients´ dental arches were scanned with the iTero® intraoral scanner while the CBCTs were segmented to create three-dimensional models. To obtain 2D digital models, plaster models were scanned using a conventional 2D scanner. When digital models had been obtained using these three methods, direct dental measurements were measured and indirect measurements were calculated. Differences between methods were assessed by means of paired t-tests and regression models. Intra and inter-observer error were analyzed using Dahlberg´s d and coefficients of variation. Results Intraobserver and interobserver error for the ILS model was less than 0.44 mm while for segmented CBCT models, the error was less than 0.97 mm. ILS models provided statistically and clinically acceptable accuracy for all dental measurements, while CBCT models showed a tendency to underestimate measurements in the lower arch, although within the limits of clinical acceptability. Conclusions ILS and CBCT segmented models are both reliable and accurate for dental measurements. Integration of ILS with CBCT scans would get dental and skeletal information altogether. Key words:CBCT, intraoral laser scanner, 2D digital models, 3D models, dental measurements, reliability. PMID:29410764
Zijlstra, Agnes; Zijlstra, Wiebren
2013-09-01
Inverted pendulum (IP) models of human walking allow for wearable motion-sensor based estimations of spatio-temporal gait parameters during unconstrained walking in daily-life conditions. At present it is unclear to what extent different IP based estimations yield different results, and reliability and validity have not been investigated in older persons without a specific medical condition. The aim of this study was to compare reliability and validity of four different IP based estimations of mean step length in independent-living older persons. Participants were assessed twice and walked at different speeds while wearing a tri-axial accelerometer at the lower back. For all step-length estimators, test-retest intra-class correlations approached or were above 0.90. Intra-class correlations with reference step length were above 0.92 with a mean error of 0.0 cm when (1) multiplying the estimated center-of-mass displacement during a step by an individual correction factor in a simple IP model, or (2) adding an individual constant for bipedal stance displacement to the estimated displacement during single stance in a 2-phase IP model. When applying generic corrections or constants in all subjects (i.e. multiplication by 1.25, or adding 75% of foot length), correlations were above 0.75 with a mean error of respectively 2.0 and 1.2 cm. Although the results indicate that an individual adjustment of the IP models provides better estimations of mean step length, the ease of a generic adjustment can be favored when merely evaluating intra-individual differences. Further studies should determine the validity of these IP based estimations for assessing gait in daily life. Copyright © 2013 Elsevier B.V. All rights reserved.
Validation of a Dry Model for Assessing the Performance of Arthroscopic Hip Labral Repair.
Phillips, Lisa; Cheung, Jeffrey J H; Whelan, Daniel B; Murnaghan, Michael Lucas; Chahal, Jas; Theodoropoulos, John; Ogilvie-Harris, Darrell; Macniven, Ian; Dwyer, Tim
2017-07-01
Arthroscopic hip labral repair is a technically challenging and demanding surgical technique with a steep learning curve. Arthroscopic simulation allows trainees to develop these skills in a safe environment. The purpose of this study was to evaluate the use of a combination of assessment ratings for the performance of arthroscopic hip labral repair on a dry model. Cross-sectional study; Level of evidence, 3. A total of 47 participants including orthopaedic surgery residents (n = 37), sports medicine fellows (n = 5), and staff surgeons (n = 5) performed arthroscopic hip labral repair on a dry model. Prior arthroscopic experience was noted. Participants were evaluated by 2 orthopaedic surgeons using a task-specific checklist, the Arthroscopic Surgical Skill Evaluation Tool (ASSET), task completion time, and a final global rating scale. All procedures were video-recorded and scored by an orthopaedic fellow blinded to the level of training of each participant. The internal consistency/reliability (Cronbach alpha) using the total ASSET score for the procedure was high (intraclass correlation coefficient > 0.9). One-way analysis of variance for the total ASSET score demonstrated a difference between participants based on the level of training ( F 3,43 = 27.8, P < .001). A good correlation was seen between the ASSET score and previous exposure to arthroscopic procedures ( r = 0.52-0.73, P < .001). The interrater reliability for the ASSET score was excellent (>0.9). The results of this study demonstrate that the use of dry models to assess the performance of arthroscopic hip labral repair by trainees is both valid and reliable. Further research will be required to demonstrate a correlation with performance on cadaveric specimens or in the operating room.
The Importance of Human Reliability Analysis in Human Space Flight: Understanding the Risks
NASA Technical Reports Server (NTRS)
Hamlin, Teri L.
2010-01-01
HRA is a method used to describe, qualitatively and quantitatively, the occurrence of human failures in the operation of complex systems that affect availability and reliability. Modeling human actions with their corresponding failure in a PRA (Probabilistic Risk Assessment) provides a more complete picture of the risk and risk contributions. A high quality HRA can provide valuable information on potential areas for improvement, including training, procedural, equipment design and need for automation.
A model to predict accommodations needed by disabled persons.
Babski-Reeves, Kari; Williams, Sabrina; Waters, Tzer Nan; Crumpton-Young, Lesia L; McCauley-Bell, Pamela
2005-09-01
In this paper, several approaches to assist employers in the accommodation process for disabled employees are discussed and a mathematical model is proposed to assist employers in predicting the accommodation level needed by an individual with a mobility-related disability. This study investigates the validity and reliability of this model in assessing the accommodation level needed by individuals utilizing data collected from twelve individuals with mobility-related disabilities. Based on the results of the statistical analyses, this proposed model produces a feasible preliminary measure for assessing the accommodation level needed for persons with mobility-related disabilities. Suggestions for practical application of this model in an industrial setting are addressed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chu, Tsong-Lun; Varuttamaseni, Athi; Baek, Joo-Seok
The U.S. Nuclear Regulatory Commission (NRC) encourages the use of probabilistic risk assessment (PRA) technology in all regulatory matters, to the extent supported by the state-of-the-art in PRA methods and data. Although much has been accomplished in the area of risk-informed regulation, risk assessment for digital systems has not been fully developed. The NRC established a plan for research on digital systems to identify and develop methods, analytical tools, and regulatory guidance for (1) including models of digital systems in the PRAs of nuclear power plants (NPPs), and (2) incorporating digital systems in the NRC's risk-informed licensing and oversight activities.more » Under NRC's sponsorship, Brookhaven National Laboratory (BNL) explored approaches for addressing the failures of digital instrumentation and control (I and C) systems in the current NPP PRA framework. Specific areas investigated included PRA modeling digital hardware, development of a philosophical basis for defining software failure, and identification of desirable attributes of quantitative software reliability methods. Based on the earlier research, statistical testing is considered a promising method for quantifying software reliability. This paper describes a statistical software testing approach for quantifying software reliability and applies it to the loop-operating control system (LOCS) of an experimental loop of the Advanced Test Reactor (ATR) at Idaho National Laboratory (INL).« less
Jalalian, Mehrdad; Latiff, Latiffah; Hassan, Syed Tajuddin Syed; Hanachi, Parichehr; Othman, Mohamed
2010-05-01
University students are a target group for blood donor programs. To develop a blood donation culture among university students, it is important to identify factors used to predict their intent to donate blood. This study attempted to develop a valid and reliable measurement tool to be employed in assessing variables in a blood donation behavior model based on the Theory of Planned Behavior (TPB), a commonly used theoretical foundation for social psychology studies. We employed an elicitation study, in which we determined the commonly held behavioral and normative beliefs about blood donation. We used the results of the elicitation study and a standard format for creating questionnaire items for all constructs of the TPB model to prepare the first draft of the measurement tool. After piloting the questionnaire, we prepared the final draft of the questionnaire to be used in our main study. Examination of internal consistency using Chronbach's alpha coefficient and item-total statistics indicated the constructs "Intention" and "Self efficacy" had the highest reliability. Removing one item from each of the constructs, "Attitude," "Subjective norm," "Self efficacy," or "Behavioral beliefs", can considerably increase the reliability of the measurement tool, however, such action is controversial, especially for the variables "attitude" and "subjective norm." We consider all the items of our first draft questionnaire in our main study to make it a reliable measurement tool.
Kook, Seung Hee; Varni, James W
2008-06-02
The Pediatric Quality of Life Inventory (PedsQL) is a child self-report and parent proxy-report instrument designed to assess health-related quality of life (HRQOL) in healthy and ill children and adolescents. It has been translated into over 70 international languages and proposed as a valid and reliable pediatric HRQOL measure. This study aimed to assess the psychometric properties of the Korean translation of the PedsQL 4.0 Generic Core Scales. Following the guidelines for linguistic validation, the original US English scales were translated into Korean and cognitive interviews were administered. The field testing responses of 1425 school children and adolescents and 1431 parents to the Korean version of PedsQL 4.0 Generic Core Scales were analyzed utilizing confirmatory factor analysis and the Rasch model. Consistent with studies using the US English instrument and other translation studies, score distributions were skewed toward higher HRQOL in a predominantly healthy population. Confirmatory factor analysis supported a four-factor and a second order-factor model. The analysis using the Rasch model showed that person reliabilities are low, item reliabilities are high, and the majority of items fit the model's expectation. The Rasch rating scale diagnostics showed that PedsQL 4.0 Generic Core Scales in general have the optimal number of response categories, but category 4 (almost always a problem) is somewhat problematic for the healthy school sample. The agreements between child self-report and parent proxy-report were moderate. The results demonstrate the feasibility, validity, item reliability, item fit, and agreement between child self-report and parent proxy-report of the Korean version of PedsQL 4.0 Generic Core Scales for school population health research in Korea. However, the utilization of the Korean version of the PedsQL 4.0 Generic Core Scales for healthy school populations needs to consider low person reliability, ceiling effects and cultural differences, and further validation studies on Korean clinical samples are required.
Qi, Bing-Bing; Resnick, Barbara
2014-01-01
To assess the psychometric properties of Chinese versions self-efficacy and outcome expectations on osteoporosis medication adherence (SEOMA-C and OEOMA-C) scales. Back-translated tools were assessed by internal consistency and R2 by structured equation modeling, confirmatory factor analyses, hypothesis testing, and criterion-related validity among 110 (81 females, 29 males) Mandarin-speaking immigrants (mean age = 63.44, SD = 9.63). The Cronbach's alpha for SEOMA-C and OEOMA-C is .904 and .937, respectively. There was fair and good fit of the measurement model to the data. Previous bone mineral density (BMD) testing, calcaneus BMD, self-efficacy for exercise, and osteoporosis medication adherence were positively related to SEOMA-C scores. These scales constitute some preliminary validity and reliability. Further refined and cultural sensitive items could be explored and added.
Benjamin, Sara E; Neelon, Brian; Ball, Sarah C; Bangdiwala, Shrikant I; Ammerman, Alice S; Ward, Dianne S
2007-01-01
Background Few assessment instruments have examined the nutrition and physical activity environments in child care, and none are self-administered. Given the emerging focus on child care settings as a target for intervention, a valid and reliable measure of the nutrition and physical activity environment is needed. Methods To measure inter-rater reliability, 59 child care center directors and 109 staff completed the self-assessment concurrently, but independently. Three weeks later, a repeat self-assessment was completed by a sub-sample of 38 directors to assess test-retest reliability. To assess criterion validity, a researcher-administered environmental assessment was conducted at 69 centers and was compared to a self-assessment completed by the director. A weighted kappa test statistic and percent agreement were calculated to assess agreement for each question on the self-assessment. Results For inter-rater reliability, kappa statistics ranged from 0.20 to 1.00 across all questions. Test-retest reliability of the self-assessment yielded kappa statistics that ranged from 0.07 to 1.00. The inter-quartile kappa statistic ranges for inter-rater and test-retest reliability were 0.45 to 0.63 and 0.27 to 0.45, respectively. When percent agreement was calculated, questions ranged from 52.6% to 100% for inter-rater reliability and 34.3% to 100% for test-retest reliability. Kappa statistics for validity ranged from -0.01 to 0.79, with an inter-quartile range of 0.08 to 0.34. Percent agreement for validity ranged from 12.9% to 93.7%. Conclusion This study provides estimates of criterion validity, inter-rater reliability and test-retest reliability for an environmental nutrition and physical activity self-assessment instrument for child care. Results indicate that the self-assessment is a stable and reasonably accurate instrument for use with child care interventions. We therefore recommend the Nutrition and Physical Activity Self-Assessment for Child Care (NAP SACC) instrument to researchers and practitioners interested in conducting healthy weight intervention in child care. However, a more robust, less subjective measure would be more appropriate for researchers seeking an outcome measure to assess intervention impact. PMID:17615078
A Study of Wind Turbine Comprehensive Operational Assessment Model Based on EM-PCA Algorithm
NASA Astrophysics Data System (ADS)
Zhou, Minqiang; Xu, Bin; Zhan, Yangyan; Ren, Danyuan; Liu, Dexing
2018-01-01
To assess wind turbine performance accurately and provide theoretical basis for wind farm management, a hybrid assessment model based on Entropy Method and Principle Component Analysis (EM-PCA) was established, which took most factors of operational performance into consideration and reach to a comprehensive result. To verify the model, six wind turbines were chosen as the research objects, the ranking obtained by the method proposed in the paper were 4#>6#>1#>5#>2#>3#, which are completely in conformity with the theoretical ranking, which indicates that the reliability and effectiveness of the EM-PCA method are high. The method could give guidance for processing unit state comparison among different units and launching wind farm operational assessment.
Alfredsson, Jayne; Plichart, Patrick; Zary, Nabil
2012-01-01
Research on computer supported scoring of assessments in health care education has mainly focused on automated scoring. Little attention has been given to how informatics can support the currently predominant human-based grading approach. This paper reports steps taken to develop a model for a computer supported scoring process that focuses on optimizing a task that was previously undertaken without computer support. The model was also implemented in the open source assessment platform TAO in order to study its benefits. Ability to score test takers anonymously, analytics on the graders reliability and a more time efficient process are example of observed benefits. A computer supported scoring will increase the quality of the assessment results.
Reliability Technology to Achieve Insertion of Advanced Packaging (RELTECH) program
NASA Astrophysics Data System (ADS)
Fayette, Daniel F.; Speicher, Patricia; Stoklosa, Mark J.; Evans, Jillian V.; Evans, John W.; Gentile, Mike; Pagel, Chuck A.; Hakim, Edward
1993-08-01
A joint military-commercial effort to evaluate multichip module (MCM) structures is discussed. The program, Reliability Technology to Achieve Insertion of Advanced Packaging (RELTECH), has been designed to identify the failure mechanisms that are possible in MCM structures. The RELTECH test vehicles, technical assessment task, product evaluation plan, reliability modeling task, accelerated and environmental testing, and post-test physical analysis and failure analysis are described. The information obtained through RELTECH can be used to address standardization issues, through development of cost effective qualification and appropriate screening criteria, for inclusion into a commercial specification and the MIL-H-38534 general specification for hybrid microcircuits.
Reliability Technology to Achieve Insertion of Advanced Packaging (RELTECH) program
NASA Technical Reports Server (NTRS)
Fayette, Daniel F.; Speicher, Patricia; Stoklosa, Mark J.; Evans, Jillian V.; Evans, John W.; Gentile, Mike; Pagel, Chuck A.; Hakim, Edward
1993-01-01
A joint military-commercial effort to evaluate multichip module (MCM) structures is discussed. The program, Reliability Technology to Achieve Insertion of Advanced Packaging (RELTECH), has been designed to identify the failure mechanisms that are possible in MCM structures. The RELTECH test vehicles, technical assessment task, product evaluation plan, reliability modeling task, accelerated and environmental testing, and post-test physical analysis and failure analysis are described. The information obtained through RELTECH can be used to address standardization issues, through development of cost effective qualification and appropriate screening criteria, for inclusion into a commercial specification and the MIL-H-38534 general specification for hybrid microcircuits.
Zhang, Ju; Ye, Wenqin; Fan, Fan
2015-01-01
Introduction: To provide high-quality nursing care, a reliable and feasible competency assessment tool is critical. Although several questionnaire-based competency assessment tools have been reported, a tool specific for obstetric nurses in rooming-in wards is lacking. Therefore, the purpose of this research is to develop a competency assessment tool for obstetric rooming-in ward nurses. Methods: A literature review was conducted to create an individual intensive interview with 14 nurse managers, educators, and primary nurses in rooming-in wards. Expert reviews (n = 15) were conducted to identify emergent themes in a Delphi fashion. A competency assessment questionnaire was then developed and tested with 246 rooming-in ward nurses in local hospitals. Results: We constructed a three-factor linear model for obstetric rooming-in nurse competency assessment. Further refinement resulted in a self-assessment questionnaire containing three first-tier, 12 second-tier, and 43 third-tier items for easy implementation. The questionnaire was reliable, contained satisfactory content, and had construct validity. Discussion: Our competency assessment tool provides a systematic, easy, and operational subjective evaluation model for nursing managers and administrators to evaluate obstetric rooming-in ward primary nurses. The application of this tool will facilitate various human resources functions, such as nurse training/education effect evaluation, and will eventually promote high-quality nursing care delivery. PMID:26770468
Lang, Jonas W B
2014-07-01
The measurement of implicit or unconscious motives using the picture story exercise (PSE) has long been a target of debate in the psychological literature. Most debates have centered on the apparent paradox that PSE measures of implicit motives typically show low internal consistency reliability on common indices like Cronbach's alpha but nevertheless predict behavioral outcomes. I describe a dynamic Thurstonian item response theory (IRT) model that builds on dynamic system theories of motivation, theorizing on the PSE response process, and recent advancements in Thurstonian IRT modeling of choice data. To assess the models' capability to explain the internal consistency paradox, I first fitted the model to archival data (Gurin, Veroff, & Feld, 1957) and then simulated data based on bias-corrected model estimates from the real data. Simulation results revealed that the average squared correlation reliability for the motives in the Thurstonian IRT model was .74 and that Cronbach's alpha values were similar to the real data (<.35). These findings suggest that PSE motive measures have long been reliable and increase the scientific value of extant evidence from motivational research using PSE motive measures. (c) 2014 APA, all rights reserved.
Thibault, Pascal; Abbott, J Haxby; Jensen, Mark P
2018-01-01
Background Pain catastrophizing is an exaggerated negative cognitive response related to pain. It is commonly assessed using the Pain Catastrophizing Scale (PCS). Translation and validation of the scale in a new language would facilitate cross-cultural comparisons of the role that pain catastrophizing plays in patient function. Purpose The aim of this study was to translate and culturally adapt the PCS into Nepali (Nepali version of PCS [PCS-NP]) and evaluate its clinimetric properties. Methods We translated, cross-culturally adapted, and performed an exploratory factor analysis (EFA) of the PCS-NP in a sample of adults with chronic pain (N=143). We then confirmed the resulting factor model in a separate sample (N=272) and compared this model with 1-, 2-, and 3-factor models previously identified using confirmatory factor analyses (CFAs). We also computed internal consistencies, test–retest reliabilities, standard error of measurement (SEM), minimal detectable change (MDC), and limits of agreement with 95% confidence interval (LOA95%) of the PCS-NP scales. Concurrent validity with measures of depression, anxiety, and pain intensity was assessed by computing Pearson’s correlation coefficients. Results The PCS-NP was comprehensible and culturally acceptable. We extracted a two-factor solution using EFA and confirmed this model using CFAs in the second sample. Adequate fit was also found for a one-factor model and different two- and three-factor models based on prior studies. The PCS-NP scores evidenced excellent reliability and temporal stability, and demonstrated validity via moderate-to-strong associations with measures of depression, anxiety, and pain intensity. The SEM and MDC for the PCS-NP total score were 2.52 and 7.86, respectively (range of PCS scores 0–52). LOA95% was between −15.17 and +16.02 for the total PCS-NP scores. Conclusion The PCS-NP is a valid and reliable instrument to assess pain catastrophizing in Nepalese individuals with chronic pain. PMID:29430196
Assessments of a Turbulence Model Based on Menter's Modification to Rotta's Two-Equation Model
NASA Technical Reports Server (NTRS)
Abdol-Hamid, Khaled S.
2013-01-01
The main objective of this paper is to construct a turbulence model with a more reliable second equation simulating length scale. In the present paper, we assess the length scale equation based on Menter s modification to Rotta s two-equation model. Rotta shows that a reliable second equation can be formed in an exact transport equation from the turbulent length scale L and kinetic energy. Rotta s equation is well suited for a term-by-term modeling and shows some interesting features compared to other approaches. The most important difference is that the formulation leads to a natural inclusion of higher order velocity derivatives into the source terms of the scale equation, which has the potential to enhance the capability of Reynolds-averaged Navier-Stokes (RANS) to simulate unsteady flows. The model is implemented in the PAB3D solver with complete formulation, usage methodology, and validation examples to demonstrate its capabilities. The detailed studies include grid convergence. Near-wall and shear flows cases are documented and compared with experimental and Large Eddy Simulation (LES) data. The results from this formulation are as good or better than the well-known SST turbulence model and much better than k-epsilon results. Overall, the study provides useful insights into the model capability in predicting attached and separated flows.
Bachmann, Monica; de Boer, Wout; Schandelmaier, Stefan; Leibold, Andrea; Marelli, Renato; Jeger, Joerg; Hoffmann-Richter, Ulrike; Mager, Ralph; Schaad, Heinz; Zumbrunn, Thomas; Vogel, Nicole; Bänziger, Oskar; Busse, Jason W; Fischer, Katrin; Kunz, Regina
2016-07-29
Work capacity evaluations by independent medical experts are widely used to inform insurers whether injured or ill workers are capable of engaging in competitive employment. In many countries, evaluation processes lack a clearly structured approach, standardized instruments, and an explicit focus on claimants' functional abilities. Evaluation of subjective complaints, such as mental illness, present additional challenges in the determination of work capacity. We have therefore developed a process for functional evaluation of claimants with mental disorders which complements usual psychiatric evaluation. Here we report the design of a study to measure the reliability of our approach in determining work capacity among patients with mental illness applying for disability benefits. We will conduct a multi-center reliability study, in which 20 psychiatrists trained in our functional evaluation process will assess 30 claimants presenting with mental illness for eligibility to receive disability benefits [Reliability of Functional Evaluation in Psychiatry, RELY-study]. The functional evaluation process entails a five-step structured interview and a reporting instrument (Instrument of Functional Assessment in Psychiatry [IFAP]) to document the severity of work-related functional limitations. We will videotape all evaluations which will be viewed by three psychiatrists who will independently rate claimants' functional limitations. Our primary outcome measure is the evaluation of claimant's work capacity as a percentage (0 to 100 %), and our secondary outcomes are the 12 mental functions and 13 functional capacities assessed by the IFAP-instrument. Inter-rater reliability of four psychiatric experts will be explored using multilevel models to estimate the intraclass correlation coefficient (ICC). Additional analyses include subgroups according to mental disorder, the typicality of claimants, and claimant perceived fairness of the assessment process. We hypothesize that a structured functional approach will show moderate reliability (ICC ≥ 0.6) of psychiatric evaluation of work capacity. Enrollment of actual claimants with mental disorders referred for evaluation by disability/accident insurers will increase the external validity of our findings. Finding moderate levels of reliability, we will continue with a randomized trial to test the reliability of a structured functional approach versus evaluation-as-usual.
NASA Astrophysics Data System (ADS)
Reinert, K. A.
The use of linear decision rules (LDR) and chance constrained programming (CCP) to optimize the performance of wind energy conversion clusters coupled to storage systems is described. Storage is modelled by LDR and output by CCP. The linear allocation rule and linear release rule prescribe the size and optimize a storage facility with a bypass. Chance constraints are introduced to explicitly treat reliability in terms of an appropriate value from an inverse cumulative distribution function. Details of deterministic programming structure and a sample problem involving a 500 kW and a 1.5 MW WECS are provided, considering an installed cost of $1/kW. Four demand patterns and three levels of reliability are analyzed for optimizing the generator choice and the storage configuration for base load and peak operating conditions. Deficiencies in ability to predict reliability and to account for serial correlations are noted in the model, which is concluded useful for narrowing WECS design options.
Imura, Tomoya; Takamura, Masahiro; Okazaki, Yoshihiro; Tokunaga, Satoko
2016-10-01
We developed a scale to measure time management and assessed its reliability and validity. We then used this scale to examine the impact of time management on psychological stress response. In Study 1-1, we developed the scale and assessed its internal consistency and criterion-related validity. Findings from a factor analysis revealed three elements of time management, “time estimation,” “time utilization,” and “taking each moment as it comes.” In Study 1-2, we assessed the scale’s test-retest reliability. In Study 1-3, we assessed the validity of the constructed scale. The results indicate that the time management scale has good reliability and validity. In Study 2, we performed a covariance structural analysis to verify our model that hypothesized that time management influences perceived control of time and psychological stress response, and perceived control of time influences psychological stress response. The results showed that time estimation increases the perceived control of time, which in turn decreases stress response. However, we also found that taking each moment as it comes reduces perceived control of time, which in turn increases stress response.
Quantified Risk Ranking Model for Condition-Based Risk and Reliability Centered Maintenance
NASA Astrophysics Data System (ADS)
Chattopadhyaya, Pradip Kumar; Basu, Sushil Kumar; Majumdar, Manik Chandra
2017-06-01
In the recent past, risk and reliability centered maintenance (RRCM) framework is introduced with a shift in the methodological focus from reliability and probabilities (expected values) to reliability, uncertainty and risk. In this paper authors explain a novel methodology for risk quantification and ranking the critical items for prioritizing the maintenance actions on the basis of condition-based risk and reliability centered maintenance (CBRRCM). The critical items are identified through criticality analysis of RPN values of items of a system and the maintenance significant precipitating factors (MSPF) of items are evaluated. The criticality of risk is assessed using three risk coefficients. The likelihood risk coefficient treats the probability as a fuzzy number. The abstract risk coefficient deduces risk influenced by uncertainty, sensitivity besides other factors. The third risk coefficient is called hazardous risk coefficient, which is due to anticipated hazards which may occur in the future and the risk is deduced from criteria of consequences on safety, environment, maintenance and economic risks with corresponding cost for consequences. The characteristic values of all the three risk coefficients are obtained with a particular test. With few more tests on the system, the values may change significantly within controlling range of each coefficient, hence `random number simulation' is resorted to obtain one distinctive value for each coefficient. The risk coefficients are statistically added to obtain final risk coefficient of each critical item and then the final rankings of critical items are estimated. The prioritization in ranking of critical items using the developed mathematical model for risk assessment shall be useful in optimization of financial losses and timing of maintenance actions.
NASA Technical Reports Server (NTRS)
English, Thomas
2005-01-01
A standard tool of reliability analysis used at NASA-JSC is the event tree. An event tree is simply a probability tree, with the probabilities determining the next step through the tree specified at each node. The nodal probabilities are determined by a reliability study of the physical system at work for a particular node. The reliability study performed at a node is typically referred to as a fault tree analysis, with the potential of a fault tree existing.for each node on the event tree. When examining an event tree it is obvious why the event tree/fault tree approach has been adopted. Typical event trees are quite complex in nature, and the event tree/fault tree approach provides a systematic and organized approach to reliability analysis. The purpose of this study was two fold. Firstly, we wanted to explore the possibility that a semi-Markov process can create dependencies between sojourn times (the times it takes to transition from one state to the next) that can decrease the uncertainty when estimating time to failures. Using a generalized semi-Markov model, we studied a four element reliability model and were able to demonstrate such sojourn time dependencies. Secondly, we wanted to study the use of semi-Markov processes to introduce a time variable into the event tree diagrams that are commonly developed in PRA (Probabilistic Risk Assessment) analyses. Event tree end states which change with time are more representative of failure scenarios than are the usual static probability-derived end states.
The reliability of in-training assessment when performance improvement is taken into account.
van Lohuizen, Mirjam T; Kuks, Jan B M; van Hell, Elisabeth A; Raat, A N; Stewart, Roy E; Cohen-Schotanus, Janke
2010-12-01
During in-training assessment students are frequently assessed over a longer period of time and therefore it can be expected that their performance will improve. We studied whether there really is a measurable performance improvement when students are assessed over an extended period of time and how this improvement affects the reliability of the overall judgement. In-training assessment results were obtained from 104 students on rotation at our university hospital or at one of the six affiliated hospitals. Generalisability theory was used in combination with multilevel analysis to obtain reliability coefficients and to estimate the number of assessments needed for reliable overall judgement, both including and excluding performance improvement. Students' clinical performance ratings improved significantly from a mean of 7.6 at the start to a mean of 7.8 at the end of their clerkship. When taking performance improvement into account, reliability coefficients were higher. The number of assessments needed to achieve a reliability of 0.80 or higher decreased from 17 to 11. Therefore, when studying reliability of in-training assessment, performance improvement should be considered.
Magnan, Morris A; Maklebust, Joann
2008-01-01
To evaluate the effect of Web-based Braden Scale training on the reliability and precision of pressure ulcer risk assessments made by registered nurses (RN) working in acute care settings. Pretest-posttest, 2-group, quasi-experimental design. Five hundred Braden Scale risk assessments were made on 102 acute care patients deemed to be at various levels of risk for pressure ulceration. Assessments were made by RNs working in acute care hospitals at 3 different medical centers where the Braden Scale was in regular daily use (2 medical centers) or new to the setting (1 medical center). The Braden Scale for Predicting Pressure Sore Risk was used to guide pressure ulcer risk assessments. A Web-based version of the Detroit Medical Center Braden Scale Computerized Training Module was used to teach nurses correct use of the Braden Scale and selection of risk-based pressure ulcer prevention interventions. In the aggregate, RN generated reliable Braden Scale pressure ulcer risk assessments 65% of the time after training. The effect of Web-based Braden Scale training on reliability and precision of assessments varied according to familiarity with the scale. With training, new users of the scale made reliable assessments 84% of the time and significantly improved precision of their assessments. The reliability and precision of Braden Scale risk assessments made by its regular users was unaffected by training. Technology-assisted Braden Scale training improved both reliability and precision of risk assessments made by new users of the scale, but had virtually no effect on the reliability or precision of risk assessments made by regular users of the instrument. Further research is needed to determine best approaches for improving reliability and precision of Braden Scale assessments made by its regular users.
Bruen, Catherine; Kreiter, Clarence; Wade, Vincent; Pawlikowska, Teresa
2017-01-01
Experience with simulated patients supports undergraduate learning of medical consultation skills. Adaptive simulations are being introduced into this environment. The authors investigate whether it can underpin valid and reliable assessment by conducting a generalizability analysis using IT data analytics from the interaction of medical students (in psychiatry) with adaptive simulations to explore the feasibility of adaptive simulations for supporting automated learning and assessment. The generalizability (G) study was focused on two clinically relevant variables: clinical decision points and communication skills. While the G study on the communication skills score yielded low levels of true score variance, the results produced by the decision points, indicating clinical decision-making and confirming user knowledge of the process of the Calgary-Cambridge model of consultation, produced reliability levels similar to what might be expected with rater-based scoring. The findings indicate that adaptive simulations have potential as a teaching and assessment tool for medical consultations.
Little, N Keith
2010-01-01
Clinical Pastoral Education is professional training for pastoral care. This paper compares CPE against the professional training model. While limiting the discussion to Christian pastoral care, the professional education model suggests a clarification of the trainee's theological and other entry requirements for a basic unit, a more thoughtful provision of information during CPE training, a careful attention to group membership and an appropriate integration with the theological curriculum. It also suggests more specific competency standards and more reliable, valid and objective assessment methods.
Ahmad, Sohail; Ismail, Ahmad Izuanuddin; Khan, Tahir Mehmood; Akram, Waqas; Mohd Zim, Mohd Arif; Ismail, Nahlah Elkudssiah
2017-04-01
The stigmatisation degree, self-esteem and knowledge either directly or indirectly influence the control and self-management of asthma. To date, there is no valid and reliable instrument that can assess these key issues collectively. The main aim of this study was to test the reliability and validity of the newly devised and translated "Stigmatisation Degree, Self-Esteem and Knowledge Questionnaire" among adult asthma patients using the Rasch measurement model. This cross-sectional study recruited thirty adult asthma patients from two respiratory specialist clinics in Selangor, Malaysia. The newly devised self-administered questionnaire was adapted from relevant publications and translated into the Malay language using international standard translation guidelines. Content and face validation was done. The data were extracted and analysed for real item reliability and construct validation using the Rasch model. The translated "Stigmatisation Degree, Self-Esteem and Knowledge Questionnaire" showed high real item reliability values of 0.90, 0.86 and 0.89 for stigmatisation degree, self-esteem, and knowledge of asthma, respectively. Furthermore, all values of point measure correlation (PTMEA Corr) analysis were within the acceptable specified range of the Rasch model. Infit/outfit mean square values and Z standard (ZSTD) values of each item verified the construct validity and suggested retaining all the items in the questionnaire. The reliability analyses and output tables of item measures for construct validation proved the translated Malaysian version of "Stigmatisation Degree, Self-Esteem and Knowledge Questionnaire" as a valid and highly reliable questionnaire.
Good modeling practice guidelines for applying multimedia models in chemical assessments.
Buser, Andreas M; MacLeod, Matthew; Scheringer, Martin; Mackay, Don; Bonnell, Mark; Russell, Mark H; DePinto, Joseph V; Hungerbühler, Konrad
2012-10-01
Multimedia mass balance models of chemical fate in the environment have been used for over 3 decades in a regulatory context to assist decision making. As these models become more comprehensive, reliable, and accepted, there is a need to recognize and adopt principles of Good Modeling Practice (GMP) to ensure that multimedia models are applied with transparency and adherence to accepted scientific principles. We propose and discuss 6 principles of GMP for applying existing multimedia models in a decision-making context, namely 1) specification of the goals of the model assessment, 2) specification of the model used, 3) specification of the input data, 4) specification of the output data, 5) conduct of a sensitivity and possibly also uncertainty analysis, and finally 6) specification of the limitations and limits of applicability of the analysis. These principles are justified and discussed with a view to enhancing the transparency and quality of model-based assessments. Copyright © 2012 SETAC.
Mejia, Amanda F; Nebel, Mary Beth; Barber, Anita D; Choe, Ann S; Pekar, James J; Caffo, Brian S; Lindquist, Martin A
2018-05-15
Reliability of subject-level resting-state functional connectivity (FC) is determined in part by the statistical techniques employed in its estimation. Methods that pool information across subjects to inform estimation of subject-level effects (e.g., Bayesian approaches) have been shown to enhance reliability of subject-level FC. However, fully Bayesian approaches are computationally demanding, while empirical Bayesian approaches typically rely on using repeated measures to estimate the variance components in the model. Here, we avoid the need for repeated measures by proposing a novel measurement error model for FC describing the different sources of variance and error, which we use to perform empirical Bayes shrinkage of subject-level FC towards the group average. In addition, since the traditional intra-class correlation coefficient (ICC) is inappropriate for biased estimates, we propose a new reliability measure denoted the mean squared error intra-class correlation coefficient (ICC MSE ) to properly assess the reliability of the resulting (biased) estimates. We apply the proposed techniques to test-retest resting-state fMRI data on 461 subjects from the Human Connectome Project to estimate connectivity between 100 regions identified through independent components analysis (ICA). We consider both correlation and partial correlation as the measure of FC and assess the benefit of shrinkage for each measure, as well as the effects of scan duration. We find that shrinkage estimates of subject-level FC exhibit substantially greater reliability than traditional estimates across various scan durations, even for the most reliable connections and regardless of connectivity measure. Additionally, we find partial correlation reliability to be highly sensitive to the choice of penalty term, and to be generally worse than that of full correlations except for certain connections and a narrow range of penalty values. This suggests that the penalty needs to be chosen carefully when using partial correlations. Copyright © 2018. Published by Elsevier Inc.
Velten, Julia; Scholten, Saskia; Graham, Cynthia A; Margraf, Jürgen
2016-02-01
The Sexual Excitation Sexual/Inhibition Inventory for Women (SESII-W) is a self-report questionnaire for assessing propensities of sexual excitation (SE) and sexual inhibition (SI) in women. According to the dual control model of sexual response, these two factors differ between individuals and influence the occurrence of sexual arousal in given situations. Extreme levels of SE and SI are postulated to be associated with sexual problems or risky sexual behaviors. Psychometric evaluation of the original scale yielded two higher order and eight lower order factors as well as satisfactory to good construct validity and reliability. The present study was designed to assess the psychometric properties of a German version of the SESII-W utilizing a large convenience sample of 2206 women. Confirmatory factor analysis showed a satisfactory overall model fit, with support for the five lower order factors of SE (Arousability, Sexual Power Dynamics, Smell, Partner Characteristics, Setting) and the three lower order factors of SI (Relationship Importance, Arousal Contingency, and Concerns about Sexual Function). Additionally, the scale demonstrated good convergent and discriminant validity, internal consistency, and test-retest-reliability. The German SESII-W is a sufficiently reliable and valid measure for assessing SE and SI in women. Hence, its use can be recommended for future research in Germany that investigates women's sexual behaviors and experiences.
A Report on Simulation-Driven Reliability and Failure Analysis of Large-Scale Storage Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wan, Lipeng; Wang, Feiyi; Oral, H. Sarp
High-performance computing (HPC) storage systems provide data availability and reliability using various hardware and software fault tolerance techniques. Usually, reliability and availability are calculated at the subsystem or component level using limited metrics such as, mean time to failure (MTTF) or mean time to data loss (MTTDL). This often means settling on simple and disconnected failure models (such as exponential failure rate) to achieve tractable and close-formed solutions. However, such models have been shown to be insufficient in assessing end-to-end storage system reliability and availability. We propose a generic simulation framework aimed at analyzing the reliability and availability of storagemore » systems at scale, and investigating what-if scenarios. The framework is designed for an end-to-end storage system, accommodating the various components and subsystems, their interconnections, failure patterns and propagation, and performs dependency analysis to capture a wide-range of failure cases. We evaluate the framework against a large-scale storage system that is in production and analyze its failure projections toward and beyond the end of lifecycle. We also examine the potential operational impact by studying how different types of components affect the overall system reliability and availability, and present the preliminary results« less
A systematic review of the factor structure and reliability of the Spence Children's Anxiety Scale.
Orgilés, Mireia; Fernández-Martínez, Iván; Guillén-Riquelme, Alejandro; Espada, José P; Essau, Cecilia A
2016-01-15
The Spence Children's Anxiety Scale (SCAS) is a widely used instrument for assessing symptoms of anxiety disorders among children and adolescents. Previous studies have demonstrated its good reliability for children and adolescents from different backgrounds. However, remarkable variability in the reliability of the SCAS across studies and inconsistent results regarding its factor structure has been found. The present study aims to examine the SCAS factor structure by means of a systematic review with narrative synthesis, the mean reliability of the SCAS by means of a meta-analysis, and the influence of the moderators on the SCAS reliability. Databases employed to collect the studies included Scholar Google, PsycARTICLES, PsycINFO, Web of Science, and Scopus since 1997. Twenty-nine and 32 studies, which examined the factor structure and the internal consistency of the SCAS, respectively, were included. The SCAS was found to have strong internal consistency, influenced by different moderators. The systematic review demonstrated that the original six-factor model was supported by most studies. Factorial invariance studies (across age, gender, country) and test-retest reliability of the SCAS were not examined in this study. It is concluded that the SCAS is a reliable instrument for cross-cultural use, and it is suggested that the original six-factor model is appropriate for cross-cultural application. Copyright © 2015 Elsevier B.V. All rights reserved.
Li, Wei Bo; Greiter, Matthias; Oeh, Uwe; Hoeschen, Christoph
2011-12-01
The reliability of biokinetic models is essential in internal dose assessments and radiation risk analysis for the public, occupational workers, and patients exposed to radionuclides. In this paper, a method for assessing the reliability of biokinetic models by means of uncertainty and sensitivity analysis was developed. The paper is divided into two parts. In the first part of the study published here, the uncertainty sources of the model parameters for zirconium (Zr), developed by the International Commission on Radiological Protection (ICRP), were identified and analyzed. Furthermore, the uncertainty of the biokinetic experimental measurement performed at the Helmholtz Zentrum München-German Research Center for Environmental Health (HMGU) for developing a new biokinetic model of Zr was analyzed according to the Guide to the Expression of Uncertainty in Measurement, published by the International Organization for Standardization. The confidence interval and distribution of model parameters of the ICRP and HMGU Zr biokinetic models were evaluated. As a result of computer biokinetic modelings, the mean, standard uncertainty, and confidence interval of model prediction calculated based on the model parameter uncertainty were presented and compared to the plasma clearance and urinary excretion measured after intravenous administration. It was shown that for the most important compartment, the plasma, the uncertainty evaluated for the HMGU model was much smaller than that for the ICRP model; that phenomenon was observed for other organs and tissues as well. The uncertainty of the integral of the radioactivity of Zr up to 50 y calculated by the HMGU model after ingestion by adult members of the public was shown to be smaller by a factor of two than that of the ICRP model. It was also shown that the distribution type of the model parameter strongly influences the model prediction, and the correlation of the model input parameters affects the model prediction to a certain extent depending on the strength of the correlation. In the case of model prediction, the qualitative comparison of the model predictions with the measured plasma and urinary data showed the HMGU model to be more reliable than the ICRP model; quantitatively, the uncertainty model prediction by the HMGU systemic biokinetic model is smaller than that of the ICRP model. The uncertainty information on the model parameters analyzed in this study was used in the second part of the paper regarding a sensitivity analysis of the Zr biokinetic models.
Improved reliability of wind turbine towers with active tuned mass dampers (ATMDs)
NASA Astrophysics Data System (ADS)
Fitzgerald, Breiffni; Sarkar, Saptarshi; Staino, Andrea
2018-04-01
Modern multi-megawatt wind turbines are composed of slender, flexible, and lightly damped blades and towers. These components exhibit high susceptibility to wind-induced vibrations. As the size, flexibility and cost of the towers have increased in recent years, the need to protect these structures against damage induced by turbulent aerodynamic loading has become apparent. This paper combines structural dynamic models and probabilistic assessment tools to demonstrate improvements in structural reliability when modern wind turbine towers are equipped with active tuned mass dampers (ATMDs). This study proposes a multi-modal wind turbine model for wind turbine control design and analysis. This study incorporates an ATMD into the tower of this model. The model is subjected to stochastically generated wind loads of varying speeds to develop wind-induced probabilistic demand models for towers of modern multi-megawatt wind turbines under structural uncertainty. Numerical simulations have been carried out to ascertain the effectiveness of the active control system to improve the structural performance of the wind turbine and its reliability. The study constructs fragility curves, which illustrate reductions in the vulnerability of towers to wind loading owing to the inclusion of the damper. Results show that the active controller is successful in increasing the reliability of the tower responses. According to the analysis carried out in this paper, a strong reduction of the probability of exceeding a given displacement at the rated wind speed has been observed.
NASA Technical Reports Server (NTRS)
Hoppa, Mary Ann; Wilson, Larry W.
1994-01-01
There are many software reliability models which try to predict future performance of software based on data generated by the debugging process. Our research has shown that by improving the quality of the data one can greatly improve the predictions. We are working on methodologies which control some of the randomness inherent in the standard data generation processes in order to improve the accuracy of predictions. Our contribution is twofold in that we describe an experimental methodology using a data structure called the debugging graph and apply this methodology to assess the robustness of existing models. The debugging graph is used to analyze the effects of various fault recovery orders on the predictive accuracy of several well-known software reliability algorithms. We found that, along a particular debugging path in the graph, the predictive performance of different models can vary greatly. Similarly, just because a model 'fits' a given path's data well does not guarantee that the model would perform well on a different path. Further we observed bug interactions and noted their potential effects on the predictive process. We saw that not only do different faults fail at different rates, but that those rates can be affected by the particular debugging stage at which the rates are evaluated. Based on our experiment, we conjecture that the accuracy of a reliability prediction is affected by the fault recovery order as well as by fault interaction.
NASA Astrophysics Data System (ADS)
El-Jaat, Majda; Hulley, Michael; Tétreault, Michel
2018-02-01
Despite the broad impact and importance of saltwater intrusion in coastal aquifers, little research has been directed towards forecasting saltwater intrusion in areas where the source of saltwater is uncertain. Saline contamination in inland groundwater supplies is a concern for numerous communities in the southern US including the city of Deltona, Florida. Furthermore, conventional numerical tools for forecasting saltwater contamination are heavily dependent on reliable characterization of the physical characteristics of underlying aquifers, information that is often absent or challenging to obtain. To overcome these limitations, a reliable alternative data-driven model for forecasting salinity in a groundwater supply was developed for Deltona using the fast orthogonal search (FOS) method. FOS was applied on monthly water-demand data and corresponding chloride concentrations at water supply wells. Groundwater salinity measurements from Deltona water supply wells were applied to evaluate the forecasting capability and accuracy of the FOS model. Accurate and reliable groundwater salinity forecasting is necessary to support effective and sustainable coastal-water resource planning and management. The available (27) water supply wells for Deltona were randomly split into three test groups for the purposes of FOS model development and performance assessment. Based on four performance indices (RMSE, RSR, NSEC, and R), the FOS model proved to be a reliable and robust forecaster of groundwater salinity. FOS is relatively inexpensive to apply, is not based on rigorous physical characterization of the water supply aquifer, and yields reliable estimates of groundwater salinity in active water supply wells.
2018-01-01
We propose a novel approach to modelling rater effects in scoring-based assessment. The approach is based on a Bayesian hierarchical model and simulations from the posterior distribution. We apply it to large-scale essay assessment data over a period of 5 years. Empirical results suggest that the model provides a good fit for both the total scores and when applied to individual rubrics. We estimate the median impact of rater effects on the final grade to be ± 2 points on a 50 point scale, while 10% of essays would receive a score at least ± 5 different from their actual quality. Most of the impact is due to rater unreliability, not rater bias. PMID:29614129
NASA Astrophysics Data System (ADS)
Vivoni, E. R.; Mayer, A. S.; Halvorsen, K. E.; Robles-Morua, A.; Kossak, D.
2016-12-01
A series of iterative participatory modeling workshops were held in Sonora, México with the goal of developing water resources management strategies in a water-stressed basin subject to hydro-climatic variability and change. A model of the water resources system, consisting of watershed hydrology, water resources infrastructure, and groundwater models, was developed deliberatively in the workshops, along with scenarios of future climate and development. Participants used the final version of the water resources systems model to select from supply-side and demand-side water resources management strategies. The performance of the strategies was based on the reliability of meeting current and future demands at a daily time scale over a year's period. Pre- and post-workshop surveys were developed and administered. The survey questions focused on evaluation of participants' modeling capacity and the utility and accuracy of the models. The selected water resources strategies and the associated, expected reliability varied widely among participants. Most participants could be clustered into three groups with roughly equal numbers of participants that varied in terms of reliance on expanding infrastructure vs. demand modification; expectations of reliability; and perceptions of social, environmental, and economic impacts. The wide range of strategies chosen and associated reliabilities indicate that there is a substantial degree of uncertainty in how future water resources decisions could be made in the region. The pre- and post-survey results indicate that participants believed their modeling abilities increased and beliefs in the utility of models increased as a result of the workshops
Ko, Junsu; Park, Hahnbeom; Seok, Chaok
2012-08-10
Protein structures can be reliably predicted by template-based modeling (TBM) when experimental structures of homologous proteins are available. However, it is challenging to obtain structures more accurate than the single best templates by either combining information from multiple templates or by modeling regions that vary among templates or are not covered by any templates. We introduce GalaxyTBM, a new TBM method in which the more reliable core region is modeled first from multiple templates and less reliable, variable local regions, such as loops or termini, are then detected and re-modeled by an ab initio method. This TBM method is based on "Seok-server," which was tested in CASP9 and assessed to be amongst the top TBM servers. The accuracy of the initial core modeling is enhanced by focusing on more conserved regions in the multiple-template selection and multiple sequence alignment stages. Additional improvement is achieved by ab initio modeling of up to 3 unreliable local regions in the fixed framework of the core structure. Overall, GalaxyTBM reproduced the performance of Seok-server, with GalaxyTBM and Seok-server resulting in average GDT-TS of 68.1 and 68.4, respectively, when tested on 68 single-domain CASP9 TBM targets. For application to multi-domain proteins, GalaxyTBM must be combined with domain-splitting methods. Application of GalaxyTBM to CASP9 targets demonstrates that accurate protein structure prediction is possible by use of a multiple-template-based approach, and ab initio modeling of variable regions can further enhance the model quality.
A Zebrafish Larval Model to Assess Virulence of Porcine Streptococcus suis Strains.
Zaccaria, Edoardo; Cao, Rui; Wells, Jerry M; van Baarlen, Peter
2016-01-01
Streptococcus suis is an encapsulated Gram-positive bacterium, and the leading cause of sepsis and meningitis in young pigs resulting in considerable economic losses in the porcine industry. It is also considered an emerging zoonotic agent. In the environment, both avirulent and virulent strains occur in pigs, and virulent strains appear to cause disease in both humans and pigs. There is a need for a convenient, reliable and standardized animal model to assess S. suis virulence. A zebrafish (Danio rerio) larvae infection model has several advantages, including transparency of larvae, low cost, ease of use and exemption from ethical legislation up to 6 days post fertilization, but has not been previously established as a model for S. suis. Microinjection of different porcine strains of S. suis in zebrafish larvae resulted in highly reproducible dose- and strain-dependent larval death, strongly correlating with presence of the S. suis capsule and to the original virulence of the strain in pigs. Additionally we compared the virulence of the two-component system mutant of ciaRH, which is attenuated for virulence in both mice and pigs in vivo. Infection of larvae with the ΔciaRH strain resulted in significantly higher survival rate compared to infection with the S10 wild-type strain. Our data demonstrate that zebrafish larvae are a rapid and reliable model to assess the virulence of clinical porcine S. suis isolates.
A Zebrafish Larval Model to Assess Virulence of Porcine Streptococcus suis Strains
Zaccaria, Edoardo; Cao, Rui; Wells, Jerry M.; van Baarlen, Peter
2016-01-01
Streptococcus suis is an encapsulated Gram-positive bacterium, and the leading cause of sepsis and meningitis in young pigs resulting in considerable economic losses in the porcine industry. It is also considered an emerging zoonotic agent. In the environment, both avirulent and virulent strains occur in pigs, and virulent strains appear to cause disease in both humans and pigs. There is a need for a convenient, reliable and standardized animal model to assess S. suis virulence. A zebrafish (Danio rerio) larvae infection model has several advantages, including transparency of larvae, low cost, ease of use and exemption from ethical legislation up to 6 days post fertilization, but has not been previously established as a model for S. suis. Microinjection of different porcine strains of S. suis in zebrafish larvae resulted in highly reproducible dose- and strain-dependent larval death, strongly correlating with presence of the S. suis capsule and to the original virulence of the strain in pigs. Additionally we compared the virulence of the two-component system mutant of ciaRH, which is attenuated for virulence in both mice and pigs in vivo. Infection of larvae with the ΔciaRH strain resulted in significantly higher survival rate compared to infection with the S10 wild-type strain. Our data demonstrate that zebrafish larvae are a rapid and reliable model to assess the virulence of clinical porcine S. suis isolates. PMID:26999052
Assessing Psychodynamic Conflict.
Simmonds, Joshua; Constantinides, Prometheas; Perry, J Christopher; Drapeau, Martin; Sheptycki, Amanda R
2015-09-01
Psychodynamic psychotherapies suggest that symptomatic relief is provided, in part, with the resolution of psychic conflicts. Clinical researchers have used innovative methods to investigate such phenomenon. This article aims to review the literature on quantitative psychodynamic conflict rating scales. An electronic search of the literature was conducted to retrieve quantitative observer-rated scales used to assess conflict noting each measure's theoretical model, information source, and training and clinical experience required. Scales were also examined for levels of reliability and validity. Five quantitative observer-rated conflict scales were identified. Reliability varied from poor to excellent with each measure demonstrating good validity. However a small number of studies and limited links to current conflict theory suggest further clinical research is needed.
Probabilistic simulation of the human factor in structural reliability
NASA Astrophysics Data System (ADS)
Chamis, Christos C.; Singhal, Surendra N.
1994-09-01
The formal approach described herein computationally simulates the probable ranges of uncertainties for the human factor in probabilistic assessments of structural reliability. Human factors such as marital status, professional status, home life, job satisfaction, work load, and health are studied by using a multifactor interaction equation (MFIE) model to demonstrate the approach. Parametric studies in conjunction with judgment are used to select reasonable values for the participating factors (primitive variables). Subsequently performed probabilistic sensitivity studies assess the suitability of the MFIE as well as the validity of the whole approach. Results show that uncertainties range from 5 to 30 percent for the most optimistic case, assuming 100 percent for no error (perfect performance).
Probabilistic Simulation of the Human Factor in Structural Reliability
NASA Technical Reports Server (NTRS)
Chamis, Christos C.; Singhal, Surendra N.
1994-01-01
The formal approach described herein computationally simulates the probable ranges of uncertainties for the human factor in probabilistic assessments of structural reliability. Human factors such as marital status, professional status, home life, job satisfaction, work load, and health are studied by using a multifactor interaction equation (MFIE) model to demonstrate the approach. Parametric studies in conjunction with judgment are used to select reasonable values for the participating factors (primitive variables). Subsequently performed probabilistic sensitivity studies assess the suitability of the MFIE as well as the validity of the whole approach. Results show that uncertainties range from 5 to 30 percent for the most optimistic case, assuming 100 percent for no error (perfect performance).
Chalmers, E V; McIntyre, G T; Wang, W; Gillgrass, T; Martin, C B; Mossey, P A
2016-09-01
This study was undertaken to evaluate intraoral 3D scans for assessing dental arch relationships and obtain patient/parent perceptions of impressions and intraoral 3D scanning. Forty-three subjects with nonsyndromic unilateral cleft lip and palate (UCLP) had impressions taken for plaster models. These and the teeth were scanned using the R700 Orthodontic Study Model Scanner and Trios® Digital Impressions Scanner (3Shape A/S, Copenhagen, Denmark) to create indirect and direct digital models. All model formats were scored by three observers on two occasions using the GOSLON and modified Huddart Bodenham (MHB) indices. Participants and parents scored their perceptions of impressions and scanning from 1 (very good) to 5 (very bad). Intra- and interexaminer reliability were tested using GOSLON and MHB data (Cronbach's Alpha >0.9). Bland and Altman plots were created for MHB data, with each model medium (one-sample t tests, P < .05) and questionnaire data (Wilcoxon signed ranks P < .05) tested. Intra- and interexaminer reliability (>0.9) were good for all formats with the direct digital models having the lowest interexaminer differences. Participants had higher ratings for scanning comfort (84.8%) than impressions (44.2%) (P < .05) and for scanning time (56.6%) than impressions (51.2%) (P > .05). None disliked scanning, but 16.3% disliked impressions. Data for parents and children positively correlated (P < .05). Reliability of scoring dental arch relationships using intraoral 3D scans was superior to indirect digital and to plaster models; Subjects with UCLP preferred intra-oral 3D scanning to dental impressions, mirrored by parents/carers; This study supports the replacement of conventional impressions with intra-oral 3D scans in longitudinal evaluations of the outcomes of cleft care.
Dima, Alexandra Lelia; Schulz, Peter Johannes
2017-01-01
Background The eHealth Literacy Scale (eHEALS) is a tool to assess consumers’ comfort and skills in using information technologies for health. Although evidence exists of reliability and construct validity of the scale, less agreement exists on structural validity. Objective The aim of this study was to validate the Italian version of the eHealth Literacy Scale (I-eHEALS) in a community sample with a focus on its structural validity, by applying psychometric techniques that account for item difficulty. Methods Two Web-based surveys were conducted among a total of 296 people living in the Italian-speaking region of Switzerland (Ticino). After examining the latent variables underlying the observed variables of the Italian scale via principal component analysis (PCA), fit indices for two alternative models were calculated using confirmatory factor analysis (CFA). The scale structure was examined via parametric and nonparametric item response theory (IRT) analyses accounting for differences between items regarding the proportion of answers indicating high ability. Convergent validity was assessed by correlations with theoretically related constructs. Results CFA showed a suboptimal model fit for both models. IRT analyses confirmed all items measure a single dimension as intended. Reliability and construct validity of the final scale were also confirmed. The contrasting results of factor analysis (FA) and IRT analyses highlight the importance of considering differences in item difficulty when examining health literacy scales. Conclusions The findings support the reliability and validity of the translated scale and its use for assessing Italian-speaking consumers’ eHealth literacy. PMID:28400356
Lin, Xiaotong; Liu, Mei; Chen, Xue-wen
2009-04-29
Protein-protein interactions play vital roles in nearly all cellular processes and are involved in the construction of biological pathways such as metabolic and signal transduction pathways. Although large-scale experiments have enabled the discovery of thousands of previously unknown linkages among proteins in many organisms, the high-throughput interaction data is often associated with high error rates. Since protein interaction networks have been utilized in numerous biological inferences, the inclusive experimental errors inevitably affect the quality of such prediction. Thus, it is essential to assess the quality of the protein interaction data. In this paper, a novel Bayesian network-based integrative framework is proposed to assess the reliability of protein-protein interactions. We develop a cross-species in silico model that assigns likelihood scores to individual protein pairs based on the information entirely extracted from model organisms. Our proposed approach integrates multiple microarray datasets and novel features derived from gene ontology. Furthermore, the confidence scores for cross-species protein mappings are explicitly incorporated into our model. Applying our model to predict protein interactions in the human genome, we are able to achieve 80% in sensitivity and 70% in specificity. Finally, we assess the overall quality of the experimentally determined yeast protein-protein interaction dataset. We observe that the more high-throughput experiments confirming an interaction, the higher the likelihood score, which confirms the effectiveness of our approach. This study demonstrates that model organisms certainly provide important information for protein-protein interaction inference and assessment. The proposed method is able to assess not only the overall quality of an interaction dataset, but also the quality of individual protein-protein interactions. We expect the method to continually improve as more high quality interaction data from more model organisms becomes available and is readily scalable to a genome-wide application.
Reviewing the psychometric properties of contemporary circadian typology measures.
Di Milia, Lee; Adan, Ana; Natale, Vincenzo; Randler, Christoph
2013-12-01
The accurate measurement of circadian typology (CT) is critical because the construct has implications for a number of health disorders. In this review, we focus on the evidence to support the reliability and validity of the more commonly used CT scales: the Morningness-Eveningness Questionnaire (MEQ), reduced Morningness-Eveningness Questionnaire (rMEQ), the Composite Scale of Morningness (CSM), and the Preferences Scale (PS). In addition, we also consider the Munich ChronoType Questionnaire (MCTQ). In terms of reliability, the MEQ, CSM, and PS consistently report high levels of reliability (>0.80), whereas the reliability of the rMEQ is satisfactory. The stability of these scales is sound at follow-up periods up to 13 mos. The MCTQ is not a scale; therefore, its reliability cannot be assessed. Although it is possible to determine the stability of the MCTQ, these data are yet to be reported. Validity must be given equal weight in assessing the measurement properties of CT instruments. Most commonly reported is convergent and construct validity. The MEQ, rMEQ, and CSM are highly correlated and this is to be expected, given that these scales share common items. The level of agreement between the MCTQ and the MEQ is satisfactory, but the correlation between these two constructs decreases in line with the number of "corrections" applied to the MCTQ. The interesting question is whether CT is best represented by a psychological preference for behavior or by using a biomarker such as sleep midpoint. Good-quality subjective and objective data suggest adequate construct validity for each of the CT instruments, but a major limitation of this literature is studies that assess the predictive validity of these instruments. We make a number of recommendations with the aim of advancing science. Future studies need to (1) focus on collecting data from representative samples that consider a number of environmental factors; (2) employ longitudinal designs to allow the predictive validity of CT measures to be assessed and preferably make use of objective data; (3) employ contemporary statistical approaches, including structural equation modeling and item-response models; and (4) provide better information concerning sample selection and a rationale for choosing cutoff points.
Earthquake Source Inversion Blindtest: Initial Results and Further Developments
NASA Astrophysics Data System (ADS)
Mai, P.; Burjanek, J.; Delouis, B.; Festa, G.; Francois-Holden, C.; Monelli, D.; Uchide, T.; Zahradnik, J.
2007-12-01
Images of earthquake ruptures, obtained from modelling/inverting seismic and/or geodetic data exhibit a high degree in spatial complexity. This earthquake source heterogeneity controls seismic radiation, and is determined by the details of the dynamic rupture process. In turn, such rupture models are used for studying source dynamics and for ground-motion prediction. But how reliable and trustworthy are these earthquake source inversions? Rupture models for a given earthquake, obtained by different research teams, often display striking disparities (see http://www.seismo.ethz.ch/srcmod) However, well resolved, robust, and hence reliable source-rupture models are an integral part to better understand earthquake source physics and to improve seismic hazard assessment. Therefore it is timely to conduct a large-scale validation exercise for comparing the methods, parameterization and data-handling in earthquake source inversions.We recently started a blind test in which several research groups derive a kinematic rupture model from synthetic seismograms calculated for an input model unknown to the source modelers. The first results, for an input rupture model with heterogeneous slip but constant rise time and rupture velocity, reveal large differences between the input and inverted model in some cases, while a few studies achieve high correlation between the input and inferred model. Here we report on the statistical assessment of the set of inverted rupture models to quantitatively investigate their degree of (dis-)similarity. We briefly discuss the different inversion approaches, their possible strength and weaknesses, and the use of appropriate misfit criteria. Finally we present new blind-test models, with increasing source complexity and ambient noise on the synthetics. The goal is to attract a large group of source modelers to join this source-inversion blindtest in order to conduct a large-scale validation exercise to rigorously asses the performance and reliability of current inversion methods and to discuss future developments.
Validity and Reliability of the 8-Item Work Limitations Questionnaire.
Walker, Timothy J; Tullar, Jessica M; Diamond, Pamela M; Kohl, Harold W; Amick, Benjamin C
2017-12-01
Purpose To evaluate factorial validity, scale reliability, test-retest reliability, convergent validity, and discriminant validity of the 8-item Work Limitations Questionnaire (WLQ) among employees from a public university system. Methods A secondary analysis using de-identified data from employees who completed an annual Health Assessment between the years 2009-2015 tested research aims. Confirmatory factor analysis (CFA) (n = 10,165) tested the latent structure of the 8-item WLQ. Scale reliability was determined using a CFA-based approach while test-retest reliability was determined using the intraclass correlation coefficient. Convergent/discriminant validity was tested by evaluating relations between the 8-item WLQ with health/performance variables for convergent validity (health-related work performance, number of chronic conditions, and general health) and demographic variables for discriminant validity (gender and institution type). Results A 1-factor model with three correlated residuals demonstrated excellent model fit (CFI = 0.99, TLI = 0.99, RMSEA = 0.03, and SRMR = 0.01). The scale reliability was acceptable (0.69, 95% CI 0.68-0.70) and the test-retest reliability was very good (ICC = 0.78). Low-to-moderate associations were observed between the 8-item WLQ and the health/performance variables while weak associations were observed between the demographic variables. Conclusions The 8-item WLQ demonstrated sufficient reliability and validity among employees from a public university system. Results suggest the 8-item WLQ is a usable alternative for studies when the more comprehensive 25-item WLQ is not available.
Vasconcelos-Raposo, José; Fernandes, Helder Miguel; Teixeira, Carla M
2013-01-01
The purpose of the present study was to assess the factor structure and reliability of the Depression, Anxiety and Stress Scales (DASS-21) in a large Portuguese community sample. Participants were 1020 adults (585 women and 435 men), with a mean age of 36.74 (SD = 11.90) years. All scales revealed good reliability, with Cronbach's alpha values between .80 (anxiety) and .84 (depression). The internal consistency of the total score was .92. Confirmatory factor analysis revealed that the best-fitting model (*CFI = .940, *RMSEA = .038) consisted of a latent component of general psychological distress (or negative affectivity) plus orthogonal depression, anxiety and stress factors. The Portuguese version of the DASS-21 showed good psychometric properties (factorial validity and reliability) and thus can be used as a reliable and valid instrument for measuring depression, anxiety and stress symptoms.
NASA Astrophysics Data System (ADS)
Jayasankar, C. B.; Surendran, Sajani; Rajendran, Kavirajan
2015-05-01
Coupled Model Intercomparison Project phase 5 (Fifth Assessment Report of Intergovernmental Panel on Climate Change) coupled global climate model Representative Concentration Pathway 8.5 simulations are analyzed to derive robust signals of projected changes in Indian summer monsoon rainfall (ISMR) and its variability. Models project clear future temperature increase but diverse changes in ISMR with substantial intermodel spread. Objective measures of interannual variability (IAV) yields nearly equal chance for future increase or decrease. This leads to discrepancy in quantifying changes in ISMR and variability. However, based primarily on the physical association between mean changes in ISMR and its IAV, and objective methods such as k-means clustering with Dunn's validity index, mean seasonal cycle, and reliability ensemble averaging, projections fall into distinct groups. Physically consistent groups of models with the highest reliability project future reduction in the frequency of light rainfall but increase in high to extreme rainfall and thereby future increase in ISMR by 0.74 ± 0.36 mm d-1, along with increased future IAV. These robust estimates of future changes are important for useful impact assessments.
Mortazavi, Forough; Mortazavi, Saideh S; Khosrorad, Razieh
2015-09-01
Procrastination is a common behavior which affects different aspects of life. The procrastination assessment scale-student (PASS) evaluates academic procrastination apropos its frequency and reasons. The aims of the present study were to translate, culturally adapt, and validate the Farsi version of the PASS in a sample of Iranian medical students. In this cross-sectional study, the PASS was translated into Farsi through the forward-backward method, and its content validity was thereafter assessed by a panel of 10 experts. The Farsi version of the PASS was subsequently distributed among 423 medical students. The internal reliability of the PASS was assessed using Cronbach's alpha. An exploratory factor analysis (EFA) was conducted on 18 items and then 28 items of the scale to find new models. The construct validity of the scale was assessed using both EFA and confirmatory factor analysis. The predictive validity of the scale was evaluated by calculating the correlation between the academic procrastination scores and the students' average scores in the previous semester. The corresponding reliability of the first and second parts of the scale was 0.781 and 0.861. An EFA on 18 items of the scale found 4 factors which jointly explained 53.2% of variances: The model was marginally acceptable (root mean square error of approximation [RMSEA] =0.098, standardized root mean square residual [SRMR] =0.076, χ(2) /df =4.8, comparative fit index [CFI] =0.83). An EFA on 28 items of the scale found 4 factors which altogether explained 42.62% of variances: The model was acceptable (RMSEA =0.07, SRMR =0.07, χ(2)/df =2.8, incremental fit index =0.90, CFI =0.90). There was a negative correlation between the procrastination scores and the students' average scores (r = -0.131, P =0.02). The Farsi version of the PASS is a valid and reliable tool to measure academic procrastination in Iranian undergraduate medical students.
MacDonald, James; Duerson, Drew
2015-07-01
Baseline assessments using computerized neurocognitive tests are frequently used in the management of sport-related concussions. Such testing is often done on an annual basis in a community setting. Reliability is a fundamental test characteristic that should be established for such tests. Our study examined the test-retest reliability of a computerized neurocognitive test in high school athletes over 1 year. Repeated measures design. Two American high schools. High school athletes (N = 117) participating in American football or soccer during the 2011-2012 and 2012-2013 academic years. All study participants completed 2 baseline computerized neurocognitive tests taken 1 year apart at their respective schools. The test measures performance on 4 cognitive tasks: identification speed (Attention), detection speed (Processing Speed), one card learning accuracy (Learning), and one back speed (Working Memory). Reliability was assessed by measuring the intraclass correlation coefficient (ICC) between the repeated measures of the 4 cognitive tasks. Pearson and Spearman correlation coefficients were calculated as a secondary outcome measure. The measure for identification speed performed best (ICC = 0.672; 95% confidence interval, 0.559-0.760) and the measure for one card learning accuracy performed worst (ICC = 0.401; 95% confidence interval, 0.237-0.542). All tests had marginal or low reliability. In a population of high school athletes, computerized neurocognitive testing performed in a community setting demonstrated low to marginal test-retest reliability on baseline assessments 1 year apart. Further investigation should focus on (1) improving the reliability of individual tasks tested, (2) controlling for external factors that might affect test performance, and (3) identifying the ideal time interval to repeat baseline testing in high school athletes. Computerized neurocognitive tests are used frequently in high school athletes, often within a model of baseline testing of asymptomatic individuals before the start of a sporting season. This study adds to the evidence that suggests in this population such testing may lack sufficient reliability to support clinical decision making.
Validation of Medical Tourism Service Quality Questionnaire (MTSQQ) for Iranian Hospitals.
Qolipour, Mohammad; Torabipour, Amin; Khiavi, Farzad Faraji; Malehi, Amal Saki
2017-03-01
Assessing service quality is one of the basic requirements to develop the medical tourism industry. There is no valid and reliable tool to measure service quality of medical tourism. This study aimed to determine the reliability and validity of a Persian version of medical tourism service quality questionnaire for Iranian hospitals. To validate the medical tourism service quality questionnaire (MTSQQ), a cross-sectional study was conducted on 250 Iraqi patients referred to hospitals in Ahvaz (Iran) from 2015. To design a questionnaire and determine its content validity, the Delphi Technique (3 rounds) with the participation of 20 medical tourism experts was used. Construct validity of the questionnaire was assessed through exploratory and confirmatory factor analysis. Reliability was assessed using Cronbach's alpha coefficient. Data were analyzed by Excel 2007, SPSS version18, and Lisrel l8.0 software. The content validity of the questionnaire with CVI=0.775 was confirmed. According to exploratory factor analysis, the MTSQQ included 31 items and 8 dimensions (tangibility, reliability, responsiveness, assurance, empathy, exchange and travel facilities, technical and infrastructure facilities and safety and security). Construct validity of the questionnaire was confirmed, based on the goodness of fit quantities of model (RMSEA=0.032, CFI= 0.98, GFI=0.88). Cronbach's alpha coefficient was 0.837 and 0.919 for expectation and perception questionnaire. The results of the study showed that the medical tourism SERVQUAL questionnaire with 31 items and 8 dimensions was a valid and reliable tool to measure service quality of medical tourism in Iranian hospitals.
Boerboom, T B B; Dolmans, D H J M; Jaarsma, A D C; Muijtjens, A M M; Van Beukelen, P; Scherpbier, A J J A
2011-01-01
Feedback to aid teachers in improving their teaching requires validated evaluation instruments. When implementing an evaluation instrument in a different context, it is important to collect validity evidence from multiple sources. We examined the validity and reliability of the Maastricht Clinical Teaching Questionnaire (MCTQ) as an instrument to evaluate individual clinical teachers during short clinical rotations in veterinary education. We examined four sources of validity evidence: (1) Content was examined based on theory of effective learning. (2) Response process was explored in a pilot study. (3) Internal structure was assessed by confirmatory factor analysis using 1086 student evaluations and reliability was examined utilizing generalizability analysis. (4) Relations with other relevant variables were examined by comparing factor scores with other outcomes. Content validity was supported by theory underlying the cognitive apprenticeship model on which the instrument is based. The pilot study resulted in an additional question about supervision time. A five-factor model showed a good fit with the data. Acceptable reliability was achievable with 10-12 questionnaires per teacher. Correlations between the factors and overall teacher judgement were strong. The MCTQ appears to be a valid and reliable instrument to evaluate clinical teachers' performance during short rotations.
Grimm, Annegret; Gruber, Bernd; Henle, Klaus
2014-01-01
Reliable estimates of population size are fundamental in many ecological studies and biodiversity conservation. Selecting appropriate methods to estimate abundance is often very difficult, especially if data are scarce. Most studies concerning the reliability of different estimators used simulation data based on assumptions about capture variability that do not necessarily reflect conditions in natural populations. Here, we used data from an intensively studied closed population of the arboreal gecko Gehyra variegata to construct reference population sizes for assessing twelve different population size estimators in terms of bias, precision, accuracy, and their 95%-confidence intervals. Two of the reference populations reflect natural biological entities, whereas the other reference populations reflect artificial subsets of the population. Since individual heterogeneity was assumed, we tested modifications of the Lincoln-Petersen estimator, a set of models in programs MARK and CARE-2, and a truncated geometric distribution. Ranking of methods was similar across criteria. Models accounting for individual heterogeneity performed best in all assessment criteria. For populations from heterogeneous habitats without obvious covariates explaining individual heterogeneity, we recommend using the moment estimator or the interpolated jackknife estimator (both implemented in CAPTURE/MARK). If data for capture frequencies are substantial, we recommend the sample coverage or the estimating equation (both models implemented in CARE-2). Depending on the distribution of catchabilities, our proposed multiple Lincoln-Petersen and a truncated geometric distribution obtained comparably good results. The former usually resulted in a minimum population size and the latter can be recommended when there is a long tail of low capture probabilities. Models with covariates and mixture models performed poorly. Our approach identified suitable methods and extended options to evaluate the performance of mark-recapture population size estimators under field conditions, which is essential for selecting an appropriate method and obtaining reliable results in ecology and conservation biology, and thus for sound management. PMID:24896260
Torabinia, Mansour; Mahmoudi, Sara; Dolatshahi, Mojtaba; Abyaz, Mohamad Reza
2017-01-01
Background: Considering the overall tendency in psychology, researchers in the field of work and organizational psychology have become progressively interested in employees’ effective and optimistic experiments at work such as work engagement. This study was conducted to investigate 2 main purposes: assessing the psychometric properties of the Utrecht Work Engagement Scale, and finding any association between work engagement and burnout in nurses. Methods: The present methodological study was conducted in 2015 and included 248 females and 34 males with 6 months to 30 years of job experience. After the translation process, face and content validity were calculated by qualitative and quantitative methods. Moreover, content validation ratio, scale-level content validity index and item-level content validity index were measured for this scale. Construct validity was determined by factor analysis. Moreover, internal consistency and stability reliability were assessed. Factor analysis, test-retest, Cronbach’s alpha, and association analysis were used as statistical methods. Results: Face and content validity were acceptable. Exploratory factor analysis suggested a new 3- factor model. In this new model, some items from the construct model of the original version were dislocated with the same 17 items. The new model was confirmed by divergent Copenhagen Burnout Inventory as the Persian version of UWES. Internal consistency reliability for the total scale and the subscales was 0.76 to 0.89. Results from Pearson correlation test indicated a high degree of test-retest reliability (r = 0. 89). ICC was also 0.91. Engagement was negatively related to burnout and overtime per month, whereas it was positively related with age and job experiment. Conclusion: The Persian 3– factor model of Utrecht Work Engagement Scale is a valid and reliable instrument to measure work engagement in Iranian nurses as well as in other medical professionals. PMID:28955665
Reliability assessment of multiple quantum well avalanche photodiodes
NASA Technical Reports Server (NTRS)
Yun, Ilgu; Menkara, Hicham M.; Wang, Yang; Oguzman, Isamil H.; Kolnik, Jan; Brennan, Kevin F.; May, Gray S.; Wagner, Brent K.; Summers, Christopher J.
1995-01-01
The reliability of doped-barrier AlGaAs/GsAs multi-quantum well avalanche photodiodes fabricated by molecular beam epitaxy is investigated via accelerated life tests. Dark current and breakdown voltage were the parameters monitored. The activation energy of the degradation mechanism and median device lifetime were determined. Device failure probability as a function of time was computed using the lognormal model. Analysis using the electron beam induced current method revealed the degradation to be caused by ionic impurities or contamination in the passivation layer.
Jucherski, Andrzej; Nastawny, Maria; Walczowski, Andrzej; Jóźwiakowski, Krzysztof; Gajewska, Magdalena
2017-06-01
The aim of the present study was to assess the technological reliability of a domestic hybrid wastewater treatment installation consisting of a classic three-chambered (volume 6 m 3 ) septic tank, a vertical flow trickling bed filled with granules of a calcinated clay material (KERAMZYT), a special wetland bed constructed on a slope, and a permeable pond used as a receiver. The test treatment plant was located at a mountain eco-tourist farm on the periphery of the spa municipality of Krynica-Zdrój, Poland. The plant's operational reliability in reducing the concentration of organic matter, measured as biochemical oxygen demand (BOD 5 ) and chemical oxygen demand (COD), was 100% when modelled by both the Weibull and the lognormal distributions. The respective reliability values for total nitrogen removal were 76.8% and 77.0%, total suspended solids - 99.5% and 92.6%, and PO 4 -P - 98.2% and 95.2%, with the differences being negligible. The installation was characterized by a very high level of technological reliability when compared with other solutions of this type. The Weibull method employed for statistical evaluation of technological reliability can also be used for comparison purposes. From the ecological perspective, the facility presented in the study has proven to be an effective tool for protecting local aquifer areas.
Assessment of Prevalence of Persons with Down Syndrome: A Theory-Based Demographic Model
ERIC Educational Resources Information Center
de Graaf, Gert; Vis, Jeroen C.; Haveman, Meindert; van Hove, Geert; de Graaf, Erik A. B.; Tijssen, Jan G. P.; Mulder, Barbara J. M.
2011-01-01
Background: The Netherlands are lacking reliable empirical data in relation to the development of birth and population prevalence of Down syndrome. For the UK and Ireland there are more historical empirical data available. A theory-based model is developed for predicting Down syndrome prevalence in the Netherlands from the 1950s onwards. It is…
Importance of Nuclear Physics to NASA's Space Missions
NASA Technical Reports Server (NTRS)
Tripathi, R. K.; Wilson, J. W.; Cucinotta, F. A.
2001-01-01
We show that nuclear physics is extremely important for accurate risk assessments for space missions. Due to paucity of experimental input radiation interaction information it is imperative to develop reliable accurate models for the interaction of radiation with matter. State-of-the-art nuclear cross sections models have been developed at the NASA Langley Research center and are discussed.
ERIC Educational Resources Information Center
Royal, Kenneth D.
2010-01-01
Quality measurement is essential in every form of research, including institutional research and assessment. This paper addresses the erroneous assumptions institutional researchers often make with regard to survey research and provides an alternative method to producing more valid and reliable measures. Rasch measurement models are discussed and…
ERIC Educational Resources Information Center
Shindler, John; Taylor, Clint; Cadenas, Herminia; Jones, Albert
This study was a pilot effort to examine the efficacy of an analytic trait scale school climate assessment instrument and democratic change system in two urban high schools. Pilot study results indicate that the instrument shows promising soundness in that it exhibited high levels of validity and reliability. In addition, the analytic trait format…
ERIC Educational Resources Information Center
Südkamp, Anna; Pohl, Steffi; Weinert, Sabine
2015-01-01
Including students with special educational needs in learning (SEN-L) is a challenge for large-scale assessments. In order to draw inferences with respect to students with SEN-L and to compare their scores to students in general education, one needs to assure that the measurement model is reliable and that the same construct is measured for…
NASA Technical Reports Server (NTRS)
Cohen, Gerald C. (Inventor); McMann, Catherine M. (Inventor)
1991-01-01
An improved method and system for automatically generating reliability models for use with a reliability evaluation tool is described. The reliability model generator of the present invention includes means for storing a plurality of low level reliability models which represent the reliability characteristics for low level system components. In addition, the present invention includes means for defining the interconnection of the low level reliability models via a system architecture description. In accordance with the principles of the present invention, a reliability model for the entire system is automatically generated by aggregating the low level reliability models based on the system architecture description.
Sutherland, Rebecca; Trembath, David; Hodge, Antoinette; Drevensek, Suzi; Lee, Sabrena; Silove, Natalie; Roberts, Jacqueline
2017-01-01
Introduction Telehealth can be an effective way to provide speech pathology intervention to children with speech and language impairments. However, the provision of reliable and feasible standardised language assessments via telehealth to establish children's needs for intervention and to monitor progress has not yet been well established. Further, there is limited information about children's reactions to telehealth. This study aimed to examine the reliability and feasibility of conducting standardised language assessment with school-aged children with known or suspected language impairment via a telehealth application using consumer grade computer equipment within a public school setting. Method Twenty-three children (aged 8-12 years) participated. Each child was assessed using a standardised language assessment comprising six subtests. Two subtests were administered by a speech pathologist face-to-face (local clinician) and four subtests were administered via telehealth. All subtests were completed within a single visit to the clinic service, with a break between the face to face and telehealth sessions. The face-to-face clinician completed behaviour observation checklists in the telehealth and face to face conditions and provided feedback on the audio and video quality of the application from the child's point of view. Parent feedback about their child's experience was elicited via survey. Results There was strong inter-rater reliability in the telehealth and face-to-face conditions (correlation coefficients ranged from r = 0.96-1.0 across the subtests) and good agreement on all measures. Similar levels of attention, distractibility and anxiety were observed in the two conditions. Clinicians rated only one session of 23 as having poor audio quality and no sessions were rated as having poor visual quality. Parent and child reactions to the use of telehealth were largely positive and supportive of using telehealth to assess rural children. Discussion The findings support the use of telehealth in the language assessment of school-aged children using a web application and commercially available computer equipment. This reliable and innovative service delivery model has the potential to be used by speech pathologists to provide assessments to children in remote communities.
Llamas-Ramos, Inés; Llamas-Ramos, Rocío; Buz, José; Cortés-Rodríguez, María; Martín-Nogueras, Ana María
2018-06-01
The Memorial Symptom Assessment Scale (MSAS) is a self-rating instrument for the assessment of symptom distress in cancer patients. The Spanish version of the MSAS has recently been validated. However, we lack evidence of the internal construct validity of the shorter versions (short form [MSAS-SF] and condensed form [CMSAS]). In addition, rigorous testing of these scales with modern psychometric methods is needed. The aim of this study was to evaluate the internal construct validity and reliability of the Spanish versions of the MSAS-SF and CMSAS in oncology outpatients using Rasch analysis. Data from a convenience sample of oncology outpatients receiving chemotherapy (n = 306; mean age 60 years; 63% women) at a university hospital were analyzed. The Rasch unidimensional measurement model was used to examine response category functioning, item hierarchy, targeting, unidimensionality, reliability, and differential item functioning by age, gender, and marital status. The response category structure of the symptom distress items was improved by collapsing two categories. The scales were adequately targeted to the study patients, showed overall Rasch model fit (mean Infit MnSq ranged from 0.98 to 1.05), met criteria for unidimensionality, and the reliability of scores was good (person reliability > 0.80), except for the CMSAS prevalence scale. Only four items showed differential item functioning. The present study demonstrated that the Spanish versions of the MSAS-SF and CMSAS have adequate psychometric properties to evaluate symptom distress in oncology outpatients. Additional studies of the CMSAS are recommended. Copyright © 2018 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Smith, L
2001-01-01
Background—No published quantitative instrument exists to measure maternal satisfaction with the quality of different models of labour care in the UK. Methods—A quantitative psychometric multidimensional maternal satisfaction questionnaire, the Women's Views of Birth Labour Satisfaction Questionnaire (WOMBLSQ), was developed using principal components analysis with varimax rotation of successive versions. Internal reliability and content and construct validity were assessed. Results—Of 300 women sent the first version (WOMBLSQ1), 120 (40%) replied; of 300 sent WOMBLSQ2, 188 (62.7%) replied; of 500 women sent WOMBLSQ3, 319 (63.8%) replied; and of 2400 women sent WOMBLSQ4, 1683 (70.1%) replied. The latter two versions consisted of 10 dimensions in addition to general satisfaction. These were (Cronbach's alpha): professional support in labour (0.91), expectations of labour (0.90), home assessment in early labour (0.90), holding the baby (0.87), support from husband/partner (0.83), pain relief in labour (0.83), pain relief immediately after labour (0.65), knowing labour carers (0.82), labour environment (0.80), and control in labour (0.62). There were moderate correlations (range 0.16–0.73) between individual dimensions and the general satisfaction scale (0.75). Scores on individual dimensions were significantly related to a range of clinical and demographic variables. Conclusion—This multidimensional labour satisfaction instrument has good validity and internal reliability. It could be used to assess care in labour across different models of maternity care, or as a prelude to in depth exploration of specific areas of concern. Its external reliability and transferability to care outside the South West region needs further evaluation, particularly in terms of ethnicity and social class. Key Words: Women's Views of Birth Labour Satisfaction Questionnaire (WOMBLSQ); labour; questionnaire PMID:11239139
NASA Technical Reports Server (NTRS)
Wheeler, J. T.
1990-01-01
The Weibull process, identified as the inhomogeneous Poisson process with the Weibull intensity function, is used to model the reliability growth assessment of the space shuttle main engine test and flight failure data. Additional tables of percentage-point probabilities for several different values of the confidence coefficient have been generated for setting (1-alpha)100-percent two sided confidence interval estimates on the mean time between failures. The tabled data pertain to two cases: (1) time-terminated testing, and (2) failure-terminated testing. The critical values of the three test statistics, namely Cramer-von Mises, Kolmogorov-Smirnov, and chi-square, were calculated and tabled for use in the goodness of fit tests for the engine reliability data. Numerical results are presented for five different groupings of the engine data that reflect the actual response to the failures.
The reliability of the Australasian Triage Scale: a meta-analysis
Ebrahimi, Mohsen; Heydari, Abbas; Mazlom, Reza; Mirhaghi, Amir
2015-01-01
BACKGROUND: Although the Australasian Triage Scale (ATS) has been developed two decades ago, its reliability has not been defined; therefore, we present a meta-analyis of the reliability of the ATS in order to reveal to what extent the ATS is reliable. DATA SOURCES: Electronic databases were searched to March 2014. The included studies were those that reported samples size, reliability coefficients, and adequate description of the ATS reliability assessment. The guidelines for reporting reliability and agreement studies (GRRAS) were used. Two reviewers independently examined abstracts and extracted data. The effect size was obtained by the z-transformation of reliability coefficients. Data were pooled with random-effects models, and meta-regression was done based on the method of moment’s estimator. RESULTS: Six studies were included in this study at last. Pooled coefficient for the ATS was substantial 0.428 (95%CI 0.340–0.509). The rate of mis-triage was less than fifty percent. The agreement upon the adult version is higher than the pediatric version. CONCLUSION: The ATS has shown an acceptable level of overall reliability in the emergency department, but it needs more development to reach an almost perfect agreement. PMID:26056538
de Montbrun, Sandra; Roberts, Patricia L; Satterthwaite, Lisa; MacRae, Helen
2016-07-01
To implement the Colorectal Objective Structured Assessment of Technical skill (COSATS) into American Board of Colon and Rectal Surgery (ABCRS) certification and build evidence of validity for the interpretation of the scores of this high stakes assessment tool. Currently, technical skill assessment is not a formal component of board certification. With the technical demands of surgical specialties, documenting competence in technical skill at the time of certification with a valid tool is ideal. In September 2014, the COSATS was a mandatory component of ABCRS certification. Seventy candidates took the examination, with their performance evaluated by expert colorectal surgeons using a task-specific checklist, global rating scale, and overall performance scale. Passing scores were set and compared using 2 standard setting methodologies, using a compensatory and conjunctive model. Inter-rater reliability and the reliability of the pass/fail decision were calculated using Cronbach alpha and Subkoviak methodology, respectively. Overall COSATS scores and pass/fail status were compared with results on the ABCRS oral examination. The pass rate ranged from 85.7% to 90%. Inter-rater reliability (0.85) and reliability of the pass/fail decision (0.87 and 0.84) were high. A low positive correlation (r= 0.25) was seen between the COSATS and oral examination. All individuals who failed the COSATS passed the ABCRS oral examination. COSATS is the first technical skill examination used in national surgical board certification. This study suggests that the current certification process may be failing to identify individuals who have demonstrated technical deficiencies on this standardized assessment tool.
Gerling, Marco; Zhao, Ying; Nania, Salvatore; Norberg, K Jessica; Verbeke, Caroline S; Englert, Benjamin; Kuiper, Raoul V; Bergström, Asa; Hassan, Moustapha; Neesse, Albrecht; Löhr, J Matthias; Heuchel, Rainer L
2014-01-01
In preclinical cancer studies, non-invasive functional imaging has become an important tool to assess tumor development and therapeutic effects. Tumor hypoxia is closely associated with tumor aggressiveness and is therefore a key parameter to be monitored. Recently, photoacoustic (PA) imaging with inherently co-registered high-frequency ultrasound (US) has reached preclinical applicability, allowing parallel collection of anatomical and functional information. Dual-wavelength PA imaging can be used to quantify tissue oxygen saturation based on the absorbance spectrum differences between hemoglobin and deoxyhemoglobin. A new bi-modal PA/US system for small animal imaging was employed to test feasibility and reliability of dual-wavelength PA for measuring relative tissue oxygenation. Murine models of pancreatic and colon cancer were imaged, and differences in tissue oxygenation were compared to immunohistochemistry for hypoxia in the corresponding tissue regions. Functional studies proved feasibility and reliability of oxygenation detection in murine tissue in vivo. Tumor models exhibited different levels of hypoxia in localized regions, which positively correlated with immunohistochemical staining for hypoxia. Contrast-enhanced imaging yielded complementary information on tissue perfusion using the same system. Bimodal PA/US imaging can be utilized to reliably detect hypoxic tumor regions in murine tumor models, thus providing the possibility to collect anatomical and functional information on tumor growth and treatment response live in longitudinal preclinical studies.
Mash, Bob; Derese, Anselme
2013-01-01
Abstract Background Competency-based education and the validity and reliability of workplace-based assessment of postgraduate trainees have received increasing attention worldwide. Family medicine was recognised as a speciality in South Africa six years ago and a satisfactory portfolio of learning is a prerequisite to sit the national exit exam. A massive scaling up of the number of family physicians is needed in order to meet the health needs of the country. Aim The aim of this study was to develop a reliable, robust and feasible portfolio assessment tool (PAT) for South Africa. Methods Six raters each rated nine portfolios from the Stellenbosch University programme, using the PAT, to test for inter-rater reliability. This rating was repeated three months later to determine test–retest reliability. Following initial analysis and feedback the PAT was modified and the inter-rater reliability again assessed on nine new portfolios. An acceptable intra-class correlation was considered to be > 0.80. Results The total score was found to be reliable, with a coefficient of 0.92. For test–retest reliability, the difference in mean total score was 1.7%, which was not statistically significant. Amongst the subsections, only assessment of the educational meetings and the logbook showed reliability coefficients > 0.80. Conclusion This was the first attempt to develop a reliable, robust and feasible national portfolio assessment tool to assess postgraduate family medicine training in the South African context. The tool was reliable for the total score, but the low reliability of several sections in the PAT helped us to develop 12 recommendations regarding the use of the portfolio, the design of the PAT and the training of raters.
Reliability and Validity of Ambulatory Cognitive Assessments
Sliwinski, Martin J.; Mogle, Jacqueline A.; Hyun, Jinshil; Munoz, Elizabeth; Smyth, Joshua M.; Lipton, Richard B.
2017-01-01
Mobile technologies are increasingly used to measure cognitive function outside of traditional clinic and laboratory settings. Although ambulatory assessments of cognitive function conducted in people’s natural environments offer potential advantages over traditional assessment approaches, the psychometrics of cognitive assessment procedures have been understudied. We evaluated the reliability and construct validity of ambulatory assessments of working memory and perceptual speed administered via smartphones as part of an ecological momentary assessment (EMA) protocol in a diverse adult sample (N=219). Results indicated excellent between-person reliability (≥.97) for average scores, and evidence of reliable within-person variability across measurement occasions (.41–.53). The ambulatory tasks also exhibited construct validity, as evidence by their loadings on working memory and perceptual speed factors defined by the in-lab assessments. Our findings demonstrate that averaging across brief cognitive assessments made in uncontrolled naturalistic settings provide measurements that are comparable in reliability to assessments made in controlled laboratory environments. PMID:27084835
Mentiplay, Benjamin F; Tan, Dawn; Williams, Gavin; Adair, Brooke; Pua, Yong-Hao; Bower, Kelly J; Clark, Ross A
2018-04-27
Isometric rate of torque development examines how quickly force can be exerted and may resemble everyday task demands more closely than isometric strength. Rate of torque development may provide further insight into the relationship between muscle function and gait following stroke. Aims of this study were to examine the test-retest reliability of hand-held dynamometry to measure isometric rate of torque development following stroke, to examine associations between strength and rate of torque development, and to compare the relationships of strength and rate of torque development to gait velocity. Sixty-three post-stroke adults participated (60 years, 34 male). Gait velocity was assessed using the fast-paced 10 m walk test. Isometric strength and rate of torque development of seven lower-limb muscle groups were assessed with hand-held dynamometry. Intraclass correlation coefficients were calculated for reliability and Spearman's rho correlations were calculated for associations. Regression analyses using partial F-tests were used to compare strength and rate of torque development in their relationship with gait velocity. Good to excellent reliability was shown for strength and rate of torque development (0.82-0.97). Strong associations were found between strength and rate of torque development (0.71-0.94). Despite high correlations between strength and rate of torque development, rate of torque development failed to provide significant value to regression models that already contained strength. Assessment of isometric rate of torque development with hand-held dynamometry is reliable following stroke, however isometric strength demonstrated greater relationships with gait velocity. Further research should examine the relationship between dynamic measures of muscle strength/torque and gait after stroke. Copyright © 2018 Elsevier Ltd. All rights reserved.
Wyles, Susannah M; Miskovic, Danilo; Ni, Zhifang; Darzi, Ara W; Valori, Roland M; Coleman, Mark G; Hanna, George B
2016-03-01
There is a lack of educational tools available for surgical teaching critique, particularly for advanced laparoscopic surgery. The aim was to develop and implement a tool that assesses training quality and structures feedback for trainers in the English National Training Programme for laparoscopic colorectal surgery. Semi-structured interviews were performed and analysed, and items were extracted. Through the Delphi process, essential items pertaining to desirable trainer characteristics, training structure and feedback were determined. An assessment tool (Structured Training Trainer Assessment Report-STTAR) was developed and tested for feasibility, acceptability and educational impact. Interview transcripts (29 surgical trainers, 10 trainees, four educationalists) were analysed, and item lists created and distributed for consensus opinion (11 trainers and seven trainees). The STTAR consisted of 64 factors, and its web-based version, the mini-STTAR, included 21 factors that were categorised into four groups (training structure, training behaviour, trainer attributes and role modelling) and structured around a training session timeline (beginning, middle and end). The STTAR (six trainers, 48 different assessments) demonstrated good internal consistency (α = 0.88) and inter-rater reliability (ICC = 0.75). The mini-STTAR demonstrated good inter-item reliability (α = 0.79) and intra-observer reliability on comparison of 85 different trainer/trainee combinations (r = 0.701, p = <0.001). Both were found to be feasible and acceptable. The educational report for trainers was found to be useful (4.4 out of 5). An assessment tool that evaluates training quality was developed and shown to be reliable, acceptable and of educational value. It has been successfully implemented into the English National Training Programme for laparoscopic colorectal surgery.
A rater training protocol to assess team performance.
Eppich, Walter; Nannicelli, Anna P; Seivert, Nicholas P; Sohn, Min-Woong; Rozenfeld, Ranna; Woods, Donna M; Holl, Jane L
2015-01-01
Simulation-based methodologies are increasingly used to assess teamwork and communication skills and provide team training. Formative feedback regarding team performance is an essential component. While effective use of simulation for assessment or training requires accurate rating of team performance, examples of rater-training programs in health care are scarce. We describe our rater training program and report interrater reliability during phases of training and independent rating. We selected an assessment tool shown to yield valid and reliable results and developed a rater training protocol with an accompanying rater training handbook. The rater training program was modeled after previously described high-stakes assessments in the setting of 3 facilitated training sessions. Adjacent agreement was used to measure interrater reliability between raters. Nine raters with a background in health care and/or patient safety evaluated team performance of 42 in-situ simulations using post-hoc video review. Adjacent agreement increased from the second training session (83.6%) to the third training session (85.6%) when evaluating the same video segments. Adjacent agreement for the rating of overall team performance was 78.3%, which was added for the third training session. Adjacent agreement was 97% 4 weeks posttraining and 90.6% at the end of independent rating of all simulation videos. Rater training is an important element in team performance assessment, and providing examples of rater training programs is essential. Articulating key rating anchors promotes adequate interrater reliability. In addition, using adjacent agreement as a measure allows differentiation between high- and low-performing teams on video review. © 2015 The Alliance for Continuing Education in the Health Professions, the Society for Academic Continuing Medical Education, and the Council on Continuing Medical Education, Association for Hospital Medical Education.
Griffiths, A; Cox, T; Karanika, M; Khan, S; Tomás, J M
2006-10-01
To examine the factor structure, reliability, and validity of a new context-specific questionnaire for the assessment of work and organisational factors. The Work Organisation Assessment Questionnaire (WOAQ) was developed as part of a risk assessment and risk reduction methodology for hazards inherent in the design and management of work in the manufacturing sector. Two studies were conducted. Data were collected from 524 white- and blue-collar employees from a range of manufacturing companies. Exploratory factor analysis was carried out on 28 items that described the most commonly reported failures of work design and management in companies in the manufacturing sector. Concurrent validity data were also collected. A reliability study was conducted with a further 156 employees. Principal component analysis, with varimax rotation, revealed a strong 28-item, five factor structure. The factors were named: quality of relationships with management, reward and recognition, workload, quality of relationships with colleagues, and quality of physical environment. Analyses also revealed a more general summative factor. Results indicated that the questionnaire has good internal consistency and test-retest reliability and validity. Being associated with poor employee health and changes in health related behaviour, the WOAQ factors are possible hazards. It is argued that the strength of those associations offers some estimation of risk. Feedback from the organisations involved indicated that the WOAQ was easy to use and meaningful for them as part of their risk assessment procedures. The studies reported here describe a model of the hazards to employee health and health related behaviour inherent in the design and management of work in the manufacturing sector. It offers an instrument for their assessment. The scales derived which form the WOAQ were shown to be reliable, valid, and meaningful to the user population.
Macdermid, Paul William; Fink, Philip W; Stannard, Stephen R
2015-01-01
This investigation sets out to assess the effect of five different models of mountain bike tyre on rolling performance over hard-pack mud. Independent characteristics included total weight, volume, tread surface area and tread depth. One male cyclist performed multiple (30) trials of a deceleration field test to assess reliability. Further tests performed on a separate occasion included multiple (15) trials of the deceleration test and six fixed power output hill climb tests for each tyre. The deceleration test proved to be reliable as a means of assessing rolling performance via differences in initial and final speed (coefficient of variation (CV) = 4.52%). Overall differences between tyre performance for both deceleration test (P = 0.014) and hill climb (P = 0.032) were found, enabling significant (P < 0.0001 and P = 0.049) models to be generated, allowing tyre performance prediction based on tyre characteristics. The ideal tyre for rolling and climbing performance on hard-pack surfaces would be to decrease tyre weight by way of reductions in tread surface area and tread depth while keeping volume high.
New methods for analyzing semantic graph based assessments in science education
NASA Astrophysics Data System (ADS)
Vikaros, Lance Steven
This research investigated how the scoring of semantic graphs (known by many as concept maps) could be improved and automated in order to address issues of inter-rater reliability and scalability. As part of the NSF funded SENSE-IT project to introduce secondary school science students to sensor networks (NSF Grant No. 0833440), semantic graphs illustrating how temperature change affects water ecology were collected from 221 students across 16 schools. The graphing task did not constrain students' use of terms, as is often done with semantic graph based assessment due to coding and scoring concerns. The graphing software used provided real-time feedback to help students learn how to construct graphs, stay on topic and effectively communicate ideas. The collected graphs were scored by human raters using assessment methods expected to boost reliability, which included adaptations of traditional holistic and propositional scoring methods, use of expert raters, topical rubrics, and criterion graphs. High levels of inter-rater reliability were achieved, demonstrating that vocabulary constraints may not be necessary after all. To investigate a new approach to automating the scoring of graphs, thirty-two different graph features characterizing graphs' structure, semantics, configuration and process of construction were then used to predict human raters' scoring of graphs in order to identify feature patterns correlated to raters' evaluations of graphs' topical accuracy and complexity. Results led to the development of a regression model able to predict raters' scoring with 77% accuracy, with 46% accuracy expected when used to score new sets of graphs, as estimated via cross-validation tests. Although such performance is comparable to other graph and essay based scoring systems, cross-context testing of the model and methods used to develop it would be needed before it could be recommended for widespread use. Still, the findings suggest techniques for improving the reliability and scalability of semantic graph based assessments without requiring constraint of how ideas are expressed.
Reliability analysis and initial requirements for FC systems and stacks
NASA Astrophysics Data System (ADS)
Åström, K.; Fontell, E.; Virtanen, S.
In the year 2000 Wärtsilä Corporation started an R&D program to develop SOFC systems for CHP applications. The program aims to bring to the market highly efficient, clean and cost competitive fuel cell systems with rated power output in the range of 50-250 kW for distributed generation and marine applications. In the program Wärtsilä focuses on system integration and development. System reliability and availability are key issues determining the competitiveness of the SOFC technology. In Wärtsilä, methods have been implemented for analysing the system in respect to reliability and safety as well as for defining reliability requirements for system components. A fault tree representation is used as the basis for reliability prediction analysis. A dynamic simulation technique has been developed to allow for non-static properties in the fault tree logic modelling. Special emphasis has been placed on reliability analysis of the fuel cell stacks in the system. A method for assessing reliability and critical failure predictability requirements for fuel cell stacks in a system consisting of several stacks has been developed. The method is based on a qualitative model of the stack configuration where each stack can be in a functional, partially failed or critically failed state, each of the states having different failure rates and effects on the system behaviour. The main purpose of the method is to understand the effect of stack reliability, critical failure predictability and operating strategy on the system reliability and availability. An example configuration, consisting of 5 × 5 stacks (series of 5 sets of 5 parallel stacks) is analysed in respect to stack reliability requirements as a function of predictability of critical failures and Weibull shape factor of failure rate distributions.
Latent class analysis of diagnostic science assessment data using Bayesian networks
NASA Astrophysics Data System (ADS)
Steedle, Jeffrey Thomas
2008-10-01
Diagnostic science assessments seek to draw inferences about student understanding by eliciting evidence about the mental models that underlie students' reasoning about physical systems. Measurement techniques for analyzing data from such assessments embody one of two contrasting assessment programs: learning progressions and facet-based assessments. Learning progressions assume that students have coherent theories that they apply systematically across different problem contexts. In contrast, the facet approach makes no such assumption, so students should not be expected to reason systematically across different problem contexts. A systematic comparison of these two approaches is of great practical value to assessment programs such as the National Assessment of Educational Progress as they seek to incorporate small clusters of related items in their tests for the purpose of measuring depth of understanding. This dissertation describes an investigation comparing learning progression and facet models. Data comprised student responses to small clusters of multiple-choice diagnostic science items focusing on narrow aspects of understanding of Newtonian mechanics. Latent class analysis was employed using Bayesian networks in order to model the relationship between students' science understanding and item responses. Separate models reflecting the assumptions of the learning progression and facet approaches were fit to the data. The technical qualities of inferences about student understanding resulting from the two models were compared in order to determine if either modeling approach was more appropriate. Specifically, models were compared on model-data fit, diagnostic reliability, diagnostic certainty, and predictive accuracy. In addition, the effects of test length were evaluated for both models in order to inform the number of items required to obtain adequately reliable latent class diagnoses. Lastly, changes in student understanding over time were studied with a longitudinal model in order to provide educators and curriculum developers with a sense of how students advance in understanding over the course of instruction. Results indicated that expected student response patterns rarely reflected the assumptions of the learning progression approach. That is, students tended not to systematically apply a coherent set of ideas across different problem contexts. Even those students expected to express scientifically-accurate understanding had substantial probabilities of reporting certain problematic ideas. The learning progression models failed to make as many substantively-meaningful distinctions among students as the facet models. In statistical comparisons, model-data fit was better for the facet model, but the models were quite comparable on all other statistical criteria. Studying the effects of test length revealed that approximately 8 items are needed to obtain adequate diagnostic certainty, but more items are needed to obtain adequate diagnostic reliability. The longitudinal analysis demonstrated that students either advance in their understanding (i.e., switch to the more advanced latent class) over a short period of instruction or stay at the same level. There was no significant relationship between the probability of changing latent classes and time between testing occasions. In all, this study is valuable because it provides evidence informing decisions about modeling and reporting on student understanding, it assesses the quality of measurement available from short clusters of diagnostic multiple-choice items, and it provides educators with knowledge of the paths that student may take as they advance from novice to expert understanding over the course of instruction.
A Web-Based System for Bayesian Benchmark Dose Estimation.
Shao, Kan; Shapiro, Andrew J
2018-01-11
Benchmark dose (BMD) modeling is an important step in human health risk assessment and is used as the default approach to identify the point of departure for risk assessment. A probabilistic framework for dose-response assessment has been proposed and advocated by various institutions and organizations; therefore, a reliable tool is needed to provide distributional estimates for BMD and other important quantities in dose-response assessment. We developed an online system for Bayesian BMD (BBMD) estimation and compared results from this software with U.S. Environmental Protection Agency's (EPA's) Benchmark Dose Software (BMDS). The system is built on a Bayesian framework featuring the application of Markov chain Monte Carlo (MCMC) sampling for model parameter estimation and BMD calculation, which makes the BBMD system fundamentally different from the currently prevailing BMD software packages. In addition to estimating the traditional BMDs for dichotomous and continuous data, the developed system is also capable of computing model-averaged BMD estimates. A total of 518 dichotomous and 108 continuous data sets extracted from the U.S. EPA's Integrated Risk Information System (IRIS) database (and similar databases) were used as testing data to compare the estimates from the BBMD and BMDS programs. The results suggest that the BBMD system may outperform the BMDS program in a number of aspects, including fewer failed BMD and BMDL calculations and estimates. The BBMD system is a useful alternative tool for estimating BMD with additional functionalities for BMD analysis based on most recent research. Most importantly, the BBMD has the potential to incorporate prior information to make dose-response modeling more reliable and can provide distributional estimates for important quantities in dose-response assessment, which greatly facilitates the current trend for probabilistic risk assessment. https://doi.org/10.1289/EHP1289.
MicroRNA Biomarkers of Toxicity in Biological Matrices
Biomarker measurements that reliably correlate with tissue injury and can be measured from sampling accessible biofluids offer enormous benefits in terms of cost, time, and convenience when assessing environmental and drug-induced toxicity in model systems or human cohorts. Micro...
Bibler Zaidi, Nikki L; Santen, Sally A; Purkiss, Joel A; Teener, Carol A; Gay, Steven E
2016-11-01
Most medical schools have either retained a traditional admissions interview or fully adopted an innovative, multisampling format (e.g., the multiple mini-interview) despite there being advantages and disadvantages associated with each format. The University of Michigan Medical School (UMMS) sought to maximize the strengths associated with both interview formats after recognizing that combining the two approaches had the potential to capture additional, unique information about an applicant. In September 2014, the UMMS implemented a hybrid interview model with six, 6-minute short-form interviews-highly structured scenario-based encounters-and two, 30-minute semistructured long-form interviews. Five core skills were assessed across both interview formats. Overall, applicants and admissions committee members reported favorable reactions to the hybrid model, supporting continued use of the model. The generalizability coefficients for the six-station short-form and the two-interview long-form formats were estimated to be 0.470 and 0.176, respectively. Different skills were more reliably assessed by different interview formats. Scores from each format seemed to be operating independently as evidenced through moderate to low correlations (r = 0.100-0.403) for the same skills measured across different interview formats; however, after correcting for attenuation, these correlations were much higher. This hybrid model will be revised and optimized to capture the skills most reliably assessed by each format. Future analysis will examine validity by determining whether short-form and long-form interview scores accurately measure the skills intended to be assessed. Additionally, data collected from both formats will be used to establish baselines for entering students' competencies.