Field reliability of competency and sanity opinions: A systematic review and meta-analysis.
Guarnera, Lucy A; Murrie, Daniel C
2017-06-01
We know surprisingly little about the interrater reliability of forensic psychological opinions, even though courts and other authorities have long called for known error rates for scientific procedures admitted as courtroom testimony. This is particularly true for opinions produced during routine practice in the field, even for some of the most common types of forensic evaluations-evaluations of adjudicative competency and legal sanity. To address this gap, we used meta-analytic procedures and study space methodology to systematically review studies that examined the interrater reliability-particularly the field reliability-of competency and sanity opinions. Of 59 identified studies, 9 addressed the field reliability of competency opinions and 8 addressed the field reliability of sanity opinions. These studies presented a wide range of reliability estimates; pairwise percentage agreements ranged from 57% to 100% and kappas ranged from .28 to 1.0. Meta-analytic combinations of reliability estimates obtained by independent evaluators returned estimates of κ = .49 (95% CI: .40-.58) for competency opinions and κ = .41 (95% CI: .29-.53) for sanity opinions. This wide range of reliability estimates underscores the extent to which different evaluation contexts tend to produce different reliability rates. Unfortunately, our study space analysis illustrates that available field reliability studies typically provide little information about contextual variables crucial to understanding their findings. Given these concerns, we offer suggestions for improving research on the field reliability of competency and sanity opinions, as well as suggestions for improving reliability rates themselves. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
2013-01-01
Summary of background data Recent smartphones, such as the iPhone, are often equipped with an accelerometer and magnetometer, which, through software applications, can perform various inclinometric functions. Although these applications are intended for recreational use, they have the potential to measure and quantify range of motion. The purpose of this study was to estimate the intra and inter-rater reliability as well as the criterion validity of the clinometer and compass applications of the iPhone in the assessment cervical range of motion in healthy participants. Methods The sample consisted of 28 healthy participants. Two examiners measured cervical range of motion of each participant twice using the iPhone (for the estimation of intra and inter-reliability) and once with the CROM (for the estimation of criterion validity). Estimates of reliability and validity were then established using the intraclass correlation coefficient (ICC). Results We observed a moderate intra-rater reliability for each movement (ICC = 0.65-0.85) but a poor inter-rater reliability (ICC < 0.60). For the criterion validity, the ICCs are moderate (>0.50) to good (>0.65) for movements of flexion, extension, lateral flexions and right rotation, but poor (<0.50) for the movement left rotation. Conclusion We found good intra-rater reliability and lower inter-rater reliability. When compared to the gold standard, these applications showed moderate to good validity. However, before using the iPhone as an outcome measure in clinical settings, studies should be done on patients presenting with cervical problems. PMID:23829201
Simulation analyses of space use: Home range estimates, variability, and sample size
Bekoff, Marc; Mech, L. David
1984-01-01
Simulations of space use by animals were run to determine the relationship among home range area estimates, variability, and sample size (number of locations). As sample size increased, home range size increased asymptotically, whereas variability decreased among mean home range area estimates generated by multiple simulations for the same sample size. Our results suggest that field workers should ascertain between 100 and 200 locations in order to estimate reliably home range area. In some cases, this suggested guideline is higher than values found in the few published studies in which the relationship between home range area and number of locations is addressed. Sampling differences for small species occupying relatively small home ranges indicate that fewer locations may be sufficient to allow for a reliable estimate of home range. Intraspecific variability in social status (group member, loner, resident, transient), age, sex, reproductive condition, and food resources also have to be considered, as do season, habitat, and differences in sampling and analytical methods. Comparative data still are needed.
Proposed Reliability/Cost Model
NASA Technical Reports Server (NTRS)
Delionback, L. M.
1982-01-01
New technique estimates cost of improvement in reliability for complex system. Model format/approach is dependent upon use of subsystem cost-estimating relationships (CER's) in devising cost-effective policy. Proposed methodology should have application in broad range of engineering management decisions.
Lucas, Nicholas; Macaskill, Petra; Irwig, Les; Moran, Robert; Bogduk, Nikolai
2009-01-01
Trigger points are promoted as an important cause of musculoskeletal pain. There is no accepted reference standard for the diagnosis of trigger points, and data on the reliability of physical examination for trigger points are conflicting. To systematically review the literature on the reliability of physical examination for the diagnosis of trigger points. MEDLINE, EMBASE, and other sources were searched for articles reporting the reliability of physical examination for trigger points. Included studies were evaluated for their quality and applicability, and reliability estimates were extracted and reported. Nine studies were eligible for inclusion. None satisfied all quality and applicability criteria. No study specifically reported reliability for the identification of the location of active trigger points in the muscles of symptomatic participants. Reliability estimates varied widely for each diagnostic sign, for each muscle, and across each study. Reliability estimates were generally higher for subjective signs such as tenderness (kappa range, 0.22-1.0) and pain reproduction (kappa range, 0.57-1.00), and lower for objective signs such as the taut band (kappa range, -0.08-0.75) and local twitch response (kappa range, -0.05-0.57). No study to date has reported the reliability of trigger point diagnosis according to the currently proposed criteria. On the basis of the limited number of studies available, and significant problems with their design, reporting, statistical integrity, and clinical applicability, physical examination cannot currently be recommended as a reliable test for the diagnosis of trigger points. The reliability of trigger point diagnosis needs to be further investigated with studies of high quality that use current diagnostic criteria in clinically relevant patients.
Interval Estimation of Revision Effect on Scale Reliability via Covariance Structure Modeling
ERIC Educational Resources Information Center
Raykov, Tenko
2009-01-01
A didactic discussion of a procedure for interval estimation of change in scale reliability due to revision is provided, which is developed within the framework of covariance structure modeling. The method yields ranges of plausible values for the population gain or loss in reliability of unidimensional composites, which results from deletion or…
NASA Astrophysics Data System (ADS)
Kim, K.-h.; Oh, T.-s.; Park, K.-r.; Lee, J. H.; Ghim, Y.-c.
2017-11-01
One factor determining the reliability of measurements of electron temperature using a Thomson scattering (TS) system is transmittance of the optical bandpass filters in polychromators. We investigate the system performance as a function of electron temperature to determine reliable range of measurements for a given set of the optical bandpass filters. We show that such a reliability, i.e., both bias and random errors, can be obtained by building a forward model of the KSTAR TS system to generate synthetic TS data with the prescribed electron temperature and density profiles. The prescribed profiles are compared with the estimated ones to quantify both bias and random errors.
Revised techniques for estimating peak discharges from channel width in Montana
Parrett, Charles; Hull, J.A.; Omang, R.J.
1987-01-01
This study was conducted to develop new estimating equations based on channel width and the updated flood frequency curves of previous investigations. Simple regression equations for estimating peak discharges with recurrence intervals of 2, 5, 10 , 25, 50, and 100 years were developed for seven regions in Montana. The standard errors of estimates for the equations that use active channel width as the independent variables ranged from 30% to 87%. The standard errors of estimate for the equations that use bankfull width as the independent variable ranged from 34% to 92%. The smallest standard errors generally occurred in the prediction equations for the 2-yr flood, 5-yr flood, and 10-yr flood, and the largest standard errors occurred in the prediction equations for the 100-yr flood. The equations that use active channel width and the equations that use bankfull width were determined to be about equally reliable in five regions. In the West Region, the equations that use bankfull width were slightly more reliable than those based on active channel width, whereas in the East-Central Region the equations that use active channel width were slightly more reliable than those based on bankfull width. Compared with similar equations previously developed, the standard errors of estimate for the new equations are substantially smaller in three regions and substantially larger in two regions. Limitations on the use of the estimating equations include: (1) The equations are based on stable conditions of channel geometry and prevailing water and sediment discharge; (2) The measurement of channel width requires a site visit, preferably by a person with experience in the method, and involves appreciable measurement errors; (3) Reliability of results from the equations for channel widths beyond the range of definition is unknown. In spite of the limitations, the estimating equations derived in this study are considered to be as reliable as estimating equations based on basin and climatic variables. Because the two types of estimating equations are independent, results from each can be weighted inversely proportional to their variances, and averaged. The weighted average estimate has a variance less than either individual estimate. (Author 's abstract)
Reliable estimation of orbit errors in spaceborne SAR interferometry. The network approach
NASA Astrophysics Data System (ADS)
Bähr, Hermann; Hanssen, Ramon F.
2012-12-01
An approach to improve orbital state vectors by orbit error estimates derived from residual phase patterns in synthetic aperture radar interferograms is presented. For individual interferograms, an error representation by two parameters is motivated: the baseline error in cross-range and the rate of change of the baseline error in range. For their estimation, two alternatives are proposed: a least squares approach that requires prior unwrapping and a less reliable gridsearch method handling the wrapped phase. In both cases, reliability is enhanced by mutual control of error estimates in an overdetermined network of linearly dependent interferometric combinations of images. Thus, systematic biases, e.g., due to unwrapping errors, can be detected and iteratively eliminated. Regularising the solution by a minimum-norm condition results in quasi-absolute orbit errors that refer to particular images. For the 31 images of a sample ENVISAT dataset, orbit corrections with a mutual consistency on the millimetre level have been inferred from 163 interferograms. The method itself qualifies by reliability and rigorous geometric modelling of the orbital error signal but does not consider interfering large scale deformation effects. However, a separation may be feasible in a combined processing with persistent scatterer approaches or by temporal filtering of the estimates.
Stenneberg, Martijn S; Busstra, Harm; Eskes, Michel; van Trijffel, Emiel; Cattrysse, Erik; Scholten-Peeters, Gwendolijne G M; de Bie, Rob A
2018-04-01
There is a lack of valid, reliable, and feasible instruments for measuring planar active cervical range of motion (aCROM) and associated 3D coupling motions in patients with neck pain. Smartphones have advanced sensors and appear to be suitable for these measurements. To estimate the concurrent validity and interrater reliability of a new iPhone application for assessing planar aCROM and associated 3D coupling motions in patients with neck pain, using an electromagnetic tracking device as a reference test. Cross-sectional study. Two samples of neck pain patients were recruited; 30 patients for the validity study and 26 patients for the reliability study. Validity was estimated using intraclass correlation coefficients (ICCs), and by calculating 95% limits of agreement (LoA). To estimate interrater reliability, ICCs were calculated. Cervical 3D coupling motions were analyzed by calculating the cross-correlation coefficients and ratio between the main motions and coupled motions for both instruments. ICCs for concurrent validity and interrater reliability ranged from 0.90 to 0.99. The width of the 95% LoA ranged from about 5° for right lateral bending to 11° for total rotation. No significant differences were found between both devices for associated coupling motion analysis. The iPhone application appears to be a useful discriminative tool for the measurement of planar aCROM and associated coupling motions in patients with neck pain. It fulfills the need for a valid, reliable, and feasible instrument in clinical practice and research. Therapists and researchers should consider measurement error when interpreting scores. Copyright © 2017 Elsevier Ltd. All rights reserved.
Parrett, Charles; Johnson, D.R.; Hull, J.A.
1989-01-01
Estimates of streamflow characteristics (monthly mean flow that is exceeded 90, 80, 50, and 20 percent of the time for all years of record and mean monthly flow) were made and are presented in tabular form for 312 sites in the Missouri River basin in Montana. Short-term gaged records were extended to the base period of water years 1937-86, and were used to estimate monthly streamflow characteristics at 100 sites. Data from 47 gaged sites were used in regression analysis relating the streamflow characteristics to basin characteristics and to active-channel width. The basin-characteristics equations, with standard errors of 35% to 97%, were used to estimate streamflow characteristics at 179 ungaged sites. The channel-width equations, with standard errors of 36% to 103%, were used to estimate characteristics at 138 ungaged sites. Streamflow measurements were correlated with concurrent streamflows at nearby gaged sites to estimate streamflow characteristics at 139 ungaged sites. In a test using 20 pairs of gages, the standard errors ranged from 31% to 111%. At 139 ungaged sites, the estimates from two or more of the methods were weighted and combined in accordance with the variance of individual methods. When estimates from three methods were combined the standard errors ranged from 24% to 63 %. A drainage-area-ratio adjustment method was used to estimate monthly streamflow characteristics at seven ungaged sites. The reliability of the drainage-area-ratio adjustment method was estimated to be about equal to that of the basin-characteristics method. The estimate were checked for reliability. Estimates of monthly streamflow characteristics from gaged records were considered to be most reliable, and estimates at sites with actual flow record from 1937-86 were considered to be completely reliable (zero error). Weighted-average estimates were considered to be the most reliable estimates made at ungaged sites. (USGS)
Between-day reliability of the trapezius muscle H-reflex and M-wave.
Vangsgaard, Steffen; Hansen, Ernst A; Madeleine, Pascal
2015-12-01
The aim of this study was to investigate the between-day reliability of the trapezius muscle H-reflex and M-wave. Sixteen healthy subjects were studied on 2 consecutive days. Trapezius muscle H-reflexes were evoked by electrical stimulation of the C3/4 cervical nerves; M-waves were evoked by electrical stimulation of the accessory nerve. Relative reliability was estimated by intraclass correlation coefficients (ICC2,1 ). Absolute reliability was estimated by computing the standard error of measurement (SEM) and the smallest real difference (SRD). Bland-Altman plots were constructed to detect any systematic bias. Variables showed substantial to excellent relative reliability (ICC = 0.70-0.99). The relative SEM ranged from 1.4% to 34.8%; relative SRD ranged from 3.8% to 96.5%. No systematic bias was present in the data. The amplitude and latency of the trapezius muscle H-reflex and M-wave in healthy young subjects can be measured reliably across days. © 2015 Wiley Periodicals, Inc.
Ilahi, Omer A; Mansfield, David J; Urrea, Luis H; Qadeer, Ali A
2014-10-01
To assess interobserver and intraobserver agreement of estimating anterior cruciate ligament (ACL) femoral tunnel positioning arthroscopically using circular and linear (noncircular) estimation methods and to determine whether overlay template visual aids improve agreement. Standardized intraoperative pictures of femoral tunnel pilot holes (taken with a 30° arthroscope through an anterolateral portal at 90° of knee flexion with horizontal being parallel to the tibial surface) in 27 patients undergoing single-bundle ACL reconstruction were presented to 3 fellowship-trained arthroscopists on 2 separate occasions. On both viewings, each surgeon estimated the femoral tunnel pilot hole location to the nearest half-hour mark using a whole clock face and half clock face, to the nearest 15° using a whole compass and half compass, in the top or bottom half of a linear quadrant, and in the top or bottom half of a linear trisector. Evaluations were performed first without and then with an overlay template of each estimation method. The average difference among reviewers was quite similar for all 4 circular methods with the use of visual aids. Without overlay template visual aids, pair-wise κ statistic values for interobserver agreement ranged from -0.14 to 0.56 for the whole clock face and from 0.16 to 0.42 for the half clock face. With overlay visual guides, interobserver agreement ranged from 0.29 to 0.63 for the whole clock face and from 0.17 to 0.66 for the half clock face. The quadrant method's interobserver agreement ranged from 0.22 to 0.60, and that of the trisection method ranged from 0.17 to 0.57. Neither linear estimation method's reliability uniformly improved with the use of overlay templates. Intraobserver agreement without overlay templates ranged from 0.17 to 0.49 for the whole clock face, 0.11 to 0.47 for the half clock face, 0.01 to 0.66 for the quadrant method, and 0.20 to 0.57 for the trisection method. Use of overlay templates did not uniformly improve intraobserver agreement for any estimation method. There does not appear to be any advantage of using a half clock face or compass for estimating femoral tunnel position compared with a whole clock-face analogy. Visual reference aids appear to improve interobserver agreement (reliability) of circular analogies. The linear quadrant appears to be the most reliable method (fair to moderate agreement) for estimating femoral tunnel position without a visual aid for reference, but even better reliability, ranging from fair to good agreement, may be obtained by using the whole clock-face analogy with a visual aid. Increasing femoral tunnel position reliability may improve outcomes of ACL reconstruction surgery. Copyright © 2014 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.
Nitschke, J E; Nattrass, C L; Disler, P B; Chou, M J; Ooi, K T
1999-02-01
Repeated measures design for intra- and interrater reliability. To determine the intra- and interrater reliability of the lumbar spine range of motion measured with a dual inclinometer, and the thoracolumbar spine range of motion measured with a long-arm goniometer, as recommended in the American Medical Association Guides. The American Medical Association Guides (2nd and 4th editions) recommend using measurements of thoracolumbar and lumbar range of movement, respectively, to estimate the percentage of permanent impairment in patients with chronic low back pain. However, the reliability of this method of estimating impairment has not been determined. In all, 34 subjects participated in the study, 21 women with a mean age of 40.1 years (SD, +/- 11.1) and 13 men with a mean age of 47.7 years (SD, +/- 12.1). Measures of thoracolumbar flexion, extension, lateral flexion, and rotation were obtained with a long-arm goniometer. Lumbar flexion, extension, and lateral flexion were measured with a dual inclinometer. Measurements were taken by two examiners on one occasion and by one examiner on two occasions approximately 1 week apart. The results showed poor intra- and interrater reliability for all measurements taken with both instruments. Measurement error expressed in degrees showed that measurements taken by different raters exhibited systematic as well as random differences. As a result, subjects measured by two different examiners on the same day, with either instrument, could give impairment ratings ranging between 0% and 18% of the whole person (excluding rotation), in which percentage impairment is calculated using the average range of motion and the average systematic and random error in degrees for the group for each movement (flexion, extension, and lateral flexion). The poor reliability of the American Medical Association Guides' spinal range of motion model can result in marked variation in the percentage of whole-body impairment. These findings have implications for compensation bodies in Australia and other countries that use the American Medical Association Guides' procedure to estimate impairment in chronic low back pain patients.
Is there a single best estimator? selection of home range estimators using area- under- the-curve
Walter, W. David; Onorato, Dave P.; Fischer, Justin W.
2015-01-01
Comparisons of fit of home range contours with locations collected would suggest that use of VHF technology is not as accurate as GPS technology to estimate size of home range for large mammals. Estimators of home range collected with GPS technology performed better than those estimated with VHF technology regardless of estimator used. Furthermore, estimators that incorporate a temporal component (third-generation estimators) appeared to be the most reliable regardless of whether kernel-based or Brownian bridge-based algorithms were used and in comparison to first- and second-generation estimators. We defined third-generation estimators of home range as any estimator that incorporates time, space, animal-specific parameters, and habitat. Such estimators would include movement-based kernel density, Brownian bridge movement models, and dynamic Brownian bridge movement models among others that have yet to be evaluated.
ERIC Educational Resources Information Center
Worrell, Frank C.; Mello, Zena R.
2007-01-01
In this study, the authors examined the reliability, structural validity, and concurrent validity of Zimbardo Time Perspective Inventory (ZTPI) scores in a group of 815 academically talented adolescents. Reliability estimates of the purported factors' scores were in the low to moderate range. Exploratory factor analysis supported a five-factor…
Global remote sensing of water-chlorophyll ratio in terrestrial plant leaves.
Kushida, Keiji
2012-10-01
I evaluated the use of global remote sensing techniques for estimating plant leaf chlorophyll a + b (C(ab); μg cm(-2)) and water (C(w); mg cm(-2)) concentrations as well as the ratio of C(w)/C(ab) with the PROSAIL model under possible distributions for leaf and soil spectra, leaf area index (LAI), canopy geometric structure, and leaf size. First, I estimated LAI from the normalized difference vegetation index. I found that, at LAI values <2, C(ab), C(w), and C(w)/C(ab) could not be reliably estimated. At LAI values >2, C(ab) and C(w) could be estimated for only restricted ranges of the canopy structure; however, the ratio of C(w)/C(ab) could be reliably estimated for a variety of possible canopy structures with coefficients of determination (R(2)) ranging from 0.56 to 0.90. The remote estimation of the C(w)/C(ab) ratio from satellites offers information on plant condition at a global scale.
Schäfer, Axel; Lüdtke, Kerstin; Breuel, Franziska; Gerloff, Nikolas; Knust, Maren; Kollitsch, Christian; Laukart, Alex; Matej, Laura; Müller, Antje; Schöttker-Königer, Thomas; Hall, Toby
2018-08-01
Headache is a common and costly health problem. Although pathogenesis of headache is heterogeneous, one reported contributing factor is dysfunction of the upper cervical spine. The flexion rotation test (FRT) is a commonly used diagnostic test to detect upper cervical movement impairment. The aim of this cross-sectional study was to investigate concurrent validity of detecting high cervical ROM impairment during the FRT by comparing measurements established by an ultrasound-based system (gold standard) with eyeball estimation. Secondary aim was to investigate intra-rater reliability of FRT ROM eyeball estimation. The examiner (6 years experience) was blinded to the data from the ultrasound-based device and to the symptoms of the patients. FRT test result (positive or negative) was based on visual estimation of range of rotation less than 34° to either side. Concurrently, range of rotation was evaluated using the ultrasound-based device. A total of 43 subjects with headache (79% female), mean age of 35.05 years (SD 13.26) were included. According to the International Headache Society Classification 23 subjects had migraine, 4 tension type headache, and 16 multiple headache forms. Sensitivity and specificity were 0.96 and 0.89 for combined rotation, indicating good concurrent reliability. The area under the ROC curve was 0.95 (95% CI 0.91-0.98) for rotation to both sides. Intra-rater reliability for eyeball estimation was excellent with Fleiss Kappa 0.79 for right rotation and left rotation. The results of this study indicate that the FRT is a valid and reliable test to detect impairment of upper cervical ROM in patients with headache.
Storm surge and tidal range energy
NASA Astrophysics Data System (ADS)
Lewis, Matthew; Angeloudis, Athanasios; Robins, Peter; Evans, Paul; Neill, Simon
2017-04-01
The need to reduce carbon-based energy sources whilst increasing renewable energy forms has led to concerns of intermittency within a national electricity supply strategy. The regular rise and fall of the tide makes prediction almost entirely deterministic compared to other stochastic renewable energy forms; therefore, tidal range energy is often stated as a predictable and firm renewable energy source. Storm surge is the term used for the non-astronomical forcing of tidal elevation, and is synonymous with coastal flooding because positive storm surges can elevate water-levels above the height of coastal flood defences. We hypothesis storm surges will affect the reliability of the tidal range energy resource; with negative surge events reducing the tidal range, and conversely, positive surge events increasing the available resource. Moreover, tide-surge interaction, which results in positive storm surges more likely to occur on a flooding tide, will reduce the annual tidal range energy resource estimate. Water-level data (2000-2012) at nine UK tide gauges, where the mean tidal amplitude is above 2.5m and thus suitable for tidal-range energy development (e.g. Bristol Channel), were used to predict tidal range power with a 0D modelling approach. Storm surge affected the annual resource estimate by between -5% to +3%, due to inter-annual variability. Instantaneous power output were significantly affected (Normalised Root Mean Squared Error: 3%-8%, Scatter Index: 15%-41%) with spatial variability and variability due to operational strategy. We therefore find a storm surge affects the theoretical reliability of tidal range power, such that a prediction system may be required for any future electricity generation scenario that includes large amounts of tidal-range energy; however, annual resource estimation from astronomical tides alone appears sufficient for resource estimation. Future work should investigate water-level uncertainties on the reliability and predictability of tidal range energy with 2D hydrodynamic models.
An interrater reliability study of the Braden scale in two nursing homes.
Kottner, Jan; Dassen, Theo
2008-10-01
Adequate risk assessment is essential in pressure ulcer prevention. Assessment scales were designed to support practitioners in identifying persons at pressure ulcer risk. The Braden scale is one of the most extensively studied risk assessment instruments, although the majority of studies focused on validity rather than reliability. The first aim was to measure the interrater reliability of the Braden scale and its individual items. The second aim was to study different statistical approaches regarding interrater reliability estimation. An interrater reliability study was conducted in two German nursing homes. Residents (n = 152) from 8 units were assessed twice. The raters were trained nurses with a work experience ranging from 0.5 to 30 years. Data were analysed using an overall percentage of agreement, weighted and unweighted kappa and the intraclass correlation coefficient. Differences between nurses rating the overall Braden score ranged from 0 up to 9 points. Interrater reliability expressed by the intraclass correlation coefficient ranged from 0.73 (95% CI 0.26 - 0.91) to 0.95 (95% CI 0.87 - 0.98). Calculated intraclass correlation coefficients for individual items ranged from 0.06 (95% CI -0.31 to 0.48) to 0.97 (95% CI 0.93-0.99) with the lowest values being measured for the items "sensory perception" and "nutrition". There was no association between work experience and the level of interrater reliability. With two exceptions, simple kappa-values were always lower than weighted kappa-values and intraclass correlation coefficients. Although the calculated interrater reliability coefficients for the total Braden score were high in some cases, several clinically relevant differences occurred between the nurses. Due to interrater reliability being very low for the items "sensory perception" and "nutrition", it is doubtful if their assessment contributes to any valid results. The calculation of weighted kappa or intraclass correlation coefficients is the most appropriate interrater reliability estimates.
How reliable is apparent age at death on cadavers?
Amadasi, Alberto; Merusi, Nicolò; Cattaneo, Cristina
2015-07-01
The assessment of age at death for identification purposes is a frequent and tough challenge for forensic pathologists and anthropologists. Too frequently, visual assessment of age is performed on well-preserved corpses, a method considered subjective and full of pitfalls, but whose level of inadequacy no one has yet tested or proven. This study consisted in the visual estimation of the age of 100 cadavers performed by a total of 37 observers among those usually attending the dissection room. Cadavers were of Caucasian ethnicity, well preserved, belonging to individuals who died of natural death. All the evaluations were performed prior to autopsy. Observers assessed the age with ranges of 5 and 10 years, indicating also the body part they mainly observed for each case. Globally, the 5-year range had an accuracy of 35%, increasing to 69% with the 10-year range. The highest accuracy was in the 31-60 age category (74.7% with the 10-year range), and the skin seemed to be the most reliable age parameter (71.5% of accuracy when observed), while the face was considered most frequently, in 92.4% of cases. A simple formula with the general "mean of averages" in the range given by the observers and related standard deviations was then developed; the average values with standard deviations of 4.62 lead to age estimation with ranges of some 20 years that seem to be fairly reliable and suitable, sometimes in alignment with classic anthropological methods, in the age estimation of well-preserved corpses.
Bryant, Jessica V; Zeng, Xingyuan; Hong, Xiaojiang; Chatterjee, Helen J; Turvey, Samuel T
2017-03-01
Conservation management requires an evidence-based approach, as uninformed decisions can signify the difference between species recovery and loss. The Hainan gibbon, the world's rarest ape, reportedly exploits the largest home range of any gibbon species, with these apparently large spatial requirements potentially limiting population recovery. However, previous home range assessments rarely reported survey methods, effort, or analytical approaches, hindering critical evaluation of estimate reliability. For extremely rare species where data collection is challenging, it also is unclear what impact such limitations have on estimating home range requirements. We re-evaluated Hainan gibbon spatial ecology using 75 hr of observations from 35 contact days over 93 field-days across dry (November 2010-February 2011) and wet (June 2011-September 2011) seasons. We calculated home range area for three social groups (N = 21 individuals) across the sampling period, seasonal estimates for one group (based on 24 days of observation; 12 days per season), and between-group home range overlap using multiple approaches (Minimum Convex Polygon, Kernel Density Estimation, Local Convex Hull, Brownian Bridge Movement Model), and assessed estimate reliability and representativeness using three approaches (Incremental Area Analysis, spatial concordance, and exclusion of expected holes). We estimated a yearly home range of 1-2 km 2 , with 1.49 km 2 closest to the median of all estimates. Although Hainan gibbon spatial requirements are relatively large for gibbons, our new estimates are smaller than previous estimates used to explain the species' limited recovery, suggesting that habitat availability may be less important in limiting population growth. We argue that other ecological, genetic, and/or anthropogenic factors are more likely to constrain Hainan gibbon recovery, and conservation attention should focus on elucidating and managing these factors. Re-evaluation reveals Hainan gibbon home range as c. 1-2 km 2 . Hainan gibbon home range is, therefore, similar to other Nomascus gibbons. Limited data for extremely rare species does not necessarily prevent derivation of robust home range estimates. © 2016 Wiley Periodicals, Inc.
Uncertainties in obtaining high reliability from stress-strength models
NASA Technical Reports Server (NTRS)
Neal, Donald M.; Matthews, William T.; Vangel, Mark G.
1992-01-01
There has been a recent interest in determining high statistical reliability in risk assessment of aircraft components. The potential consequences are identified of incorrectly assuming a particular statistical distribution for stress or strength data used in obtaining the high reliability values. The computation of the reliability is defined as the probability of the strength being greater than the stress over the range of stress values. This method is often referred to as the stress-strength model. A sensitivity analysis was performed involving a comparison of reliability results in order to evaluate the effects of assuming specific statistical distributions. Both known population distributions, and those that differed slightly from the known, were considered. Results showed substantial differences in reliability estimates even for almost nondetectable differences in the assumed distributions. These differences represent a potential problem in using the stress-strength model for high reliability computations, since in practice it is impossible to ever know the exact (population) distribution. An alternative reliability computation procedure is examined involving determination of a lower bound on the reliability values using extreme value distributions. This procedure reduces the possibility of obtaining nonconservative reliability estimates. Results indicated the method can provide conservative bounds when computing high reliability. An alternative reliability computation procedure is examined involving determination of a lower bound on the reliability values using extreme value distributions. This procedure reduces the possibility of obtaining nonconservative reliability estimates. Results indicated the method can provide conservative bounds when computing high reliability.
Reneman, M F; Roelofs, M; Schiphorst Preuper, H R
2017-07-01
To analyze test-retest reliability and agreement, and to explore the safety of neck functional capacity evaluation (Neck-FCE) tests in patients with chronic multifactorial neck pain. Test-retest; 2 FCE sessions were held with a 2-week interval. University-based outpatient rehabilitation center. Individuals (N=18; 14 women) with a mean age of 34 years. Not applicable. The Neck-FCE protocol consists of 6 tests: lifting waist to overhead (kg), 2-handed carrying (kg), overhead working (s), bending and overhead reaching (s), and repetitive side reaching (left and right) (s). Intraclass correlation coefficients (ICCs) and limits of agreement (LoA) were calculated. ICC point estimates between .75 and .90 were considered as good, and >.90 were considered as excellent reliability. ICC point estimates ranged between .39 and .96. Ratios of the LoA ranged between 32.0% and 56.5%. Mean ± SD numeric rating scale pain scores in the neck and shoulder 24 hours after the test were 6.7±2.6 and 6.3±3.0, respectively. Based on ICC point estimates and 95% confidence intervals, 3 tests had excellent reliability and 3 had poor reliability. LoA were substantial in all 6 tests. Safety was confirmed. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Examining the reliability of ADAS-Cog change scores.
Grochowalski, Joseph H; Liu, Ying; Siedlecki, Karen L
2016-09-01
The purpose of this study was to estimate and examine ways to improve the reliability of change scores on the Alzheimer's Disease Assessment Scale, Cognitive Subtest (ADAS-Cog). The sample, provided by the Alzheimer's Disease Neuroimaging Initiative, included individuals with Alzheimer's disease (AD) (n = 153) and individuals with mild cognitive impairment (MCI) (n = 352). All participants were administered the ADAS-Cog at baseline and 1 year, and change scores were calculated as the difference in scores over the 1-year period. Three types of change score reliabilities were estimated using multivariate generalizability. Two methods to increase change score reliability were evaluated: reweighting the subtests of the scale and adding more subtests. Reliability of ADAS-Cog change scores over 1 year was low for both the AD sample (ranging from .53 to .64) and the MCI sample (.39 to .61). Reweighting the change scores from the AD sample improved reliability (.68 to .76), but lengthening provided no useful improvement for either sample. The MCI change scores had low reliability, even with reweighting and adding additional subtests. The ADAS-Cog scores had low reliability for measuring change. Researchers using the ADAS-Cog should estimate and report reliability for their use of the change scores. The ADAS-Cog change scores are not recommended for assessment of meaningful clinical change.
On decentralized estimation. [for large linear systems
NASA Technical Reports Server (NTRS)
Siljak, D. D.; Vukcevic, M. B.
1978-01-01
A multilevel scheme is proposed to construct decentralized estimators for large linear systems. The scheme is numerically attractive since only observability tests of low-order subsystems are required. Equally important is the fact that the constructed estimators are reliable under structural perturbations and can tolerate a wide range of nonlinearities in coupling among the subsystems.
ERIC Educational Resources Information Center
Van Norman, Ethan R.; Christ, Theodore J.; Zopluoglu, Cengiz
2013-01-01
This study examined the effect of baseline estimation on the quality of trend estimates derived from Curriculum Based Measurement of Oral Reading (CBM-R) progress monitoring data. The authors used a linear mixed effects regression (LMER) model to simulate progress monitoring data for schedules ranging from 6-20 weeks for datasets with high and low…
Reliability of self-reported antisocial personality disorder symptoms among substance abusers.
Cottler, L B; Compton, W M; Ridenour, T A; Ben Abdallah, A; Gallagher, T
1998-02-01
It is estimated that from 20 to 60% of substance abusers meet criteria for Antisocial Personality Disorder (APD). An accurate and reliable diagnosis is important because persons meeting criteria for APD, by the nature of their disorder, are less likely to change behaviors and more likely to relapse to both substance abuse and high risk behaviors. To understand more about the reliability of the disorder and symptoms of APD, the Diagnostic Interview Schedule Version III-R (DIS) was administered to 453 substance abusers ascertained from treatment programs and from the general population (St Louis Epidemiological Catchment Area (ECA) follow-up study). Estimates of the 1 week, test-retest reliability for the childhood conduct disorder criterion, the adult antisocial behavior criterion, and APD diagnosis fell in the good agreement range, as measured by kappa. The internal consistency of these DIS symptoms was adequate to acceptable. Individual DIS criteria designed to measure childhood conduct disorder ranged from fair to good for most items; reliability was slightly higher for the adult antisocial behavior symptom items. Finally, self-reported 'liars' were no more unreliable in their reports of their behaviors than 'non-liars'.
Comparing capacity value estimation techniques for photovoltaic solar power
Madaeni, Seyed Hossein; Sioshansi, Ramteen; Denholm, Paul
2012-09-28
In this paper, we estimate the capacity value of photovoltaic (PV) solar plants in the western U.S. Our results show that PV plants have capacity values that range between 52% and 93%, depending on location and sun-tracking capability. We further compare more robust but data- and computationally-intense reliability-based estimation techniques with simpler approximation methods. We show that if implemented properly, these techniques provide accurate approximations of reliability-based methods. Overall, methods that are based on the weighted capacity factor of the plant provide the most accurate estimate. As a result, we also examine the sensitivity of PV capacity value to themore » inclusion of sun-tracking systems.« less
Mejia, Amanda F; Nebel, Mary Beth; Barber, Anita D; Choe, Ann S; Pekar, James J; Caffo, Brian S; Lindquist, Martin A
2018-05-15
Reliability of subject-level resting-state functional connectivity (FC) is determined in part by the statistical techniques employed in its estimation. Methods that pool information across subjects to inform estimation of subject-level effects (e.g., Bayesian approaches) have been shown to enhance reliability of subject-level FC. However, fully Bayesian approaches are computationally demanding, while empirical Bayesian approaches typically rely on using repeated measures to estimate the variance components in the model. Here, we avoid the need for repeated measures by proposing a novel measurement error model for FC describing the different sources of variance and error, which we use to perform empirical Bayes shrinkage of subject-level FC towards the group average. In addition, since the traditional intra-class correlation coefficient (ICC) is inappropriate for biased estimates, we propose a new reliability measure denoted the mean squared error intra-class correlation coefficient (ICC MSE ) to properly assess the reliability of the resulting (biased) estimates. We apply the proposed techniques to test-retest resting-state fMRI data on 461 subjects from the Human Connectome Project to estimate connectivity between 100 regions identified through independent components analysis (ICA). We consider both correlation and partial correlation as the measure of FC and assess the benefit of shrinkage for each measure, as well as the effects of scan duration. We find that shrinkage estimates of subject-level FC exhibit substantially greater reliability than traditional estimates across various scan durations, even for the most reliable connections and regardless of connectivity measure. Additionally, we find partial correlation reliability to be highly sensitive to the choice of penalty term, and to be generally worse than that of full correlations except for certain connections and a narrow range of penalty values. This suggests that the penalty needs to be chosen carefully when using partial correlations. Copyright © 2018. Published by Elsevier Inc.
Jodice, Patrick G.R.; Garman, S.L.; Collopy, Michael W.
2001-01-01
Marbled Murrelets (Brachyramphus marmoratus) are threatened seabirds that nest in coastal old-growth coniferous forests throughout much of their breeding range. Currently, observer-based audio-visual surveys are conducted at inland forest sites during the breeding season primarily to determine nesting distribution and breeding status and are being used to estimate temporal or spatial trends in murrelet detections. Our goal was to assess the feasibility of using audio-visual survey data for such monitoring. We used an intensive field-based survey effort to record daily murrelet detections at seven survey stations in the Oregon Coast Range. We then used computer-aided resampling techniques to assess the effectiveness of twelve survey strategies with varying scheduling and a sampling intensity of 4-14 surveys per breeding season to estimate known means and SDs of murrelet detections. Most survey strategies we tested failed to provide estimates of detection means and SDs that were within A?20% of actual means and SDs. Estimates of daily detections were, however, frequently estimated to within A?50% of field data with sampling efforts of 14 days/breeding season. Additional resampling analyses with statistically generated detection data indicated that the temporal variability in detection data had a great effect on the reliability of the mean and SD estimates calculated from the twelve survey strategies, while the value of the mean had little effect. Effectiveness at estimating multi-year trends in detection data was similarly poor, indicating that audio-visual surveys might be reliably used to estimate annual declines in murrelet detections of the order of 50% per year.
NASA Technical Reports Server (NTRS)
Juhasz, A. J.; Bloomfield, H. S.
1985-01-01
A combinatorial reliability approach is used to identify potential dynamic power conversion systems for space mission applications. A reliability and mass analysis is also performed, specifically for a 100 kWe nuclear Brayton power conversion system with parallel redundancy. Although this study is done for a reactor outlet temperature of 1100K, preliminary system mass estimates are also included for reactor outlet temperatures ranging up to 1500 K.
NASA Astrophysics Data System (ADS)
Kamiaka, Shoya; Benomar, Othman; Suto, Yasushi
2018-05-01
Advances in asteroseismology of solar-like stars, now provide a unique method to estimate the stellar inclination i⋆. This enables to evaluate the spin-orbit angle of transiting planetary systems, in a complementary fashion to the Rossiter-McLaughlineffect, a well-established method to estimate the projected spin-orbit angle λ. Although the asteroseismic method has been broadly applied to the Kepler data, its reliability has yet to be assessed intensively. In this work, we evaluate the accuracy of i⋆ from asteroseismology of solar-like stars using 3000 simulated power spectra. We find that the low signal-to-noise ratio of the power spectra induces a systematic under-estimate (over-estimate) bias for stars with high (low) inclinations. We derive analytical criteria for the reliable asteroseismic estimate, which indicates that reliable measurements are possible in the range of 20° ≲ i⋆ ≲ 80° only for stars with high signal-to-noise ratio. We also analyse and measure the stellar inclination of 94 Kepler main-sequence solar-like stars, among which 33 are planetary hosts. According to our reliability criteria, a third of them (9 with planets, 22 without) have accurate stellar inclination. Comparison of our asteroseismic estimate of vsin i⋆ against spectroscopic measurements indicates that the latter suffers from a large uncertainty possibly due to the modeling of macro-turbulence, especially for stars with projected rotation speed vsin i⋆ ≲ 5km/s. This reinforces earlier claims, and the stellar inclination estimated from the combination of measurements from spectroscopy and photometric variation for slowly rotating stars needs to be interpreted with caution.
NASA Astrophysics Data System (ADS)
Iskandar, Ismed; Satria Gondokaryono, Yudi
2016-02-01
In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range between the true value and the maximum likelihood estimated value lines.
Pneumothorax size measurements on digital chest radiographs: Intra- and inter- rater reliability.
Thelle, Andreas; Gjerdevik, Miriam; Grydeland, Thomas; Skorge, Trude D; Wentzel-Larsen, Tore; Bakke, Per S
2015-10-01
Detailed and reliable methods may be important for discussions on the importance of pneumothorax size in clinical decision-making. Rhea's method is widely used to estimate pneumothorax size in percent based on chest X-rays (CXRs) from three measure points. Choi's addendum is used for anterioposterior projections. The aim of this study was to examine the intrarater and interrater reliability of the Rhea and Choi method using digital CXR in the ward based PACS monitors. Three physicians examined a retrospective series of 80 digital CXRs showing pneumothorax, using Rhea and Choi's method, then repeated in a random order two weeks later. We used the analysis of variance technique by Eliasziw et al. to assess the intrarater and interrater reliability in altogether 480 estimations of pneumothorax size. Estimated pneumothorax sizes ranged between 5% and 100%. The intrarater reliability coefficient was 0.98 (95% one-sided lower-limit confidence interval C 0.96), and the interrater reliability coefficient was 0.95 (95% one-sided lower-limit confidence interval 0.93). This study has shown that the Rhea and Choi method for calculating pneumothorax size has high intrarater and interrater reliability. These results are valid across gender, side of pneumothorax and whether the patient is diagnosed with primary or secondary pneumothorax. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
ERIC Educational Resources Information Center
Wei, Meifen; Alvarez, Alvin N.; Ku, Tsun-Yao; Russell, Daniel W.; Bonett, Douglas G.
2010-01-01
Four studies were conducted to develop and validate the Coping With Discrimination Scale (CDS). In Study 1, an exploratory factor analysis (N = 328) identified 5 factors: Education/Advocacy, Internalization, Drug and Alcohol Use, Resistance, and Detachment, with internal consistency reliability estimates ranging from 0.72 to 0.90. In Study 2, a…
Determination of output factors for small proton therapy fields.
Fontenot, Jonas D; Newhauser, Wayne D; Bloch, Charles; White, R Allen; Titt, Uwe; Starkschall, George
2007-02-01
Current protocols for the measurement of proton dose focus on measurements under reference conditions; methods for measuring dose under patient-specific conditions have not been standardized. In particular, it is unclear whether dose in patient-specific fields can be determined more reliably with or without the presence of the patient-specific range compensator. The aim of this study was to quantitatively assess the reliability of two methods for measuring dose per monitor unit (DIMU) values for small-field treatment portals: one with the range compensator and one without the range compensator. A Monte Carlo model of the Proton Therapy Center-Houston double-scattering nozzle was created, and estimates of D/MU values were obtained from 14 simulated treatments of a simple geometric patient model. Field-specific D/MU calibration measurements were simulated with a dosimeter in a water phantom with and without the range compensator. D/MU values from the simulated calibration measurements were compared with D/MU values from the corresponding treatment simulation in the patient model. To evaluate the reliability of the calibration measurements, six metrics and four figures of merit were defined to characterize accuracy, uncertainty, the standard deviations of accuracy and uncertainty, worst agreement, and maximum uncertainty. Measuring D/MU without the range compensator provided superior results for five of the six metrics and for all four figures of merit. The two techniques yielded different results primarily because of high-dose gradient regions introduced into the water phantom when the range compensator was present. Estimated uncertainties (approximately 1 mm) in the position of the dosimeter in these regions resulted in large uncertainties and high variability in D/MU values. When the range compensator was absent, these gradients were minimized and D/MU values were less sensitive to dosimeter positioning errors. We conclude that measuring D/MU without the range compensator present provides more reliable results than measuring it with the range compensator in place.
Anthropogenic range contractions bias species climate change forecasts
NASA Astrophysics Data System (ADS)
Faurby, Søren; Araújo, Miguel B.
2018-03-01
Forecasts of species range shifts under climate change most often rely on ecological niche models, in which characterizations of climate suitability are highly contingent on the species range data used. If ranges are far from equilibrium under current environmental conditions, for instance owing to local extinctions in otherwise suitable areas, modelled environmental suitability can be truncated, leading to biased estimates of the effects of climate change. Here we examine the impact of such biases on estimated risks from climate change by comparing models of the distribution of North American mammals based on current ranges with ranges accounting for historical information on species ranges. We find that estimated future diversity, almost everywhere, except in coastal Alaska, is drastically underestimated unless the full historical distribution of the species is included in the models. Consequently forecasts of climate change impacts on biodiversity for many clades are unlikely to be reliable without acknowledging anthropogenic influences on contemporary ranges.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Uresk, D.W.; Gilbert, R.O.; Rickard, W.H.
Big sagebrush (Artemisia tridentata) was subjected to a double sampling procedure to obtain reliable phytomass estimates for leaves, flowering stalks, live wood, dead wood, various combinations of the preceeding, and total phytomass. Coefficients of determination (R/sup 2/) between the independent variable and various phytomass categories ranged from 0.45 to 0.93. Total phytomass was approximately 69 +- 16 (+- S.E.) g/m/sup 2/. Reductions in the variance of the phytomass estimates ranged from 33 percent to 80 percent using double sampling assuming optimum allocation. (auth)
Estimating respiratory rate from FBG optical sensors by using signal quality measurement.
Yongwei Zhu; Maniyeri, Jayachandran; Fook, Victor Foo Siang; Haihong Zhang
2015-08-01
Non-intrusiveness is one of the advantages of in-bed optical sensor device for monitoring vital signs, including heart rate and respiratory rate. Estimating respiratory rate reliably using such sensors, however, is challenging, due to body movement, signal variation according to different subjects or body positions, etc. This paper presents a method for reliable respiratory rate estimation for FBG optical sensors by introducing signal quality estimation. The method estimates the quality of the signal waveform by detecting regularly repetitive patterns using proposed spectrum and cepstrum analysis. Multiple window sizes are used to cater for a wide range of target respiratory rates. Furthermore, the readings of multiple sensors are fused to derive a final respiratory rate. Experiments with 12 subjects and 2 body positions were conducted using polysomnography belt signal as groundtruth. The results demonstrated the effectiveness of the method.
Reliability analysis in the Office of Safety, Environmental, and Mission Assurance (OSEMA)
NASA Astrophysics Data System (ADS)
Kauffmann, Paul J.
1994-12-01
The technical personnel in the SEMA office are working to provide the highest degree of value-added activities to their support of the NASA Langley Research Center mission. Management perceives that reliability analysis tools and an understanding of a comprehensive systems approach to reliability will be a foundation of this change process. Since the office is involved in a broad range of activities supporting space mission projects and operating activities (such as wind tunnels and facilities), it was not clear what reliability tools the office should be familiar with and how these tools could serve as a flexible knowledge base for organizational growth. Interviews and discussions with the office personnel (both technicians and engineers) revealed that job responsibilities ranged from incoming inspection to component or system analysis to safety and risk. It was apparent that a broad base in applied probability and reliability along with tools for practical application was required by the office. A series of ten class sessions with a duration of two hours each was organized and scheduled. Hand-out materials were developed and practical examples based on the type of work performed by the office personnel were included. Topics covered were: Reliability Systems - a broad system oriented approach to reliability; Probability Distributions - discrete and continuous distributions; Sampling and Confidence Intervals - random sampling and sampling plans; Data Analysis and Estimation - Model selection and parameter estimates; and Reliability Tools - block diagrams, fault trees, event trees, FMEA. In the future, this information will be used to review and assess existing equipment and processes from a reliability system perspective. An analysis of incoming materials sampling plans was also completed. This study looked at the issues associated with Mil Std 105 and changes for a zero defect acceptance sampling plan.
Reliability analysis in the Office of Safety, Environmental, and Mission Assurance (OSEMA)
NASA Technical Reports Server (NTRS)
Kauffmann, Paul J.
1994-01-01
The technical personnel in the SEMA office are working to provide the highest degree of value-added activities to their support of the NASA Langley Research Center mission. Management perceives that reliability analysis tools and an understanding of a comprehensive systems approach to reliability will be a foundation of this change process. Since the office is involved in a broad range of activities supporting space mission projects and operating activities (such as wind tunnels and facilities), it was not clear what reliability tools the office should be familiar with and how these tools could serve as a flexible knowledge base for organizational growth. Interviews and discussions with the office personnel (both technicians and engineers) revealed that job responsibilities ranged from incoming inspection to component or system analysis to safety and risk. It was apparent that a broad base in applied probability and reliability along with tools for practical application was required by the office. A series of ten class sessions with a duration of two hours each was organized and scheduled. Hand-out materials were developed and practical examples based on the type of work performed by the office personnel were included. Topics covered were: Reliability Systems - a broad system oriented approach to reliability; Probability Distributions - discrete and continuous distributions; Sampling and Confidence Intervals - random sampling and sampling plans; Data Analysis and Estimation - Model selection and parameter estimates; and Reliability Tools - block diagrams, fault trees, event trees, FMEA. In the future, this information will be used to review and assess existing equipment and processes from a reliability system perspective. An analysis of incoming materials sampling plans was also completed. This study looked at the issues associated with Mil Std 105 and changes for a zero defect acceptance sampling plan.
Benjamin, Sara E; Neelon, Brian; Ball, Sarah C; Bangdiwala, Shrikant I; Ammerman, Alice S; Ward, Dianne S
2007-01-01
Background Few assessment instruments have examined the nutrition and physical activity environments in child care, and none are self-administered. Given the emerging focus on child care settings as a target for intervention, a valid and reliable measure of the nutrition and physical activity environment is needed. Methods To measure inter-rater reliability, 59 child care center directors and 109 staff completed the self-assessment concurrently, but independently. Three weeks later, a repeat self-assessment was completed by a sub-sample of 38 directors to assess test-retest reliability. To assess criterion validity, a researcher-administered environmental assessment was conducted at 69 centers and was compared to a self-assessment completed by the director. A weighted kappa test statistic and percent agreement were calculated to assess agreement for each question on the self-assessment. Results For inter-rater reliability, kappa statistics ranged from 0.20 to 1.00 across all questions. Test-retest reliability of the self-assessment yielded kappa statistics that ranged from 0.07 to 1.00. The inter-quartile kappa statistic ranges for inter-rater and test-retest reliability were 0.45 to 0.63 and 0.27 to 0.45, respectively. When percent agreement was calculated, questions ranged from 52.6% to 100% for inter-rater reliability and 34.3% to 100% for test-retest reliability. Kappa statistics for validity ranged from -0.01 to 0.79, with an inter-quartile range of 0.08 to 0.34. Percent agreement for validity ranged from 12.9% to 93.7%. Conclusion This study provides estimates of criterion validity, inter-rater reliability and test-retest reliability for an environmental nutrition and physical activity self-assessment instrument for child care. Results indicate that the self-assessment is a stable and reasonably accurate instrument for use with child care interventions. We therefore recommend the Nutrition and Physical Activity Self-Assessment for Child Care (NAP SACC) instrument to researchers and practitioners interested in conducting healthy weight intervention in child care. However, a more robust, less subjective measure would be more appropriate for researchers seeking an outcome measure to assess intervention impact. PMID:17615078
Stationary echo canceling in velocity estimation by time-domain cross-correlation.
Jensen, J A
1993-01-01
The application of stationary echo canceling to ultrasonic estimation of blood velocities using time-domain cross-correlation is investigated. Expressions are derived that show the influence from the echo canceler on the signals that enter the cross-correlation estimator. It is demonstrated that the filtration results in a velocity-dependent degradation of the signal-to-noise ratio. An analytic expression is given for the degradation for a realistic pulse. The probability of correct detection at low signal-to-noise ratios is influenced by signal-to-noise ratio, transducer bandwidth, center frequency, number of samples in the range gate, and number of A-lines employed in the estimation. Quantitative results calculated by a simple simulation program are given for the variation in probability from these parameters. An index reflecting the reliability of the estimate at hand can be calculated from the actual cross-correlation estimate by a simple formula and used in rejecting poor estimates or in displaying the reliability of the velocity estimated.
Junkes, Monica C; Fraiz, Fabian C; Sardenberg, Fernanda; Lee, Jessica Y; Paiva, Saul M; Ferreira, Fernanda M
2015-01-01
The aim of the present study was to translate, perform the cross-cultural adaptation of the Rapid Estimate of Adult Literacy in Dentistry to Brazilian-Portuguese language and test the reliability and validity of this version. After translation and cross-cultural adaptation, interviews were conducted with 258 parents/caregivers of children in treatment at the pediatric dentistry clinics and health units in Curitiba, Brazil. To test the instrument's validity, the scores of Brazilian Rapid Estimate of Adult Literacy in Dentistry (BREALD-30) were compared based on occupation, monthly household income, educational attainment, general literacy, use of dental services and three dental outcomes. The BREALD-30 demonstrated good internal reliability. Cronbach's alpha ranged from 0.88 to 0.89 when words were deleted individually. The analysis of test-retest reliability revealed excellent reproducibility (intraclass correlation coefficient = 0.983 and Kappa coefficient ranging from moderate to nearly perfect). In the bivariate analysis, BREALD-30 scores were significantly correlated with the level of general literacy (rs = 0.593) and income (rs = 0.327) and significantly associated with occupation, educational attainment, use of dental services, self-rated oral health and the respondent's perception regarding his/her child's oral health. However, only the association between the BREALD-30 score and the respondent's perception regarding his/her child's oral health remained significant in the multivariate analysis. The BREALD-30 demonstrated satisfactory psychometric properties and is therefore applicable to adults in Brazil.
Junkes, Monica C.; Fraiz, Fabian C.; Sardenberg, Fernanda; Lee, Jessica Y.; Paiva, Saul M.; Ferreira, Fernanda M.
2015-01-01
Objective The aim of the present study was to translate, perform the cross-cultural adaptation of the Rapid Estimate of Adult Literacy in Dentistry to Brazilian-Portuguese language and test the reliability and validity of this version. Methods After translation and cross-cultural adaptation, interviews were conducted with 258 parents/caregivers of children in treatment at the pediatric dentistry clinics and health units in Curitiba, Brazil. To test the instrument's validity, the scores of Brazilian Rapid Estimate of Adult Literacy in Dentistry (BREALD-30) were compared based on occupation, monthly household income, educational attainment, general literacy, use of dental services and three dental outcomes. Results The BREALD-30 demonstrated good internal reliability. Cronbach’s alpha ranged from 0.88 to 0.89 when words were deleted individually. The analysis of test-retest reliability revealed excellent reproducibility (intraclass correlation coefficient = 0.983 and Kappa coefficient ranging from moderate to nearly perfect). In the bivariate analysis, BREALD-30 scores were significantly correlated with the level of general literacy (rs = 0.593) and income (rs = 0.327) and significantly associated with occupation, educational attainment, use of dental services, self-rated oral health and the respondent’s perception regarding his/her child's oral health. However, only the association between the BREALD-30 score and the respondent’s perception regarding his/her child's oral health remained significant in the multivariate analysis. Conclusion The BREALD-30 demonstrated satisfactory psychometric properties and is therefore applicable to adults in Brazil. PMID:26158724
Validity and Reliability of a New Instrument to Measure Cancer-Related Fatigue in Adolescents
Hinds, Pamela S.; Hockenberry, Marilyn; Tong, Xin; Rai, Shesh N.; Gattuso, Jamie S.; McCarthy, Kathleen; Pui, Ching-Hon; Srivastava, Deo Kumar
2008-01-01
Adolescents undergoing treatment for cancer rate fatigue as their most prevalent and intense cancer- and treatment-related effect. Parents and staff rate it similarly. Despite its reported prevalence, intensity, and distressing effects, cancer-related fatigue in adolescents is not routinely assessed during or after cancer treatment. We contend that the insufficient clinical attention is primarily due to the lack of a reliable and valid self-report instrument with which adolescent cancer-related fatigue can be measured. Our aim was to determine the reliability and construct validity of a new instrument and its ability to measure change in fatigue over time. Initial testing involved 64 adolescents undergoing curative treatment of cancer who completed the Fatigue Scale-Adolescent (FS-A) at two to four key points in treatment in one of four studies. Internal consistency estimates ranged from 0.67 to 0.95. Validity estimates involving the FS-A with the parent version ranged from 0.13 to 0.76; estimates involving the staff version and the Reynolds Depression Scale were 0.27 and 0.87 respectively. Additional validity findings included significant fatigue differences between anemic and non-anemic patients (P = 0.042) and the emergence of four factors in an exploratory factor analysis. Findings further indicate that the FS-A can be used to measure change over time (t = 2.55, P <0.01). In summary, the FS-A has moderate to strong reliability and impressive validity coefficients for a new research instrument. PMID:17629669
García-Ramos, Amador; Haff, Guy Gregory; Pestaña-Melero, Francisco Luis; Pérez-Castilla, Alejandro; Rojas, Francisco Javier; Balsalobre-Fernández, Carlos; Jaric, Slobodan
2017-09-05
This study compared the concurrent validity and reliability of previously proposed generalized group equations for estimating the bench press (BP) one-repetition maximum (1RM) with the individualized load-velocity relationship modelled with a two-point method. Thirty men (BP 1RM relative to body mass: 1.08 0.18 kg·kg -1 ) performed two incremental loading tests in the concentric-only BP exercise and another two in the eccentric-concentric BP exercise to assess their actual 1RM and load-velocity relationships. A high velocity (≈ 1 m·s -1 ) and a low velocity (≈ 0.5 m·s -1 ) was selected from their load-velocity relationships to estimate the 1RM from generalized group equations and through an individual linear model obtained from the two velocities. The directly measured 1RM was highly correlated with all predicted 1RMs (r range: 0.847-0.977). The generalized group equations systematically underestimated the actual 1RM when predicted from the concentric-only BP (P <0.001; effect size [ES] range: 0.15-0.94), but overestimated it when predicted from the eccentric-concentric BP (P <0.001; ES range: 0.36-0.98). Conversely, a low systematic bias (range: -2.3-0.5 kg) and random errors (range: 3.0-3.8 kg), no heteroscedasticity of errors (r 2 range: 0.053-0.082), and trivial ES (range: -0.17-0.04) were observed when the prediction was based on the two-point method. Although all examined methods reported the 1RM with high reliability (CV≤5.1%; ICC≥0.89), the direct method was the most reliable (CV<2.0%; ICC≥0.98). The quick, fatigue-free, and practical two-point method was able to predict the BP 1RM with high reliability and practically perfect validity, and therefore we recommend its use over generalized group equations.
ERIC Educational Resources Information Center
Forde, David R.; Baron, Stephen W.; Scher, Christine D.; Stein, Murray B.
2012-01-01
This study examines the psychometric properties of the Childhood Trauma Questionnaire short form (CTQ-SF) with street youth who have run away or been expelled from their homes (N = 397). Internal reliability coefficients for the five clinical scales ranged from 0.65 to 0.95. Confirmatory Factor Analysis (CFA) was used to test the five-factor…
Joint Estimation of Source Range and Depth Using a Bottom-Deployed Vertical Line Array in Deep Water
Li, Hui; Yang, Kunde; Duan, Rui; Lei, Zhixiong
2017-01-01
This paper presents a joint estimation method of source range and depth using a bottom-deployed vertical line array (VLA). The method utilizes the information on the arrival angle of direct (D) path in space domain and the interference characteristic of D and surface-reflected (SR) paths in frequency domain. The former is related to a ray tracing technique to backpropagate the rays and produces an ambiguity surface of source range. The latter utilizes Lloyd’s mirror principle to obtain an ambiguity surface of source depth. The acoustic transmission duct is the well-known reliable acoustic path (RAP). The ambiguity surface of the combined estimation is a dimensionless ad hoc function. Numerical efficiency and experimental verification show that the proposed method is a good candidate for initial coarse estimation of source position. PMID:28590442
Accuracy and Reliability of the Klales et al. (2012) Morphoscopic Pelvic Sexing Method.
Lesciotto, Kate M; Doershuk, Lily J
2018-01-01
Klales et al. (2012) devised an ordinal scoring system for the morphoscopic pelvic traits described by Phenice (1969) and used for sex estimation of skeletal remains. The aim of this study was to test the accuracy and reliability of the Klales method using a large sample from the Hamann-Todd collection (n = 279). Two observers were blinded to sex, ancestry, and age and used the Klales et al. method to estimate the sex of each individual. Sex was correctly estimated for females with over 95% accuracy; however, the male allocation accuracy was approximately 50%. Weighted Cohen's kappa and intraclass correlation coefficient analysis for evaluating intra- and interobserver error showed moderate to substantial agreement for all traits. Although each trait can be reliably scored using the Klales method, low accuracy rates and high sex bias indicate better trait descriptions and visual guides are necessary to more accurately reflect the range of morphological variation. © 2017 American Academy of Forensic Sciences.
Zimowski, Michele; Moye, Jack; Dugoni, Bernard; Heim Viox, Melissa; Cohen, Hildie; Winfrey, Krishna
2017-02-01
The current study assessed whether home-based data collection by trained data collectors can produce high-quality physical measurement data in young children. The study assessed the quality of intra-examiner measurements of blood pressure, pulse rate and anthropometric dimensions using intra-examiner reliability and intra-examiner technical error of measurement (TEM). Non-clinical, primarily private homes of National Children's Study participants in twenty-two study locations across the USA. Children in four age groups: 5-7 months (n 91), 11-16 months (n 393), 23-28 months (n 1410) and 35-40 months (n 800). Absolute TEM ranged in value from 0·09 to 16·21, varying widely by age group and measure, as expected. Relative TEM spanned from 0·27 to 13·71 across age groups and physical measures. Reliabilities for anthropometric measurements by age group and measure ranged from 0·46 to >0·99 with most exceeding 0·90, suggesting that the large majority of anthropometric measures can be collected in a home-based setting on young children by trained data collectors. Reliabilities for blood pressure and pulse rate measurements by age group ranged from 0·21 to 0·74, implying these are less reliably measured with young children when taken in the data collection context described here. Reliability estimates >0·95 for weight, length, height, and thigh, waist and head circumference, and >0·90 for triceps and subscapular skinfolds, indicate that these measures can be collected in the field by trained data collectors without compromising data quality. These estimates can be used for interim evaluations of data collector training and measurement protocols.
Reliability and validity of a short form household food security scale in a Caribbean community.
Gulliford, Martin C; Mahabir, Deepak; Rocke, Brian
2004-06-16
We evaluated the reliability and validity of the short form household food security scale in a different setting from the one in which it was developed. The scale was interview administered to 531 subjects from 286 households in north central Trinidad in Trinidad and Tobago, West Indies. We evaluated the six items by fitting item response theory models to estimate item thresholds, estimating agreement among respondents in the same households and estimating the slope index of income-related inequality (SII) after adjusting for age, sex and ethnicity. Item-score correlations ranged from 0.52 to 0.79 and Cronbach's alpha was 0.87. Item responses gave within-household correlation coefficients ranging from 0.70 to 0.78. Estimated item thresholds (standard errors) from the Rasch model ranged from -2.027 (0.063) for the 'balanced meal' item to 2.251 (0.116) for the 'hungry' item. The 'balanced meal' item had the lowest threshold in each ethnic group even though there was evidence of differential functioning for this item by ethnicity. Relative thresholds of other items were generally consistent with US data. Estimation of the SII, comparing those at the bottom with those at the top of the income scale, gave relative odds for an affirmative response of 3.77 (95% confidence interval 1.40 to 10.2) for the lowest severity item, and 20.8 (2.67 to 162.5) for highest severity item. Food insecurity was associated with reduced consumption of green vegetables after additionally adjusting for income and education (0.52, 0.28 to 0.96). The household food security scale gives reliable and valid responses in this setting. Differing relative item thresholds compared with US data do not require alteration to the cut-points for classification of 'food insecurity without hunger' or 'food insecurity with hunger'. The data provide further evidence that re-evaluation of the 'balanced meal' item is required.
de la Cámara, Miguel Ángel; Higueras-Fresnillo, Sara; Martinez-Gomez, David; Veiga, Oscar L
2018-05-29
The inter-day reliability of the Intelligent Device for Energy Expenditure and Activity (IDEEA) has not been studied to date. The study purpose was to examine the inter-day variability and reliability on two consecutive days collected with the IDEEA, as well as to predict the number of days needed to provide a reliable estimate of several movement (walking and climbing stairs) and non-movement behaviors (lying, reclining, sitting) and standing in older adults. The sample included 126 older adults (74 women) who wore the IDEEA for 48-h. Results showed low variability between the two days and its reliability was from moderate (ICC=0.34) to high (ICC=0.80) in most of movement and non-movement behaviors analyzed. The Bland-Altman plots showed a high-moderate agreement between days and the Spearman-Brown formula estimated ranged from 1.2 and 9.1 days of monitoring with the IDEEA are needed to achieve ICCs≥0.70 in older adults for sitting and climbing stairs, respectively.
Test-retest reliability and practice effects of a rapid screen of mild traumatic brain injury.
De Monte, Veronica Eileen; Geffen, Gina Malke; Kwapil, Karleigh
2005-07-01
Test-retest reliabilities and practice effects of measures from the Rapid Screen of Concussion (RSC), in addition to the Digit Symbol Substitution Test (Digit Symbol), were examined. Twenty five male participants were tested three times; each testing session scheduled a week apart. The test-retest reliability estimates for most measures were reasonably good, ranging from .79 to .97. An exception was the delayed word recall test, which has had a reliability estimate of .66 for the first retest, and .59 for the second retest. Practice effects were evident from Times 1 to 2 on the sentence comprehension and delayed recall subtests of the RSC, Digit Symbol and a composite score. There was also a practice effect of the same magnitude found from Time 2 to Time 3 on Digit Symbol, delayed recall and the composite score. Statistics on measures for both the first and second retest intervals, with associated practice effects, are presented to enable the calculation of reliable change indices (RCI). The RCI may be used to assess any improvement in cognitive functioning after mild Traumatic Brain Injury.
The hockey-stick method to estimate evening dim light melatonin onset (DLMO) in humans.
Danilenko, Konstantin V; Verevkin, Evgeniy G; Antyufeev, Viktor S; Wirz-Justice, Anna; Cajochen, Christian
2014-04-01
The onset of melatonin secretion in the evening is the most reliable and most widely used index of circadian timing in humans. Saliva (or plasma) is usually sampled every 0.5-1 hours under dim-light conditions in the evening 5-6 hours before usual bedtime to assess the dim-light melatonin onset (DLMO). For many years, attempts have been made to find a reliable objective determination of melatonin onset time either by fixed or dynamic threshold approaches. The here-developed hockey-stick algorithm, used as an interactive computer-based approach, fits the evening melatonin profile by a piecewise linear-parabolic function represented as a straight line switching to the branch of a parabola. The switch point is considered to reliably estimate melatonin rise time. We applied the hockey-stick method to 109 half-hourly melatonin profiles to assess the DLMOs and compared these estimates to visual ratings from three experts in the field. The DLMOs of 103 profiles were considered to be clearly quantifiable. The hockey-stick DLMO estimates were on average 4 minutes earlier than the experts' estimates, with a range of -27 to +13 minutes; in 47% of the cases the difference fell within ±5 minutes, in 98% within -20 to +13 minutes. The raters' and hockey-stick estimates showed poor accordance with DLMOs defined by threshold methods. Thus, the hockey-stick algorithm is a reliable objective method to estimate melatonin rise time, which does not depend on a threshold value and is free from errors arising from differences in subjective circadian phase estimates. The method is available as a computerized program that can be easily used in research settings and clinical practice either for salivary or plasma melatonin values.
Estimation of density of mongooses with capture-recapture and distance sampling
Corn, J.L.; Conroy, M.J.
1998-01-01
We captured mongooses (Herpestes javanicus) in live traps arranged in trapping webs in Antigua, West Indies, and used capture-recapture and distance sampling to estimate density. Distance estimation and program DISTANCE were used to provide estimates of density from the trapping-web data. Mean density based on trapping webs was 9.5 mongooses/ha (range, 5.9-10.2/ha); estimates had coefficients of variation ranging from 29.82-31.58% (X?? = 30.46%). Mark-recapture models were used to estimate abundance, which was converted to density using estimates of effective trap area. Tests of model assumptions provided by CAPTURE indicated pronounced heterogeneity in capture probabilities and some indication of behavioral response and variation over time. Mean estimated density was 1.80 mongooses/ha (range, 1.37-2.15/ha) with estimated coefficients of variation of 4.68-11.92% (X?? = 7.46%). Estimates of density based on mark-recapture data depended heavily on assumptions about animal home ranges; variances of densities also may be underestimated, leading to unrealistically narrow confidence intervals. Estimates based on trap webs require fewer assumptions, and estimated variances may be a more realistic representation of sampling variation. Because trap webs are established easily and provide adequate data for estimation in a few sample occasions, the method should be efficient and reliable for estimating densities of mongooses.
Bouman, Zita; Hendriks, Marc P H; Van Der Veld, William M; Aldenkamp, Albert P; Kessels, Roy P C
2016-06-01
The reliability and validity of three short forms of the Dutch version of the Wechsler Memory Scale-Fourth Edition (WMS-IV-NL) were evaluated in a mixed clinical sample of 235 patients. The short forms were based on the WMS-IV Flexible Approach, that is, a 3-subtest combination (Older Adult Battery for Adults) and two 2-subtest combinations (Logical Memory and Visual Reproduction and Logical Memory and Designs), which can be used to estimate the Immediate, Delayed, Auditory and Visual Memory Indices. All short forms showed good reliability coefficients. As expected, for adults (16-69 years old) the 3-subtest short form was consistently more accurate (predictive accuracy ranged from 73% to 100%) than both 2-subtest short forms (range = 61%-80%). Furthermore, for older adults (65-90 years old), the predictive accuracy of the 2-subtest short form ranged from 75% to 100%. These results suggest that caution is warranted when using the WMS-IV-NL Flexible Approach short forms to estimate all four indices. © The Author(s) 2015.
Fernández-Calderón, Fermín; Díaz-Batanero, Carmen; Rojas-Tejada, Antonio J; Castellanos-Ryan, Natalie; Lozano-Rojas, Óscar M
2017-07-14
The identification of different personality risk profiles for substance misuse is useful in preventing substance-related problems. This study aims to test the psychometric properties of a new version of the Substance Use Risk Profile Scale (SURPS) for Spanish college students. Cross-sectional study with 455 undergraduate students from four Spanish universities. A new version of the SURPS, adapted to the Spanish population, was administered with the Beck Hopelessness Scale, the UPPS-P Impulsive Behavior Scale, the State-Trait Anxiety Inventory (STAI) and the Alcohol Use Disorders Identification Test (AUDIT). Internal consistency reliability ranged between 0.652 and 0.806 for the four SURPS subscales, while reliability estimated by split-half coefficients varied from 0.686 to 0.829. The estimated test-retest reliability ranged between 0.733 and 0.868. The expected four-factor structure of the original scale was replicated. As evidence of convergent validity, we found that the SURPS subscales were significantly associated with other conceptually-relevant personality scales and significantly associated with alcohol use measures in theoretically-expected ways. This SURPS version may be a useful instrument for measuring personality traits related to vulnerability to substance use and misuse when targeting personality with preventive interventions.
Interplanetary laser ranging - an emerging technology for planetary science missions
NASA Astrophysics Data System (ADS)
Dirkx, D.; Vermeersen, L. L. A.
2012-09-01
Interplanetary laser ranging (ILR) is an emerging technology for very high accuracy distance determination between Earth-based stations and spacecraft or landers at interplanetary distances. It has evolved from laser ranging to Earth-orbiting satellites, modified with active laser transceiver systems at both ends of the link instead of the passive space-based retroreflectors. It has been estimated that this technology can be used for mm- to cm-level accuracy range determination at interplanetary distances [2, 7]. Work is being performed in the ESPaCE project [6] to evaluate in detail the potential and limitations of this technology by means of bottom-up laser link simulation, allowing for a reliable performance estimate from mission architecture and hardware characteristics.
NASA Astrophysics Data System (ADS)
Verkade, J. S.; Brown, J. D.; Davids, F.; Reggiani, P.; Weerts, A. H.
2017-12-01
Two statistical post-processing approaches for estimation of predictive hydrological uncertainty are compared: (i) 'dressing' of a deterministic forecast by adding a single, combined estimate of both hydrological and meteorological uncertainty and (ii) 'dressing' of an ensemble streamflow forecast by adding an estimate of hydrological uncertainty to each individual streamflow ensemble member. Both approaches aim to produce an estimate of the 'total uncertainty' that captures both the meteorological and hydrological uncertainties. They differ in the degree to which they make use of statistical post-processing techniques. In the 'lumped' approach, both sources of uncertainty are lumped by post-processing deterministic forecasts using their verifying observations. In the 'source-specific' approach, the meteorological uncertainties are estimated by an ensemble of weather forecasts. These ensemble members are routed through a hydrological model and a realization of the probability distribution of hydrological uncertainties (only) is then added to each ensemble member to arrive at an estimate of the total uncertainty. The techniques are applied to one location in the Meuse basin and three locations in the Rhine basin. Resulting forecasts are assessed for their reliability and sharpness, as well as compared in terms of multiple verification scores including the relative mean error, Brier Skill Score, Mean Continuous Ranked Probability Skill Score, Relative Operating Characteristic Score and Relative Economic Value. The dressed deterministic forecasts are generally more reliable than the dressed ensemble forecasts, but the latter are sharper. On balance, however, they show similar quality across a range of verification metrics, with the dressed ensembles coming out slightly better. Some additional analyses are suggested. Notably, these include statistical post-processing of the meteorological forecasts in order to increase their reliability, thus increasing the reliability of the streamflow forecasts produced with ensemble meteorological forcings.
Validation of the Regicor Short Physical Activity Questionnaire for the Adult Population
Molina, Luis; Sarmiento, Manuel; Peñafiel, Judith; Donaire, David; Garcia-Aymerich, Judith; Gomez, Miquel; Ble, Mireia; Ruiz, Sonia; Frances, Albert; Schröder, Helmut; Marrugat, Jaume; Elosua, Roberto
2017-01-01
Objective To develop and validate a short questionnaire to estimate physical activity (PA) practice and sedentary behavior for the adult population. Methods The short questionnaire was developed using data from a cross-sectional population-based survey (n = 6352) that included the Minnesota leisure-time PA questionnaire. Activities that explained a significant proportion of the variability of population PA practice were identified. Validation of the short questionnaire included a cross-sectional component to assess validity with respect to the data collected by accelerometers and a longitudinal component to assess reliability and sensitivity to detect changes (n = 114, aged 35 to 74 years). Results Six types of activities that accounted for 87% of population variability in PA estimated with the Minnesota questionnaire were selected. The short questionnaire estimates energy expenditure in total PA and by intensity (light, moderate, vigorous), and includes 2 questions about sedentary behavior and a question about occupational PA. The short questionnaire showed high reliability, with intraclass correlation coefficients ranging between 0.79 to 0.95. The Spearman correlation coefficients between estimated energy expenditure obtained with the questionnaire and the number of steps detected by the accelerometer were as follows: 0.36 for total PA, 0.40 for moderate intensity, and 0.26 for vigorous intensity. The questionnaire was sensitive to detect changes in moderate and vigorous PA (correlation coefficients ranging from 0.26 to 0.34). Conclusion The REGICOR short questionnaire is reliable, valid, and sensitive to detect changes in moderate and vigorous PA. This questionnaire could be used in daily clinical practice and epidemiological studies. PMID:28085886
Bosakova, Lucia; Kolarcik, Peter; Bobakova, Daniela; Sulcova, Martina; Van Dijk, Jitse P; Reijneveld, Sijmen A; Geckova, Andrea Madarasova
2016-04-01
Participation in organized activities is related with a range of positive outcomes, but the way such participation is measured has not been scrutinized. Test-retest reliability as an important indicator of a scale's reliability has been assessed rarely and for "The scale of participation in organized activities" lacks completely. This test-retest study is based on the Health Behaviour in School-aged Children study and is consistent with its methodology. We obtained data from 353 Czech (51.9 % boys) and 227 Slovak (52.9 % boys) primary school pupils, grades five and nine, who participated in this study in 2013. We used Cohen's kappa statistic and single measures of the intraclass correlation coefficient to estimate the test-retest reliability of all selected items in the sample, stratified by gender, age and country. We mostly observed a large correlation between the test and retest in all of the examined variables (κ ranged from 0.46 to 0.68). Test-retest reliability of the sum score of individual items showed substantial agreement (ICC = 0.64). The scale of participation in organized activities has an acceptable level of agreement, indicating good reliability.
NASA Technical Reports Server (NTRS)
French, V. (Principal Investigator)
1982-01-01
An evaluation was made of Thompson-Type models which use trend terms (as a surrogate for technology), meteorological variables based on monthly average temperature, and total precipitation to forecast and estimate corn yields in Iowa, Illinois, and Indiana. Pooled and unpooled Thompson-type models were compared. Neither was found to be consistently superior to the other. Yield reliability indicators show that the models are of limited use for large area yield estimation. The models are objective and consistent with scientific knowledge. Timely yield forecasts and estimates can be made during the growing season by using normals or long range weather forecasts. The models are not costly to operate and are easy to use and understand. The model standard errors of prediction do not provide a useful current measure of modeled yield reliability.
Tuck, L.K.; Pearson, Daniel K.; Cannon, M.R.; Dutton, DeAnn M.
2013-01-01
The Tongue River Member of the Tertiary Fort Union Formation is the primary source of groundwater in the Northern Cheyenne Indian Reservation in southeastern Montana. Coal beds within this formation generally contain the most laterally extensive aquifers in much of the reservation. The U.S. Geological Survey, in cooperation with the Northern Cheyenne Tribe, conducted a study to estimate the volume of water in five coal aquifers. This report presents estimates of the volume of water in five coal aquifers in the eastern and southern parts of the Northern Cheyenne Indian Reservation: the Canyon, Wall, Pawnee, Knobloch, and Flowers-Goodale coal beds in the Tongue River Member of the Tertiary Fort Union Formation. Only conservative estimates of the volume of water in these coal aquifers are presented. The volume of water in the Canyon coal was estimated to range from about 10,400 acre-feet (75 percent saturated) to 3,450 acre-feet (25 percent saturated). The volume of water in the Wall coal was estimated to range from about 14,200 acre-feet (100 percent saturated) to 3,560 acre-feet (25 percent saturated). The volume of water in the Pawnee coal was estimated to range from about 9,440 acre-feet (100 percent saturated) to 2,360 acre-feet (25 percent saturated). The volume of water in the Knobloch coal was estimated to range from about 38,700 acre-feet (100 percent saturated) to 9,680 acre-feet (25 percent saturated). The volume of water in the Flowers-Goodale coal was estimated to be about 35,800 acre-feet (100 percent saturated). Sufficient data are needed to accurately characterize coal-bed horizontal and vertical variability, which is highly complex both locally and regionally. Where data points are widely spaced, the reliability of estimates of the volume of coal beds is decreased. Additionally, reliable estimates of the volume of water in coal aquifers depend heavily on data about water levels and data about coal-aquifer characteristics. Because the data needed to define the volume of water were sparse, only conservative estimates of the volume of water in the five coal aquifers are presented in this report. These estimates need to be used with caution and mindfulness of the uncertainty associated with them.
Terada, Tasuku; Loehr, Sarah; Guigard, Emmanuel; McCargar, Linda J; Bell, Gordon J; Senior, Peter; Boulé, Normand G
2014-08-01
This study determined the test-retest reliability of a continuous glucose monitoring system (CGMS) (iPro™2; Medtronic, Northridge, CA) under standardized conditions in individuals with type 2 diabetes (T2D). Fourteen individuals with T2D spent two nonconsecutive days in a calorimetry unit. On both days, meals, medication, and exercise were standardized. Glucose concentrations were measured continuously by CGMS, from which daily mean glucose concentration (GLU(mean)), time spent in hyperglycemia (t(>10.0 mmol/L)), and meal, exercise, and nocturnal mean glucose concentrations, as well as glycemic variability (SD(w), percentage coefficient of variation [%cv(w)], mean amplitude of glycemic excursions [MAGEc, MAGE(ave), and MAGE(abs.gos)], and continuous overlapping net glycemic action [CONGA(n)]) were estimated. Absolute and relative reliabilities were investigated using coefficient of variation (CV) and intraclass correlation, respectively. Relative reliability ranged from 0.77 to 0.95 (P<0.05) for GLU(mean) and meal, exercise, and nocturnal glycemia with CV ranging from 3.9% to 11.7%. Despite significant relative reliability (R=0.93; P<0.01), t(>10.0 mmol/L) showed larger CV (54.7%). Among the different glycemic variability measures, a significant between-day difference was observed in MAGEc, MAGE(ave), CONGA6, and CONGA12. The remaining measures (i.e., SD(w), %cv(w), MAGE(abs.gos), and CONGA1-4) indicated no between-day differences and significant relative reliability. In individuals with T2D, CGMS-estimated glycemic profiles were characterized by high relative and absolute reliability for both daily and shorter-term measurements as represented by GLUmean and meal, exercise, and nocturnal glycemia. Among the different methods to calculate glycemic variability, our results showed SD(w), %cv(w), MAGE(abs.gos), and CONGAn with n ≤ 4 were reliable measures. These results suggest the usefulness of CGMS in clinical trials utilizing repeated measured.
Manual and automatic locomotion scoring systems in dairy cows: a review.
Schlageter-Tello, Andrés; Bokkers, Eddie A M; Koerkamp, Peter W G Groot; Van Hertem, Tom; Viazzi, Stefano; Romanini, Carlos E B; Halachmi, Ilan; Bahr, Claudia; Berckmans, Daniël; Lokhorst, Kees
2014-09-01
The objective of this review was to describe, compare and evaluate agreement, reliability, and validity of manual and automatic locomotion scoring systems (MLSSs and ALSSs, respectively) used in dairy cattle lameness research. There are many different types of MLSSs and ALSSs. Twenty-five MLSSs were found in 244 articles. MLSSs use different types of scale (ordinal or continuous) and different gait and posture traits need to be observed. The most used MLSS (used in 28% of the references) is based on asymmetric gait, reluctance to bear weight, and arched back, and is scored on a five-level scale. Fifteen ALSSs were found that could be categorized according to three approaches: (a) the kinetic approach measures forces involved in locomotion, (b) the kinematic approach measures time and distance of variables associated to limb movement and some specific posture variables, and (c) the indirect approach uses behavioural variables or production variables as indicators for impaired locomotion. Agreement and reliability estimates were scarcely reported in articles related to MLSSs. When reported, inappropriate statistical methods such as PABAK and Pearson and Spearman correlation coefficients were commonly used. Some of the most frequently used MLSSs were poorly evaluated for agreement and reliability. Agreement and reliability estimates for the original four-, five- or nine-level MLSS, expressed in percentage of agreement, kappa and weighted kappa, showed large ranges among and sometimes also within articles. After the transformation into a two-level scale, agreement and reliability estimates showed acceptable estimates (percentage of agreement ≥ 75%; kappa and weighted kappa ≥ 0.6), but still estimates showed a large variation between articles. Agreement and reliability estimates for ALSSs were not reported in any article. Several ALSSs use MLSSs as a reference for model calibration and validation. However, varying agreement and reliability estimates of MLSSs make a clear definition of a lameness case difficult, and thus affect the validity of ALSSs. MLSSs and ALSSs showed limited validity for hoof lesion detection and pain assessment. The utilization of MLSSs and ALSSs should aim to the prevention and efficient management of conditions that induce impaired locomotion. Long-term studies comparing MLSSs and ALSSs while applying various strategies to detect and control unfavourable conditions leading to impaired locomotion are required to determine the usefulness of MLSSs and ALSSs for securing optimal production and animal welfare in practice. Copyright © 2014 Elsevier B.V. All rights reserved.
Tadakamadla, Santosh Kumar; Quadri, Mir Faeq Ali; Pakpour, Amir H; Zailai, Abdulaziz M; Sayed, Mohammed E; Mashyakhy, Mohammed; Inamdar, Aadil S; Tadakamadla, Jyothi
2014-09-29
To evaluate the reliability and validity of Arabic Rapid Estimate of Adult Literacy in Dentistry (AREALD-30) in Saudi Arabia. A convenience sample of 200 subjects was approached, of which 177 agreed to participate giving a response rate of 88.5%. Rapid Estimate of Adult Literacy in Dentistry (REALD-99), was translated into Arabic to prepare the longer and shorter versions of Arabic Rapid Estimate of Adult Literacy in Dentistry (AREALD-99 and AREALD-30). Each participant was provided with AREALD-99 which also includes words from AREALD-30. A questionnaire containing socio-behavioral information and Arabic Oral Health Impact Profile (A-OHIP-14) was also administered. Reliability of the AREALD-30 was assessed by re-administering it to 20 subjects after two weeks. Convergent and predictive validity of AREALD-30 was evaluated by its correlations with AREALD-99 and self-perceived oral health status, dental visiting habits and A-OHIP-14 respectively. Discriminant validity was assessed in relation to the educational level while construct validity was evaluated by confirmatory factor analysis (CFA). Reliability of AREALD-30 was excellent with intraclass correlation coefficient of 0.99. It exhibited good convergent and discriminant validity but poor predictive validity. CFA showed presence of two factors and infit mean-square statistics for AREALD-30 were all within the desired range of 0.50 - 2.0 in Rasch analysis. AREALD-30 showed excellent reliability, good convergent and concurrent validity, but failed to predict the differences between the subjects categorized based on their oral health outcomes.
Jaeschke, Lina; Steinbrecher, Astrid; Jeran, Stephanie; Konigorski, Stefan; Pischon, Tobias
2018-04-20
24 h-accelerometry is now used to objectively assess physical activity (PA) in many observational studies like the German National Cohort; however, PA variability, observational time needed to estimate habitual PA, and reliability are unclear. We assessed 24 h-PA of 50 participants using triaxial accelerometers (ActiGraph GT3X+) over 2 weeks. Variability of overall PA and different PA intensities (time in inactivity and in low intensity, moderate, vigorous, and very vigorous PA) between days of assessment or days of the week was quantified using linear mixed-effects and random effects models. We calculated the required number of days to estimate PA, and calculated PA reliability using intraclass correlation coefficients. Between- and within-person variance accounted for 34.4-45.5% and 54.5-65.6%, respectively, of total variance in overall PA and PA intensities over the 2 weeks. Overall PA and times in low intensity, moderate, and vigorous PA decreased slightly over the first 3 days of assessment. Overall PA (p = 0.03), time in inactivity (p = 0.003), in low intensity PA (p = 0.001), in moderate PA (p = 0.02), and in vigorous PA (p = 0.04) slightly differed between days of the week, being highest on Wednesday and Friday and lowest on Sunday and Monday, with apparent differences between Saturday and Sunday. In nested random models, the day of the week accounted for < 19% of total variance in the PA parameters. On average, the required number of days to estimate habitual PA was around 1 week, being 7 for overall PA and ranging from 6 to 9 for the PA intensities. Week-to-week reliability was good (intraclass correlation coefficients, range, 0.68-0.82). Individual PA, as assessed using 24 h-accelerometry, is highly variable between days, but the day of assessment or the day of the week explain only small parts of this variance. Our data indicate that 1 week of assessment is necessary for reliable estimation of habitual PA.
Link-state-estimation-based transmission power control in wireless body area networks.
Kim, Seungku; Eom, Doo-Seop
2014-07-01
This paper presents a novel transmission power control protocol to extend the lifetime of sensor nodes and to increase the link reliability in wireless body area networks (WBANs). We first experimentally investigate the properties of the link states using the received signal strength indicator (RSSI). We then propose a practical transmission power control protocol based on both short- and long-term link-state estimations. Both the short- and long-term link-state estimations enable the transceiver to adapt the transmission power level and target the RSSI threshold range, respectively, to simultaneously satisfy the requirements of energy efficiency and link reliability. Finally, the performance of the proposed protocol is experimentally evaluated in two experimental scenarios-body posture change and dynamic body motion-and compared with the typical WBAN transmission power control protocols, a real-time reactive scheme, and a dynamic postural position inference mechanism. From the experimental results, it is found that the proposed protocol increases the lifetime of the sensor nodes by a maximum of 9.86% and enhances the link reliability by reducing the packet loss by a maximum of 3.02%.
van Duijvenbode, Neomi; Didden, Robert; van den Hazel, Teunis; Engels, Rutger C M E
2016-01-01
To investigate the reliability and validity of a Wechsler Abbreviated Scale of Intelligence-based Wechsler Adult Intelligence Scale - third edition (WAIS-III) short form (SF) in a sample of individuals with mild to borderline intellectual disability (MBID) (N = 117; M(IQ) = 71.34; SD(IQ) = 8.00, range: 52-85). A full WAIS-III was administered as a standard procedure in the diagnostic process. The results indicate an excellent reliability (r = 0.96) and a strong, positive correlation with the full WAIS-III (r = 0.89). The SF correctly identified ID in general and the correct IQ category more specifically in the majority of cases (97.4% and 86.3% of cases, respectively). In addition, 82.1% of the full scale IQ (FSIQ) estimates fell within the 95% confidence interval of the original score. We conclude that the SF is a reliable and valid measure to estimate FSIQ. It can be used in clinical and research settings when global estimates of intelligence are sufficient.
Lee, Chin-Pang; Chiu, Yu-Wen; Chu, Chun-Lin; Chen, Yu; Jiang, Kun-Hao; Chen, Jiun-Liang; Chen, Ching-Yen
2016-12-01
The aging males' symptoms (AMS) scale is an instrument used to determine the health-related quality of life in adult and elderly men. The purpose of this study was to synthesize internal consistency (Cronbach's alpha) and test-retest reliability for the AMS scale and its three subscales. Of the 123 studies reviewed, 12 provided alpha coefficients which were then used in the meta-analyses of internal consistency. Seven of the 12 included studies provided test-retest coefficients, and these were used in the meta-analyses of test-retest reliability. The AMS scale had excellent internal consistency [α = 0.89 (95% CI 0.88-0.90)]; the mean alpha estimates across the AMS subscales ranged from 0.79 to 0.82. The AMS scale also had good test-retest reliability [r = 0.85 (95% CI 0.82-0.88]; the test-retest reliability coefficients of the AMS subscales ranged from 0.76 to 0.83. There was significant heterogeneity among the included studies. The AMS scale and the three subscales had fairly good internal consistency and test-retest reliability. Future psychometric studies of the AMS scale should report important characteristics of the participants, details of item scores, and test-retest reliability.
Davison, Kirsten K.; Austin, S. Bryn; Giles, Catherine; Cradock, Angie L.; Lee, Rebekka M.; Gortmaker, Steven L.
2017-01-01
Interest in evaluating and improving children’s diets in afterschool settings has grown, necessitating the development of feasible yet valid measures for capturing children’s intake in such settings. This study’s purpose was to test the criterion validity and cost of three unobtrusive visual estimation methods compared to a plate-weighing method: direct on-site observation using a 4-category rating scale and off-site rating of digital photographs taken on-site using 4- and 10-category scales. Participants were 111 children in grades 1–6 attending four afterschool programs in Boston, MA in December 2011. Researchers observed and photographed 174 total snack meals consumed across two days at each program. Visual estimates of consumption were compared to weighed estimates (the criterion measure) using intra-class correlations. All three methods were highly correlated with the criterion measure, ranging from 0.92–0.94 for total calories consumed, 0.86–0.94 for consumption of pre-packaged beverages, 0.90–0.93 for consumption of fruits/vegetables, and 0.92–0.96 for consumption of grains. For water, which was not pre-portioned, coefficients ranged from 0.47–0.52. The photographic methods also demonstrated excellent inter-rater reliability: 0.84–0.92 for the 4-point and 0.92–0.95 for the 10-point scale. The costs of the methods for estimating intake ranged from $0.62 per observation for the on-site direct visual method to $0.95 per observation for the criterion measure. This study demonstrates that feasible, inexpensive methods can validly and reliably measure children’s dietary intake in afterschool settings. Improving precision in measures of children’s dietary intake can reduce the likelihood of spurious or null findings in future studies. PMID:25596895
Helsel, Dennis R.; Gilliom, Robert J.
1986-01-01
Estimates of distributional parameters (mean, standard deviation, median, interquartile range) are often desired for data sets containing censored observations. Eight methods for estimating these parameters have been evaluated by R. J. Gilliom and D. R. Helsel (this issue) using Monte Carlo simulations. To verify those findings, the same methods are now applied to actual water quality data. The best method (lowest root-mean-squared error (rmse)) over all parameters, sample sizes, and censoring levels is log probability regression (LR), the method found best in the Monte Carlo simulations. Best methods for estimating moment or percentile parameters separately are also identical to the simulations. Reliability of these estimates can be expressed as confidence intervals using rmse and bias values taken from the simulation results. Finally, a new simulation study shows that best methods for estimating uncensored sample statistics from censored data sets are identical to those for estimating population parameters. Thus this study and the companion study by Gilliom and Helsel form the basis for making the best possible estimates of either population parameters or sample statistics from censored water quality data, and for assessments of their reliability.
NASA Astrophysics Data System (ADS)
Saini, K. K.; Sehgal, R. K.; Sethi, B. L.
2008-10-01
In this paper major reliability estimators are analyzed and there comparatively result are discussed. There strengths and weaknesses are evaluated in this case study. Each of the reliability estimators has certain advantages and disadvantages. Inter-rater reliability is one of the best ways to estimate reliability when your measure is an observation. However, it requires multiple raters or observers. As an alternative, you could look at the correlation of ratings of the same single observer repeated on two different occasions. Each of the reliability estimators will give a different value for reliability. In general, the test-retest and inter-rater reliability estimates will be lower in value than the parallel forms and internal consistency ones because they involve measuring at different times or with different raters. Since reliability estimates are often used in statistical analyses of quasi-experimental designs.
Weafer, Jessica; Baggott, Matthew J; de Wit, Harriet
2013-12-01
Behavioral measures of impulsivity are widely used in substance abuse research, yet relatively little attention has been devoted to establishing their psychometric properties, especially their reliability over repeated administration. The current study examined the test-retest reliability of a battery of standardized behavioral impulsivity tasks, including measures of impulsive choice (i.e., delay discounting, probability discounting, and the Balloon Analogue Risk Task), impulsive action (i.e., the stop signal task, the go/no-go task, and commission errors on the continuous performance task), and inattention (i.e., attention lapses on a simple reaction time task and omission errors on the continuous performance task). Healthy adults (n = 128) performed the battery on two separate occasions. Reliability estimates for the individual tasks ranged from moderate to high, with Pearson correlations within the specific impulsivity domains as follows: impulsive choice (r range: .76-.89, ps < .001); impulsive action (r range: .65-.73, ps < .001); and inattention (r range: .38-.42, ps < .001). Additionally, the influence of day-to-day fluctuations in mood, as measured by the Profile of Mood States, was assessed in relation to variability in performance on each of the behavioral tasks. Change in performance on the delay discounting task was significantly associated with change in positive mood and arousal. No other behavioral measures were significantly associated with mood. In sum, the current analysis demonstrates that behavioral measures of impulsivity are reliable measures and thus can be confidently used to assess various facets of impulsivity as intermediate phenotypes for drug abuse.
Lower Bounds to the Reliabilities of Factor Score Estimators.
Hessen, David J
2016-10-06
Under the general common factor model, the reliabilities of factor score estimators might be of more interest than the reliability of the total score (the unweighted sum of item scores). In this paper, lower bounds to the reliabilities of Thurstone's factor score estimators, Bartlett's factor score estimators, and McDonald's factor score estimators are derived and conditions are given under which these lower bounds are equal. The relative performance of the derived lower bounds is studied using classic example data sets. The results show that estimates of the lower bounds to the reliabilities of Thurstone's factor score estimators are greater than or equal to the estimates of the lower bounds to the reliabilities of Bartlett's and McDonald's factor score estimators.
Palta, Mari; Chen, Han-Yang; Kaplan, Robert M.; Feeny, David; Cherepanov, Dasha; Fryback, Dennis
2011-01-01
Background Standard errors of measurement (SEMs) of health related quality of life (HRQoL) indexes are not well characterized. SEM is needed to estimate responsiveness statistics and provides guidance on using indexes on the individual and group level. SEM is also a component of reliability. Purpose To estimate SEM of five HRQoL indexes. Design The National Health Measurement Study (NHMS) was a population based telephone survey. The Clinical Outcomes and Measurement of Health Study (COMHS) provided repeated measures 1 and 6 months post cataract surgery. Subjects 3844 randomly selected adults from the non-institutionalized population 35 to 89 years old in the contiguous United States and 265 cataract patients. Measurements The SF6-36v2™, QWB-SA, EQ-5D, HUI2 and HUI3 were included. An item-response theory (IRT) approach captured joint variation in indexes into a composite construct of health (theta). We estimated: (1) the test-retest standard deviation (SEM-TR) from COMHS, (2) the structural standard deviation (SEM-S) around the composite construct from NHMS and (3) corresponding reliability coefficients. Results SEM-TR was 0.068 (SF-6D), 0.087 (QWB-SA), 0.093 (EQ-5D), 0.100 (HUI2) and 0.134 (HUI3), while SEM-S was 0.071, 0.094, 0.084, 0.074 and 0.117, respectively. These translate into reliability coefficients for SF-6D: 0.66 (COMHS) and 0.71 (NHMS), for QWB: 0.59 and 0.64, for EQ-5D: 0.61 and 0.70 for HUI2: 0.64 and 0.80, and for HUI3: 0.75 and 0.77, respectively. The SEM varied considerably across levels of health, especially for HUI2, HUI3 and EQ-5D, and was strongly influenced by ceiling effects. Limitations Repeated measures were five months apart and estimated theta contain measurement error. Conclusions The two types of SEM are similar and substantial for all the indexes, and vary across the range of health. PMID:20935280
Hess, G.W.; Bohman, L.R.
1996-01-01
Techniques for estimating monthly mean streamflow at gaged sites and monthly streamflow duration characteristics at ungaged sites in central Nevada were developed using streamflow records at six gaged sites and basin physical and climatic characteristics. Streamflow data at gaged sites were related by regression techniques to concurrent flows at nearby gaging stations so that monthly mean streamflows for periods of missing or no record can be estimated for gaged sites in central Nevada. The standard error of estimate for relations at these sites ranged from 12 to 196 percent. Also, monthly streamflow data for selected percent exceedence levels were used in regression analyses with basin and climatic variables to determine relations for ungaged basins for annual and monthly percent exceedence levels. Analyses indicate that the drainage area and percent of drainage area at altitudes greater than 10,000 feet are the most significant variables. For the annual percent exceedence, the standard error of estimate of the relations for ungaged sites ranged from 51 to 96 percent and standard error of prediction for ungaged sites ranged from 96 to 249 percent. For the monthly percent exceedence values, the standard error of estimate of the relations ranged from 31 to 168 percent, and the standard error of prediction ranged from 115 to 3,124 percent. Reliability and limitations of the estimating methods are described.
Chesson, Harrell W; Forhan, Sara E; Gottlieb, Sami L; Markowitz, Lauri E
2008-08-18
We estimated the health and economic benefits of preventing recurrent respiratory papillomatosis (RRP) through quadrivalent human papillomavirus (HPV) vaccination. We applied a simple mathematical model to estimate the averted costs and quality-adjusted life years (QALYs) saved by preventing RRP in children whose mothers had been vaccinated at age 12 years. Under base case assumptions, the prevention of RRP would avert an estimated USD 31 (range: USD 2-178) in medical costs (2006 US dollars) and save 0.00016 QALYs (range: 0.00001-0.00152) per 12-year-old girl vaccinated. Including the benefits of RRP reduced the estimated cost per QALY gained by HPV vaccination by roughly 14-21% in the base case and by <2% to >100% in the sensitivity analyses. More precise estimates of the incidence of RRP are needed, however, to quantify this impact more reliably.
Applications of computerized adaptive testing (CAT) to the assessment of headache impact.
Ware, John E; Kosinski, Mark; Bjorner, Jakob B; Bayliss, Martha S; Batenhorst, Alice; Dahlöf, Carl G H; Tepper, Stewart; Dowson, Andrew
2003-12-01
To evaluate the feasibility of computerized adaptive testing (CAT) and the reliability and validity of CAT-based estimates of headache impact scores in comparison with 'static' surveys. Responses to the 54-item Headache Impact Test (HIT) were re-analyzed for recent headache sufferers (n = 1016) who completed telephone interviews during the National Survey of Headache Impact (NSHI). Item response theory (IRT) calibrations and the computerized dynamic health assessment (DYNHA) software were used to simulate CAT assessments by selecting the most informative items for each person and estimating impact scores according to pre-set precision standards (CAT-HIT). Results were compared with IRT estimates based on all items (total-HIT), computerized 6-item dynamic estimates (CAT-HIT-6), and a developmental version of a 'static' 6-item form (HIT-6-D). Analyses focused on: respondent burden (survey length and administration time), score distributions ('ceiling' and 'floor' effects), reliability and standard errors, and clinical validity (diagnosis, level of severity). A random sample (n = 245) was re-assessed to test responsiveness. A second study (n = 1103) compared actual CAT surveys and an improved 'static' HIT-6 among current headache sufferers sampled on the Internet. Respondents completed measures from the first study and the generic SF-8 Health Survey; some (n = 540) were re-tested on the Internet after 2 weeks. In the first study, simulated CAT-HIT and total-HIT scores were highly correlated (r = 0.92) without 'ceiling' or 'floor' effects and with a substantial reduction (90.8%) in respondent burden. Six of the 54 items accounted for the great majority of item administrations (3603/5028, 77.6%). CAT-HIT reliability estimates were very high (0.975-0.992) in the range where 95% of respondents scored, and relative validity (RV) coefficients were high for diagnosis (RV = 0.87) and severity (RV = 0.89); patient-level classifications were accurate 91.3% for a diagnosis of migraine. For all three criteria of change, CAT-HIT scores were more responsive than all other measures. In the second study, estimates of respondent burden, item usage, reliability and clinical validity were replicated. The test-retest reliability of CAT-HIT was 0.79 and alternate forms coefficients ranged from 0.85 to 0.91. All correlations with the generic SF-8 were negative. CAT-based administrations of headache impact items achieved very large reductions in respondent burden without compromising validity for purposes of patient screening or monitoring changes in headache impact over time. IRT models and CAT-based dynamic health assessments warrant testing among patients with other conditions.
Smartphone assessment of knee flexion compared to radiographic standards.
Dietz, Matthew J; Sprando, Daniel; Hanselman, Andrew E; Regier, Michael D; Frye, Benjamin M
2017-03-01
Measuring knee range of motion (ROM) is an important assessment for the outcomes of total knee arthroplasty. Recent technological advances have led to the development and use of accelerometer-based smartphone applications to measure knee ROM. The purpose of this study was to develop, standardize, and validate methods of utilizing smartphone accelerometer technology compared to radiographic standards, visual estimation, and goniometric evaluation. Participants used visual estimation, a long-arm goniometer, and a smartphone accelerometer to determine range of motion of a cadaveric lower extremity; these results were compared to radiographs taken at the same angles. The optimal smartphone position was determined to be on top of the leg at the distal femur and proximal tibia location. Between methods, it was found that the smartphone and goniometer were comparably reliable in measuring knee flexion (ICC=0.94; 95% CI: 0.91-0.96). Visual estimation was found to be the least reliable method of measurement. The results suggested that the smartphone accelerometer was non-inferior when compared to the other measurement techniques, demonstrated similar deviations from radiographic standards, and did not appear to be influenced by the person performing the measurements or the girth of the extremity. Copyright © 2016 Elsevier B.V. All rights reserved.
Smartphone Assessment of Knee Flexion Compared to Radiographic Standards
Dietz, Matthew J.; Sprando, Daniel; Hanselman, Andrew E.; Regier, Michael D.; Frye, Benjamin M.
2017-01-01
Purpose Measuring knee range of motion (ROM) is an important assessment for the outcomes of total knee arthroplasty. Recent technological advances have led to the development and use of accelerometer-based smartphone applications to measure knee ROM. The purpose of this study was to develop, standardize, and validate methods of utilizing smartphone accelerometer technology compared to radiographic standards, visual estimation, and goniometric evaluation. Methods Participants used visual estimation, a long-arm goniometer, and a smartphone accelerometer to determine range of motion of a cadaveric lower extremity; these results were compared to radiographs taken at the same angles. Results The optimal smartphone position was determined to be on top of the leg at the distal femur and proximal tibia location. Between methods, it was found that the smartphone and goniometer were comparably reliable in measuring knee flexion (ICC = 0.94; 95% CI: 0.91–0.96). Visual estimation was found to be the least reliable method of measurement. Conclusions The results suggested that the smartphone accelerometer was non-inferior when compared to the other measurement techniques, demonstrated similar deviations from radiographic standards, and did not appear to be influenced by the person performing the measurements or the girth of the extremity. PMID:28179062
NWS Operational Requirements for Ensemble-Based Hydrologic Forecasts
NASA Astrophysics Data System (ADS)
Hartman, R. K.
2008-12-01
Ensemble-based hydrologic forecasts have been developed and issued by National Weather Service (NWS) staff at River Forecast Centers (RFCs) for many years. Used principally for long-range water supply forecasts, only the uncertainty associated with weather and climate have been traditionally considered. As technology and societal expectations of resource managers increase, the use and desire for risk-based decision support tools has also increased. These tools require forecast information that includes reliable uncertainty estimates across all time and space domains. The development of reliable uncertainty estimates associated with hydrologic forecasts is being actively pursued within the United States and internationally. This presentation will describe the challenges, components, and requirements for operational hydrologic ensemble-based forecasts from the perspective of a NOAA/NWS River Forecast Center.
Evaluation of wind field statistics near and inside clouds using a coherent Doppler lidar
NASA Astrophysics Data System (ADS)
Lottman, Brian Todd
1998-09-01
This work proposes advanced techniques for measuring the spatial wind field statistics near and inside clouds using a vertically pointing solid state coherent Doppler lidar on a fixed ground based platform. The coherent Doppler lidar is an ideal instrument for high spatial and temporal resolution velocity estimates. The basic parameters of lidar are discussed, including a complete statistical description of the Doppler lidar signal. This description is extended to cases with simple functional forms for aerosol backscatter and velocity. An estimate for the mean velocity over a sensing volume is produced by estimating the mean spectra. There are many traditional spectral estimators, which are useful for conditions with slowly varying velocity and backscatter. A new class of estimators (novel) is introduced that produces reliable velocity estimates for conditions with large variations in aerosol backscatter and velocity with range, such as cloud conditions. Performance of traditional and novel estimators is computed for a variety of deterministic atmospheric conditions using computer simulated data. Wind field statistics are produced for actual data for a cloud deck, and for multi- layer clouds. Unique results include detection of possible spectral signatures for rain, estimates for the structure function inside a cloud deck, reliable velocity estimation techniques near and inside thin clouds, and estimates for simple wind field statistics between cloud layers.
Jeong, Ju Ri; Ko, Young Jun; Ha, Hyun Geun; Lee, Wan Hee
2016-03-01
This study was to establish inter-rater and intrarater reliability of the rehabilitative ultrasonographic imaging (RUSI) technique for muscle thickness measurement of the rhomboid major at rest and with the shoulder abducted to 90°. Twenty-four young adults (eight men, 16 women; right-handed; mean age [±SD], 24·4 years [±2·6]) with no history of neck, shoulder, or arm pain were recruited. Rhomboid major muscle images were obtained in the resting position and with shoulder in 90° abduction using an ultrasonography system with a 7·5-MHz linear transducer. In these two positions, the examiners found the site at which the transducer could be placed. Two examiners obtained the images of all participants in three test sessions at random. Intraclass correlation coefficients (ICC) were used to estimate reliability. All ICCs (95% CI) were >0·75, ranging from 0·93 to 0·98, which indicates good reliability. The ICCs for inter-rater reliability ranged from 0·75 to 0·94. For the absolute value of the difference in the intra-examiner reliability between the right and left ratios, the ICCs ranged from 0·58 to 0·91. In this study, the intra- and interexaminer reliability of muscle thickness measurements of the rhomboid major were good. Therefore, we suggest that muscle thickness measurements of the rhomboid major obtained with the RUSI technique would be useful for clinical rehabilitative assessment. © 2014 Scandinavian Society of Clinical Physiology and Nuclear Medicine. Published by John Wiley & Sons Ltd.
Pomeroy, Emma; Macintosh, Alison; Wells, Jonathan C K; Cole, Tim J; Stock, Jay T
2018-05-01
Estimating body mass from skeletal dimensions is widely practiced, but methods for estimating its components (lean and fat mass) are poorly developed. The ability to estimate these characteristics would offer new insights into the evolution of body composition and its variation relative to past and present health. This study investigates the potential of long bone cross-sectional properties as predictors of body, lean, and fat mass. Humerus, femur and tibia midshaft cross-sectional properties were measured by peripheral quantitative computed tomography in sample of young adult women (n = 105) characterized by a range of activity levels. Body composition was estimated from bioimpedance analysis. Lean mass correlated most strongly with both upper and lower limb bone properties (r values up to 0.74), while fat mass showed weak correlations (r ≤ 0.29). Estimation equations generated from tibial midshaft properties indicated that lean mass could be estimated relatively reliably, with some improvement using logged data and including bone length in the models (minimum standard error of estimate = 8.9%). Body mass prediction was less reliable and fat mass only poorly predicted (standard errors of estimate ≥11.9% and >33%, respectively). Lean mass can be predicted more reliably than body mass from limb bone cross-sectional properties. The results highlight the potential for studying evolutionary trends in lean mass from skeletal remains, and have implications for understanding the relationship between bone morphology and body mass or composition. © 2018 The Authors. American Journal of Physical Anthropology Published by Wiley Periodicals, Inc.
Reliability of tree-height measurements in northern hardwood stands
Dale S. Solomon; Richard J. Nolet
1968-01-01
No significant differences were found between the heights of standing hardwood trees estimated with a Haga altimeter and actual heights measured after the trees had been felled. Differences ranged from +10 feet to -12 feet, and the mean difference for all trees was 0.1 foot.
Reliability of the method of levels for determining cutaneous temperature sensitivity
NASA Astrophysics Data System (ADS)
Jakovljević, Miroljub; Mekjavić, Igor B.
2012-09-01
Determination of the thermal thresholds is used clinically for evaluation of peripheral nervous system function. The aim of this study was to evaluate reliability of the method of levels performed with a new, low cost device for determining cutaneous temperature sensitivity. Nineteen male subjects were included in the study. Thermal thresholds were tested on the right side at the volar surface of mid-forearm, lateral surface of mid-upper arm and front area of mid-thigh. Thermal testing was carried out by the method of levels with an initial temperature step of 2°C. Variability of thermal thresholds was expressed by means of the ratio between the second and the first testing, coefficient of variation (CV), coefficient of repeatability (CR), intraclass correlation coefficient (ICC), mean difference between sessions (S1-S2diff), standard error of measurement (SEM) and minimally detectable change (MDC). There were no statistically significant changes between sessions for warm or cold thresholds, or between warm and cold thresholds. Within-subject CVs were acceptable. The CR estimates for warm thresholds ranged from 0.74°C to 1.06°C and from 0.67°C to 1.07°C for cold thresholds. The ICC values for intra-rater reliability ranged from 0.41 to 0.72 for warm thresholds and from 0.67 to 0.84 for cold thresholds. S1-S2diff ranged from -0.15°C to 0.07°C for warm thresholds, and from -0.08°C to 0.07°C for cold thresholds. SEM ranged from 0.26°C to 0.38°C for warm thresholds, and from 0.23°C to 0.38°C for cold thresholds. Estimated MDC values were between 0.60°C and 0.88°C for warm thresholds, and 0.53°C and 0.88°C for cold thresholds. The method of levels for determining cutaneous temperature sensitivity has acceptable reliability.
Amano, Nobuko; Nakamura, Tomiyo
2018-02-01
The visual estimation method is commonly used in hospitals and other care facilities to evaluate food intake through estimation of plate waste. In Japan, no previous studies have investigated the validity and reliability of this method under the routine conditions of a hospital setting. The present study aimed to evaluate the validity and reliability of the visual estimation method, in long-term inpatients with different levels of eating disability caused by Alzheimer's disease. The patients were provided different therapeutic diets presented in various food types. This study was performed between February and April 2013, and 82 patients with Alzheimer's disease were included. Plate waste was evaluated for the 3 main daily meals, for a total of 21 days, 7 consecutive days during each of the 3 months, originating a total of 4851 meals, from which 3984 were included. Plate waste was measured by the nurses through the visual estimation method, and by the hospital's registered dietitians through the actual measurement method. The actual measurement method was first validated to serve as a reference, and the level of agreement between both methods was then determined. The month, time of day, type of food provided, and patients' physical characteristics were considered for analysis. For the 3984 meals included in the analysis, the level of agreement between the measurement methods was 78.4%. Disagreement of measurements consisted of 3.8% of underestimation and 17.8% of overestimation. Cronbach's α (0.60, P < 0.001) indicated that the reliability of the visual estimation method was within the acceptable range. The visual estimation method was found to be a valid and reliable method for estimating food intake in patients with different levels of eating impairment. The successful implementation and use of the method depends upon adequate training and motivation of the nurses and care staff involved. Copyright © 2017 European Society for Clinical Nutrition and Metabolism. Published by Elsevier Ltd. All rights reserved.
Pfau, Maximilian; Lindner, Moritz; Müller, Philipp L; Birtel, Johannes; Finger, Robert P; Harmening, Wolf M; Fleckenstein, Monika; Holz, Frank G; Schmitz-Valckenberg, Steffen
2017-05-01
To determine the effective dynamic range (EDR), retest reliability, and number of discriminable steps (DS) for mesopic and dark-adapted two-color fundus-controlled perimetry (FCP) using the S-MAIA (Scotopic-Macular Integrity Assessment) "micro-perimeter." In this prospective cross-sectional study, each of the 52 eyes of 52 subjects with various macular diseases (mean age 62.0 ± 16.9 years; range, 19.1-90.1 years) underwent duplicate mesopic (achromatic stimuli, 400-800 nm), dark-adapted cyan (505 nm), and dark-adapted red (627 nm) FCP using a grid of 61 stimuli covering 18° of the central retina. The EDR, the number of DS, and the retest reliability for point-wise sensitivity (PWS) were analyzed. The effects of fixation stability, sensitivity, and age on retest reliability were examined using mixed-effects models. The EDR was 10 to 30 dB with five DS for mesopic and 4 to 17 dB with four DS for dark-adapted cyan and red testing. PWS retest reliability was good among all three types of retinal sensitivity assessments (coefficient of repeatability ±5.79, ±4.72, and ±4.77 dB, respectively) and did not depend on fixation stability or age. PWS had no effect on retest variability in dark-adapted cyan and dark-adapted red testing but had a minor effect in mesopic testing. Combined mesopic and dark-adapted two-color FCP allows for reliable topographic testing of cone and rod function in patients with various macular diseases with and without foveal fixation. Retest reliability is homogeneous across eccentricities and various degrees of scotoma depth, including zones at risk for disease progression. These reliability estimates can serve for the design of future clinical trials.
Estimation of stature using lower limb measurements in Sudanese Arabs.
Ahmed, Altayeb Abdalla
2013-07-01
The estimation of stature from body parts is one of the most vital parts of personal identification in medico-legal autopsies, especially when mutilated and amputated limbs or body parts are found. The aim of this study was to assess the reliability and accuracy of using lower limb measurements for stature estimations. The stature, tibial length, bimalleolar breadth, foot length and foot breadth of 160 right-handed Sudanese Arab subjects, 80 men and 80 women (25-30 years old), were measured. The reliability of measurement acquisition was tested prior to the primary data collection. The data were analysed using basic univariate analysis and linear and multiple regression analyses. The results showed acceptable standards of measurement errors and reliability. Sex differences were significant for all of the measurements. There was a positive correlation coefficient between lower-limb dimensions and stature (P-value < 0.01). The best predictors were tibial length and foot length. The stature prediction accuracy ranged from ± 2.75-5.40 cm, which is comparable to the established skeletal standards for the lower limbs. This study provides new forensic standards for stature estimation using the lower limb measurements of Sudanese Arabs. Copyright © 2013 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Aly, Sharif S; Zhao, Jianyang; Li, Ben; Jiang, Jiming
2014-01-01
The Intraclass Correlation Coefficient (ICC) is commonly used to estimate the similarity between quantitative measures obtained from different sources. Overdispersed data is traditionally transformed so that linear mixed model (LMM) based ICC can be estimated. A common transformation used is the natural logarithm. The reliability of environmental sampling of fecal slurry on freestall pens has been estimated for Mycobacterium avium subsp. paratuberculosis using the natural logarithm transformed culture results. Recently, the negative binomial ICC was defined based on a generalized linear mixed model for negative binomial distributed data. The current study reports on the negative binomial ICC estimate which includes fixed effects using culture results of environmental samples. Simulations using a wide variety of inputs and negative binomial distribution parameters (r; p) showed better performance of the new negative binomial ICC compared to the ICC based on LMM even when negative binomial data was logarithm, and square root transformed. A second comparison that targeted a wider range of ICC values showed that the mean of estimated ICC closely approximated the true ICC.
De Cock, N; Van Camp, J; Kolsteren, P; Lachat, C; Huybregts, L; Maes, L; Deforche, B; Verstraeten, R; Vangeel, J; Beullens, K; Eggermont, S; Van Lippevelde, W
2017-04-01
A short, reliable and valid tool to measure snack and beverage consumption in adolescents, taking into account the correct definitions, would benefit both epidemiological and intervention research. The present study aimed to develop a short quantitative beverage and snack food frequency questionnaire (FFQ) and to assess the reliability and validity of this FFQ against three 24-h recalls. Reliability was assessed by comparing estimates of the FFQ administered 14 days apart (FFQ1 and FFQ2) in a convenience sample of 179 adolescents [60.3% male; mean (SD) 14.7 (0.9) years]. Validity was assessed by comparing FFQ1 with three telephone-administered 24-h recalls in a convenience sample of 99 adolescents [52.5% male, mean (SD) 14.8 (0.9) years]. Reliability and validity were assessed using Bland-Altman plots, classification agreements and correlation coefficients for the amount and frequency of consumption of unhealthy snacks, healthy snacks, unhealthy beverages, healthy beverages, and for the healthy snack and beverage ratios. Small mean differences (FFQ1 versus FFQ2) were observed for reliability, ranking ability ranged from fair to substantial, and Spearman coefficients fell within normal ranges. For the validity, mean differences (FFQ1 versus recalls) were small for beverage intake but large for snack intake, except for the healthy snack ratio. Ranking ability ranged from slightly to moderate, and Spearman coefficients fell within normal ranges. Reliability and validity of the FFQ for all outcomes were found to be acceptable at a group level for epidemiological purposes, whereas for intervention purposes only the healthy snack and beverage ratios were found to be acceptable at a group level. © 2016 The British Dietetic Association Ltd.
A Timing Estimation Method Based-on Skewness Analysis in Vehicular Wireless Networks.
Cui, Xuerong; Li, Juan; Wu, Chunlei; Liu, Jian-Hang
2015-11-13
Vehicle positioning technology has drawn more and more attention in vehicular wireless networks to reduce transportation time and traffic accidents. Nowadays, global navigation satellite systems (GNSS) are widely used in land vehicle positioning, but most of them are lack precision and reliability in situations where their signals are blocked. Positioning systems base-on short range wireless communication are another effective way that can be used in vehicle positioning or vehicle ranging. IEEE 802.11p is a new real-time short range wireless communication standard for vehicles, so a new method is proposed to estimate the time delay or ranges between vehicles based on the IEEE 802.11p standard which includes three main steps: cross-correlation between the received signal and the short preamble, summing up the correlated results in groups, and finding the maximum peak using a dynamic threshold based on the skewness analysis. With the range between each vehicle or road-side infrastructure, the position of neighboring vehicles can be estimated correctly. Simulation results were presented in the International Telecommunications Union (ITU) vehicular multipath channel, which show that the proposed method provides better precision than some well-known timing estimation techniques, especially in low signal to noise ratio (SNR) environments.
Qualitative Meta-Analysis on the Hospital Task: Implications for Research
ERIC Educational Resources Information Center
Noll, Jennifer; Sharma, Sashi
2014-01-01
The "law of large numbers" indicates that as sample size increases, sample statistics become less variable and more closely estimate their corresponding population parameters. Different research studies investigating how people consider sample size when evaluating the reliability of a sample statistic have found a wide range of…
Hu, B; Lin, L F; Zhuang, M Q; Yuan, Z Y; Li, S Y; Yang, Y J; Lu, M; Yu, S Z; Jin, L; Ye, W M; Wang, X F
2015-09-01
To examine the test-retest reliabilities and relative validities of the Chinese version of short International Physical Activity Questionnaire (IPAQ-S-C), the Global Physical Activity Questionnaire (GPAQ-C), and the Total Energy Expenditure Questionnaire (TEEQ-C) in a population-based prospective study, the Taizhou Longitudinal Study (TZLS). A longitudinal comparative study. A total of 205 participants (male: 38.54%) aged 30-70 years completed three questionnaires twice (day one and day nine) and physical activity log (PA-log) over seven consecutive days. The test-retest reliabilities were evaluated using intra-class correlation coefficients (ICCs) and the relative validities were estimated by comparing the data from physical activity questionnaires (PAQs) and PA-log. Good reliabilities were observed between the repeated PAQs. The ICCs ranged from 0.51 to 0.80 for IPAQ-C, 0.67 to 0.85 for GPAQ-C, and 0.74 to 0.94 for TEEQ-C, respectively. Energy expenditure of most PA domains estimated by the three PAQs correlated moderately with the results recorded by PA-log except the walking domain of IPAQ-S-C. The partial correlation coefficients between the PAQs and PA-log ranged from 0.44 to 0.58 for IPAQ-S-C, 0.26 to 0.52 for GPAQ-C, and 0.41 to 0.72 for TEEQ-C, respectively. Bland-Altman plots showed acceptable agreement between the three PAQs and PA-log. The three PAQs, especially TEEQ-C, were relatively reliable and valid for assessment of physical activity and could be used in TZLS. Copyright © 2015 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.
Hung, Man; Baumhauer, Judith F; Latt, L Daniel; Saltzman, Charles L; SooHoo, Nelson F; Hunt, Kenneth J
2013-11-01
In 2012, the American Orthopaedic Foot & Ankle Society(®) established a national network for collecting and sharing data on treatment outcomes and improving patient care. One of the network's initiatives is to explore the use of computerized adaptive tests (CATs) for patient-level outcome reporting. We determined whether the CAT from the NIH Patient Reported Outcome Measurement Information System(®) (PROMIS(®)) Physical Function (PF) item bank provides efficient, reliable, valid, precise, and adequately covered point estimates of patients' physical function. After informed consent, 288 patients with a mean age of 51 years (range, 18-81 years) undergoing surgery for common foot and ankle problems completed a web-based questionnaire. Efficiency was determined by time for test administration. Reliability was assessed with person and item reliability estimates. Validity evaluation included content validity from expert review and construct validity measured against the PROMIS(®) Pain CAT and patient responses based on tradeoff perceptions. Precision was assessed by standard error of measurement (SEM) across patients' physical function levels. Instrument coverage was based on a person-item map. Average time of test administration was 47 seconds. Reliability was 0.96 for person and 0.99 for item. Construct validity against the Pain CAT had an r value of -0.657 (p < 0.001). Precision had an SEM of less than 3.3 (equivalent to a Cronbach's alpha of ≥ 0.90) across a broad range of function. Concerning coverage, the ceiling effect was 0.32% and there was no floor effect. The PROMIS(®) PF CAT appears to be an excellent method for measuring outcomes for patients with foot and ankle surgery. Further validation of the PROMIS(®) item banks may ultimately provide a valid and reliable tool for measuring patient-reported outcomes after injuries and treatment.
Çelik, Derya; Can, Canan; Aslan, Yasemin; Ceylan, Hasan Huseyin; Bilsel, Kerem; Ozdincler, Arzu Razak
2014-01-01
The Harris Hip Score (HHS) developed to assess function and pain from the perspective of patients hip pathologies. The purpose of this study was to translate and culturally adapt the HHS into Turkish, and thereby determine the reliability and validity of the translated version. The HHS was translated into Turkish in accordance with the stages recommended by Beaton. The measurement properties of the HHS were tested in 80 patients; 52 males, mean age 51 years (range 21-75 years) suffering from different hip pathologies. The test-retest reliability was tested in 58 patients; 28 males mean age, 52 years (range 30-73 years) after an interval of seven days. The Cronbach's Alpha was used to assess internal consistency and the intra-class correlation coefficient (ICC) was used to estimate the test-retest reliability. Patients were asked to answer the Oxford Hip Score (OHS), the Western Ontario and McMaster Universities Arthritis Index (WOMAC), the VAS and the Short Form-36 (SF-36) for the validity of the estimation. The Turkish version of the HHS showed sufficient internal consistency (Cronbach's alpha,0.70) and test-retest reliability (ICC = 0.91). The correlation coefficients between the HHS, the WOMAC and the OHS were 0.64 and 0.89 respectively. The highest correlations between the HHS and SF-36 were with the physical function scale (r = 0.72), and the lowest correlations were with the mental function scale (r = 0.10). We observed no floor or ceiling effects. The Turkish version of the HHS has sufficient reliability and validity to measure patient-reported outcome for Turkish-speaking individuals with a variety of hip disorders.
Assessment of the Maximal Split-Half Coefficient to Estimate Reliability
ERIC Educational Resources Information Center
Thompson, Barry L.; Green, Samuel B.; Yang, Yanyun
2010-01-01
The maximal split-half coefficient is computed by calculating all possible split-half reliability estimates for a scale and then choosing the maximal value as the reliability estimate. Osburn compared the maximal split-half coefficient with 10 other internal consistency estimates of reliability and concluded that it yielded the most consistently…
ERIC Educational Resources Information Center
Morgan, Grant B.; Zhu, Min; Johnson, Robert L.; Hodge, Kari J.
2014-01-01
Common estimators of interrater reliability include Pearson product-moment correlation coefficients, Spearman rank-order correlations, and the generalizability coefficient. The purpose of this study was to examine the accuracy of estimators of interrater reliability when varying the true reliability, number of scale categories, and number of…
Nanidis, Theodore G; Ridha, Hyder; Jallali, Navid
2014-10-01
Estimation of the volume of abdominal tissue is desirable when planning autologous abdominal based breast reconstruction. However, this can be difficult clinically. The aim of this study was to develop a simple, yet reliable method of calculating the deep inferior epigastric artery perforator flap weight using the routine preoperative computed tomography angiogram (CTA) scan. Our mathematical formula is based on the shape of a DIEP flap resembling that of an isosceles triangular prism. Thus its volume can be calculated with a standard mathematical formula. Using bony landmarks three measurements were acquired from the CTA scan to calculate the flap weight. This was then compared to the actual flap weight harvested in both a retrospective feasibility and prospective study. In the retrospective group 17 DIEP flaps in 17 patients were analyzed. Average predicted flap weight was 667 g (range 293-1254). The average actual flap weight was 657 g (range 300-1290) giving an average percentage error of 6.8% (p-value for weight difference 0.53). In the prospective group 15 DIEP flaps in 15 patients were analyzed. Average predicted flap weight was 618 g (range 320-925). The average actual flap weight was 624 g (range 356-970) giving an average percentage error of 6.38% (p-value for weight difference 0.57). This formula is a quick, reliable and accurate way of estimating the volume of abdominal tissue using the preoperative CTA scan. Copyright © 2014 British Association of Plastic, Reconstructive and Aesthetic Surgeons. Published by Elsevier Ltd. All rights reserved.
W5″ Test: A simple method for measuring mean power output in the bench press exercise.
Tous-Fajardo, Julio; Moras, Gerard; Rodríguez-Jiménez, Sergio; Gonzalo-Skok, Oliver; Busquets, Albert; Mujika, Iñigo
2016-11-01
The aims of the present study were to assess the validity and reliability of a novel simple test [Five Seconds Power Test (W5″ Test)] for estimating the mean power output during the bench press exercise at different loads, and its sensitivity to detect training-induced changes. Thirty trained young men completed as many repetitions as possible in a time of ≈5 s at 25%, 45%, 65% and 85% of one-repetition maximum (1RM) in two test sessions separated by four days. The number of repetitions, linear displacement of the bar and time needed to complete the test were recorded by two independent testers, and a linear encoder was used as the criterion measure. For each load, the mean power output was calculated in the W5″ Test as mechanical work per time unit and compared with that obtained from the linear encoder. Subsequently, 20 additional subjects (10 training group vs. 10 control group) were assessed before and after completing a seven-week training programme designed to improve maximal power. Results showed that both assessment methods correlated highly in estimating mean power output at different loads (r range: 0.86-0.94; p < .01) and detecting training-induced changes (R(2): 0.78). Good to excellent intra-tester (intraclass correlation coefficient (ICC) range: 0.81-0.97) and excellent inter-tester (ICC range: 0.96-0.99; coefficient of variation range: 2.4-4.1%) reliability was found for all loads. The W5″ Test was shown to be a valid, reliable and sensitive method for measuring mean power output during the bench press exercise in subjects who have previous resistance training experience.
Blais, Julie; Forth, Adelle E; Hare, Robert D
2017-06-01
The goal of the current study was to assess the interrater reliability of the Psychopathy Checklist-Revised (PCL-R) among a large sample of trained raters (N = 280). All raters completed PCL-R training at some point between 1989 and 2012 and subsequently provided complete coding for the same 6 practice cases. Overall, 3 major conclusions can be drawn from the results: (a) reliability of individual PCL-R items largely fell below any appropriate standards while the estimates for Total PCL-R scores and factor scores were good (but not excellent); (b) the cases representing individuals with high psychopathy scores showed better reliability than did the cases of individuals in the moderate to low PCL-R score range; and (c) there was a high degree of variability among raters; however, rater specific differences had no consistent effect on scoring the PCL-R. Therefore, despite low reliability estimates for individual items, Total scores and factor scores can be reliably scored among trained raters. We temper these conclusions by noting that scoring standardized videotaped case studies does not allow the rater to interact directly with the offender. Real-world PCL-R assessments typically involve a face-to-face interview and much more extensive collateral information. We offer recommendations for new web-based training procedures. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Ward, Dianne S; Mazzucca, Stephanie; McWilliams, Christina; Hales, Derek
2015-09-26
Early care and education (ECE) centers are important settings influencing young children's diet and physical activity (PA) behaviors. To better understand their impact on diet and PA behaviors as well as to evaluate public health programs aimed at ECE settings, we developed and tested the Environment and Policy Assessment and Observation - Self-Report (EPAO-SR), a self-administered version of the previously validated, researcher-administered EPAO. Development of the EPAO-SR instrument included modification of items from the EPAO, community advisory group and expert review, and cognitive interviews with center directors and classroom teachers. Reliability and validity data were collected across 4 days in 3-5 year old classrooms in 50 ECE centers in North Carolina. Center teachers and directors completed relevant portions of the EPAO-SR on multiple days according to a standardized protocol, and trained data collectors completed the EPAO for 4 days in the centers. Reliability and validity statistics calculated included percent agreement, kappa, correlation coefficients, coefficients of variation, deviations, mean differences, and intraclass correlation coefficients (ICC), depending on the response option of the item. Data demonstrated a range of reliability and validity evidence for the EPAO-SR instrument. Reporting from directors and classroom teachers was consistent and similar to the observational data. Items that produced strongest reliability and validity estimates included beverages served, outside time, and physical activity equipment, while items such as whole grains served and amount of teacher-led PA had lower reliability (observation and self-report) and validity estimates. To overcome lower reliability and validity estimates, some items need administration on multiple days. This study demonstrated appropriate reliability and validity evidence for use of the EPAO-SR in the field. The self-administered EPAO-SR is an advancement of the measurement of ECE settings and can be used by researchers and practitioners to assess the nutrition and physical activity environments of ECE settings.
Development of modelling algorithm of technological systems by statistical tests
NASA Astrophysics Data System (ADS)
Shemshura, E. A.; Otrokov, A. V.; Chernyh, V. G.
2018-03-01
The paper tackles the problem of economic assessment of design efficiency regarding various technological systems at the stage of their operation. The modelling algorithm of a technological system was performed using statistical tests and with account of the reliability index allows estimating the level of machinery technical excellence and defining the efficiency of design reliability against its performance. Economic feasibility of its application shall be determined on the basis of service quality of a technological system with further forecasting of volumes and the range of spare parts supply.
Developing a Danish version of the "Impact on Participation and Autonomy Questionnaire".
Ghaziani, Emma; Krogh, Anne Grethe; Lund, Hans
2013-05-01
To translate the "Impact on Participation and Autonomy Questionnaire" into Danish (IPAQ-DK), and estimate its internal consistency and test-retest reliability in order to promote participation-based interventions and research. Translation and two successive reliability assessments through test-retest. 137 adults with varying degrees of impairment; of these, 67 participated in the final reliability assessment. The translation followed guidelines set forth by the "European Group for Quality of Life Assessment and Health Measurement". Internal consistency for subscales was estimated by Chronbach's alpha. Weighted kappa coefficients and intraclass correlation coefficients were calculated to assess the test-retest reliability at item and subscale level, respectively. A preliminary reliability assessment revealed residual issues regarding the translation and cultural adaptation of the instrument. The revised version (IPAQ-DK) was subsequently subjected to a similar assessment demonstrating Chronbach's alpha values from 0.698 to 0.817. Weighted kappa ranged from 0.370 to 0.880; 78% of these values were higher than 0.600. The intraclass correlation coefficient covered values from 0.701 to 0.818. IPAQ-DK is a useful instrument for identifying person-perceived participation restrictions and satisfaction with participation. Further studies of IPAQ-DK's floor/ceiling effects and responsiveness to change are recommended, and whether there is a need for further linguistic improvement of certain items.
Validity and reliability of Optojump photoelectric cells for estimating vertical jump height.
Glatthorn, Julia F; Gouge, Sylvain; Nussbaumer, Silvio; Stauffacher, Simone; Impellizzeri, Franco M; Maffiuletti, Nicola A
2011-02-01
Vertical jump is one of the most prevalent acts performed in several sport activities. It is therefore important to ensure that the measurements of vertical jump height made as a part of research or athlete support work have adequate validity and reliability. The aim of this study was to evaluate concurrent validity and reliability of the Optojump photocell system (Microgate, Bolzano, Italy) with force plate measurements for estimating vertical jump height. Twenty subjects were asked to perform maximal squat jumps and countermovement jumps, and flight time-derived jump heights obtained by the force plate were compared with those provided by Optojump, to examine its concurrent (criterion-related) validity (study 1). Twenty other subjects completed the same jump series on 2 different occasions (separated by 1 week), and jump heights of session 1 were compared with session 2, to investigate test-retest reliability of the Optojump system (study 2). Intraclass correlation coefficients (ICCs) for validity were very high (0.997-0.998), even if a systematic difference was consistently observed between force plate and Optojump (-1.06 cm; p < 0.001). Test-retest reliability of the Optojump system was excellent, with ICCs ranging from 0.982 to 0.989, low coefficients of variation (2.7%), and low random errors (±2.81 cm). The Optojump photocell system demonstrated strong concurrent validity and excellent test-retest reliability for the estimation of vertical jump height. We propose the following equation that allows force plate and Optojump results to be used interchangeably: force plate jump height (cm) = 1.02 × Optojump jump height + 0.29. In conclusion, the use of Optojump photoelectric cells is legitimate for field-based assessments of vertical jump height.
The reliability of a quality appraisal tool for studies of diagnostic reliability (QAREL).
Lucas, Nicholas; Macaskill, Petra; Irwig, Les; Moran, Robert; Rickards, Luke; Turner, Robin; Bogduk, Nikolai
2013-09-09
The aim of this project was to investigate the reliability of a new 11-item quality appraisal tool for studies of diagnostic reliability (QAREL). The tool was tested on studies reporting the reliability of any physical examination procedure. The reliability of physical examination is a challenging area to study given the complex testing procedures, the range of tests, and lack of procedural standardisation. Three reviewers used QAREL to independently rate 29 articles, comprising 30 studies, published during 2007. The articles were identified from a search of relevant databases using the following string: "Reproducibility of results (MeSH) OR reliability (t.w.) AND Physical examination (MeSH) OR physical examination (t.w.)." A total of 415 articles were retrieved and screened for inclusion. The reviewers undertook an independent trial assessment prior to data collection, followed by a general discussion about how to score each item. At no time did the reviewers discuss individual papers. Reliability was assessed for each item using multi-rater kappa (κ). Multi-rater reliability estimates ranged from κ = 0.27 to 0.92 across all items. Six items were recorded with good reliability (κ > 0.60), three with moderate reliability (κ = 0.41 - 0.60), and two with fair reliability (κ = 0.21 - 0.40). Raters found it difficult to agree about the spectrum of patients included in a study (Item 1) and the correct application and interpretation of the test (Item 10). In this study, we found that QAREL was a reliable assessment tool for studies of diagnostic reliability when raters agreed upon criteria for the interpretation of each item. Nine out of 11 items had good or moderate reliability, and two items achieved fair reliability. The heterogeneity in the tests included in this study may have resulted in an underestimation of the reliability of these two items. We discuss these and other factors that could affect our results and make recommendations for the use of QAREL.
Casartelli, Nicola; Müller, Roland; Maffiuletti, Nicola A
2010-11-01
The aim of the present study was to verify the validity and reliability of the Myotest accelerometric system (Myotest SA, Sion, Switzerland) for the assessment of vertical jump height. Forty-four male basketball players (age range: 9-25 years) performed series of squat, countermovement and repeated jumps during 2 identical test sessions separated by 2-15 days. Flight height was simultaneously quantified with the Myotest system and validated photoelectric cells (Optojump). Two calculation methods were used to estimate the jump height from Myotest recordings: flight time (Myotest-T) and vertical takeoff velocity (Myotest-V). Concurrent validity was investigated comparing Myotest-T and Myotest-V to the criterion method (Optojump), and test-retest reliability was also examined. As regards validity, Myotest-T overestimated jumping height compared to Optojump (p < 0.001) with a systematic bias of approximately 7 cm, even though random errors were low (2.7 cm) and intraclass correlation coefficients (ICCs) where high (>0.98), that is, excellent validity. Myotest-V overestimated jumping height compared to Optojump (p < 0.001), with high random errors (>12 cm), high limits of agreement ratios (>36%), and low ICCs (<0.75), that is, poor validity. As regards reliability, Myotest-T showed high ICCs (range: 0.92-0.96), whereas Myotest-V showed low ICCs (range: 0.56-0.89), and high random errors (>9 cm). In conclusion, Myotest-T is a valid and reliable method for the assessment of vertical jump height, and its use is legitimate for field-based evaluations, whereas Myotest-V is neither valid nor reliable.
Rossettini, Giacomo; Rondoni, Angie; Lovato, Tommaso; Strobe, Marco; Verzè, Elisa; Vicentini, Marco; Testa, Marco
2016-06-03
Passive Intervertebral Movements (PIVMs) are commonly used to assess and treat patients with nonspecific neck pain. Only very few studies have investigated 3D movements until now. This study assessed intra- and inter-rater reliability of three-dimensional (3D) cervical PIVMs performed by physical therapy students in patients with nonspecific neck pain. Thirty-one patients, mean age 47.2 ± 7.2 years, were independently evaluated by 2 physical therapy students. The raters (A and B) assessed mobility, end-feel and pain provocation performing bilaterally the 3D cervical segmental side-bending test (3D CSSB) from levels C2-C3 to C6-C7. Percentage agreement (raw, positive and negative), Cohen's kappa (95% CI), prevalence index and bias index were calculated to estimate intra- and inter-reliability. Intra-rater reliability showed kappa values ranging between fair and substantial (k 0.29-0.80) for pain provocation, mobility and end-feel, with percentage agreements between 61%-90%. Inter-rater reliability presented kappa values ranging between fair and substantial (k 0.22-0.62) for pain provocation, mobility and end-feel, with percentage agreements between 61% and 80%. Intra-rater reliability of 3D PIVMs was superior to inter-rater reliability in patients with nonspecific neck pain. The most repeatable evaluation parameter was pain. However overall poor reliability suggests avoiding the use of these techniques alone to examine patients and measure their outcome. Further studies are needed to investigate PIVMs reliability in combination with other assessment procedure in symptomatic patients.
Lehotkay, R; Saraswathi Devi, T; Raju, M V R; Bada, P K; Nuti, S; Kempf, N; Carminati, G Galli
2015-03-01
In this study realised in collaboration with the department of psychology and parapsychology of Andhra University, validation of the Aberrant Behavior Checklist-Community (ABC-C) in Telugu, the official language of Andhra Pradesh, one of India's 28 states, was carried out. To assess the factor validity and reliability of this Telugu version, 120 participants with moderate to profound intellectual disability (94 men and 26 women, mean age 25.2, SD 7.1) were rated by the staff of the Lebenshilfe Institution for Mentally Handicapped in Visakhapatnam, Andhra Pradesh, India. Rating data were analysed with a confirmatory factor analysis. The internal consistency was estimated by Cronbach's alpha. To confirm the test-retest reliability, 50 participants were rated twice with an interval of 4 weeks, and 50 were rated by pairs of raters to assess inter-rater reliability. Confirmatory factor analysis revealed that the root mean square error of approximation (RMSEA) was equal to 0.06, the comparative fit index (CFI) was equal to 0.77, and the Tucker Lewis index (TLI) was equal to 0.77, which indicated that the model with five correlated factors had a good fit. Coefficient alpha ranged from 0.85 to 0.92 across the five subscales. Spearman's rank correlation coefficients for inter-rater reliability tests ranged from 0.65 to 0.75, and the correlations for test-retest reliability ranged from 0.58 to 0.76. All reliability coefficients were statistically significant (P < 0.01). The factor validity and reliability of Telugu version of the ABC-C evidenced factor validity and reliability comparable to the original English version and appears to be useful for assessing behaviour disorders in Indian people with intellectual disabilities. © 2014 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
Hu, Zhi-Jun; He, Jian; Zhao, Feng-Dong; Fang, Xiang-Qian; Zhou, Li-Na; Fan, Shun-Wu
2011-06-01
A reliability study was conducted. To estimate the intra- and intermeasurement errors in the measurements of functional cross-sectional area (FCSA), density, and T2 signal intensity of paraspinal muscles using computed tomography (CT) scan and magnetic resonance imaging (MRI). CT scan and MRI had been used widely to measure the cross-sectional area and degeneration of the back muscles in spine and muscle research. But there is still no systemic study to analyze the reliability of these measurements. This study measured the FCSA and fatty infiltration (density on CT scan and T2 signal intensity on MRI) of the paraspinal muscles at L3-L4, L4-L5, and L5-S1 in 29 patients with chronic low back pain. Two experienced musculoskeletal radiologists and one superior spine surgeon traced the region of interest twice within 3 weeks for measurement of the intra- and interobserver reliability. The intraclass correlation coefficients (ICCs) of the intra-reliability ranged from fair to excellent for FCSA, and good to excellent for fatty infiltration. The ICCs of the inter-reliability ranged from fair to excellent for FCSA, and good to excellent for fatty infiltration. There were no significant differences between CT scan and MRI in reliability results, except in the relative standard error of fatty infiltration measurement. The ICCs of the FCSA measurement between CT scan and MRI ranged from poor to good. The reliabilities of the CT scan and MRI for measuring the FCSA and fatty infiltration of the atrophied lumbar paraspinal muscles were acceptable. It was reliable for using uniform one image method for a single paraspinal muscle evaluation study. And the authors preferred to advise the MRI other than CT scan for paraspinal muscles measurements of FCSA and fatty infiltration.
Toward accurate and precise estimates of lion density.
Elliot, Nicholas B; Gopalaswamy, Arjun M
2017-08-01
Reliable estimates of animal density are fundamental to understanding ecological processes and population dynamics. Furthermore, their accuracy is vital to conservation because wildlife authorities rely on estimates to make decisions. However, it is notoriously difficult to accurately estimate density for wide-ranging carnivores that occur at low densities. In recent years, significant progress has been made in density estimation of Asian carnivores, but the methods have not been widely adapted to African carnivores, such as lions (Panthera leo). Although abundance indices for lions may produce poor inferences, they continue to be used to estimate density and inform management and policy. We used sighting data from a 3-month survey and adapted a Bayesian spatially explicit capture-recapture (SECR) model to estimate spatial lion density in the Maasai Mara National Reserve and surrounding conservancies in Kenya. Our unstructured spatial capture-recapture sampling design incorporated search effort to explicitly estimate detection probability and density on a fine spatial scale, making our approach robust in the context of varying detection probabilities. Overall posterior mean lion density was estimated to be 17.08 (posterior SD 1.310) lions >1 year old/100 km 2 , and the sex ratio was estimated at 2.2 females to 1 male. Our modeling framework and narrow posterior SD demonstrate that SECR methods can produce statistically rigorous and precise estimates of population parameters, and we argue that they should be favored over less reliable abundance indices. Furthermore, our approach is flexible enough to incorporate different data types, which enables robust population estimates over relatively short survey periods in a variety of systems. Trend analyses are essential to guide conservation decisions but are frequently based on surveys of differing reliability. We therefore call for a unified framework to assess lion numbers in key populations to improve management and policy decisions. © 2016 Society for Conservation Biology.
Höller, Yvonne; Uhl, Andreas; Bathke, Arne; Thomschewski, Aljoscha; Butz, Kevin; Nardone, Raffaele; Fell, Jürgen; Trinka, Eugen
2017-01-01
Measures of interaction (connectivity) of the EEG are at the forefront of current neuroscientific research. Unfortunately, test-retest reliability can be very low, depending on the measure and its estimation, the EEG-frequency of interest, the length of the signal, and the population under investigation. In addition, artifacts can hamper the continuity of the EEG signal, and in some clinical situations it is impractical to exclude artifacts. We aimed to examine factors that moderate test-retest reliability of measures of interaction. The study involved 40 patients with a range of neurological diseases and memory impairments (age median: 60; range 21–76; 40% female; 22 mild cognitive impairment, 5 subjective cognitive complaints, 13 temporal lobe epilepsy), and 20 healthy controls (age median: 61.5; range 23–74; 70% female). We calculated 14 measures of interaction based on the multivariate autoregressive model from two EEG-recordings separated by 2 weeks. We characterized test-retest reliability by correlating the measures between the two EEG-recordings for variations of data length, data discontinuity, artifact exclusion, model order, and frequency over all combinations of channels and all frequencies, individually for each subject, yielding a correlation coefficient for each participant. Excluding artifacts had strong effects on reliability of some measures, such as classical, real valued coherence (~0.1 before, ~0.9 after artifact exclusion). Full frequency directed transfer function was highly reliable and robust against artifacts. Variation of data length decreased reliability in relation to poor adjustment of model order and signal length. Variation of discontinuity had no effect, but reliabilities were different between model orders, frequency ranges, and patient groups depending on the measure. Pathology did not interact with variation of signal length or discontinuity. Our results emphasize the importance of documenting reliability, which may vary considerably between measures of interaction. We recommend careful selection of measures of interaction in accordance with the properties of the data. When only short data segments are available and when the signal length varies strongly across subjects after exclusion of artifacts, reliability becomes an issue. Finally, measures which show high reliability irrespective of the presence of artifacts could be extremely useful in clinical situations when exclusion of artifacts is impractical. PMID:28912704
Reliability of reservoir firm yield determined from the historical drought of record
Archfield, S.A.; Vogel, R.M.
2005-01-01
The firm yield of a reservoir is typically defined as the maximum yield that could have been delivered without failure during the historical drought of record. In the future, reservoirs will experience droughts that are either more or less severe than the historical drought of record. The question addressed here is what the reliability of such systems will be when operated at the firm yield. To address this question, we examine the reliability of 25 hypothetical reservoirs sited across five locations in the central and western United States. These locations provided a continuous 756-month streamflow record spanning the same time interval. The firm yield of each reservoir was estimated from the historical drought of record at each location. To determine the steady-state monthly reliability of each firm-yield estimate, 12,000-month synthetic records were generated using the moving-blocks bootstrap method. Bootstrapping was repeated 100 times for each reservoir to obtain an average steady-state monthly reliability R, the number of months the reservoir did not fail divided by the total months. Values of R were greater than 0.99 for 60 percent of the study reservoirs; the other 40 percent ranged from 0.95 to 0.98. Estimates of R were highly correlated with both the level of development (ratio of firm yield to average streamflow) and average lag-1 monthly autocorrelation. Together these two predictors explained 92 percent of the variability in R, with the level of development alone explaining 85 percent of the variability. Copyright ASCE 2005.
Reliability of tanoak volume equations when applied to different areas
Norman H. Pillsbury; Philip M. McDonald; Victor Simon
1995-01-01
Tree volume equations for tanoak (Lithocarpus densiflorus) were developed for seven stands throughout its natural range and compared by a volume prediction and a parameter difference method. The objective was to test if volume estimates from a species growing in a local, relatively uniform habitat could be applied more widely. Results indicated...
NASA Astrophysics Data System (ADS)
Marks, D. G.; Kormos, P.; Johnson, M.; Bormann, K. J.; Hedrick, A. R.; Havens, S.; Robertson, M.; Painter, T. H.
2017-12-01
Lidar-derived snow depths when combined with modeled or estimated snow density can provide reliable estimates of the distribution of SWE over large mountain areas. Application of this approach is transforming western snow hydrology. We present a comprehensive approach toward modeling bulk snow density that is reliable over a vast range of weather and snow conditions. The method is applied and evaluated over mountainous regions of California, Idaho, Oregon and Colorado in the western US. Simulated and measured snow density are compared at fourteen validation sites across the western US where measurements of snow mass (SWE) and depth are co-located. Fitting statistics for ten sites from three mountain catchments (two in Idaho, one in California) show an average Nash-Sutcliff model efficiency coefficient of 0.83, and mean bias of 4 kg m-3. Results illustrate issues associated with monitoring snow depth and SWE and show the effectiveness of the model, with a small mean bias across a range of snow and climate conditions in the west.
Clark, S; Rose, D J
2001-04-01
To establish reliability estimates of the 75% Limits of Stability Test (75% LOS test) when administered to community-dwelling older adults with a history of falls. Generalizability theory was used to estimate both the relative contribution of identified error sources to the total measurement error and generalizability coefficients. A random effects repeated-measures analysis of variance (ANOVA) was used to assess consistency of LOS test movement variables across both days and targets. A motor control research laboratory in a university setting. Fifty community-dwelling older adults with 2 or more falls in the previous year. Spatial and temporal measures of dynamic balance derived from the 75% LOS test included average movement velocity, maximum center of gravity (COG) excursion, end-point COG excursion, and directional control. Estimated generalizability coefficients for 2 testing days ranged from.58 to.87. Total variance in LOS test measures attributable to inconsistencies in day-to-day test performance (Day and Subject x Day facets) ranged from 2.5% to 8.4%. The ANOVA results indicated that no significant differences were observed in the LOS test variables across the 2 testing days. The 75% LOS test administered to older adult fallers on 2 consecutive days provides consistent and reliable measures of dynamic balance.
Evapotranspiration from areas of native vegetation in west-central Florida
Bidlake, W.R.; Woodham, W.M.; Lopez, Miguel Angel
1996-01-01
The micrometeorological methods of energy-balance Bowen ratio and eddy correlation probably are suitable for determining evapotranspiration from unforested sites, but the aerodynamic effects of tall tree canopies need to be considered when the methods are used for forested sites. Potential evapotranspiration methods might not yield reliable estimates of evapotranspiration for all areas of native vegetation. Estimates of annual evapotranspiration ranged from 970 millimeters for a cypress swamp site to 1,060 millimeters for a pine flatwood site.
Petrowski, Katja; Kliem, Sören; Sadler, Michael; Meuret, Alicia E; Ritz, Thomas; Brähler, Elmar
2018-02-06
Demands placed on individuals in occupational and social settings, as well as imbalances in personal traits and resources, can lead to chronic stress. The Trier Inventory for Chronic Stress (TICS) measures chronic stress while incorporating domain-specific aspects, and has been found to be a highly reliable and valid research tool. The aims of the present study were to confirm the German version TICS factorial structure in an English translation of the instrument (TICS-E) and to report its psychometric properties. A random route sample of healthy participants (N = 483) aged 18-30 years completed the TICS-E. The robust maximum likelihood estimation with a mean-adjusted chi-square test statistic was applied due to the sample's significant deviation from the multivariate normal distribution. Goodness of fit, absolute model fit, and relative model fit were assessed by means of the root mean square error of approximation (RMSEA), the Comparative Fit Index (CFI) and the Tucker Lewis Index (TLI). Reliability estimates (Cronbach's α and adjusted split-half reliability) ranged from .84 to .92. Item-scale correlations ranged from .50 to .85. Measures of fit showed values of .052 for RMSEA (Cl = 0.50-.054) and .067 for SRMR for absolute model fit, and values of .846 (TLI) and .855 (CFI) for relative model-fit. Factor loadings ranged from .55 to .91. The psychometric properties and factor structure of the TICS-E are comparable to the German version of the TICS. The instrument therefore meets quality standards for an adequate measurement of chronic stress.
Anhang Price, Rebecca; Stucky, Brian; Parast, Layla; Elliott, Marc N; Haas, Ann; Bradley, Melissa; Teno, Joan M
2018-03-20
Increasingly, dying patients and their families have a choice of hospice providers. Care quality varies considerably across providers; informing consumers of these differences may help to improve their selection of hospices. To develop and evaluate standardized survey measures of hospice care experiences for the purpose of comparing and publicly reporting hospice performance. We assessed item performance and constructed composite measures by factor analysis, evaluating item-scale correlations and estimating reliability. To assess key drivers of overall experiences, we regressed overall rating and willingness to recommend the hospice on each composite. Data submitted by 2500 hospices participating in national implementation of the Consumer Assessment of Healthcare Providers and Systems (CAHPS ® ) Hospice Survey for April through September 2015. Composite measures of Hospice Team Communication, Getting Timely Care, Treating Family Member with Respect, Getting Emotional and Religious Support, Getting Help for Symptoms, and Getting Hospice Care Training. Cronbach's alpha estimates for the composite measures range from 0.61 to 0.85; hospice-level reliability for the measures range from 0.67 to 0.81 assuming 200 completed surveys per hospice. Together, the composites are responsible for 48% of the variance in caregivers' overall ratings of hospices. Hospice Team Communication is the strongest predictor of overall rating of care. Our analyses provide evidence of the reliability and validity of CAHPS Hospice Survey measure scores. Results also highlight important opportunities to improve the quality of hospice care, particularly with regard to addressing symptoms of anxiety and sadness, discussing side effects of pain medicine, and keeping family informed of the patient's condition.
Deltombe, T; Jamart, J; Recloux, S; Legrand, C; Vandenbroeck, N; Theys, S; Hanson, P
2007-03-01
We conducted a reliability comparison study to determine the intrarater and inter-rater reliability and the limits of agreement of the volume estimated by circumferential measurements using the frustum sign method and the disk model method, by water displacement volumetry, and by infrared optoelectronic volumetry in the assessment of upper limb lymphedema. Thirty women with lymphedema following axillary lymph node dissection surgery for breast cancer surgery were enrolled. In each patient, the volumes of the upper limbs were estimated by three physical therapists using circumference measurements, water displacement and optoelectronic volumetry. One of the physical therapists performed each measure twice. Intraclass correlation coefficients (ICCs), relative differences, and limits of agreement were determined. Intrarater and interrater reliability ICCs ranged from 0.94 to 1. Intrarater relative differences were 1.9% for the disk model method, 3.2% for the frustum sign model method, 2.9% for water displacement volumetry, and 1.5% for optoelectronic volumetry. Intrarater reliability was always better than interrater, except for the optoelectronic method. Intrarater and interrater limits of agreement were calculated for each technique. The disk model method and optoelectronic volumetry had better reliability than the frustum sign method and water displacement volumetry, which is usually considered to be the gold standard. In terms of low-cost, simplicity, and reliability, we recommend the disk model method as the method of choice in clinical practice. Since intrarater reliability was always better than interrater reliability (except for optoelectronic volumetry), patients should therefore, ideally, always be evaluated by the same therapist. Additionally, the limits of agreement must be taken into account when determining the response of a patient to treatment.
Effects of intrinsic aging and photodamage on skin dyspigmentation: an explorative study
NASA Astrophysics Data System (ADS)
Dobos, Gabor; Trojahn, Carina; D'Alessandro, Brian; Patwardhan, Sachin; Canfield, Douglas; Blume-Peytavi, Ulrike; Kottner, Jan
2016-06-01
Photoaging is associated with increasing pigmentary heterogeneity and darkening of skin color. However, little is known about age-related changes in skin pigmentation on sun-protected areas. The aim of this explorative study was to measure skin color and dyspigmentation using image processing and to evaluate the reliability of these parameters. Twenty-four volunteers of three age-groups were included in this explorative study. Measurements were conducted at sun-exposed and sun-protected areas. Overall skin-color estimates were similar among age groups. The hyper- and hypopigmentation indices differed significantly by age groups and their correlations with age ranged between 0.61 and 0.74. Dorsal forearm skin differed from the other investigational areas (p<0.001). We observed an increase in dyspigmentation at all skin areas, including sun-protected skin areas, already in young adulthood. Associations between age and dyspigmentation estimates were higher compared to color parameters. All color and dyspigmentation estimates showed high reliability. Dyspigmentation parameters seem to be better biomarkers for UV damage than the overall color measurements.
Sensitivity of wildlife habitat models to uncertainties in GIS data
NASA Technical Reports Server (NTRS)
Stoms, David M.; Davis, Frank W.; Cogan, Christopher B.
1992-01-01
Decision makers need to know the reliability of output products from GIS analysis. For many GIS applications, it is not possible to compare these products to an independent measure of 'truth'. Sensitivity analysis offers an alternative means of estimating reliability. In this paper, we present a CIS-based statistical procedure for estimating the sensitivity of wildlife habitat models to uncertainties in input data and model assumptions. The approach is demonstrated in an analysis of habitat associations derived from a GIS database for the endangered California condor. Alternative data sets were generated to compare results over a reasonable range of assumptions about several sources of uncertainty. Sensitivity analysis indicated that condor habitat associations are relatively robust, and the results have increased our confidence in our initial findings. Uncertainties and methods described in the paper have general relevance for many GIS applications.
White dwarf stars and the age of the Galactic disk
NASA Technical Reports Server (NTRS)
Wood, M. A.
1990-01-01
The history of the Galaxy is written in its oldest stars, the white dwarf (WD) stars. Significant limits can be placed on both the Galactic age and star formation history. A wide range of input WD model sequences is used to derive the current limits to the age estimates suggested by fitting to the observed falloff in the WD luminosity function. The results suggest that the star formation rate over the history of the Galaxy has been relatively constant, and that the disk age lies in the range 6-12 billion years, depending upon the assumed structure of WD stars, and in particular on the core composition and surface helium layer mass. Using plausible mixed C/O core input models, the estimates for the disk age range from 8-10.5 Gyr, i.e.,sustantially younger than most age estimates for the halo globular clusters. After speculating on the significance of the results, expected observational and theoretical refinements which will further enhance the reliability of the method are discussed.
Complete Bouguer gravity map of the Medicine Lake Quadrangle, California
Finn, C.
1981-01-01
A mathematical technique, called kriging, was programmed for a computer to interpolate hydrologic data based on a network of measured values in west-central Kansas. The computer program generated estimated values at the center of each 1-mile section in the Western Kansas Groundwater Management District No. 1 and facilitated contouring of selected values that are needed in the effective management of ground water for irrigation. The kriging technique produced objective and reproducible maps that illustrated hydrologic conditions in the Ogallala aquifer, the principal source of water in west-central Kansas. Maps of the aquifer, which use a 3-year average, included the 1978-80 water-table altitudes, which ranged from about 2,580 to 3,720 feet; the 1978-80 saturated thicknesses, which ranged from about 0 to 250 feet; and the percentage changes in saturated thickness from 1950 to 1978-80, which ranged from about a 50-percent increase to a 100-percent decrease. A map showing errors of estimate also was provided as a measure of reliability for the 1978-80 water-table altitudes. Errors of estimate ranged from 2 to 24 feet. (USGS)
Reliability Problems of the Datum: Solutions for Questionnaire Responses.
ERIC Educational Resources Information Center
Bastick, Tony
Questionnaires often ask for estimates, and these estimates are given with different reliabilities. It is difficult to know the different reliabilities of single estimates and to take these into account in subsequent analyses. This paper contains a practical example to show that not taking the reliability of different responses into account can…
Lincoln estimates of mallard (Anas platyrhynchos) abundance in North America.
Alisauskas, Ray T; Arnold, Todd W; Leafloor, James O; Otis, David L; Sedinger, James S
2014-01-01
Estimates of range-wide abundance, harvest, and harvest rate are fundamental for sound inferences about the role of exploitation in the dynamics of free-ranging wildlife populations, but reliability of existing survey methods for abundance estimation is rarely assessed using alternative approaches. North American mallard populations have been surveyed each spring since 1955 using internationally coordinated aerial surveys, but population size can also be estimated with Lincoln's method using banding and harvest data. We estimated late summer population size of adult and juvenile male and female mallards in western, midcontinent, and eastern North America using Lincoln's method of dividing (i) total estimated harvest, [Formula: see text], by estimated harvest rate, [Formula: see text], calculated as (ii) direct band recovery rate, [Formula: see text], divided by the (iii) band reporting rate, [Formula: see text]. Our goal was to compare estimates based on Lincoln's method with traditional estimates based on aerial surveys. Lincoln estimates of adult males and females alive in the period June-September were 4.0 (range: 2.5-5.9), 1.8 (range: 0.6-3.0), and 1.8 (range: 1.3-2.7) times larger than respective aerial survey estimates for the western, midcontinent, and eastern mallard populations, and the two population estimates were only modestly correlated with each other (western: r = 0.70, 1993-2011; midcontinent: r = 0.54, 1961-2011; eastern: r = 0.50, 1993-2011). Higher Lincoln estimates are predictable given that the geographic scope of inference from Lincoln estimates is the entire population range, whereas sampling frames for aerial surveys are incomplete. Although each estimation method has a number of important potential biases, our review suggests that underestimation of total population size by aerial surveys is the most likely explanation. In addition to providing measures of total abundance, Lincoln's method provides estimates of fecundity and population sex ratio and could be used in integrated population models to provide greater insights about population dynamics and management of North American mallards and most other harvested species.
Use of Internal Consistency Coefficients for Estimating Reliability of Experimental Tasks Scores
Green, Samuel B.; Yang, Yanyun; Alt, Mary; Brinkley, Shara; Gray, Shelley; Hogan, Tiffany; Cowan, Nelson
2017-01-01
Reliabilities of scores for experimental tasks are likely to differ from one study to another to the extent that the task stimuli change, the number of trials varies, the type of individuals taking the task changes, the administration conditions are altered, or the focal task variable differs. Given reliabilities vary as a function of the design of these tasks and the characteristics of the individuals taking them, making inferences about the reliability of scores in an ongoing study based on reliability estimates from prior studies is precarious. Thus, it would be advantageous to estimate reliability based on data from the ongoing study. We argue that internal consistency estimates of reliability are underutilized for experimental task data and in many applications could provide this information using a single administration of a task. We discuss different methods for computing internal consistency estimates with a generalized coefficient alpha and the conditions under which these estimates are accurate. We illustrate use of these coefficients using data for three different tasks. PMID:26546100
Wartmann, Flurina M; Purves, Ross S; van Schaik, Carel P
2010-04-01
Quantification of the spatial needs of individuals and populations is vitally important for management and conservation. Geographic information systems (GIS) have recently become important analytical tools in wildlife biology, improving our ability to understand animal movement patterns, especially when very large data sets are collected. This study aims at combining the field of GIS with primatology to model and analyse space-use patterns of wild orang-utans. Home ranges of female orang-utans in the Tuanan Mawas forest reserve in Central Kalimantan, Indonesia were modelled with kernel density estimation methods. Kernel results were compared with minimum convex polygon estimates, and were found to perform better, because they were less sensitive to sample size and produced more reliable estimates. Furthermore, daily travel paths were calculated from 970 complete follow days. Annual ranges for the resident females were approximately 200 ha and remained stable over several years; total home range size was estimated to be 275 ha. On average, each female shared a third of her home range with each neighbouring female. Orang-utan females in Tuanan built their night nest on average 414 m away from the morning nest, whereas average daily travel path length was 777 m. A significant effect of fruit availability on day path length was found. Sexually active females covered longer distances per day and may also temporarily expand their ranges.
Guillaume, François; Fritz, Sébastien; Boichard, Didier; Druet, Tom
2008-01-01
The efficiency of the French marker-assisted selection (MAS) was estimated by a simulation study. The data files of two different time periods were used: April 2004 and 2006. The simulation method used the structure of the existing French MAS: same pedigree, same marker genotypes and same animals with records. The program simulated breeding values and new records based on this existing structure and knowledge on the QTL used in MAS (variance and frequency). Reliabilities of genetic values of young animals (less than one year old) obtained with and without marker information were compared to assess the efficiency of MAS for evaluation of milk, fat and protein yields and fat and protein contents. Mean gains of reliability ranged from 0.015 to 0.094 and from 0.038 to 0.114 in 2004 and 2006, respectively. The larger number of animals genotyped and the use of a new set of genetic markers can explain the improvement of MAS reliability from 2004 to 2006. This improvement was also observed by analysis of information content for young candidates. The gain of MAS reliability with respect to classical selection was larger for sons of sires with genotyped progeny daughters with records. Finally, it was shown that when superiority of MAS over classical selection was estimated with daughter yield deviations obtained after progeny test instead of true breeding values, the gain was underestimated. PMID:18096117
Methods for Identifying Object Class, Type, and Orientation in the Presence of Uncertainty
1990-08-01
on Range Finding Techniques for Computer Vision," IEEE Trans. on Pattern Analysis and Machine Intellegence PAMI-5 (2), pp 129-139 March 1983. 15. Yang... Artificial Intelligence Applications, pp 199-205, December 1984. 16. Flynn, P.J. and Jain, A.K.," On Reliable Curvature Estimation, " Proceedings of the
Computer-Aided Reliability Estimation
NASA Technical Reports Server (NTRS)
Bavuso, S. J.; Stiffler, J. J.; Bryant, L. A.; Petersen, P. L.
1986-01-01
CARE III (Computer-Aided Reliability Estimation, Third Generation) helps estimate reliability of complex, redundant, fault-tolerant systems. Program specifically designed for evaluation of fault-tolerant avionics systems. However, CARE III general enough for use in evaluation of other systems as well.
Methods for estimating flood frequency in Montana based on data through water year 1998
Parrett, Charles; Johnson, Dave R.
2004-01-01
Annual peak discharges having recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years (T-year floods) were determined for 660 gaged sites in Montana and in adjacent areas of Idaho, Wyoming, and Canada, based on data through water year 1998. The updated flood-frequency information was subsequently used in regression analyses, either ordinary or generalized least squares, to develop equations relating T-year floods to various basin and climatic characteristics, equations relating T-year floods to active-channel width, and equations relating T-year floods to bankfull width. The equations can be used to estimate flood frequency at ungaged sites. Montana was divided into eight regions, within which flood characteristics were considered to be reasonably homogeneous, and the three sets of regression equations were developed for each region. A measure of the overall reliability of the regression equations is the average standard error of prediction. The average standard errors of prediction for the equations based on basin and climatic characteristics ranged from 37.4 percent to 134.1 percent. Average standard errors of prediction for the equations based on active-channel width ranged from 57.2 percent to 141.3 percent. Average standard errors of prediction for the equations based on bankfull width ranged from 63.1 percent to 155.5 percent. In most regions, the equations based on basin and climatic characteristics generally had smaller average standard errors of prediction than equations based on active-channel or bankfull width. An exception was the Southeast Plains Region, where all equations based on active-channel width had smaller average standard errors of prediction than equations based on basin and climatic characteristics or bankfull width. Methods for weighting estimates derived from the basin- and climatic-characteristic equations and the channel-width equations also were developed. The weights were based on the cross correlation of residuals from the different methods and the average standard errors of prediction. When all three methods were combined, the average standard errors of prediction ranged from 37.4 percent to 120.2 percent. Weighting of estimates reduced the standard errors of prediction for all T-year flood estimates in four regions, reduced the standard errors of prediction for some T-year flood estimates in two regions, and provided no reduction in average standard error of prediction in two regions. A computer program for solving the regression equations, weighting estimates, and determining reliability of individual estimates was developed and placed on the USGS Montana District World Wide Web page. A new regression method, termed Region of Influence regression, also was tested. Test results indicated that the Region of Influence method was not as reliable as the regional equations based on generalized least squares regression. Two additional methods for estimating flood frequency at ungaged sites located on the same streams as gaged sites also are described. The first method, based on a drainage-area-ratio adjustment, is intended for use on streams where the ungaged site of interest is located near a gaged site. The second method, based on interpolation between gaged sites, is intended for use on streams that have two or more streamflow-gaging stations.
Hu, Yinhuan; Zhang, Zixia; Xie, Jinzhu; Wang, Guanping
2017-02-01
The objective of this study is to describe the development of the Outpatient Experience Questionnaire (OPEQ) and to assess the validity and reliability of the scale. Literature review, patient interviews, Delphi method and Cross-sectional validation survey. Six comprehensive public hospitals in China. The survey was carried out on a sample of 600 outpatients. Acceptability of the questionnaire was assessed according to the overall response rate, item non-response rate and the average completion time. Correlation coefficients and confirmatory factor analysis were used to test construct validity. Delphi method was used to assess the content validity of the questionnaire. Cronbach's coefficient alpha and split-half reliability coefficient were used to estimate the internal reliability of the questionnaire. The overall response rate was 97.2% and the item non-response rate ranged from 0% to 0.3%. The mean completion time was 6 min. The Spearman correlations of item-total score ranged from 0.466 to 0.765. The results of confirmatory factor analysis showed that all items had factor loadings above 0.40 and the dimension intercorrelation ranged from 0.449 to 0.773, the goodness of fit of the questionnaire was reasonable. The overall authority grade of expert consultation was 0.80 and Kendall's coefficient of concordance W was 0.186. The Cronbach's coefficients alpha of six dimensions ranged from 0.708 to 0.895, the split-half reliability coefficient (Spearman-Brown coefficient) was 0.969. The OPEQ is a promising instrument covering the most important aspects which influence outpatient experiences of comprehensive public hospital in China. It has good evidence for acceptability, validity and reliability. © The Author 2016. Published by Oxford University Press in association with the International Society for Quality in Health Care. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Clayson, Peter E; Miller, Gregory A
2017-01-01
Generalizability theory (G theory) provides a flexible, multifaceted approach to estimating score reliability. G theory's approach to estimating score reliability has important advantages over classical test theory that are relevant for research using event-related brain potentials (ERPs). For example, G theory does not require parallel forms (i.e., equal means, variances, and covariances), can handle unbalanced designs, and provides a single reliability estimate for designs with multiple sources of error. This monograph provides a detailed description of the conceptual framework of G theory using examples relevant to ERP researchers, presents the algorithms needed to estimate ERP score reliability, and provides a detailed walkthrough of newly-developed software, the ERP Reliability Analysis (ERA) Toolbox, that calculates score reliability using G theory. The ERA Toolbox is open-source, Matlab software that uses G theory to estimate the contribution of the number of trials retained for averaging, group, and/or event types on ERP score reliability. The toolbox facilitates the rigorous evaluation of psychometric properties of ERP scores recommended elsewhere in this special issue. Copyright © 2016 Elsevier B.V. All rights reserved.
Reliability considerations for the total strain range version of strainrange partitioning
NASA Technical Reports Server (NTRS)
Wirsching, P. H.; Wu, Y. T.
1984-01-01
A proposed total strainrange version of strainrange partitioning (SRP) to enhance the manner in which SRP is applied to life prediction is considered with emphasis on how advanced reliability technology can be applied to perform risk analysis and to derive safety check expressions. Uncertainties existing in the design factors associated with life prediction of a component which experiences the combined effects of creep and fatigue can be identified. Examples illustrate how reliability analyses of such a component can be performed when all design factors in the SRP model are random variables reflecting these uncertainties. The Rackwitz-Fiessler and Wu algorithms are used and estimates of the safety index and the probablity of failure are demonstrated for a SRP problem. Methods of analysis of creep-fatigue data with emphasis on procedures for producing synoptic statistics are presented. An attempt to demonstrate the importance of the contribution of the uncertainties associated with small sample sizes (fatique data) to risk estimates is discussed. The procedure for deriving a safety check expression for possible use in a design criteria document is presented.
Patterns and determinants of mammal species occurrence in India
Karanth, K.K.; Nichols, J.D.; Hines, J.E.; Karanth, K.U.; Christensen, N.L.
2009-01-01
Many Indian mammals face range contraction and extinction, but assessments of their population status are hindered by the lack of reliable distribution data and range maps. 2. We estimated the current geographical ranges of 20 species of large mammals by applying occupancy models to data from country-wide expert. We modelled species in relation to ecological and social covariates (protected areas, landscape characteristics and human influences) based on a priori hypotheses about plausible determinants of mammalian distribution patterns. 3. We demonstrated that failure to incorporate detection probability in distribution survey methods underestimated habitat occupancy for all species. 4. Protected areas were important for the distribution of 16 species. However, for many species much of their current range remains unprotected. The availability of evergreen forests was important for the occurrence of 14 species, temperate forests for six species, deciduous forests for 15 species and higher altitude habitats for two species. Low human population density was critical for the occurrence of five species, while culturally based tolerance was important for the occurrence of nine other species. 5. Rhino Rhinoceros unicornis, gaur Bos gaurus and elephant Elephas maximus showed the most restricted ranges among herbivores, and sun bear Helarctos malayanus, brown bear Ursus arctos and tiger Panthera tigris were most restricted among carnivores. While cultural tolerance has helped the survival of some mammals, legal protection has been critically associated with occurrence of most species. 6. Synthesis and applications. Extent of range is an important determinant of species conservation status. Understanding the relationship of species occurrence with ecological and socio-cultural covariates is important for identification and management of key conservation areas. The combination of occupancy models with field data from country-wide experts enables reliable estimation of species range and habitat associations for conservation at regional scales. ?? 2009 British Ecological Society.
Rapid Detection of Small Movements with GNSS Doppler Observables
NASA Astrophysics Data System (ADS)
Hohensinn, Roland; Geiger, Alain
2017-04-01
High-alpine terrain reacts very sensitively to varying environmental conditions. As an example, increasing temperatures cause thawing of permafrost areas. This, in turn causes an increasing threat by natural hazards like debris flow (e.g. rock glaciers) or rockfalls. The Institute of Geodesy and Photogrammetry is contributing to alpine mass-movement monitoring systems in different project areas in the Swiss Alps. A main focus lies on providing geodetic mass-movement information derived from GNSS static solutions on a daily and a sub-daily basis, obtained with low-cost and autonomous GNSS stations. Another focus is set on rapidly providing reliable geodetic information in real-time i.e. for an integration in early warning systems. One way to achieve this is the estimation of accurate station velocities from observations of range rates, which can be obtained as Doppler observables from time derivatives of carrier phase measurements. The key for this method lies in a precise modeling of prominent effects contributing to the observed range rates, which are satellite velocity, atmospheric delay rates and relativistic effects. A suitable observation model is then devised, which accounts for these predictions. The observation model, combined with a simple kinematic movement model forms the basis for the parameter estimation. Based on the estimated station velocities, movements are then detected using a statistical test. To improve the reliablity of the estimated parameters, another spotlight is set on an on-line quality control procedure. We will present the basic algorithms as well as results from first tests which were carried out with a low-cost GPS L1 phase receiver. With a u-blox module and a sampling rate of 5 Hz, accuracies on the mm/s level can be obtained and velocities down to 1 cm/s can be detected. Reliable and accurate station velocities and movement information can be provided within seconds.
A Note on Structural Equation Modeling Estimates of Reliability
ERIC Educational Resources Information Center
Yang, Yanyun; Green, Samuel B.
2010-01-01
Reliability can be estimated using structural equation modeling (SEM). Two potential problems with this approach are that estimates may be unstable with small sample sizes and biased with misspecified models. A Monte Carlo study was conducted to investigate the quality of SEM estimates of reliability by themselves and relative to coefficient…
Estimating Measures of Pass-Fail Reliability from Parallel Half-Tests.
ERIC Educational Resources Information Center
Woodruff, David J.; Sawyer, Richard L.
Two methods for estimating measures of pass-fail reliability are derived, by which both theta and kappa may be estimated from a single test administration. The methods require only a single test administration and are computationally simple. Both are based on the Spearman-Brown formula for estimating stepped-up reliability. The non-distributional…
Large Sample Confidence Intervals for Item Response Theory Reliability Coefficients
ERIC Educational Resources Information Center
Andersson, Björn; Xin, Tao
2018-01-01
In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability…
Reliability Correction for Functional Connectivity: Theory and Implementation
Mueller, Sophia; Wang, Danhong; Fox, Michael D.; Pan, Ruiqi; Lu, Jie; Li, Kuncheng; Sun, Wei; Buckner, Randy L.; Liu, Hesheng
2016-01-01
Network properties can be estimated using functional connectivity MRI (fcMRI). However, regional variation of the fMRI signal causes systematic biases in network estimates including correlation attenuation in regions of low measurement reliability. Here we computed the spatial distribution of fcMRI reliability using longitudinal fcMRI datasets and demonstrated how pre-estimated reliability maps can correct for correlation attenuation. As a test case of reliability-based attenuation correction we estimated properties of the default network, where reliability was significantly lower than average in the medial temporal lobe and higher in the posterior medial cortex, heterogeneity that impacts estimation of the network. Accounting for this bias using attenuation correction revealed that the medial temporal lobe’s contribution to the default network is typically underestimated. To render this approach useful to a greater number of datasets, we demonstrate that test-retest reliability maps derived from repeated runs within a single scanning session can be used as a surrogate for multi-session reliability mapping. Using data segments with different scan lengths between 1 and 30 min, we found that test-retest reliability of connectivity estimates increases with scan length while the spatial distribution of reliability is relatively stable even at short scan lengths. Finally, analyses of tertiary data revealed that reliability distribution is influenced by age, neuropsychiatric status and scanner type, suggesting that reliability correction may be especially important when studying between-group differences. Collectively, these results illustrate that reliability-based attenuation correction is an easily implemented strategy that mitigates certain features of fMRI signal nonuniformity. PMID:26493163
Reliability of a structured interview for admission to an emergency medicine residency program.
Blouin, Danielle
2010-10-01
Interviews are most important in resident selection. Structured interviews are more reliable than unstructured ones. We sought to measure the interrater reliability of a newly designed structured interview during the selection process to an Emergency Medicine residency program. The critical incident technique was used to extract the desired dimensions of performance. The interview tool consisted of 7 clinical scenarios and 1 global rating. Three trained interviewers marked each candidate on all scenarios without discussing candidates' responses. Interitem consistency and estimates of variance were computed. Twenty-eight candidates were interviewed. The generalizability coefficient was 0.67. Removing the central tendency ratings increased the coefficient to 0.74. Coefficients of interitem consistency ranged from 0.64 to 0.74. The structured interview tool provided good although suboptimal interrater reliability. Increasing the number of scenarios improves reliability as does applying differential weights to the rating scale anchors. The latter would also facilitate the identification of those candidates with extreme ratings.
Bergeron, Lise; Smolla, Nicole; Berthiaume, Claude; Renaud, Johanne; Breton, Jean-Jacques; St-Georges, Marie; Morin, Pauline; Zavaglia, Elissa; Labelle, Réal
2017-03-01
The Dominic Interactive for Adolescents-Revised (DIA-R) is a multimedia self-report screen for 9 mental disorders, borderline personality traits, and suicidality defined by the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders ( DSM-5). This study aimed to examine the reliability and the validity of this instrument. French- and English-speaking adolescents aged 12 to 15 years ( N = 447) were recruited from schools and clinical settings in Montreal and were evaluated twice. The internal consistency was estimated by Cronbach alpha coefficients and the test-retest reliability by intraclass correlation coefficients. Cutoff points on the DIA-R scales were determined by using clinically relevant measures for defining external validation criteria: the Schedule for Affective Disorders and Schizophrenia for School-Aged Children, the Beck Hopelessness Scale, and the Abbreviated-Diagnostic Interview for Borderlines. Receiver operating characteristic (ROC) analyses provided accuracy estimates (area under the ROC curve, sensitivity, specificity, likelihood ratio) to evaluate the ability of the DIA-R scales to predict external criteria. For most of the DIA-R scales, reliability coefficients were excellent or moderate. High or moderate accuracy estimates from ROC analyses demonstrated the ability of the DIA-R thresholds to predict psychopathological conditions. These thresholds were generally capable to discriminate between clinical and school subsamples. However, the validity of the obsessions/compulsions scale was too low. Findings clearly support the reliability and the validity of the DIA-R. This instrument may be useful to assess a wide range of adolescents' mental health problems in the continuum of services. This conclusion applies to all scales, except the obsessions/compulsions one.
TU, Frank F.; EPSTEIN, Aliza E.; POZOLO, Kristen E.; SEXTON, Debra L.; MELNYK, Alexandra I.; HELLMAN, Kevin M.
2012-01-01
Objective Catheterization to measure bladder sensitivity is aversive and hinders human participation in visceral sensory research. Therefore, we sought to characterize the reliability of sonographically-estimated female bladder sensory thresholds. To demonstrate this technique’s usefulness, we examined the effects of self-reported dysmenorrhea on bladder pain thresholds. Methods Bladder sensory threshold volumes were determined during provoked natural diuresis in 49 healthy women (mean age 24 ± 8) using three-dimensional ultrasound. Cystometric thresholds (Vfs – first sensation, Vfu – first urge, Vmt – maximum tolerance) were quantified and related to bladder urgency and pain. We estimated reliability (one-week retest and interrater). Self-reported menstrual pain was examined in relationship to bladder pain, urgency and volume thresholds. Results Average bladder sensory thresholds (mLs) were Vfs (160±100), Vfu (310±130), and Vmt (500±180). Interrater reliability ranged from 0.97–0.99. One-week retest reliability was Vmt = 0.76 (95% CI 0.64–0.88), Vfs = 0.62 (95% CI 0.44–0.80), and Vfu = 0.63, (95% CI 0.47–0.80). Bladder filling rate correlated with all thresholds (r = 0.53–0.64, p < 0.0001). Women with moderate to severe dysmenorrhea pain had increased bladder pain and urgency at Vfs and increased pain at Vfu (p’s < 0.05). In contrast, dysmenorrhea pain was unrelated to bladder capacity. Discussion Sonographic estimates of bladder sensory thresholds were reproducible and reliable. In these healthy volunteers, dysmenorrhea was associated with increased bladder pain and urgency during filling but unrelated to capacity. Plausibly, dysmenorrhea sufferers may exhibit enhanced visceral mechanosensitivity, increasing their risk to develop chronic bladder pain syndromes. PMID:23370073
Bosquet, Laurent; Porta-Benache, Jeremy; Blais, Jérôme
2010-01-01
The aim of this study was to assess the validity and accuracy of a commercial linear encoder (Musclelab, Ergotest, Norway) to estimate Bench press 1 repetition maximum (1RM) from the force - velocity relationship. Twenty seven physical education students and teachers (5 women and 22 men) with a heterogeneous history of strength training participated in this study. They performed a 1 RM test and a force - velocity test using a Bench press lifting task in a random order. Mean 1 RM was 61.8 ± 15.3 kg (range: 34 to 100 kg), while 1 RM estimated by the Musclelab's software from the force-velocity relationship was 56.4 ± 14.0 kg (range: 33 to 91 kg). Actual and estimated 1 RM were very highly correlated (r = 0.93, p<0.001) but largely different (Bias: 5.4 ± 5.7 kg, p < 0.001, ES = 1.37). The 95% limits of agreement were ±11.2 kg, which represented ±18% of actual 1 RM. It was concluded that 1 RM estimated from the force-velocity relationship was a good measure for monitoring training induced adaptations, but also that it was not accurate enough to prescribe training intensities. Additional studies are required to determine whether accuracy is affected by age, sex or initial level. Key pointsSome commercial devices allow to estimate 1 RM from the force-velocity relationship.These estimations are valid. However, their accuracy is not high enough to be of practical help for training intensity prescription.Day-to-day reliability of force and velocity measured by the linear encoder has been shown to be very high, but the specific reliability of 1 RM estimated from the force-velocity relationship has to be determined before concluding to the usefulness of this approach in the monitoring of training induced adaptations.
Bosquet, Laurent; Porta-Benache, Jeremy; Blais, Jérôme
2010-01-01
The aim of this study was to assess the validity and accuracy of a commercial linear encoder (Musclelab, Ergotest, Norway) to estimate Bench press 1 repetition maximum (1RM) from the force - velocity relationship. Twenty seven physical education students and teachers (5 women and 22 men) with a heterogeneous history of strength training participated in this study. They performed a 1 RM test and a force - velocity test using a Bench press lifting task in a random order. Mean 1 RM was 61.8 ± 15.3 kg (range: 34 to 100 kg), while 1 RM estimated by the Musclelab’s software from the force-velocity relationship was 56.4 ± 14.0 kg (range: 33 to 91 kg). Actual and estimated 1 RM were very highly correlated (r = 0.93, p<0.001) but largely different (Bias: 5.4 ± 5.7 kg, p < 0.001, ES = 1.37). The 95% limits of agreement were ±11.2 kg, which represented ±18% of actual 1 RM. It was concluded that 1 RM estimated from the force-velocity relationship was a good measure for monitoring training induced adaptations, but also that it was not accurate enough to prescribe training intensities. Additional studies are required to determine whether accuracy is affected by age, sex or initial level. Key points Some commercial devices allow to estimate 1 RM from the force-velocity relationship. These estimations are valid. However, their accuracy is not high enough to be of practical help for training intensity prescription. Day-to-day reliability of force and velocity measured by the linear encoder has been shown to be very high, but the specific reliability of 1 RM estimated from the force-velocity relationship has to be determined before concluding to the usefulness of this approach in the monitoring of training induced adaptations. PMID:24149641
Body mass estimates of hominin fossils and the evolution of human body size.
Grabowski, Mark; Hatala, Kevin G; Jungers, William L; Richmond, Brian G
2015-08-01
Body size directly influences an animal's place in the natural world, including its energy requirements, home range size, relative brain size, locomotion, diet, life history, and behavior. Thus, an understanding of the biology of extinct organisms, including species in our own lineage, requires accurate estimates of body size. Since the last major review of hominin body size based on postcranial morphology over 20 years ago, new fossils have been discovered, species attributions have been clarified, and methods improved. Here, we present the most comprehensive and thoroughly vetted set of individual fossil hominin body mass predictions to date, and estimation equations based on a large (n = 220) sample of modern humans of known body masses. We also present species averages based exclusively on fossils with reliable taxonomic attributions, estimates of species averages by sex, and a metric for levels of sexual dimorphism. Finally, we identify individual traits that appear to be the most reliable for mass estimation for each fossil species, for use when only one measurement is available for a fossil. Our results show that many early hominins were generally smaller-bodied than previously thought, an outcome likely due to larger estimates in previous studies resulting from the use of large-bodied modern human reference samples. Current evidence indicates that modern human-like large size first appeared by at least 3-3.5 Ma in some Australopithecus afarensis individuals. Our results challenge an evolutionary model arguing that body size increased from Australopithecus to early Homo. Instead, we show that there is no reliable evidence that the body size of non-erectus early Homo differed from that of australopiths, and confirm that Homo erectus evolved larger average body size than earlier hominins. Copyright © 2015 Elsevier Ltd. All rights reserved.
Falaggis, Konstantinos; Towers, David P; Towers, Catherine E
2012-09-20
Multiwavelength interferometry (MWI) is a well established technique in the field of optical metrology. Previously, we have reported a theoretical analysis of the method of excess fractions that describes the mutual dependence of unambiguous measurement range, reliability, and the measurement wavelengths. In this paper wavelength, selection strategies are introduced that are built on the theoretical description and maximize the reliability in the calculated fringe order for a given measurement range, number of wavelengths, and level of phase noise. Practical implementation issues for an MWI interferometer are analyzed theoretically. It is shown that dispersion compensation is best implemented by use of reference measurements around absolute zero in the interferometer. Furthermore, the effects of wavelength uncertainty allow the ultimate performance of an MWI interferometer to be estimated.
Mbizah, Moreangels M; Steenkamp, Gerhard; Groom, Rosemary J
2016-01-01
African wild dogs (Lycaon pictus) are endangered and their population continues to decline throughout their range. Given their conservation status, more research focused on their population dynamics, population growth and age specific mortality is needed and this requires reliable estimates of age and age of mortality. Various age determination methods from teeth and skull measurements have been applied in numerous studies and it is fundamental to test the validity of these methods and their applicability to different species. In this study we assessed the accuracy of estimating chronological age and age class of African wild dogs, from dental age measured by (i) counting cementum annuli (ii) pulp cavity/tooth width ratio, (iii) tooth wear (measured by tooth crown height) (iv) tooth wear (measured by tooth crown width/crown height ratio) (v) tooth weight and (vi) skull measurements (length, width and height). A sample of 29 African wild dog skulls, from opportunistically located carcasses was analysed. Linear and ordinal regression analysis was done to investigate the performance of each of the six age determination methods in predicting wild dog chronological age and age class. Counting cementum annuli was the most accurate method for estimating chronological age of wild dogs with a 79% predictive capacity, while pulp cavity/tooth width ratio was also a reliable method with a 68% predictive capacity. Counting cementum annuli and pulp cavity/tooth width ratio were again the most accurate methods for separating wild dogs into three age classes (6-24 months; 25-60 months and > 60 months), with a McFadden's Pseudo-R2 of 0.705 and 0.412 respectively. The use of the cementum annuli method is recommended when estimating age of wild dogs since it is the most reliable method. However, its use is limited as it requires tooth extraction and shipping, is time consuming and expensive, and is not applicable to living individuals. Pulp cavity/tooth width ratio is a moderately reliable method for estimating both chronological age and age class. This method gives a balance between accuracy, cost and practicability, therefore it is recommended when precise age estimations are not paramount.
Steenkamp, Gerhard; Groom, Rosemary J.
2016-01-01
African wild dogs (Lycaon pictus) are endangered and their population continues to decline throughout their range. Given their conservation status, more research focused on their population dynamics, population growth and age specific mortality is needed and this requires reliable estimates of age and age of mortality. Various age determination methods from teeth and skull measurements have been applied in numerous studies and it is fundamental to test the validity of these methods and their applicability to different species. In this study we assessed the accuracy of estimating chronological age and age class of African wild dogs, from dental age measured by (i) counting cementum annuli (ii) pulp cavity/tooth width ratio, (iii) tooth wear (measured by tooth crown height) (iv) tooth wear (measured by tooth crown width/crown height ratio) (v) tooth weight and (vi) skull measurements (length, width and height). A sample of 29 African wild dog skulls, from opportunistically located carcasses was analysed. Linear and ordinal regression analysis was done to investigate the performance of each of the six age determination methods in predicting wild dog chronological age and age class. Counting cementum annuli was the most accurate method for estimating chronological age of wild dogs with a 79% predictive capacity, while pulp cavity/tooth width ratio was also a reliable method with a 68% predictive capacity. Counting cementum annuli and pulp cavity/tooth width ratio were again the most accurate methods for separating wild dogs into three age classes (6–24 months; 25–60 months and > 60 months), with a McFadden’s Pseudo-R2 of 0.705 and 0.412 respectively. The use of the cementum annuli method is recommended when estimating age of wild dogs since it is the most reliable method. However, its use is limited as it requires tooth extraction and shipping, is time consuming and expensive, and is not applicable to living individuals. Pulp cavity/tooth width ratio is a moderately reliable method for estimating both chronological age and age class. This method gives a balance between accuracy, cost and practicability, therefore it is recommended when precise age estimations are not paramount. PMID:27732663
Kenny, Sarah J; Palacios-Derflingher, Luz; Owoeye, Oluwatoyosi B A; Whittaker, Jackie L; Emery, Carolyn A
2018-03-15
Critical appraisal of research investigating risk factors for musculoskeletal injury in dancers suggests high quality reliability studies are lacking. The purpose of this study was to determine between-day reliability of pre-participation screening (PPS) components in pre-professional ballet and contemporary dancers. Thirty-eight dancers (35 female, 3 male; median age; 18 years; range: 11 to 30 years) participated. Screening components (Athletic Coping Skills Inventory-28, body mass index, percent total body fat, total bone mineral density, Foot Posture Index-6, hip and ankle range of motion, three lumbopelvic control tasks, unipedal dynamic balance, and the Y-Balance Test) were conducted one week apart. Intra-class correlation coefficients (ICCs: 95% confidence intervals), standard error of measurement, minimal detectable change (MDC), Bland-Altman methods of agreement [95% limits of agreement (LOA)], Cohen's kappa coefficients, standard error, and percent agreements were calculated. Depending on the screening component, ICC estimates ranged from 0.51 to 0.98, kappa coefficients varied between -0.09 and 0.47, and percent agreement spanned 71% to 95%. Wide 95% LOA were demonstrated by Foot Posture Index-6 (right: -6.06, 7.31), passive hip external rotation (right: -9.89, 16.54), and passive supine turnout (left: -15.36, 17.58). The PPS components examined demonstrated moderate to excellent relative reliability with mean between-day differences less than MDC, or sufficient percent agreement, across all assessments. However, due to wide 95% limits of agreement, the Foot Posture Index-6 and passive hip range of motion are not recommended for screening injury risk in pre-professional dancers.
Net trophic transfer efficiency of PCBs to Lake Michigan coho salmon from their prey
Madenjian, Charles P.; Elliott, Robert F.; Schmidt, Larry J.; DeSorcie, Timothy J.; Hesselberg, Robert J.; Quintal, Richard T.; Begnoche, Linda J.; Bouchard, Patrick M.; Holey, Mark E.
1998-01-01
Most of the polychlorinated biphenyl (PCB) body burden accumulated by coho salmon (Oncorhynchus kisutch) from the Laurentian Great Lakes is from their food. We used diet information, PCB determinations in both coho salmon and their prey, and bioenergetics modeling to estimate the efficiency with which Lake Michigan coho salmon retain PCBs from their food. Our estimate was the most reliable estimate to date because (a) the coho salmon and prey fish sampled during our study were sampled in spring, summer, and fall from various locations throughout the lake, (b) detailed measurements were made on the PCB concentrations of both coho salmon and prey fish over wide ranges in fish size, and (c) coho salmon diet was analyzed in detail from April through November over a wide range of salmon size from numerous locations throughout the lake. We estimated that coho salmon from Lake Michigan retain 50% of the PCBs that are contained within their food.
Chung, Chia-Fang; Xu, Kaiyuan; Dong, Yi; Schenk, Jeanette M.; Cain, Kevin; Munson, Sean; Heitkemper, Margaret M.
2017-01-01
There are currently no standardized methods for identifying trigger food(s) from irritable bowel syndrome (IBS) food and symptom journals. The primary aim of this study was to assess the inter-rater reliability of providers’ interpretations of IBS journals. A second aim was to describe whether these interpretations varied for each patient. Eight providers reviewed 17 IBS journals and rated how likely key food groups (fermentable oligo-di-monosaccharides and polyols, high-calorie, gluten, caffeine, high-fiber) were to trigger IBS symptoms for each patient. Agreement of trigger food ratings was calculated using Krippendorff’s α-reliability estimate. Providers were also asked to write down recommendations they would give to each patient. Estimates of agreement of trigger food likelihood ratings were poor (average α = 0.07). Most providers gave similar trigger food likelihood ratings for over half the food groups. Four providers gave the exact same written recommendation(s) (range 3–7) to over half the patients. Inter-rater reliability of provider interpretations of IBS food and symptom journals was poor. Providers favored certain trigger food likelihood ratings and written recommendations. This supports the need for a more standardized method for interpreting these journals and/or more rigorous techniques to accurately identify personalized IBS food triggers. PMID:29113044
Validity and reliability of the Self-Reported Physical Fitness (SRFit) survey.
Keith, NiCole R; Clark, Daniel O; Stump, Timothy E; Miller, Douglas K; Callahan, Christopher M
2014-05-01
An accurate physical fitness survey could be useful in research and clinical care. To estimate the validity and reliability of a Self-Reported Fitness (SRFit) survey; an instrument that estimates muscular fitness, flexibility, cardiovascular endurance, BMI, and body composition (BC) in adults ≥ 40 years of age. 201 participants completed the SF-36 Physical Function Subscale, International Physical Activity Questionnaire (IPAQ), Older Adults' Desire for Physical Competence Scale (Rejeski), the SRFit survey, and the Rikli and Jones Senior Fitness Test. BC, height and weight were measured. SRFit survey items described BC, BMI, and Senior Fitness Test movements. Correlations between the Senior Fitness Test and the SRFit survey assessed concurrent validity. Cronbach's Alpha measured internal consistency within each SRFit domain. SRFit domain scores were compared with SF-36, IPAQ, and Rejeski survey scores to assess construct validity. Intraclass correlations evaluated test-retest reliability. Correlations between SRFit and the Senior Fitness Test domains ranged from 0.35 to 0.79. Cronbach's Alpha scores were .75 to .85. Correlations between SRFit and other survey scores were -0.23 to 0.72 and in the expected direction. Intraclass correlation coefficients were 0.79 to 0.93. All P-values were 0.001. Initial evaluation supports the SRFit survey's validity and reliability.
Clayson, Peter E; Miller, Gregory A
2017-01-01
Failing to consider psychometric issues related to reliability and validity, differential deficits, and statistical power potentially undermines the conclusions of a study. In research using event-related brain potentials (ERPs), numerous contextual factors (population sampled, task, data recording, analysis pipeline, etc.) can impact the reliability of ERP scores. The present review considers the contextual factors that influence ERP score reliability and the downstream effects that reliability has on statistical analyses. Given the context-dependent nature of ERPs, it is recommended that ERP score reliability be formally assessed on a study-by-study basis. Recommended guidelines for ERP studies include 1) reporting the threshold of acceptable reliability and reliability estimates for observed scores, 2) specifying the approach used to estimate reliability, and 3) justifying how trial-count minima were chosen. A reliability threshold for internal consistency of at least 0.70 is recommended, and a threshold of 0.80 is preferred. The review also advocates the use of generalizability theory for estimating score dependability (the generalizability theory analog to reliability) as an improvement on classical test theory reliability estimates, suggesting that the latter is less well suited to ERP research. To facilitate the calculation and reporting of dependability estimates, an open-source Matlab program, the ERP Reliability Analysis Toolbox, is presented. Copyright © 2016 Elsevier B.V. All rights reserved.
Simultaneous calibration of ensemble river flow predictions over an entire range of lead times
NASA Astrophysics Data System (ADS)
Hemri, S.; Fundel, F.; Zappa, M.
2013-10-01
Probabilistic estimates of future water levels and river discharge are usually simulated with hydrologic models using ensemble weather forecasts as main inputs. As hydrologic models are imperfect and the meteorological ensembles tend to be biased and underdispersed, the ensemble forecasts for river runoff typically are biased and underdispersed, too. Thus, in order to achieve both reliable and sharp predictions statistical postprocessing is required. In this work Bayesian model averaging (BMA) is applied to statistically postprocess ensemble runoff raw forecasts for a catchment in Switzerland, at lead times ranging from 1 to 240 h. The raw forecasts have been obtained using deterministic and ensemble forcing meteorological models with different forecast lead time ranges. First, BMA is applied based on mixtures of univariate normal distributions, subject to the assumption of independence between distinct lead times. Then, the independence assumption is relaxed in order to estimate multivariate runoff forecasts over the entire range of lead times simultaneously, based on a BMA version that uses multivariate normal distributions. Since river runoff is a highly skewed variable, Box-Cox transformations are applied in order to achieve approximate normality. Both univariate and multivariate BMA approaches are able to generate well calibrated probabilistic forecasts that are considerably sharper than climatological forecasts. Additionally, multivariate BMA provides a promising approach for incorporating temporal dependencies into the postprocessed forecasts. Its major advantage against univariate BMA is an increase in reliability when the forecast system is changing due to model availability.
Reliability of fish size estimates obtained from multibeam imaging sonar
Hightower, Joseph E.; Magowan, Kevin J.; Brown, Lori M.; Fox, Dewayne A.
2013-01-01
Multibeam imaging sonars have considerable potential for use in fisheries surveys because the video-like images are easy to interpret, and they contain information about fish size, shape, and swimming behavior, as well as characteristics of occupied habitats. We examined images obtained using a dual-frequency identification sonar (DIDSON) multibeam sonar for Atlantic sturgeon Acipenser oxyrinchus oxyrinchus, striped bass Morone saxatilis, white perch M. americana, and channel catfish Ictalurus punctatus of known size (20–141 cm) to determine the reliability of length estimates. For ranges up to 11 m, percent measurement error (sonar estimate – total length)/total length × 100 varied by species but was not related to the fish's range or aspect angle (orientation relative to the sonar beam). Least-square mean percent error was significantly different from 0.0 for Atlantic sturgeon (x̄ = −8.34, SE = 2.39) and white perch (x̄ = 14.48, SE = 3.99) but not striped bass (x̄ = 3.71, SE = 2.58) or channel catfish (x̄ = 3.97, SE = 5.16). Underestimating lengths of Atlantic sturgeon may be due to difficulty in detecting the snout or the longer dorsal lobe of the heterocercal tail. White perch was the smallest species tested, and it had the largest percent measurement errors (both positive and negative) and the lowest percentage of images classified as good or acceptable. Automated length estimates for the four species using Echoview software varied with position in the view-field. Estimates tended to be low at more extreme azimuthal angles (fish's angle off-axis within the view-field), but mean and maximum estimates were highly correlated with total length. Software estimates also were biased by fish images partially outside the view-field and when acoustic crosstalk occurred (when a fish perpendicular to the sonar and at relatively close range is detected in the side lobes of adjacent beams). These sources of bias are apparent when files are processed manually and can be filtered out when producing automated software estimates. Multibeam sonar estimates of fish size should be useful for research and management if these potential sources of bias and imprecision are addressed.
A particle swarm model for estimating reliability and scheduling system maintenance
NASA Astrophysics Data System (ADS)
Puzis, Rami; Shirtz, Dov; Elovici, Yuval
2016-05-01
Modifying data and information system components may introduce new errors and deteriorate the reliability of the system. Reliability can be efficiently regained with reliability centred maintenance, which requires reliability estimation for maintenance scheduling. A variant of the particle swarm model is used to estimate reliability of systems implemented according to the model view controller paradigm. Simulations based on data collected from an online system of a large financial institute are used to compare three component-level maintenance policies. Results show that appropriately scheduled component-level maintenance greatly reduces the cost of upholding an acceptable level of reliability by reducing the need in system-wide maintenance.
Development of a hybrid pollution index for heavy metals in marine and estuarine sediments.
Brady, James P; Ayoko, Godwin A; Martens, Wayde N; Goonetilleke, Ashantha
2015-05-01
Heavy metal pollution of sediments is a growing concern in most parts of the world, and numerous studies focussed on identifying contaminated sediments by using a range of digestion methods and pollution indices to estimate sediment contamination have been described in the literature. The current work provides a critical review of the more commonly used sediment digestion methods and identifies that weak acid digestion is more likely to provide guidance on elements that are likely to be bioavailable than other traditional methods of digestion. This work also reviews common pollution indices and identifies the Nemerow Pollution Index as the most appropriate method for establishing overall sediment quality. Consequently, a modified Pollution Index that can lead to a more reliable understanding of whole sediment quality is proposed. This modified pollution index is then tested against a number of existing studies and demonstrated to give a reliable and rapid estimate of sediment contamination and quality.
Palta, Mari; Chen, Han-Yang; Kaplan, Robert M; Feeny, David; Cherepanov, Dasha; Fryback, Dennis G
2011-01-01
Standard errors of measurement (SEMs) of health-related quality of life (HRQoL) indexes are not well characterized. SEM is needed to estimate responsiveness statistics, and is a component of reliability. To estimate the SEM of 5 HRQoL indexes. The National Health Measurement Study (NHMS) was a population-based survey. The Clinical Outcomes and Measurement of Health Study (COMHS) provided repeated measures. A total of 3844 randomly selected adults from the noninstitutionalized population aged 35 to 89 y in the contiguous United States and 265 cataract patients. The SF6-36v2™, QWB-SA, EQ-5D, HUI2, and HUI3 were included. An item-response theory approach captured joint variation in indexes into a composite construct of health (theta). The authors estimated 1) the test-retest standard deviation (SEM-TR) from COMHS, 2) the structural standard deviation (SEM-S) around theta from NHMS, and 3) reliability coefficients. SEM-TR was 0.068 (SF-6D), 0.087 (QWB-SA), 0.093 (EQ-5D), 0.100 (HUI2), and 0.134 (HUI3), whereas SEM-S was 0.071, 0.094, 0.084, 0.074, and 0.117, respectively. These yield reliability coefficients 0.66 (COMHS) and 0.71 (NHMS) for SF-6D, 0.59 and 0.64 for QWB-SA, 0.61 and 0.70 for EQ-5D, 0.64 and 0.80 for HUI2, and 0.75 and 0.77 for HUI3, respectively. The SEM varied across levels of health, especially for HUI2, HUI3, and EQ-5D, and was influenced by ceiling effects. Limitations. Repeated measures were 5 mo apart, and estimated theta contained measurement error. The 2 types of SEM are similar and substantial for all the indexes and vary across health.
ERIC Educational Resources Information Center
Lee, Guemin; Park, In-Yong
2012-01-01
Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…
Intersession reliability of self-selected and narrow stance balance testing in older adults.
Riemann, Bryan L; Piersol, Kelsey
2017-10-01
Despite the common practice of using force platforms to assess balance of older adults, few investigations have examined the reliability of postural screening tests in this population. We sought to determine the test-retest reliability of self-selected and narrow stance balance testing with eyes open and eyes closed in healthy older adults. Thirty older adults (>65 years) completed 45 s trials of eyes open and eyes closed stability tests using self-selected and narrow stances on two separate days (1.9 ± .7 days). Average medial-lateral center of pressure velocity was computed. The ICC results ranged from .74 to .86, and no significant systematic changes (P < .05) occurred between the testing sessions for any of the tests. The standard error of measurement ranged from 15.9 to 23.6%. Reliability estimates were similar between the two stances and visual conditions assessed. Slightly higher coefficients were identified for the self-selected stances compared to the narrow stances under both visual conditions; however, there were negligible differences between the sessions. The within subject session-to-session variability provides a basis for further research to consider differences between fallers and non-fallers. Reliability for eyes open and closed balance testing using self-selected and narrow stances in older adults was established which should provide a foundation for the development of fall risk screening tests.
Groschen, George E.
1985-01-01
Two simulations of the projected pumping a low estimate, as much as 46.2 cubic feet per second during 2011-20; and a high estimate, as much as 60.0 cubic feet per second during the same period indicate that no further regional water-quality deterioration is likely to occur. Many important properties and conditions are estimated from poor or insufficient field data, and possible ranges of these properties and conditions are tested. In spite of the errors and data deficiencies, the results are based on the best estimates currently available. The reliability of the conclusions rests on the adequacy of the data and the demonstrated sensitivity of the model results to errors in estimates of these properties.
ERIC Educational Resources Information Center
Black, Ryan A.; Yang, Yanyun; Beitra, Danette; McCaffrey, Stacey
2015-01-01
Estimation of composite reliability within a hierarchical modeling framework has recently become of particular interest given the growing recognition that the underlying assumptions of coefficient alpha are often untenable. Unfortunately, coefficient alpha remains the prominent estimate of reliability when estimating total scores from a scale with…
Methods to assess geological CO2 storage capacity: Status and best practice
Heidug, Wolf; Brennan, Sean T.; Holloway, Sam; Warwick, Peter D.; McCoy, Sean; Yoshimura, Tsukasa
2013-01-01
To understand the emission reduction potential of carbon capture and storage (CCS), decision makers need to understand the amount of CO2 that can be safely stored in the subsurface and the geographical distribution of storage resources. Estimates of storage resources need to be made using reliable and consistent methods. Previous estimates of CO2 storage potential for a range of countries and regions have been based on a variety of methodologies resulting in a correspondingly wide range of estimates. Consequently, there has been uncertainty about which of the methodologies were most appropriate in given settings, and whether the estimates produced by these methods were useful to policy makers trying to determine the appropriate role of CCS. In 2011, the IEA convened two workshops which brought together experts for six national surveys organisations to review CO2 storage assessment methodologies and make recommendations on how to harmonise CO2 storage estimates worldwide. This report presents the findings of these workshops and an internationally shared guideline for quantifying CO2 storage resources.
A Comparison of Three Multivariate Models for Estimating Test Battery Reliability.
ERIC Educational Resources Information Center
Wood, Terry M.; Safrit, Margaret J.
1987-01-01
A comparison of three multivariate models (canonical reliability model, maximum generalizability model, canonical correlation model) for estimating test battery reliability indicated that the maximum generalizability model showed the least degree of bias, smallest errors in estimation, and the greatest relative efficiency across all experimental…
Systematic effects in LOD from SLR observations
NASA Astrophysics Data System (ADS)
Bloßfeld, Mathis; Gerstl, Michael; Hugentobler, Urs; Angermann, Detlef; Müller, Horst
2014-09-01
Beside the estimation of station coordinates and the Earth’s gravity field, laser ranging observations to near-Earth satellites can be used to determine the rotation of the Earth. One parameter of this rotation is ΔLOD (excess Length Of Day) which describes the excess revolution time of the Earth w.r.t. 86,400 s. Due to correlations among the different parameter groups, it is difficult to obtain reliable estimates for all parameters. In the official ΔLOD products of the International Earth Rotation and Reference Systems Service (IERS), the ΔLOD information determined from laser ranging observations is excluded from the processing. In this paper, we study the existing correlations between ΔLOD, the orbital node Ω, the even zonal gravity field coefficients, cross-track empirical accelerations and relativistic accelerations caused by the Lense-Thirring and deSitter effect in detail using first order Gaussian perturbation equations. We found discrepancies due to different a priories by using different gravity field models of up to 1.0 ms for polar orbits at an altitude of 500 km and up to 40.0 ms, if the gravity field coefficients are estimated using only observations to LAGEOS 1. If observations to LAGEOS 2 are included, reliable ΔLOD estimates can be achieved. Nevertheless, an impact of the a priori gravity field even on the multi-satellite ΔLOD estimates can be clearly identified. Furthermore, we investigate the effect of empirical cross-track accelerations and the effect of relativistic accelerations of near-Earth satellites on ΔLOD. A total effect of 0.0088 ms is caused by not modeled Lense-Thirring and deSitter terms. The partial derivatives of these accelerations w.r.t. the position and velocity of the satellite cause very small variations (0.1 μs) on ΔLOD.
Learmonth, Yvonne C; Dlugonski, Deirdre D; Pilutti, Lara A; Sandroff, Brian M; Motl, Robert W
2013-11-01
Assessing walking impairment in those with multiple sclerosis (MS) is common, however little is known about the reliability, precision and clinically important change of walking outcomes. The purpose of this study was to determine the reliability, precision and clinically important change of the Timed 25-Foot Walk (T25FW), Six-Minute Walk (6MW), Multiple Sclerosis Walking Scale-12 (MSWS-12) and accelerometry. Data were collected from 82 persons with MS at two time points, six months apart. Analyses were undertaken for the whole sample and stratified based on disability level and usage of walking aids. Intraclass correlation coefficient (ICC) analyses established reliability: standard error of measurement (SEM) and coefficient of variation (CV) determined precision; and minimal detectable change (MDC) defined clinically important change. All outcome measures were reliable with precision and MDC varying between measures in the whole sample: T25FW: ICC=0.991; SEM=1 s; CV=6.2%; MDC=2.7 s (36%), 6MW: ICC=0.959; SEM=32 m; CV=6.2%; MDC=88 m (20%), MSWS-12: ICC=0.927; SEM=8; CV=27%; MDC=22 (53%), accelerometry counts/day: ICC=0.883; SEM=28450; CV=17%; MDC=78860 (52%), accelerometry steps/day: ICC=0.907; SEM=726; CV=16%; MDC=2011 (45%). Variation in these estimates was seen based on disability level and walking aid. The reliability of these outcomes is good and falls within acceptable ranges. Precision and clinically important change estimates provide guidelines for interpreting these outcomes in clinical and research settings.
Analysis of methods to estimate spring flows in a karst aquifer
Sepulveda, N.
2009-01-01
Hydraulically and statistically based methods were analyzed to identify the most reliable method to predict spring flows in a karst aquifer. Measured water levels at nearby observation wells, measured spring pool altitudes, and the distance between observation wells and the spring pool were the parameters used to match measured spring flows. Measured spring flows at six Upper Floridan aquifer springs in central Florida were used to assess the reliability of these methods to predict spring flows. Hydraulically based methods involved the application of the Theis, Hantush-Jacob, and Darcy-Weisbach equations, whereas the statistically based methods were the multiple linear regressions and the technology of artificial neural networks (ANNs). Root mean square errors between measured and predicted spring flows using the Darcy-Weisbach method ranged between 5% and 15% of the measured flows, lower than the 7% to 27% range for the Theis or Hantush-Jacob methods. Flows at all springs were estimated to be turbulent based on the Reynolds number derived from the Darcy-Weisbach equation for conduit flow. The multiple linear regression and the Darcy-Weisbach methods had similar spring flow prediction capabilities. The ANNs provided the lowest residuals between measured and predicted spring flows, ranging from 1.6% to 5.3% of the measured flows. The model prediction efficiency criteria also indicated that the ANNs were the most accurate method predicting spring flows in a karst aquifer. ?? 2008 National Ground Water Association.
Analysis of methods to estimate spring flows in a karst aquifer.
Sepúlveda, Nicasio
2009-01-01
Hydraulically and statistically based methods were analyzed to identify the most reliable method to predict spring flows in a karst aquifer. Measured water levels at nearby observation wells, measured spring pool altitudes, and the distance between observation wells and the spring pool were the parameters used to match measured spring flows. Measured spring flows at six Upper Floridan aquifer springs in central Florida were used to assess the reliability of these methods to predict spring flows. Hydraulically based methods involved the application of the Theis, Hantush-Jacob, and Darcy-Weisbach equations, whereas the statistically based methods were the multiple linear regressions and the technology of artificial neural networks (ANNs). Root mean square errors between measured and predicted spring flows using the Darcy-Weisbach method ranged between 5% and 15% of the measured flows, lower than the 7% to 27% range for the Theis or Hantush-Jacob methods. Flows at all springs were estimated to be turbulent based on the Reynolds number derived from the Darcy-Weisbach equation for conduit flow. The multiple linear regression and the Darcy-Weisbach methods had similar spring flow prediction capabilities. The ANNs provided the lowest residuals between measured and predicted spring flows, ranging from 1.6% to 5.3% of the measured flows. The model prediction efficiency criteria also indicated that the ANNs were the most accurate method predicting spring flows in a karst aquifer.
Automatic Certification of Kalman Filters for Reliable Code Generation
NASA Technical Reports Server (NTRS)
Denney, Ewen; Fischer, Bernd; Schumann, Johann; Richardson, Julian
2005-01-01
AUTOFILTER is a tool for automatically deriving Kalman filter code from high-level declarative specifications of state estimation problems. It can generate code with a range of algorithmic characteristics and for several target platforms. The tool has been designed with reliability of the generated code in mind and is able to automatically certify that the code it generates is free from various error classes. Since documentation is an important part of software assurance, AUTOFILTER can also automatically generate various human-readable documents, containing both design and safety related information. We discuss how these features address software assurance standards such as DO-178B.
Modern psychometrics for assessing achievement goal orientation: a Rasch analysis.
Muis, Krista R; Winne, Philip H; Edwards, Ordene V
2009-09-01
A program of research is needed that assesses the psychometric properties of instruments designed to quantify students' achievement goal orientations to clarify inconsistencies across previous studies and to provide a stronger basis for future research. We conducted traditional psychometric and modern Rasch-model analyses of the Achievement Goals Questionnaire (AGQ, Elliot & McGregor, 2001) and the Patterns of Adaptive Learning Scale (PALS, Midgley et al., 2000) to provide an in-depth analysis of the two most popular instruments in educational psychology. For Study 1, 217 undergraduate students enrolled in educational psychology courses participated. Thirty-four were male and 181 were female (two did not respond). Participants completed the AGQ in the context of their educational psychology class. For Study 2, 126 undergraduate students enrolled in educational psychology courses participated. Thirty were male and 95 were female (one did not respond). Participants completed the PALS in the context of their educational psychology class. Traditional psychometric assessments of the AGQ and PALS replicated previous studies. For both, reliability estimates ranged from good to very good for raw subscale scores and fit for the models of goal orientations were good. Based on traditional psychometrics, the AGQ and PALS are valid and reliable indicators of achievement goals. Rasch analyses revealed that estimates of reliability for items were very good but respondent ability estimates varied from poor to good for both the AGQ and PALS. These findings indicate that items validly and reliably reflect a group's aggregate goal orientation, but using either instrument to characterize an individual's goal orientation is hazardous.
NASA Astrophysics Data System (ADS)
Nair, S. P.; Righetti, R.
2015-05-01
Recent elastography techniques focus on imaging information on properties of materials which can be modeled as viscoelastic or poroelastic. These techniques often require the fitting of temporal strain data, acquired from either a creep or stress-relaxation experiment to a mathematical model using least square error (LSE) parameter estimation. It is known that the strain versus time relationships for tissues undergoing creep compression have a non-linear relationship. In non-linear cases, devising a measure of estimate reliability can be challenging. In this article, we have developed and tested a method to provide non linear LSE parameter estimate reliability: which we called Resimulation of Noise (RoN). RoN provides a measure of reliability by estimating the spread of parameter estimates from a single experiment realization. We have tested RoN specifically for the case of axial strain time constant parameter estimation in poroelastic media. Our tests show that the RoN estimated precision has a linear relationship to the actual precision of the LSE estimator. We have also compared results from the RoN derived measure of reliability against a commonly used reliability measure: the correlation coefficient (CorrCoeff). Our results show that CorrCoeff is a poor measure of estimate reliability for non-linear LSE parameter estimation. While the RoN is specifically tested only for axial strain time constant imaging, a general algorithm is provided for use in all LSE parameter estimation.
Doble, Brett; Wordsworth, Sarah; Rogers, Chris A; Welbourn, Richard; Byrne, James; Blazeby, Jane M
2017-08-01
This review aims to evaluate the current literature on the procedural costs of bariatric surgery for the treatment of severe obesity. Using a published framework for the conduct of micro-costing studies for surgical interventions, existing cost estimates from the literature are assessed for their accuracy, reliability and comprehensiveness based on their consideration of seven 'important' cost components. MEDLINE, PubMed, key journals and reference lists of included studies were searched up to January 2017. Eligible studies had to report per-case, total procedural costs for any type of bariatric surgery broken down into two or more individual cost components. A total of 998 citations were screened, of which 13 studies were included for analysis. Included studies were mainly conducted from a US hospital perspective, assessed either gastric bypass or adjustable gastric banding procedures and considered a range of different cost components. The mean total procedural costs for all included studies was US$14,389 (range, US$7423 to US$33,541). No study considered all of the recommended 'important' cost components and estimation methods were poorly reported. The accuracy, reliability and comprehensiveness of the existing cost estimates are, therefore, questionable. There is a need for a comparative cost analysis of the different approaches to bariatric surgery, with the most appropriate costing approach identified to be micro-costing methods. Such an analysis will not only be useful in estimating the relative cost-effectiveness of different surgeries but will also ensure appropriate reimbursement and budgeting by healthcare payers to ensure barriers to access this effective treatment by severely obese patients are minimised.
Zulkifly, Nuranis-Raihan; Wahab, Roswanira Abd; Layang, Elizabeth; Ismail, Dzulkiflee; Desa, Wan Nur Syuhaila Mat; Hisham, Salina; Mahat, Naji A
2018-01-01
Handprints and dismembered hands are commonly found during crime scene investigations and disaster victim identifications, respectively. It has been indicated that the accuracy of handprint and hand measurements for estimating stature maybe population specific. Since Iban is the largest ethnic population in Sarawak, Malaysia and because the application of anthropometry of hand and handprint within this population as well as other populations within the Southeast Asian countries remain unreported, this present study that investigated the reliability and accuracy of these two anthropometric aspects acquires forensic significance. Upon measuring the height, 21 measurements were recorded on each hand and the corresponding handprint of 50 male and 52 female consented adult Iban subjects. Using univariate statistics as well as simple and multiple regression analyses, interpretation of the measurements examined here was attempted. Results revealed that lengths of hand and handprint are the more reliable traits for estimating stature in both the male and female Iban subjects (p < 0.05) with correlation strength ranging from 0.60 to 0.76. Comparable to the established skeletal standards for hand, the stature prediction accuracy using hand and handprint measurements investigated in this research ranged between 4.29 and 5.78 cm. Hence, this research provided the first forensic standard for estimation of stature among the Iban population in Sarawak that may prove useful for crime scene investigations and disaster victim identifications in Malaysia. Copyright © 2017 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Albin, Thomas J; Vink, Peter
2014-11-01
Designers and ergonomists may occasionally be limited to using tables of percentiles of anthropometric data to model users. Design models that add or subtract percentiles produce unreliable estimates of the proportion of users accommodated, in part because they assume a perfect correlation between variables. Percentile data do not allow the use of more reliable modeling methods such as Principle Component Analysis. A better method is needed. A new method for modeling with limited data is described. It uses measures of central tendency (median or mean) of the range of possible correlation values to estimate the combined variance is shown to reduce error compared to combining percentiles. Second, use of the Chebyshev inequality allows the designer to more reliably estimate the percent accommodation when the distributions of the underlying anthropometric data are unknown than does combining percentiles. This paper describes a modeling method that is more accurate than combining percentiles when only limited data are available. Copyright © 2014 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Ubhi, Harveen Kaur; Michie, Susan; Kotz, Daniel; van Schayck, Onno C P; Selladurai, Abiram; West, Robert
2016-09-01
The aim of this study was to assess whether or not behaviour change techniques (BCTs) as well as engagement and ease-of-use features used in smartphone applications (apps) to aid smoking cessation can be identified reliably. Apps were coded for presence of potentially effective BCTs, and engagement and ease-of-use features. Inter-rater reliability for this coding was assessed. Inter-rater agreement for identifying presence of potentially effective BCTs ranged from 66.8 to 95.1 % with 'prevalence and bias adjusted kappas' (PABAK) ranging from 0.35 to 0.90 (p < 0.001). The intra-class correlation coefficients between the two coders for scores denoting the proportions of (a) a set of engagement features and (b) a set of ease-of-use features, which were included, were 0.77 and 0.75, respectively (p < 0.001). Prevalence estimates for BCTs ranged from <10 % for medication advice to >50 % for rewarding abstinence. The average proportions of specified engagement and ease-of-use features included in the apps were 69 and 83 %, respectively. The study found that it is possible to identify potentially effective BCTs, and engagement and ease-of-use features in smoking cessation apps with fair to high inter-rater reliability.
Calculating system reliability with SRFYDO
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morzinski, Jerome; Anderson - Cook, Christine M; Klamann, Richard M
2010-01-01
SRFYDO is a process for estimating reliability of complex systems. Using information from all applicable sources, including full-system (flight) data, component test data, and expert (engineering) judgment, SRFYDO produces reliability estimates and predictions. It is appropriate for series systems with possibly several versions of the system which share some common components. It models reliability as a function of age and up to 2 other lifecycle (usage) covariates. Initial output from its Exploratory Data Analysis mode consists of plots and numerical summaries so that the user can check data entry and model assumptions, and help determine a final form for themore » system model. The System Reliability mode runs a complete reliability calculation using Bayesian methodology. This mode produces results that estimate reliability at the component, sub-system, and system level. The results include estimates of uncertainty, and can predict reliability at some not-too-distant time in the future. This paper presents an overview of the underlying statistical model for the analysis, discusses model assumptions, and demonstrates usage of SRFYDO.« less
NASA Astrophysics Data System (ADS)
Yu, Z. P.; Yue, Z. F.; Liu, W.
2018-05-01
With the development of artificial intelligence, more and more reliability experts have noticed the roles of subjective information in the reliability design of complex system. Therefore, based on the certain numbers of experiment data and expert judgments, we have divided the reliability estimation based on distribution hypothesis into cognition process and reliability calculation. Consequently, for an illustration of this modification, we have taken the information fusion based on intuitional fuzzy belief functions as the diagnosis model of cognition process, and finished the reliability estimation for the open function of cabin door affected by the imprecise judgment corresponding to distribution hypothesis.
Are Validity and Reliability "Relevant" in Qualitative Evaluation Research?
ERIC Educational Resources Information Center
Goodwin, Laura D.; Goodwin, William L.
1984-01-01
The views of prominant qualitative methodologists on the appropriateness of validity and reliability estimation for the measurement strategies employed in qualitative evaluations are summarized. A case is made for the relevance of validity and reliability estimation. Definitions of validity and reliability for qualitative measurement are presented…
A Group Contribution Method for Estimating Cetane and Octane Numbers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kubic, William Louis
Much of the research on advanced biofuels is devoted to the study of novel chemical pathways for converting nonfood biomass into liquid fuels that can be blended with existing transportation fuels. Many compounds under consideration are not found in the existing fuel supplies. Often, the physical properties needed to assess the viability of a potential biofuel are not available. The only reliable information available may be the molecular structure. Group contribution methods for estimating physical properties from molecular structure have been used for more than 60 years. The most common application is estimation of thermodynamic properties. More recently, group contributionmore » methods have been developed for estimating rate dependent properties including cetane and octane numbers. Often, published group contribution methods are limited in terms of types of function groups and range of applicability. In this study, a new, broadly-applicable group contribution method based on an artificial neural network was developed to estimate cetane number research octane number, and motor octane numbers of hydrocarbons and oxygenated hydrocarbons. The new method is more accurate over a greater range molecular weights and structural complexity than existing group contribution methods for estimating cetane and octane numbers.« less
A General Approach for Estimating Scale Score Reliability for Panel Survey Data
ERIC Educational Resources Information Center
Biemer, Paul P.; Christ, Sharon L.; Wiesen, Christopher A.
2009-01-01
Scale score measures are ubiquitous in the psychological literature and can be used as both dependent and independent variables in data analysis. Poor reliability of scale score measures leads to inflated standard errors and/or biased estimates, particularly in multivariate analysis. Reliability estimation is usually an integral step to assess…
ERIC Educational Resources Information Center
Md Desa, Zairul Nor Deana
2012-01-01
In recent years, there has been increasing interest in estimating and improving subscore reliability. In this study, the multidimensional item response theory (MIRT) and the bi-factor model were combined to estimate subscores, to obtain subscores reliability, and subscores classification. Both the compensatory and partially compensatory MIRT…
Reliability and precision of pellet-group counts for estimating landscape-level deer density
David S. deCalesta
2013-01-01
This study provides hitherto unavailable methodology for reliably and precisely estimating deer density within forested landscapes, enabling quantitative rather than qualitative deer management. Reliability and precision of the deer pellet-group technique were evaluated in 1 small and 2 large forested landscapes. Density estimates, adjusted to reflect deer harvest and...
Method matters: Understanding diagnostic reliability in DSM-IV and DSM-5.
Chmielewski, Michael; Clark, Lee Anna; Bagby, R Michael; Watson, David
2015-08-01
Diagnostic reliability is essential for the science and practice of psychology, in part because reliability is necessary for validity. Recently, the DSM-5 field trials documented lower diagnostic reliability than past field trials and the general research literature, resulting in substantial criticism of the DSM-5 diagnostic criteria. Rather than indicating specific problems with DSM-5, however, the field trials may have revealed long-standing diagnostic issues that have been hidden due to a reliance on audio/video recordings for estimating reliability. We estimated the reliability of DSM-IV diagnoses using both the standard audio-recording method and the test-retest method used in the DSM-5 field trials, in which different clinicians conduct separate interviews. Psychiatric patients (N = 339) were diagnosed using the SCID-I/P; 218 were diagnosed a second time by an independent interviewer. Diagnostic reliability using the audio-recording method (N = 49) was "good" to "excellent" (M κ = .80) and comparable to the DSM-IV field trials estimates. Reliability using the test-retest method (N = 218) was "poor" to "fair" (M κ = .47) and similar to DSM-5 field-trials' estimates. Despite low test-retest diagnostic reliability, self-reported symptoms were highly stable. Moreover, there was no association between change in self-report and change in diagnostic status. These results demonstrate the influence of method on estimates of diagnostic reliability. (c) 2015 APA, all rights reserved).
Combining facial dynamics with appearance for age estimation.
Dibeklioglu, Hamdi; Alnajar, Fares; Ali Salah, Albert; Gevers, Theo
2015-06-01
Estimating the age of a human from the captured images of his/her face is a challenging problem. In general, the existing approaches to this problem use appearance features only. In this paper, we show that in addition to appearance information, facial dynamics can be leveraged in age estimation. We propose a method to extract and use dynamic features for age estimation, using a person's smile. Our approach is tested on a large, gender-balanced database with 400 subjects, with an age range between 8 and 76. In addition, we introduce a new database on posed disgust expressions with 324 subjects in the same age range, and evaluate the reliability of the proposed approach when used with another expression. State-of-the-art appearance-based age estimation methods from the literature are implemented as baseline. We demonstrate that for each of these methods, the addition of the proposed dynamic features results in statistically significant improvement. We further propose a novel hierarchical age estimation architecture based on adaptive age grouping. We test our approach extensively, including an exploration of spontaneous versus posed smile dynamics, and gender-specific age estimation. We show that using spontaneity information reduces the mean absolute error by up to 21%, advancing the state of the art for facial age estimation.
Terry, Leann; Kelley, Ken
2012-11-01
Composite measures play an important role in psychology and related disciplines. Composite measures almost always have error. Correspondingly, it is important to understand the reliability of the scores from any particular composite measure. However, the point estimates of the reliability of composite measures are fallible and thus all such point estimates should be accompanied by a confidence interval. When confidence intervals are wide, there is much uncertainty in the population value of the reliability coefficient. Given the importance of reporting confidence intervals for estimates of reliability, coupled with the undesirability of wide confidence intervals, we develop methods that allow researchers to plan sample size in order to obtain narrow confidence intervals for population reliability coefficients. We first discuss composite reliability coefficients and then provide a discussion on confidence interval formation for the corresponding population value. Using the accuracy in parameter estimation approach, we develop two methods to obtain accurate estimates of reliability by planning sample size. The first method provides a way to plan sample size so that the expected confidence interval width for the population reliability coefficient is sufficiently narrow. The second method ensures that the confidence interval width will be sufficiently narrow with some desired degree of assurance (e.g., 99% assurance that the 95% confidence interval for the population reliability coefficient will be less than W units wide). The effectiveness of our methods was verified with Monte Carlo simulation studies. We demonstrate how to easily implement the methods with easy-to-use and freely available software. ©2011 The British Psychological Society.
The Revised School Culture Elements Questionnaire: Gender and Grade Level Invariant?
ERIC Educational Resources Information Center
DeVaney, Thomas A.; Adams, Nan B.; Hill-Winstead, Flo; Trahan, Mitzi P.
2012-01-01
The purpose of this research was to examine the psychometric properties of the RSCEQ with respect to invariance across gender and grade level, using a sample of 901 teachers from 44 schools in southeast Louisiana. Reliability estimates were consistent with previous research and ranged from 0.81 to 0.90 on the actual and 0.83 to 0.92 on the…
A Measure for the Reliability of a Rating Scale Based on Longitudinal Clinical Trial Data
ERIC Educational Resources Information Center
Laenen, Annouschka; Alonso, Ariel; Molenberghs, Geert
2007-01-01
A new measure for reliability of a rating scale is introduced, based on the classical definition of reliability, as the ratio of the true score variance and the total variance. Clinical trial data can be employed to estimate the reliability of the scale in use, whenever repeated measurements are taken. The reliability is estimated from the…
Paoli, Carly J.; Hays, Ron D.; Taylor-Stokes, Gavin; Piercy, James; Gitlin, Matthew
2014-01-01
Background and objectives The US Centers for Medicare and Medicaid Services (CMS) End Stage Renal Disease Prospective Payment System and Quality Incentive Program requires that dialysis centers meet predefined criteria for quality of patient care to ensure future funding. The CMS selected the Consumer Assessment of Healthcare Providers and Systems In-Center Hemodialysis (CAHPS-ICH) survey for the assessment of patient experience of care. This analysis evaluated the psychometric properties of the CAHPS-ICH survey in a sample of hemodialysis patients. Design, setting, participants, & measurements Data were drawn from the Adelphi CKD Disease Specific Program (a retrospective, cross-sectional survey of nephrologists and patients). Selected United States–based nephrologists treating patients receiving hemodialysis completed patient record forms and provided information on their dialysis center. Patients (n=404) completed the CAHPS-ICH survey (comprising 58 questions) providing six scores for the assessment of patient experience of care. CAHPS-ICH item-scale convergence, discrimination, and reliability were evaluated for multi-item scales. Floor and ceiling effects were estimated for all six scores. Patient (demographics, dialysis history, vascular access method) and facility characteristics (size, ratio of patients-to-physicians, nurses, and technicians) associated with the CAHPS-ICH scores were also evaluated. Results Item-scale correlations and internal consistency reliability estimates provided support for the nephrologists’ communication (range, 0.16–0.71; α=0.81) and quality of care (range, 0.16–0.76; α=0.90) composites. However, the patient information composite had low internal consistency reliability (α=0.55). Provider-to-patient ratios (range, 2.37 for facilities with >36 patients per physician to 2.8 for those with <8 patients per physician) and time spent in the waiting room (3.44 for >15 minutes of waiting time to 3.75 for 5 to <10 minutes) were characteristics most consistently related to patients’ perceptions of dialysis care. Conclusions CAHPS-ICH is a potentially valuable and informative tool for the evaluation of patients’ experiences with dialysis care. Additional studies are needed to estimate clinically meaningful differences between care providers. PMID:24832092
Back to the future: estimating pre-injury brain volume in patients with traumatic brain injury.
Ross, David E; Ochs, Alfred L; D Zannoni, Megan; Seabaugh, Jan M
2014-11-15
A recent meta-analysis by Hedman et al. allows for accurate estimation of brain volume changes throughout the life span. Additionally, Tate et al. showed that intracranial volume at a later point in life can be used to estimate reliably brain volume at an earlier point in life. These advancements were combined to create a model which allowed the estimation of brain volume just prior to injury in a group of patients with mild or moderate traumatic brain injury (TBI). This volume estimation model was used in combination with actual measurements of brain volume to test hypotheses about progressive brain volume changes in the patients. Twenty six patients with mild or moderate TBI were compared to 20 normal control subjects. NeuroQuant® was used to measure brain MRI volume. Brain volume after the injury (from MRI scans performed at t1 and t2) was compared to brain volume just before the injury (volume estimation at t0) using longitudinal designs. Groups were compared with respect to volume changes in whole brain parenchyma (WBP) and its 3 major subdivisions: cortical gray matter (GM), cerebral white matter (CWM) and subcortical nuclei+infratentorial regions (SCN+IFT). Using the normal control data, the volume estimation model was tested by comparing measured brain volume to estimated brain volume; reliability ranged from good to excellent. During the initial phase after injury (t0-t1), the TBI patients had abnormally rapid atrophy of WBP and CWM, and abnormally rapid enlargement of SCN+IFT. Rates of volume change during t0-t1 correlated with cross-sectional measures of volume change at t1, supporting the internal reliability of the volume estimation model. A logistic regression analysis using the volume change data produced a function which perfectly predicted group membership (TBI patients vs. normal control subjects). During the first few months after injury, patients with mild or moderate TBI have rapid atrophy of WBP and CWM, and rapid enlargement of SCN+IFT. The magnitude and pattern of the changes in volume may allow for the eventual development of diagnostic tools based on the volume estimation approach. Copyright © 2014 Elsevier Inc. All rights reserved.
Hukkerikar, Amol Shivajirao; Kalakul, Sawitree; Sarup, Bent; Young, Douglas M; Sin, Gürkan; Gani, Rafiqul
2012-11-26
The aim of this work is to develop group-contribution(+) (GC(+)) method (combined group-contribution (GC) method and atom connectivity index (CI) method) based property models to provide reliable estimations of environment-related properties of organic chemicals together with uncertainties of estimated property values. For this purpose, a systematic methodology for property modeling and uncertainty analysis is used. The methodology includes a parameter estimation step to determine parameters of property models and an uncertainty analysis step to establish statistical information about the quality of parameter estimation, such as the parameter covariance, the standard errors in predicted properties, and the confidence intervals. For parameter estimation, large data sets of experimentally measured property values of a wide range of chemicals (hydrocarbons, oxygenated chemicals, nitrogenated chemicals, poly functional chemicals, etc.) taken from the database of the US Environmental Protection Agency (EPA) and from the database of USEtox is used. For property modeling and uncertainty analysis, the Marrero and Gani GC method and atom connectivity index method have been considered. In total, 22 environment-related properties, which include the fathead minnow 96-h LC(50), Daphnia magna 48-h LC(50), oral rat LD(50), aqueous solubility, bioconcentration factor, permissible exposure limit (OSHA-TWA), photochemical oxidation potential, global warming potential, ozone depletion potential, acidification potential, emission to urban air (carcinogenic and noncarcinogenic), emission to continental rural air (carcinogenic and noncarcinogenic), emission to continental fresh water (carcinogenic and noncarcinogenic), emission to continental seawater (carcinogenic and noncarcinogenic), emission to continental natural soil (carcinogenic and noncarcinogenic), and emission to continental agricultural soil (carcinogenic and noncarcinogenic) have been modeled and analyzed. The application of the developed property models for the estimation of environment-related properties and uncertainties of the estimated property values is highlighted through an illustrative example. The developed property models provide reliable estimates of environment-related properties needed to perform process synthesis, design, and analysis of sustainable chemical processes and allow one to evaluate the effect of uncertainties of estimated property values on the calculated performance of processes giving useful insights into quality and reliability of the design of sustainable processes.
Survival of European mouflon (Artiodactyla: Bovidae) in Hawai'i based on tooth cementum lines
Hess, S.C.; Stephens, R.M.; Thompson, T.L.; Danner, R.M.; Kawakami, B.
2011-01-01
Reliable techniques for estimating age of ungulates are necessary to determine population parameters such as age structure and survival. Techniques that rely on dentition, horn, and facial patterns have limited utility for European mouflon sheep (Ovis gmelini musimon), but tooth cementum lines may offer a useful alternative. Cementum lines may not be reliable outside temperate regions, however, because lack of seasonality in diet may affect annulus formation. We evaluated the utility of tooth cementum lines for estimating age of mouflon in Hawai'i in comparison to dentition. Cementum lines were present in mouflon from Mauna Loa, island of Hawai'i, but were less distinct than in North American sheep. The two age-estimation methods provided similar estimates for individuals aged ???3 yr by dentition (the maximum age estimable by dentition), with exact matches in 51% (18/35) of individuals, and an average difference of 0.8 yr (range 04). Estimates of age from cementum lines were higher than those from dentition in 40% (14/35) and lower in 9% (3/35) of individuals. Discrepancies in age estimates between techniques and between paired tooth samples estimated by cementum lines were related to certainty categories assigned by the clarity of cementum lines, reinforcing the importance of collecting a sufficient number of samples to compensate for samples of lower quality, which in our experience, comprised approximately 22% of teeth. Cementum lines appear to provide relatively accurate age estimates for mouflon in Hawai'i, allow estimating age beyond 3 yr, and they offer more precise estimates than tooth eruption patterns. After constructing an age distribution, we estimated annual survival with a log-linear model to be 0.596 (95% CI 0.5540.642) for this heavily controlled population. ?? 2011 by University of Hawai'i Press.
How reliable an indicator of inflammation is myeloperoxidase activity?
Faith, Minnie; Sukumaran, Abitha; Pulimood, Anna B; Jacob, Molly
2008-10-01
Myeloperoxidase (MPO) and interleukin-6 (IL-6) are often used as markers of inflammation. The aim of this study was to ascertain whether MPO activity is as reliable as IL-6 as an indicator of inflammation. Inflammation was induced in mice, using either turpentine or indomethacin. Duodenal tissue was removed from these animals at various time periods ranging from 6 h to 7 days later. Concentrations of IL-6 and MPO activity were estimated in the tissue. Histopathological examination was also carried out at some of the time periods to determine the presence of neutrophil infiltration in turpentine-treated mice. Concentrations of IL-6 and MPO activity were significantly higher in tissue that had been treated with the agents used, at all the time periods studied, when compared with corresponding control tissue. Fold-increases in MPO activity were higher than fold-increases in IL-6. Concentrations of the 2 parameters showed significant positive correlation. Histopathological examination did not show significantly higher numbers of neutrophils infiltrating the tissue in response to turpentine, at the time periods studied. Estimation of MPO activity is a reliable indicator of inflammation, being more sensitive than histopathological examination of tissue and as good as measurement of IL-6 concentrations.
Mohd Din, F H; Hoe, Victor C W; Chan, C K; Muslan, M A
2015-05-01
The Pain Catastrophizing Scale (PCS) is designed to assess negative thoughts in response to pain. It is composed of three domains: helplessness, rumination, and magnification. We report on the translation, adaptation, and validation of scores on a Malay-speaking version of the PCS, the PCS-MY. Guidelines for the process of cross-cultural adaptations of assessment measures were implemented. A sample of 303 young military recruits participated in the study. Factor structure, reliability, and validity of scores on the PCS-MY were examined. Convergent validity was investigated with the Positive and Negative Affect Scale, Short-form 12 version 2, and Ryff's Psychological Well-being Scale. Most participants were men, ranging in age from 19 to 26. The reliability of the PCS-MY scores was adequate (α = 0.90; mean inter-item correlation = 0.43). Confirmatory factor analysis showed that a modified version of the PCS-MY provided best fit estimates to the sample data. The PCS-MY total score was negatively correlated with mental well-being and positively correlated with negative affect (all ps < 0.001). The PCS-MY was demonstrated to have adequate reliability and validity estimates in the study sample.
Bottema-Beutel, Kristen; Lloyd, Blair; Carter, Erik W; Asmus, Jennifer M
2014-11-01
Attaining reliable estimates of observational measures can be challenging in school and classroom settings, as behavior can be influenced by multiple contextual factors. Generalizability (G) studies can enable researchers to estimate the reliability of observational data, and decision (D) studies can inform how many observation sessions are necessary to achieve a criterion level of reliability. We conducted G and D studies using observational data from a randomized control trial focusing on social and academic participation of students with severe disabilities in inclusive secondary classrooms. Results highlight the importance of anchoring observational decisions to reliability estimates from existing or pilot data sets. We outline steps for conducting G and D studies and address options when reliability estimates are lower than desired.
Gilliom, Robert J.; Helsel, Dennis R.
1986-01-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations, for determining the best performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilliom, R.J.; Helsel, D.R.
1986-02-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensoredmore » observations, for determining the best performing parameter estimation method for any particular data det. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.« less
Estimating monotonic rates from biological data using local linear regression.
Olito, Colin; White, Craig R; Marshall, Dustin J; Barneche, Diego R
2017-03-01
Accessing many fundamental questions in biology begins with empirical estimation of simple monotonic rates of underlying biological processes. Across a variety of disciplines, ranging from physiology to biogeochemistry, these rates are routinely estimated from non-linear and noisy time series data using linear regression and ad hoc manual truncation of non-linearities. Here, we introduce the R package LoLinR, a flexible toolkit to implement local linear regression techniques to objectively and reproducibly estimate monotonic biological rates from non-linear time series data, and demonstrate possible applications using metabolic rate data. LoLinR provides methods to easily and reliably estimate monotonic rates from time series data in a way that is statistically robust, facilitates reproducible research and is applicable to a wide variety of research disciplines in the biological sciences. © 2017. Published by The Company of Biologists Ltd.
NASA Astrophysics Data System (ADS)
Xu, Shaoping; Zeng, Xiaoxia; Jiang, Yinnan; Tang, Yiling
2018-01-01
We proposed a noniterative principal component analysis (PCA)-based noise level estimation (NLE) algorithm that addresses the problem of estimating the noise level with a two-step scheme. First, we randomly extracted a number of raw patches from a given noisy image and took the smallest eigenvalue of the covariance matrix of the raw patches as the preliminary estimation of the noise level. Next, the final estimation was directly obtained with a nonlinear mapping (rectification) function that was trained on some representative noisy images corrupted with different known noise levels. Compared with the state-of-art NLE algorithms, the experiment results show that the proposed NLE algorithm can reliably infer the noise level and has robust performance over a wide range of image contents and noise levels, showing a good compromise between speed and accuracy in general.
Ratnayake, M; Obertová, Z; Dose, M; Gabriel, P; Bröker, H M; Brauckmann, M; Barkus, A; Rizgeliene, R; Tutkuviene, J; Ritz-Timme, S; Marasciuolo, L; Gibelli, D; Cattaneo, C
2014-09-01
In cases of suspected child pornography, the age of the victim represents a crucial factor for legal prosecution. The conventional methods for age estimation provide unreliable age estimates, particularly if teenage victims are concerned. In this pilot study, the potential of age estimation for screening purposes is explored for juvenile faces. In addition to a visual approach, an automated procedure is introduced, which has the ability to rapidly scan through large numbers of suspicious image data in order to trace juvenile faces. Age estimations were performed by experts, non-experts and the Demonstrator of a developed software on frontal facial images of 50 females aged 10-19 years from Germany, Italy, and Lithuania. To test the accuracy, the mean absolute error (MAE) between the estimates and the real ages was calculated for each examiner and the Demonstrator. The Demonstrator achieved the lowest MAE (1.47 years) for the 50 test images. Decreased image quality had no significant impact on the performance and classification results. The experts delivered slightly less accurate MAE (1.63 years). Throughout the tested age range, both the manual and the automated approach led to reliable age estimates within the limits of natural biological variability. The visual analysis of the face produces reasonably accurate age estimates up to the age of 18 years, which is the legally relevant age threshold for victims in cases of pedo-pornography. This approach can be applied in conjunction with the conventional methods for a preliminary age estimation of juveniles depicted on images.
Postmortem time estimation using body temperature and a finite-element computer model.
den Hartog, Emiel A; Lotens, Wouter A
2004-09-01
In the Netherlands most murder victims are found 2-24 h after the crime. During this period, body temperature decrease is the most reliable method to estimate the postmortem time (PMT). Recently, two murder cases were analysed in which currently available methods did not provide a sufficiently reliable estimate of the PMT. In both cases a study was performed to verify the statements of suspects. For this purpose a finite-element computer model was developed that simulates a human torso and its clothing. With this model, changes to the body and the environment can also be modelled; this was very relevant in one of the cases, as the body had been in the presence of a small fire. In both cases it was possible to falsify the statements of the suspects by improving the accuracy of the PMT estimate. The estimated PMT in both cases was within the range of Henssge's model. The standard deviation of the PMT estimate was 35 min in the first case and 45 min in the second case, compared to 168 min (2.8 h) in Henssge's model. In conclusion, the model as presented here can have additional value for improving the accuracy of the PMT estimate. In contrast to the simple model of Henssge, the current model allows for increased accuracy when more detailed information is available. Moreover, the sensitivity of the predicted PMT for uncertainty in the circumstances can be studied, which is crucial to the confidence of the judge in the results.
ERIC Educational Resources Information Center
Green, Samuel B.; Yang, Yanyun
2009-01-01
A method is presented for estimating reliability using structural equation modeling (SEM) that allows for nonlinearity between factors and item scores. Assuming the focus is on consistency of summed item scores, this method for estimating reliability is preferred to those based on linear SEM models and to the most commonly reported estimate of…
A study of fault prediction and reliability assessment in the SEL environment
NASA Technical Reports Server (NTRS)
Basili, Victor R.; Patnaik, Debabrata
1986-01-01
An empirical study on estimation and prediction of faults, prediction of fault detection and correction effort, and reliability assessment in the Software Engineering Laboratory environment (SEL) is presented. Fault estimation using empirical relationships and fault prediction using curve fitting method are investigated. Relationships between debugging efforts (fault detection and correction effort) in different test phases are provided, in order to make an early estimate of future debugging effort. This study concludes with the fault analysis, application of a reliability model, and analysis of a normalized metric for reliability assessment and reliability monitoring during development of software.
Shahian, David M; He, Xia; Jacobs, Jeffrey P; Kurlansky, Paul A; Badhwar, Vinay; Cleveland, Joseph C; Fazzalari, Frank L; Filardo, Giovanni; Normand, Sharon-Lise T; Furnary, Anthony P; Magee, Mitchell J; Rankin, J Scott; Welke, Karl F; Han, Jane; O'Brien, Sean M
2015-10-01
Previous composite performance measures of The Society of Thoracic Surgeons (STS) were estimated at the STS participant level, typically a hospital or group practice. The STS Quality Measurement Task Force has now developed a multiprocedural, multidimensional composite measure suitable for estimating the performance of individual surgeons. The development sample from the STS National Database included 621,489 isolated coronary artery bypass grafting procedures, isolated aortic valve replacement, aortic valve replacement plus coronary artery bypass grafting, mitral, or mitral plus coronary artery bypass grafting procedures performed by 2,286 surgeons between July 1, 2011, and June 30, 2014. Each surgeon's composite score combined their aggregate risk-adjusted mortality and major morbidity rates (each weighted inversely by their standard deviations) and reflected the proportion of case types they performed. Model parameters were estimated in a Bayesian framework. Composite star ratings were examined using 90%, 95%, or 98% Bayesian credible intervals. Measure reliability was estimated using various 3-year case thresholds. The final composite measure was defined as 0.81 × (1 minus risk-standardized mortality rate) + 0.19 × (1 minus risk-standardized complication rate). Risk-adjusted mortality (median, 2.3%; interquartile range, 1.7% to 3.0%), morbidity (median, 13.7%; interquartile range, 10.8% to 17.1%), and composite scores (median, 95.4%; interquartile range, 94.4% to 96.3%) varied substantially across surgeons. Using 98% Bayesian credible intervals, there were 207 1-star (lower performance) surgeons (9.1%), 1,701 2-star (as-expected performance) surgeons (74.4%), and 378 3-star (higher performance) surgeons (16.5%). With an eligibility threshold of 100 cases over 3 years, measure reliability was 0.81. The STS has developed a multiprocedural composite measure suitable for evaluating performance at the individual surgeon level. Copyright © 2015 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.
Age estimation by amino acid racemization in human teeth.
Ohtani, Susumu; Yamamoto, Toshiharu
2010-11-01
When an unidentified body is found, it is essential to establish the personal identity of the body in addition to investigating the cause of death. Identification is one of the most important functions of forensic dentistry. Fingerprint, dental, and DNA analysis can be used to accurately identify a body. However, if no information is available for identification, age estimation can contribute to the resolution of a case. The authors have been using aspartic acid racemization rates in dentin (D-aspartic acid/L-aspartic acid: D/L Asp) as an index for age estimation and have obtained satisfactory results. We report five cases of age estimation using the racemization method. In all five cases, estimated ages were accurate within a range ±3 years. We conclude that the racemization method is a reliable and practical method for estimating age. © 2010 American Academy of Forensic Sciences.
Tracking reliability for space cabin-borne equipment in development by Crow model.
Chen, J D; Jiao, S J; Sun, H L
2001-12-01
Objective. To study and track the reliability growth of manned spaceflight cabin-borne equipment in the course of its development. Method. A new technique of reliability growth estimation and prediction, which is composed of the Crow model and test data conversion (TDC) method was used. Result. The estimation and prediction value of the reliability growth conformed to its expectations. Conclusion. The method could dynamically estimate and predict the reliability of the equipment by making full use of various test information in the course of its development. It offered not only a possibility of tracking the equipment reliability growth, but also the reference for quality control in manned spaceflight cabin-borne equipment design and development process.
A Bayesian Account of Visual-Vestibular Interactions in the Rod-and-Frame Task.
Alberts, Bart B G T; de Brouwer, Anouk J; Selen, Luc P J; Medendorp, W Pieter
2016-01-01
Panoramic visual cues, as generated by the objects in the environment, provide the brain with important information about gravity direction. To derive an optimal, i.e., Bayesian, estimate of gravity direction, the brain must combine panoramic information with gravity information detected by the vestibular system. Here, we examined the individual sensory contributions to this estimate psychometrically. We asked human subjects to judge the orientation (clockwise or counterclockwise relative to gravity) of a briefly flashed luminous rod, presented within an oriented square frame (rod-in-frame). Vestibular contributions were manipulated by tilting the subject's head, whereas visual contributions were manipulated by changing the viewing distance of the rod and frame. Results show a cyclical modulation of the frame-induced bias in perceived verticality across a 90° range of frame orientations. The magnitude of this bias decreased significantly with larger viewing distance, as if visual reliability was reduced. Biases increased significantly when the head was tilted, as if vestibular reliability was reduced. A Bayesian optimal integration model, with distinct vertical and horizontal panoramic weights, a gain factor to allow for visual reliability changes, and ocular counterroll in response to head tilt, provided a good fit to the data. We conclude that subjects flexibly weigh visual panoramic and vestibular information based on their orientation-dependent reliability, resulting in the observed verticality biases and the associated response variabilities.
Sedentary Behavior in Preschoolers: How Many Days of Accelerometer Monitoring Is Needed?
Byun, Wonwoo; Beets, Michael W.; Pate, Russell R.
2015-01-01
The reliability of accelerometry for measuring sedentary behavior in preschoolers has not been determined, thus we determined how many days of accelerometry monitoring are necessary to reliably estimate daily time spent in sedentary behavior in preschoolers. In total, 191 and 150 preschoolers (three to five years) wore ActiGraph accelerometers (15-s epoch) during the in-school (≥4 days) and the total-day (≥6 days) period respectively. Accelerometry data were summarized as time spent in sedentary behavior (min/h) using three different cutpoints developed for preschool-age children (<37.5, <200, and <373 counts/15 s). The intraclass correlations (ICCs) and Spearman-Brown prophecy formula were used to estimate the reliability of accelerometer for measuring sedentary behavior. Across different cutpoints, the ICCs ranged from 0.81 to 0.92 for in-school sedentary behavior, and from 0.75 to 0.81 for total-day sedentary behavior, respectively. To achieve an ICC of ≥0.8, two to four days or six to nine days of monitoring were needed for in-school sedentary behavior and total-day sedentary behavior, respectively. These findings provide important guidance for future research on sedentary behavior in preschool children using accelerometry. Understanding the reliability of accelerometry will facilitate the conduct of research designed to inform policies and practices aimed at reducing sedentary behavior in preschool children. PMID:26492261
The reliability of the Hendrich Fall Risk Model in a geriatric hospital.
Heinze, Cornelia; Halfens, Ruud; Dassen, Theo
2008-12-01
Aims and objectives. The purpose of this study was to test the interrater reliability of the Hendrich Fall Risk Model, an instrument to identify patients in a hospital setting with a high risk of falling. Background. Falls are a serious problem in older patients. Valid and reliable fall risk assessment tools are required to identify high-risk patients and to take adequate preventive measures. Methods. Seventy older patients were independently and simultaneously assessed by six pairs of raters made up of nursing staff members. Consensus estimates were calculated using simple percentage agreement and consistency estimates using Spearman's rho and intra class coefficient. Results. Percentage agreement ranged from 0.70 to 0.92 between the six pairs of raters. Spearman's rho coefficients were between 0.54 and 0.80 and the intra class coefficients were between 0.46 and 0.92. Conclusions. Whereas some pairs of raters obtained considerable interobserver agreement and internal consistency, the others did not. Therefore, it is concluded that the Hendrich Fall Risk Model is not a reliable instrument. The use of more unambiguous operationalized items is preferred. Relevance to clinical practice. In practice, well operationalized fall risk assessment tools are necessary. Observer agreement should always be investigated after introducing a standardized measurement tool. © 2008 The Authors. Journal compilation © 2008 Blackwell Publishing Ltd.
A Bayesian Account of Visual–Vestibular Interactions in the Rod-and-Frame Task
de Brouwer, Anouk J.; Medendorp, W. Pieter
2016-01-01
Abstract Panoramic visual cues, as generated by the objects in the environment, provide the brain with important information about gravity direction. To derive an optimal, i.e., Bayesian, estimate of gravity direction, the brain must combine panoramic information with gravity information detected by the vestibular system. Here, we examined the individual sensory contributions to this estimate psychometrically. We asked human subjects to judge the orientation (clockwise or counterclockwise relative to gravity) of a briefly flashed luminous rod, presented within an oriented square frame (rod-in-frame). Vestibular contributions were manipulated by tilting the subject’s head, whereas visual contributions were manipulated by changing the viewing distance of the rod and frame. Results show a cyclical modulation of the frame-induced bias in perceived verticality across a 90° range of frame orientations. The magnitude of this bias decreased significantly with larger viewing distance, as if visual reliability was reduced. Biases increased significantly when the head was tilted, as if vestibular reliability was reduced. A Bayesian optimal integration model, with distinct vertical and horizontal panoramic weights, a gain factor to allow for visual reliability changes, and ocular counterroll in response to head tilt, provided a good fit to the data. We conclude that subjects flexibly weigh visual panoramic and vestibular information based on their orientation-dependent reliability, resulting in the observed verticality biases and the associated response variabilities. PMID:27844055
Validity and Reliability of a New Device (WIMU®) for Measuring Hamstring Muscle Extensibility.
Muyor, José M
2017-09-01
The aims of the current study were 1) to evaluate the validity of the WIMU ® system for measuring hamstring muscle extensibility in the passive straight leg raise (PSLR) test using an inclinometer for the criterion and 2) to determine the test-retest reliability of the WIMU ® system to measure hamstring muscle extensibility during the PSLR test. 55 subjects were evaluated on 2 separate occasions. Data from a Unilever inclinometer and WIMU ® system were collected simultaneously. Intraclass correlation coefficients (ICCs) for the validity were very high (0.983-1); a very low systematic bias (-0.21°--0.42°), random error (0.05°-0.04°) and standard error of the estimate (0.43°-0.34°) were observed (left-right leg, respectively) between the 2 devices (inclinometer and the WIMU ® system). The R 2 between the devices was 0.999 (p<0.001) in both the left and right legs. The test-retest reliability of the WIMU ® system was excellent, with ICCs ranging from 0.972-0.995, low coefficients of variation (0.01%), and a low standard error of the estimate (0.19-0.31°). The WIMU ® system showed strong concurrent validity and excellent test-retest reliability for the evaluation of hamstring muscle extensibility in the PSLR test. © Georg Thieme Verlag KG Stuttgart · New York.
Effect of Surge Current Testing on Reliability of Solid Tantalum Capacitors
NASA Technical Reports Server (NTRS)
Teverovsky, Alexander
2008-01-01
Tantalum capacitors manufactured per military specifications are established reliability components and have less than 0.001% of failures per 1000 hours for grades D or S, thus positioning these parts among electronic components with the highest reliability characteristics. Still, failures of tantalum capacitors do happen and when it occurs it might have catastrophic consequences for the system. To reduce this risk, further development of a screening and qualification system with special attention to the possible deficiencies in the existing procedures is necessary. The purpose of this work is evaluation of the effect of surge current stress testing on reliability of the parts at both steady-state and multiple surge current stress conditions. In order to reveal possible degradation and precipitate more failures, various part types were tested and stressed in the range of voltage and temperature conditions exceeding the specified limits. A model to estimate the probability of post-surge current testing-screening failures and measures to improve the effectiveness of the screening process has been suggested.
Quiroz, Viviana; Reinero, Daniela; Hernández, Patricia; Contreras, Johanna; Vernal, Rolando; Carvajal, Paola
2017-01-01
This study aimed to develop and assess the content validity and reliability of a cognitively adapted self-report questionnaire designed for surveillance of gingivitis in adolescents. Ten predetermined self-report questions evaluating early signs and symptoms of gingivitis were preliminary assessed by a panel of clinical experts. Eight questions were selected and cognitively tested in 20 adolescents aged 12 to 18 years from Santiago de Chile. The questionnaire was then conducted and answered by 178 Chilean adolescents. Internal consistency was measured using the Cronbach's alpha and temporal stability was calculated using the Kappa-index. A reliable final self-report questionnaire consisting of 5 questions was obtained, with a total Cronbach's alpha of 0.73 and a Kappa-index ranging from 0.41 to 0.77 between the different questions. The proposed questionnaire is reliable, with an acceptable internal consistency and a temporal stability from moderate to substantial, and it is promising for estimating the prevalence of gingivitis in adolescents.
The Yale-Brown Obsessive Compulsive Scale: A Reliability Generalization Meta-Analysis.
López-Pina, José Antonio; Sánchez-Meca, Julio; López-López, José Antonio; Marín-Martínez, Fulgencio; Núñez-Núñez, Rosa Maria; Rosa-Alcázar, Ana I; Gómez-Conesa, Antonia; Ferrer-Requena, Josefa
2015-10-01
The Yale-Brown Obsessive Compulsive Scale (Y-BOCS) is the most frequently applied test to assess obsessive compulsive symptoms. We conducted a reliability generalization meta-analysis on the Y-BOCS to estimate the average reliability, examine the variability among the reliability estimates, search for moderators, and propose a predictive model that researchers and clinicians can use to estimate the expected reliability of the Y-BOCS. We included studies where the Y-BOCS was applied to a sample of adults and reliability estimate was reported. Out of the 11,490 references located, 144 studies met the selection criteria. For the total scale, the mean reliability was 0.866 for coefficients alpha, 0.848 for test-retest correlations, and 0.922 for intraclass correlations. The moderator analyses led to a predictive model where the standard deviation of the total test and the target population (clinical vs. nonclinical) explained 38.6% of the total variability among coefficients alpha. Finally, clinical implications of the results are discussed. © The Author(s) 2014.
Estimating numbers of greater prairie-chickens using mark-resight techniques
Clifton, A.M.; Krementz, D.G.
2006-01-01
Current monitoring efforts for greater prairie-chicken (Tympanuchus cupido pinnatus) populations indicate that populations are declining across their range. Monitoring the population status of greater prairie-chickens is based on traditional lek surveys (TLS) that provide an index without considering detectability. Estimators, such as immigration-emigration joint maximum-likelihood estimator from a hypergeometric distribution (IEJHE), can account for detectability and provide reliable population estimates based on resightings. We evaluated the use of mark-resight methods using radiotelemetry to estimate population size and density of greater prairie-chickens on 2 sites at a tallgrass prairie in the Flint Hills of Kansas, USA. We used average distances traveled from lek of capture to estimate density. Population estimates and confidence intervals at the 2 sites were 54 (CI 50-59) on 52.9 km 2 and 87 (CI 82-94) on 73.6 km2. The TLS performed at the same sites resulted in population ranges of 7-34 and 36-63 and always produced a lower population index than the mark-resight population estimate with a larger range. Mark-resight simulations with varying male:female ratios of marks indicated that this ratio was important in designing a population study on prairie-chickens. Confidence intervals for estimates when no marks were placed on females at the 2 sites (CI 46-50, 76-84) did not overlap confidence intervals when 40% of marks were placed on females (CI 54-64, 91-109). Population estimates derived using this mark-resight technique were apparently more accurate than traditional methods and would be more effective in detecting changes in prairie-chicken populations. Our technique could improve prairie-chicken management by providing wildlife biologists and land managers with a tool to estimate the population size and trends of lekking bird species, such as greater prairie-chickens.
Influences on and Limitations of Classical Test Theory Reliability Estimates.
ERIC Educational Resources Information Center
Arnold, Margery E.
It is incorrect to say "the test is reliable" because reliability is a function not only of the test itself, but of many factors. The present paper explains how different factors affect classical reliability estimates such as test-retest, interrater, internal consistency, and equivalent forms coefficients. Furthermore, the limits of classical test…
Reliability Generalization of the Psychopathy Checklist Applied in Youthful Samples
ERIC Educational Resources Information Center
Campbell, Justin S.; Pulos, Steven; Hogan, Mike; Murry, Francie
2005-01-01
This study examines the average reliability of Hare Psychopathy Checklists (PCLs) adapted for use in samples of youthful offenders (aged 12 to 21 years). Two forms of reliability are examined: 18 alpha estimates of internal consistency and 18 intraclass correlation (two or more raters) estimates of interrater reliability. The results, an average…
Reliability reporting across studies using the Buss Durkee Hostility Inventory.
Vassar, Matt; Hale, William
2009-01-01
Empirical research on anger and hostility has pervaded the academic literature for more than 50 years. Accurate measurement of anger/hostility and subsequent interpretation of results requires that the instruments yield strong psychometric properties. For consistent measurement, reliability estimates must be calculated with each administration, because changes in sample characteristics may alter the scale's ability to generate reliable scores. Therefore, the present study was designed to address reliability reporting practices for a widely used anger assessment, the Buss Durkee Hostility Inventory (BDHI). Of the 250 published articles reviewed, 11.2% calculated and presented reliability estimates for the data at hand, 6.8% cited estimates from a previous study, and 77.1% made no mention of score reliability. Mean alpha estimates of scores for BDHI subscales generally fell below acceptable standards. Additionally, no detectable pattern was found between reporting practices and publication year or journal prestige. Areas for future research are also discussed.
Estimating the production, consumption and export of cannabis: The Dutch case.
van der Giessen, Mark; van Ooyen-Houben, Marianne M J; Moolenaar, Debora E G
2016-05-01
Quantifying an illegal phenomenon like a drug market is inherently complex due to its hidden nature and the limited availability of reliable information. This article presents findings from a recent estimate of the production, consumption and export of Dutch cannabis and discusses the opportunities provided by, and limitations of, mathematical models for estimating the illegal cannabis market. The data collection consisted of a comprehensive literature study, secondary analyses on data from available registrations (2012-2014) and previous studies, and expert opinion. The cannabis market was quantified with several mathematical models. The data analysis included a Monte Carlo simulation to come to a 95% interval estimate (IE) and a sensitivity analysis to identify the most influential indicators. The annual production of Dutch cannabis was estimated to be between 171 and 965tons (95% IE of 271-613tons). The consumption was estimated to be between 28 and 119tons, depending on the inclusion or exclusion of non-residents (95% IE of 51-78tons or 32-49tons respectively). The export was estimated to be between 53 and 937tons (95% IE of 206-549tons or 231-573tons, respectively). Mathematical models are valuable tools for the systematic assessment of the size of illegal markets and determining the uncertainty inherent in the estimates. The estimates required the use of many assumptions and the availability of reliable indicators was limited. This uncertainty is reflected in the wide ranges of the estimates. The estimates are sensitive to 10 of the 45 indicators. These 10 account for 86-93% of the variation found. Further research should focus on improving the variables and the independence of the mathematical models. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Carvallo, Claire; Özdemir, Özden; Dunlop, David J.
2004-01-01
We measured palaeodirections and palaeointensities by the Thellier method on 93 samples from three of the Emperor seamounts: 20 from Detroit seamount (81 Ma), 48 from Nintoku seamount (56 Ma) and 25 from Koko seamount (48 Ma). Reliable palaeodirections obtained from three lava flows on Nintoku seamount give an average palaeolatitude of 32.7°, which is different from the present-day latitude of Hawaii and supports the hypothesis of a moving hotspot. According to the selection criteria traditionally used in palaeointensity determination, 17 samples give a reliable result. The samples show a very wide variety in unblocking temperatures, revealing an important variation in titanium content and the oxidation state of titanomagnetites. In order to assess the reliability of the palaeofield recording in the accepted samples, we carried out measurements of saturation isothermal remanent magnetization at low temperature and thermomagnetic curves. We found Curie temperatures varying from 250 to 580 °C, not only between seamounts but even within one lava flow. Thermomagnetic curves enabled us to identify titanomaghemite in several lava flows. After rejecting the results from samples showing evidence of maghemitization, only four samples, all from Nintoku seamount, can be considered as truly reliable. The palaeointensity values range between 34.2 and 36.9 μT. The low virtual axial dipole moment (VADM) values calculated from the palaeofield values are consistent with the most reliable VADM estimates in this time range and they are very close to the average VADM in the 0.3-300 Ma time interval.
Temporal Stability of the Dutch Version of the Wechsler Memory Scale-Fourth Edition (WMS-IV-NL).
Bouman, Zita; Hendriks, Marc P H; Aldenkamp, Albert P; Kessels, Roy P C
2015-01-01
The Wechsler Memory Scale-Fourth Edition (WMS-IV) is one of the most widely used memory batteries. We examined the test-retest reliability, practice effects, and standardized regression-based (SRB) change norms for the Dutch version of the WMS-IV (WMS-IV-NL) after both short and long retest intervals. The WMS-IV-NL was administered twice after either a short (M = 8.48 weeks, SD = 3.40 weeks, range = 3-16) or a long (M = 17.87 months, SD = 3.48, range = 12-24) retest interval in a sample of 234 healthy participants (M = 59.55 years, range = 16-90; 118 completed the Adult Battery; and 116 completed the Older Adult Battery). The test-retest reliability estimates varied across indexes. They were adequate to good after a short retest interval (ranging from .74 to .86), with the exception of the Visual Working Memory Index (r = .59), yet generally lower after a long retest interval (ranging from .56 to .77). Practice effects were only observed after a short retest interval (overall group mean gains up to 11 points), whereas no significant change in performance was found after a long retest interval. Furthermore, practice effect-adjusted SRB change norms were calculated for all WMS-IV-NL index scores. Overall, this study shows that the test-retest reliability of the WMS-IV-NL varied across indexes. Practice effects were observed after a short retest interval, but no evidence was found for practice effects after a long retest interval from one to two years. Finally, the SRB change norms were provided for the WMS-IV-NL.
USDA-ARS?s Scientific Manuscript database
Error in rater estimates of plant disease severity occur, and standard area diagrams (SADs) help improve accuracy and reliability. The effects of diagram number in a SAD set on accuracy and reliability is unknown. The objective of this study was to compare estimates of pecan scab severity made witho...
Reliability Estimation When a Test Is Split into Two Parts of Unknown Effective Length.
ERIC Educational Resources Information Center
Feldt, Leonard S.
2002-01-01
Considers the situation in which content or administrative considerations limit the way in which a test can be partitioned to estimate the internal consistency reliability of the total test score. Demonstrates that a single-valued estimate of the total score reliability is possible only if an assumption is made about the comparative size of the…
ERIC Educational Resources Information Center
Saupe, Joe L.; Eimers, Mardy T.
2013-01-01
The purpose of this paper is to explore differences in the reliabilities of cumulative college grade point averages (GPAs), estimated for unweighted and weighted, one-semester, 1-year, 2-year, and 4-year GPAs. Using cumulative GPAs for a freshman class at a major university, we estimate internal consistency (coefficient alpha) reliabilities for…
Assessment of the impact of the scanner-related factors on brain morphometry analysis with Brainvisa
2011-01-01
Background Brain morphometry is extensively used in cross-sectional studies. However, the difference in the estimated values of the morphometric measures between patients and healthy subjects may be small and hence overshadowed by the scanner-related variability, especially with multicentre and longitudinal studies. It is important therefore to investigate the variability and reliability of morphometric measurements between different scanners and different sessions of the same scanner. Methods We assessed the variability and reliability for the grey matter, white matter, cerebrospinal fluid and cerebral hemisphere volumes as well as the global sulcal index, sulcal surface and mean geodesic depth using Brainvisa. We used datasets obtained across multiple MR scanners at 1.5 T and 3 T from the same groups of 13 and 11 healthy volunteers, respectively. For each morphometric measure, we conducted ANOVA analysis and verified whether the estimated values were significantly different across different scanners or different sessions of the same scanner. The between-centre and between-visit reliabilities were estimated from their contribution to the total variance, using a random-effects ANOVA model. To estimate the main processes responsible for low reliability, the results of brain segmentation were compared to those obtained using FAST within FSL. Results In a considerable number of cases, the main effects of both centre and visit factors were found to be significant. Moreover, both between-centre and between-visit reliabilities ranged from poor to excellent for most morphometric measures. A comparison between segmentation using Brainvisa and FAST revealed that FAST improved the reliabilities for most cases, suggesting that morphometry could benefit from improving the bias correction. However, the results were still significantly different across different scanners or different visits. Conclusions Our results confirm that for morphometry analysis with the current version of Brainvisa using data from multicentre or longitudinal studies, the scanner-related variability must be taken into account and where possible should be corrected for. We also suggest providing some flexibility to Brainvisa for a step-by-step analysis of the robustness of this package in terms of reproducibility of the results by allowing the bias corrected images to be imported from other packages and bias correction step be skipped, for example. PMID:22189342
Reliable Change and Outcome Trajectories Across Levels of Care in a Mental Health System for Youth.
Jackson, David S; Keir, Scott S; Sender, Max; Mueller, Charles W
2017-01-01
Knowledge of mental health treatment outcome trajectories across various service types can be valuable for both system- and client-level decision-making. Using longitudinal youth functional impairment scores across 2807 treatment episodes, this study examined outcome trajectories and estimated the number of months required for reliable change across nine major services (or levels of care). Results indicate logarithmic improvement trajectories for a majority of levels of care and significant differences in time until improvement ranging from 4 to 12 months. Findings can guide system-level policies on lengths of treatment and service authorizations and provide expected treatment response data for client-level treatment decisions.
How social influence can undermine the wisdom of crowd effect.
Lorenz, Jan; Rauhut, Heiko; Schweitzer, Frank; Helbing, Dirk
2011-05-31
Social groups can be remarkably smart and knowledgeable when their averaged judgements are compared with the judgements of individuals. Already Galton [Galton F (1907) Nature 75:7] found evidence that the median estimate of a group can be more accurate than estimates of experts. This wisdom of crowd effect was recently supported by examples from stock markets, political elections, and quiz shows [Surowiecki J (2004) The Wisdom of Crowds]. In contrast, we demonstrate by experimental evidence (N = 144) that even mild social influence can undermine the wisdom of crowd effect in simple estimation tasks. In the experiment, subjects could reconsider their response to factual questions after having received average or full information of the responses of other subjects. We compare subjects' convergence of estimates and improvements in accuracy over five consecutive estimation periods with a control condition, in which no information about others' responses was provided. Although groups are initially "wise," knowledge about estimates of others narrows the diversity of opinions to such an extent that it undermines the wisdom of crowd effect in three different ways. The "social influence effect" diminishes the diversity of the crowd without improvements of its collective error. The "range reduction effect" moves the position of the truth to peripheral regions of the range of estimates so that the crowd becomes less reliable in providing expertise for external observers. The "confidence effect" boosts individuals' confidence after convergence of their estimates despite lack of improved accuracy. Examples of the revealed mechanism range from misled elites to the recent global financial crisis.
Estimation of distributional parameters for censored trace-level water-quality data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilliom, R.J.; Helsel, D.R.
1984-01-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water-sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations,more » for determining the best-performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least-squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification. 6 figs., 6 tabs.« less
Tsai, Alexander C.; Scott, Jennifer A.; Hung, Kristin J.; Zhu, Jennifer Q.; Matthews, Lynn T.; Psaros, Christina; Tomlinson, Mark
2013-01-01
Background A major barrier to improving perinatal mental health in Africa is the lack of locally validated tools for identifying probable cases of perinatal depression or for measuring changes in depression symptom severity. We systematically reviewed the evidence on the reliability and validity of instruments to assess perinatal depression in African settings. Methods and Findings Of 1,027 records identified through searching 7 electronic databases, we reviewed 126 full-text reports. We included 25 unique studies, which were disseminated in 26 journal articles and 1 doctoral dissertation. These enrolled 12,544 women living in nine different North and sub-Saharan African countries. Only three studies (12%) used instruments developed specifically for use in a given cultural setting. Most studies provided evidence of criterion-related validity (20 [80%]) or reliability (15 [60%]), while fewer studies provided evidence of construct validity, content validity, or internal structure. The Edinburgh postnatal depression scale (EPDS), assessed in 16 studies (64%), was the most frequently used instrument in our sample. Ten studies estimated the internal consistency of the EPDS (median estimated coefficient alpha, 0.84; interquartile range, 0.71-0.87). For the 14 studies that estimated sensitivity and specificity for the EPDS, we constructed 2 x 2 tables for each cut-off score. Using a bivariate random-effects model, we estimated a pooled sensitivity of 0.94 (95% confidence interval [CI], 0.68-0.99) and a pooled specificity of 0.77 (95% CI, 0.59-0.88) at a cut-off score of ≥9, with higher cut-off scores yielding greater specificity at the cost of lower sensitivity. Conclusions The EPDS can reliably and validly measure perinatal depression symptom severity or screen for probable postnatal depression in African countries, but more validation studies on other instruments are needed. In addition, more qualitative research is needed to adequately characterize local understandings of perinatal depression-like syndromes in different African contexts. PMID:24340036
Reliability of the cervical vertebrae maturation (CVM) method.
Predko-Engel, A; Kaminek, M; Langova, K; Kowalski, P; Fudalej, P S
2015-01-01
To assess the reliability of the cervical vertebrae maturation method (CVM). Skeletal maturity estimation can influence the manner and time of orthodontic treatment. The CVM method evaluates skeletal growth on the basis of the changes in the morphology of cervical vertebrae C2, C3, C4 during growth. These vertebrae are visible on a lateral cephalogram, so the method does not require an additional radiograph. In this website based study, 10 orthodontists with a long clinical practice (3 routinely using the method - "Routine user - RU" and 7 with less experience in the CVM method - "Non-Routine user - nonRU") rated twice cervical vertebrae maturation with the CVM method on 50 cropped scans of lateral cephalograms of children in circumpubertal age (for boys: 11.5 to 15.5 years; for girls: 10 to 14 years). Kappa statistics (with lower limits of 95% confidence intervals (CI)) and proportion of complete agreement on staging was used to evaluate intra- and inter-assessor agreement. The mean weighted kappa for intra-assessor agreement was 0.44 (range: 0.30-0.64; range of lower limits of 95% CI: 0.12-0.48) and for inter-assessor agreement was 0.28 (range: -0.01-0.58; range of lower limits of 95% CI: -0.14-0.42). The mean proportion of identical scores assigned by the same assessor was 55.2 %(range: 44-74 %) and for different pairs of assessors was 42 % (range: 16-68 %). The reliability of the CVM method is questionable and if orthodontic treatment should be initiated relative to the maximum growth, the use of additional biologic indicators should be considered (Tab. 4, Fig. 1, Ref. 24).
Accuracy and reliability of pulp/tooth area ratio in upper canines by peri-apical X-rays.
Azevedo, A C; Michel-Crosato, E; Biazevic, M G H; Galić, I; Merelli, V; De Luca, S; Cameriere, R
2014-11-01
Due to the real need for careful staff training in age assessment, in order to improve capacity, consistency and competence, new research on the reliability and repeatability of methods frequently used in age assessment are required. The aim of this study was twofold: first, to test the accuracy of this method for age estimation; second, to obtain data on the reliability of this technique. A sample of 81 peri-apical radiographs of upper canines (44 men and 37 women), aged between 19 and 74years, was used; the teeth were taken from the osteological collection of Sassari (Sardinia, Italy). Three blinded observers used the technique in order to perform the age estimation. The mean real age of the 81 observations was 37.21 (CI95% 34.37 40.05), and estimated ages ranged from 36.65 to 38.99 (CI95%-Ex1 35.42; 41.28; CI95%-Ex2 33.89; 39.41; CI95%-Ex3 35.92; 42.06). The module differences found by the three observers were 3.43, 4.24 and 4.45, respectively for Ex1×Ex2, Ex1×Ex3 and Ex2×Ex3. The module differences observed among real and observed ages were 2.55 (CI95% 1.90; 3.20), 2.22 (CI95% 1.65; 2.78) and 4.39 (CI95% 3.80; 5.75), respectively for Ex1, Ex2 and Ex3. No differences were observed among measurements. This technique can be reproduced and repeated after proper training, since it was found high reliability and accuracy. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Tsai, Alexander C.
2014-01-01
OBJECTIVES To systematically review the reliability and validity of instruments used to screen for major depressive disorder or assess depression symptom severity among persons with HIV in sub-Saharan Africa. DESIGN Systematic review and meta-analysis. METHODS A systematic evidence search protocol was applied to seven bibliographic databases. Studies examining the reliability and/or validity of depression assessment tools were selected for inclusion if they were based on data collected from HIV-positive adults in any African member state of the United Nations. Random-effects meta-analysis was employed to calculate pooled estimates of depression prevalence. In a subgroup of studies of criterion-related validity, the bivariate random-effects model was used to calculate pooled estimates of sensitivity and specificity. RESULTS Of 1,117 records initially identified, I included 13 studies of 5,373 persons with HIV in 7 sub-Saharan African countries. Reported estimates of Cronbach’s alpha ranged from 0.63–0.95, and analyses of internal structure generally confirmed the existence of a depression-like construct accounting for a substantial portion of variance. The pooled prevalence of probable depression was 29.5% (95% CI, 20.5–39.4), while the pooled prevalence of major depressive disorder was 13.9% (95% CI, 9.7–18.6). The Center for Epidemiologic Studies-Depression scale was the most frequently studied instrument, with a pooled sensitivity of 0.82 (95% CI, 0.73–0.87) for detecting major depressive disorder. CONCLUSIONS Depression screening instruments yielded relatively high false positive rates. Overall, few studies described the reliability and/or validity of depression instruments in sub-Saharan Africa. PMID:24853307
General Aviation Aircraft Reliability Study
NASA Technical Reports Server (NTRS)
Pettit, Duane; Turnbull, Andrew; Roelant, Henk A. (Technical Monitor)
2001-01-01
This reliability study was performed in order to provide the aviation community with an estimate of Complex General Aviation (GA) Aircraft System reliability. To successfully improve the safety and reliability for the next generation of GA aircraft, a study of current GA aircraft attributes was prudent. This was accomplished by benchmarking the reliability of operational Complex GA Aircraft Systems. Specifically, Complex GA Aircraft System reliability was estimated using data obtained from the logbooks of a random sample of the Complex GA Aircraft population.
Mortsiefer, Achim; Immecke, Janine; Rotthoff, Thomas; Karger, André; Schmelzer, Regine; Raski, Bianca; Schmitten, Jürgen In der; Altiner, Attila; Pentzek, Michael
2014-06-01
To evaluate the summative assessment (OSCE) of a communication training programme for dealing with challenging doctor-patient encounters in the 4th study year. Our OSCE consists of 4 stations (breaking bad news, guilt and shame, aggressive patients, shared decision making), using a 4-item global rating (GR) instrument. We calculated reliability coefficients for different levels, discriminability of single items and interrater reliability. Validity was estimated by gender differences and accordance between GR and a checklist. In a pooled sample of 456 students in 3 OSCEs over 3 terms, total reliability was α=0.64, reliability coefficients for single stations were >0.80, and discriminability in 3 of 4 stations was within the range of 0.4-0.7. Except for one station, interrater reliability was moderate to strong. Reliability on item level was poor and pointed to some problems with the use of the GR. The application of the GR on regular undergraduate medical education shows moderate reliability in need of improvement and some traits of validity. Ongoing development and evaluation is needed with particular regard to the training of the examiners. Our CoMeD-OSCE proved suitable for the summative assessment of communication skills in challenging doctor-patient encounters. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
ERIC Educational Resources Information Center
Lane, Ginny G.; White, Amy E.; Henson, Robin K.
2002-01-01
Conducted a reliability generalizability study on the Coopersmith Self-Esteem Inventory (CSEI; S. Coopersmith, 1967) to examine the variability of reliability estimates across studies and to identify study characteristics that may predict this variability. Results show that reliability for CSEI scores can vary considerably, especially at the…
Online estimation of lithium-ion battery capacity using sparse Bayesian learning
NASA Astrophysics Data System (ADS)
Hu, Chao; Jain, Gaurav; Schmidt, Craig; Strief, Carrie; Sullivan, Melani
2015-09-01
Lithium-ion (Li-ion) rechargeable batteries are used as one of the major energy storage components for implantable medical devices. Reliability of Li-ion batteries used in these devices has been recognized as of high importance from a broad range of stakeholders, including medical device manufacturers, regulatory agencies, patients and physicians. To ensure a Li-ion battery operates reliably, it is important to develop health monitoring techniques that accurately estimate the capacity of the battery throughout its life-time. This paper presents a sparse Bayesian learning method that utilizes the charge voltage and current measurements to estimate the capacity of a Li-ion battery used in an implantable medical device. Relevance Vector Machine (RVM) is employed as a probabilistic kernel regression method to learn the complex dependency of the battery capacity on the characteristic features that are extracted from the charge voltage and current measurements. Owing to the sparsity property of RVM, the proposed method generates a reduced-scale regression model that consumes only a small fraction of the CPU time required by a full-scale model, which makes online capacity estimation computationally efficient. 10 years' continuous cycling data and post-explant cycling data obtained from Li-ion prismatic cells are used to verify the performance of the proposed method.
Ruiz Ayma, Gabriel; Olalla Kerstupp, Alina; Macías Duarte, Alberto; Guzmán Velasco, Antonio; González Rojas, José I
2016-08-26
The western burrowing owl (Athene cunicularia hypugaea) occurs throughout western North America in various habitats such as desert, short-grass prairie and shrub-steppe, among others, where the main threat for this species is habitat loss. Range-wide declines have prompted a need for reliable estimates of its populations in Mexico, where the size of resident and migratory populations remain unknown. Our objective was to estimate the abundance and density of breeding western burrowing owl populations in Mexican prairie dog (Cynomys mexicanus) colonies in two sites located within the Chihuahuan Desert ecoregion in the states of Nuevo Leon and San Luis Potosi, Mexico. Line transect surveys were conducted from February to April of 2010 and 2011. Fifty 60 ha transects were analyzed using distance sampling to estimate owl and Mexican prairie dog populations. We estimated a population of 2026 owls (95 % CI 1756-2336) in 2010 and 2015 owls (95 % CI 1573-2317) in 2011 across 50 Mexican prairie dog colonies (20,529 ha). The results represent the first systematic attempt to provide reliable evidence related to the size of the adult owl populations, within the largest and best preserved Mexican prairie dog colonies in Mexico.
NASA Astrophysics Data System (ADS)
He, Ling-Yun; Qian, Wen-Bin
2012-07-01
A correct or precise estimation of the Hurst exponent is one of the fundamentally important problems in the financial economics literature. There are three widely used tools to estimate the Hurst exponent, the canonical rescaled range (R/S), the variance rescaled statistic (V/S) and the Modified rescaled range (Modified R/S). To clarify their performance, we compare them by Monte Carlo simulations; we generate many time-series of a fractal Brownian motion, of a Weierstrass-Mandelbrot cosine fractal function and of a fractionally integrated process, whose theoretical Hurst exponents are known, to compare the Hurst exponents estimated by the three methods. To better understand their pragmatic performance, we further apply all of these methods empirically in real-world applications. Our results imply it is not appropriate to conclude simply which method is better as V/S performs better when the analyzed market is anti-persistent while R/S seems to be a reliable tool used in persistent market.
On Algorithms for Generating Computationally Simple Piecewise Linear Classifiers
1989-05-01
suffers. - Waveform classification, e.g. speech recognition, seismic analysis (i.e. discrimination between earthquakes and nuclear explosions), target...assuming Gaussian distributions (B-G) d) Bayes classifier with probability densities estimated with the k-N-N method (B- kNN ) e) The -arest neighbour...range of classifiers are chosen including a fast, easy computable and often used classifier (B-G), reliable and complex classifiers (B- kNN and NNR
Forde, David R; Baron, Stephen W; Scher, Christine D; Stein, Murray B
2012-01-01
This study examines the psychometric properties of the Childhood Trauma Questionnaire short form (CTQ-SF) with street youth who have run away or been expelled from their homes (N = 397). Internal reliability coefficients for the five clinical scales ranged from .65 to .95. Confirmatory Factor Analysis (CFA) was used to test the five-factor structure of the scales yielding acceptable fit for the total sample. Additional multigroup analyses were performed to consider items by gender. Results provided only evidence of weak factorial invariance. Constrained models showed invariance in configuration, factor loadings, and factor covariances but failed for equality of intercepts. Mean trauma scores for street youth tended to fall in the moderate to severe range on all abuse/neglect clinical scales. Females reported higher levels of abuse and neglect. Prevalence of child maltreatment of individual forms was very high with 98% of street youth reporting one or more forms; 27.4% of males and 48.9% of females reported all five forms. Results of this study support the viability of the CTQ-SF for screening maltreatment in a highly vulnerable street population. Caution is recommended when comparing prevalence estimates for male and female street youth given the failure of the strong factorial multigroup model.
NASA Astrophysics Data System (ADS)
Sandoval-Soto, L.; Stanimirov, M.; von Hobe, M.; Schmitt, V.; Valdes, J.; Wild, A.; Kesselmeier, J.
2005-01-01
COS uptake by trees, as observed under dark/light changes and under application of the plant hormone abscisic acid, exhibited a strong correlation with the CO2 assimilation rate and the stomatal conductance. As the uptake of COS occurred exclusively through the stomata we compared experimentally derived and re-evaluated deposition velocities (Vd for COS and CO2). We show that Vd of COS is generally significantly larger than that of CO2. We therefore introduced this attribute into a new global estimate of COS fluxes into vegetation. The global COS uptake by vegetation as estimated by the new model ranges between 0.69-1.40 Tg a-1, based on the Net Primary Productivity (NPP). Taking into account Gross Primary Productivity (GPP) the deposition estimate ranges between 1.37-2.81 Tg a-1 (0.73-1.50 Tg S a-1). We believe that in order to obtain accurate and reliable global NPP-based estimates for the COS flux into vegetation, the different deposition velocities of COS and CO2 must be taken into account.
Occurrence and distribution of Indian primates
Karanth, K.K.; Nichols, J.D.; Hines, J.E.
2010-01-01
Global and regional species conservation efforts are hindered by poor distribution data and range maps. Many Indian primates face extinction, but assessments of population status are hindered by lack of reliable distribution data. We estimated the current occurrence and distribution of 15 Indian primates by applying occupancy models to field data from a country-wide survey of local experts. We modeled species occurrence in relation to ecological and social covariates (protected areas, landscape characteristics, and human influences), which we believe are critical to determining species occurrence in India. We found evidence that protected areas positively influence occurrence of seven species and for some species are their only refuge. We found evergreen forests to be more critical for some primates along with temperate and deciduous forests. Elevation negatively influenced occurrence of three species. Lower human population density was positively associated with occurrence of five species, and higher cultural tolerance was positively associated with occurrence of three species. We find that 11 primates occupy less than 15% of the total land area of India. Vulnerable primates with restricted ranges are Golden langur, Arunachal macaque, Pig-tailed macaque, stump-tailed macaque, Phayre's leaf monkey, Nilgiri langur and Lion-tailed macaque. Only Hanuman langur and rhesus macaque are widely distributed. We find occupancy modeling to be useful in determining species ranges, and in agreement with current species ranking and IUCN status. In landscapes where monitoring efforts require optimizing cost, effort and time, we used ecological and social covariates to reliably estimate species occurrence and focus species conservation efforts. ?? Elsevier Ltd.
DNA content analysis allows discrimination between Trypanosoma cruzi and Trypanosoma rangeli.
Naves, Lucila Langoni; da Silva, Marcos Vinícius; Fajardo, Emanuella Francisco; da Silva, Raíssa Bernardes; De Vito, Fernanda Bernadelli; Rodrigues, Virmondes; Lages-Silva, Eliane; Ramírez, Luis Eduardo; Pedrosa, André Luiz
2017-01-01
Trypanosoma cruzi, a human protozoan parasite, is the causative agent of Chagas disease. Currently the species is divided into six taxonomic groups. The genome of the CL Brener clone has been estimated to be 106.4-110.7 Mb, and DNA content analyses revealed that it is a diploid hybrid clone. Trypanosoma rangeli is a hemoflagellate that has the same reservoirs and vectors as T. cruzi; however, it is non-pathogenic to vertebrate hosts. The haploid genome of T. rangeli was previously estimated to be 24 Mb. The parasitic strains of T. rangeli are divided into KP1(+) and KP1(-). Thus, the objective of this study was to investigate the DNA content in different strains of T. cruzi and T. rangeli by flow cytometry. All T. cruzi and T. rangeli strains yielded cell cycle profiles with clearly identifiable G1-0 (2n) and G2-M (4n) peaks. T. cruzi and T. rangeli genome sizes were estimated using the clone CL Brener and the Leishmania major CC1 as reference cell lines because their genome sequences have been previously determined. The DNA content of T. cruzi strains ranged from 87,41 to 108,16 Mb, and the DNA content of T. rangeli strains ranged from 63,25 Mb to 68,66 Mb. No differences in DNA content were observed between KP1(+) and KP1(-) T. rangeli strains. Cultures containing mixtures of the epimastigote forms of T. cruzi and T. rangeli strains resulted in cell cycle profiles with distinct G1 peaks for strains of each species. These results demonstrate that DNA content analysis by flow cytometry is a reliable technique for discrimination between T. cruzi and T. rangeli isolated from different hosts.
Ahluwalia, Indu B; Helms, Kristen; Morrow, Brian
2013-01-01
We investigated the reliability and validity of three self-reported indicators from the Pregnancy Risk Assessment Monitoring System (PRAMS) survey. We used 2008 PRAMS (n=15,646) data from 12 states that had implemented the 2003 revised U.S. Certificate of Live Birth. We estimated reliability by kappa coefficient and validity by sensitivity and specificity using the birth certificate data as the reference for the following: prenatal participation in the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC); Medicaid payment for delivery; and breastfeeding initiation. These indicators were examined across several demographic subgroups. The reliability was high for all three measures: 0.81 for WIC participation, 0.67 for Medicaid payment of delivery, and 0.72 for breastfeeding initiation. The validity of PRAMS indicators was also high: WIC participation (sensitivity = 90.8%, specificity = 90.6%), Medicaid payment for delivery (sensitivity = 82.4%, specificity = 85.6%), and breastfeeding initiation (sensitivity = 94.3%, specificity = 76.0%). The prevalence estimates were higher on PRAMS than the birth certificate for each of the indicators except Medicaid-paid delivery among non-Hispanic black women. Kappa values within most subgroups remained in the moderate range (0.40-0.80). Sensitivity and specificity values were lower for Hispanic women who responded to the PRAMS survey in Spanish and for breastfeeding initiation among women who delivered very low birthweight and very preterm infants. The validity and reliability of the PRAMS data for measures assessed were high. Our findings support the use of PRAMS data for epidemiological surveillance, research, and planning.
Operations and Modeling Analysis
NASA Technical Reports Server (NTRS)
Ebeling, Charles
2005-01-01
The Reliability and Maintainability Analysis Tool (RMAT) provides NASA the capability to estimate reliability and maintainability (R&M) parameters and operational support requirements for proposed space vehicles based upon relationships established from both aircraft and Shuttle R&M data. RMAT has matured both in its underlying database and in its level of sophistication in extrapolating this historical data to satisfy proposed mission requirements, maintenance concepts and policies, and type of vehicle (i.e. ranging from aircraft like to shuttle like). However, a companion analyses tool, the Logistics Cost Model (LCM) has not reached the same level of maturity as RMAT due, in large part, to nonexistent or outdated cost estimating relationships and underlying cost databases, and it's almost exclusive dependence on Shuttle operations and logistics cost input parameters. As a result, the full capability of the RMAT/LCM suite of analysis tools to take a conceptual vehicle and derive its operations and support requirements along with the resulting operating and support costs has not been realized.
Wavelet analysis for wind fields estimation.
Leite, Gladeston C; Ushizima, Daniela M; Medeiros, Fátima N S; de Lima, Gilson G
2010-01-01
Wind field analysis from synthetic aperture radar images allows the estimation of wind direction and speed based on image descriptors. In this paper, we propose a framework to automate wind direction retrieval based on wavelet decomposition associated with spectral processing. We extend existing undecimated wavelet transform approaches, by including à trous with B(3) spline scaling function, in addition to other wavelet bases as Gabor and Mexican-hat. The purpose is to extract more reliable directional information, when wind speed values range from 5 to 10 ms(-1). Using C-band empirical models, associated with the estimated directional information, we calculate local wind speed values and compare our results with QuikSCAT scatterometer data. The proposed approach has potential application in the evaluation of oil spills and wind farms.
Estimation of respiratory rhythm during night sleep using a bio-radar
NASA Astrophysics Data System (ADS)
Tataraidze, Alexander; Anishchenko, Lesya; Alekhin, Maksim; Korostovtseva, Lyudmila; Sviryaev, Yurii
2014-05-01
An assessment of bio-radiolocation monitoring of respiratory rhythm during sleep is given. Full-night respiratory inductance plethysmography (RIP) and bio-radiolocation (BRL) records were collected simultaneously in a sleep laboratory. Polysomnography data from 5 subjects without sleep breathing disorders were used. A multi-frequency bioradar with step frequency modulation was applied. It has 8 operating frequencies ranging from 3.6 to 4.0 GHz. BRL data are recorded in two quadratures. Respiratory cycles were detected in time domain. Obtained data was used for the evaluation of correlation between BRL and RIP respiration rate estimates. Strong correlation between corresponding time series was revealed. BRL method is reliably implemented for estimation of respiratory rhythm and respiratory rate variability during full night sleep.
Evaluating the reliability of an injury prevention screening tool: Test-retest study.
Gittelman, Michael A; Kincaid, Madeline; Denny, Sarah; Wervey Arnold, Melissa; FitzGerald, Michael; Carle, Adam C; Mara, Constance A
2016-10-01
A standardized injury prevention (IP) screening tool can identify family risks and allow pediatricians to address behaviors. To assess behavior changes on later screens, the tool must be reliable for an individual and ideally between household members. Little research has examined the reliability of safety screening tool questions. This study utilized test-retest reliability of parent responses on an existing IP questionnaire and also compared responses between household parents. Investigators recruited parents of children 0 to 1 year of age during admission to a tertiary care children's hospital. When both parents were present, one was chosen as the "primary" respondent. Primary respondents completed the 30-question IP screening tool after consent, and they were re-screened approximately 4 hours later to test individual reliability. The "second" parent, when present, only completed the tool once. All participants received a 10-dollar gift card. Cohen's Kappa was used to estimate test-retest reliability and inter-rater agreement. Standard test-retest criteria consider Kappa values: 0.0 to 0.40 poor to fair, 0.41 to 0.60 moderate, 0.61 to 0.80 substantial, and 0.81 to 1.00 as almost perfect reliability. One hundred five families participated, with five lost to follow-up. Thirty-two (30.5%) parent dyads completed the tool. Primary respondents were generally mothers (88%) and Caucasian (72%). Test-retest of the primary respondents showed their responses to be almost perfect; average 0.82 (SD = 0.13, range 0.49-1.00). Seventeen questions had almost perfect test-retest reliability and 11 had substantial reliability. However, inter-rater agreement between household members for 12 objective questions showed little agreement between responses; inter-rater agreement averaged 0.35 (SD = 0.34, range -0.19-1.00). One question had almost perfect inter-rater agreement and two had substantial inter-rater agreement. The IP screening tool used by a single individual had excellent test-retest reliability for nearly all questions. However, when a reporter changes from pre- to postintervention, differences may reflect poor reliability or different subjective experiences rather than true change.
Inter-Observer Reliability of DSM-5 Substance Use Disorders*
Denis, Cécile M.; Gelernter, Joel; Hart, Amy B.; Kranzler, Henry R.
2015-01-01
Aims Although studies have examined the impact of changes made in DSM-5 on the estimated prevalence of substance use disorder (SUD) diagnoses, there is limited evidence of the reliability of DSM-5 SUDs. We evaluated the inter-observer reliability of four DSM-5 SUDs in a sample in which we had previously evaluated the reliability of DSM-IV diagnoses, allowing us to compare the two systems. Methods Two different interviewers each assessed 173 subjects over a 2-week period using the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA). Using the percent agreement and kappa (κ) coefficient, we examined the reliability of DSM-5 lifetime alcohol, opioid, cocaine, and cannabis use disorders, which we compared to that of SSADDA-derived DSM-IV SUD diagnoses. We also assessed the effect of additional lifetime SUD and lifetime mood or anxiety disorder diagnoses on the reliability of the DSM-5 SUD diagnoses. Results Reliability was good to excellent for the four disorders, with κ values ranging from 0.65 to 0.94. Agreement was consistently lower for SUDs of mild severity than for moderate or severe disorders. DSM-5 SUD diagnoses showed greater reliability than DSM-IV diagnoses of abuse or dependence or dependence only. Co-occurring SUD and lifetime mood or anxiety disorders exerted a modest effect on the reliability of the DSM-5 SUD diagnoses. Conclusions For alcohol, opioid, cocaine and cannabis use disorders, DSM-5 criteria and diagnoses are at least as reliable as those of DSM-IV. PMID:26048641
Salido-Vallejo, R; Ruano, J; Garnacho-Saucedo, G; Godoy-Gijón, E; Llorca, D; Gómez-Fernández, C; Moreno-Giménez, J C
2014-12-01
Tuberous sclerosis complex (TSC) is an autosomal dominant neurocutaneous disorder characterized by the development of multisystem hamartomatous tumours. Topical sirolimus has recently been suggested as a potential treatment for TSC-associated facial angiofibroma (FA). To validate a reproducible scale created for the assessment of clinical severity and treatment response in these patients. We developed a new tool, the Facial Angiofibroma Severity Index (FASI) to evaluate the grade of erythema and the size and extent of FAs. In total, 30 different photographs of patients with TSC were shown to 56 dermatologists at each evaluation. Three evaluations using the same photographs but in a different random order were performed 1 week apart. Test and retest reliability and interobserver reproducibility were determined. There was good agreement between the investigators. Inter-rater reliability showed strong correlations (> 0.98; range 0.97-0.99) with inter-rater correlation coefficients (ICCs) for the FASI. The global estimated kappa coefficient for the degree of intra-rater agreement (test-retest) was 0.94 (range 0.91-0.97). The FASI is a valid and reliable tool for measuring the clinical severity of TSC-associated FAs, which can be applied in clinical practice to evaluate the response to treatment in these patients. © 2014 British Association of Dermatologists.
Parts and Components Reliability Assessment: A Cost Effective Approach
NASA Technical Reports Server (NTRS)
Lee, Lydia
2009-01-01
System reliability assessment is a methodology which incorporates reliability analyses performed at parts and components level such as Reliability Prediction, Failure Modes and Effects Analysis (FMEA) and Fault Tree Analysis (FTA) to assess risks, perform design tradeoffs, and therefore, to ensure effective productivity and/or mission success. The system reliability is used to optimize the product design to accommodate today?s mandated budget, manpower, and schedule constraints. Stand ard based reliability assessment is an effective approach consisting of reliability predictions together with other reliability analyses for electronic, electrical, and electro-mechanical (EEE) complex parts and components of large systems based on failure rate estimates published by the United States (U.S.) military or commercial standards and handbooks. Many of these standards are globally accepted and recognized. The reliability assessment is especially useful during the initial stages when the system design is still in the development and hard failure data is not yet available or manufacturers are not contractually obliged by their customers to publish the reliability estimates/predictions for their parts and components. This paper presents a methodology to assess system reliability using parts and components reliability estimates to ensure effective productivity and/or mission success in an efficient manner, low cost, and tight schedule.
Wimberley, Catriona J; Fischer, Kristina; Reilhac, Anthonin; Pichler, Bernd J; Gregoire, Marie Claude
2014-10-01
The partial saturation approach (PSA) is a simple, single injection experimental protocol that will estimate both B(avail) and appK(D) without the use of blood sampling. This makes it ideal for use in longitudinal studies of neurodegenerative diseases in the rodent. The aim of this study was to increase the range and applicability of the PSA by developing a data driven strategy for determining reliable regional estimates of receptor density (B(avail)) and in vivo affinity (1/appK(D)), and validate the strategy using a simulation model. The data driven method uses a time window guided by the dynamic equilibrium state of the system as opposed to using a static time window. To test the method, simulations of partial saturation experiments were generated and validated against experimental data. The experimental conditions simulated included a range of receptor occupancy levels and three different B(avail) and appK(D) values to mimic diseases states. Also the effect of using a reference region and typical PET noise on the stability and accuracy of the estimates was investigated. The investigations showed that the parameter estimates in a simulated healthy mouse, using the data driven method were within 10±30% of the simulated input for the range of occupancy levels simulated. Throughout all experimental conditions simulated, the accuracy and robustness of the estimates using the data driven method were much improved upon the typical method of using a static time window, especially at low receptor occupancy levels. Introducing a reference region caused a bias of approximately 10% over the range of occupancy levels. Based on extensive simulated experimental conditions, it was shown the data driven method provides accurate and precise estimates of B(avail) and appK(D) for a broader range of conditions compared to the original method. Copyright © 2014 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Kuschenerus, Mieke; Cullen, Robert
2016-08-01
To ensure reliability and precision of wave height estimates for future satellite altimetry missions such as Sentinel 6, reliable parameter retrieval algorithms that can extract significant wave heights up to 20 m have to be established. The retrieved parameters, i.e. the retrieval methods need to be validated extensively on a wide range of possible significant wave heights. Although current missions require wave height retrievals up to 20 m, there is little evidence of systematic validation of parameter retrieval methods for sea states with wave heights above 10 m. This paper provides a definition of a set of simulated sea states with significant wave height up to 20 m, that allow simulation of radar altimeter response echoes for extreme sea states in SAR and low resolution mode. The simulated radar responses are used to derive significant wave height estimates, which can be compared with the initial models, allowing precision estimations of the applied parameter retrieval methods. Thus we establish a validation method for significant wave height retrieval for sea states causing high significant wave heights, to allow improved understanding and planning of future satellite altimetry mission validation.
Almeida, Gustavo J; Irrgang, James J; Fitzgerald, G Kelley; Jakicic, John M; Piva, Sara R
2016-06-01
Few instruments that measure physical activity (PA) can accurately quantify PA performed at light and moderate intensities, which is particularly relevant in older adults. The evidence of their reliability in free-living conditions is limited. The study objectives were: (1) to determine the test-retest reliability of the Actigraph (ACT), SenseWear Armband (SWA), and Community Healthy Activities Model Program for Seniors (CHAMPS) questionnaire in assessing free-living PA at light and moderate intensities in people after total knee arthroplasty; (2) to compare the reliability of the 3 instruments relative to each other; and (3) to determine the reliability of commonly used monitoring time frames (24 hours, waking hours, and 10 hours from awakening). A one-group, repeated-measures design was used. Participants wore the activity monitors for 2 weeks, and the CHAMPS questionnaire was completed at the end of each week. Test-retest reliability was determined by using the intraclass correlation coefficient (ICC [2,k]) to compare PA measures from one week with those from the other week. Data from 28 participants who reported similar PA during the 2 weeks were included in the analysis. The mean age of these participants was 69 years (SD=8), and 75% of them were women. Reliability ranged from moderate to excellent for the ACT (ICC=.75-.86) and was excellent for the SWA (ICC=.93-.95) and the CHAMPS questionnaire (ICC=.86-.92). The 95% confidence intervals (95% CI) of the ICCs from the SWA were the only ones within the excellent reliability range (.85-.98). The CHAMPS questionnaire showed systematic bias, with less PA being reported in week 2. The reliability of PA measures in the waking-hour time frame was comparable to that in the 24-hour time frame and reflected most PA performed during this period. Reliability may be lower for time intervals longer than 1 week. All PA measures showed good reliability. The reliability of the ACT was lower than those of the SWA and the CHAMPS questionnaire. The SWA provided more precise reliability estimates. Wearing PA monitors during waking hours provided sufficiently reliable measures and can reduce the burden on people wearing them. © 2016 American Physical Therapy Association.
The reliability of the Glasgow Coma Scale: a systematic review.
Reith, Florence C M; Van den Brande, Ruben; Synnot, Anneliese; Gruen, Russell; Maas, Andrew I R
2016-01-01
The Glasgow Coma Scale (GCS) provides a structured method for assessment of the level of consciousness. Its derived sum score is applied in research and adopted in intensive care unit scoring systems. Controversy exists on the reliability of the GCS. The aim of this systematic review was to summarize evidence on the reliability of the GCS. A literature search was undertaken in MEDLINE, EMBASE and CINAHL. Observational studies that assessed the reliability of the GCS, expressed by a statistical measure, were included. Methodological quality was evaluated with the consensus-based standards for the selection of health measurement instruments checklist and its influence on results considered. Reliability estimates were synthesized narratively. We identified 52 relevant studies that showed significant heterogeneity in the type of reliability estimates used, patients studied, setting and characteristics of observers. Methodological quality was good (n = 7), fair (n = 18) or poor (n = 27). In good quality studies, kappa values were ≥0.6 in 85%, and all intraclass correlation coefficients indicated excellent reliability. Poor quality studies showed lower reliability estimates. Reliability for the GCS components was higher than for the sum score. Factors that may influence reliability include education and training, the level of consciousness and type of stimuli used. Only 13% of studies were of good quality and inconsistency in reported reliability estimates was found. Although the reliability was adequate in good quality studies, further improvement is desirable. From a methodological perspective, the quality of reliability studies needs to be improved. From a clinical perspective, a renewed focus on training/education and standardization of assessment is required.
Test Assembly Implications for Providing Reliable and Valid Subscores
ERIC Educational Resources Information Center
Lee, Minji K.; Sweeney, Kevin; Melican, Gerald J.
2017-01-01
This study investigates the relationships among factor correlations, inter-item correlations, and the reliability estimates of subscores, providing a guideline with respect to psychometric properties of useful subscores. In addition, it compares subscore estimation methods with respect to reliability and distinctness. The subscore estimation…
Compound estimation procedures in reliability
NASA Technical Reports Server (NTRS)
Barnes, Ron
1990-01-01
At NASA, components and subsystems of components in the Space Shuttle and Space Station generally go through a number of redesign stages. While data on failures for various design stages are sometimes available, the classical procedures for evaluating reliability only utilize the failure data on the present design stage of the component or subsystem. Often, few or no failures have been recorded on the present design stage. Previously, Bayesian estimators for the reliability of a single component, conditioned on the failure data for the present design, were developed. These new estimators permit NASA to evaluate the reliability, even when few or no failures have been recorded. Point estimates for the latter evaluation were not possible with the classical procedures. Since different design stages of a component (or subsystem) generally have a good deal in common, the development of new statistical procedures for evaluating the reliability, which consider the entire failure record for all design stages, has great intuitive appeal. A typical subsystem consists of a number of different components and each component has evolved through a number of redesign stages. The present investigations considered compound estimation procedures and related models. Such models permit the statistical consideration of all design stages of each component and thus incorporate all the available failure data to obtain estimates for the reliability of the present version of the component (or subsystem). A number of models were considered to estimate the reliability of a component conditioned on its total failure history from two design stages. It was determined that reliability estimators for the present design stage, conditioned on the complete failure history for two design stages have lower risk than the corresponding estimators conditioned only on the most recent design failure data. Several models were explored and preliminary models involving bivariate Poisson distribution and the Consael Process (a bivariate Poisson process) were developed. Possible short comings of the models are noted. An example is given to illustrate the procedures. These investigations are ongoing with the aim of developing estimators that extend to components (and subsystems) with three or more design stages.
Reliability and Validity Assessment of a Linear Position Transducer
Garnacho-Castaño, Manuel V.; López-Lastra, Silvia; Maté-Muñoz, José L.
2015-01-01
The objectives of the study were to determine the validity and reliability of peak velocity (PV), average velocity (AV), peak power (PP) and average power (AP) measurements were made using a linear position transducer. Validity was assessed by comparing measurements simultaneously obtained using the Tendo Weightlifting Analyzer Systemi and T-Force Dynamic Measurement Systemr (Ergotech, Murcia, Spain) during two resistance exercises, bench press (BP) and full back squat (BS), performed by 71 trained male subjects. For the reliability study, a further 32 men completed both lifts using the Tendo Weightlifting Analyzer Systemz in two identical testing sessions one week apart (session 1 vs. session 2). Intraclass correlation coefficients (ICCs) indicating the validity of the Tendo Weightlifting Analyzer Systemi were high, with values ranging from 0.853 to 0.989. Systematic biases and random errors were low to moderate for almost all variables, being higher in the case of PP (bias ±157.56 W; error ±131.84 W). Proportional biases were identified for almost all variables. Test-retest reliability was strong with ICCs ranging from 0.922 to 0.988. Reliability results also showed minimal systematic biases and random errors, which were only significant for PP (bias -19.19 W; error ±67.57 W). Only PV recorded in the BS showed no significant proportional bias. The Tendo Weightlifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and estimating power in resistance exercises. The low biases and random errors observed here (mainly AV, AP) make this device a useful tool for monitoring resistance training. Key points This study determined the validity and reliability of peak velocity, average velocity, peak power and average power measurements made using a linear position transducer The Tendo Weight-lifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and power. PMID:25729300
Lin, Huan-Ting; Okumura, Takashi; Yatsuda, Yukinori; Ito, Satoru; Nakauchi, Hiromitsu; Otsu, Makoto
2016-10-01
Stable gene transfer into target cell populations via integrating viral vectors is widely used in stem cell gene therapy (SCGT). Accurate vector copy number (VCN) estimation has become increasingly important. However, existing methods of estimation such as real-time quantitative PCR are more restricted in practicality, especially during clinical trials, given the limited availability of sample materials from patients. This study demonstrates the application of an emerging technology called droplet digital PCR (ddPCR) in estimating VCN states in the context of SCGT. Induced pluripotent stem cells (iPSCs) derived from a patient with X-linked chronic granulomatous disease were used as clonable target cells for transduction with alpharetroviral vectors harboring codon-optimized CYBB cDNA. Precise primer-probe design followed by multiplex analysis conferred assay specificity. Accurate estimation of per-cell VCN values was possible without reliance on a reference standard curve. Sensitivity was high and the dynamic range of detection was wide. Assay reliability was validated by observation of consistent, reproducible, and distinct VCN clustering patterns for clones of transduced iPSCs with varying numbers of transgene copies. Taken together, use of ddPCR appears to offer a practical and robust approach to VCN estimation with a wide range of clinical and research applications.
Lin, Huan-Ting; Okumura, Takashi; Yatsuda, Yukinori; Ito, Satoru; Nakauchi, Hiromitsu; Otsu, Makoto
2016-01-01
Stable gene transfer into target cell populations via integrating viral vectors is widely used in stem cell gene therapy (SCGT). Accurate vector copy number (VCN) estimation has become increasingly important. However, existing methods of estimation such as real-time quantitative PCR are more restricted in practicality, especially during clinical trials, given the limited availability of sample materials from patients. This study demonstrates the application of an emerging technology called droplet digital PCR (ddPCR) in estimating VCN states in the context of SCGT. Induced pluripotent stem cells (iPSCs) derived from a patient with X-linked chronic granulomatous disease were used as clonable target cells for transduction with alpharetroviral vectors harboring codon-optimized CYBB cDNA. Precise primer–probe design followed by multiplex analysis conferred assay specificity. Accurate estimation of per-cell VCN values was possible without reliance on a reference standard curve. Sensitivity was high and the dynamic range of detection was wide. Assay reliability was validated by observation of consistent, reproducible, and distinct VCN clustering patterns for clones of transduced iPSCs with varying numbers of transgene copies. Taken together, use of ddPCR appears to offer a practical and robust approach to VCN estimation with a wide range of clinical and research applications. PMID:27763786
A New Estimate of North American Mountain Snow Accumulation From Regional Climate Model Simulations
NASA Astrophysics Data System (ADS)
Wrzesien, Melissa L.; Durand, Michael T.; Pavelsky, Tamlin M.; Kapnick, Sarah B.; Zhang, Yu; Guo, Junyi; Shum, C. K.
2018-02-01
Despite the importance of mountain snowpack to understanding the water and energy cycles in North America's montane regions, no reliable mountain snow climatology exists for the entire continent. We present a new estimate of mountain snow water equivalent (SWE) for North America from regional climate model simulations. Climatological peak SWE in North America mountains is 1,006 km3, 2.94 times larger than previous estimates from reanalyses. By combining this mountain SWE value with the best available global product in nonmountain areas, we estimate peak North America SWE of 1,684 km3, 55% greater than previous estimates. In our simulations, the date of maximum SWE varies widely by mountain range, from early March to mid-April. Though mountains comprise 24% of the continent's land area, we estimate that they contain 60% of North American SWE. This new estimate is a suitable benchmark for continental- and global-scale water and energy budget studies.
Halford, K.J.; Mayer, G.C.
2000-01-01
Ground water discharge and recharge frequently have been estimated with hydrograph-separation techniques, but the critical assumptions of the techniques have not been investigated. The critical assumptions are that the hydraulic characteristics of the contributing aquifer (recession index) can be estimated from stream-discharge records; that periods of exclusively ground water discharge can be reliably identified; and that stream-discharge peaks approximate the magnitude and tinting of recharge events. The first assumption was tested by estimating the recession index from st earn-discharge hydrographs, ground water hydrographs, and hydraulic diffusivity estimates from aquifer tests in basins throughout the eastern United States and Montana. The recession index frequently could not be estimated reliably from stream-discharge records alone because many of the estimates of the recession index were greater than 1000 days. The ratio of stream discharge during baseflow periods was two to 36 times greater than the maximum expected range of ground water discharge at 12 of the 13 field sites. The identification of the ground water component of stream-discharge records was ambiguous because drainage from bank-storage, wetlands, surface water bodies, soils, and snowpacks frequently exceeded ground water discharge and also decreased exponentially during recession periods. The timing and magnitude of recharge events could not be ascertained from stream-discharge records at any of the sites investigated because recharge events were not directly correlated with stream peaks. When used alone, the recession-curve-displacement method and other hydrograph-separation techniques are poor tools for estimating ground water discharge or recharge because the major assumptions of the methods are commonly and grossly violated. Multiple, alternative methods of estimating ground water discharge and recharge should be used because of the uncertainty associated with any one technique.
Artificial Intelligence Estimation of Carotid-Femoral Pulse Wave Velocity using Carotid Waveform.
Tavallali, Peyman; Razavi, Marianne; Pahlevan, Niema M
2018-01-17
In this article, we offer an artificial intelligence method to estimate the carotid-femoral Pulse Wave Velocity (PWV) non-invasively from one uncalibrated carotid waveform measured by tonometry and few routine clinical variables. Since the signal processing inputs to this machine learning algorithm are sensor agnostic, the presented method can accompany any medical instrument that provides a calibrated or uncalibrated carotid pressure waveform. Our results show that, for an unseen hold back test set population in the age range of 20 to 69, our model can estimate PWV with a Root-Mean-Square Error (RMSE) of 1.12 m/sec compared to the reference method. The results convey the fact that this model is a reliable surrogate of PWV. Our study also showed that estimated PWV was significantly associated with an increased risk of CVDs.
Test Reliability at the Individual Level
Hu, Yueqin; Nesselroade, John R.; Erbacher, Monica K.; Boker, Steven M.; Burt, S. Alexandra; Keel, Pamela K.; Neale, Michael C.; Sisk, Cheryl L.; Klump, Kelly
2016-01-01
Reliability has a long history as one of the key psychometric properties of a test. However, a given test might not measure people equally reliably. Test scores from some individuals may have considerably greater error than others. This study proposed two approaches using intraindividual variation to estimate test reliability for each person. A simulation study suggested that the parallel tests approach and the structural equation modeling approach recovered the simulated reliability coefficients. Then in an empirical study, where forty-five females were measured daily on the Positive and Negative Affect Schedule (PANAS) for 45 consecutive days, separate estimates of reliability were generated for each person. Results showed that reliability estimates of the PANAS varied substantially from person to person. The methods provided in this article apply to tests measuring changeable attributes and require repeated measures across time on each individual. This article also provides a set of parallel forms of PANAS. PMID:28936107
Wang-Hsu, Elizabeth; Smith, Susan S
2017-01-10
Falls are a common cause of injuries and hospital admissions in older adults. Balance limitation is a potentially modifiable factor contributing to falls. The Balance Evaluation Systems Test (BESTest), a clinical balance measure, categorizes balance into 6 underlying subsystems. Each of the subsystems is scored individually and summed to obtain a total score. The reliability of the BESTest and its individual subsystems has been reported in patients with various neurological disorders and cancer survivors. However, the reliability and minimal detectable change (MDC) of the BESTest with community-dwelling older adults have not been reported. The purposes of our study were to (1) determine the interrater and test-retest reliability of the BESTest total and subsystem scores; and (2) estimate the MDC of the BESTest and its individual subsystem scores with community-dwelling older adults. We used a prospective cohort methodological design. Community-dwelling older adults (N = 70; aged 70-94 years; mean = 85.0 [5.5] years) were recruited from a senior independent living community. Trained testers (N = 3) administered the BESTest. All participants were tested with the BESTest by the same tester initially and then retested 7 to 14 days later. With 32 of the participants, a second tester concurrently scored the retest for interrater reliability. Testers were blinded to each other's scores. Intraclass correlation coefficients [ICC(2,1)] were used to determine the interrater and test-retest reliability. Test-retest reliability was also analyzed using method error and the associated coefficients of variation (CVME). MDC was calculated using standard error of measurement. Interrater reliability (N = 32) of the BESTest total score was ICC(2, 1) = 0.97 (95% confidence interval [CI], 0.94-0.99). The ICCs for the individual subsystem scores ranged from 0.85 to 0.94. Test-retest reliability (N = 70) of the BESTest total score was ICC(2,1) = 0.93 (95% CI, 0.89-0.96). ICCs for the individual subsystem scores ranged from 0.72 to 0.89. The CVME (N = 70) of the BESTest total score was 4.1%. The CVME for the subsystem scores ranged from 5.0% to 10.7%. MDC (N = 70) for the BESTest total score at the 95% CI was 7.6%, or 8.2 points. MDC at the 95% CI for subsystem scores ranged from 11.7% to 19.0% (2.1-3.4 points). Results demonstrated generally good to excellent interrater and test-retest reliability in both the BESTest total and subsystem scores with community-dwelling older adults. The BESTest total and individual subsystem scores demonstrate good to excellent interrater and test-retest reliability with community-dwelling older adults. A change of 7.6% (8.2 points) or more in the BESTest total and a percentage change ranged from 11.7% to 19.0% (2.1-3.4 points) in the subsystem scores are suggested for clinicians to be 95% confident of true change when evaluating change in this population.
Estimation of the oxalate content of foods and daily oxalate intake
NASA Technical Reports Server (NTRS)
Holmes, R. P.; Kennedy, M.
2000-01-01
BACKGROUND: The amount of oxalate ingested may be an important risk factor in the development of idiopathic calcium oxalate nephrolithiasis. Reliable food tables listing the oxalate content of foods are currently not available. The aim of this research was to develop an accurate and reliable method to measure the food content of oxalate. METHODS: Capillary electrophoresis (CE) and ion chromatography (IC) were compared as direct techniques for the estimation of the oxalate content of foods. Foods were thoroughly homogenized in acid, heat extracted, and clarified by centrifugation and filtration before dilution in water for analysis. Five individuals consuming self-selected diets maintained food records for three days to determine their mean daily oxalate intakes. RESULTS: Both techniques were capable of adequately measuring the oxalate in foods with a significant oxalate content. With foods of very low oxalate content (<1.8 mg/100 g), IC was more reliable than CE. The mean daily intake of oxalate by the five individuals tested was 152 +/- 83 mg, ranging from 44 to 352 mg/day. CONCLUSIONS: CE appears to be the method of choice over IC for estimating the oxalate content of foods with a medium (>10 mg/100 g) to high oxalate content due to a faster analysis time and lower running costs, whereas IC may be better suited for the analysis of foods with a low oxalate content. Accurate estimates of the oxalate content of foods should permit the role of dietary oxalate in urinary oxalate excretion and stone formation to be clarified. Other factors, apart from the amount of oxalate ingested, appear to exert a major influence over the amount of oxalate excreted in the urine.
Macionis, Valdas
2013-01-09
Diagrammatic recording of finger joint angles by using two criss-crossed paper strips can be a quick substitute to the standard goniometry. As a preliminary step toward clinical validation of the diagrammatic technique, the current study employed healthy subjects and non-professional raters to explore whether reliability estimates of the diagrammatic goniometry are comparable with those of the standard procedure. The study included two procedurally different parts, which were replicated by assigning 24 medical students to act interchangeably as 12 subjects and 12 raters. A larger component of the study was designed to compare goniometers side-by-side in measurement of finger joint angles varying from subject to subject. In the rest of the study, the instruments were compared by parallel evaluations of joint angles similar for all subjects in a situation of simulated change of joint range of motion over time. The subjects used special guides to position the joints of their left ring finger at varying angles of flexion and extension. The obtained diagrams of joint angles were converted to numerical values by computerized measurements. The statistical approaches included calculation of appropriate intraclass correlation coefficients, standard errors of measurements, proportions of measurement differences of 5 or less degrees, and significant differences between paired observations. Reliability estimates were similar for both goniometers. Intra-rater and inter-rater intraclass correlation coefficients ranged from 0.69 to 0.93. The corresponding standard errors of measurements ranged from 2.4 to 4.9 degrees. Repeated measurements of a considerable number of raters fell within clinically non-meaningful 5 degrees of each other in proportions comparable with a criterion value of 0.95. Data collected with both instruments could be similarly interpreted in a simulated situation of change of joint range of motion over time. The paper goniometer and the standard goniometer can be used interchangeably by non-professional raters for evaluation of normal finger joints. The obtained results warrant further research to assess clinical performance of the paper strip technique.
2013-01-01
Background Diagrammatic recording of finger joint angles by using two criss-crossed paper strips can be a quick substitute to the standard goniometry. As a preliminary step toward clinical validation of the diagrammatic technique, the current study employed healthy subjects and non-professional raters to explore whether reliability estimates of the diagrammatic goniometry are comparable with those of the standard procedure. Methods The study included two procedurally different parts, which were replicated by assigning 24 medical students to act interchangeably as 12 subjects and 12 raters. A larger component of the study was designed to compare goniometers side-by-side in measurement of finger joint angles varying from subject to subject. In the rest of the study, the instruments were compared by parallel evaluations of joint angles similar for all subjects in a situation of simulated change of joint range of motion over time. The subjects used special guides to position the joints of their left ring finger at varying angles of flexion and extension. The obtained diagrams of joint angles were converted to numerical values by computerized measurements. The statistical approaches included calculation of appropriate intraclass correlation coefficients, standard errors of measurements, proportions of measurement differences of 5 or less degrees, and significant differences between paired observations. Results Reliability estimates were similar for both goniometers. Intra-rater and inter-rater intraclass correlation coefficients ranged from 0.69 to 0.93. The corresponding standard errors of measurements ranged from 2.4 to 4.9 degrees. Repeated measurements of a considerable number of raters fell within clinically non-meaningful 5 degrees of each other in proportions comparable with a criterion value of 0.95. Data collected with both instruments could be similarly interpreted in a simulated situation of change of joint range of motion over time. Conclusions The paper goniometer and the standard goniometer can be used interchangeably by non-professional raters for evaluation of normal finger joints. The obtained results warrant further research to assess clinical performance of the paper strip technique. PMID:23302419
Seo, Joohyun; Pietrangelo, Sabino J; Sodini, Charles G; Lee, Hae-Seung
2018-05-01
This paper details unfocused imaging using single-element ultrasound transducers for motion tolerant arterial blood pressure (ABP) waveform estimation. The ABP waveform is estimated based on pulse wave velocity and arterial pulsation through Doppler and M-mode ultrasound. This paper discusses approaches to mitigate the effect of increased clutter due to unfocused imaging on blood flow and diameter waveform estimation. An intensity reduction model (IRM) estimator is described to track the change of diameter, which outperforms a complex cross-correlation model (C3M) estimator in low contrast environments. An adaptive clutter filtering approach is also presented, which reduces the increased Doppler angle estimation error due to unfocused imaging. Experimental results in a flow phantom demonstrate that flow velocity and diameter waveforms can be reliably measured with wide lateral offsets of the transducer position. The distension waveform estimated from human carotid M-mode imaging using the IRM estimator shows physiological baseline fluctuations and 0.6-mm pulsatile diameter change on average, which is within the expected physiological range. These results show the feasibility of this low cost and portable ABP waveform estimation device.
Coal resources of the Sonda coal field, Sindh Province, Pakistan
Thomas, R.E.; Riaz, Khan M.; Ahmed, Khan S.
1993-01-01
Approximately 4.7 billion t of original coal resources, ranging from lignite A to subbituminous C in rank, are estimated to be present in the Sonda coal field. These resources occur in 10 coal zones in the Bara Formation of Paleocene age. The Bara Formation does not out crop in the area covered by this report. Thin discontinuous coal beds also occur in the Sonhari Member of the Laki Formation, of Paleocene and Eocene age, but they are unimportant as a resource of the Sonda coal field. The coal resource assessment was based on 56 exploratory drill holes that were completed in the Sonda field between April 1986 and February 1988. The Sonda coal field is split into two, roughly equal, areas by the southwestward flowing Indus River, a major barrier to the logistics of communications between the two halves. As a result the two halves, called the Sonda East and Sonda West areas, were evaluated at different times by slightlydifferent techniques; but, because the geology is consistent between the two areas, the results of both evaluations have been summarized in this report. The resource estimates for the Sonda East area, approximately 1,700 million t, were based on the thickest coal bed in each zone at each drill hole. This method gives a conservative estimate of the total amount of coal in the Sonda East area. The resource estimates for the Sonda West area, approximately 3,000 million t, were based on cumulative coal bed thicknesses within each coal zone, resulting in a more liberal estimate. In both cases, minimum parameters for qualifying coal were a thickness of 30 cm or greater and no more than 50% ash; partings thicker than 1 cm were excluded. The three most important coal zones in the Sonda field are the Inayatabad, the Middle Sonda and the Lower Sonda. Together, these three coal zones contain 50% of the total resources. Isopachs were constructed for the thickest coal beds in these three coal zones and indicate large variations in thickness over relatively small distances. Coal beds in the Sonda coal field were difficult to correlate because of poor core recovery in some intervals and abrupt lateral thinning and thickening. Most coal zones are separated by 5-10 m of interburden, although in some places the interburden between zones is over 100 m thick. More closely spaced drill holes should clarify and significantly improve coal zone correlations in the Bara Formation. Coal resources in the Sonda coal field were calculated for three reliability categories; measured, indicated, and inferred. The most reliable estimates are those for the measured category. Measured coal resources are approximately 91 million t, or about 2% of the total resource; indicated resources are 681 million t, or about 14% of the total; and inferred resources, the least reliable resource category, are 3,931 million t, or 84% of the total resources. The distribution of resources by reliability category is due to the relatively wide spacing (approximately 5 km) between core holes. Analyses of 90 coal samples, on an as-received basis, indicate average ash and sulfur contents of 13.7% and 3.6%, respectively, and a range in rank from lignite A to subbituminous C. Calorific values for these samples range from 6,000 to 8,000 Btu/lb (1 Btu = 1055J; 1 lb = 4536 kg). ?? 1993.
How Many Sleep Diary Entries Are Needed to Reliably Estimate Adolescent Sleep?
Arora, Teresa; Gradisar, Michael; Taheri, Shahrad; Carskadon, Mary A.
2017-01-01
Abstract Study Objectives: To investigate (1) how many nights of sleep diary entries are required for reliable estimates of five sleep-related outcomes (bedtime, wake time, sleep onset latency [SOL], sleep duration, and wake after sleep onset [WASO]) and (2) the test–retest reliability of sleep diary estimates of school night sleep across 12 weeks. Methods: Data were drawn from four adolescent samples (Australia [n = 385], Qatar [n = 245], United Kingdom [n = 770], and United States [n = 366]), who provided 1766 eligible sleep diary weeks for reliability analyses. We performed reliability analyses for each cohort using complete data (7 days), one to five school nights, and one to two weekend nights. We also performed test–retest reliability analyses on 12-week sleep diary data available from a subgroup of 55 US adolescents. Results: Intraclass correlation coefficients for bedtime, SOL, and sleep duration indicated good-to-excellent reliability from five weekday nights of sleep diary entries across all adolescent cohorts. Four school nights was sufficient for wake times in the Australian and UK samples, but not the US or Qatari samples. Only Australian adolescents showed good reliability for two weekend nights of bedtime reports; estimates of SOL were adequate for UK adolescents based on two weekend nights. WASO was not reliably estimated using 1 week of sleep diaries. We observed excellent test–rest reliability across 12 weeks of sleep diary data in a subsample of US adolescents. Conclusion: We recommend at least five weekday nights of sleep dairy entries to be made when studying adolescent bedtimes, SOL, and sleep duration. Adolescent sleep patterns were stable across 12 consecutive school weeks. PMID:28199718
How Many Sleep Diary Entries Are Needed to Reliably Estimate Adolescent Sleep?
Short, Michelle A; Arora, Teresa; Gradisar, Michael; Taheri, Shahrad; Carskadon, Mary A
2017-03-01
To investigate (1) how many nights of sleep diary entries are required for reliable estimates of five sleep-related outcomes (bedtime, wake time, sleep onset latency [SOL], sleep duration, and wake after sleep onset [WASO]) and (2) the test-retest reliability of sleep diary estimates of school night sleep across 12 weeks. Data were drawn from four adolescent samples (Australia [n = 385], Qatar [n = 245], United Kingdom [n = 770], and United States [n = 366]), who provided 1766 eligible sleep diary weeks for reliability analyses. We performed reliability analyses for each cohort using complete data (7 days), one to five school nights, and one to two weekend nights. We also performed test-retest reliability analyses on 12-week sleep diary data available from a subgroup of 55 US adolescents. Intraclass correlation coefficients for bedtime, SOL, and sleep duration indicated good-to-excellent reliability from five weekday nights of sleep diary entries across all adolescent cohorts. Four school nights was sufficient for wake times in the Australian and UK samples, but not the US or Qatari samples. Only Australian adolescents showed good reliability for two weekend nights of bedtime reports; estimates of SOL were adequate for UK adolescents based on two weekend nights. WASO was not reliably estimated using 1 week of sleep diaries. We observed excellent test-rest reliability across 12 weeks of sleep diary data in a subsample of US adolescents. We recommend at least five weekday nights of sleep dairy entries to be made when studying adolescent bedtimes, SOL, and sleep duration. Adolescent sleep patterns were stable across 12 consecutive school weeks. © Sleep Research Society 2017. Published by Oxford University Press on behalf of the Sleep Research Society. All rights reserved. For permissions, please e-mail journals.permissions@oup.com.
García Bengoechea, Enrique; Sabiston, Catherine M; Wilson, Philip M
2017-01-01
The aim of this study was to provide initial evidence of validity and reliability of scores derived from the Activity Context in Youth Sport Questionnaire (ACYSQ), an instrument designed to offer a comprehensive assessment of the activities adolescents take part in during sport practices. Two studies were designed for the purposes of item development and selection, and to provide evidence of structural and criterion validity of ACYSQ scores, respectively (N = 334; M age = 14.93, SD = 1.76 years). Confirmatory factor analysis (CFA) supported the adequacy of a 20-item ACYSQ measurement model, which was invariant across gender, and comprised the following dimensions: (1) stimulation; (2) usefulness-value; (3) authenticity; (4) repetition-boredom; and (5) ineffectiveness. Internal consistency reliability estimates and composite reliability estimates for ACYSQ subscale scores ranged from 0.72 to 0.91. In regression analyses, stimulation predicted enjoyment and perceived competence, ineffectiveness was significantly associated with perceived competence and authenticity emerged as a predictor of commitment in sport. These findings indicate that the ACYSQ displays adequate psychometric properties and the use of the instrument may be useful for studying selected activity-based features of the practice environment and their motivational consequences in youth sport.
A Meta-Analysis of Reliability Coefficients in Second Language Research
ERIC Educational Resources Information Center
Plonsky, Luke; Derrick, Deirdre J.
2016-01-01
Ensuring internal validity in quantitative research requires, among other conditions, reliable instrumentation. Unfortunately, however, second language (L2) researchers often fail to report and even more often fail to interpret reliability estimates beyond generic benchmarks for acceptability. As a means to guide interpretations of such estimates,…
Reliability Estimates for Undergraduate Grade Point Average
ERIC Educational Resources Information Center
Westrick, Paul A.
2017-01-01
Undergraduate grade point average (GPA) is a commonly employed measure in educational research, serving as a criterion or as a predictor depending on the research question. Over the decades, researchers have used a variety of reliability coefficients to estimate the reliability of undergraduate GPA, which suggests that there has been no consensus…
Reliability of Test Scores in Nonparametric Item Response Theory.
ERIC Educational Resources Information Center
Sijtsma, Klaas; Molenaar, Ivo W.
1987-01-01
Three methods for estimating reliability are studied within the context of nonparametric item response theory. Two were proposed originally by Mokken and a third is developed in this paper. Using a Monte Carlo strategy, these three estimation methods are compared with four "classical" lower bounds to reliability. (Author/JAZ)
IRT-Estimated Reliability for Tests Containing Mixed Item Formats
ERIC Educational Resources Information Center
Shu, Lianghua; Schwarz, Richard D.
2014-01-01
As a global measure of precision, item response theory (IRT) estimated reliability is derived for four coefficients (Cronbach's a, Feldt-Raju, stratified a, and marginal reliability). Models with different underlying assumptions concerning test-part similarity are discussed. A detailed computational example is presented for the targeted…
Evaluation of Validity and Reliability for Hierarchical Scales Using Latent Variable Modeling
ERIC Educational Resources Information Center
Raykov, Tenko; Marcoulides, George A.
2012-01-01
A latent variable modeling method is outlined, which accomplishes estimation of criterion validity and reliability for a multicomponent measuring instrument with hierarchical structure. The approach provides point and interval estimates for the scale criterion validity and reliability coefficients, and can also be used for testing composite or…
López-Pina, José Antonio; Sánchez-Meca, Julio; López-López, José Antonio; Marín-Martínez, Fulgencio; Núñez-Núñez, Rosa Ma; Rosa-Alcázar, Ana I; Gómez-Conesa, Antonia; Ferrer-Requena, Josefa
2015-01-01
The Yale-Brown Obsessive-Compulsive Scale for children and adolescents (CY-BOCS) is a frequently applied test to assess obsessive-compulsive symptoms. We conducted a reliability generalization meta-analysis on the CY-BOCS to estimate the average reliability, search for reliability moderators, and propose a predictive model that researchers and clinicians can use to estimate the expected reliability of the CY-BOCS scores. A total of 47 studies reporting a reliability coefficient with the data at hand were included in the meta-analysis. The results showed good reliability and a large variability associated to the standard deviation of total scores and sample size.
A Vision System For A Mars Rover
NASA Astrophysics Data System (ADS)
Wilcox, Brian H.; Gennery, Donald B.; Mishkin, Andrew H.; Cooper, Brian K.; Lawton, Teri B.; Lay, N. Keith; Katzmann, Steven P.
1987-01-01
A Mars rover must be able to sense its local environment with sufficient resolution and accuracy to avoid local obstacles and hazards while moving a significant distance each day. Power efficiency and reliability are extremely important considerations, making stereo correlation an attractive method of range sensing compared to laser scanning, if the computational load and correspondence errors can be handled. Techniques for treatment of these problems, including the use of more than two cameras to reduce correspondence errors and possibly to limit the computational burden of stereo processing, have been tested at JPL. Once a reliable range map is obtained, it must be transformed to a plan view and compared to a stored terrain database, in order to refine the estimated position of the rover and to improve the database. The slope and roughness of each terrain region are computed, which form the basis for a traversability map allowing local path planning. Ongoing research and field testing of such a system is described.
A vision system for a Mars rover
NASA Technical Reports Server (NTRS)
Wilcox, Brian H.; Gennery, Donald B.; Mishkin, Andrew H.; Cooper, Brian K.; Lawton, Teri B.; Lay, N. Keith; Katzmann, Steven P.
1988-01-01
A Mars rover must be able to sense its local environment with sufficient resolution and accuracy to avoid local obstacles and hazards while moving a significant distance each day. Power efficiency and reliability are extremely important considerations, making stereo correlation an attractive method of range sensing compared to laser scanning, if the computational load and correspondence errors can be handled. Techniques for treatment of these problems, including the use of more than two cameras to reduce correspondence errors and possibly to limit the computational burden of stereo processing, have been tested at JPL. Once a reliable range map is obtained, it must be transformed to a plan view and compared to a stored terrain database, in order to refine the estimated position of the rover and to improve the database. The slope and roughness of each terrain region are computed, which form the basis for a traversability map allowing local path planning. Ongoing research and field testing of such a system is described.
Predictors of validity and reliability of a physical activity record in adolescents
2013-01-01
Background Poor to moderate validity of self-reported physical activity instruments is commonly observed in young people in low- and middle-income countries. However, the reasons for such low validity have not been examined in detail. We tested the validity of a self-administered daily physical activity record in adolescents and assessed if personal characteristics or the convenience level of reporting physical activity modified the validity estimates. Methods The study comprised a total of 302 adolescents from an urban and rural area in Ecuador. Validity was evaluated by comparing the record with accelerometer recordings for seven consecutive days. Test-retest reliability was examined by comparing registrations from two records administered three weeks apart. Time spent on sedentary (SED), low (LPA), moderate (MPA) and vigorous (VPA) intensity physical activity was estimated. Bland Altman plots were used to evaluate measurement agreement. We assessed if age, sex, urban or rural setting, anthropometry and convenience of completing the record explained differences in validity estimates using a linear mixed model. Results Although the record provided higher estimates for SED and VPA and lower estimates for LPA and MPA compared to the accelerometer, it showed an overall fair measurement agreement for validity. There was modest reliability for assessing physical activity in each intensity level. Validity was associated with adolescents’ personal characteristics: sex (SED: P = 0.007; LPA: P = 0.001; VPA: P = 0.009) and setting (LPA: P = 0.000; MPA: P = 0.047). Reliability was associated with the convenience of completing the physical activity record for LPA (low convenience: P = 0.014; high convenience: P = 0.045). Conclusions The physical activity record provided acceptable estimates for reliability and validity on a group level. Sex and setting were associated with validity estimates, whereas convenience to fill out the record was associated with better reliability estimates for LPA. This tendency of improved reliability estimates for adolescents reporting higher convenience merits further consideration. PMID:24289296
Estimating irrigation water use in the humid eastern United States
Levin, Sara B.; Zarriello, Phillip J.
2013-01-01
Accurate accounting of irrigation water use is an important part of the U.S. Geological Survey National Water-Use Information Program and the WaterSMART initiative to help maintain sustainable water resources in the Nation. Irrigation water use in the humid eastern United States is not well characterized because of inadequate reporting and wide variability associated with climate, soils, crops, and farming practices. To better understand irrigation water use in the eastern United States, two types of predictive models were developed and compared by using metered irrigation water-use data for corn, cotton, peanut, and soybean crops in Georgia and turf farms in Rhode Island. Reliable metered irrigation data were limited to these areas. The first predictive model that was developed uses logistic regression to predict the occurrence of irrigation on the basis of antecedent climate conditions. Logistic regression equations were developed for corn, cotton, peanut, and soybean crops by using weekly irrigation water-use data from 36 metered sites in Georgia in 2009 and 2010 and turf farms in Rhode Island from 2000 to 2004. For the weeks when irrigation was predicted to take place, the irrigation water-use volume was estimated by multiplying the average metered irrigation application rate by the irrigated acreage for a given crop. The second predictive model that was developed is a crop-water-demand model that uses a daily soil water balance to estimate the water needs of a crop on a given day based on climate, soil, and plant properties. Crop-water-demand models were developed independently of reported irrigation water-use practices and relied on knowledge of plant properties that are available in the literature. Both modeling approaches require accurate accounting of irrigated area and crop type to estimate total irrigation water use. Water-use estimates from both modeling methods were compared to the metered irrigation data from Rhode Island and Georgia that were used to develop the models as well as two independent validation datasets from Georgia and Virginia that were not used in model development. Irrigation water-use estimates from the logistic regression method more closely matched mean reported irrigation rates than estimates from the crop-water-demand model when compared to the irrigation data used to develop the equations. The root mean squared errors (RMSEs) for the logistic regression estimates of mean annual irrigation ranged from 0.3 to 2.0 inches (in.) for the five crop types; RMSEs for the crop-water-demand models ranged from 1.4 to 3.9 in. However, when the models were applied and compared to the independent validation datasets from southwest Georgia from 2010, and from Virginia from 1999 to 2007, the crop-water-demand model estimates were as good as or better at predicting the mean irrigation volume than the logistic regression models for most crop types. RMSEs for logistic regression estimates of mean annual irrigation ranged from 1.0 to 7.0 in. for validation data from Georgia and from 1.8 to 4.9 in. for validation data from Virginia; RMSEs for crop-water-demand model estimates ranged from 2.1 to 5.8 in. for Georgia data and from 2.0 to 3.9 in. for Virginia data. In general, regression-based models performed better in areas that had quality daily or weekly irrigation data from which the regression equations were developed; however, the regression models were less reliable than the crop-water-demand models when applied outside the area for which they were developed. In most eastern coastal states that do not have quality irrigation data, the crop-water-demand model can be used more reliably. The development of predictive models of irrigation water use in this study was hindered by a lack of quality irrigation data. Many mid-Atlantic and New England states do not require irrigation water use to be reported. A survey of irrigation data from 14 eastern coastal states from Maine to Georgia indicated that, with the exception of the data in Georgia, irrigation data in the states that do require reporting commonly did not contain requisite ancillary information such as irrigated area or crop type, lacked precision, or were at an aggregated temporal scale making them unsuitable for use in the development of predictive models. Confidence in the reliability of either modeling method is affected by uncertainty in the reported data from which the models were developed or validated. Only through additional collection of quality data and further study can the accuracy and uncertainty of irrigation water-use estimates be improved in the humid eastern United States.
Estimation of sex from the lower limb measurements of Sudanese adults.
Ahmed, Altayeb Abdalla
2013-06-10
The sex estimation from mutilated and amputated limbs or body parts is one of the most vital steps in person identification in medical-legal autopsies. Sex estimation from lower limb anthropometric measurements has demonstrated a high degree of expected accuracy in a limited range of the global population. The aims of this study were to assess the degree of the sexual dimorphism in lower limb measurements and the accuracy of utilization of these measurements for estimation of sex in a contemporary adult Sudanese population. The tibial length, bimalleolar breadth, foot length, and foot breadth of 240 right-handed Sudanese Arab subjects (120 males and 120 females) aged between 25 and 30 years were measured following international anthropometric standards. Demarking points, sexual dimorphism indices and discriminant functions were developed from 200 subjects (100 males and 100 females) who comprised the study group. All variables were sexually dimorphic. The bimalleolar breadth and foot breadth significantly contributed to sex estimation. Leg dimensions showed a higher accuracy for sex estimation than foot dimensions. Cross-validated sex classification accuracy ranged between 78% and 89.5%. The reliability of these standards was assessed in a test sample of 20 males and 20 females, and the results showed accuracy between 75% and 90%. This study provides new forensic standards for sex estimation from lower limb measurements of Sudanese adults. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
van Heesch, Peter N; Struijk, Pieter C; Laudy, Jaqueline A M; Steegers, Eric A P; Wildschut, Hajo I J
2010-05-01
To establish how different methods of estimating gestational age (GA) affect reliability of first-trimester screening for Down syndrome. Retrospective single-center study of 100 women with a viable singleton pregnancy, who had first-trimester screening. We calculated multiples of the median (MoM) for maternal-serum free beta human chorionic gonadotropin (free beta-hCG) and pregnancy associated plasma protein-A (PAPP-A), derived from either last menstrual period (LMP) or ultrasound-dating scans. In women with a regular cycle, LMP-derived estimates of GA were two days longer (range -11 to 18), than crown-rump length (CRL)-derived estimates of GA whereas this discrepancy was more pronounced in women who reported to have an irregular cycle, i.e., six days (range -7 to 32). Except for PAPP-A in the regular-cycle group, all differences were significant. Consequently, risk estimates are affected by the mode of estimating GA. In fact, LMP-based estimates revealed ten "screen-positive" cases compared to five "screen-positive" cases where GA was derived from dating-scans. Provided fixed values for nuchal translucency are applied, dating-scans reduce the number of screen-positive findings on the basis of biochemical screening. We recommend implementation of guidelines for Down syndrome screening based on CRL-dependent rather than LMP-dependent parameters of GA.
NASA Astrophysics Data System (ADS)
Liu, Yiming; Shi, Yimin; Bai, Xuchao; Zhan, Pei
2018-01-01
In this paper, we study the estimation for the reliability of a multicomponent system, named N- M-cold-standby redundancy system, based on progressive Type-II censoring sample. In the system, there are N subsystems consisting of M statistically independent distributed strength components, and only one of these subsystems works under the impact of stresses at a time and the others remain as standbys. Whenever the working subsystem fails, one from the standbys takes its place. The system fails when the entire subsystems fail. It is supposed that the underlying distributions of random strength and stress both belong to the generalized half-logistic distribution with different shape parameter. The reliability of the system is estimated by using both classical and Bayesian statistical inference. Uniformly minimum variance unbiased estimator and maximum likelihood estimator for the reliability of the system are derived. Under squared error loss function, the exact expression of the Bayes estimator for the reliability of the system is developed by using the Gauss hypergeometric function. The asymptotic confidence interval and corresponding coverage probabilities are derived based on both the Fisher and the observed information matrices. The approximate highest probability density credible interval is constructed by using Monte Carlo method. Monte Carlo simulations are performed to compare the performances of the proposed reliability estimators. A real data set is also analyzed for an illustration of the findings.
NASA Astrophysics Data System (ADS)
Winiarek, Victor; Bocquet, Marc; Duhanyan, Nora; Roustan, Yelva; Saunier, Olivier; Mathieu, Anne
2014-01-01
Inverse modelling techniques can be used to estimate the amount of radionuclides and the temporal profile of the source term released in the atmosphere during the accident of the Fukushima Daiichi nuclear power plant in March 2011. In Winiarek et al. (2012b), the lower bounds of the caesium-137 and iodine-131 source terms were estimated with such techniques, using activity concentration measurements. The importance of an objective assessment of prior errors (the observation errors and the background errors) was emphasised for a reliable inversion. In such critical context where the meteorological conditions can make the source term partly unobservable and where only a few observations are available, such prior estimation techniques are mandatory, the retrieved source term being very sensitive to this estimation. We propose to extend the use of these techniques to the estimation of prior errors when assimilating observations from several data sets. The aim is to compute an estimate of the caesium-137 source term jointly using all available data about this radionuclide, such as activity concentrations in the air, but also daily fallout measurements and total cumulated fallout measurements. It is crucial to properly and simultaneously estimate the background errors and the prior errors relative to each data set. A proper estimation of prior errors is also a necessary condition to reliably estimate the a posteriori uncertainty of the estimated source term. Using such techniques, we retrieve a total released quantity of caesium-137 in the interval 11.6-19.3 PBq with an estimated standard deviation range of 15-20% depending on the method and the data sets. The “blind” time intervals of the source term have also been strongly mitigated compared to the first estimations with only activity concentration data.
Hall, Justin M; Azar, Frederick M; Miller, Robert H; Smith, Richard; Throckmorton, Thomas W
2014-09-01
We compared accuracy and reliability of a traditional method of measurement (most cephalad vertebral spinous process that can be reached by a patient with the extended thumb) to estimates made with the shoulder in abduction to determine if there were differences between the two methods. Six physicians with fellowship training in sports medicine or shoulder surgery estimated measurements in 48 healthy volunteers. Three were randomly chosen to make estimates of both internal rotation measurements for each volunteer. An independent observer made objective measurements on lateral scoliosis films (spinous process method) or with a goniometer (abduction method). Examiners were blinded to objective measurements as well as to previous estimates. Intraclass coefficients for interobserver reliability for the traditional method averaged 0.75, indicating good agreement among observers. The difference in vertebral level estimated by the examiner and the actual radiographic level averaged 1.8 levels. The intraclass coefficient for interobserver reliability for the abduction method averaged 0.81 for all examiners, indicating near-perfect agreement. Confidence intervals indicated that estimates were an average of 8° different from the objective goniometer measurements. Pearson correlation coefficients of intraobserver reliability for the abduction method averaged 0.94, indicating near-perfect agreement within observers. Confidence intervals demonstrated repeated estimates between 5° and 10° of the original. Internal rotation estimates made with the shoulder abducted demonstrated interobserver reliability superior to that of spinous process estimates, and reproducibility was high. On the basis of this finding, we now take glenohumeral internal rotation measurements with the shoulder in abduction and use a goniometer to maximize accuracy and objectivity. Copyright © 2014 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Mosby, Inc. All rights reserved.
O'Donnell, Matthew J.; Horton, Gregg E.; Letcher, Benjamin H.
2010-01-01
Portable passive integrated transponder (PIT) tag antenna systems can be valuable in providing reliable estimates of the abundance of tagged Atlantic salmon Salmo salar in small streams under a wide range of conditions. We developed and employed PIT tag antenna wand techniques in two controlled experiments and an additional case study to examine the factors that influenced our ability to estimate population size. We used Pollock's robust-design capture–mark–recapture model to obtain estimates of the probability of first detection (p), the probability of redetection (c), and abundance (N) in the two controlled experiments. First, we conducted an experiment in which tags were hidden in fixed locations. Although p and c varied among the three observers and among the three passes that each observer conducted, the estimates of N were identical to the true values and did not vary among observers. In the second experiment using free-swimming tagged fish, p and c varied among passes and time of day. Additionally, estimates of N varied between day and night and among age-classes but were within 10% of the true population size. In the case study, we used the Cormack–Jolly–Seber model to examine the variation in p, and we compared counts of tagged fish found with the antenna wand with counts collected via electrofishing. In that study, we found that although p varied for age-classes, sample dates, and time of day, antenna and electrofishing estimates of N were similar, indicating that population size can be reliably estimated via PIT tag antenna wands. However, factors such as the observer, time of day, age of fish, and stream discharge can influence the initial and subsequent detection probabilities.
Dutch population specific sex estimation formulae using the proximal femur.
Colman, K L; Janssen, M C L; Stull, K E; van Rijn, R R; Oostra, R J; de Boer, H H; van der Merwe, A E
2018-05-01
Sex estimation techniques are frequently applied in forensic anthropological analyses of unidentified human skeletal remains. While morphological sex estimation methods are able to endure population differences, the classification accuracy of metric sex estimation methods are population-specific. No metric sex estimation method currently exists for the Dutch population. The purpose of this study is to create Dutch population specific sex estimation formulae by means of osteometric analyses of the proximal femur. Since the Netherlands lacks a representative contemporary skeletal reference population, 2D plane reconstructions, derived from clinical computed tomography (CT) data, were used as an alternative source for a representative reference sample. The first part of this study assesses the intra- and inter-observer error, or reliability, of twelve measurements of the proximal femur. The technical error of measurement (TEM) and relative TEM (%TEM) were calculated using 26 dry adult femora. In addition, the agreement, or accuracy, between the dry bone and CT-based measurements was determined by percent agreement. Only reliable and accurate measurements were retained for the logistic regression sex estimation formulae; a training set (n=86) was used to create the models while an independent testing set (n=28) was used to validate the models. Due to high levels of multicollinearity, only single variable models were created. Cross-validated classification accuracies ranged from 86% to 92%. The high cross-validated classification accuracies indicate that the developed formulae can contribute to the biological profile and specifically in sex estimation of unidentified human skeletal remains in the Netherlands. Furthermore, the results indicate that clinical CT data can be a valuable alternative source of data when representative skeletal collections are unavailable. Copyright © 2017 Elsevier B.V. All rights reserved.
Evaluation of a word recognition instrument to test health literacy in dentistry: the REALD-99.
Richman, Julia A; Lee, Jessica Y; Rozier, R Gary; Gong, Debra A; Pahel, Bhavna T; Vann, William F
2007-01-01
This study aims to evaluate a dental health literacy word recognition instrument. Based on a reading recognition test used in medicine, the Rapid Estimate of Adult Literacy in Medicine (REALM), we developed the Rapid Estimate of Adult Literacy in Dentistry (REALD-99). Parents of pediatric dental patients were recruited from local dental clinics and asked to read aloud words in both REALM and REALD-99. REALD-99 scores had a possible range of 0 (low literacy) to 99 (high literacy); REALM scores ranged from 0 to 66. Outcome measures included parents' perceived oral health for themselves and of their children, and oral health-related quality of life of the parent as measured by the short-form Oral Health Impact Profile (OHIP-14). To determine the validity, we tested bivariate correlations between REALM and REALD-99, REALM and perceived dental outcomes, and REALD-99 and perceived dental outcomes. We used ordinary least squares regression and logit models to further examine the relationship between REALD-99 and dental outcomes. We determined internal reliability using Cronbach's alpha. One hundred two parents of children were interviewed. The average REALD-99 and REALM-66 scores were high (84 and 62, respectively). REALD-99 was positively correlated with REALM (PCC = 0.80). REALM was not related to dental outcomes. REALD-99 was associated with parents' OHIP-14 score in multivariate analysis. REALD-99 had good reliability (Cronbach's alpha = 0.86). REALD-99 has promise for measuring dental health literacy because it demonstrated good reliability and is quick and easy to administer. Additional studies are needed to examine the validity of REALD-99 using objective clinical oral health measures and more proximal outcomes such as behavior and compliance to specific health instructions.
Bowman, Gene L.; Shannon, Jackilen; Ho, Emily; Traber, Maret G.; Frei, Balz; Oken, Barry S.; Kaye, Jeffery A.; Quinn, Joseph F.
2010-01-01
Introduction There is great interest in nutritional strategies for the prevention of age-related cognitive decline, yet the best methods for nutritional assessment in populations at risk for dementia are still evolving. Our study objective was to test the reliability and validity of two common nutritional assessments (plasma nutrient biomarkers and Food Frequency Questionnaire) in people at risk for dementia. Methods Thirty-eight elders, half with amnestic -Mild Cognitive Impairment and half with intact cognition were recruited. Nutritional assessments were collected together at baseline and again at 1 month. Intraclass and Pearson correlation coefficients quantified reliability and validity. Results Twenty-six nutrients were examined and reliability was very good or better for 77% (20/26, ICC ≥ .75) of the plasma nutrient biomarkers and for 88% of the FFQ estimates. Twelve of the plasma nutrient estimates were as reliable as the commonly measured plasma cholesterol (ICC=.92). FFQ and plasma long-chain fatty acids (docosahexaenoic acid, r =.39, eicosapentaenoic acid, r = .39) and carotenoids (α-carotene, r =.49; lutein + zeaxanthin, r = .48; β-carotene, r = .43; β-cryptoxanthin, r = .41) were correlated, but no other FFQ estimates correlated with respective nutrient biomarkers. Correlations between FFQ and plasma fatty acids and carotenoids were significantly stronger after removing subjects with MCI. Conclusion The reliability and validity of plasma and FFQ nutrient estimates vary according to the nutrient of interest. Memory deficit attenuates FFQ estimate validity and inflates FFQ estimate reliability. Many plasma nutrient biomarkers have very good reliability over 1-month regardless of memory state. This method can circumvent sources of error seen in other less direct methods of nutritional assessment. PMID:20856100
Skin Friction at Very High Reynolds Numbers in the National Transonic Facility
NASA Technical Reports Server (NTRS)
Watson, Ralph D.; Anders, John B.; Hall, Robert M.
2006-01-01
Skin friction coefficients were derived from measurements using standard measurement technologies on an axisymmetric cylinder in the NASA Langley National Transonic Facility (NTF) at Mach numbers from 0.2 to 0.85. The pressure gradient was nominally zero, the wall temperature was nominally adiabatic, and the ratio of boundary layer thickness to model diameter within the measurement region was 0.10 to 0.14, varying with distance along the model. Reynolds numbers based on momentum thicknesses ranged from 37,000 to 605,000. The measurements approximately doubled the range of available data for flat plate skin friction coefficients. Three different techniques were used to measure surface shear. The maximum error of Preston tube measurements was estimated to be 2.5 percent, while that of Clauser derived measurements was estimated to be approximately 5 percent. Direct measurements by skin friction balance proved to be subject to large errors and were not considered reliable.
Ocean Data Assimilation in Support of Climate Applications: Status and Perspectives.
Stammer, D; Balmaseda, M; Heimbach, P; Köhl, A; Weaver, A
2016-01-01
Ocean data assimilation brings together observations with known dynamics encapsulated in a circulation model to describe the time-varying ocean circulation. Its applications are manifold, ranging from marine and ecosystem forecasting to climate prediction and studies of the carbon cycle. Here, we address only climate applications, which range from improving our understanding of ocean circulation to estimating initial or boundary conditions and model parameters for ocean and climate forecasts. Because of differences in underlying methodologies, data assimilation products must be used judiciously and selected according to the specific purpose, as not all related inferences would be equally reliable. Further advances are expected from improved models and methods for estimating and representing error information in data assimilation systems. Ultimately, data assimilation into coupled climate system components is needed to support ocean and climate services. However, maintaining the infrastructure and expertise for sustained data assimilation remains challenging.
Factor structure of a standards-based inventory of competencies in social work with groups.
Macgowan, Mark J; Dillon, Frank R; Spadola, Christine E
2018-01-01
This study extends previous findings on a measure of competencies based on Standards for Social Work Practice with Groups. The Inventory of Competencies in Social Work with Groups (ICSWG) measures confidence in performing the Standards. This study examines the latent structure of the Inventory, while illuminating the underlying structure of the Standards. A multinational sample of 586 persons completed the ICSWG. Exploratory factor analysis (EFA), reliability estimates, standard error of measurement estimates, and a range of validity tests were conducted. The EFA yielded a six-factor solution consisting of core values, mutuality/connectivity, collaboration, and three phases of group development (planning, beginnings/middles, endings). The alphas were .98 for the scale and ranged from .85 to .95 for the subscales. Correlations between the subscales and validators supported evidence of construct validity. The findings suggest key group work domains that should be taught and practiced in social work with groups.
Evaluating North American Electric Grid Reliability Using the Barabasi-Albert Network Model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chassin, David P.; Posse, Christian
2005-09-15
The reliability of electric transmission systems is examined using a scale-free model of network topology and failure propagation. The topologies of the North American eastern and western electric grids are analyzed to estimate their reliability based on the Barabási-Albert network model. A commonly used power system reliability index is computed using a simple failure propagation model. The results are compared to the values of power system reliability indices previously obtained using other methods and they suggest that scale-free network models are usable to estimate aggregate electric grid reliability.
Evaluating North American Electric Grid Reliability Using the Barabasi-Albert Network Model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chassin, David P.; Posse, Christian
2005-09-15
The reliability of electric transmission systems is examined using a scale-free model of network topology and failure propagation. The topologies of the North American eastern and western electric grids are analyzed to estimate their reliability based on the Barabasi-Albert network model. A commonly used power system reliability index is computed using a simple failure propagation model. The results are compared to the values of power system reliability indices previously obtained using standard power engineering methods, and they suggest that scale-free network models are usable to estimate aggregate electric grid reliability.
Chang, Pyung Hun; Kang, Sang Hoon
2010-05-30
The basic assumption of stochastic human arm impedance estimation methods is that the human arm and robot behave linearly for small perturbations. In the present work, we have identified the degree of influence of nonlinear friction in robot joints to the stochastic human arm impedance estimation. Internal model based impedance control (IMBIC) is then proposed as a means to make the estimation accurate by compensating for the nonlinear friction. From simulations with a nonlinear Lugre friction model, it is observed that the reliability and accuracy of the estimation are severely degraded with nonlinear friction: below 2 Hz, multiple and partial coherence functions are far less than unity; estimated magnitudes and phases are severely deviated from that of a real human arm throughout the frequency range of interest; and the accuracy is not enhanced with an increase of magnitude of the force perturbations. In contrast, the combined use of stochastic estimation and IMBIC provides with accurate estimation results even with large friction: the multiple coherence functions are larger than 0.9 throughout the frequency range of interest and the estimated magnitudes and phases are well matched with that of a real human arm. Furthermore, the performance of suggested method is independent of human arm and robot posture, and human arm impedance. Therefore, the IMBIC will be useful in measuring human arm impedance with conventional robot, as well as in designing a spatial impedance measuring robot, which requires gearing. (c) 2010 Elsevier B.V. All rights reserved.
Reliability of TMS phosphene threshold estimation: Toward a standardized protocol.
Mazzi, Chiara; Savazzi, Silvia; Abrahamyan, Arman; Ruzzoli, Manuela
Phosphenes induced by transcranial magnetic stimulation (TMS) are a subjectively described visual phenomenon employed in basic and clinical research as index of the excitability of retinotopically organized areas in the brain. Phosphene threshold estimation is a preliminary step in many TMS experiments in visual cognition for setting the appropriate level of TMS doses; however, the lack of a direct comparison of the available methods for phosphene threshold estimation leaves unsolved the reliability of those methods in setting TMS doses. The present work aims at fulfilling this gap. We compared the most common methods for phosphene threshold calculation, namely the Method of Constant Stimuli (MOCS), the Modified Binary Search (MOBS) and the Rapid Estimation of Phosphene Threshold (REPT). In two experiments we tested the reliability of PT estimation under each of the three methods, considering the day of administration, participants' expertise in phosphene perception and the sensitivity of each method to the initial values used for the threshold calculation. We found that MOCS and REPT have comparable reliability when estimating phosphene thresholds, while MOBS estimations appear less stable. Based on our results, researchers and clinicians can estimate phosphene threshold according to MOCS or REPT equally reliably, depending on their specific investigation goals. We suggest several important factors for consideration when calculating phosphene thresholds and describe strategies to adopt in experimental procedures. Copyright © 2017 Elsevier Inc. All rights reserved.
Data-driven coarse graining in action: Modeling and prediction of complex systems
NASA Astrophysics Data System (ADS)
Krumscheid, S.; Pradas, M.; Pavliotis, G. A.; Kalliadasis, S.
2015-10-01
In many physical, technological, social, and economic applications, one is commonly faced with the task of estimating statistical properties, such as mean first passage times of a temporal continuous process, from empirical data (experimental observations). Typically, however, an accurate and reliable estimation of such properties directly from the data alone is not possible as the time series is often too short, or the particular phenomenon of interest is only rarely observed. We propose here a theoretical-computational framework which provides us with a systematic and rational estimation of statistical quantities of a given temporal process, such as waiting times between subsequent bursts of activity in intermittent signals. Our framework is illustrated with applications from real-world data sets, ranging from marine biology to paleoclimatic data.
Wavelet Analysis for Wind Fields Estimation
Leite, Gladeston C.; Ushizima, Daniela M.; Medeiros, Fátima N. S.; de Lima, Gilson G.
2010-01-01
Wind field analysis from synthetic aperture radar images allows the estimation of wind direction and speed based on image descriptors. In this paper, we propose a framework to automate wind direction retrieval based on wavelet decomposition associated with spectral processing. We extend existing undecimated wavelet transform approaches, by including à trous with B3 spline scaling function, in addition to other wavelet bases as Gabor and Mexican-hat. The purpose is to extract more reliable directional information, when wind speed values range from 5 to 10 ms−1. Using C-band empirical models, associated with the estimated directional information, we calculate local wind speed values and compare our results with QuikSCAT scatterometer data. The proposed approach has potential application in the evaluation of oil spills and wind farms. PMID:22219699
Zhu, Junya; Li, Liping; Zhao, Hailei; Han, Guangshu; Wu, Albert W; Weingart, Saul N
2014-10-01
Existing patient safety climate instruments, most of which have been developed in the USA, may not accurately reflect the conditions in the healthcare systems of other countries. To develop and evaluate a patient safety climate instrument for healthcare workers in Chinese hospitals. Based on a review of existing instruments, expert panel review, focus groups and cognitive interviews, we developed items relevant to patient safety climate in Chinese hospitals. The draft instrument was distributed to 1700 hospital workers from 54 units in six hospitals in five Chinese cities between July and October 2011, and 1464 completed surveys were received. We performed exploratory and confirmatory factor analyses and estimated internal consistency reliability, within-unit agreement, between-unit variation, unit-mean reliability, correlation between multi-item composites, and association between the composites and two single items of perceived safety. The final instrument included 34 items organised into nine composites: institutional commitment to safety, unit management support for safety, organisational learning, safety system, adequacy of safety arrangements, error reporting, communication and peer support, teamwork and staffing. All composites had acceptable unit-mean reliabilities (≥0.74) and within-unit agreement (Rwg ≥0.71), and exhibited significant between-unit variation with intraclass correlation coefficients ranging from 9% to 21%. Internal consistency reliabilities ranged from 0.59 to 0.88 and were ≥0.70 for eight of the nine composites. Correlations between composites ranged from 0.27 to 0.73. All composites were positively and significantly associated with the two perceived safety items. The Chinese Hospital Survey on Patient Safety Climate demonstrates adequate dimensionality, reliability and validity. The integration of qualitative and quantitative methods is essential to produce an instrument that is culturally appropriate for Chinese hospitals. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Testing comparison models of DASS-12 and its reliability among adolescents in Malaysia.
Osman, Zubaidah Jamil; Mukhtar, Firdaus; Hashim, Hairul Anuar; Abdul Latiff, Latiffah; Mohd Sidik, Sherina; Awang, Hamidin; Ibrahim, Normala; Abdul Rahman, Hejar; Ismail, Siti Irma Fadhilah; Ibrahim, Faisal; Tajik, Esra; Othman, Norlijah
2014-10-01
The 21-item Depression, Anxiety and Stress Scale (DASS-21) is frequently used in non-clinical research to measure mental health factors among adults. However, previous studies have concluded that the 21 items are not stable for utilization among the adolescent population. Thus, the aims of this study are to examine the structure of the factors and to report on the reliability of the refined version of the DASS that consists of 12 items. A total of 2850 students (aged 13 to 17 years old) from three major ethnic in Malaysia completed the DASS-21. The study was conducted at 10 randomly selected secondary schools in the northern state of Peninsular Malaysia. The study population comprised secondary school students (Forms 1, 2 and 4) from the selected schools. Based on the results of the EFA stage, 12 items were included in a final CFA to test the fit of the model. Using maximum likelihood procedures to estimate the model, the selected fit indices indicated a close model fit (χ(2)=132.94, df=57, p=.000; CFI=.96; RMR=.02; RMSEA=.04). Moreover, significant loadings of all the unstandardized regression weights implied an acceptable convergent validity. Besides the convergent validity of the item, a discriminant validity of the subscales was also evident from the moderate latent factor inter-correlations, which ranged from .62 to .75. The subscale reliability was further estimated using Cronbach's alpha and the adequate reliability of the subscales was obtained (Total=76; Depression=.68; Anxiety=.53; Stress=.52). The new version of the 12-item DASS for adolescents in Malaysia (DASS-12) is reliable and has a stable factor structure, and thus it is a useful instrument for distinguishing between depression, anxiety and stress. Copyright © 2014 Elsevier Inc. All rights reserved.
Olsen, J. Pat; Fellows, Robert P.; Rivera-Mindt, Monica; Morgello, Susan; Byrd, Desiree A.
2015-01-01
The Wide Range Achievement Test, 3rd edition, Reading-Recognition subtest (WRAT-3 RR) is an established measure of premorbid ability. Furthermore, its long-term reliability is not well documented, particularly in diverse populations with CNS-relevant disease. Objective: We examined test-retest reliability of the WRAT-3 RR over time in an HIV+ sample of predominantly racial/ethnic minority adults. Method: Participants (N = 88) completed a comprehensive neuropsychological battery, including the WRAT-3 RR, on at least two separate study visits. Intraclass correlation coefficients (ICCs) were computed using scores from baseline and follow-up assessments to determine the test-retest reliability of the WRAT-3 RR across racial/ethnic groups and changes in medical (immunological) and clinical (neurocognitive) factors. Additionally, Fisher’s Z tests were used to determine the significance of the differences between ICCs. Results: The average test-retest interval was 58.7 months (SD=36.4). The overall WRAT-3 RR test-retest reliability was high (r = .97, p < .001), and remained robust across all demographic, medical, and clinical variables (all r’s > .92). Intraclass correlation coefficients did not differ significantly between the subgroups tested (all Fisher’s Z p’s > .05). Conclusions: Overall, this study supports the appropriateness of word-reading tests, such as the WRAT-3 RR, for use as stable premorbid IQ estimates among ethnically diverse groups. Moreover, this study supports the reliability of this measure in the context of change in health and neurocognitive status, and in lengthy inter-test intervals. These findings offer strong rationale for reading as a “hold” test, even in the presence of a chronic, variable disease such as HIV. PMID:26689235
Processes and Procedures for Estimating Score Reliability and Precision
ERIC Educational Resources Information Center
Bardhoshi, Gerta; Erford, Bradley T.
2017-01-01
Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…
A Latent Class Approach to Estimating Test-Score Reliability
ERIC Educational Resources Information Center
van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas
2011-01-01
This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…
ERIC Educational Resources Information Center
Gadermann, Anne M.; Guhn, Martin; Zumbo, Bruno D.
2012-01-01
This paper provides a conceptual, empirical, and practical guide for estimating ordinal reliability coefficients for ordinal item response data (also referred to as Likert, Likert-type, ordered categorical, or rating scale item responses). Conventionally, reliability coefficients, such as Cronbach's alpha, are calculated using a Pearson…
NASA Astrophysics Data System (ADS)
Su, Hailin; Li, Hengde; Wang, Shi; Wang, Yangfan; Bao, Zhenmin
2017-02-01
Genomic selection is more and more popular in animal and plant breeding industries all around the world, as it can be applied early in life without impacting selection candidates. The objective of this study was to bring the advantages of genomic selection to scallop breeding. Two different genomic selection tools MixP and gsbay were applied on genomic evaluation of simulated data and Zhikong scallop ( Chlamys farreri) field data. The data were compared with genomic best linear unbiased prediction (GBLUP) method which has been applied widely. Our results showed that both MixP and gsbay could accurately estimate single-nucleotide polymorphism (SNP) marker effects, and thereby could be applied for the analysis of genomic estimated breeding values (GEBV). In simulated data from different scenarios, the accuracy of GEBV acquired was ranged from 0.20 to 0.78 by MixP; it was ranged from 0.21 to 0.67 by gsbay; and it was ranged from 0.21 to 0.61 by GBLUP. Estimations made by MixP and gsbay were expected to be more reliable than those estimated by GBLUP. Predictions made by gsbay were more robust, while with MixP the computation is much faster, especially in dealing with large-scale data. These results suggested that both algorithms implemented by MixP and gsbay are feasible to carry out genomic selection in scallop breeding, and more genotype data will be necessary to produce genomic estimated breeding values with a higher accuracy for the industry.
Investigation of spectral analysis techniques for randomly sampled velocimetry data
NASA Technical Reports Server (NTRS)
Sree, Dave
1993-01-01
It is well known that velocimetry (LV) generates individual realization velocity data that are randomly or unevenly sampled in time. Spectral analysis of such data to obtain the turbulence spectra, and hence turbulence scales information, requires special techniques. The 'slotting' technique of Mayo et al, also described by Roberts and Ajmani, and the 'Direct Transform' method of Gaster and Roberts are well known in the LV community. The slotting technique is faster than the direct transform method in computation. There are practical limitations, however, as to how a high frequency and accurate estimate can be made for a given mean sampling rate. These high frequency estimates are important in obtaining the microscale information of turbulence structure. It was found from previous studies that reliable spectral estimates can be made up to about the mean sampling frequency (mean data rate) or less. If the data were evenly samples, the frequency range would be half the sampling frequency (i.e. up to Nyquist frequency); otherwise, aliasing problem would occur. The mean data rate and the sample size (total number of points) basically limit the frequency range. Also, there are large variabilities or errors associated with the high frequency estimates from randomly sampled signals. Roberts and Ajmani proposed certain pre-filtering techniques to reduce these variabilities, but at the cost of low frequency estimates. The prefiltering acts as a high-pass filter. Further, Shapiro and Silverman showed theoretically that, for Poisson sampled signals, it is possible to obtain alias-free spectral estimates far beyond the mean sampling frequency. But the question is, how far? During his tenure under 1993 NASA-ASEE Summer Faculty Fellowship Program, the author investigated from his studies on the spectral analysis techniques for randomly sampled signals that the spectral estimates can be enhanced or improved up to about 4-5 times the mean sampling frequency by using a suitable prefiltering technique. But, this increased bandwidth comes at the cost of the lower frequency estimates. The studies further showed that large data sets of the order of 100,000 points, or more, high data rates, and Poisson sampling are very crucial for obtaining reliable spectral estimates from randomly sampled data, such as LV data. Some of the results of the current study are presented.
Inter-observer reliability of DSM-5 substance use disorders.
Denis, Cécile M; Gelernter, Joel; Hart, Amy B; Kranzler, Henry R
2015-08-01
Although studies have examined the impact of changes made in DSM-5 on the estimated prevalence of substance use disorder (SUD) diagnoses, there is limited evidence concerning the reliability of DSM-5 SUDs. We evaluated the inter-observer reliability of four DSM-5 SUDs in a sample in which we had previously evaluated the reliability of DSM-IV diagnoses, allowing us to compare the two systems. Two different interviewers each assessed 173 subjects over a 2-week period using the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA). Using the percent agreement and kappa (κ) coefficient, we examined the reliability of DSM-5 lifetime alcohol, opioid, cocaine, and cannabis use disorders, which we compared to that of SSADDA-derived DSM-IV SUD diagnoses. We also assessed the effect of additional lifetime SUD and lifetime mood or anxiety disorder diagnoses on the reliability of the DSM-5 SUD diagnoses. Reliability was good to excellent for the four disorders, with κ values ranging from 0.65 to 0.94. Agreement was consistently lower for SUDs of mild severity than for moderate or severe disorders. DSM-5 SUD diagnoses showed greater reliability than DSM-IV diagnoses of abuse or dependence or dependence only. Co-occurring SUD and lifetime mood or anxiety disorders exerted a modest effect on the reliability of the DSM-5 SUD diagnoses. For alcohol, opioid, cocaine and cannabis use disorders, DSM-5 criteria and diagnoses are at least as reliable as those of DSM-IV. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Reliability of hospital cost profiles in inpatient surgery.
Grenda, Tyler R; Krell, Robert W; Dimick, Justin B
2016-02-01
With increased policy emphasis on shifting risk from payers to providers through mechanisms such as bundled payments and accountable care organizations, hospitals are increasingly in need of metrics to understand their costs relative to peers. However, it is unclear whether Medicare payments for surgery can reliably compare hospital costs. We used national Medicare data to assess patients undergoing colectomy, pancreatectomy, and open incisional hernia repair from 2009 to 2010 (n = 339,882 patients). We first calculated risk-adjusted hospital total episode payments for each procedure. We then used hierarchical modeling techniques to estimate the reliability of total episode payments for each procedure and explored the impact of hospital caseload on payment reliability. Finally, we quantified the number of hospitals meeting published reliability benchmarks. Mean risk-adjusted total episode payments ranged from $13,262 (standard deviation [SD] $14,523) for incisional hernia repair to $25,055 (SD $22,549) for pancreatectomy. The reliability of hospital episode payments varied widely across procedures and depended on sample size. For example, mean episode payment reliability for colectomy (mean caseload, 157) was 0.80 (SD 0.18), whereas for pancreatectomy (mean caseload, 13) the mean reliability was 0.45 (SD 0.27). Many hospitals met published reliability benchmarks for each procedure. For example, 90% of hospitals met reliability benchmarks for colectomy, 40% for pancreatectomy, and 66% for incisional hernia repair. Episode payments for inpatient surgery are a reliable measure of hospital costs for commonly performed procedures, but are less reliable for lower volume operations. These findings suggest that hospital cost profiles based on Medicare claims data may be used to benchmark efficiency, especially for more common procedures. Copyright © 2016 Elsevier Inc. All rights reserved.
The modal surface interpolation method for damage localization
NASA Astrophysics Data System (ADS)
Pina Limongelli, Maria
2017-05-01
The Interpolation Method (IM) has been previously proposed and successfully applied for damage localization in plate like structures. The method is based on the detection of localized reductions of smoothness in the Operational Deformed Shapes (ODSs) of the structure. The IM can be applied to any type of structure provided the ODSs are estimated accurately in the original and in the damaged configurations. If the latter circumstance fails to occur, for example when the structure is subjected to an unknown input(s) or if the structural responses are strongly corrupted by noise, both false and missing alarms occur when the IM is applied to localize a concentrated damage. In order to overcome these drawbacks a modification of the method is herein investigated. An ODS is the deformed shape of a structure subjected to a harmonic excitation: at resonances the ODS are dominated by the relevant mode shapes. The effect of noise at resonance is usually lower with respect to other frequency values hence the relevant ODS are estimated with higher reliability. Several methods have been proposed to reliably estimate modal shapes in case of unknown input. These two circumstances can be exploited to improve the reliability of the IM. In order to reduce or eliminate the drawbacks related to the estimation of the ODSs in case of noisy signals, in this paper is investigated a modified version of the method based on a damage feature calculated considering the interpolation error relevant only to the modal shapes and not to all the operational shapes in the significant frequency range. Herein will be reported the comparison between the results of the IM in its actual version (with the interpolation error calculated summing up the contributions of all the operational shapes) and in the new proposed version (with the estimation of the interpolation error limited to the modal shapes).
Internal Consistency, Retest Reliability, and their Implications For Personality Scale Validity
McCrae, Robert R.; Kurtz, John E.; Yamagata, Shinji; Terracciano, Antonio
2010-01-01
We examined data (N = 34,108) on the differential reliability and validity of facet scales from the NEO Inventories. We evaluated the extent to which (a) psychometric properties of facet scales are generalizable across ages, cultures, and methods of measurement; and (b) validity criteria are associated with different forms of reliability. Composite estimates of facet scale stability, heritability, and cross-observer validity were broadly generalizable. Two estimates of retest reliability were independent predictors of the three validity criteria; none of three estimates of internal consistency was. Available evidence suggests the same pattern of results for other personality inventories. Internal consistency of scales can be useful as a check on data quality, but appears to be of limited utility for evaluating the potential validity of developed scales, and it should not be used as a substitute for retest reliability. Further research on the nature and determinants of retest reliability is needed. PMID:20435807
The influence of liquidity on informational efficiency: The case of the Thai Stock Market
NASA Astrophysics Data System (ADS)
Bariviera, Aurelio Fernández
2011-11-01
The presence of long-range memory in financial time series is a puzzling fact that challenges the established financial theory. We study the effect of liquidity on the efficiency (measured by the Hurst’s exponent) of the Thai Stock Market. According to our study, we find that: (i) the R/S method could generate spurious long-range dependence, giving the DFA method more reliable estimates of the Hurst’s exponent and (ii) there is a weak relationship between market capitalization and the efficiency of the market, and that the latter is not significantly affected by the presence of foreign investors.
Au, Lewis; Turner, Natalie; Wong, Hui-Li; Field, Kathryn; Lee, Belinda; Boadle, David; Cooray, Prasad; Karikios, Deme; Kosmider, Suzanne; Lipton, Lara; Nott, Louise; Parente, Phillip; Tie, Jeanne; Tran, Ben; Wong, Rachel; Yip, Desmond; Shapiro, Jeremy; Gibbs, Peter
2018-04-01
Current efforts to understand patient management in clinical practice are largely based on clinician surveys with uncertain reliability. The TRACC (Treatment of Recurrent and Advanced Colorectal Cancer) database is a multisite registry collecting comprehensive treatment and outcome data on consecutive metastatic colorectal cancer (mCRC) patients at multiple sites across Australia. This study aims to determine the accuracy of oncologists' impressions of real-word practice by comparing clinicians' estimates to data captured by TRACC. Nineteen medical oncologists from nine hospitals contributing data to TRACC completed a 34-question survey regarding their impression of the management and outcomes of mCRC at their own practice and other hospitals contributing to the database. Responses were then compared with TRACC data to determine how closely their impressions reflected actual practice. Data on 1300 patients with mCRC were available. Median clinician estimated frequency of KRAS testing within 6 months of diagnosis was 80% (range: 20-100%); the TRACC documented rate was 43%. Clinicians generally overestimated the rates of first-line treatment, particularly in patients over 75 years. Estimate for bevacizumab in first line was 60% (35-80%) versus 49% in TRACC. Estimated rate for liver resection varied substantially (5-35%), and the estimated median (27%) was inconsistent with the TRACC rate (12%). Oncologists generally felt their practice was similar to other hospitals. Oncologists' estimates of current clinical practice varied and were discordant with the TRACC database, often with a tendency to overestimate interventions. Clinician surveys alone do not reliably capture contemporary clinical practices in mCRC. © 2017 John Wiley & Sons Australia, Ltd.
Predicting Cost/Reliability/Maintainability of Advanced General Aviation Avionics Equipment
NASA Technical Reports Server (NTRS)
Davis, M. R.; Kamins, M.; Mooz, W. E.
1978-01-01
A methodology is provided for assisting NASA in estimating the cost, reliability, and maintenance (CRM) requirements for general avionics equipment operating in the 1980's. Practical problems of predicting these factors are examined. The usefulness and short comings of different approaches for modeling coast and reliability estimates are discussed together with special problems caused by the lack of historical data on the cost of maintaining general aviation avionics. Suggestions are offered on how NASA might proceed in assessing cost reliability CRM implications in the absence of reliable generalized predictive models.
The Trojan Lifetime Champions Health Survey: development, validity, and reliability.
Sorenson, Shawn C; Romano, Russell; Scholefield, Robin M; Schroeder, E Todd; Azen, Stanley P; Salem, George J
2015-04-01
Self-report questionnaires are an important method of evaluating lifespan health, exercise, and health-related quality of life (HRQL) outcomes among elite, competitive athletes. Few instruments, however, have undergone formal characterization of their psychometric properties within this population. To evaluate the validity and reliability of a novel health and exercise questionnaire, the Trojan Lifetime Champions (TLC) Health Survey. Descriptive laboratory study. A large National Collegiate Athletic Association Division I university. A total of 63 university alumni (age range, 24 to 84 years), including former varsity collegiate athletes and a control group of nonathletes. Participants completed the TLC Health Survey twice at a mean interval of 23 days with randomization to the paper or electronic version of the instrument. Content validity, feasibility of administration, test-retest reliability, parallel-form reliability between paper and electronic forms, and estimates of systematic and typical error versus differences of clinical interest were assessed across a broad range of health, exercise, and HRQL measures. Correlation coefficients, including intraclass correlation coefficients (ICCs) for continuous variables and κ agreement statistics for ordinal variables, for test-retest reliability averaged 0.86, 0.90, 0.80, and 0.74 for HRQL, lifetime health, recent health, and exercise variables, respectively. Correlation coefficients, again ICCs and κ, for parallel-form reliability (ie, equivalence) between paper and electronic versions averaged 0.90, 0.85, 0.85, and 0.81 for HRQL, lifetime health, recent health, and exercise variables, respectively. Typical measurement error was less than the a priori thresholds of clinical interest, and we found minimal evidence of systematic test-retest error. We found strong evidence of content validity, convergent construct validity with the Short-Form 12 Version 2 HRQL instrument, and feasibility of administration in an elite, competitive athletic population. These data suggest that the TLC Health Survey is a valid and reliable instrument for assessing lifetime and recent health, exercise, and HRQL, among elite competitive athletes. Generalizability of the instrument may be enhanced by additional, larger-scale studies in diverse populations.
Constraining uncertainties in water supply reliability in a tropical data scarce basin
NASA Astrophysics Data System (ADS)
Kaune, Alexander; Werner, Micha; Rodriguez, Erasmo; de Fraiture, Charlotte
2015-04-01
Assessing the water supply reliability in river basins is essential for adequate planning and development of irrigated agriculture and urban water systems. In many cases hydrological models are applied to determine the surface water availability in river basins. However, surface water availability and variability is often not appropriately quantified due to epistemic uncertainties, leading to water supply insecurity. The objective of this research is to determine the water supply reliability in order to support planning and development of irrigated agriculture in a tropical, data scarce environment. The approach proposed uses a simple hydrological model, but explicitly includes model parameter uncertainty. A transboundary river basin in the tropical region of Colombia and Venezuela with an approximately area of 2100 km² was selected as a case study. The Budyko hydrological framework was extended to consider climatological input variability and model parameter uncertainty, and through this the surface water reliability to satisfy the irrigation and urban demand was estimated. This provides a spatial estimate of the water supply reliability across the basin. For the middle basin the reliability was found to be less than 30% for most of the months when the water is extracted from an upstream source. Conversely, the monthly water supply reliability was high (r>98%) in the lower basin irrigation areas when water was withdrawn from a source located further downstream. Including model parameter uncertainty provides a complete estimate of the water supply reliability, but that estimate is influenced by the uncertainty in the model. Reducing the uncertainty in the model through improved data and perhaps improved model structure will improve the estimate of the water supply reliability allowing better planning of irrigated agriculture and dependable water allocation decisions.
Cost Estimation of Software Development and the Implications for the Program Manager
1992-06-01
Software Lifecycle Model (SLIM), the Jensen System-4 model, the Software Productivity, Quality, and Reliability Estimator ( SPQR \\20), the Constructive...function models in current use are the Software Productivity, Quality, and Reliability Estimator ( SPQR /20) and the Software Architecture Sizing and...Estimator ( SPQR /20) was developed by T. Capers Jones of Software Productivity Research, Inc., in 1985. The model is intended to estimate the outcome
Point Cloud Based Relative Pose Estimation of a Satellite in Close Range
Liu, Lujiang; Zhao, Gaopeng; Bo, Yuming
2016-01-01
Determination of the relative pose of satellites is essential in space rendezvous operations and on-orbit servicing missions. The key problems are the adoption of suitable sensor on board of a chaser and efficient techniques for pose estimation. This paper aims to estimate the pose of a target satellite in close range on the basis of its known model by using point cloud data generated by a flash LIDAR sensor. A novel model based pose estimation method is proposed; it includes a fast and reliable pose initial acquisition method based on global optimal searching by processing the dense point cloud data directly, and a pose tracking method based on Iterative Closest Point algorithm. Also, a simulation system is presented in this paper in order to evaluate the performance of the sensor and generate simulated sensor point cloud data. It also provides truth pose of the test target so that the pose estimation error can be quantified. To investigate the effectiveness of the proposed approach and achievable pose accuracy, numerical simulation experiments are performed; results demonstrate algorithm capability of operating with point cloud directly and large pose variations. Also, a field testing experiment is conducted and results show that the proposed method is effective. PMID:27271633
NASA Astrophysics Data System (ADS)
Luo, Shezhou; Wang, Cheng; Xi, Xiaohuan; Pan, Feifei; Qian, Mingjie; Peng, Dailiang; Nie, Sheng; Qin, Haiming; Lin, Yi
2017-06-01
Wetland biomass is essential for monitoring the stability and productivity of wetland ecosystems. Conventional field methods to measure or estimate wetland biomass are accurate and reliable, but expensive, time consuming and labor intensive. This research explored the potential for estimating wetland reed biomass using a combination of airborne discrete-return Light Detection and Ranging (LiDAR) and hyperspectral data. To derive the optimal predictor variables of reed biomass, a range of LiDAR and hyperspectral metrics at different spatial scales were regressed against the field-observed biomasses. The results showed that the LiDAR-derived H_p99 (99th percentile of the LiDAR height) and hyperspectral-calculated modified soil-adjusted vegetation index (MSAVI) were the best metrics for estimating reed biomass using the single regression model. Although the LiDAR data yielded a higher estimation accuracy compared to the hyperspectral data, the combination of LiDAR and hyperspectral data produced a more accurate prediction model for reed biomass (R2 = 0.648, RMSE = 167.546 g/m2, RMSEr = 20.71%) than LiDAR data alone. Thus, combining LiDAR data with hyperspectral data has a great potential for improving the accuracy of aboveground biomass estimation.
Social Costs of Gambling in the Czech Republic 2012.
Winkler, Petr; Bejdová, Markéta; Csémy, Ladislav; Weissová, Aneta
2017-12-01
Evidence about social costs of gambling is scarce and the methodology for their calculation has been a subject to strong criticism. We aimed to estimate social costs of gambling in the Czech Republic 2012. This retrospective, prevalence based cost of illness study builds on the revised methodology of Australian Productivity Commission. Social costs of gambling were estimated by combining epidemiological and economic data. Prevalence data on negative consequences of gambling were taken from existing national epidemiological studies. Economic data were taken from various national and international sources. Consequences of problem and pathological gambling only were taken into account. In 2012, the social costs of gambling in the Czech Republic were estimated to range between 541,619 and 619,608 thousands EUR. While personal and family costs accounted for 63% of all social costs, direct medical costs were estimated to range from 0.25 to 0.28% of all social costs only. This is the first study which estimates social costs of gambling in any of the Central and East European countries. It builds upon the solid evidence about prevalence of gambling related problems in the Czech Republic and satisfactorily reliable economic data. However, there is a number of limitations stemming from assumptions that were made, which suggest that the methodology for the calculation of the social costs of gambling needs further development.
Oscillating-flow regenerator test rig: Woven screen and metal felt results
NASA Technical Reports Server (NTRS)
Gedeon, D.; Wood, J. G.
1992-01-01
We present correlating expressions, in terms of Reynolds or Peclet numbers, for friction factors, Nusselt numbers, enhanced axial conduction ratios, and overall heat flux ratios in four porous regenerator samples representative of stirling cycle regenerators: two woven screen samples and two random wire samples. Error estimates and comparison of data with others suggest our correlations are reliable, but we need to test more samples over a range of porosities before our results will become generally useful.
NASA Astrophysics Data System (ADS)
Katake, Anup; Choi, Heeyoul
2010-01-01
To enable autonomous air-to-refueling of manned and unmanned vehicles a robust high speed relative navigation sensor capable of proving high accuracy 3DOF information in diverse operating conditions is required. To help address this problem, StarVision Technologies Inc. has been developing a compact, high update rate (100Hz), wide field-of-view (90deg) direction and range estimation imaging sensor called VisNAV 100. The sensor is fully autonomous requiring no communication from the tanker aircraft and contains high reliability embedded avionics to provide range, azimuth, elevation (3 degrees of freedom solution 3DOF) and closing speed relative to the tanker aircraft. The sensor is capable of providing 3DOF with an error of 1% in range and 0.1deg in azimuth/elevation up to a range of 30m and 1 deg error in direction for ranges up to 200m at 100Hz update rates. In this paper we will discuss the algorithms that were developed in-house to enable robust beacon pattern detection, outlier rejection and 3DOF estimation in adverse conditions and present the results of several outdoor tests. Results from the long range single beacon detection tests will also be discussed.
APPLICATION OF TRAVEL TIME RELIABILITY FOR PERFORMANCE ORIENTED OPERATIONAL PLANNING OF EXPRESSWAYS
NASA Astrophysics Data System (ADS)
Mehran, Babak; Nakamura, Hideki
Evaluation of impacts of congestion improvement scheme s on travel time reliability is very significant for road authorities since travel time reliability repr esents operational performance of expressway segments. In this paper, a methodology is presented to estimate travel tim e reliability prior to implementation of congestion relief schemes based on travel time variation modeling as a function of demand, capacity, weather conditions and road accident s. For subject expressway segmen ts, traffic conditions are modeled over a whole year considering demand and capacity as random variables. Patterns of demand and capacity are generated for each five minute interval by appl ying Monte-Carlo simulation technique, and accidents are randomly generated based on a model that links acci dent rate to traffic conditions. A whole year analysis is performed by comparing de mand and available capacity for each scenario and queue length is estimated through shockwave analysis for each time in terval. Travel times are estimated from refined speed-flow relationships developed for intercity expressways and buffer time index is estimated consequently as a measure of travel time reliability. For validation, estimated reliability indices are compared with measured values from empirical data, and it is shown that the proposed method is suitable for operational evaluation and planning purposes.
NASA Technical Reports Server (NTRS)
Unal, Resit; Morris, W. Douglas; White, Nancy H.; Lepsch, Roger A.; Brown, Richard W.
2000-01-01
This paper describes the development of parametric models for estimating operational reliability and maintainability (R&M) characteristics for reusable vehicle concepts, based on vehicle size and technology support level. A R&M analysis tool (RMAT) and response surface methods are utilized to build parametric approximation models for rapidly estimating operational R&M characteristics such as mission completion reliability. These models that approximate RMAT, can then be utilized for fast analysis of operational requirements, for lifecycle cost estimating and for multidisciplinary sign optimization.
NASA Astrophysics Data System (ADS)
Mohammed, Amal A.; Abraheem, Sudad K.; Fezaa Al-Obedy, Nadia J.
2018-05-01
In this paper is considered with Burr type XII distribution. The maximum likelihood, Bayes methods of estimation are used for estimating the unknown scale parameter (α). Al-Bayyatis’ loss function and suggest loss function are used to find the reliability with the least loss. So the reliability function is expanded in terms of a set of power function. For this performance, the Matlab (ver.9) is used in computations and some examples are given.
Genomic prediction using imputed whole-genome sequence data in Holstein Friesian cattle.
van Binsbergen, Rianne; Calus, Mario P L; Bink, Marco C A M; van Eeuwijk, Fred A; Schrooten, Chris; Veerkamp, Roel F
2015-09-17
In contrast to currently used single nucleotide polymorphism (SNP) panels, the use of whole-genome sequence data is expected to enable the direct estimation of the effects of causal mutations on a given trait. This could lead to higher reliabilities of genomic predictions compared to those based on SNP genotypes. Also, at each generation of selection, recombination events between a SNP and a mutation can cause decay in reliability of genomic predictions based on markers rather than on the causal variants. Our objective was to investigate the use of imputed whole-genome sequence genotypes versus high-density SNP genotypes on (the persistency of) the reliability of genomic predictions using real cattle data. Highly accurate phenotypes based on daughter performance and Illumina BovineHD Beadchip genotypes were available for 5503 Holstein Friesian bulls. The BovineHD genotypes (631,428 SNPs) of each bull were used to impute whole-genome sequence genotypes (12,590,056 SNPs) using the Beagle software. Imputation was done using a multi-breed reference panel of 429 sequenced individuals. Genomic estimated breeding values for three traits were predicted using a Bayesian stochastic search variable selection (BSSVS) model and a genome-enabled best linear unbiased prediction model (GBLUP). Reliabilities of predictions were based on 2087 validation bulls, while the other 3416 bulls were used for training. Prediction reliabilities ranged from 0.37 to 0.52. BSSVS performed better than GBLUP in all cases. Reliabilities of genomic predictions were slightly lower with imputed sequence data than with BovineHD chip data. Also, the reliabilities tended to be lower for both sequence data and BovineHD chip data when relationships between training animals were low. No increase in persistency of prediction reliability using imputed sequence data was observed. Compared to BovineHD genotype data, using imputed sequence data for genomic prediction produced no advantage. To investigate the putative advantage of genomic prediction using (imputed) sequence data, a training set with a larger number of individuals that are distantly related to each other and genomic prediction models that incorporate biological information on the SNPs or that apply stricter SNP pre-selection should be considered.
Accurate secondary structure prediction and fold recognition for circular dichroism spectroscopy
Micsonai, András; Wien, Frank; Kernya, Linda; Lee, Young-Ho; Goto, Yuji; Réfrégiers, Matthieu; Kardos, József
2015-01-01
Circular dichroism (CD) spectroscopy is a widely used technique for the study of protein structure. Numerous algorithms have been developed for the estimation of the secondary structure composition from the CD spectra. These methods often fail to provide acceptable results on α/β-mixed or β-structure–rich proteins. The problem arises from the spectral diversity of β-structures, which has hitherto been considered as an intrinsic limitation of the technique. The predictions are less reliable for proteins of unusual β-structures such as membrane proteins, protein aggregates, and amyloid fibrils. Here, we show that the parallel/antiparallel orientation and the twisting of the β-sheets account for the observed spectral diversity. We have developed a method called β-structure selection (BeStSel) for the secondary structure estimation that takes into account the twist of β-structures. This method can reliably distinguish parallel and antiparallel β-sheets and accurately estimates the secondary structure for a broad range of proteins. Moreover, the secondary structure components applied by the method are characteristic to the protein fold, and thus the fold can be predicted to the level of topology in the CATH classification from a single CD spectrum. By constructing a web server, we offer a general tool for a quick and reliable structure analysis using conventional CD or synchrotron radiation CD (SRCD) spectroscopy for the protein science research community. The method is especially useful when X-ray or NMR techniques fail. Using BeStSel on data collected by SRCD spectroscopy, we investigated the structure of amyloid fibrils of various disease-related proteins and peptides. PMID:26038575
NASA Astrophysics Data System (ADS)
Zhuang, Chao; Zhou, Zhifang; Illman, Walter A.; Guo, Qiaona; Wang, Jinguo
2017-09-01
The classical aquitard-drainage model COMPAC has been modified to simulate the compaction process of a heterogeneous aquitard consisting of multiple sub-units (Multi-COMPAC). By coupling Multi-COMPAC with the parameter estimation code PEST++, the vertical hydraulic conductivity ( K v) and elastic ( S ske) and inelastic ( S skp) skeletal specific-storage values of each sub-unit can be estimated using observed long-term multi-extensometer and groundwater level data. The approach was first tested through a synthetic case with known parameters. Results of the synthetic case revealed that it was possible to accurately estimate the three parameters for each sub-unit. Next, the methodology was applied to a field site located in Changzhou city, China. Based on the detailed stratigraphic information and extensometer data, the aquitard of interest was subdivided into three sub-units. Parameters K v, S ske and S skp of each sub-unit were estimated simultaneously and then were compared with laboratory results and with bulk values and geologic data from previous studies, demonstrating the reliability of parameter estimates. Estimated S skp values ranged within the magnitude of 10-4 m-1, while K v ranged over 10-10-10-8 m/s, suggesting moderately high heterogeneity of the aquitard. However, the elastic deformation of the third sub-unit, consisting of soft plastic silty clay, is masked by delayed drainage, and the inverse procedure leads to large uncertainty in the S ske estimate for this sub-unit.
Al-Abassi, Abdulla Ahmed; Al Saadi, Azan Saleh; Ahmed, Faisal
2018-06-19
Intra-abdominal pressure (IAP) can be measured by several indirect methods; however, the urinary bladder is largely preferred. The aim of this study was to compare intra-bladder pressure (IBP) at different levels of IAPs and assess its reliability as an indirect method for IAP measurement. We compared IBP with IAP in twenty-one patients undergoing laparoscopic cholecystectomy under general anesthesia. Measurements were recorded at increasing levels of insufflation pressures to approximately 22 mmHg. Pearson's correlation coefficient was calculated to establish the relationship between the two pressure measurements and Bland-Altman analysis was used to assess the limits of agreement between the two methods of measurements. The urinary bladder pressures reflected well the pressures in the abdominal cavity. Pearson correlation coefficient showed a good correlation between the two measurement techniques (r = 0.966, p < 0.0001) and Bland-Altman analysis indicated that the 95% limits of agreement between the two methods ranged from - 2.83 to 2.64. This range is accepted both clinically and according to the recommendations of the World Society of Abdominal Compartment Syndrome (WSACS). Our study showed that IBP measurement is a simple, minimally invasive method that may reliably estimates IAP in patients placed in supine position. Measurements for pressures higher than 12 mmHg may be less reliable. When applied clinically, this should alert the clinician to take safety measures to avoid abdominal compartment syndrome (ACS).
Loeding, B L; Greenan, J P
1998-12-01
The study examined the validity and reliability of four assessments, with three instruments per domain. Domains included generalizable mathematics, communication, interpersonal relations, and reasoning skills. Participants were deaf, legally blind, or visually impaired students enrolled in vocational classes at residential secondary schools. The researchers estimated the internal consistency reliability, test-retest reliability, and construct validity correlations of three subinstruments: student self-ratings, teacher ratings, and performance assessments. The data suggest that these instruments are highly internally consistent measures of generalizable vocational skills. Four performance assessments have high-to-moderate test-retest reliability estimates, and were generally considered to possess acceptable validity and reliability.
Reliability of the Raven Coloured Progressive Matrices for Anglo and for Mexican-American Children.
ERIC Educational Resources Information Center
Valencia, Richard R.
1984-01-01
Investigated the internal consistency reliability estimates of the Raven Coloured Progressive Matrices (CPM) for 96 Anglo and Mexican American third-grade boys from low socioeconomic status background. The results showed that the reliability estimates of the CPM for the two ethnic groups were acceptably high and extremely similar in magnitude.…
Racemization of aspartic acid in root dentin as a tool for age estimation in a Kuwaiti population.
Elfawal, Mohamed Amin; Alqattan, Sahib Issa; Ghallab, Noha Ayman
2015-01-01
Estimation of age is one of the most significant tasks in forensic practice. Amino acid racemization is considered one of the most reliable and accurate methods of age estimation and aspartic acid shows a high racemization reaction rate. The present study has investigated the application of aspartic acid racemization in age estimation in a Kuwaiti population using root dentin from a total of 89 upper first premolar teeth. The D/L ratio of aspartic acid was obtained by HPLC technique in a test group of 50 subjects and a linear regression line was established between aspartic acid racemization and age. The correlation coefficient (r) was 0.97, and the standard error of estimation was ±1.26 years. The racemization age "t" of each subject was calculated by applying the following formula: ln [(1 + D/L)/(1 - D/L)] = 0.003181 t + (-0.01591). When the proposed formula "estimated age t = ln [(1 + D/L)/(1 - D/L)] + 0.01591/0.003181" was applied to a validation group of 39 subjects, the range of error was less than one year in 82.1% of the cases and the standard error of estimation was ±1.12. The current work has established a reasonably significant correlation of the D-/L-aspartic acid ratio with age, and proposed an apparently reliable formula for calculating the age in Kuwaiti populations through aspartic acid racemization. Further research is required to find out whether similar findings are applicable to other ethnic populations. © The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Younes, Magdy; Kuna, Samuel T; Pack, Allan I; Walsh, James K; Kushida, Clete A; Staley, Bethany; Pien, Grace W
2018-02-15
The American Academy of Sleep Medicine has published manuals for scoring polysomnograms that recommend time spent in non-rapid eye movement sleep stages (stage N1, N2, and N3 sleep) be reported. Given the well-established large interrater variability in scoring stage N1 and N3 sleep, we determined the range of time in stage N1 and N3 sleep scored by a large number of technologists when compared to reasonably estimated true values. Polysomnograms of 70 females were scored by 10 highly trained sleep technologists, two each from five different academic sleep laboratories. Range and confidence interval (CI = difference between the 5th and 95th percentiles) of the 10 times spent in stage N1 and N3 sleep assigned in each polysomnogram were determined. Average values of times spent in stage N1 and N3 sleep generated by the 10 technologists in each polysomnogram were considered representative of the true values for the individual polysomnogram. Accuracy of different technologists in estimating delta wave duration was determined by comparing their scores to digitally determined durations. The CI range of the ten N1 scores was 4 to 39 percent of total sleep time (% TST) in different polysomnograms (mean CI ± standard deviation = 11.1 ± 7.1 % TST). Corresponding range for N3 was 1 to 28 % TST (14.4 ± 6.1 % TST). For stage N1 and N3 sleep, very low or very high values were reported for virtually all polysomnograms by different technologists. Technologists varied widely in their assignment of stage N3 sleep, scoring that stage when the digitally determined time of delta waves ranged from 3 to 17 seconds. Manual scoring of non-rapid eye movement sleep stages is highly unreliable among highly trained, experienced technologists. Measures of sleep continuity and depth that are reliable and clinically relevant should be a focus of clinical research. © 2018 American Academy of Sleep Medicine
Wang, Chaoyuan; Li, Baoming; Zhang, Guoqiang; Rom, Hans Benny; Strøom, Jan S
2006-09-01
Laboratory experiments were carried out in a wind tunnel with a model of a slurry pit to investigate the characteristics of ammonia emission from dairy cattle buildings with slatted floor designs. Ammonia emission at different temperatures and air velocities over the floor surface above the slurry pit was measured with uniform feces spreading and urine sprinkling on the surface daily. The data were used to improve a model for estimation of ammonia emission from dairy cattle buildings. Estimates from the updated emission model were compared with measured data from five naturally ventilated dairy cattle buildings. The overall measured ammonia emission rates were in the range of 11-88 g per cow per day at air temperatures of 2.3-22.4 degrees C. Ammonia emission rates estimated by the model were in the range of 19-107 g per cow per day for the surveyed buildings. The average ammonia emission estimated by the model was 11% higher than the mean measured value. The results show that predicted emission patterns generally agree with the measured one, but the prediction has less variation. The model performance may be improved if the influence of animal activity and management strategy on ammonia emission could be estimated and more reliable data of air velocities of the buildings could be obtained.
Estimate of net trophic transfer efficiency of PCBs to Lake Michigan lake trout from their prey
Madenjian, Charles P.; Hesselberg, Robert J.; DeSorcie, Timothy J.; Schmidt, Larry J.; Stedman, Ralph M.; Quintal, Richard T.; Begnoche, Linda J.; Passino-Reader, Dora R.
1998-01-01
Most of the polychlorinated biphenyl (PCB) body burden accumulated by lake trout (Salvelinus namaycush) from the Laurentian Great Lakes is from their food. We used diet information, PCB determinations in both lake trout and their prey, and bioenergetics modeling to estimate the efficiency with which Lake Michigan lake trout retain PCBs from their food. Our estimates were the most reliable estimates to date because (a) the lake trout and prey fish sampled during our study were all from the same vicinity of the lake, (b) detailed measurements were made on the PCB concentrations of both lake trout and prey fish over wide ranges in fish size, and (c) lake trout diet was analyzed in detail over a wide range of lake trout size. Our estimates of net trophic transfer efficiency of PCBs to lake trout from their prey averaged from 0.73 to 0.89 for lake trout between the ages of 5 and 10 years old. There was no evidence of an upward or downward trend in our estimates of net trophic transfer efficiency for lake trout between the ages of 5 and 10 years old, and therefore this efficiency appeared to be constant over the duration of the lake trout's adult life in the lake. On the basis of our estimtes, lake trout retained 80% of the PCBs that are contained within their food.
Validity and Reliability of Assessing Body Composition Using a Mobile Application.
Macdonald, Elizabeth Z; Vehrs, Pat R; Fellingham, Gilbert W; Eggett, Dennis; George, James D; Hager, Ronald
2017-12-01
The purpose of this study was to determine the validity and reliability of the LeanScreen (LS) mobile application that estimates percent body fat (%BF) using estimates of circumferences from photographs. The %BF of 148 weight-stable adults was estimated once using dual-energy x-ray absorptiometry (DXA). Each of two administrators assessed the %BF of each subject twice using the LS app and manually measured circumferences. A mixed-model ANOVA and Bland-Altman analyses were used to compare the estimates of %BF obtained from each method. Interrater and intrarater reliabilities values were determined using multiple measurements taken by each of the two administrators. The LS app and manually measured circumferences significantly underestimated (P < 0.05) the %BF determined using DXA by an average of -3.26 and -4.82 %BF, respectively. The LS app (6.99 %BF) and manually measured circumferences (6.76 %BF) had large limits of agreement. All interrater and intrarater reliability coefficients of estimates of %BF using the LS app and manually measured circumferences exceeded 0.99. The estimates of %BF from manually measured circumferences and the LS app were highly reliable. However, these field measures are not currently recommended for the assessment of body composition because of significant bias and large limits of agreements.
Language evolution and human history: what a difference a date makes.
Gray, Russell D; Atkinson, Quentin D; Greenhill, Simon J
2011-04-12
Historical inference is at its most powerful when independent lines of evidence can be integrated into a coherent account. Dating linguistic and cultural lineages can potentially play a vital role in the integration of evidence from linguistics, anthropology, archaeology and genetics. Unfortunately, although the comparative method in historical linguistics can provide a relative chronology, it cannot provide absolute date estimates and an alternative approach, called glottochronology, is fundamentally flawed. In this paper we outline how computational phylogenetic methods can reliably estimate language divergence dates and thus help resolve long-standing debates about human prehistory ranging from the origin of the Indo-European language family to the peopling of the Pacific.
NASA Technical Reports Server (NTRS)
Rignot, Eric J.; Zimmermann, Reiner; Oren, Ram
1995-01-01
In the tropical rain forests of Manu, in Peru, where forest biomass ranges from 4 kg/sq m in young forest succession up to 100 kg/sq m in old, undisturbed floodplain stands, the P-band polarimetric radar data gathered in June of 1993 by the AIRSAR (Airborne Synthetic Aperture Radar) instrument separate most major vegetation formations and also perform better than expected in estimating woody biomass. The worldwide need for large scale, updated biomass estimates, achieved with a uniformly applied method, as well as reliable maps of land cover, justifies a more in-depth exploration of long wavelength imaging radar applications for tropical forests inventories.
Language evolution and human history: what a difference a date makes
Gray, Russell D.; Atkinson, Quentin D.; Greenhill, Simon J.
2011-01-01
Historical inference is at its most powerful when independent lines of evidence can be integrated into a coherent account. Dating linguistic and cultural lineages can potentially play a vital role in the integration of evidence from linguistics, anthropology, archaeology and genetics. Unfortunately, although the comparative method in historical linguistics can provide a relative chronology, it cannot provide absolute date estimates and an alternative approach, called glottochronology, is fundamentally flawed. In this paper we outline how computational phylogenetic methods can reliably estimate language divergence dates and thus help resolve long-standing debates about human prehistory ranging from the origin of the Indo-European language family to the peopling of the Pacific. PMID:21357231
Digital Processing Of Young's Fringes In Speckle Photography
NASA Astrophysics Data System (ADS)
Chen, D. J.; Chiang, F. P.
1989-01-01
A new technique for fully automatic diffraction fringe measurement in point-wise speckle photograph analysis is presented in this paper. The fringe orientation and spacing are initially estimated with the help of 1-D FFT. A 2-D convolution filter is then applied to enhance the estimated image . High signal-to-noise rate (SNR) fringe pattern is achieved which makes it feasible for precise determination of the displacement components. The halo-effect is also optimally eliminated in a new way. With the computation time compared favorably with those of 2-D autocorrelation method and the iterative 2-D FFT method. High reliability and accurate determination of displacement components are achieved over a wide range of fringe density.
Sensitivity and systematics of calorimetric neutrino mass experiments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nucciotti, A.; Cremonesi, O.; Ferri, E.
2009-12-16
A large calorimetric neutrino mass experiment using thermal detectors is expected to play a crucial role in the challenge for directly assessing the neutrino mass. We discuss and compare here two approaches for the estimation of the experimental sensitivity of such an experiment. The first method uses an analytic formulation and allows to obtain readily a close estimate over a wide range of experimental configurations. The second method is based on a Montecarlo technique and is more precise and reliable. The Montecarlo approach is then exploited to study some sources of systematic uncertainties peculiar to calorimetric experiments. Finally, the toolsmore » are applied to investigate the optimal experimental configuration of the MARE project.« less
Salicylate-induced changes in auditory thresholds of adolescent and adult rats.
Brennan, J F; Brown, C A; Jastreboff, P J
1996-01-01
Shifts in auditory intensity thresholds after salicylate administration were examined in postweanling and adult pigmented rats at frequencies ranging from 1 to 35 kHz. A total of 132 subjects from both age levels were tested under two-way active avoidance or one-way active avoidance paradigms. Estimated thresholds were inferred from behavioral responses to presentations of descending and ascending series of intensities for each test frequency value. Reliable threshold estimates were found under both avoidance conditioning methods, and compared to controls, subjects at both age levels showed threshold shifts at selective higher frequency values after salicylate injection, and the extent of shifts was related to salicylate dose level.
Zhang, Rui; Yao, Enjian; Yang, Yang
2017-01-01
Introducing electric vehicles (EVs) into urban transportation network brings higher requirement on travel time reliability and charging reliability. Specifically, it is believed that travel time reliability is a key factor influencing travelers’ route choice. Meanwhile, due to the limited cruising range, EV drivers need to better learn about the required energy for the whole trip to make decisions about whether charging or not and where to charge (i.e., charging reliability). Since EV energy consumption is highly related to travel speed, network uncertainty affects travel time and charging demand estimation significantly. Considering the network uncertainty resulted from link degradation, which influences the distribution of travel demand on transportation network and the energy demand on power network, this paper aims to develop a reliability-based network equilibrium framework for accommodating degradable road conditions with the addition of EVs. First, based on the link travel time distribution, the mean and variance of route travel time and monetary expenses related to energy consumption are deduced, respectively. And the charging time distribution of EVs with charging demand is also estimated. Then, a nested structure is considered to deal with the difference of route choice behavior derived by the different uncertainty degrees between the routes with and without degradable links. Given the expected generalized travel cost and a psychological safety margin, a traffic assignment model with the addition of EVs is formulated. Subsequently, a heuristic solution algorithm is developed to solve the proposed model. Finally, the effects of travelers’ risk attitude, network degradation degree, and EV penetration rate on network performance are illustrated through an example network. The numerical results show that the difference of travelers’ risk attitudes does have impact on the route choice, and the widespread adoption of EVs can cut down the total system travel cost effectively when the transportation network is more reliable. PMID:28886167
Zhang, Rui; Yao, Enjian; Yang, Yang
2017-01-01
Introducing electric vehicles (EVs) into urban transportation network brings higher requirement on travel time reliability and charging reliability. Specifically, it is believed that travel time reliability is a key factor influencing travelers' route choice. Meanwhile, due to the limited cruising range, EV drivers need to better learn about the required energy for the whole trip to make decisions about whether charging or not and where to charge (i.e., charging reliability). Since EV energy consumption is highly related to travel speed, network uncertainty affects travel time and charging demand estimation significantly. Considering the network uncertainty resulted from link degradation, which influences the distribution of travel demand on transportation network and the energy demand on power network, this paper aims to develop a reliability-based network equilibrium framework for accommodating degradable road conditions with the addition of EVs. First, based on the link travel time distribution, the mean and variance of route travel time and monetary expenses related to energy consumption are deduced, respectively. And the charging time distribution of EVs with charging demand is also estimated. Then, a nested structure is considered to deal with the difference of route choice behavior derived by the different uncertainty degrees between the routes with and without degradable links. Given the expected generalized travel cost and a psychological safety margin, a traffic assignment model with the addition of EVs is formulated. Subsequently, a heuristic solution algorithm is developed to solve the proposed model. Finally, the effects of travelers' risk attitude, network degradation degree, and EV penetration rate on network performance are illustrated through an example network. The numerical results show that the difference of travelers' risk attitudes does have impact on the route choice, and the widespread adoption of EVs can cut down the total system travel cost effectively when the transportation network is more reliable.
Gill, Tiffany K; Tucker, Graeme R; Avery, Jodie C; Shanahan, E Michael; Menz, Hylton B; Taylor, Anne W; Adams, Robert J; Hill, Catherine L
2016-02-24
Case definition has long been an issue for comparability of results obtained for musculoskeletal pain prevalence, however the test-retest reliability of questions used to determine joint pain prevalence has not been examined. The objective of this study was to determine question reliability and the impact of question wording, ordering and the time between questions on responses. A Computer Assisted Telephone Interviewing (CATI) survey was used to re-administer questions collected as part of a population-based longitudinal cohort study. On two different occasions questions were asked of the same sample of 203 community dwelling respondents (which were initially randomly selected) aged 18 years and over at two time points 14 to 27 days apart (average 15 days). Reliability of the questions was assessed using Cohen's kappa (κ) and intraclass correlation coefficient (ICC) and whether question wording and period effects existed was assessed using a crossover design. The self-reported prevalence of doctor diagnosed arthritis demonstrated excellent reliability (κ = 0.84 and κ = 0.79 for questionnaires 1 and 2 respectively). The reliability of questions relating to musculoskeletal pain and/or stiffness ranged from moderate to excellent for both types of questions, that is, those related to ever having joint pain on most days for at least a month (κ = 0.52 to κ = 0.95) and having pain and/or stiffness on most days for the last month (κ = 0.52 to κ = 0.90). However there was an effect of question wording on the results obtained for hand, foot and back pain and/or stiffness indicating that the area of pain may influence prevalence estimates. Joint pain and stiffness questions are reliable and can be used to determine prevalence. However, question wording and pain area may impact on estimates with issues such as pain perception and effect on activities playing a possible role in the recall of musculoskeletal pain.
Pailian, Hrag; Halberda, Justin
2015-04-01
We investigated the psychometric properties of the one-shot change detection task for estimating visual working memory (VWM) storage capacity-and also introduced and tested an alternative flicker change detection task for estimating these limits. In three experiments, we found that the one-shot whole-display task returns estimates of VWM storage capacity (K) that are unreliable across set sizes-suggesting that the whole-display task is measuring different things at different set sizes. In two additional experiments, we found that the one-shot single-probe variant shows improvements in the reliability and consistency of K estimates. In another additional experiment, we found that a one-shot whole-display-with-click task (requiring target localization) also showed improvements in reliability and consistency. The latter results suggest that the one-shot task can return reliable and consistent estimates of VWM storage capacity (K), and they highlight the possibility that the requirement to localize the changed target is what engenders this enhancement. Through a final series of four experiments, we introduced and tested an alternative flicker change detection method that also requires the observer to localize the changing target and that generates, from response times, an estimate of VWM storage capacity (K). We found that estimates of K from the flicker task correlated with estimates from the traditional one-shot task and also had high reliability and consistency. We highlight the flicker method's ability to estimate executive functions as well as VWM storage capacity, and discuss the potential for measuring multiple abilities with the one-shot and flicker tasks.
Zhang, Man; Castaneda, Benjamin; Wu, Zhe; Nigwekar, Priya; Joseph, Jean V.; Rubens, Deborah J.; Parker, Kevin J.
2007-01-01
Biomechanical properties of soft tissues are important for a wide range of medical applications, such as surgical simulation and planning and detection of lesions by elasticity imaging modalities. Currently, the data in the literature is limited and conflicting. Furthermore, to assess the biomechanical properties of living tissue in vivo, reliable imaging-based estimators must be developed and verified. For these reasons we developed and compared two independent quantitative methods – crawling wave estimator (CRE) and mechanical measurement (MM) for soft tissue characterization. The CRE method images shear wave interference patterns from which the shear wave velocity can be determined and hence the Young’s modulus can be obtained. The MM method provides the complex Young’s modulus of the soft tissue from which both elastic and viscous behavior can be extracted. This article presents the systematic comparison between these two techniques on the measurement of gelatin phantom, veal liver, thermal-treated veal liver, and human prostate. It was observed that the Young’s moduli of liver and prostate tissues slightly increase with frequency. The experimental results of the two methods are highly congruent, suggesting CRE and MM methods can be reliably used to investigate viscoelastic properties of other soft tissues, with CRE having the advantages of operating in nearly real time and in situ. PMID:17604902
Sanaei Nezhad, Faezeh; Anton, Adriana; Michou, Emilia; Jung, JeYoung; Parkes, Laura M.
2017-01-01
γ‐Aminobutyric acid (GABA) and glutamate (Glu), major neurotransmitters in the brain, are recycled through glutamine (Gln). All three metabolites can be measured by magnetic resonance spectroscopy in vivo, although GABA measurement at 3 T requires an extra editing acquisition, such as Mescher–Garwood point‐resolved spectroscopy (MEGA‐PRESS). In a GABA‐edited MEGA‐PRESS spectrum, Glu and Gln co‐edit with GABA, providing the possibility to measure all three in one acquisition. In this study, we investigated the reliability of the composite Glu + Gln (Glx) peak estimation and the possibility of Glu and Gln separation in GABA‐edited MEGA‐PRESS spectra. The data acquired in vivo were used to develop a quality assessment framework which identified MEGA‐PRESS spectra in which Glu and Gln could be estimated reliably. Phantoms containing Glu, Gln, GABA and N‐acetylaspartate (NAA) at different concentrations were scanned using GABA‐edited MEGA‐PRESS at 3 T. Fifty‐six sets of spectra in five brain regions were acquired from 36 healthy volunteers. Based on the Glu/Gln ratio, data were classified as either within or outside the physiological range. A peak‐by‐peak quality assessment was performed on all data to investigate whether quality metrics can discriminate between these two classes of spectra. The quality metrics were as follows: the GABA signal‐to‐noise ratio, the NAA linewidth and the Glx Cramer–Rao lower bound (CRLB). The Glu and Gln concentrations were estimated with precision across all phantoms with a linear relationship between the measured and true concentrations: R 1 = 0.95 for Glu and R 1 = 0.91 for Gln. A quality assessment framework was set based on the criteria necessary for a good GABA‐edited MEGA‐PRESS spectrum. Simultaneous criteria of NAA linewidth <8 Hz and Glx CRLB <16% were defined as optimum features for reliable Glu and Gln quantification. Glu and Gln can be reliably quantified from GABA‐edited MEGA‐PRESS acquisitions. However, this reliability should be controlled using the quality assessment methods suggested in this work. PMID:29130590
Spatial Interpolation of Rain-field Dynamic Time-Space Evolution in Hong Kong
NASA Astrophysics Data System (ADS)
Liu, P.; Tung, Y. K.
2017-12-01
Accurate and reliable measurement and prediction of spatial and temporal distribution of rain-field over a wide range of scales are important topics in hydrologic investigations. In this study, geostatistical treatment of precipitation field is adopted. To estimate the rainfall intensity over a study domain with the sample values and the spatial structure from the radar data, the cumulative distribution functions (CDFs) at all unsampled locations were estimated. Indicator Kriging (IK) was used to estimate the exceedance probabilities for different pre-selected cutoff levels and a procedure was implemented for interpolating CDF values between the thresholds that were derived from the IK. Different interpolation schemes of the CDF were proposed and their influences on the performance were also investigated. The performance measures and visual comparison between the observed rain-field and the IK-based estimation suggested that the proposed method can provide fine results of estimation of indicator variables and is capable of producing realistic image.
A semi-automatic method for left ventricle volume estimate: an in vivo validation study
NASA Technical Reports Server (NTRS)
Corsi, C.; Lamberti, C.; Sarti, A.; Saracino, G.; Shiota, T.; Thomas, J. D.
2001-01-01
This study aims to the validation of the left ventricular (LV) volume estimates obtained by processing volumetric data utilizing a segmentation model based on level set technique. The validation has been performed by comparing real-time volumetric echo data (RT3DE) and magnetic resonance (MRI) data. A validation protocol has been defined. The validation protocol was applied to twenty-four estimates (range 61-467 ml) obtained from normal and pathologic subjects, which underwent both RT3DE and MRI. A statistical analysis was performed on each estimate and on clinical parameters as stroke volume (SV) and ejection fraction (EF). Assuming MRI estimates (x) as a reference, an excellent correlation was found with volume measured by utilizing the segmentation procedure (y) (y=0.89x + 13.78, r=0.98). The mean error on SV was 8 ml and the mean error on EF was 2%. This study demonstrated that the segmentation technique is reliably applicable on human hearts in clinical practice.
Comparing top-down and bottom-up costing approaches for economic evaluation within social welfare.
Olsson, Tina M
2011-10-01
This study compares two approaches to the estimation of social welfare intervention costs: one "top-down" and the other "bottom-up" for a group of social welfare clients with severe problem behavior participating in a randomized trial. Intervention costs ranging over a two-year period were compared by intervention category (foster care placement, institutional placement, mentorship services, individual support services and structured support services), estimation method (price, micro costing, average cost) and treatment group (intervention, control). Analyses are based upon 2007 costs for 156 individuals receiving 404 interventions. Overall, both approaches were found to produce reliable estimates of intervention costs at the group level but not at the individual level. As choice of approach can greatly impact the estimate of mean difference, adjustment based on estimation approach should be incorporated into sensitivity analyses. Analysts must take care in assessing the purpose and perspective of the analysis when choosing a costing approach for use within economic evaluation.
A Laboratory Study on the Reliability Estimations of the Mini-CEX
ERIC Educational Resources Information Center
de Lima, Alberto Alves; Conde, Diego; Costabel, Juan; Corso, Juan; Van der Vleuten, Cees
2013-01-01
Reliability estimations of workplace-based assessments with the mini-CEX are typically based on real-life data. Estimations are based on the assumption of local independence: the object of the measurement should not be influenced by the measurement itself and samples should be completely independent. This is difficult to achieve. Furthermore, the…
Comparability and Reliability Considerations of Adequate Yearly Progress
ERIC Educational Resources Information Center
Maier, Kimberly S.; Maiti, Tapabrata; Dass, Sarat C.; Lim, Chae Young
2012-01-01
The purpose of this study is to develop an estimate of Adequate Yearly Progress (AYP) that will allow for reliable and valid comparisons among student subgroups, schools, and districts. A shrinkage-type estimator of AYP using the Bayesian framework is described. Using simulated data, the performance of the Bayes estimator will be compared to…
Sample Size for Estimation of G and Phi Coefficients in Generalizability Theory
ERIC Educational Resources Information Center
Atilgan, Hakan
2013-01-01
Problem Statement: Reliability, which refers to the degree to which measurement results are free from measurement errors, as well as its estimation, is an important issue in psychometrics. Several methods for estimating reliability have been suggested by various theories in the field of psychometrics. One of these theories is the generalizability…
Software For Computing Reliability Of Other Software
NASA Technical Reports Server (NTRS)
Nikora, Allen; Antczak, Thomas M.; Lyu, Michael
1995-01-01
Computer Aided Software Reliability Estimation (CASRE) computer program developed for use in measuring reliability of other software. Easier for non-specialists in reliability to use than many other currently available programs developed for same purpose. CASRE incorporates mathematical modeling capabilities of public-domain Statistical Modeling and Estimation of Reliability Functions for Software (SMERFS) computer program and runs in Windows software environment. Provides menu-driven command interface; enabling and disabling of menu options guides user through (1) selection of set of failure data, (2) execution of mathematical model, and (3) analysis of results from model. Written in C language.
Observer variability in estimating numbers: An experiment
Erwin, R.M.
1982-01-01
Census estimates of bird populations provide an essential framework for a host of research and management questions. However, with some exceptions, the reliability of numerical estimates and the factors influencing them have received insufficient attention. Independent of the problems associated with habitat type, weather conditions, cryptic coloration, ete., estimates may vary widely due only to intrinsic differences in observers? abilities to estimate numbers. Lessons learned in the field of perceptual psychology may be usefully applied to 'real world' problems in field ornithology. Based largely on dot discrimination tests in the laboratory, it was found that numerical abundance, density of objects, spatial configuration, color, background, and other variables influence individual accuracy in estimating numbers. The primary purpose of the present experiment was to assess the effects of observer, prior experience, and numerical range on accuracy in estimating numbers of waterfowl from black-and-white photographs. By using photographs of animals rather than black dots, I felt the results could be applied more meaningfully to field situations. Further, reinforcement was provided throughout some experiments to examine the influence of training on accuracy.
Multi-particle three-dimensional coordinate estimation in real-time optical manipulation
NASA Astrophysics Data System (ADS)
Dam, J. S.; Perch-Nielsen, I.; Palima, D.; Gluckstad, J.
2009-11-01
We have previously shown how stereoscopic images can be obtained in our three-dimensional optical micromanipulation system [J. S. Dam et al, Opt. Express 16, 7244 (2008)]. Here, we present an extension and application of this principle to automatically gather the three-dimensional coordinates for all trapped particles with high tracking range and high reliability without requiring user calibration. Through deconvolving of the red, green, and blue colour planes to correct for bleeding between colour planes, we show that we can extend the system to also utilize green illumination, in addition to the blue and red. Applying the green colour as on-axis illumination yields redundant information for enhanced error correction, which is used to verify the gathered data, resulting in reliable coordinates as well as producing visually attractive images.
Hatch, Christine E; Fisher, Andrew T.; Revenaugh, Justin S.; Constantz, Jim; Ruehl, Chris
2006-01-01
We present a method for determining streambed seepage rates using time series thermal data. The new method is based on quantifying changes in phase and amplitude of temperature variations between pairs of subsurface sensors. For a reasonable range of streambed thermal properties and sensor spacings the time series method should allow reliable estimation of seepage rates for a range of at least ±10 m d−1 (±1.2 × 10−2 m s−1), with amplitude variations being most sensitive at low flow rates and phase variations retaining sensitivity out to much higher rates. Compared to forward modeling, the new method requires less observational data and less setup and data handling and is faster, particularly when interpreting many long data sets. The time series method is insensitive to streambed scour and sedimentation, which allows for application under a wide range of flow conditions and allows time series estimation of variable streambed hydraulic conductivity. This new approach should facilitate wider use of thermal methods and improve understanding of the complex spatial and temporal dynamics of surface water–groundwater interactions.
REVERBERATION AND PHOTOIONIZATION ESTIMATES OF THE BROAD-LINE REGION RADIUS IN LOW-z QUASARS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Negrete, C. Alenka; Dultzin, Deborah; Marziani, Paola
2013-07-01
Black hole mass estimation in quasars, especially at high redshift, involves the use of single-epoch spectra with signal-to-noise ratio and resolution that permit accurate measurement of the width of a broad line assumed to be a reliable virial estimator. Coupled with an estimate of the radius of the broad-line region (BLR) this yields the black hole mass M{sub BH}. The radius of the BLR may be inferred from an extrapolation of the correlation between source luminosity and reverberation-derived r{sub BLR} measures (the so-called Kaspi relation involving about 60 low-z sources). We are exploring a different method for estimating r{sub BLR}more » directly from inferred physical conditions in the BLR of each source. We report here on a comparison of r{sub BLR} estimates that come from our method and from reverberation mapping. Our ''photoionization'' method employs diagnostic line intensity ratios in the rest-frame range 1400-2000 A (Al III {lambda}1860/Si III] {lambda}1892, C IV {lambda}1549/Al III {lambda}1860) that enable derivation of the product of density and ionization parameter with the BLR distance derived from the definition of the ionization parameter. We find good agreement between our estimates of the density, ionization parameter, and r{sub BLR} and those from reverberation mapping. We suggest empirical corrections to improve the agreement between individual photoionization-derived r{sub BLR} values and those obtained from reverberation mapping. The results in this paper can be exploited to estimate M{sub BH} for large samples of high-z quasars using an appropriate virial broadening estimator. We show that the width of the UV intermediate emission lines are consistent with the width of H{beta}, thereby providing a reliable virial broadening estimator that can be measured in large samples of high-z quasars.« less
Clarke, Diana E; Narrow, William E; Regier, Darrel A; Kuramoto, S Janet; Kupfer, David J; Kuhl, Emily A; Greiner, Lisa; Kraemer, Helena C
2013-01-01
This article discusses the design,sampling strategy, implementation,and data analytic processes of the DSM-5 Field Trials. The DSM-5 Field Trials were conducted by using a test-retest reliability design with a stratified sampling approach across six adult and four pediatric sites in the United States and one adult site in Canada. A stratified random sampling approach was used to enhance precision in the estimation of the reliability coefficients. A web-based research electronic data capture system was used for simultaneous data collection from patients and clinicians across sites and for centralized data management.Weighted descriptive analyses, intraclass kappa and intraclass correlation coefficients for stratified samples, and receiver operating curves were computed. The DSM-5 Field Trials capitalized on advances since DSM-III and DSM-IV in statistical measures of reliability (i.e., intraclass kappa for stratified samples) and other recently developed measures to determine confidence intervals around kappa estimates. Diagnostic interviews using DSM-5 criteria were conducted by 279 clinicians of varied disciplines who received training comparable to what would be available to any clinician after publication of DSM-5.Overall, 2,246 patients with various diagnoses and levels of comorbidity were enrolled,of which over 86% were seen for two diagnostic interviews. A range of reliability coefficients were observed for the categorical diagnoses and dimensional measures. Multisite field trials and training comparable to what would be available to any clinician after publication of DSM-5 provided “real-world” testing of DSM-5 proposed diagnoses.
Reliability of the Language ENvironment Analysis system (LENA™) in European French.
Canault, Mélanie; Le Normand, Marie-Thérèse; Foudil, Samy; Loundon, Natalie; Thai-Van, Hung
2016-09-01
In this study, we examined the accuracy of the Language ENvironment Analysis (LENA) system in European French. LENA is a digital recording device with software that facilitates the collection and analysis of audio recordings from young children, providing automated measures of the speech overheard and produced by the child. Eighteen native French-speaking children, who were divided into six age groups ranging from 3 to 48 months old, were recorded about 10-16 h per day, three days a week. A total of 324 samples (six 10-min chunks of recordings) were selected and then transcribed according to the CHAT format. Simple and mixed linear models between the LENA and human adult word count (AWC) and child vocalization count (CVC) estimates were performed, to determine to what extent the automatic and the human methods agreed. Both the AWC and CVC estimates were very reliable (r = .64 and .71, respectively) for the 324 samples. When controlling the random factors of participants and recordings, 1 h was sufficient to obtain a reliable sample. It was, however, found that two age groups (7-12 months and 13-18 months) had a significant effect on the AWC data and that the second day of recording had a significant effect on the CVC data. When noise-related factors were added to the model, only a significant effect of signal-to-noise ratio was found on the AWC data. All of these findings and their clinical implications are discussed, providing strong support for the reliability of LENA in French.
Bidulescu, Aurelian; Chambless, Lloyd E; Siega-Riz, Anna Maria; Zeisel, Steven H; Heiss, Gerardo
2009-02-20
The repeatability of a risk factor measurement affects the ability to accurately ascertain its association with a specific outcome. Choline is involved in methylation of homocysteine, a putative risk factor for cardiovascular disease, to methionine through a betaine-dependent pathway (one-carbon metabolism). It is unknown whether dietary intake of choline meets the recommended Adequate Intake (AI) proposed for choline (550 mg/day for men and 425 mg/day for women). The Estimated Average Requirement (EAR) remains to be established in population settings. Our objectives were to ascertain the reliability of choline and related nutrients (folate and methionine) intakes assessed with a brief food frequency questionnaire (FFQ) and to estimate dietary intake of choline and betaine in a bi-ethnic population. We estimated the FFQ dietary instrument reliability for the Atherosclerosis Risk in Communities (ARIC) study and the measurement error for choline and related nutrients from a stratified random sample of the ARIC study participants at the second visit, 1990-92 (N = 1,004). In ARIC, a population-based cohort of 15,792 men and women aged 45-64 years (1987-89) recruited at four locales in the U.S., diet was assessed in 15,706 baseline study participants using a version of the Willett 61-item FFQ, expanded to include some ethnic foods. Intraindividual variability for choline, folate and methionine were estimated using mixed models regression. Measurement error was substantial for the nutrients considered. The reliability coefficients were 0.50 for choline (0.50 for choline plus betaine), 0.53 for folate, 0.48 for methionine and 0.43 for total energy intake. In the ARIC population, the median and the 75th percentile of dietary choline intake were 284 mg/day and 367 mg/day, respectively. 94% of men and 89% of women had an intake of choline below that proposed as AI. African Americans had a lower dietary intake of choline in both genders. The three-year reliability of reported dietary intake was similar for choline and related nutrients, in the range as that published in the literature for other micronutrients. Using a brief FFQ to estimate intake, the majority of individuals in the ARIC cohort had an intake of choline below the values proposed as AI.
Bidulescu, Aurelian; Chambless, Lloyd E; Siega-Riz, Anna Maria; Zeisel, Steven H; Heiss, Gerardo
2009-01-01
Background The repeatability of a risk factor measurement affects the ability to accurately ascertain its association with a specific outcome. Choline is involved in methylation of homocysteine, a putative risk factor for cardiovascular disease, to methionine through a betaine-dependent pathway (one-carbon metabolism). It is unknown whether dietary intake of choline meets the recommended Adequate Intake (AI) proposed for choline (550 mg/day for men and 425 mg/day for women). The Estimated Average Requirement (EAR) remains to be established in population settings. Our objectives were to ascertain the reliability of choline and related nutrients (folate and methionine) intakes assessed with a brief food frequency questionnaire (FFQ) and to estimate dietary intake of choline and betaine in a bi-ethnic population. Methods We estimated the FFQ dietary instrument reliability for the Atherosclerosis Risk in Communities (ARIC) study and the measurement error for choline and related nutrients from a stratified random sample of the ARIC study participants at the second visit, 1990–92 (N = 1,004). In ARIC, a population-based cohort of 15,792 men and women aged 45–64 years (1987–89) recruited at four locales in the U.S., diet was assessed in 15,706 baseline study participants using a version of the Willett 61-item FFQ, expanded to include some ethnic foods. Intraindividual variability for choline, folate and methionine were estimated using mixed models regression. Results Measurement error was substantial for the nutrients considered. The reliability coefficients were 0.50 for choline (0.50 for choline plus betaine), 0.53 for folate, 0.48 for methionine and 0.43 for total energy intake. In the ARIC population, the median and the 75th percentile of dietary choline intake were 284 mg/day and 367 mg/day, respectively. 94% of men and 89% of women had an intake of choline below that proposed as AI. African Americans had a lower dietary intake of choline in both genders. Conclusion The three-year reliability of reported dietary intake was similar for choline and related nutrients, in the range as that published in the literature for other micronutrients. Using a brief FFQ to estimate intake, the majority of individuals in the ARIC cohort had an intake of choline below the values proposed as AI. PMID:19232103
Bazzo, Stefania; Battistella, Giuseppe; Riscica, Patrizia; Moino, Giuliana; Dal Pozzo, Giuseppe; Bottarel, Mery; Geromel, Mariasole; Czerwinsky, Loredana
2015-01-01
Alcohol consumption during pregnancy can result in a range of harmful effects on the developing foetus and newborn, called Fetal Alcohol Spectrum Disorders (FASD). The identification of pregnant women who use alcohol enables to provide information, support and treatment for women and the surveillance of their children. The AUDIT-C (the shortened consumption version of the Alcohol Use Disorders Identification Test) is used for investigating risky drinking with different populations, and has been applied to estimate alcohol use and risky drinking also in antenatal clinics. The aim of the study was to investigate the reliability of a self-report Italian version of the AUDIT-C questionnaire to detect alcohol consumption during pregnancy, regardless of its use as a screening tool. The questionnaire was filled in by two independent consecutive series of pregnant women at the 38th gestation week visit in the two birth locations of the Local Health Authority of Treviso (Italy), during the years 2010 and 2011 (n=220 and n=239). Reliability analysis was performed using internal consistency, item-total score correlations, and inter-item correlations. The "discriminatory power" of the test was also evaluated. Results. Overall, about one third of women recalled alcohol consumption at least once during the current pregnancy. The questionnaire had an internal consistency of 0.565 for the group of the year 2010, of 0.516 for the year 2011, and of 0.542 for the overall group. The highest item total correlations' coefficient was 0.687 and the highest inter-item correlations' coefficient was 0.675. As for the discriminatory power of the questionnaire, the highest Ferguson's delta coefficient was 0.623. These findings suggest that the Italian self-report version of the AUDIT-C possesses unsatisfactory reliability to estimate alcohol consumption during pregnancy when used as self-report questionnaire in an obstetric setting.
Measuring sleep habits without using a diary: the sleep timing questionnaire
NASA Technical Reports Server (NTRS)
Monk, Timothy H.; Buysse, Daniel J.; Kennedy, Kathy S.; Pods, Jaime M.; DeGrazia, Jean M.; Miewald, Jean M.
2003-01-01
STUDY OBJECTIVES: To develop a single-administration instrument yielding equivalent measures of sleep to those obtained from a formal (2-week) sleep diary. DESIGN & SETTING: A single-administration Sleep riming Questionnaire (STQ) is described (and reproduced in the Appendix). Test-retest reliability was examined in 40 subjects who were given the STQ on two occasions separated by less than 1 year. Convergent validity was measured both by comparing STO-derived measures with objective measures derived from wrist actigraphy (n=23) and by comparing STQ-derived measures with other subjective measures derived from a detailed 2-week sleep diary in two nonoverlapping samples (n=101, 93). Correlations of STQ measures with age and momingness-eveningness (chronotype) were also examined. SUBJECTS: The analyses used sample sizes of 40, 23, 101, and 93 (both genders, overall age range 20y-89y). Most subjects were healthy volunteers; some Study 4 subjects were patients (enrolled in research protocols). RESULTS: Test-retest reliability for the STQ was demonstrated for estimates of bedtime (r = 0.705, p < 0.001) and waketime (r = 0.826, p < 0.001). Convergent validity using wrist actigraphy was demonstrated by correlations of 0.592 (p < 0.005) for bedtime, and of 0.769 (p < 0.001) for waketime. Diary studies indicated STQ bedtime and waketime data to be highly correlated (at about 0.8) with those obtained from a formal 2-week sleep diary. The STQ also provided data on estimated sleep latency and wake after sleep onset (WASO), which correlated reliably (at about 0.7) with average nightly ratings of these variables from a 2-week sleep diary. Mean estimated values of sleep latency and WASO from the two instruments were within 1 minute of each other. ST-derived bedtimes and waketimes correlated with both age and chronotype in the expected direction (older subjects earlier, morning types earlier). CONCLUSION: The STQ may be a reliable valid measure of sleep timing that could provide a time-efficient alternative to traditional sleep diaries.
Reducing random measurement error in assessing postural load on the back in epidemiologic surveys.
Burdorf, A
1995-02-01
The goal of this study was to design strategies to assess postural load on the back in occupational epidemiology by taking into account the reliability of measurement methods and the variability of exposure among the workers under study. Intermethod reliability studies were evaluated to estimate the systematic bias (accuracy) and random measurement error (precision) of various methods to assess postural load on the back. Intramethod reliability studies were reviewed to estimate random variability of back load over time. Intermethod surveys have shown that questionnaires have a moderate reliability for gross activities such as sitting, whereas duration of trunk flexion and rotation should be assessed by observation methods or inclinometers. Intramethod surveys indicate that exposure variability can markedly affect the reliability of estimates of back load if the estimates are based upon a single measurement over a certain time period. Equations have been presented to evaluate various study designs according to the reliability of the measurement method, the optimum allocation of the number of repeated measurements per subject, and the number of subjects in the study. Prior to a large epidemiologic study, an exposure-oriented survey should be conducted to evaluate the performance of measurement instruments and to estimate sources of variability for back load. The strategy for assessing back load can be optimized by balancing the number of workers under study and the number of repeated measurements per worker.
The relationship between cost estimates reliability and BIM adoption: SEM analysis
NASA Astrophysics Data System (ADS)
Ismail, N. A. A.; Idris, N. H.; Ramli, H.; Rooshdi, R. R. Raja Muhammad; Sahamir, S. R.
2018-02-01
This paper presents the usage of Structural Equation Modelling (SEM) approach in analysing the effects of Building Information Modelling (BIM) technology adoption in improving the reliability of cost estimates. Based on the questionnaire survey results, SEM analysis using SPSS-AMOS application examined the relationships between BIM-improved information and cost estimates reliability factors, leading to BIM technology adoption. Six hypotheses were established prior to SEM analysis employing two types of SEM models, namely the Confirmatory Factor Analysis (CFA) model and full structural model. The SEM models were then validated through the assessment on their uni-dimensionality, validity, reliability, and fitness index, in line with the hypotheses tested. The final SEM model fit measures are: P-value=0.000, RMSEA=0.079<0.08, GFI=0.824, CFI=0.962>0.90, TLI=0.956>0.90, NFI=0.935>0.90 and ChiSq/df=2.259; indicating that the overall index values achieved the required level of model fitness. The model supports all the hypotheses evaluated, confirming that all relationship exists amongst the constructs are positive and significant. Ultimately, the analysis verified that most of the respondents foresee better understanding of project input information through BIM visualization, its reliable database and coordinated data, in developing more reliable cost estimates. They also perceive to accelerate their cost estimating task through BIM adoption.
Reliability of Space-Shuttle Pressure Vessels with Random Batch Effects
NASA Technical Reports Server (NTRS)
Feiveson, Alan H.; Kulkarni, Pandurang M.
2000-01-01
In this article we revisit the problem of estimating the joint reliability against failure by stress rupture of a group of fiber-wrapped pressure vessels used on Space-Shuttle missions. The available test data were obtained from an experiment conducted at the U.S. Department of Energy Lawrence Livermore Laboratory (LLL) in which scaled-down vessels were subjected to life testing at four accelerated levels of pressure. We estimate the reliability assuming that both the Shuttle and LLL vessels were chosen at random in a two-stage process from an infinite population with spools of fiber as the primary sampling unit. Two main objectives of this work are: (1) to obtain practical estimates of reliability taking into account random spool effects and (2) to obtain a realistic assessment of estimation accuracy under the random model. Here, reliability is calculated in terms of a 'system' of 22 fiber-wrapped pressure vessels, taking into account typical pressures and exposure times experienced by Shuttle vessels. Comparisons are made with previous studies. The main conclusion of this study is that, although point estimates of reliability are still in the 'comfort zone,' it is advisable to plan for replacement of the pressure vessels well before the expected Lifetime of 100 missions per Shuttle Orbiter. Under a random-spool model, there is simply not enough information in the LLL data to provide reasonable assurance that such replacement would not be necessary.
Luján, Manel; Sogo, Ana; Pomares, Xavier; Monsó, Eduard; Sales, Bernat; Blanch, Lluís
2013-05-01
New home ventilators are able to provide clinicians data of interest through built-in software. Monitoring of tidal volume (VT) is a key point in the assessment of the efficacy of home mechanical ventilation. To assess the reliability of the VT provided by 5 ventilators in a bench test. Five commercial ventilators from 4 different manufacturers were tested in pressure support mode with the help of a breathing simulator under different conditions of mechanical respiratory pattern, inflation pressure, and intentional leakage. Values provided by the built-in software of each ventilator were compared breath to breath with the VT monitored through an external pneumotachograph. Ten breaths for each condition were compared for every tested situation. All tested ventilators underestimated VT (ranges of -21.7 mL to -83.5 mL, which corresponded to -3.6% to -14.7% of the externally measured VT). A direct relationship between leak and underestimation was found in 4 ventilators, with higher underestimations of the VT when the leakage increased, ranging between -2.27% and -5.42% for each 10 L/min increase in the leakage. A ventilator that included an algorithm that computes the pressure loss through the tube as a function of the flow exiting the ventilator had the minimal effect of leaks on the estimation of VT (0.3%). In 3 ventilators the underestimation was also influenced by mechanical pattern (lower underestimation with restrictive, and higher with obstructive). The inclusion of algorithms that calculate the pressure loss as a function of the flow exiting the ventilator in commercial models may increase the reliability of VT estimation.
Equations for estimating Clark Unit-hydrograph parameters for small rural watersheds in Illinois
Straub, Timothy D.; Melching, Charles S.; Kocher, Kyle E.
2000-01-01
Simulation of the measured discharge hydrographs for the verification storms utilizing TC and R obtained from the estimation equations yielded good results. The error in peak discharge for 21 of the 29 verification storms was less than 25 percent, and the error in time-to-peak discharge for 18 of the 29 verification storms also was less than 25 percent. Therefore, applying the estimation equations to determine TC and R for design-storm simulation may result in reliable design hydrographs, as long as the physical characteristics of the watersheds under consideration are within the range of those characteristics for the watersheds in this study [area: 0.02-2.3 mi2, main-channel length: 0.17-3.4 miles, main-channel slope: 10.5-229 feet per mile, and insignificant percentage of impervious cover].
Depth inpainting by tensor voting.
Kulkarni, Mandar; Rajagopalan, Ambasamudram N
2013-06-01
Depth maps captured by range scanning devices or by using optical cameras often suffer from missing regions due to occlusions, reflectivity, limited scanning area, sensor imperfections, etc. In this paper, we propose a fast and reliable algorithm for depth map inpainting using the tensor voting (TV) framework. For less complex missing regions, local edge and depth information is utilized for synthesizing missing values. The depth variations are modeled by local planes using 3D TV, and missing values are estimated using plane equations. For large and complex missing regions, we collect and evaluate depth estimates from self-similar (training) datasets. We align the depth maps of the training set with the target (defective) depth map and evaluate the goodness of depth estimates among candidate values using 3D TV. We demonstrate the effectiveness of the proposed approaches on real as well as synthetic data.
Forecasting overhaul or replacement intervals based on estimated system failure intensity
NASA Astrophysics Data System (ADS)
Gannon, James M.
1994-12-01
System reliability can be expressed in terms of the pattern of failure events over time. Assuming a nonhomogeneous Poisson process and Weibull intensity function for complex repairable system failures, the degree of system deterioration can be approximated. Maximum likelihood estimators (MLE's) for the system Rate of Occurrence of Failure (ROCOF) function are presented. Evaluating the integral of the ROCOF over annual usage intervals yields the expected number of annual system failures. By associating a cost of failure with the expected number of failures, budget and program policy decisions can be made based on expected future maintenance costs. Monte Carlo simulation is used to estimate the range and the distribution of the net present value and internal rate of return of alternative cash flows based on the distributions of the cost inputs and confidence intervals of the MLE's.
Psychometrics Matter in Health Behavior: A Long-term Reliability Generalization Study.
Pickett, Andrew C; Valdez, Danny; Barry, Adam E
2017-09-01
Despite numerous calls for increased understanding and reporting of reliability estimates, social science research, including the field of health behavior, has been slow to respond and adopt such practices. Therefore, we offer a brief overview of reliability and common reporting errors; we then perform analyses to examine and demonstrate the variability of reliability estimates by sample and over time. Using meta-analytic reliability generalization, we examined the variability of coefficient alpha scores for a well-designed, consistent, nationwide health study, covering a span of nearly 40 years. For each year and sample, reliability varied. Furthermore, reliability was predicted by a sample characteristic that differed among age groups within each administration. We demonstrated that reliability is influenced by the methods and individuals from which a given sample is drawn. Our work echoes previous calls that psychometric properties, particularly reliability of scores, are important and must be considered and reported before drawing statistical conclusions.
Efficient high-rate satellite clock estimation for PPP ambiguity resolution using carrier-ranges.
Chen, Hua; Jiang, Weiping; Ge, Maorong; Wickert, Jens; Schuh, Harald
2014-11-25
In order to catch up the short-term clock variation of GNSS satellites, clock corrections must be estimated and updated at a high-rate for Precise Point Positioning (PPP). This estimation is already very time-consuming for the GPS constellation only as a great number of ambiguities need to be simultaneously estimated. However, on the one hand better estimates are expected by including more stations, and on the other hand satellites from different GNSS systems must be processed integratively for a reliable multi-GNSS positioning service. To alleviate the heavy computational burden, epoch-differenced observations are always employed where ambiguities are eliminated. As the epoch-differenced method can only derive temporal clock changes which have to be aligned to the absolute clocks but always in a rather complicated way, in this paper, an efficient method for high-rate clock estimation is proposed using the concept of "carrier-range" realized by means of PPP with integer ambiguity resolution. Processing procedures for both post- and real-time processing are developed, respectively. The experimental validation shows that the computation time could be reduced to about one sixth of that of the existing methods for post-processing and less than 1 s for processing a single epoch of a network with about 200 stations in real-time mode after all ambiguities are fixed. This confirms that the proposed processing strategy will enable the high-rate clock estimation for future multi-GNSS networks in post-processing and possibly also in real-time mode.
Jette, Alan M.; McDonough, Christine M.; Haley, Stephen M.; Ni, Pengsheng; Olarsch, Sippy; Latham, Nancy; Hambleton, Ronald K.; Felson, David; Kim, Young-jo; Hunter, David
2012-01-01
Objective To develop and evaluate a prototype measure (OA-DISABILITY-CAT) for osteoarthritis research using Item Response Theory (IRT) and Computer Adaptive Test (CAT) methodologies. Study Design and Setting We constructed an item bank consisting of 33 activities commonly affected by lower extremity (LE) osteoarthritis. A sample of 323 adults with LE osteoarthritis reported their degree of limitation in performing everyday activities and completed the Health Assessment Questionnaire-II (HAQ-II). We used confirmatory factor analyses to assess scale unidimensionality and IRT methods to calibrate the items and examine the fit of the data. Using CAT simulation analyses, we examined the performance of OA-DISABILITY-CATs of different lengths compared to the full item bank and the HAQ-II. Results One distinct disability domain was identified. The 10-item OA-DISABILITY-CAT demonstrated a high degree of accuracy compared with the full item bank (r=0.99). The item bank and the HAQ-II scales covered a similar estimated scoring range. In terms of reliability, 95% of OA-DISABILITY reliability estimates were over 0.83 versus 0.60 for the HAQ-II. Except at the highest scores the 10-item OA-DISABILITY-CAT demonstrated superior precision to the HAQ-II. Conclusion The prototype OA-DISABILITY-CAT demonstrated promising measurement properties compared to the HAQ-II, and is recommended for use in LE osteoarthritis research. PMID:19216052
Stress drop estimates and hypocenter relocations of induced earthquakes near Fox Creek, Alberta
NASA Astrophysics Data System (ADS)
Clerc, F.; Harrington, R. M.; Liu, Y.; Gu, Y. J.
2016-12-01
This study investigates the physical differences between induced and naturally occurring earthquakes using a sequence of events potentially induced by hydraulic fracturing near Fox Creek, Alberta. We perform precise estimations of static stress drop to determine if the range of values is low compared to values estimated for naturally occurring events, as has been suggested by previous studies. Starting with the Natural Resources Canada earthquake catalog and using waveform data from regional networks, we use a spectral ratio method to calculate the static stress drop values of a group of relocated earthquakes occurring in close proximity to hydraulic fracturing wells from December 2013 to June 2015. The spectral ratio method allows us to precisely constrain the corner frequencies of the amplitude spectra by eliminating the path and site effects of co-located event pairs. Our estimated stress drop values range from 0.1 - 149 MPa over the full range of observed magnitudes, Mw 1.5-4, which are on the high side of the typical reported range of tectonic events, but consistent with other regional studies [Zhang et al., 2016; Wang et al., 2016]. , Stress drops values range from 11 to 93 MPa and appear to be scale invariant over the magnitude range Mw 3 - 4, and are less well constrained at lower magnitudes due to noise and bandwidth limitations. We observe no correlation between event stress drop and hypocenter depth or distance from the wells. Relocated hypocenters cluster around corresponding injection wells and form fine-scale lineations, suggesting the presence and orientation of fault planes. We conclude that neither the range of stress drops nor their scaling with respect to magnitude can be used to conclusively discriminate induced and tectonic earthquakes, as stress drop values may be greatly affected by the regional setting. Instead, the double-difference relocations may be a more reliable indicator of induced seismicity.
Validation of the one pass measure for motivational interviewing competence.
McMaster, Fiona; Resnicow, Ken
2015-04-01
This paper examines the psychometric properties of the OnePass coding system: a new, user-friendly tool for evaluating practitioner competence in motivational interviewing (MI). We provide data on reliability and validity with the current gold-standard: Motivational Interviewing Treatment Integrity tool (MITI). We compared scores from 27 videotaped MI sessions performed by student counselors trained in MI and simulated patients using both OnePass and MITI, with three different raters for each tool. Reliability was estimated using intra-class coefficients (ICCs), and validity was assessed using Pearson's r. OnePass had high levels of inter-rater reliability with 19/23 items found from substantial to almost perfect agreement. Taking the pair of scores with the highest inter-rater reliability on the MITI, the concurrent validity between the two measures ranged from moderate to high. Validity was highest for evocation, autonomy, direction and empathy. OnePass appears to have good inter-rater reliability while capturing similar dimensions of MI as the MITI. Despite the moderate concurrent validity with the MITI, the OnePass shows promise in evaluating both traditional and novel interpretations of MI. OnePass may be a useful tool for developing and improving practitioner competence in MI where access to MITI coders is limited. Copyright © 2015. Published by Elsevier Ireland Ltd.
Reliability analysis of structural ceramic components using a three-parameter Weibull distribution
NASA Technical Reports Server (NTRS)
Duffy, Stephen F.; Powers, Lynn M.; Starlinger, Alois
1992-01-01
Described here are nonlinear regression estimators for the three-Weibull distribution. Issues relating to the bias and invariance associated with these estimators are examined numerically using Monte Carlo simulation methods. The estimators were used to extract parameters from sintered silicon nitride failure data. A reliability analysis was performed on a turbopump blade utilizing the three-parameter Weibull distribution and the estimates from the sintered silicon nitride data.
Test-retest reliability of cardinal plane isokinetic hip torque and EMG.
Claiborne, Tina L; Timmons, Mark K; Pincivero, Danny M
2009-10-01
The objective of the present study was to establish test-retest reliability of isokinetic hip torque and prime mover electromyogram (EMG) through the three cardinal planes of motion. Thirteen healthy young adults participated in two experimental sessions, separated by approximately one week. During each session, isokinetic hip torque was evaluated on the Biodex Isokinetic Dynamometer at a velocity of 60 deg/s. Subjects performed three maximal-effort concentric and eccentric contractions, separately, for right and left hip abduction/adduction, flexion/extension, and internal/external rotation. Surface EMGs were sampled from the gluteus maximus, gluteus medius, adductor, medial and lateral hamstring, and rectus femoris muscles during all contractions. Intraclass correlation coefficients (ICC - 2,1) and standard errors of measurement (SEM) were calculated for peak torque for each movement direction and contraction mode, while ICCs were only computed for the EMG data. Motions that demonstrated high torque reliability included concentric hip abduction (right and left), flexion (right and left), extension (right) and internal rotation (right and left), and eccentric hip abduction (left), adduction (left), flexion (right), and extension (right and left) (ICC range=0.81-0.91). Motions with moderate torque reliability included concentric hip adduction (right), extension (left), internal rotation (left), and external rotation (right), and eccentric hip abduction and adduction (right), flexion (left), internal rotation (right and left), and external rotation (right and left) (ICC range=0.49-0.79). The majority of the EMG sampled muscles (n=12 and n=11 for concentric and eccentric contractions, respectively) demonstrated high reliability (ICC=0.81-0.95). Instances of low, or unacceptable, EMG reliability values occurred for the medial hamstring muscle of the left leg (both contraction modes) and the adductor muscle of the right leg during eccentric internal rotation. The major finding revealed high and moderate levels of between-day reliability of isokinetic hip peak torque and prime mover EMG. It is recommended that the day-to-day variability estimates concomitant with acceptable levels of reliability be considered when attempting to objectify intervention effects on hip muscle performance.
to do so, and (5) three distinct versions of the problem of estimating component reliability from system failure-time data are treated, each resulting inconsistent estimators with asymptotically normal distributions.
Duncan, Laura; Comeau, Jinette; Wang, Li; Vitoroulis, Irene; Boyle, Michael H; Bennett, Kathryn
2018-02-19
A better understanding of factors contributing to the observed variability in estimates of test-retest reliability in published studies on standardized diagnostic interviews (SDI) is needed. The objectives of this systematic review and meta-analysis were to estimate the pooled test-retest reliability for parent and youth assessments of seven common disorders, and to examine sources of between-study heterogeneity in reliability. Following a systematic review of the literature, multilevel random effects meta-analyses were used to analyse 202 reliability estimates (Cohen's kappa = ҡ) from 31 eligible studies and 5,369 assessments of 3,344 children and youth. Pooled reliability was moderate at ҡ = .58 (CI 95% 0.53-0.63) and between-study heterogeneity was substantial (Q = 2,063 (df = 201), p < .001 and I 2 = 79%). In subgroup analysis, reliability varied across informants for specific types of psychiatric disorder (ҡ = .53-.69 for parent vs. ҡ = .39-.68 for youth) with estimates significantly higher for parents on attention deficit hyperactivity disorder, oppositional defiant disorder and the broad groupings of externalizing and any disorder. Reliability was also significantly higher in studies with indicators of poor or fair study methodology quality (sample size <50, retest interval <7 days). Our findings raise important questions about the meaningfulness of published evidence on the test-retest reliability of SDIs and the usefulness of these tools in both clinical and research contexts. Potential remedies include the introduction of standardized study and reporting requirements for reliability studies, and exploration of other approaches to assessing and classifying child and adolescent psychiatric disorder. © 2018 Association for Child and Adolescent Mental Health.
Kilgus, Stephen P; Riley-Tillman, T Chris; Stichter, Janine P; Schoemann, Alexander M; Bellesheim, Katie
2016-09-01
The purpose of this investigation was to evaluate the reliability of Direct Behavior Ratings-Social Competence (DBR-SC) ratings. Participants included 60 students identified as possessing deficits in social competence, as well as their 23 classroom teachers. Teachers used DBR-SC to complete ratings of 5 student behaviors within the general education setting on a daily basis across approximately 5 months. During this time, each student was assigned to 1 of 2 intervention conditions, including the Social Competence Intervention-Adolescent (SCI-A) and a business-as-usual (BAU) intervention. Ratings were collected across 3 intervention phases, including pre-, mid-, and postintervention. Results suggested DBR-SC ratings were highly consistent across time within each student, with reliability coefficients predominantly falling in the .80 and .90 ranges. Findings further indicated such levels of reliability could be achieved with only a small number of ratings, with estimates varying between 2 and 10 data points. Group comparison analyses further suggested the reliability of DBR-SC ratings increased over time, such that student behavior became more consistent throughout the intervention period. Furthermore, analyses revealed that for 2 of the 5 DBR-SC behavior targets, the increase in reliability over time was moderated by intervention grouping, with students receiving SCI-A demonstrating greater increases in reliability relative to those in the BAU group. Limitations of the investigation as well as directions for future research are discussed herein. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
ERIC Educational Resources Information Center
Oakland, Thomas
New strategies for evaluation criterion referenced measures (CRM) are discussed. These strategies examine the following issues: (1) the use of normed referenced measures (NRM) as CRM and then estimating the reliability and validity of such measures in terms of variance from an arbitrarily specified criterion score, (2) estimation of the…
A Note on the Reliability Coefficients for Item Response Model-Based Ability Estimates
ERIC Educational Resources Information Center
Kim, Seonghoon
2012-01-01
Assuming item parameters on a test are known constants, the reliability coefficient for item response theory (IRT) ability estimates is defined for a population of examinees in two different ways: as (a) the product-moment correlation between ability estimates on two parallel forms of a test and (b) the squared correlation between the true…
Wu, Joseph T.; Ho, Andrew; Ma, Edward S. K.; Lee, Cheuk Kwong; Chu, Daniel K. W.; Ho, Po-Lai; Hung, Ivan F. N.; Ho, Lai Ming; Lin, Che Kit; Tsang, Thomas; Lo, Su-Vui; Lau, Yu-Lung; Leung, Gabriel M.
2011-01-01
Background In an emerging influenza pandemic, estimating severity (the probability of a severe outcome, such as hospitalization, if infected) is a public health priority. As many influenza infections are subclinical, sero-surveillance is needed to allow reliable real-time estimates of infection attack rate (IAR) and severity. Methods and Findings We tested 14,766 sera collected during the first wave of the 2009 pandemic in Hong Kong using viral microneutralization. We estimated IAR and infection-hospitalization probability (IHP) from the serial cross-sectional serologic data and hospitalization data. Had our serologic data been available weekly in real time, we would have obtained reliable IHP estimates 1 wk after, 1–2 wk before, and 3 wk after epidemic peak for individuals aged 5–14 y, 15–29 y, and 30–59 y. The ratio of IAR to pre-existing seroprevalence, which decreased with age, was a major determinant for the timeliness of reliable estimates. If we began sero-surveillance 3 wk after community transmission was confirmed, with 150, 350, and 500 specimens per week for individuals aged 5–14 y, 15–19 y, and 20–29 y, respectively, we would have obtained reliable IHP estimates for these age groups 4 wk before the peak. For 30–59 y olds, even 800 specimens per week would not have generated reliable estimates until the peak because the ratio of IAR to pre-existing seroprevalence for this age group was low. The performance of serial cross-sectional sero-surveillance substantially deteriorates if test specificity is not near 100% or pre-existing seroprevalence is not near zero. These potential limitations could be mitigated by choosing a higher titer cutoff for seropositivity. If the epidemic doubling time is longer than 6 d, then serial cross-sectional sero-surveillance with 300 specimens per week would yield reliable estimates when IAR reaches around 6%–10%. Conclusions Serial cross-sectional serologic data together with clinical surveillance data can allow reliable real-time estimates of IAR and severity in an emerging pandemic. Sero-surveillance for pandemics should be considered. Please see later in the article for the Editors' Summary PMID:21990967
Smile line assessment comparing quantitative measurement and visual estimation.
Van der Geld, Pieter; Oosterveld, Paul; Schols, Jan; Kuijpers-Jagtman, Anne Marie
2011-02-01
Esthetic analysis of dynamic functions such as spontaneous smiling is feasible by using digital videography and computer measurement for lip line height and tooth display. Because quantitative measurements are time-consuming, digital videography and semiquantitative (visual) estimation according to a standard categorization are more practical for regular diagnostics. Our objective in this study was to compare 2 semiquantitative methods with quantitative measurements for reliability and agreement. The faces of 122 male participants were individually registered by using digital videography. Spontaneous and posed smiles were captured. On the records, maxillary lip line heights and tooth display were digitally measured on each tooth and also visually estimated according to 3-grade and 4-grade scales. Two raters were involved. An error analysis was performed. Reliability was established with kappa statistics. Interexaminer and intraexaminer reliability values were high, with median kappa values from 0.79 to 0.88. Agreement of the 3-grade scale estimation with quantitative measurement showed higher median kappa values (0.76) than the 4-grade scale estimation (0.66). Differentiating high and gummy smile lines (4-grade scale) resulted in greater inaccuracies. The estimation of a high, average, or low smile line for each tooth showed high reliability close to quantitative measurements. Smile line analysis can be performed reliably with a 3-grade scale (visual) semiquantitative estimation. For a more comprehensive diagnosis, additional measuring is proposed, especially in patients with disproportional gingival display. Copyright © 2011 American Association of Orthodontists. Published by Mosby, Inc. All rights reserved.
A method of bias correction for maximal reliability with dichotomous measures.
Penev, Spiridon; Raykov, Tenko
2010-02-01
This paper is concerned with the reliability of weighted combinations of a given set of dichotomous measures. Maximal reliability for such measures has been discussed in the past, but the pertinent estimator exhibits a considerable bias and mean squared error for moderate sample sizes. We examine this bias, propose a procedure for bias correction, and develop a more accurate asymptotic confidence interval for the resulting estimator. In most empirically relevant cases, the bias correction and mean squared error correction can be performed simultaneously. We propose an approximate (asymptotic) confidence interval for the maximal reliability coefficient, discuss the implementation of this estimator, and investigate the mean squared error of the associated asymptotic approximation. We illustrate the proposed methods using a numerical example.
Practical Issues in Implementing Software Reliability Measurement
NASA Technical Reports Server (NTRS)
Nikora, Allen P.; Schneidewind, Norman F.; Everett, William W.; Munson, John C.; Vouk, Mladen A.; Musa, John D.
1999-01-01
Many ways of estimating software systems' reliability, or reliability-related quantities, have been developed over the past several years. Of particular interest are methods that can be used to estimate a software system's fault content prior to test, or to discriminate between components that are fault-prone and those that are not. The results of these methods can be used to: 1) More accurately focus scarce fault identification resources on those portions of a software system most in need of it. 2) Estimate and forecast the risk of exposure to residual faults in a software system during operation, and develop risk and safety criteria to guide the release of a software system to fielded use. 3) Estimate the efficiency of test suites in detecting residual faults. 4) Estimate the stability of the software maintenance process.
Sun, Wei; Chou, Chih-Ping; Stacy, Alan W; Ma, Huiyan; Unger, Jennifer; Gallaher, Peggy
2007-02-01
Cronbach's a is widely used in social science research to estimate the internal consistency of reliability of a measurement scale. However, when items are not strictly parallel, the Cronbach's a coefficient provides a lower-bound estimate of true reliability, and this estimate may be further biased downward when items are dichotomous. The estimation of standardized Cronbach's a for a scale with dichotomous items can be improved by using the upper bound of coefficient phi. SAS and SPSS macros have been developed in this article to obtain standardized Cronbach's a via this method. The simulation analysis showed that Cronbach's a from upper-bound phi might be appropriate for estimating the real reliability when standardized Cronbach's a is problematic.
NASA Astrophysics Data System (ADS)
Rosas, Pedro; Wagemans, Johan; Ernst, Marc O.; Wichmann, Felix A.
2005-05-01
A number of models of depth-cue combination suggest that the final depth percept results from a weighted average of independent depth estimates based on the different cues available. The weight of each cue in such an average is thought to depend on the reliability of each cue. In principle, such a depth estimation could be statistically optimal in the sense of producing the minimum-variance unbiased estimator that can be constructed from the available information. Here we test such models by using visual and haptic depth information. Different texture types produce differences in slant-discrimination performance, thus providing a means for testing a reliability-sensitive cue-combination model with texture as one of the cues to slant. Our results show that the weights for the cues were generally sensitive to their reliability but fell short of statistically optimal combination - we find reliability-based reweighting but not statistically optimal cue combination.
Timme, M; Timme, W H; Olze, A; Ottow, C; Ribbecke, S; Pfeiffer, H; Dettmeyer, R; Schmeling, A
2017-03-01
There is a need for dental age estimation methods after completion of the third molar mineralization. Degenerative dental characteristics appear to be suitable for forensic age diagnostics beyond the 18th year of life. In 2012, Olze et al. investigated the criteria studied by Gustafson using orthopantomograms. The objective of this study was to prove the applicability and reliability of this method with a large cohort and a wide age range, including older individuals. For this purpose, 2346 orthopantomograms of 1167 female and 1179 male Germans aged 15 to 70 years were reviewed. The characteristics of secondary dentin formation, cementum apposition, periodontal recession and attrition were evaluated in all the mandibular premolars. The correlation of the individual characteristics with the chronological age was examined by means of a stepwise multiple regression analysis, in which the chronological age formed the dependent variable. Following those results, R 2 values amounted to 0.73 to 0.8; the standard error of estimate was 6.8 to 8.2 years. Fundamentally, the recommendation for conducting age estimations in the living by these methods can be shared. The values for the quality of the regression are, however, not precise enough for a reliable age estimation around regular retirement date ages. More precise regression formulae for the age group of 15 to 40 years of life are separately presented in this study. Further research should investigate the influence of ethnicity, dietary habits and modern health care on the degenerative characteristics in question.
Hogan, Thomas J
2012-05-01
The objective was to review recent economic evaluations of influenza vaccination by injection in the US, assess their evidence, and conclude on their collective findings. The literature was searched for economic evaluations of influenza vaccination injection in healthy working adults in the US published since 1995. Ten evaluations described in nine papers were identified. These were synopsized and their results evaluated, the basic structure of all evaluations was ascertained, and sensitivity of outcomes to changes in parameter values were explored using a decision model. Areas to improve economic evaluations were noted. Eight of nine evaluations with credible economic outcomes were favourable to vaccination, representing a statistically significant result compared with a proportion of 50% that would be expected if vaccination and no vaccination were economically equivalent. Evaluations shared a basic structure, but differed considerably with respect to cost components, assumptions, methods, and parameter estimates. Sensitivity analysis indicated that changes in parameter values within the feasible range, individually or simultaneously, could reverse economic outcomes. Given stated misgivings, the methods of estimating influenza reduction ascribed to vaccination must be researched to confirm that they produce accurate and reliable estimates. Research is also needed to improve estimates of the costs per case of influenza illness and the costs of vaccination. Based on their assumptions, the reviewed papers collectively appear to support the economic benefits of influenza vaccination of healthy adults. Yet the underlying assumptions, methods and parameter estimates themselves warrant further research to confirm they are accurate, reliable and appropriate to economic evaluation purposes.
Measuring eating disorder attitudes and behaviors: a reliability generalization study
2014-01-01
Background Although score reliability is a sample-dependent characteristic, researchers often only report reliability estimates from previous studies as justification for employing particular questionnaires in their research. The present study followed reliability generalization procedures to determine the mean score reliability of the Eating Disorder Inventory and its most commonly employed subscales (Drive for Thinness, Bulimia, and Body Dissatisfaction) and the Eating Attitudes Test as a way to better identify those characteristics that might impact score reliability. Methods Published studies that used these measures were coded based on their reporting of reliability information and additional study characteristics that might influence score reliability. Results Score reliability estimates were included in 26.15% of studies using the EDI and 36.28% of studies using the EAT. Mean Cronbach’s alphas for the EDI (total score = .91; subscales = .75 to .89), EAT-40 (total score = .81) and EAT-26 (total score = .86; subscales = .56 to .80) suggested variability in estimated internal consistency. Whereas some EDI subscales exhibited higher score reliability in clinical eating disorder samples than in nonclinical samples, other subscales did not exhibit these differences. Score reliability information for the EAT was primarily reported for nonclinical samples, making it difficult to characterize the effect of type of sample on these measures. However, there was a tendency for mean score reliability to be higher in the adult (vs. adolescent) samples and in female (vs. male) samples. Conclusions Overall, this study highlights the importance of assessing and reporting internal consistency during every test administration because reliability is affected by characteristics of the participants being examined. PMID:24764530
Mehmandoust, Babak; Sanjari, Ehsan; Vatani, Mostafa
2013-01-01
The heat of vaporization of a pure substance at its normal boiling temperature is a very important property in many chemical processes. In this work, a new empirical method was developed to predict vaporization enthalpy of pure substances. This equation is a function of normal boiling temperature, critical temperature, and critical pressure. The presented model is simple to use and provides an improvement over the existing equations for 452 pure substances in wide boiling range. The results showed that the proposed correlation is more accurate than the literature methods for pure substances in a wide boiling range (20.3–722 K). PMID:25685493
Mehmandoust, Babak; Sanjari, Ehsan; Vatani, Mostafa
2014-03-01
The heat of vaporization of a pure substance at its normal boiling temperature is a very important property in many chemical processes. In this work, a new empirical method was developed to predict vaporization enthalpy of pure substances. This equation is a function of normal boiling temperature, critical temperature, and critical pressure. The presented model is simple to use and provides an improvement over the existing equations for 452 pure substances in wide boiling range. The results showed that the proposed correlation is more accurate than the literature methods for pure substances in a wide boiling range (20.3-722 K).
Attia, A; Dhahbi, W; Chaouachi, A; Padulo, J; Wong, D P; Chamari, K
2017-03-01
Common methods to estimate vertical jump height (VJH) are based on the measurements of flight time (FT) or vertical reaction force. This study aimed to assess the measurement errors when estimating the VJH with flight time using photocell devices in comparison with the gold standard jump height measured by a force plate (FP). The second purpose was to determine the intrinsic reliability of the Optojump photoelectric cells in estimating VJH. For this aim, 20 subjects (age: 22.50±1.24 years) performed maximal vertical jumps in three modalities in randomized order: the squat jump (SJ), counter-movement jump (CMJ), and CMJ with arm swing (CMJarm). Each trial was simultaneously recorded by the FP and Optojump devices. High intra-class correlation coefficients (ICCs) for validity (0.98-0.99) and low limits of agreement (less than 1.4 cm) were found; even a systematic difference in jump height was consistently observed between FT and double integration of force methods (-31% to -27%; p<0.001) and a large effect size (Cohen's d >1.2). Intra-session reliability of Optojump was excellent, with ICCs ranging from 0.98 to 0.99, low coefficients of variation (3.98%), and low standard errors of measurement (0.8 cm). It was concluded that there was a high correlation between the two methods to estimate the vertical jump height, but the FT method cannot replace the gold standard, due to the large systematic bias. According to our results, the equations of each of the three jump modalities were presented in order to obtain a better estimation of the jump height.
Radar QPE for hydrological design: Intensity-Duration-Frequency curves
NASA Astrophysics Data System (ADS)
Marra, Francesco; Morin, Efrat
2015-04-01
Intensity-duration-frequency (IDF) curves are widely used in flood risk management since they provide an easy link between the characteristics of a rainfall event and the probability of its occurrence. They are estimated analyzing the extreme values of rainfall records, usually basing on raingauge data. This point-based approach raises two issues: first, hydrological design applications generally need IDF information for the entire catchment rather than a point, second, the representativeness of point measurements decreases with the distance from measure location, especially in regions characterized by steep climatological gradients. Weather radar, providing high resolution distributed rainfall estimates over wide areas, has the potential to overcome these issues. Two objections usually restrain this approach: (i) the short length of data records and (ii) the reliability of quantitative precipitation estimation (QPE) of the extremes. This work explores the potential use of weather radar estimates for the identification of IDF curves by means of a long length radar archive and a combined physical- and quantitative- adjustment of radar estimates. Shacham weather radar, located in the eastern Mediterranean area (Tel Aviv, Israel), archives data since 1990 providing rainfall estimates for 23 years over a region characterized by strong climatological gradients. Radar QPE is obtained correcting the effects of pointing errors, ground echoes, beam blockage, attenuation and vertical variations of reflectivity. Quantitative accuracy is then ensured with a range-dependent bias adjustment technique and reliability of radar QPE is assessed by comparison with gauge measurements. IDF curves are derived from the radar data using the annual extremes method and compared with gauge-based curves. Results from 14 study cases will be presented focusing on the effects of record length and QPE accuracy, exploring the potential application of radar IDF curves for ungauged locations and providing insights on the use of radar QPE for hydrological design studies.
Attia, A; Chaouachi, A; Padulo, J; Wong, DP; Chamari, K
2016-01-01
Common methods to estimate vertical jump height (VJH) are based on the measurements of flight time (FT) or vertical reaction force. This study aimed to assess the measurement errors when estimating the VJH with flight time using photocell devices in comparison with the gold standard jump height measured by a force plate (FP). The second purpose was to determine the intrinsic reliability of the Optojump photoelectric cells in estimating VJH. For this aim, 20 subjects (age: 22.50±1.24 years) performed maximal vertical jumps in three modalities in randomized order: the squat jump (SJ), counter-movement jump (CMJ), and CMJ with arm swing (CMJarm). Each trial was simultaneously recorded by the FP and Optojump devices. High intra-class correlation coefficients (ICCs) for validity (0.98-0.99) and low limits of agreement (less than 1.4 cm) were found; even a systematic difference in jump height was consistently observed between FT and double integration of force methods (-31% to -27%; p<0.001) and a large effect size (Cohen’s d>1.2). Intra-session reliability of Optojump was excellent, with ICCs ranging from 0.98 to 0.99, low coefficients of variation (3.98%), and low standard errors of measurement (0.8 cm). It was concluded that there was a high correlation between the two methods to estimate the vertical jump height, but the FT method cannot replace the gold standard, due to the large systematic bias. According to our results, the equations of each of the three jump modalities were presented in order to obtain a better estimation of the jump height. PMID:28416900
Gottschalk, Hilton P; Bastrom, Tracey P; Edmonds, Eric W
2013-01-01
Standard elbow radiographs (AP and lateral views) are not accurate enough to measure true displacement of medial epicondyle fractures of the humerus. The amount of perceived displacement has been used to determine treatment options. This study assesses the utility of internal oblique radiographs for measurement of true displacement in these fractures. A medial epicondyle fracture was created in a cadaveric specimen. Displacement of the fragment (mm) was set at 5, 10, and 15 in line with the vector of the flexor pronator mass. The fragment was sutured temporarily in place. Radiographs were obtained at 0 (AP), 15, 30, 45, 60, 75, and 90 degrees (lateral) of internal rotation, with the elbow in set positions of flexion. This was done with and without radio-opaque markers placed on the fragment and fracture bed. The 45 and 60 degrees internal oblique radiographs were then presented to 5 separate reviewers (of different levels of training) to evaluate intraobserver and interobserver agreement. Change in elbow position did not affect the perceived displacement (P=0.82) with excellent intraobserver reliability (intraclass correlation coefficient range, 0.979 to 0.988) and interobserver agreement of 0.953. The intraclass correlation coefficient for intraobserver reliability on 45 degrees internal oblique films for all groups ranged from 0.985 to 0.998, with interobserver agreement of 0.953. For predicting displacement, the observers were 60% accurate in predicting the true displacement on the 45 degrees internal oblique films and only 35% accurate using the 60 degrees internal oblique view. Standardizing to a 45 degrees internal oblique radiograph of the elbow (regardless of elbow flexion) can augment the treating surgeon's ability to determine true displacement. At this degree of rotation, the measured number can be multiplied by 1.4 to better estimate displacement. The addition of a 45 degrees internal oblique radiograph in medial humeral epicondyle fractures has good intraobserver and interobserver reliability to more accurately estimate the true displacement of these fractures. Diagnostic study, Level II (Development of diagnostic study with universally applied reference "gold" standard).
Ruschel, Caroline; Haupenthal, Alessandro; Jacomel, Gabriel Fernandes; Fontana, Heiliane de Brito; Santos, Daniela Pacheco dos; Scoz, Robson Dias; Roesler, Helio
2015-05-20
Isometric muscle strength of knee extensors has been assessed for estimating performance, evaluating progress during physical training, and investigating the relationship between isometric and dynamic/functional performance. To assess the validity and reliability of an adapted leg-extension machine for measuring isometric knee extensor force. Validity (concurrent approach) and reliability (test and test-retest approach) study. University laboratory. 70 healthy men and women aged between 20 and 30 y (39 in the validity study and 31 in the reliability study). Intraclass correlation coefficient (ICC) values calculated for the maximum voluntary isometric torque of knee extensors at 30°, 60°, and 90°, measured with the prototype and with an isokinetic dynamometer (ICC2,1, validity study) and measured with the prototype in test and retest sessions, scheduled from 48 h to 72 h apart (ICC1,1, reliability study). In the validity analysis, the prototype showed good agreement for measurements at 30° (ICC2,1 = .75, SEM = 18.2 Nm) and excellent agreement for measurements at 60° (ICC2,1 = .93, SEM = 9.6 Nm) and at 90° (ICC2,1 = .94, SEM = 8.9 Nm). Regarding the reliability analysis, between-days' ICC1,1 were good to excellent, ranging from .88 to .93. Standard error of measurement and minimal detectable difference based on test-retest ranged from 11.7 Nm to 18.1 Nm and 32.5 Nm to 50.1 Nm, respectively, for the 3 analyzed knee angles. The analysis of validity and repeatability of the prototype for measuring isometric muscle strength has shown to be good or excellent, depending on the knee joint angle analyzed. The new instrument, which presents a relative low cost and easiness of transportation when compared with an isokinetic dynamometer, is valid and provides consistent data concerning isometric strength of knee extensors and, for this reason, can be used for practical, clinical, and research purposes.
Ruiz, Jonatan R; Ortega, Francisco B; Castro-Piñero, Jose
2014-11-30
We investigated the criterion-related validity and the reliability of the 1/4 mile run-walk test (MRWT) in children and adolescents. A total of 86 children (n=42 girls) completed a maximal graded treadmill test using a gas analyzer and the 1/4MRW test. We investigated the test-retest reliability of the 1/4MRWT in a different group of children and adolescents (n=995, n=418 girls). The 1/4MRWT time, sex, and BMI significantly contributed to predict measured VO2peak (R2= 0.32). There was no systematic bias in the cross-validation group (P>0.1). The root mean sum of squared errors (RMSE) and the percentage error were 6.9 ml/kg/min and 17.7%, respectively, and the accurate prediction (i.e. the percentage of estimations within ±4.5 ml/kg/min of VO2peak) was 48.8%. The reliability analysis showed that the mean inter-trial difference ranged from 0.6 seconds in children aged 6-11 years to 1.3 seconds in adolescents aged 12-17 years (all P. Copyright AULA MEDICA EDICIONES 2014. Published by AULA MEDICA. All rights reserved.
Butler, Emily E; Saville, Christopher W N; Ward, Robert; Ramsey, Richard
2017-01-01
The human face cues a range of important fitness information, which guides mate selection towards desirable others. Given humans' high investment in the central nervous system (CNS), cues to CNS function should be especially important in social selection. We tested if facial attractiveness preferences are sensitive to the reliability of human nervous system function. Several decades of research suggest an operational measure for CNS reliability is reaction time variability, which is measured by standard deviation of reaction times across trials. Across two experiments, we show that low reaction time variability is associated with facial attractiveness. Moreover, variability in performance made a unique contribution to attractiveness judgements above and beyond both physical health and sex-typicality judgements, which have previously been associated with perceptions of attractiveness. In a third experiment, we empirically estimated the distribution of attractiveness preferences expected by chance and show that the size and direction of our results in Experiments 1 and 2 are statistically unlikely without reference to reaction time variability. We conclude that an operating characteristic of the human nervous system, reliability of information processing, is signalled to others through facial appearance. Copyright © 2016 Elsevier B.V. All rights reserved.
Green, Eric P; Tuli, Hawa; Kwobah, Edith; Menya, D; Chesire, Irene; Schmidt, Christina
2018-03-01
Routine screening for perinatal depression is not common in most primary health care settings. The U.S. Preventive Services Task Force only recently updated their recommendation on depression screening to specifically recommend screening during the pre- and postpartum periods. While practitioners in high-income countries can respond to this new recommendation by implementing one of several existing depression screening tools developed in Western contexts, such as the Edinburgh Postnatal Depression Scale (EPDS) or the Patient Health Questionnaire-9 (PHQ-9), these tools lack strong evidence of cross-cultural equivalence, validity for case finding, and precision in measuring response to treatment in developing countries. Thus, there is a critical need to develop and validate new screening tools for perinatal depression that can be used by lay health workers, primary health care personnel, and patients. Working in rural Kenya, we used free listing, card sorting, and item analysis methods to develop a locally-relevant screening tool that blended Western psychiatric concepts with local idioms of distress. We conducted a validation study with a random sample of 193 pregnant women and new mothers to test the diagnostic accuracy of this scale along with the EPDS and PHQ-9. The sensitivity/specificity of the EPDS and PHQ-9 was estimated to be 0.70/0.72 and 0.70/0.73, respectively. This compared to sensitivity/specificity of 0.90/0.90 for a new 9-item locally-developed tool called the Perinatal Depression Screening (PDEPS). Across these three tools, internal consistency reliability ranged from 0.77 to 0.81 and test-retest reliability ranged from 0.57 to 0.67. The prevalence of depression ranges from 5.2% to 6.2% depending on the clinical reference standard. The EPDS and PHQ-9 are valid and reliable screening tools for perinatal depression in rural Western Kenya, the PDEPS may be a more useful alternative. At less than 10%, the prevalence of depression in this region appears to be lower than other published estimates for African and other low-income countries. Copyright © 2017 Elsevier B.V. All rights reserved.
Characterization and Local Emission Sources for Ammonia in an Urban Environment.
Galán Madruga, D; Fernández Patier, R; Sintes Puertas, M A; Romero García, M D; Cristóbal López, A
2018-04-01
Ammonia levels were evaluated in the urban environment of Madrid City, Spain. A total of 110 samplers were distributed throughout the city. Vehicle traffic density, garbage containers and sewers were identified as local emission sources of ammonia. The average ammonia concentrations were 4.66 ± 2.14 µg/m 3 (0.39-11.23 µg/m 3 range) in the winter and 5.30 ± 1.81 µg/m 3 (2.33-11.08 µg/m 3 range) in the summer. Spatial and seasonal variations of ammonia levels were evaluated. Hotspots were located in the south and center of Madrid City in both winter and summer seasons, with lower ammonia concentrations located in the north (winter) and in the west and east (summer). The number of representative points that were needed to establish a reliable air quality monitoring network for ammonia was determined using a combined clustering and kriging approach. The results indicated that 40 samplers were sufficient to provide a reliable estimate for Madrid City.
A Closed-Form Error Model of Straight Lines for Improved Data Association and Sensor Fusing
2018-01-01
Linear regression is a basic tool in mobile robotics, since it enables accurate estimation of straight lines from range-bearing scans or in digital images, which is a prerequisite for reliable data association and sensor fusing in the context of feature-based SLAM. This paper discusses, extends and compares existing algorithms for line fitting applicable also in the case of strong covariances between the coordinates at each single data point, which must not be neglected if range-bearing sensors are used. Besides, in particular, the determination of the covariance matrix is considered, which is required for stochastic modeling. The main contribution is a new error model of straight lines in closed form for calculating quickly and reliably the covariance matrix dependent on just a few comprehensible and easily-obtainable parameters. The model can be applied widely in any case when a line is fitted from a number of distinct points also without a priori knowledge of the specific measurement noise. By means of extensive simulations, the performance and robustness of the new model in comparison to existing approaches is shown. PMID:29673205
Development of a pneumatic tensioning device for gap measurement during total knee arthroplasty.
Kwak, Dai-Soon; Kong, Chae-Gwan; Han, Seung-Ho; Kim, Dong-Hyun; In, Yong
2012-09-01
Despite the importance of soft tissue balancing during total knee arthroplasty (TKA), all estimating techniques are dependent on a surgeon's manual distraction force or subjective feeling based on experience. We developed a new device for dynamic gap balancing, which can offer constant load to the gap between the femur and tibia, using pneumatic pressure during range of motion. To determine the amount of distraction force for the new device, 3 experienced surgeons' manual distraction force was measured using a conventional spreader. A new device called the consistent load pneumatic tensor was developed on the basis of the biomechanical tests. Reliability testing for the new device was performed using 5 cadaveric knees by the same surgeons. Intraclass correlation coefficients (ICCs) were calculated. The distraction force applied to the new pneumatic tensioning device was determined to be 150 N. The interobserver reliability was very good for the newly tested spreader device with ICCs between 0.828 and 0.881. The new pneumatic tensioning device can enable us to properly evaluate the soft tissue balance throughout the range of motion during TKA with acceptable reproducibility.
A mathematical model of diurnal variations in human plasma melatonin levels
NASA Technical Reports Server (NTRS)
Brown, E. N.; Choe, Y.; Shanahan, T. L.; Czeisler, C. A.
1997-01-01
Studies in animals and humans suggest that the diurnal pattern in plasma melatonin levels is due to the hormone's rates of synthesis, circulatory infusion and clearance, circadian control of synthesis onset and offset, environmental lighting conditions, and error in the melatonin immunoassay. A two-dimensional linear differential equation model of the hormone is formulated and is used to analyze plasma melatonin levels in 18 normal healthy male subjects during a constant routine. Recently developed Bayesian statistical procedures are used to incorporate correctly the magnitude of the immunoassay error into the analysis. The estimated parameters [median (range)] were clearance half-life of 23.67 (14.79-59.93) min, synthesis onset time of 2206 (1940-0029), synthesis offset time of 0621 (0246-0817), and maximum N-acetyltransferase activity of 7.17(2.34-17.93) pmol x l(-1) x min(-1). All were in good agreement with values from previous reports. The difference between synthesis offset time and the phase of the core temperature minimum was 1 h 15 min (-4 h 38 min-2 h 43 min). The correlation between synthesis onset and the dim light melatonin onset was 0.93. Our model provides a more physiologically plausible estimate of the melatonin synthesis onset time than that given by the dim light melatonin onset and the first reliable means of estimating the phase of synthesis offset. Our analysis shows that the circadian and pharmacokinetics parameters of melatonin can be reliably estimated from a single model.
Reliable evaluation of the quantal determinants of synaptic efficacy using Bayesian analysis
Beato, M.
2013-01-01
Communication between neurones in the central nervous system depends on synaptic transmission. The efficacy of synapses is determined by pre- and postsynaptic factors that can be characterized using quantal parameters such as the probability of neurotransmitter release, number of release sites, and quantal size. Existing methods of estimating the quantal parameters based on multiple probability fluctuation analysis (MPFA) are limited by their requirement for long recordings to acquire substantial data sets. We therefore devised an algorithm, termed Bayesian Quantal Analysis (BQA), that can yield accurate estimates of the quantal parameters from data sets of as small a size as 60 observations for each of only 2 conditions of release probability. Computer simulations are used to compare its performance in accuracy with that of MPFA, while varying the number of observations and the simulated range in release probability. We challenge BQA with realistic complexities characteristic of complex synapses, such as increases in the intra- or intersite variances, and heterogeneity in release probabilities. Finally, we validate the method using experimental data obtained from electrophysiological recordings to show that the effect of an antagonist on postsynaptic receptors is correctly characterized by BQA by a specific reduction in the estimates of quantal size. Since BQA routinely yields reliable estimates of the quantal parameters from small data sets, it is ideally suited to identify the locus of synaptic plasticity for experiments in which repeated manipulations of the recording environment are unfeasible. PMID:23076101
Bijjaragi, Shobha C; Sangle, Varsha A; Saraswathi, F K; Patil, Veerendra S; Ashwini Rani, S R; Bapure, Sunil K
2015-01-01
Estimation of the age is a procedure adopted by anthropologists, archeologists and forensic scientists. Different methods have been undertaken. However none of them meet the standards as Demirjian's method since 1973. Various researchers have applied this method, in both original and modified form (Chaillet and Demirjian in 2004) in different ethnic groups and the results obtained were not satisfactory. To determine the applicability and accuracy of modified Demirjian's method of dental age estimation (AE) in 8-18 year old Tibetan young adults to evaluate the interrelationship between dental and chronological age and the reliability between intra- and inter observer relationship. Clinical setting and computerized design. A total of 300 Tibetan young adults with an age range from 8 to 18 years were recruited in the study. Digital panoramic radiographs (DPRs) were evaluated as per the modified Demirjian's method (2004). Pearson correlation, paired t-test, linear regression analysis. Inter -and intraobserver reliability revealed a strong agreement. A positive and strong association was found between chronological age and estimated dental age (r = 0.839) with P < 0.01. Modified Demirjian method (2004) overestimated the age by 0.04 years (2.04 months)in Tibetan young adults. Results suggest that, the modified Demirjian method of AE is not suitable for Tibetan young adults. Further studies: With larger sample size and comparision with different methods of AE in a given population would be an interesting area for future research.
Solid Fuel Use for Household Cooking: Country and Regional Estimates for 1980–2010
Bonjour, Sophie; Adair-Rohani, Heather; Wolf, Jennyfer; Bruce, Nigel G.; Mehta, Sumi; Lahiff, Maureen; Rehfuess, Eva A.; Mishra, Vinod; Smith, Kirk R.
2013-01-01
Background: Exposure to household air pollution from cooking with solid fuels in simple stoves is a major health risk. Modeling reliable estimates of solid fuel use is needed for monitoring trends and informing policy. Objectives: In order to revise the disease burden attributed to household air pollution for the Global Burden of Disease 2010 project and for international reporting purposes, we estimated annual trends in the world population using solid fuels. Methods: We developed a multilevel model based on national survey data on primary cooking fuel. Results: The proportion of households relying mainly on solid fuels for cooking has decreased from 62% (95% CI: 58, 66%) to 41% (95% CI: 37, 44%) between 1980 and 2010. Yet because of population growth, the actual number of persons exposed has remained stable at around 2.8 billion during three decades. Solid fuel use is most prevalent in Africa and Southeast Asia where > 60% of households cook with solid fuels. In other regions, primary solid fuel use ranges from 46% in the Western Pacific, to 35% in the Eastern Mediterranean and < 20% in the Americas and Europe. Conclusion: Multilevel modeling is a suitable technique for deriving reliable solid-fuel use estimates. Worldwide, the proportion of households cooking mainly with solid fuels is decreasing. The absolute number of persons using solid fuels, however, has remained steady globally and is increasing in some regions. Surveys require enhancement to better capture the health implications of new technologies and multiple fuel use. PMID:23674502
A new framework for estimating return levels using regional frequency analysis
NASA Astrophysics Data System (ADS)
Winter, Hugo; Bernardara, Pietro; Clegg, Georgina
2017-04-01
We propose a new framework for incorporating more spatial and temporal information into the estimation of extreme return levels. Currently, most studies use extreme value models applied to data from a single site; an approach which is inefficient statistically and leads to return level estimates that are less physically realistic. We aim to highlight the benefits that could be obtained by using methodology based upon regional frequency analysis as opposed to classic single site extreme value analysis. This motivates a shift in thinking, which permits the evaluation of local and regional effects and makes use of the wide variety of data that are now available on high temporal and spatial resolutions. The recent winter storms over the UK during the winters of 2013-14 and 2015-16, which have caused wide-ranging disruption and damaged important infrastructure, provide the main motivation for the current work. One of the most impactful natural hazards is flooding, which is often initiated by extreme precipitation. In this presentation, we focus on extreme rainfall, but shall discuss other meteorological variables alongside potentially damaging hazard combinations. To understand the risks posed by extreme precipitation, we need reliable statistical models which can be used to estimate quantities such as the T-year return level, i.e. the level which is expected to be exceeded once every T-years. Extreme value theory provides the main collection of statistical models that can be used to estimate the risks posed by extreme precipitation events. Broadly, at a single site, a statistical model is fitted to exceedances of a high threshold and the model is used to extrapolate to levels beyond the range of the observed data. However, when we have data at many sites over a spatial domain, fitting a separate model for each separate site makes little sense and it would be better if we could incorporate all this information to improve the reliability of return level estimates. Here, we use the regional frequency analysis approach to define homogeneous regions which are affected by the same storms. Extreme value models are then fitted to the data pooled from across a region. We find that this approach leads to more spatially consistent return level estimates with reduced uncertainty bounds.
Bayesian methods in reliability
NASA Astrophysics Data System (ADS)
Sander, P.; Badoux, R.
1991-11-01
The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.
Methods and Costs to Achieve Ultra Reliable Life Support
NASA Technical Reports Server (NTRS)
Jones, Harry W.
2012-01-01
A published Mars mission is used to explore the methods and costs to achieve ultra reliable life support. The Mars mission and its recycling life support design are described. The life support systems were made triply redundant, implying that each individual system will have fairly good reliability. Ultra reliable life support is needed for Mars and other long, distant missions. Current systems apparently have insufficient reliability. The life cycle cost of the Mars life support system is estimated. Reliability can be increased by improving the intrinsic system reliability, adding spare parts, or by providing technically diverse redundant systems. The costs of these approaches are estimated. Adding spares is least costly but may be defeated by common cause failures. Using two technically diverse systems is effective but doubles the life cycle cost. Achieving ultra reliability is worth its high cost because the penalty for failure is very high.
Reliability of digital reactor protection system based on extenics.
Zhao, Jing; He, Ya-Nan; Gu, Peng-Fei; Chen, Wei-Hua; Gao, Feng
2016-01-01
After the Fukushima nuclear accident, safety of nuclear power plants (NPPs) is widespread concerned. The reliability of reactor protection system (RPS) is directly related to the safety of NPPs, however, it is difficult to accurately evaluate the reliability of digital RPS. The method is based on estimating probability has some uncertainties, which can not reflect the reliability status of RPS dynamically and support the maintenance and troubleshooting. In this paper, the reliability quantitative analysis method based on extenics is proposed for the digital RPS (safety-critical), by which the relationship between the reliability and response time of RPS is constructed. The reliability of the RPS for CPR1000 NPP is modeled and analyzed by the proposed method as an example. The results show that the proposed method is capable to estimate the RPS reliability effectively and provide support to maintenance and troubleshooting of digital RPS system.
Lord, Sarah Peregrine; Can, Doğan; Yi, Michael; Marin, Rebeca; Dunn, Christopher W.; Imel, Zac E.; Georgiou, Panayiotis; Narayanan, Shrikanth; Steyvers, Mark; Atkins, David C.
2014-01-01
The current paper presents novel methods for collecting MISC data and accurately assessing reliability of behavior codes at the level of the utterance. The MISC 2.1 was used to rate MI interviews from five randomized trials targeting alcohol and drug use. Sessions were coded at the utterance-level. Utterance-based coding reliability was estimated using three methods and compared to traditional reliability estimates of session tallies. Session-level reliability was generally higher compared to reliability using utterance-based codes, suggesting that typical methods for MISC reliability may be biased. These novel methods in MI fidelity data collection and reliability assessment provided rich data for therapist feedback and further analyses. Beyond implications for fidelity coding, utterance-level coding schemes may elucidate important elements in the counselor-client interaction that could inform theories of change and the practice of MI. PMID:25242192
Lord, Sarah Peregrine; Can, Doğan; Yi, Michael; Marin, Rebeca; Dunn, Christopher W; Imel, Zac E; Georgiou, Panayiotis; Narayanan, Shrikanth; Steyvers, Mark; Atkins, David C
2015-02-01
The current paper presents novel methods for collecting MISC data and accurately assessing reliability of behavior codes at the level of the utterance. The MISC 2.1 was used to rate MI interviews from five randomized trials targeting alcohol and drug use. Sessions were coded at the utterance-level. Utterance-based coding reliability was estimated using three methods and compared to traditional reliability estimates of session tallies. Session-level reliability was generally higher compared to reliability using utterance-based codes, suggesting that typical methods for MISC reliability may be biased. These novel methods in MI fidelity data collection and reliability assessment provided rich data for therapist feedback and further analyses. Beyond implications for fidelity coding, utterance-level coding schemes may elucidate important elements in the counselor-client interaction that could inform theories of change and the practice of MI. Copyright © 2015 Elsevier Inc. All rights reserved.
Low-flow characteristics of streams in the lower Wisconsin River basin
Gebert, W.A.
1978-01-01
Low-flow characteristics estimated for the lower Wisconsin River basin have a high degree of reliability when compared with other basins in Wisconsin, Reliable estimates appear to be related to the relatively uniform geologic features in the basin.
Autonomous navigation system based on GPS and magnetometer data
NASA Technical Reports Server (NTRS)
Julie, Thienel K. (Inventor); Richard, Harman R. (Inventor); Bar-Itzhack, Itzhack Y. (Inventor)
2004-01-01
This invention is drawn to an autonomous navigation system using Global Positioning System (GPS) and magnetometers for low Earth orbit satellites. As a magnetometer is reliable and always provides information on spacecraft attitude, rate, and orbit, the magnetometer-GPS configuration solves GPS initialization problem, decreasing the convergence time for navigation estimate and improving the overall accuracy. Eventually the magnetometer-GPS configuration enables the system to avoid costly and inherently less reliable gyro for rate estimation. Being autonomous, this invention would provide for black-box spacecraft navigation, producing attitude, orbit, and rate estimates without any ground input with high accuracy and reliability.
Modeling the erythemal surface diffuse irradiance fraction for Badajoz, Spain
NASA Astrophysics Data System (ADS)
Sanchez, Guadalupe; Serrano, Antonio; Cancillo, María Luisa
2017-10-01
Despite its important role on the human health and numerous biological processes, the diffuse component of the erythemal ultraviolet irradiance (UVER) is scarcely measured at standard radiometric stations and therefore needs to be estimated. This study proposes and compares 10 empirical models to estimate the UVER diffuse fraction. These models are inspired from mathematical expressions originally used to estimate total diffuse fraction, but, in this study, they are applied to the UVER case and tested against experimental measurements. In addition to adapting to the UVER range the various independent variables involved in these models, the total ozone column has been added in order to account for its strong impact on the attenuation of ultraviolet radiation. The proposed models are fitted to experimental measurements and validated against an independent subset. The best-performing model (RAU3) is based on a model proposed by Ruiz-Arias et al. (2010) and shows values of r2 equal to 0.91 and relative root-mean-square error (rRMSE) equal to 6.1 %. The performance achieved by this entirely empirical model is better than those obtained by previous semi-empirical approaches and therefore needs no additional information from other physically based models. This study expands on previous research to the ultraviolet range and provides reliable empirical models to accurately estimate the UVER diffuse fraction.
The Challenges of Credible Thermal Protection System Reliability Quantification
NASA Technical Reports Server (NTRS)
Green, Lawrence L.
2013-01-01
The paper discusses several of the challenges associated with developing a credible reliability estimate for a human-rated crew capsule thermal protection system. The process of developing such a credible estimate is subject to the quantification, modeling and propagation of numerous uncertainties within a probabilistic analysis. The development of specific investment recommendations, to improve the reliability prediction, among various potential testing and programmatic options is then accomplished through Bayesian analysis.
Anderson, Donald D; Segal, Neil A; Kern, Andrew M; Nevitt, Michael C; Torner, James C; Lynch, John A
2012-01-01
Recent findings suggest that contact stress is a potent predictor of subsequent symptomatic osteoarthritis development in the knee. However, much larger numbers of knees (likely on the order of hundreds, if not thousands) need to be reliably analyzed to achieve the statistical power necessary to clarify this relationship. This study assessed the reliability of new semiautomated computational methods for estimating contact stress in knees from large population-based cohorts. Ten knees of subjects from the Multicenter Osteoarthritis Study were included. Bone surfaces were manually segmented from sequential 1.0 Tesla magnetic resonance imaging slices by three individuals on two nonconsecutive days. Four individuals then registered the resulting bone surfaces to corresponding bone edges on weight-bearing radiographs, using a semi-automated algorithm. Discrete element analysis methods were used to estimate contact stress distributions for each knee. Segmentation and registration reliabilities (day-to-day and interrater) for peak and mean medial and lateral tibiofemoral contact stress were assessed with Shrout-Fleiss intraclass correlation coefficients (ICCs). The segmentation and registration steps of the modeling approach were found to have excellent day-to-day (ICC 0.93-0.99) and good inter-rater reliability (0.84-0.97). This approach for estimating compartment-specific tibiofemoral contact stress appears to be sufficiently reliable for use in large population-based cohorts.
Khodr, Zeina G.; Sak, Mark A.; Pfeiffer, Ruth M.; Duric, Nebojsa; Littrup, Peter; Bey-Knight, Lisa; Ali, Haythem; Vallieres, Patricia; Sherman, Mark E.; Gierach, Gretchen L.
2015-01-01
Purpose: High breast density, as measured by mammography, is associated with increased breast cancer risk, but standard methods of assessment have limitations including 2D representation of breast tissue, distortion due to breast compression, and use of ionizing radiation. Ultrasound tomography (UST) is a novel imaging method that averts these limitations and uses sound speed measures rather than x-ray imaging to estimate breast density. The authors evaluated the reproducibility of measures of speed of sound and changes in this parameter using UST. Methods: One experienced and five newly trained raters measured sound speed in serial UST scans for 22 women (two scans per person) to assess inter-rater reliability. Intrarater reliability was assessed for four raters. A random effects model was used to calculate the percent variation in sound speed and change in sound speed attributable to subject, scan, rater, and repeat reads. The authors estimated the intraclass correlation coefficients (ICCs) for these measures based on data from the authors’ experienced rater. Results: Median (range) time between baseline and follow-up UST scans was five (1–13) months. Contributions of factors to sound speed variance were differences between subjects (86.0%), baseline versus follow-up scans (7.5%), inter-rater evaluations (1.1%), and intrarater reproducibility (∼0%). When evaluating change in sound speed between scans, 2.7% and ∼0% of variation were attributed to inter- and intrarater variation, respectively. For the experienced rater’s repeat reads, agreement for sound speed was excellent (ICC = 93.4%) and for change in sound speed substantial (ICC = 70.4%), indicating very good reproducibility of these measures. Conclusions: UST provided highly reproducible sound speed measurements, which reflect breast density, suggesting that UST has utility in sensitively assessing change in density. PMID:26429241
DOE Office of Scientific and Technical Information (OSTI.GOV)
Khodr, Zeina G.; Pfeiffer, Ruth M.; Gierach, Gretchen L., E-mail: GierachG@mail.nih.gov
Purpose: High breast density, as measured by mammography, is associated with increased breast cancer risk, but standard methods of assessment have limitations including 2D representation of breast tissue, distortion due to breast compression, and use of ionizing radiation. Ultrasound tomography (UST) is a novel imaging method that averts these limitations and uses sound speed measures rather than x-ray imaging to estimate breast density. The authors evaluated the reproducibility of measures of speed of sound and changes in this parameter using UST. Methods: One experienced and five newly trained raters measured sound speed in serial UST scans for 22 women (twomore » scans per person) to assess inter-rater reliability. Intrarater reliability was assessed for four raters. A random effects model was used to calculate the percent variation in sound speed and change in sound speed attributable to subject, scan, rater, and repeat reads. The authors estimated the intraclass correlation coefficients (ICCs) for these measures based on data from the authors’ experienced rater. Results: Median (range) time between baseline and follow-up UST scans was five (1–13) months. Contributions of factors to sound speed variance were differences between subjects (86.0%), baseline versus follow-up scans (7.5%), inter-rater evaluations (1.1%), and intrarater reproducibility (∼0%). When evaluating change in sound speed between scans, 2.7% and ∼0% of variation were attributed to inter- and intrarater variation, respectively. For the experienced rater’s repeat reads, agreement for sound speed was excellent (ICC = 93.4%) and for change in sound speed substantial (ICC = 70.4%), indicating very good reproducibility of these measures. Conclusions: UST provided highly reproducible sound speed measurements, which reflect breast density, suggesting that UST has utility in sensitively assessing change in density.« less
Development of a Method to Observe Preschoolers' Packed Lunches in Early Care and Education Centers.
Sweitzer, Sara J; Byrd-Williams, Courtney E; Ranjit, Nalini; Romo-Palafox, Maria Jose; Briley, Margaret E; Roberts-Gray, Cynthia R; Hoelscher, Deanna M
2015-08-01
As early childhood education (ECE) centers become a more common setting for nutrition interventions, a variety of data collection methods are required, based on the center foodservice. ECE centers that require parents to send in meals and/or snacks from home present a unique challenge for accurate nutrition estimation and data collection. We present an observational methodology for recording the contents and temperature of preschool-aged children's lunchboxes and data to support a 2-day vs a 3-day collection period. Lunchbox observers were trained in visual estimation of foods based on Child and Adult Care Food Program and MyPlate servings and household recommended measures. Trainees weighed and measured foods commonly found in preschool-aged children's lunchboxes and practiced recording accurate descriptions and food temperatures. Training included test assessments of whole-grain bread products, mixed dishes such as macaroni and cheese, and a variety of sandwich preparations. Validity of the estimation method was tested by comparing estimated to actual amounts for several distinct food types. Reliability was assessed by computing the intraclass correlation coefficient for each observer as well as an interrater reliability coefficient across observers. To compare 2- and 3-day observations, 2 of the 3 days of observations were randomly selected for each child and analyzed as a separate dataset. Linear model estimated mean and standard error of whole grains, fruits and vegetables, and amounts of energy, carbohydrates, protein, total fat, saturated fat, dietary fiber, thiamin, riboflavin, niacin, vitamins A and C, calcium, iron, sodium, and dietary fiber per lunch were compared across the 2- and 3-day observation datasets. The mean estimated amounts across 11 observers were statistically indistinguishable from the measured portion size for each of the 41 test foods, implying that the visual estimation measurement method was valid: intraobserver intraclass correlation coefficients ranged from 0.951 (95% CI 0.91 to 0.97) to 1.0. Across observers, the interrater reliability correlation coefficient was estimated at 0.979 (95% CI 0.957 to 0.993). Comparison of servings of fruits, vegetables, and whole grains showed no significant differences for serving size or mean energy and nutrient content between 2- and 3-day lunch observations. The methodology is a valid and reliable option for use in research and practice that requires observing and assessing the contents and portion sizes of food items in preschool-aged children's lunchboxes in an ECE setting. The use of visual observation and estimation with Child and Adult Care Food Program and MyPlate serving sizes and household measures over 2 random days of data collection enables food handling to be minimized while obtaining an accurate record of the variety and quantities of foods that young children are exposed to at lunch time. Copyright © 2015 Academy of Nutrition and Dietetics. Published by Elsevier Inc. All rights reserved.
Stamey, Timothy C.
1998-01-01
Simple and reliable methods for estimating hourly streamflow are needed for the calibration and verification of a Chattahoochee River basin model between Buford Dam and Franklin, Ga. The river basin model is being developed by Georgia Department of Natural Resources, Environmental Protection Division, as part of their Chattahoochee River Modeling Project. Concurrent streamflow data collected at 19 continuous-record, and 31 partial-record streamflow stations, were used in ordinary least-squares linear regression analyses to define estimating equations, and in verifying drainage-area prorations. The resulting regression or drainage-area ratio estimating equations were used to compute hourly streamflow at the partial-record stations. The coefficients of determination (r-squared values) for the regression estimating equations ranged from 0.90 to 0.99. Observed and estimated hourly and daily streamflow data were computed for May 1, 1995, through October 31, 1995. Comparisons of observed and estimated daily streamflow data for 12 continuous-record tributary stations, that had available streamflow data for all or part of the period from May 1, 1995, to October 31, 1995, indicate that the mean error of estimate for the daily streamflow was about 25 percent.
Silvestro, Daniele; Tejedor, Marcelo F; Serrano-Serrano, Martha L; Loiseau, Oriane; Rossier, Victor; Rolland, Jonathan; Zizka, Alexander; Höhna, Sebastian; Antonelli, Alexandre; Salamin, Nicolas
2018-06-20
New World monkeys (platyrrhines) are one of the most diverse groups of primates, occupying today a wide range of ecosystems in the American tropics and exhibiting large variations in ecology, morphology, and behavior. Although the relationships among the almost 200 living species are relatively well understood, we lack robust estimates of the timing of origin, ancestral morphology, and geographic range evolution of the clade. Here we integrate paleontological and molecular evidence to assess the evolutionary dynamics of extinct and extant platyrrhines. We develop novel analytical frameworks to infer the evolution of body mass, changes in latitudinal ranges through time, and species diversification rates using a phylogenetic tree of living and fossil taxa. Our results show that platyrrhines originated 5-10 million years earlier than previously assumed, dating back to the Middle Eocene. The estimated ancestral platyrrhine was small - weighing 0.4 kg - and matched the size of their presumed African ancestors. As the three platyrrhine families diverged, we recover a rapid change in body mass range. During the Miocene Climatic Optimum, fossil diversity peaked and platyrrhines reached their widest latitudinal range, expanding as far South as Patagonia, favored by warm and humid climate and the lower elevation of the Andes. Finally, global cooling and aridification after the middle Miocene triggered a geographic contraction of New World monkeys and increased their extinction rates. These results unveil the full evolutionary trajectory of an iconic and ecologically important radiation of monkeys and showcase the necessity of integrating fossil and molecular data for reliably estimating evolutionary rates and trends.
Survey on Ranging Sensors and Cooperative Techniques for Relative Positioning of Vehicles
de Ponte Müller, Fabian
2017-01-01
Future driver assistance systems will rely on accurate, reliable and continuous knowledge on the position of other road participants, including pedestrians, bicycles and other vehicles. The usual approach to tackle this requirement is to use on-board ranging sensors inside the vehicle. Radar, laser scanners or vision-based systems are able to detect objects in their line-of-sight. In contrast to these non-cooperative ranging sensors, cooperative approaches follow a strategy in which other road participants actively support the estimation of the relative position. The limitations of on-board ranging sensors regarding their detection range and angle of view and the facility of blockage can be approached by using a cooperative approach based on vehicle-to-vehicle communication. The fusion of both, cooperative and non-cooperative strategies, seems to offer the largest benefits regarding accuracy, availability and robustness. This survey offers the reader a comprehensive review on different techniques for vehicle relative positioning. The reader will learn the important performance indicators when it comes to relative positioning of vehicles, the different technologies that are both commercially available and currently under research, their expected performance and their intrinsic limitations. Moreover, the latest research in the area of vision-based systems for vehicle detection, as well as the latest work on GNSS-based vehicle localization and vehicular communication for relative positioning of vehicles, are reviewed. The survey also includes the research work on the fusion of cooperative and non-cooperative approaches to increase the reliability and the availability. PMID:28146129
A low cost, simple, portable instrument for the measurement of infra-red reflectance of paints
NASA Astrophysics Data System (ADS)
Marson, F.
1982-05-01
The construction and design of a low cost, simple, portable infra-red reflectometer which can be used to estimate the reflectance of paint films in the 800 nm region is described. The infra-red reflectances of a range of lustreless, semigloss and gloss olive drab camouflage paints determined using this instrument are compared to those obtained using modified commercial equipment and to the reflectances measured at 800 nm using a Cary model 17 spectrophotometer. The new reflectometer was shown to be superior to the modified commercial instrument currently specified in Australian government paint specifications and to be capable of estimating the reflectance of olive drab paints to within about one per cent of the Cary derived reflectance values. The reflectance values for a range of 24 experimental coatings made with pigments of varying absorption in the infra-red region are used to illustrate the effect of the instrument's spectral response and the necessity of establishing a reliable working standard.
Mayo, Ann M
2015-01-01
It is important for CNSs and other APNs to consider the reliability and validity of instruments chosen for clinical practice, evidence-based practice projects, or research studies. Psychometric testing uses specific research methods to evaluate the amount of error associated with any particular instrument. Reliability estimates explain more about how well the instrument is designed, whereas validity estimates explain more about scores that are produced by the instrument. An instrument may be architecturally sound overall (reliable), but the same instrument may not be valid. For example, if a specific group does not understand certain well-constructed items, then the instrument does not produce valid scores when used with that group. Many instrument developers may conduct reliability testing only once, yet continue validity testing in different populations over many years. All CNSs should be advocating for the use of reliable instruments that produce valid results. Clinical nurse specialists may find themselves in situations where reliability and validity estimates for some instruments that are being utilized are unknown. In such cases, CNSs should engage key stakeholders to sponsor nursing researchers to pursue this most important work.
Ren, Pengyu; Li, Bowen; Dong, Shiyao; Chen, Lin; Zhang, Yuelin
2018-01-01
Although many mathematical methods were used to analyze the neural activity under sinusoidal stimulation within linear response range in vestibular system, the reliabilities of these methods are still not reported, especially in nonlinear response range. Here we chose nonlinear least-squares algorithm (NLSA) with sinusoidal model to analyze the neural response of semicircular canal neurons (SCNs) during sinusoidal rotational stimulation (SRS) over a nonlinear response range. Our aim was to acquire a reliable mathematical method for data analysis under SRS in vestibular system. Our data indicated that the reliability of this method in an entire SCNs population was quite satisfactory. However, the reliability was strongly negatively depended on the neural discharge regularity. In addition, stimulation parameters were the vital impact factors influencing the reliability. The frequency had a significant negative effect but the amplitude had a conspicuous positive effect on the reliability. Thus, NLSA with sinusoidal model resulted a reliable mathematical tool for data analysis of neural response activity under SRS in vestibular system and more suitable for those under the stimulation with low frequency but high amplitude, suggesting that this method can be used in nonlinear response range. This method broke out of the restriction of neural activity analysis under nonlinear response range and provided a solid foundation for future study in nonlinear response range in vestibular system.
Li, Bowen; Dong, Shiyao; Chen, Lin; Zhang, Yuelin
2018-01-01
Although many mathematical methods were used to analyze the neural activity under sinusoidal stimulation within linear response range in vestibular system, the reliabilities of these methods are still not reported, especially in nonlinear response range. Here we chose nonlinear least-squares algorithm (NLSA) with sinusoidal model to analyze the neural response of semicircular canal neurons (SCNs) during sinusoidal rotational stimulation (SRS) over a nonlinear response range. Our aim was to acquire a reliable mathematical method for data analysis under SRS in vestibular system. Our data indicated that the reliability of this method in an entire SCNs population was quite satisfactory. However, the reliability was strongly negatively depended on the neural discharge regularity. In addition, stimulation parameters were the vital impact factors influencing the reliability. The frequency had a significant negative effect but the amplitude had a conspicuous positive effect on the reliability. Thus, NLSA with sinusoidal model resulted a reliable mathematical tool for data analysis of neural response activity under SRS in vestibular system and more suitable for those under the stimulation with low frequency but high amplitude, suggesting that this method can be used in nonlinear response range. This method broke out of the restriction of neural activity analysis under nonlinear response range and provided a solid foundation for future study in nonlinear response range in vestibular system. PMID:29304173
Hassani, Lale; Dehdari, Tahereh; Hajizadeh, Ebrahim; Shojaeizadeh, Davoud; Abedini, Mehrandokht; Nedjat, Saharnaz
2014-01-01
Given that there are many Iranian women who have never had a Pap smear, this study was designed to develop and validate a measurement tool based on the Protection Motivation Theory to assess factors influencing the Iranian women's intention to perform first Pap testing. In this psychometric research, to determine the Content Validity Index (CVI) and the Content Validity Ratio (CVR), a panel of experts (n=10) reviewed scale items. Reliability was estimated through the Intraclass Correlation Coefficient (n=30) and internal consistency (n=240). Also, factor analysis (exploratory and conformity) was performed on the data of the sample women who had never had a Pap smear test (n=240). A 26-item questionnaire was developed. The CVI and CVR scores of the scale were 0.89 and 0.90, respectively. Exploratory factor analysis loaded a 26-item with seven factors questionnaire (perceived vulnerability and severity, fear, response costs, response efficacy, self-efficacy, and protection motivation (or intention)) that jointly accounted for 72.76% of the observed variance. Confirmatory factor analysis indicated a good fit for the data. Internal consistency (range 0.70-0.93) and test-retest reliability (range 0.72-0.96) of sub-scales were acceptable. This study showed that the designed instrument was a valid and reliable tool for measuring the factors influencing the women's intention to perform their first Pap testing.
User's guide to the Reliability Estimation System Testbed (REST)
NASA Technical Reports Server (NTRS)
Nicol, David M.; Palumbo, Daniel L.; Rifkin, Adam
1992-01-01
The Reliability Estimation System Testbed is an X-window based reliability modeling tool that was created to explore the use of the Reliability Modeling Language (RML). RML was defined to support several reliability analysis techniques including modularization, graphical representation, Failure Mode Effects Simulation (FMES), and parallel processing. These techniques are most useful in modeling large systems. Using modularization, an analyst can create reliability models for individual system components. The modules can be tested separately and then combined to compute the total system reliability. Because a one-to-one relationship can be established between system components and the reliability modules, a graphical user interface may be used to describe the system model. RML was designed to permit message passing between modules. This feature enables reliability modeling based on a run time simulation of the system wide effects of a component's failure modes. The use of failure modes effects simulation enhances the analyst's ability to correctly express system behavior when using the modularization approach to reliability modeling. To alleviate the computation bottleneck often found in large reliability models, REST was designed to take advantage of parallel processing on hypercube processors.
NASA Astrophysics Data System (ADS)
Castellarin, A.; Montanari, A.; Brath, A.
2002-12-01
The study derives Regional Depth-Duration-Frequency (RDDF) equations for a wide region of northern-central Italy (37,200 km 2) by following an adaptation of the approach originally proposed by Alila [WRR, 36(7), 2000]. The proposed RDDF equations have a rather simple structure and allow an estimation of the design storm, defined as the rainfall depth expected for a given storm duration and recurrence interval, in any location of the study area for storm durations from 1 to 24 hours and for recurrence intervals up to 100 years. The reliability of the proposed RDDF equations represents the main concern of the study and it is assessed at two different levels. The first level considers the gauged sites and compares estimates of the design storm obtained with the RDDF equations with at-site estimates based upon the observed annual maximum series of rainfall depth and with design storm estimates resulting from a regional estimator recently developed for the study area through a Hierarchical Regional Approach (HRA) [Gabriele and Arnell, WRR, 27(6), 1991]. The second level performs a reliability assessment of the RDDF equations for ungauged sites by means of a jack-knife procedure. Using the HRA estimator as a reference term, the jack-knife procedure assesses the reliability of design storm estimates provided by the RDDF equations for a given location when dealing with the complete absence of pluviometric information. The results of the analysis show that the proposed RDDF equations represent practical and effective computational means for producing a first guess of the design storm at the available raingauges and reliable design storm estimates for ungauged locations. The first author gratefully acknowledges D.H. Burn for sponsoring the submission of the present abstract.
Pateras, Konstantinos; Nikolakopoulos, Stavros; Mavridis, Dimitris; Roes, Kit C B
2018-03-01
When a meta-analysis consists of a few small trials that report zero events, accounting for heterogeneity in the (interval) estimation of the overall effect is challenging. Typically, we predefine meta-analytical methods to be employed. In practice, data poses restrictions that lead to deviations from the pre-planned analysis, such as the presence of zero events in at least one study arm. We aim to explore heterogeneity estimators behaviour in estimating the overall effect across different levels of sparsity of events. We performed a simulation study that consists of two evaluations. We considered an overall comparison of estimators unconditional on the number of observed zero cells and an additional one by conditioning on the number of observed zero cells. Estimators that performed modestly robust when (interval) estimating the overall treatment effect across a range of heterogeneity assumptions were the Sidik-Jonkman, Hartung-Makambi and improved Paul-Mandel. The relative performance of estimators did not materially differ between making a predefined or data-driven choice. Our investigations confirmed that heterogeneity in such settings cannot be estimated reliably. Estimators whose performance depends strongly on the presence of heterogeneity should be avoided. The choice of estimator does not need to depend on whether or not zero cells are observed.
Downs, Stephen; Marquez, Jodie; Chiarelli, Pauline
2013-06-01
What is the intra-rater and inter-rater relative reliability of the Berg Balance Scale? What is the absolute reliability of the Berg Balance Scale? Does the absolute reliability of the Berg Balance Scale vary across the scale? Systematic review with meta-analysis of reliability studies. Any clinical population that has undergone assessment with the Berg Balance Scale. Relative intra-rater reliability, relative inter-rater reliability, and absolute reliability. Eleven studies involving 668 participants were included in the review. The relative intrarater reliability of the Berg Balance Scale was high, with a pooled estimate of 0.98 (95% CI 0.97 to 0.99). Relative inter-rater reliability was also high, with a pooled estimate of 0.97 (95% CI 0.96 to 0.98). A ceiling effect of the Berg Balance Scale was evident for some participants. In the analysis of absolute reliability, all of the relevant studies had an average score of 20 or above on the 0 to 56 point Berg Balance Scale. The absolute reliability across this part of the scale, as measured by the minimal detectable change with 95% confidence, varied between 2.8 points and 6.6 points. The Berg Balance Scale has a higher absolute reliability when close to 56 points due to the ceiling effect. We identified no data that estimated the absolute reliability of the Berg Balance Scale among participants with a mean score below 20 out of 56. The Berg Balance Scale has acceptable reliability, although it might not detect modest, clinically important changes in balance in individual subjects. The review was only able to comment on the absolute reliability of the Berg Balance Scale among people with moderately poor to normal balance. Copyright © 2013 Australian Physiotherapy Association. Published by .. All rights reserved.
Walker, Martin; Basáñez, María-Gloria; Ouédraogo, André Lin; Hermsen, Cornelus; Bousema, Teun; Churcher, Thomas S
2015-01-16
Quantitative molecular methods (QMMs) such as quantitative real-time polymerase chain reaction (q-PCR), reverse-transcriptase PCR (qRT-PCR) and quantitative nucleic acid sequence-based amplification (QT-NASBA) are increasingly used to estimate pathogen density in a variety of clinical and epidemiological contexts. These methods are often classified as semi-quantitative, yet estimates of reliability or sensitivity are seldom reported. Here, a statistical framework is developed for assessing the reliability (uncertainty) of pathogen densities estimated using QMMs and the associated diagnostic sensitivity. The method is illustrated with quantification of Plasmodium falciparum gametocytaemia by QT-NASBA. The reliability of pathogen (e.g. gametocyte) densities, and the accompanying diagnostic sensitivity, estimated by two contrasting statistical calibration techniques, are compared; a traditional method and a mixed model Bayesian approach. The latter accounts for statistical dependence of QMM assays run under identical laboratory protocols and permits structural modelling of experimental measurements, allowing precision to vary with pathogen density. Traditional calibration cannot account for inter-assay variability arising from imperfect QMMs and generates estimates of pathogen density that have poor reliability, are variable among assays and inaccurately reflect diagnostic sensitivity. The Bayesian mixed model approach assimilates information from replica QMM assays, improving reliability and inter-assay homogeneity, providing an accurate appraisal of quantitative and diagnostic performance. Bayesian mixed model statistical calibration supersedes traditional techniques in the context of QMM-derived estimates of pathogen density, offering the potential to improve substantially the depth and quality of clinical and epidemiological inference for a wide variety of pathogens.
Wagner, Brian J.; Gorelick, Steven M.
1986-01-01
A simulation nonlinear multiple-regression methodology for estimating parameters that characterize the transport of contaminants is developed and demonstrated. Finite difference contaminant transport simulation is combined with a nonlinear weighted least squares multiple-regression procedure. The technique provides optimal parameter estimates and gives statistics for assessing the reliability of these estimates under certain general assumptions about the distributions of the random measurement errors. Monte Carlo analysis is used to estimate parameter reliability for a hypothetical homogeneous soil column for which concentration data contain large random measurement errors. The value of data collected spatially versus data collected temporally was investigated for estimation of velocity, dispersion coefficient, effective porosity, first-order decay rate, and zero-order production. The use of spatial data gave estimates that were 2–3 times more reliable than estimates based on temporal data for all parameters except velocity. Comparison of estimated linear and nonlinear confidence intervals based upon Monte Carlo analysis showed that the linear approximation is poor for dispersion coefficient and zero-order production coefficient when data are collected over time. In addition, examples demonstrate transport parameter estimation for two real one-dimensional systems. First, the longitudinal dispersivity and effective porosity of an unsaturated soil are estimated using laboratory column data. We compare the reliability of estimates based upon data from individual laboratory experiments versus estimates based upon pooled data from several experiments. Second, the simulation nonlinear regression procedure is extended to include an additional governing equation that describes delayed storage during contaminant transport. The model is applied to analyze the trends, variability, and interrelationship of parameters in a mourtain stream in northern California.
Component Analysis of Errors on PERSIANN Precipitation Estimates over Urmia Lake Basin, IRAN
NASA Astrophysics Data System (ADS)
Ghajarnia, N.; Daneshkar Arasteh, P.; Liaghat, A. M.; Araghinejad, S.
2016-12-01
In this study, PERSIANN daily dataset is evaluated from 2000 to 2011 in 69 pixels over Urmia Lake basin in northwest of Iran. Different analytical approaches and indexes are used to examine PERSIANN precision in detection and estimation of rainfall rate. The residuals are decomposed into Hit, Miss and FA estimation biases while continues decomposition of systematic and random error components are also analyzed seasonally and categorically. New interpretation of estimation accuracy named "reliability on PERSIANN estimations" is introduced while the changing manners of existing categorical/statistical measures and error components are also seasonally analyzed over different rainfall rate categories. This study yields new insights into the nature of PERSIANN errors over Urmia lake basin as a semi-arid region in the middle-east, including the followings: - The analyzed contingency table indexes indicate better detection precision during spring and fall. - A relatively constant level of error is generally observed among different categories. The range of precipitation estimates at different rainfall rate categories is nearly invariant as a sign for the existence of systematic error. - Low level of reliability is observed on PERSIANN estimations at different categories which are mostly associated with high level of FA error. However, it is observed that as the rate of precipitation increase, the ability and precision of PERSIANN in rainfall detection also increases. - The systematic and random error decomposition in this area shows that PERSIANN has more difficulty in modeling the system and pattern of rainfall rather than to have bias due to rainfall uncertainties. The level of systematic error also considerably increases in heavier rainfalls. It is also important to note that PERSIANN error characteristics at each season varies due to the condition and rainfall patterns of that season which shows the necessity of seasonally different approach for the calibration of this product. Overall, we believe that different error component's analysis performed in this study, can substantially help any further local studies for post-calibration and bias reduction of PERSIANN estimations.
An empirical Bayes approach for the Poisson life distribution.
NASA Technical Reports Server (NTRS)
Canavos, G. C.
1973-01-01
A smooth empirical Bayes estimator is derived for the intensity parameter (hazard rate) in the Poisson distribution as used in life testing. The reliability function is also estimated either by using the empirical Bayes estimate of the parameter, or by obtaining the expectation of the reliability function. The behavior of the empirical Bayes procedure is studied through Monte Carlo simulation in which estimates of mean-squared errors of the empirical Bayes estimators are compared with those of conventional estimators such as minimum variance unbiased or maximum likelihood. Results indicate a significant reduction in mean-squared error of the empirical Bayes estimators over the conventional variety.
Bedekar, Nilima; Suryawanshi, Mayuri; Rairikar, Savita; Sancheti, Parag; Shyam, Ashok
2014-01-01
Evaluation of range of motion (ROM) is integral part of assessment of musculoskeletal system. This is required in health fitness and pathological conditions; also it is used as an objective outcome measure. Several methods are described to check spinal flexion range of motion. Different methods for measuring spine ranges have their advantages and disadvantages. Hence, a new device was introduced in this study using the method of dual inclinometer to measure lumbar spine flexion range of motion (ROM). To determine Intra and Inter-rater reliability of mobile device goniometer in measuring lumbar flexion range of motion. iPod mobile device with goniometer software was used. The part being measure i.e the back of the subject was suitably exposed. Subject was standing with feet shoulder width apart. Spinous process of second sacral vertebra S2 and T12 were located, these were used as the reference points and readings were taken. Three readings were taken for each: inter-rater reliability as well as the intra-rater reliability. Sufficient rest was given between each flexion movement. Intra-rater reliability using ICC was r=0.920 and inter-rater r=0.812 at CI 95%. Validity r=0.95. Mobile device goniometer has high intra-rater reliability. The inter-rater reliability was moderate. This device can be used to assess range of motion of spine flexion, representing uni-planar movement.
Cade, B.S.; Terrell, J.W.; Neely, B.C.
2011-01-01
Increasing our understanding of how environmental factors affect fish body condition and improving its utility as a metric of aquatic system health require reliable estimates of spatial variation in condition (weight at length). We used three statistical approaches that varied in how they accounted for heterogeneity in allometric growth to estimate differences in body condition of blue suckers Cycleptus elongatus across 19 large-river locations in the central USA. Quantile regression of an expanded allometric growth model provided the most comprehensive estimates, including variation in exponents within and among locations (range = 2.88–4.24). Blue suckers from more-southerly locations had the largest exponents. Mixed-effects mean regression of a similar expanded allometric growth model allowed exponents to vary among locations (range = 3.03–3.60). Mean relative weights compared across selected intervals of total length (TL = 510–594 and 594–692 mm) in a multiplicative model involved the implicit assumption that allometric exponents within and among locations were similar to the exponent (3.46) for the standard weight equation. Proportionate differences in the quantiles of weight at length for adult blue suckers (TL = 510, 594, 644, and 692 mm) compared with their average across locations ranged from 1.08 to 1.30 for southern locations (Texas, Mississippi) and from 0.84 to 1.00 for northern locations (Montana, North Dakota); proportionate differences for mean weight ranged from 1.13 to 1.17 and from 0.87 to 0.95, respectively, and those for mean relative weight ranged from 1.10 to 1.18 and from 0.86 to 0.98, respectively. Weights for fish at longer lengths varied by 600–700 g within a location and by as much as 2,000 g among southern and northern locations. Estimates for the Wabash River, Indiana (0.96–1.07 times the average; greatest increases for lower weights at shorter TLs), and for the Missouri River from Blair, Nebraska, to Sioux City, Iowa (0.90–1.00 times the average; greatest decreases for lower weights at longer TLs), were examined in detail to explain the additional information provided by quantile estimates.
Vision based object pose estimation for mobile robots
NASA Technical Reports Server (NTRS)
Wu, Annie; Bidlack, Clint; Katkere, Arun; Feague, Roy; Weymouth, Terry
1994-01-01
Mobile robot navigation using visual sensors requires that a robot be able to detect landmarks and obtain pose information from a camera image. This paper presents a vision system for finding man-made markers of known size and calculating the pose of these markers. The algorithm detects and identifies the markers using a weighted pattern matching template. Geometric constraints are then used to calculate the position of the markers relative to the robot. The selection of geometric constraints comes from the typical pose of most man-made signs, such as the sign standing vertical and the dimensions of known size. This system has been tested successfully on a wide range of real images. Marker detection is reliable, even in cluttered environments, and under certain marker orientations, estimation of the orientation has proven accurate to within 2 degrees, and distance estimation to within 0.3 meters.
Real-time moving horizon estimation for a vibrating active cantilever
NASA Astrophysics Data System (ADS)
Abdollahpouri, Mohammad; Takács, Gergely; Rohaľ-Ilkiv, Boris
2017-03-01
Vibrating structures may be subject to changes throughout their operating lifetime due to a range of environmental and technical factors. These variations can be considered as parameter changes in the dynamic model of the structure, while their online estimates can be utilized in adaptive control strategies, or in structural health monitoring. This paper implements the moving horizon estimation (MHE) algorithm on a low-cost embedded computing device that is jointly observing the dynamic states and parameter variations of an active cantilever beam in real time. The practical behavior of this algorithm has been investigated in various experimental scenarios. It has been found, that for the given field of application, moving horizon estimation converges faster than the extended Kalman filter; moreover, it handles atypical measurement noise, sensor errors or other extreme changes, reliably. Despite its improved performance, the experiments demonstrate that the disadvantage of solving the nonlinear optimization problem in MHE is that it naturally leads to an increase in computational effort.
Crawford, John R; Garthwaite, Paul H; Lawrie, Caroline J; Henry, Julie D; MacDonald, Marie A; Sutherland, Jane; Sinha, Priyanka
2009-06-01
A series of recent papers have reported normative data from the general adult population for commonly used self-report mood scales. To bring together and supplement these data in order to provide a convenient means of obtaining percentile norms for the mood scales. A computer program was developed that provides point and interval estimates of the percentile rank corresponding to raw scores on the various self-report scales. The program can be used to obtain point and interval estimates of the percentile rank of an individual's raw scores on the DASS, DASS-21, HADS, PANAS, and sAD mood scales, based on normative sample sizes ranging from 758 to 3822. The interval estimates can be obtained using either classical or Bayesian methods as preferred. The computer program (which can be downloaded at www.abdn.ac.uk/~psy086/dept/MoodScore.htm) provides a convenient and reliable means of supplementing existing cut-off scores for self-report mood scales.
Chromosome Aberrations in Astronauts
NASA Technical Reports Server (NTRS)
George, Kerry A.; Durante, M.; Cucinotta, Francis A.
2007-01-01
A review of currently available data on in vivo induced chromosome damage in the blood lymphocytes of astronauts proves that, after protracted exposure of a few months or more to space radiation, cytogenetic biodosimetry analyses of blood collected within a week or two of return from space provides a reliable estimate of equivalent radiation dose and risk. Recent studies indicate that biodosimetry estimates from single spaceflights lie within the range expected from physical dosimetry and biophysical models, but very large uncertainties are associated with single individual measurements and the total sample population remains low. Retrospective doses may be more difficult to estimate because of the fairly rapid time-dependent loss of "stable" aberrations in blood lymphocytes. Also, biodosimetry estimates from individuals who participate in multiple missions, or very long (interplanetary) missions, may be complicated by an adaptive response to space radiation and/or changes in lymphocyte survival and repopulation. A discussion of published data is presented and specific issues related to space radiation biodosimetry protocols are discussed.
Krishan, Kewal; Chatterjee, Preetika M; Kanchan, Tanuj; Kaur, Sandeep; Baryah, Neha; Singh, R K
2016-04-01
Sex estimation is considered as one of the essential parameters in forensic anthropology casework, and requires foremost consideration in the examination of skeletal remains. Forensic anthropologists frequently employ morphologic and metric methods for sex estimation of human remains. These methods are still very imperative in identification process in spite of the advent and accomplishment of molecular techniques. A constant boost in the use of imaging techniques in forensic anthropology research has facilitated to derive as well as revise the available population data. These methods however, are less reliable owing to high variance and indistinct landmark details. The present review discusses the reliability and reproducibility of various analytical approaches; morphological, metric, molecular and radiographic methods in sex estimation of skeletal remains. Numerous studies have shown a higher reliability and reproducibility of measurements taken directly on the bones and hence, such direct methods of sex estimation are considered to be more reliable than the other methods. Geometric morphometric (GM) method and Diagnose Sexuelle Probabiliste (DSP) method are emerging as valid methods and widely used techniques in forensic anthropology in terms of accuracy and reliability. Besides, the newer 3D methods are shown to exhibit specific sexual dimorphism patterns not readily revealed by traditional methods. Development of newer and better methodologies for sex estimation as well as re-evaluation of the existing ones will continue in the endeavour of forensic researchers for more accurate results. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Iranian Health Literacy Questionnaire (IHLQ): An Instrument for Measuring Health Literacy in Iran.
Haghdoost, Ali Akbar; Rakhshani, Fatemeh; Aarabi, Mohsen; Montazeri, Ali; Tavousi, Mahmoud; Solimanian, Atoosa; Sarbandi, Fatemeh; Namdar, Hosein; Iranpour, Abedin
2015-06-01
Promoting Health Literacy (HL) is considered as an important goal in strategic plans of many countries. In spite of the necessity for access to valid, reliable and native HL instruments, the number of such instruments in the Persian language is scarce. Moreover, there is no good estimation of HL status in Iran. The aim of this study was to provide a valid, reliable and native instrument to measure and monitor community HL in Iran and also, to provide an estimation of HL status in two Iranian provinces. By applying the multistage cluster sampling, 1080 respondents (540 from each gender) were recruited from Kerman and Mazandaran provinces of Iran, from February to June 2014 to participate in this cross-sectional study. The development of the Iranian Health Literacy Questionnaire (IHLQ) was initiated with a comprehensive review of the literature. Then, face, content and construct validity as well as reliability were determined. Internal consistency and test-retest reliability (ICC) of the factors was in the range of 0.71 to 0.96 and 0.73 to 0.86, respectively. In order to construct validity, Exploratory Factor Analysis (EFA) Kaiser-Meyer-Olkin (KMO) = 0.95 and Bartlett's test result of 3.017 with P < 0.001) with varimax rotation was used. Optimal reduced solution, including 36 items and seven factors, was found in EFA. Five of the factors identified were reading/comprehension skills, individual empowerment, communication/decision-making skills, social empowerment and health knowledge. It was concluded that IHLQ might be a practical and useful tool for investigating HL for Persian language speakers around the world. Since HL is dynamic and its instruments should be regularly revised, further studies are recommended to assess HL with application of IHLQ to detect its potential imperfections.
Reliability and factorial validity of flexibility tests for team sports.
Sporis, Goran; Vucetic, Vlatko; Jovanovic, Mario; Jukic, Igor; Omrcen, Darija
2011-04-01
The main goal of this method paper was to evaluate the reliability and factorial validity of flexibility tests used in soccer, and to do crossvalidation study on 2 other team sports using handball and basketball players. The second aim was to compare the validity of the different tests and evaluate the flexibility of soccer players; the third was to determine the positional differences between attackers, defenders, and midfielders in all flexibility tests. One hundred and fifty (n = 150) elite male junior soccer players, members of the First Croatian Junior League Teams, and 60 (n = 60) handball and 60 (n = 60) basketball players also members of the First Croatian Junior League Teams volunteered to participate in the study, tested for the purpose of crossvalidation. The SAR and V-SAR had the greatest AVR and ICC. The within-subjects variation ranged from between 0.3 and 3.8%. The lowest value of CV was found between the LSPL and LSPR. Low to moderate statistically significant correlation coefficients were found among all the measured flexibility tests. It was observed that the greatest correlations existed between the SAR and V-SAR (r = 0.65) and between the LLSR and LLSL (r = 0.56). Statistically significant correlations were also observed between the BLPL and BLPR (r = 0.62). The principal components factor analysis of 9 flexibility tests resulted in the extraction of 3 significant components. The results of this study have the following implications for the assessment of flexibility in soccer: (a) all flexibility tests used in this study have the acceptable between and within-subjects reliability and they can be used to estimate the flexibility of soccer players; (b) the LSPL and LSPR tests are the most reliable and valid flexibility tests for the estimation of flexibility of professional soccer players.
Ward, Kenneth D.; Hunt, Kami Mays; Berg, Melanie Burstyne; Slawson, Deborah A.; Vukadinovich, Christopher M.; McClanahan, Barbara S.; Clemens, Linda H.
2016-01-01
Calcium intake often is inadequate in female collegiate athletes, increasing the risk for training injuries and future osteoporosis. Thus, a brief and accurate assessment tool to quickly measure calcium intake in athletes is needed. We evaluated the reliability and validity, compared to 6 days of diet records (DRs), of the Rapid Assessment Method (RAM), a self-administered calcium checklist (14). Seventy-six female collegiate athletes (mean age = 18.8 yrs, range = 17–21; 97% Caucasian) were recruited from basketball, cross-country, field hockey, soccer, and volleyball teams. Athletes completed a RAM at the start of the training season to assess calcium intake during the past week. Two weeks later, a second RAM was completed to assess reliability, and athletes began 6 days of diet records (DRs) collection. At completion of DRs, athletes completed a final RAM, corresponding to the same time period as DRs, to assess agreement between the 2 instruments. The RAM demonstrated adequate test-retest reliability over 2 weeks (n = 56; Intraclass correlation [ICC] = .54, p < .0001) and adequate agreement with DRs (n = 34; ICC = .41, p = .0067). Calcium intake was below recommended levels, and mean estimates did not differ significantly on the RAM (823 ± 387 mg/d) and DRs (822 ± 330 mg/d; p = .988). Adequacy of calcium intake from both DRs and the RAM was classified as “inadequate” (<1000 mg/d) and “adequate” (≥1000 mg/d). Agreement between the RAM and DRs for adequacy classification was fair (ICC = .30, p = .042), with the RAM identifying 84% of athletes judged to have inadequate calcium intake based on DRs. The RAM briefly and accurately estimates calcium intake in female collegiate athletes compared to DRs. PMID:15118194
Voskuil, Vicki R.; Pierce, Steven J.; Robbins, Lorraine B.
2017-01-01
Aims: This study compared the psychometric properties of two self-efficacy instruments related to physical activity. Factorial validity, cross-group and longitudinal invariance, and composite reliability were examined. Methods: Secondary analysis was conducted on data from a group randomized controlled trial investigating the effect of a 17-week intervention on increasing moderate to vigorous physical activity among 5th–8th grade girls (N = 1,012). Participants completed a 6-item Physical Activity Self-Efficacy Scale (PASE) and a 7-item Self-Efficacy for Exercise Behaviors Scale (SEEB) at baseline and post-intervention. Confirmatory factor analyses for intervention and control groups were conducted with Mplus Version 7.4 using robust weighted least squares estimation. Model fit was evaluated with the chi-square index, comparative fit index, and root mean square error of approximation. Composite reliability for latent factors with ordinal indicators was computed from Mplus output using SAS 9.3. Results: Mean age of the girls was 12.2 years (SD = 0.96). One-third of the girls were obese. Girls represented a diverse sample with over 50% indicating black race and an additional 19% identifying as mixed or other race. Both instruments demonstrated configural invariance for simultaneous analysis of cross-group and longitudinal invariance based on alternative fit indices. However, simultaneous metric invariance was not met for the PASE or the SEEB instruments. Partial metric invariance for the simultaneous analysis was achieved for the PASE with one factor loading identified as non-invariant. Partial metric invariance was not met for the SEEB. Longitudinal scalar invariance was achieved for both instruments in the control group but not the intervention group. Composite reliability for the PASE ranged from 0.772 to 0.842. Reliability for the SEEB ranged from 0.719 to 0.800 indicating higher reliability for the PASE. Reliability was more stable over time in the control group for both instruments. Conclusions: Results suggest that the intervention influenced how girls responded to indicator items. Neither of the instruments achieved simultaneous metric invariance making it difficult to assess mean differences in PA self-efficacy between groups. PMID:28824487
Saloheimo, T; González, S A; Erkkola, M; Milauskas, D M; Meisel, J D; Champagne, C M; Tudor-Locke, C; Sarmiento, O; Katzmarzyk, P T; Fogelholm, M
2015-01-01
Objective: The main aim of this study was to assess the reliability and validity of a food frequency questionnaire with 23 food groups (I-FFQ) among a sample of 9–11-year-old children from three different countries that differ on economical development and income distribution, and to assess differences between country sites. Furthermore, we assessed factors associated with I-FFQ's performance. Methods: This was an ancillary study of the International Study of Childhood Obesity, Lifestyle and the Environment. Reliability (n=321) and validity (n=282) components of this study had the same participants. Participation rates were 95% and 70%, respectively. Participants completed two I-FFQs with a mean interval of 4.9 weeks to assess reliability. A 3-day pre-coded food diary (PFD) was used as the reference method in the validity analyses. Wilcoxon signed-rank tests, intraclass correlation coefficients and cross-classifications were used to assess the reliability of I-FFQ. Spearman correlation coefficients, percentage difference and cross-classifications were used to assess the validity of I-FFQ. A logistic regression model was used to assess the relation of selected variables with the estimate of validity. Analyses based on information in the PFDs were performed to assess how participants interpreted food groups. Results: Reliability correlation coefficients ranged from 0.37 to 0.78 and gross misclassification for all food groups was <5%. Validity correlation coefficients were below 0.5 for 22/23 food groups, and they differed among country sites. For validity, gross misclassification was <5% for 22/23 food groups. Over- or underestimation did not appear for 19/23 food groups. Logistic regression showed that country of participation and parental education were associated (P⩽0.05) with the validity of I-FFQ. Analyses of children's interpretation of food groups suggested that the meaning of most food groups was understood by the children. Conclusion: I-FFQ is a moderately reliable method and its validity ranged from low to moderate, depending on food group and country site. PMID:27152180
The Trojan Lifetime Champions Health Survey: Development, Validity, and Reliability
Sorenson, Shawn C.; Romano, Russell; Scholefield, Robin M.; Schroeder, E. Todd; Azen, Stanley P.; Salem, George J.
2015-01-01
Context Self-report questionnaires are an important method of evaluating lifespan health, exercise, and health-related quality of life (HRQL) outcomes among elite, competitive athletes. Few instruments, however, have undergone formal characterization of their psychometric properties within this population. Objective To evaluate the validity and reliability of a novel health and exercise questionnaire, the Trojan Lifetime Champions (TLC) Health Survey. Design Descriptive laboratory study. Setting A large National Collegiate Athletic Association Division I university. Patients or Other Participants A total of 63 university alumni (age range, 24 to 84 years), including former varsity collegiate athletes and a control group of nonathletes. Intervention(s) Participants completed the TLC Health Survey twice at a mean interval of 23 days with randomization to the paper or electronic version of the instrument. Main Outcome Measure(s) Content validity, feasibility of administration, test-retest reliability, parallel-form reliability between paper and electronic forms, and estimates of systematic and typical error versus differences of clinical interest were assessed across a broad range of health, exercise, and HRQL measures. Results Correlation coefficients, including intraclass correlation coefficients (ICCs) for continuous variables and κ agreement statistics for ordinal variables, for test-retest reliability averaged 0.86, 0.90, 0.80, and 0.74 for HRQL, lifetime health, recent health, and exercise variables, respectively. Correlation coefficients, again ICCs and κ, for parallel-form reliability (ie, equivalence) between paper and electronic versions averaged 0.90, 0.85, 0.85, and 0.81 for HRQL, lifetime health, recent health, and exercise variables, respectively. Typical measurement error was less than the a priori thresholds of clinical interest, and we found minimal evidence of systematic test-retest error. We found strong evidence of content validity, convergent construct validity with the Short-Form 12 Version 2 HRQL instrument, and feasibility of administration in an elite, competitive athletic population. Conclusions These data suggest that the TLC Health Survey is a valid and reliable instrument for assessing lifetime and recent health, exercise, and HRQL, among elite competitive athletes. Generalizability of the instrument may be enhanced by additional, larger-scale studies in diverse populations. PMID:25611315
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hill, J.R.; Heger, A.S.; Koen, B.V.
1984-04-01
This report is the result of a preliminary feasibility study of the applicability of Stein and related parametric empirical Bayes (PEB) estimators to the Nuclear Plant Reliability Data System (NPRDS). A new estimator is derived for the means of several independent Poisson distributions with different sampling times. This estimator is applied to data from NPRDS in an attempt to improve failure rate estimation. Theoretical and Monte Carlo results indicate that the new PEB estimator can perform significantly better than the standard maximum likelihood estimator if the estimation of the individual means can be combined through the loss function or throughmore » a parametric class of prior distributions.« less
NASA Astrophysics Data System (ADS)
Morlot, T.; Mathevet, T.; Perret, C.; Favre Pugin, A. C.
2014-12-01
Streamflow uncertainty estimation has recently received a large attention in the literature. A dynamic rating curve assessment method has been introduced (Morlot et al., 2014). This dynamic method allows to compute a rating curve for each gauging and a continuous streamflow time-series, while calculating streamflow uncertainties. Streamflow uncertainty takes into account many sources of uncertainty (water level, rating curve interpolation and extrapolation, gauging aging, etc.) and produces an estimated distribution of streamflow for each days. In order to caracterise streamflow uncertainty, a probabilistic framework has been applied on a large sample of hydrometric stations of the Division Technique Générale (DTG) of Électricité de France (EDF) hydrometric network (>250 stations) in France. A reliability diagram (Wilks, 1995) has been constructed for some stations, based on the streamflow distribution estimated for a given day and compared to a real streamflow observation estimated via a gauging. To build a reliability diagram, we computed the probability of an observed streamflow (gauging), given the streamflow distribution. Then, the reliability diagram allows to check that the distribution of probabilities of non-exceedance of the gaugings follows a uniform law (i.e., quantiles should be equipropables). Given the shape of the reliability diagram, the probabilistic calibration is caracterised (underdispersion, overdispersion, bias) (Thyer et al., 2009). In this paper, we present case studies where reliability diagrams have different statistical properties for different periods. Compared to our knowledge of river bed morphology dynamic of these hydrometric stations, we show how reliability diagram gives us invaluable information on river bed movements, like a continuous digging or backfilling of the hydraulic control due to erosion or sedimentation processes. Hence, the careful analysis of reliability diagrams allows to reconcile statistics and long-term river bed morphology processes. This knowledge improves our real-time management of hydrometric stations, given a better caracterisation of erosion/sedimentation processes and the stability of hydrometric station hydraulic control.
NASA Technical Reports Server (NTRS)
Mathur, F. P.
1972-01-01
Description of an on-line interactive computer program called CARE (Computer-Aided Reliability Estimation) which can model self-repair and fault-tolerant organizations and perform certain other functions. Essentially CARE consists of a repository of mathematical equations defining the various basic redundancy schemes. These equations, under program control, are then interrelated to generate the desired mathematical model to fit the architecture of the system under evaluation. The mathematical model is then supplied with ground instances of its variables and is then evaluated to generate values for the reliability-theoretic functions applied to the model.
Examples of Nonconservatism in the CARE 3 Program
NASA Technical Reports Server (NTRS)
Dotson, Kelly J.
1988-01-01
This paper presents parameter regions in the CARE 3 (Computer-Aided Reliability Estimation version 3) computer program where the program overestimates the reliability of a modeled system without warning the user. Five simple models of fault-tolerant computer systems are analyzed; and, the parameter regions where reliability is overestimated are given. The source of the error in the reliability estimates for models which incorporate transient fault occurrences was not readily apparent. However, the source of much of the error for models with permanent and intermittent faults can be attributed to the choice of values for the run-time parameters of the program.
Lee, Minji K; Yost, Kathleen J; McDonald, Jennifer S; Dougherty, Ryne W; Vine, Roanna L; Kallmes, David F
2017-06-01
The majority of validation done on the Roland-Morris Disability Questionnaire (RMDQ) has been in patients with mild or moderate disability. There is paucity of research focusing on the psychometric quality of the RMDQ in patients with severe disability. To evaluate the psychometric quality of the RMDQ in patients with severe disability. Observational clinical study. The sample consisted of 214 patients with painful vertebral compression fractures who underwent vertebroplasty or kyphoplasty. The 23-item version of the RMDQ was completed at two time points: baseline and 30-day postintervention follow-up. With the two-parameter logistic unidimensional item response theory (IRT) analyses, we derived the range of scores that produced reliable measurement and investigated the minimal clinically important difference (MCID). Scores for 214 (100%) patients at baseline and 108 (50%) patients at follow-up did not meet the reliability criterion of 0.90 or higher, with the majority of patients having disability due to back pain that was too severe to be reliably measured by the RMDQ. Depending on methodology, MCID estimates ranged from 2 to 8 points and the proportion of patients classified as having experienced meaningful improvement ranged from 26% to 68%. A greater change in score was needed at the extreme ends of the score scale to be classified as having achieved MCID using IRT methods. Replacing items measuring moderate disability with items measuring severe disability could yield a version of the RMDQ that better targets patients with severe disability due to back pain. Improved precision in measuring disability would be valuable to clinicians who treat patients with greater functional impairments. Caution is needed when choosing criteria for interpreting meaningful change using the RMDQ. Copyright © 2017 Elsevier Inc. All rights reserved.
Valentim, Daniela Pereira; Sato, Tatiana de Oliveira; Comper, Maria Luiza Caíres; Silva, Anderson Martins da; Boas, Cristiana Villas; Padula, Rosimeire Simprini
There are very few observational methods for analysis of biomechanical exposure available in Brazilian-Portuguese. This study aimed to cross-culturally adapt and test the measurement properties of the Rapid Upper Limb Assessment (RULA) and Strain Index (SI). The cross-cultural adaptation and measurement properties test were established according to Beaton et al. and COSMIN guidelines, respectively. Several tasks that required static posture and/or repetitive motion of upper limbs were evaluated (n>100). The intra-raters' reliability for the RULA ranged from poor to almost perfect (k: 0.00-0.93), and SI from poor to excellent (ICC 2.1 : 0.05-0.99). The inter-raters' reliability was very poor for RULA (k: -0.12 to 0.13) and ranged from very poor to moderate for SI (ICC 2.1 : 0.00-0.53). The agreement was good for RULA (75-100% intra-raters, and 42.24-100% inter-raters) and to SI (EPM: -1.03% to 1.97%; intra-raters, and -0.17% to 1.51% inter-raters). The internal consistency was appropriate for RULA (α=0.88), and low for SI (α=0.65). Moderate construct validity were observed between RULA and SI, in wrist/hand-wrist posture (rho: 0.61) and strength/intensity of exertion (rho: 0.39). The adapted versions of the RULA and SI presented semantic and cultural equivalence for the Brazilian Portuguese. The RULA and SI had reliability estimates ranged from very poor to almost perfect. The internal consistency for RULA was better than the SI. The correlation between methods was moderate only of muscle request/movement repetition. Previous training is mandatory to use of observations methods for biomechanical exposure assessment, although it does not guarantee good reproducibility of these measures. Copyright © 2017 Associação Brasileira de Pesquisa e Pós-Graduação em Fisioterapia. Publicado por Elsevier Editora Ltda. All rights reserved.
Psychometric Evaluation of the PROMIS Fatigue-Short Form Across Diverse Populations
Ameringer, Suzanne; Elswick, R. K.; Menzies, Victoria; Robins, Jo Lynne; Starkweather, Angela; Walter, Jeanne; Gentry, Amanda Elswick; Jallo, Nancy
2016-01-01
Background The need for reliable, valid tools to measure patient-reported outcomes (PROs) is critical for both research and for evaluating treatment effects in practice. The Patient Reported Outcome Measurement Information System (PROMIS) Fatigue-Short Form v1.0 –Fatigue 7a (PROMIS F-SF) has had limited psychometric evaluation in various populations. Objectives The aim of the study is to examine psychometric properties of PROMIS F-SF item responses across various populations. Methods Data from five studies with common data elements were used in this secondary analysis. Samples from patients with fibromyalgia, sickle cell disease, cardiometabolic risk, pregnancy, and healthy controls were used. Reliability was estimated using Cronbach’s alpha. Dimensionality was evaluated with confirmatory factor analysis. Concurrent validity was evaluated by examining Pearson’s correlations between scores from the PROMIS F-SF, the Multidimensional Fatigue Symptom Inventory-Short Form (MFSI-SF), and the Brief Fatigue Inventory (BFI). Discriminant validity was evaluated by examining Pearson’s correlations between scores on the PROMIS F-SF and measures of stress and depressive symptoms. Known groups validity was assessed by comparing PROMIS F-SH scores in the clinical samples to healthy controls. Results Reliability of PROMIS F-SF scores was adequate across samples, ranging from .72 in the pregnancy sample to .88 in healthy controls. Unidimensionality was supported in each sample. Concurrent validity was strong; across the groups, correlations with scores on the MFSI-SF and BFI ranged from .60–.85. Correlations of the PROMIS-SF with measures of stress and depressive mood were moderate to strong, ranging from .37–.64. PROMIS F-SF scores were significantly higher in clinical samples, compared to healthy controls. Discussion Reliability and validity of the PROMIS F-SF were acceptable. The PROMIS F-SF is a suitable measure of fatigue across the four diverse clinical populations included in the analysis. PMID:27362514
Forecasting Emergency Department Crowding: An External, Multi-Center Evaluation
Hoot, Nathan R.; Epstein, Stephen K.; Allen, Todd L.; Jones, Spencer S.; Baumlin, Kevin M.; Chawla, Neal; Lee, Anna T.; Pines, Jesse M.; Klair, Amandeep K.; Gordon, Bradley D.; Flottemesch, Thomas J.; LeBlanc, Larry J.; Jones, Ian; Levin, Scott R.; Zhou, Chuan; Gadd, Cynthia S.; Aronsky, Dominik
2009-01-01
Objective To apply a previously described tool to forecast ED crowding at multiple institutions, and to assess its generalizability for predicting the near-future waiting count, occupancy level, and boarding count. Methods The ForecastED tool was validated using historical data from five institutions external to the development site. A sliding-window design separated the data for parameter estimation and forecast validation. Observations were sampled at consecutive 10-minute intervals during 12 months (n = 52,560) at four sites and 10 months (n = 44,064) at the fifth. Three outcome measures – the waiting count, occupancy level, and boarding count – were forecast 2, 4, 6, and 8 hours beyond each observation, and forecasts were compared to observed data at corresponding times. The reliability and calibration were measured following previously described methods. After linear calibration, the forecasting accuracy was measured using the median absolute error (MAE). Results The tool was successfully used for five different sites. Its forecasts were more reliable, better calibrated, and more accurate at 2 hours than at 8 hours. The reliability and calibration of the tool were similar between the original development site and external sites; the boarding count was an exception, which was less reliable at four out of five sites. Some variability in accuracy existed among institutions; when forecasting 4 hours into the future, the MAE of the waiting count ranged between 0.6 and 3.1 patients, the MAE of the occupancy level ranged between 9.0 and 14.5% of beds, and the MAE of the boarding count ranged between 0.9 and 2.7 patients. Conclusion The ForecastED tool generated potentially useful forecasts of input and throughput measures of ED crowding at five external sites, without modifying the underlying assumptions. Noting the limitation that this was not a real-time validation, ongoing research will focus on integrating the tool with ED information systems. PMID:19716629
Near-Earth-object survey progress and population of small near-Earth asteroids
NASA Astrophysics Data System (ADS)
Harris, A.
2014-07-01
Estimating the total population vs. size of NEAs and the completion of surveys is the same thing since the total population is just the number discovered divided by the estimated completion. I review the method of completion estimation based on ratio of re-detected objects to total detections (known plus new discoveries). The method is quite general and can be used for population estimations of all sorts, from wildlife to various classes of solar system bodies. Since 2001, I have been making estimates of population and survey progress approximately every two years. Plotted below, left, is my latest estimate, including NEA discoveries up to August, 2012. I plan to present an update at the meeting. All asteroids of a given size are not equally easy to detect because of specific orbital geometries. Thus a model of the orbital distribution is necessary, and computer simulations using those orbits need to establish the relation between the raw re-detection ratio and the actual completion fraction. This can be done for any sub-group population, allowing to estimate the population of a subgroup and the expected current completion. Once a reliable survey computer model has been developed and ''calibrated'' with respect to actual survey re-detections versus size, it can be extrapolated to smaller sizes to estimate completion even at very small size where re-detections are rare or even zero. I have recently investigated the subgroup of extremely low encounter velocity NEAs, the class of interest for the Asteroid Redirect Mission (ARM), recently proposed by NASA. I found that asteroids of diameter ˜ 10 m with encounter velocity with the Earth lower than 2.5 km/sec are detected by current surveys nearly 1,000 times more efficiently than the general background of NEAs of that size. Thus the current completion of these slow relative velocity objects may be around 1%, compared to 10^{-6} for that size objects of the general velocity distribution. Current surveys are nowhere near complete, but there may be fewer such objects than have been suggested. This conclusion is reinforced by the fact that at least a couple such discovered objects are known to be not real asteroids but spent rocket bodies in heliocentric orbit, of which there are only of the order of a hundred. Brown et al. (Nature 503, 238-241, 2013, below right, green squares are a re-plot of my blue circles on left plot) recently suggested that the population of small NEAs in the size range from roughly 5 to 50 meters in diameter may have been substantially under-estimated. To be sure, the greatest uncertainty in population estimates is in that range, since there are very few bolide events to use for estimation, and the surveys are extremely incomplete in that size range, so a factor of 3 or so discrepancy is not significant. However, the population estimated from surveys carried still smaller, where the bolide frequency becomes more secure, disagrees from the bolide estimate by even less than a factor of 3 and in fact intersects at about 3 m diameter. On the other hand, the shallow-sloping size-frequency distribution derived from the sparse large bolide data diverges badly from the survey estimates, in sizes where the survey estimates become ever-increasingly reliable, even by 100-200 m diameter. It appears that the bolide data provides a good "anchor" of the population in the size range up to about 5 m diameter, but above that one might do better just connecting that population with a straight line (on a log-log plot) with the survey-determined population at larger size, 50-100 m diameter or so.
A joint-space numerical model of metabolic energy expenditure for human multibody dynamic system.
Kim, Joo H; Roberts, Dustyn
2015-09-01
Metabolic energy expenditure (MEE) is a critical performance measure of human motion. In this study, a general joint-space numerical model of MEE is derived by integrating the laws of thermodynamics and principles of multibody system dynamics, which can evaluate MEE without the limitations inherent in experimental measurements (phase delays, steady state and task restrictions, and limited range of motion) or muscle-space models (complexities and indeterminacies from excessive DOFs, contacts and wrapping interactions, and reliance on in vitro parameters). Muscle energetic components are mapped to the joint space, in which the MEE model is formulated. A constrained multi-objective optimization algorithm is established to estimate the model parameters from experimental walking data also used for initial validation. The joint-space parameters estimated directly from active subjects provide reliable MEE estimates with a mean absolute error of 3.6 ± 3.6% relative to validation values, which can be used to evaluate MEE for complex non-periodic tasks that may not be experimentally verifiable. This model also enables real-time calculations of instantaneous MEE rate as a function of time for transient evaluations. Although experimental measurements may not be completely replaced by model evaluations, predicted quantities can be used as strong complements to increase reliability of the results and yield unique insights for various applications. Copyright © 2015 John Wiley & Sons, Ltd.
Model-Based Design of Long-Distance Tracer Transport Experiments in Plants.
Bühler, Jonas; von Lieres, Eric; Huber, Gregor J
2018-01-01
Studies of long-distance transport of tracer isotopes in plants offer a high potential for functional phenotyping, but so far measurement time is a bottleneck because continuous time series of at least 1 h are required to obtain reliable estimates of transport properties. Hence, usual throughput values are between 0.5 and 1 samples h -1 . Here, we propose to increase sample throughput by introducing temporal gaps in the data acquisition of each plant sample and measuring multiple plants one after each other in a rotating scheme. In contrast to common time series analysis methods, mechanistic tracer transport models allow the analysis of interrupted time series. The uncertainties of the model parameter estimates are used as a measure of how much information was lost compared to complete time series. A case study was set up to systematically investigate different experimental schedules for different throughput scenarios ranging from 1 to 12 samples h -1 . Selected designs with only a small amount of data points were found to be sufficient for an adequate parameter estimation, implying that the presented approach enables a substantial increase of sample throughput. The presented general framework for automated generation and evaluation of experimental schedules allows the determination of a maximal sample throughput and the respective optimal measurement schedule depending on the required statistical reliability of data acquired by future experiments.
Fusion of electromagnetic trackers to improve needle deflection estimation: simulation study.
Sadjadi, Hossein; Hashtrudi-Zaad, Keyvan; Fichtinger, Gabor
2013-10-01
We present a needle deflection estimation method to anticipate needle bending during insertion into deformable tissue. Using limited additional sensory information, our approach reduces the estimation error caused by uncertainties inherent in the conventional needle deflection estimation methods. We use Kalman filters to combine a kinematic needle deflection model with the position measurements of the base and the tip of the needle taken by electromagnetic (EM) trackers. One EM tracker is installed on the needle base and estimates the needle tip position indirectly using the kinematic needle deflection model. Another EM tracker is installed on the needle tip and estimates the needle tip position through direct, but noisy measurements. Kalman filters are then employed to fuse these two estimates in real time and provide a reliable estimate of the needle tip position, with reduced variance in the estimation error. We implemented this method to compensate for needle deflection during simulated needle insertions and performed sensitivity analysis for various conditions. At an insertion depth of 150 mm, we observed needle tip estimation error reductions in the range of 28% (from 1.8 to 1.3 mm) to 74% (from 4.8 to 1.2 mm), which demonstrates the effectiveness of our method, offering a clinically practical solution.
The reliability of multidimensional neuropsychological measures: from alpha to omega.
Watkins, Marley W
To demonstrate that Coefficient omega, a model-based estimate, is more a more appropriate index of reliability than coefficient alpha for the multidimensional scales that are commonly employed by neuropsychologists. As an illustration, a structural model of an overarching general factor and four first-order factors for the WAIS-IV based on the standardization sample of 2200 participants was identified and omega coefficients were subsequently computed for WAIS-IV composite scores. Alpha coefficients were ≥ .90 and omega coefficients ranged from .75 to .88 for WAIS-IV factor index scores, indicating that the blend of general and group factor variance in each index score created a reliable multidimensional composite. However, the amalgam of variance from general and group factors did not allow the precision of Full Scale IQ (FSIQ) and factor index scores to be disentangled. In contrast, omega hierarchical coefficients were low for all four factor index scores (.10-.41), indicating that most of the reliable variance of each factor index score was due to the general intelligence factor. In contrast, the omega hierarchical coefficient for the FSIQ score was .84. Meaningful interpretation of WAIS-IV factor index scores as unambiguous indicators of group factors is imprecise, thereby fostering unreliable identification of neurocognitive strengths and weaknesses, whereas the WAIS-IV FSIQ score can be interpreted as a reliable measure of general intelligence. It was concluded that neuropsychologists should base their clinical decisions on reliable scores as indexed by coefficient omega.
Inter-operator and inter-device agreement and reliability of the SEM Scanner.
Clendenin, Marta; Jaradeh, Kindah; Shamirian, Anasheh; Rhodes, Shannon L
2015-02-01
The SEM Scanner is a medical device designed for use by healthcare providers as part of pressure ulcer prevention programs. The objective of this study was to evaluate the inter-rater and inter-device agreement and reliability of the SEM Scanner. Thirty-one (31) volunteers free of pressure ulcers or broken skin at the sternum, sacrum, and heels were assessed with the SEM Scanner. Each of three operators utilized each of three devices to collect readings from four anatomical sites (sternum, sacrum, left and right heels) on each subject for a total of 108 readings per subject collected over approximately 30 min. For each combination of operator-device-anatomical site, three SEM readings were collected. Inter-operator and inter-device agreement and reliability were estimated. Over the course of this study, more than 3000 SEM Scanner readings were collected. Agreement between operators was good with mean differences ranging from -0.01 to 0.11. Inter-operator and inter-device reliability exceeded 0.80 at all anatomical sites assessed. The results of this study demonstrate the high reliability and good agreement of the SEM Scanner across different operators and different devices. Given the limitations of current methods to prevent and detect pressure ulcers, the SEM Scanner shows promise as an objective, reliable tool for assessing the presence or absence of pressure-induced tissue damage such as pressure ulcers. Copyright © 2015 Bruin Biometrics, LLC. Published by Elsevier Ltd.. All rights reserved.
Chen, Hui-fang; Wu, Ching-yi; Lin, Keh-chung; Li, Ming-wei; Yu, Hung-wen
2012-07-01
To examine the measurement properties of a short version of the Stroke-Specific Quality of Life Scale (SS-QoL-12). Self-report survey of patients with mild to moderate upper extremity dysfunction. A total of 126 patients provided 252 observations before and after treatment. The construct validity and reliability was examined using the Rasch model; the concurrent and predictive validity was estimated using Spearman's rank correlation coefficients. Paired t-test and the standardized response mean (SRM) were performed to estimate the responsiveness of the SS-QoL-12. The 2-factor model (psychosocial and physical domains) fit the data better with smaller deviances. All but 1 item showed acceptable fit, and no item biases were detected. The reliability of the subscales and the whole scale ranged from 0.67 to 0.99. The total score showed fair correlations with the criterion measures at pretreatment (ρ = 0.28-0.40) and fair to good correlations at post-treatment (ρ = 0.39-0.54). The subscales had low to fair correlations at pretreatment (ρ = 0.19-0.49) and fair to good correlations at post-treatment (ρ = 0.31-0.56). The total and the subscales had low to good predictions at baseline (ρ = 0.22-0.52). The whole scale and the psychosocial subscale were mildly responsive to change (SRM = 0.22), but the physical subscale was not responsive to change (SRM = 0.08). The SS-QoL-12 has acceptable to good measurement properties, with an advantage of requiring less time to administer than other scales. The use of the subscale and total scores depends on the purpose of research. Future studies should recruit stroke patients with a broad range of dysfunction and use a large sample size to validate the findings.
Evaluation of Reliability Coefficients for Two-Level Models via Latent Variable Analysis
ERIC Educational Resources Information Center
Raykov, Tenko; Penev, Spiridon
2010-01-01
A latent variable analysis procedure for evaluation of reliability coefficients for 2-level models is outlined. The method provides point and interval estimates of group means' reliability, overall reliability of means, and conditional reliability. In addition, the approach can be used to test simple hypotheses about these parameters. The…
Estimating Between-Person and Within-Person Subscore Reliability with Profile Analysis.
Bulut, Okan; Davison, Mark L; Rodriguez, Michael C
2017-01-01
Subscores are of increasing interest in educational and psychological testing due to their diagnostic function for evaluating examinees' strengths and weaknesses within particular domains of knowledge. Previous studies about the utility of subscores have mostly focused on the overall reliability of individual subscores and ignored the fact that subscores should be distinct and have added value over the total score. This study introduces a profile reliability approach that partitions the overall subscore reliability into within-person and between-person subscore reliability. The estimation of between-person reliability and within-person reliability coefficients is demonstrated using subscores from number-correct scoring, unidimensional and multidimensional item response theory scoring, and augmented scoring approaches via a simulation study and a real data study. The effects of various testing conditions, such as subtest length, correlations among subscores, and the number of subtests, are examined. Results indicate that there is a substantial trade-off between within-person and between-person reliability of subscores. Profile reliability coefficients can be useful in determining the extent to which subscores provide distinct and reliable information under various testing conditions.
Time in tortoiseshell: a bomb radiocarbon-validated chronology in sea turtle scutes.
Van Houtan, Kyle S; Andrews, Allen H; Jones, T Todd; Murakawa, Shawn K K; Hagemann, Molly E
2016-01-13
Some of the most basic questions of sea turtle life history are also the most elusive. Many uncertainties surround lifespan, growth rates, maturity and spatial structure, yet these are critical factors in assessing population status. Here we examine the keratinized hard tissues of the hawksbill (Eretmochelys imbricata) carapace and use bomb radiocarbon dating to estimate growth and maturity. Scutes have an established dietary record, yet the large keratin deposits of hawksbills evoke a reliable chronology. We sectioned, polished and imaged posterior marginal scutes from 36 individual hawksbills representing all life stages, several Pacific populations and spanning eight decades. We counted the apparent growth lines, microsampled along growth contours and calibrated Δ(14)C values to reference coral series. We fit von Bertalanffy growth function (VBGF) models to the results, producing a range of age estimates for each turtle. We find Hawaii hawksbills deposit eight growth lines annually (range 5-14), with model ensembles producing a somatic growth parameter (k) of 0.13 (range 0.1-0.2) and first breeding at 29 years (range 23-36). Recent bomb radiocarbon values also suggest declining trophic status. Together, our results may reflect long-term changes in the benthic community structure of Hawaii reefs, and possibly shed light on the critical population status for Hawaii hawksbills. © 2016 The Author(s).
Time in tortoiseshell: a bomb radiocarbon-validated chronology in sea turtle scutes
Van Houtan, Kyle S.; Andrews, Allen H.; Jones, T. Todd; Murakawa, Shawn K. K.; Hagemann, Molly E.
2016-01-01
Some of the most basic questions of sea turtle life history are also the most elusive. Many uncertainties surround lifespan, growth rates, maturity and spatial structure, yet these are critical factors in assessing population status. Here we examine the keratinized hard tissues of the hawksbill (Eretmochelys imbricata) carapace and use bomb radiocarbon dating to estimate growth and maturity. Scutes have an established dietary record, yet the large keratin deposits of hawksbills evoke a reliable chronology. We sectioned, polished and imaged posterior marginal scutes from 36 individual hawksbills representing all life stages, several Pacific populations and spanning eight decades. We counted the apparent growth lines, microsampled along growth contours and calibrated Δ14C values to reference coral series. We fit von Bertalanffy growth function (VBGF) models to the results, producing a range of age estimates for each turtle. We find Hawaii hawksbills deposit eight growth lines annually (range 5–14), with model ensembles producing a somatic growth parameter (k) of 0.13 (range 0.1–0.2) and first breeding at 29 years (range 23–36). Recent bomb radiocarbon values also suggest declining trophic status. Together, our results may reflect long-term changes in the benthic community structure of Hawaii reefs, and possibly shed light on the critical population status for Hawaii hawksbills. PMID:26740617
Jones, Terry L; Schlegel, Cara
2014-02-01
Accurate, precise, unbiased, reliable, and cost-effective estimates of nursing time use are needed to insure safe staffing levels. Direct observation of nurses is costly, and conventional surrogate measures have limitations. To test the potential of electronic capture of time and motion through real time location systems (RTLS), a pilot study was conducted to assess efficacy (method agreement) of RTLS time use; inter-rater reliability of RTLS time-use estimates; and associated costs. Method agreement was high (mean absolute difference = 28 seconds); inter-rater reliability was high (ICC = 0.81-0.95; mean absolute difference = 2 seconds); and costs for obtaining RTLS time-use estimates on a single nursing unit exceeded $25,000. Continued experimentation with RTLS to obtain time-use estimates for nursing staff is warranted. © 2013 Wiley Periodicals, Inc.
Prediction of Software Reliability using Bio Inspired Soft Computing Techniques.
Diwaker, Chander; Tomar, Pradeep; Poonia, Ramesh C; Singh, Vijander
2018-04-10
A lot of models have been made for predicting software reliability. The reliability models are restricted to using particular types of methodologies and restricted number of parameters. There are a number of techniques and methodologies that may be used for reliability prediction. There is need to focus on parameters consideration while estimating reliability. The reliability of a system may increase or decreases depending on the selection of different parameters used. Thus there is need to identify factors that heavily affecting the reliability of the system. In present days, reusability is mostly used in the various area of research. Reusability is the basis of Component-Based System (CBS). The cost, time and human skill can be saved using Component-Based Software Engineering (CBSE) concepts. CBSE metrics may be used to assess those techniques which are more suitable for estimating system reliability. Soft computing is used for small as well as large-scale problems where it is difficult to find accurate results due to uncertainty or randomness. Several possibilities are available to apply soft computing techniques in medicine related problems. Clinical science of medicine using fuzzy-logic, neural network methodology significantly while basic science of medicine using neural-networks-genetic algorithm most frequently and preferably. There is unavoidable interest shown by medical scientists to use the various soft computing methodologies in genetics, physiology, radiology, cardiology and neurology discipline. CBSE boost users to reuse the past and existing software for making new products to provide quality with a saving of time, memory space, and money. This paper focused on assessment of commonly used soft computing technique like Genetic Algorithm (GA), Neural-Network (NN), Fuzzy Logic, Support Vector Machine (SVM), Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO), and Artificial Bee Colony (ABC). This paper presents working of soft computing techniques and assessment of soft computing techniques to predict reliability. The parameter considered while estimating and prediction of reliability are also discussed. This study can be used in estimation and prediction of the reliability of various instruments used in the medical system, software engineering, computer engineering and mechanical engineering also. These concepts can be applied to both software and hardware, to predict the reliability using CBSE.
Obtaining Reliable Estimates of Ambulatory Physical Activity in People with Parkinson's Disease.
Paul, Serene S; Ellis, Terry D; Dibble, Leland E; Earhart, Gammon M; Ford, Matthew P; Foreman, K Bo; Cavanaugh, James T
2016-05-05
We determined the number of days required, and whether to include weekdays and/or weekends, to obtain reliable measures of ambulatory physical activity in people with Parkinson's disease (PD). Ninety-two persons with PD wore a step activity monitor for seven days. The number of days required to obtain a reliable estimate of daily activity was determined from the mean intraclass correlation (ICC2,1) for all possible combinations of 1-6 consecutive days of monitoring. Two days of monitoring were sufficient to obtain reliable daily activity estimates (ICC2,1 > 0.9). Amount (p = 0.03) but not intensity (p = 0.13) of ambulatory activity was greater on weekdays than weekends. Activity prescription based on amount rather than intensity may be more appropriate for people with PD.
NASA Astrophysics Data System (ADS)
Wan, Fubin; Tan, Yuanyuan; Jiang, Zhenhua; Chen, Xun; Wu, Yinong; Zhao, Peng
2017-12-01
Lifetime and reliability are the two performance parameters of premium importance for modern space Stirling-type pulse tube refrigerators (SPTRs), which are required to operate in excess of 10 years. Demonstration of these parameters provides a significant challenge. This paper proposes a lifetime prediction and reliability estimation method that utilizes accelerated degradation testing (ADT) for SPTRs related to gaseous contamination failure. The method was experimentally validated via three groups of gaseous contamination ADT. First, the performance degradation model based on mechanism of contamination failure and material outgassing characteristics of SPTRs was established. Next, a preliminary test was performed to determine whether the mechanism of contamination failure of the SPTRs during ADT is consistent with normal life testing. Subsequently, the experimental program of ADT was designed for SPTRs. Then, three groups of gaseous contamination ADT were performed at elevated ambient temperatures of 40 °C, 50 °C, and 60 °C, respectively and the estimated lifetimes of the SPTRs under normal condition were obtained through acceleration model (Arrhenius model). The results show good fitting of the degradation model with the experimental data. Finally, we obtained the reliability estimation of SPTRs through using the Weibull distribution. The proposed novel methodology enables us to take less than one year time to estimate the reliability of the SPTRs designed for more than 10 years.
Cosmogenic nuclides in cometary materials: Implications for rate of mass loss and exposure history
NASA Astrophysics Data System (ADS)
Herzog, G. F.; Englert, P. A. J.; Reedy, R. C.
As planned, the Rosetta mission will return to earth with a 10-kg core and a 1-kg surface sample from a comet. The selection of a comet with low current activity will maximize the chance of obtaining material altered as little as possible. Current temperature and level of activity, however, may not reliably indicate previous values. Fortunately, from measurements of the cosmogenic nuclide contents of cometary material, one may estimate a rate of mass loss in the past and perhaps learn something about the exposure history of the comet. Perhaps the simplest way to estimate the rate of mass loss is to compare the total inventories of several long-lived cosmogenic radionuclides with the values expected on the basis of model calculations. Although model calculations have become steadily more reliable, application to bodies with the composition of comets will require some extension beyond the normal range of use. In particular, the influence of light elements on the secondary particle cascade will need study, in part through laboratory irradiations of volatile-rich materials. In the analysis of cometary data, it would be valuable to test calculations against measurements of short-lived isotopes.
Wesson, Jacqueline; Clemson, Lindy; Brodaty, Henry; Reppermund, Simone
2016-09-01
Functional cognition is a relatively new concept in assessment of older adults with mild cognitive impairment or dementia. Instruments need to be reliable and valid, hence we conducted a systematic review of observational assessments of task performance used to estimate functional cognition in this population. Two separate database searches were conducted: firstly to identify instruments; and secondly to identify studies reporting on the psychometric properties of the instruments. Studies were analysed using a published checklist and their quality reviewed according to specific published criteria. Clinical utility was reviewed and the information formulated into a best evidence synthesis. We found 21 instruments and included 58 studies reporting on measurement properties. The majority of studies were rated as being of fair methodological quality and the range of properties investigated was restricted. Most instruments had studies reporting on construct validity (hypothesis testing), none on content validity and there were few studies reporting on reliability. Overall the evidence on psychometric properties is lacking and there is an urgent need for further evaluation of instruments. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Ruan, Zhixing; Guo, Huadong; Liu, Guang; Yan, Shiyong
2014-01-01
Glacier movement is closely related to changes in climatic, hydrological, and geological factors. However, detecting glacier surface flow velocity with conventional ground surveys is challenging. Remote sensing techniques, especially synthetic aperture radar (SAR), provide regular observations covering larger-scale glacier regions. Glacier surface flow velocity in the West Kunlun Mountains using modified offset-tracking techniques based on ALOS/PALSAR images is estimated. Three maps of glacier flow velocity for the period 2007 to 2010 are derived from procedures of offset detection using cross correlation in the Fourier domain and global offset elimination of thin plate smooth splines. Our results indicate that, on average, winter glacier motion on the North Slope is 1 cm/day faster than on the South Slope-a result which corresponds well with the local topography. The performance of our method as regards the reliability of extracted displacements and the robustness of this algorithm are discussed. The SAR-based offset tracking is proven to be reliable and robust, making it possible to investigate comprehensive glacier movement and its response mechanism to environmental change.
Cruff, R.W.; Thompson, T.H.
1967-01-01
This study compared potential evapotranspiration, computed from climatological data by each of six empirical methods, with pan evaporation adjusted to equivalent lake evaporation by regional coefficients. The six methods tested were the Thornthwaite, U.S. Weather Bureau (a modification of the Permian method), Lowry-Johnson, Blaney-Criddle, Lane, and Hamon methods. The test was limited to 25 sites in the arid and subhumid parts of Arizona, California, and Nevada, where pan evaporation and concurrent climatological data were available. However, some of the sites lacked complete climatological data for the application of all six methods. Average values of adjusted pan evaporation and computed potential evapotransp4ration were compared for two periods---the calendar year and the 6-month period from May 1 through October 31. The 25 sites sampled a wide range of climatic conditions. Ten sites (group 1) were in a highly arid environment and four (group 2) were in an arid environment that was modified by extensive irrigation. The remaining 11 sites (group 3) were in a subhumid environment. Only the Weather Bureau method gave estimates of potential evapotranspiration that closely agreed with the adjusted pan evaporation at all sites where the method was used. However, lack of climatological data restricted the use of the Weather Bureau method to seven sites. Results obtained by use of the Thornthwaite, Lowry-Johnson, and Hamon methods were consistently low. Results obtained by use of the Lane method agreed with adjusted pan evaporation at the group 1 sites but were consistently high at the group 2 and 3 sites. During the analysis it became apparent that adjusted pan evaporation in an arid environment (group 1 sites) was a spurious standard for evaluating the reliability of .the methods that were tested. Group 1 data were accordingly not considered when making conclusions as ,to which of the six methods tested was best. The results of this study for group 2 and 3 data indicated that the Blaney-Criddle method, which uses climatological data that can be readily obtained or deduced, was the most practical of the six methods for estimating potential evapotranspiration. At all 15 sites in the two environments, potential evapotranspiration computed by the Blaney-Criddle method checked the adjusted pan evaporation within ?22 percent. This percentage range is generally considered to be the range of reliability for estimating lake evaporation from evaporation pans.
Concordance of DSM-IV Axis I and II diagnoses by personal and informant's interview.
Schneider, Barbara; Maurer, Konrad; Sargk, Dieter; Heiskel, Harald; Weber, Bernhard; Frölich, Lutz; Georgi, Klaus; Fritze, Jürgen; Seidler, Andreas
2004-06-30
The validity and reliability of using psychological autopsies to diagnose a psychiatric disorder is a critical issue. Therefore, interrater and test-retest reliability of the Structured Clinical Interview for DSM-IV Axis I and Personality Disorders and the usefulness of these instruments for the psychological autopsy method were investigated. Diagnoses by informant's interview were compared with diagnoses generated by a personal interview of 35 persons. Interrater reliability and test-retest reliability were assessed in 33 and 29 persons, respectively. Chi-square analysis, kappa and intraclass correlation coefficients, and Kendall's tau were used to determine agreement of diagnoses. Kappa coefficients were above 0.84 for substance-related disorders, mood disorders, and anxiety and adjustment disorders, and above 0.65 for Axis II disorders for interrater and test-retest reliability. Agreement by personal and relative's interview generated kappa coefficients above 0.79 for most Axis I and above 0.65 for most personality disorder diagnoses; Kendall's tau for dimensional individual personality disorder scores ranged from 0.22 to 0.72. Despite of a small number of psychiatric disorders in the selected population, the present results provide support for the validity of most diagnoses obtained through the best-estimate method using the Structured Clinical Interview for DSM-IV Axis I and Personality Disorders. This instrument can be recommended as a tool for the psychological autopsy procedure in post-mortem research. Copyright 2004 Elsevier Ireland Ltd.
ERIC Educational Resources Information Center
Fife, Dustin A.; Mendoza, Jorge L.; Terry, Robert
2012-01-01
Though much research and attention has been directed at assessing the correlation coefficient under range restriction, the assessment of reliability under range restriction has been largely ignored. This article uses item response theory to simulate dichotomous item-level data to assess the robustness of KR-20 ([alpha]), [omega], and test-retest…
The Reliability of Individualized Load-Velocity Profiles.
Banyard, Harry G; Nosaka, K; Vernon, Alex D; Haff, G Gregory
2017-11-15
This study examined the reliability of peak velocity (PV), mean propulsive velocity (MPV), and mean velocity (MV) in the development of load-velocity profiles (LVP) in the full depth free-weight back squat performed with maximal concentric effort. Eighteen resistance-trained men performed a baseline one-repetition maximum (1RM) back squat trial and three subsequent 1RM trials used for reliability analyses, with 48-hours interval between trials. 1RM trials comprised lifts from six relative loads including 20, 40, 60, 80, 90, and 100% 1RM. Individualized LVPs for PV, MPV, or MV were derived from loads that were highly reliable based on the following criteria: intra-class correlation coefficient (ICC) >0.70, coefficient of variation (CV) ≤10%, and Cohen's d effect size (ES) <0.60. PV was highly reliable at all six loads. Importantly, MPV and MV were highly reliable at 20, 40, 60, 80 and 90% but not 100% 1RM (MPV: ICC=0.66, CV=18.0%, ES=0.10, standard error of the estimate [SEM]=0.04m·s -1 ; MV: ICC=0.55, CV=19.4%, ES=0.08, SEM=0.04m·s -1 ). When considering the reliable ranges, almost perfect correlations were observed for LVPs derived from PV 20-100% (r=0.91-0.93), MPV 20-90% (r=0.92-0.94) and MV 20-90% (r=0.94-0.95). Furthermore, the LVPs were not significantly different (p>0.05) between trials, movement velocities, or between linear regression versus second order polynomial fits. PV 20-100% , MPV 20-90% , and MV 20-90% are reliable and can be utilized to develop LVPs using linear regression. Conceptually, LVPs can be used to monitor changes in movement velocity and employed as a method for adjusting sessional training loads according to daily readiness.
Building beef cow nutritional programs with the 1996 NRC beef cattle requirements model.
Lardy, G P; Adams, D C; Klopfenstein, T J; Patterson, H H
2004-01-01
Designing a sound cow-calf nutritional program requires knowledge of nutrient requirements, diet quality, and intake. Effectively using the NRC (1996) beef cattle requirements model (1996NRC) also requires knowledge of dietary degradable intake protein (DIP) and microbial efficiency. Objectives of this paper are to 1) describe a framework in which 1996NRC-applicable data can be generated, 2) describe seasonal changes in nutrients on native range, 3) use the 1996NRC to predict nutrient balance for cattle grazing these forages, and 4) make recommendations for using the 1996NRC for forage-fed cattle. Extrusa samples were collected over 2 yr on native upland range and subirrigated meadow in the Nebraska Sandhills. Samples were analyzed for CP, in vitro OM digestibility (IVOMD), and DIP. Regression equations to predict nutrients were developed from these data. The 1996NRC was used to predict nutrient balances based on the dietary nutrient analyses. Recommendations for model users were also developed. On subirrigated meadow, CP and IVOMD increased rapidly during March and April. On native range, CP and IVOMD increased from April through June but decreased rapidly from August through September. Degradable intake protein (DM basis) followed trends similar to CP for both native range and subirrigated meadow. Predicted nutrient balances for spring- and summer-calving cows agreed with reported values in the literature, provided that IVOMD values were converted to DE before use in the model (1.07 x IVOMD - 8.13). When the IVOMD-to-DE conversion was not used, the model gave unrealistically high NE(m) balances. To effectively use the 1996NRC to estimate protein requirements, users should focus on three key estimates: DIP, microbial efficiency, and TDN intake. Consequently, efforts should be focused on adequately describing seasonal changes in forage nutrient content. In order to increase use of the 1996NRC, research is needed in the following areas: 1) cost-effective and accurate commercial laboratory procedures to estimate DIP, 2) reliable estimates or indicators of microbial efficiency for various forage types and qualities, 3) improved estimates of dietary TDN for forage-based diets, 4) validation work to improve estimates of DIP and MP requirements, and 5) incorporation of nitrogen recycling estimates.
Li, X; Lund, M S; Zhang, Q; Costa, C N; Ducrocq, V; Su, G
2016-06-01
The present study investigated the improvement of prediction reliabilities for 3 production traits in Brazilian Holsteins that had no genotype information by adding information from Nordic and French Holstein bulls that had genotypes. The estimated across-country genetic correlations (ranging from 0.604 to 0.726) indicated that an important genotype by environment interaction exists between Brazilian and Nordic (or Nordic and French) populations. Prediction reliabilities for Brazilian genotyped bulls were greatly increased by including data of Nordic and French bulls, and a 2-trait single-step genomic BLUP performed much better than the corresponding pedigree-based BLUP. However, only a minor improvement in prediction reliabilities was observed in nongenotyped Brazilian cows. The results indicate that although there is a large genotype by environment interaction, inclusion of a foreign reference population can improve accuracy of genetic evaluation for the Brazilian Holstein population. However, a Brazilian reference population is necessary to obtain a more accurate genomic evaluation. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
ASSESSING AND COMBINING RELIABILITY OF PROTEIN INTERACTION SOURCES
LEACH, SONIA; GABOW, AARON; HUNTER, LAWRENCE; GOLDBERG, DEBRA S.
2008-01-01
Integrating diverse sources of interaction information to create protein networks requires strategies sensitive to differences in accuracy and coverage of each source. Previous integration approaches calculate reliabilities of protein interaction information sources based on congruity to a designated ‘gold standard.’ In this paper, we provide a comparison of the two most popular existing approaches and propose a novel alternative for assessing reliabilities which does not require a gold standard. We identify a new method for combining the resultant reliabilities and compare it against an existing method. Further, we propose an extrinsic approach to evaluation of reliability estimates, considering their influence on the downstream tasks of inferring protein function and learning regulatory networks from expression data. Results using this evaluation method show 1) our method for reliability estimation is an attractive alternative to those requiring a gold standard and 2) the new method for combining reliabilities is less sensitive to noise in reliability assignments than the similar existing technique. PMID:17990508
Revisions of rump fat and body scoring indices for deer, elk, and moose
Cook, Rachel C.; Cook, John G.; Stephenson, Thomas R.; Myers, Woodrow L.; Mccorquodale, Scott M.; Vales, David J.; Irwin, Larry L.; Hall, P. Briggs; Spencer, Rocky D.; Murphie, Shannon L.; Schoenecker, Kathryn A.; Miller, Patrick J.
2010-01-01
Because they do not require sacrificing animals, body condition scores (BCS), thickness of rump fat (MAXFAT), and other similar predictors of body fat have advanced estimating nutritional condition of ungulates and their use has proliferated in North America in the last decade. However, initial testing of these predictors was too limited to assess their reliability among diverse habitats, ecotypes, subspecies, and populations across the continent. With data collected from mule deer (Odocoileus hemionus), elk (Cervus elaphus), and moose (Alces alces) during initial model development and data collected subsequently from free-ranging mule deer and elk herds across much of the western United States, we evaluated reliability across a broader range of conditions than were initially available. First, to more rigorously test reliability of the MAXFAT index, we evaluated its robustness across the 3 species, using an allometric scaling function to adjust for differences in animal size. We then evaluated MAXFAT, rump body condition score (rBCS), rLIVINDEX (an arithmetic combination of MAXFAT and rBCS), and our new allometrically scaled rump-fat thickness index using data from 815 free-ranging female Roosevelt and Rocky Mountain elk (C. e. roosevelti and C. e. nelsoni) from 19 populations encompassing 4 geographic regions and 250 free-ranging female mule deer from 7 populations and 2 regions. We tested for effects of subspecies, geographic region, and captive versus free-ranging existence. Rump-fat thickness, when scaled allometrically with body mass, was related to ingesta-free body fat over a 38–522-kg range of body mass (r2 = 0.87; P < 0.001), indicating the technique is remarkably robust among at least the 3 cervid species of our analysis. However, we found an underscoring bias with the rBCS for elk that had >12% body fat. This bias translated into a difference between subspecies, because Rocky Mountain elk tended to be fatter than Roosevelt elk in our sample. Effects of observer error with the rBCS also existed for mule deer with moderate to high levels of body fat, and deer body size significantly affected accuracy of the MAXFAT predictor. Our analyses confirm robustness of the rump-fat index for these 3 species but highlight the potential for bias due to differences in body size and to observer error with BCS scoring. We present alternative LIVINDEX equations where potential bias from rBCS and bias due to body size are eliminated or reduced. These modifications improve the accuracy of estimating body fat for projects intended to monitor nutritional status of herds or to evaluate nutrition's influence on population demographics.
A reliable simultaneous representation of seismic hazard and of ground shaking recurrence
NASA Astrophysics Data System (ADS)
Peresan, A.; Panza, G. F.; Magrin, A.; Vaccari, F.
2015-12-01
Different earthquake hazard maps may be appropriate for different purposes - such as emergency management, insurance and engineering design. Accounting for the lower occurrence rate of larger sporadic earthquakes may allow to formulate cost-effective policies in some specific applications, provided that statistically sound recurrence estimates are used, which is not typically the case of PSHA (Probabilistic Seismic Hazard Assessment). We illustrate the procedure to associate the expected ground motions from Neo-deterministic Seismic Hazard Assessment (NDSHA) to an estimate of their recurrence. Neo-deterministic refers to a scenario-based approach, which allows for the construction of a broad range of earthquake scenarios via full waveforms modeling. From the synthetic seismograms the estimates of peak ground acceleration, velocity and displacement, or any other parameter relevant to seismic engineering, can be extracted. NDSHA, in its standard form, defines the hazard computed from a wide set of scenario earthquakes (including the largest deterministically or historically defined credible earthquake, MCE) and it does not supply the frequency of occurrence of the expected ground shaking. A recent enhanced variant of NDSHA that reliably accounts for recurrence has been developed and it is applied to the Italian territory. The characterization of the frequency-magnitude relation can be performed by any statistically sound method supported by data (e.g. multi-scale seismicity model), so that a recurrence estimate is associated to each of the pertinent sources. In this way a standard NDSHA map of ground shaking is obtained simultaneously with the map of the corresponding recurrences. The introduction of recurrence estimates in NDSHA naturally allows for the generation of ground shaking maps at specified return periods. This permits a straightforward comparison between NDSHA and PSHA maps.
NASA Astrophysics Data System (ADS)
Boergens, Eva; Dettmering, Denise; Schwatke, Christian
2015-04-01
Since many years the numbers of in-situ gauging stations are declining. Satellite altimetry can be used as a gap-filler even over smaller inland waters like rivers. However, since altimetry measurements are not designed for inland water bodies a special data handling is necessary in order to estimate reliable water level heights over inland waters. We developed a new routine for estimating water level heights over smaller inland waters with satellite altimetry by correcting the hooking effect. The hooking effect occurs when the altimeter is not measuring in nadir before and after passing a water body due to the stronger reflectance of the water than the surrounding land surface. These off-nadir measurements, together with the motion of the satellite, lead to overlong ranges and heights declining in a parabolic shape. The vertex of this parabola is on the water surface. Therefore, by estimating the parabola we are able to determine the water level height without the need of any point over the water body itself. For estimating the parabola we only use selected measurements which are effected by the hooking effect. The applied search approach is based on the RANSAC algorithm (random sample consensus) which is a non-deterministic algorithm especially designed for finding geometric entities in point clouds with many outliers. With the hooking effect correction we are able to retrieve water level height time series from the Mekong River from Envisat and Saral/Altika high frequency data. It is possible to determine reliable time series even if the river has only a width of 500m or less. The expected annual variations are clearly depicted and the comparison of the time series with available in-situ gauging data shows a very good agreement.
Between-User Reliability of Tier 1 Exposure Assessment Tools Used Under REACH.
Lamb, Judith; Galea, Karen S; Miller, Brian G; Hesse, Susanne; Van Tongeren, Martie
2017-10-01
When applying simple screening (Tier 1) tools to estimate exposure to chemicals in a given exposure situation under the Registration, Evaluation, Authorisation and restriction of CHemicals Regulation 2006 (REACH), users must select from several possible input parameters. Previous studies have suggested that results from exposure assessments using expert judgement and from the use of modelling tools can vary considerably between assessors. This study aimed to investigate the between-user reliability of Tier 1 tools. A remote-completion exercise and in person workshop were used to identify and evaluate tool parameters and factors such as user demographics that may be potentially associated with between-user variability. Participants (N = 146) generated dermal and inhalation exposure estimates (N = 4066) from specified workplace descriptions ('exposure situations') and Tier 1 tool combinations (N = 20). Interactions between users, tools, and situations were investigated and described. Systematic variation associated with individual users was minor compared with random between-user variation. Although variation was observed between choices made for the majority of input parameters, differing choices of Process Category ('PROC') code/activity descriptor and dustiness level impacted most on the resultant exposure estimates. Exposure estimates ranging over several orders of magnitude were generated for the same exposure situation by different tool users. Such unpredictable between-user variation will reduce consistency within REACH processes and could result in under-estimation or overestimation of exposure, risking worker ill-health or the implementation of unnecessary risk controls, respectively. Implementation of additional support and quality control systems for all tool users is needed to reduce between-assessor variation and so ensure both the protection of worker health and avoidance of unnecessary business risk management expenditure. © The Author 2017. Published by Oxford University Press on behalf of the British Occupational Hygiene Society.
Robust dead reckoning system for mobile robots based on particle filter and raw range scan.
Duan, Zhuohua; Cai, Zixing; Min, Huaqing
2014-09-04
Robust dead reckoning is a complicated problem for wheeled mobile robots (WMRs), where the robots are faulty, such as the sticking of sensors or the slippage of wheels, for the discrete fault models and the continuous states have to be estimated simultaneously to reach a reliable fault diagnosis and accurate dead reckoning. Particle filters are one of the most promising approaches to handle hybrid system estimation problems, and they have also been widely used in many WMRs applications, such as pose tracking, SLAM, video tracking, fault identification, etc. In this paper, the readings of a laser range finder, which may be also interfered with by noises, are used to reach accurate dead reckoning. The main contribution is that a systematic method to implement fault diagnosis and dead reckoning in a particle filter framework concurrently is proposed. Firstly, the perception model of a laser range finder is given, where the raw scan may be faulty. Secondly, the kinematics of the normal model and different fault models for WMRs are given. Thirdly, the particle filter for fault diagnosis and dead reckoning is discussed. At last, experiments and analyses are reported to show the accuracy and efficiency of the presented method.
Robust Dead Reckoning System for Mobile Robots Based on Particle Filter and Raw Range Scan
Duan, Zhuohua; Cai, Zixing; Min, Huaqing
2014-01-01
Robust dead reckoning is a complicated problem for wheeled mobile robots (WMRs), where the robots are faulty, such as the sticking of sensors or the slippage of wheels, for the discrete fault models and the continuous states have to be estimated simultaneously to reach a reliable fault diagnosis and accurate dead reckoning. Particle filters are one of the most promising approaches to handle hybrid system estimation problems, and they have also been widely used in many WMRs applications, such as pose tracking, SLAM, video tracking, fault identification, etc. In this paper, the readings of a laser range finder, which may be also interfered with by noises, are used to reach accurate dead reckoning. The main contribution is that a systematic method to implement fault diagnosis and dead reckoning in a particle filter framework concurrently is proposed. Firstly, the perception model of a laser range finder is given, where the raw scan may be faulty. Secondly, the kinematics of the normal model and different fault models for WMRs are given. Thirdly, the particle filter for fault diagnosis and dead reckoning is discussed. At last, experiments and analyses are reported to show the accuracy and efficiency of the presented method. PMID:25192318
PREDICTION OF RELIABILITY IN BIOGRAPHICAL QUESTIONNAIRES.
ERIC Educational Resources Information Center
STARRY, ALLAN R.
THE OBJECTIVES OF THIS STUDY WERE (1) TO DEVELOP A GENERAL CLASSIFICATION SYSTEM FOR LIFE HISTORY ITEMS, (2) TO DETERMINE TEST-RETEST RELIABILITY ESTIMATES, AND (3) TO ESTIMATE RESISTANCE TO EXAMINEE FAKING, FOR REPRESENTATIVE BIOGRAPHICAL QUESTIONNAIRES. TWO 100-ITEM QUESTIONNAIRES WERE CONSTRUCTED THROUGH RANDOM ASSIGNMENT BY CONTENT AREA OF 200…
Monte Carlo Approach for Reliability Estimations in Generalizability Studies.
ERIC Educational Resources Information Center
Dimitrov, Dimiter M.
A Monte Carlo approach is proposed, using the Statistical Analysis System (SAS) programming language, for estimating reliability coefficients in generalizability theory studies. Test scores are generated by a probabilistic model that considers the probability for a person with a given ability score to answer an item with a given difficulty…