statistical performance evaluation: Topics by Science.gov

Sample records for statistical performance evaluation

EVALUATION OF A NEW MEAN SCALED AND MOMENT ADJUSTED TEST STATISTIC FOR SEM.

PubMed

Tong, Xiaoxiao; Bentler, Peter M

2013-01-01

Recently a new mean scaled and skewness adjusted test statistic was developed for evaluating structural equation models in small samples and with potentially nonnormal data, but this statistic has received only limited evaluation. The performance of this statistic is compared to normal theory maximum likelihood and two well-known robust test statistics. A modification to the Satorra-Bentler scaled statistic is developed for the condition that sample size is smaller than degrees of freedom. The behavior of the four test statistics is evaluated with a Monte Carlo confirmatory factor analysis study that varies seven sample sizes and three distributional conditions obtained using Headrick's fifth-order transformation to nonnormality. The new statistic performs badly in most conditions except under the normal distribution. The goodness-of-fit χ(2) test based on maximum-likelihood estimation performed well under normal distributions as well as under a condition of asymptotic robustness. The Satorra-Bentler scaled test statistic performed best overall, while the mean scaled and variance adjusted test statistic outperformed the others at small and moderate sample sizes under certain distributional conditions.
Evaluation of adding item-response theory analysis for evaluation of the European Board of Ophthalmology Diploma examination.

PubMed

Mathysen, Danny G P; Aclimandos, Wagih; Roelant, Ella; Wouters, Kristien; Creuzot-Garcher, Catherine; Ringens, Peter J; Hawlina, Marko; Tassignon, Marie-José

2013-11-01

To investigate whether introduction of item-response theory (IRT) analysis, in parallel to the 'traditional' statistical analysis methods available for performance evaluation of multiple T/F items as used in the European Board of Ophthalmology Diploma (EBOD) examination, has proved beneficial, and secondly, to study whether the overall assessment performance of the current written part of EBOD is sufficiently high (KR-20≥ 0.90) to be kept as examination format in future EBOD editions. 'Traditional' analysis methods for individual MCQ item performance comprise P-statistics, Rit-statistics and item discrimination, while overall reliability is evaluated through KR-20 for multiple T/F items. The additional set of statistical analysis methods for the evaluation of EBOD comprises mainly IRT analysis. These analysis techniques are used to monitor whether the introduction of negative marking for incorrect answers (since EBOD 2010) has a positive influence on the statistical performance of EBOD as a whole and its individual test items in particular. Item-response theory analysis demonstrated that item performance parameters should not be evaluated individually, but should be related to one another. Before the introduction of negative marking, the overall EBOD reliability (KR-20) was good though with room for improvement (EBOD 2008: 0.81; EBOD 2009: 0.78). After the introduction of negative marking, the overall reliability of EBOD improved significantly (EBOD 2010: 0.92; EBOD 2011:0.91; EBOD 2012: 0.91). Although many statistical performance parameters are available to evaluate individual items, our study demonstrates that the overall reliability assessment remains the only crucial parameter to be evaluated allowing comparison. While individual item performance analysis is worthwhile to undertake as secondary analysis, drawing final conclusions seems to be more difficult. Performance parameters need to be related, as shown by IRT analysis. Therefore, IRT analysis has proved beneficial for the statistical analysis of EBOD. Introduction of negative marking has led to a significant increase in the reliability (KR-20 > 0.90), indicating that the current examination format can be kept for future EBOD examinations. © 2013 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.
Evaluation of a New Mean Scaled and Moment Adjusted Test Statistic for SEM

ERIC Educational Resources Information Center

Tong, Xiaoxiao; Bentler, Peter M.

2013-01-01

Recently a new mean scaled and skewness adjusted test statistic was developed for evaluating structural equation models in small samples and with potentially nonnormal data, but this statistic has received only limited evaluation. The performance of this statistic is compared to normal theory maximum likelihood and 2 well-known robust test…
Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial

EPA Science Inventory

This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit m...
10 CFR 431.445 - Determination of small electric motor efficiency.

Code of Federal Regulations, 2012 CFR

2012-01-01

... statistical analysis, computer simulation or modeling, or other analytic evaluation of performance data. (3... statistical analysis, computer simulation or modeling, and other analytic evaluation of performance data on.... (ii) If requested by the Department, the manufacturer shall conduct simulations to predict the...
Analysis of Professional and Pre-Accession Characteristics and Junior Naval Officer Performance

DTIC Science & Technology

2018-03-01

REVIEW .............................................5 A. NAVY PERFORMANCE EVALUATION SYSTEM ............................5 B. PROFESSIONAL...17 A. DATA DESCRIPTION ...........................................................................17 B. SUMMARY...STATISTICS ......................................................................24 C. DESCRIPTIVE STATISTICS
Performance Analysis of Live-Virtual-Constructive and Distributed Virtual Simulations: Defining Requirements in Terms of Temporal Consistency

DTIC Science & Technology

2009-12-01

events. Work associated with aperiodic tasks have the same statistical behavior and the same timing requirements. The timing deadlines are soft. • Sporadic...answers, but it is possible to calculate how precise the estimates are. Simulation-based performance analysis of a model includes a statistical ...to evaluate all pos- sible states in a timely manner. This is the principle reason for resorting to simulation and statistical analysis to evaluate
Statistically significant performance results of a mine detector and fusion algorithm from an x-band high-resolution SAR

NASA Astrophysics Data System (ADS)

Williams, Arnold C.; Pachowicz, Peter W.

2004-09-01

Current mine detection research indicates that no single sensor or single look from a sensor will detect mines/minefields in a real-time manner at a performance level suitable for a forward maneuver unit. Hence, the integrated development of detectors and fusion algorithms are of primary importance. A problem in this development process has been the evaluation of these algorithms with relatively small data sets, leading to anecdotal and frequently over trained results. These anecdotal results are often unreliable and conflicting among various sensors and algorithms. Consequently, the physical phenomena that ought to be exploited and the performance benefits of this exploitation are often ambiguous. The Army RDECOM CERDEC Night Vision Laboratory and Electron Sensors Directorate has collected large amounts of multisensor data such that statistically significant evaluations of detection and fusion algorithms can be obtained. Even with these large data sets care must be taken in algorithm design and data processing to achieve statistically significant performance results for combined detectors and fusion algorithms. This paper discusses statistically significant detection and combined multilook fusion results for the Ellipse Detector (ED) and the Piecewise Level Fusion Algorithm (PLFA). These statistically significant performance results are characterized by ROC curves that have been obtained through processing this multilook data for the high resolution SAR data of the Veridian X-Band radar. We discuss the implications of these results on mine detection and the importance of statistical significance, sample size, ground truth, and algorithm design in performance evaluation.
Uses and Misuses of Student Evaluations of Teaching: The Interpretation of Differences in Teaching Evaluation Means Irrespective of Statistical Information

ERIC Educational Resources Information Center

Boysen, Guy A.

2015-01-01

Student evaluations of teaching are among the most accepted and important indicators of college teachers' performance. However, faculty and administrators can overinterpret small variations in mean teaching evaluations. The current research examined the effect of including statistical information on the interpretation of teaching evaluations.…
Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis

NASA Astrophysics Data System (ADS)

Ahmad, Siti Rohaidah; Yusop, Nurhafizah Moziyana Mohd; Bakar, Azuraliza Abu; Yaakub, Mohd Ridzwan

2017-10-01

This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbor (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, IG-GA, and IG-RSAR algorithms. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which were validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.
Empirical performance of interpolation techniques in risk-neutral density (RND) estimation

NASA Astrophysics Data System (ADS)

Bahaludin, H.; Abdullah, M. H.

2017-03-01

The objective of this study is to evaluate the empirical performance of interpolation techniques in risk-neutral density (RND) estimation. Firstly, the empirical performance is evaluated by using statistical analysis based on the implied mean and the implied variance of RND. Secondly, the interpolation performance is measured based on pricing error. We propose using the leave-one-out cross-validation (LOOCV) pricing error for interpolation selection purposes. The statistical analyses indicate that there are statistical differences between the interpolation techniques:second-order polynomial, fourth-order polynomial and smoothing spline. The results of LOOCV pricing error shows that interpolation by using fourth-order polynomial provides the best fitting to option prices in which it has the lowest value error.
Statistical evaluation of vibration analysis techniques

NASA Technical Reports Server (NTRS)

Milner, G. Martin; Miller, Patrice S.

1987-01-01

An evaluation methodology is presented for a selection of candidate vibration analysis techniques applicable to machinery representative of the environmental control and life support system of advanced spacecraft; illustrative results are given. Attention is given to the statistical analysis of small sample experiments, the quantification of detection performance for diverse techniques through the computation of probability of detection versus probability of false alarm, and the quantification of diagnostic performance.
DECIDE: a software for computer-assisted evaluation of diagnostic test performance.

PubMed

Chiecchio, A; Bo, A; Manzone, P; Giglioli, F

1993-05-01

The evaluation of the performance of clinical tests is a complex problem involving different steps and many statistical tools, not always structured in an organic and rational system. This paper presents a software which provides an organic system of statistical tools helping evaluation of clinical test performance. The program allows (a) the building and the organization of a working database, (b) the selection of the minimal set of tests with the maximum information content, (c) the search of the model best fitting the distribution of the test values, (d) the selection of optimal diagnostic cut-off value of the test for every positive/negative situation, (e) the evaluation of performance of the combinations of correlated and uncorrelated tests. The uncertainty associated with all the variables involved is evaluated. The program works in a MS-DOS environment with EGA or higher performing graphic card.
Evaluating the performance of a fault detection and diagnostic system for vapor compression equipment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Breuker, M.S.; Braun, J.E.

This paper presents a detailed evaluation of the performance of a statistical, rule-based fault detection and diagnostic (FDD) technique presented by Rossi and Braun (1997). Steady-state and transient tests were performed on a simple rooftop air conditioner over a range of conditions and fault levels. The steady-state data without faults were used to train models that predict outputs for normal operation. The transient data with faults were used to evaluate FDD performance. The effect of a number of design variables on FDD sensitivity for different faults was evaluated and two prototype systems were specified for more complete evaluation. Good performancemore » was achieved in detecting and diagnosing five faults using only six temperatures (2 input and 4 output) and linear models. The performance improved by about a factor of two when ten measurements (three input and seven output) and higher order models were used. This approach for evaluating and optimizing the performance of the statistical, rule-based FDD technique could be used as a design and evaluation tool when applying this FDD method to other packaged air-conditioning systems. Furthermore, the approach could also be modified to evaluate the performance of other FDD methods.« less
Evaluation of Fuzzy-Logic Framework for Spatial Statistics Preserving Methods for Estimation of Missing Precipitation Data

NASA Astrophysics Data System (ADS)

El Sharif, H.; Teegavarapu, R. S.

2012-12-01

Spatial interpolation methods used for estimation of missing precipitation data at a site seldom check for their ability to preserve site and regional statistics. Such statistics are primarily defined by spatial correlations and other site-to-site statistics in a region. Preservation of site and regional statistics represents a means of assessing the validity of missing precipitation estimates at a site. This study evaluates the efficacy of a fuzzy-logic methodology for infilling missing historical daily precipitation data in preserving site and regional statistics. Rain gauge sites in the state of Kentucky, USA, are used as a case study for evaluation of this newly proposed method in comparison to traditional data infilling techniques. Several error and performance measures will be used to evaluate the methods and trade-offs in accuracy of estimation and preservation of site and regional statistics.
A survey and evaluations of histogram-based statistics in alignment-free sequence comparison.

PubMed

Luczak, Brian B; James, Benjamin T; Girgis, Hani Z

2017-12-06

Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.
Environmental Health Practice: Statistically Based Performance Measurement

PubMed Central

Enander, Richard T.; Gagnon, Ronald N.; Hanumara, R. Choudary; Park, Eugene; Armstrong, Thomas; Gute, David M.

2007-01-01

Objectives. State environmental and health protection agencies have traditionally relied on a facility-by-facility inspection-enforcement paradigm to achieve compliance with government regulations. We evaluated the effectiveness of a new approach that uses a self-certification random sampling design. Methods. Comprehensive environmental and occupational health data from a 3-year statewide industry self-certification initiative were collected from representative automotive refinishing facilities located in Rhode Island. Statistical comparisons between baseline and postintervention data facilitated a quantitative evaluation of statewide performance. Results. The analysis of field data collected from 82 randomly selected automotive refinishing facilities showed statistically significant improvements (P<.05, Fisher exact test) in 4 major performance categories: occupational health and safety, air pollution control, hazardous waste management, and wastewater discharge. Statistical significance was also shown when a modified Bonferroni adjustment for multiple comparisons was performed. Conclusions. Our findings suggest that the new self-certification approach to environmental and worker protection is effective and can be used as an adjunct to further enhance state and federal enforcement programs. PMID:17267709
Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial

EPA Pesticide Factsheets

The model performance evaluation consists of metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors.
Evaluation of the performance of statistical tests used in making cleanup decisions at Superfund sites. Part 1: Choosing an appropriate statistical test

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berman, D.W.; Allen, B.C.; Van Landingham, C.B.

1998-12-31

The decision rules commonly employed to determine the need for cleanup are evaluated both to identify conditions under which they lead to erroneous conclusions and to quantify the rate that such errors occur. Their performance is also compared with that of other applicable decision rules. The authors based the evaluation of decision rules on simulations. Results are presented as power curves. These curves demonstrate that the degree of statistical control achieved is independent of the form of the null hypothesis. The loss of statistical control that occurs when a decision rule is applied to a data set that does notmore » satisfy the rule`s validity criteria is also clearly demonstrated. Some of the rules evaluated do not offer the formal statistical control that is an inherent design feature of other rules. Nevertheless, results indicate that such informal decision rules may provide superior overall control of error rates, when their application is restricted to data exhibiting particular characteristics. The results reported here are limited to decision rules applied to uncensored and lognormally distributed data. To optimize decision rules, it is necessary to evaluate their behavior when applied to data exhibiting a range of characteristics that bracket those common to field data. The performance of decision rules applied to data sets exhibiting a broader range of characteristics is reported in the second paper of this study.« less
Air Force Officials did not Consistently Comply with Requirements for Assessing Contractor Performance

DTIC Science & Technology

2016-01-29

31 Appendix B. Improvement in PAR Completion Statistics _________________________________ 33 vi...agencies must perform frequent evaluation of compliance with reporting requirements so they can readily identify delinquent past performance efforts...Reporting Program,” August 13, 2011 Appendixes DODIG-2016-043 │ 33 Appendix B Improvement in PAR Completion Statistics The Senate Armed Services Committee

Flipping the Classroom and Student Performance in Advanced Statistics: Evidence from a Quasi-Experiment

ERIC Educational Resources Information Center

Touchton, Michael

2015-01-01

I administer a quasi-experiment using undergraduate political science majors in statistics classes to evaluate whether "flipping the classroom" (the treatment) alters students' applied problem-solving performance and satisfaction relative to students in a traditional classroom environment (the control). I also assess whether general…
Performance Indicators in Education.

ERIC Educational Resources Information Center

Irvine, David J.

Evaluation of education involves assessing the effectiveness of schools and trying to determine how best to improve them. Since evaluation often deals only with the question of effectiveness, performance indicators in education are designed to make evaluation more complete. They are a set of statistical models which relate several important…
Statistical shape analysis using 3D Poisson equation--A quantitatively validated approach.

PubMed

Gao, Yi; Bouix, Sylvain

2016-05-01

Statistical shape analysis has been an important area of research with applications in biology, anatomy, neuroscience, agriculture, paleontology, etc. Unfortunately, the proposed methods are rarely quantitatively evaluated, and as shown in recent studies, when they are evaluated, significant discrepancies exist in their outputs. In this work, we concentrate on the problem of finding the consistent location of deformation between two population of shapes. We propose a new shape analysis algorithm along with a framework to perform a quantitative evaluation of its performance. Specifically, the algorithm constructs a Signed Poisson Map (SPoM) by solving two Poisson equations on the volumetric shapes of arbitrary topology, and statistical analysis is then carried out on the SPoMs. The method is quantitatively evaluated on synthetic shapes and applied on real shape data sets in brain structures. Copyright © 2016 Elsevier B.V. All rights reserved.
Is math anxiety in the secondary classroom limiting physics mastery? A study of math anxiety and physics performance

NASA Astrophysics Data System (ADS)

Mercer, Gary J.

This quantitative study examined the relationship between secondary students with math anxiety and physics performance in an inquiry-based constructivist classroom. The Revised Math Anxiety Rating Scale was used to evaluate math anxiety levels. The results were then compared to the performance on a physics standardized final examination. A simple correlation was performed, followed by a multivariate regression analysis to examine effects based on gender and prior math background. The correlation showed statistical significance between math anxiety and physics performance. The regression analysis showed statistical significance for math anxiety, physics performance, and prior math background, but did not show statistical significance for math anxiety, physics performance, and gender.
Use of the Global Test Statistic as a Performance Measurement in a Reananlysis of Environmental Health Data

PubMed Central

Dymova, Natalya; Hanumara, R. Choudary; Gagnon, Ronald N.

2009-01-01

Performance measurement is increasingly viewed as an essential component of environmental and public health protection programs. In characterizing program performance over time, investigators often observe multiple changes resulting from a single intervention across a range of categories. Although a variety of statistical tools allow evaluation of data one variable at a time, the global test statistic is uniquely suited for analyses of categories or groups of interrelated variables. Here we demonstrate how the global test statistic can be applied to environmental and occupational health data for the purpose of making overall statements on the success of targeted intervention strategies. PMID:19696393
Use of the global test statistic as a performance measurement in a reanalysis of environmental health data.

PubMed

Dymova, Natalya; Hanumara, R Choudary; Enander, Richard T; Gagnon, Ronald N

2009-10-01

Performance measurement is increasingly viewed as an essential component of environmental and public health protection programs. In characterizing program performance over time, investigators often observe multiple changes resulting from a single intervention across a range of categories. Although a variety of statistical tools allow evaluation of data one variable at a time, the global test statistic is uniquely suited for analyses of categories or groups of interrelated variables. Here we demonstrate how the global test statistic can be applied to environmental and occupational health data for the purpose of making overall statements on the success of targeted intervention strategies.
A modified F-test for evaluating model performance by including both experimental and simulation uncertainties

USDA-ARS?s Scientific Manuscript database

Experimental and simulation uncertainties have not been included in many of the statistics used in assessing agricultural model performance. The objectives of this study were to develop an F-test that can be used to evaluate model performance considering experimental and simulation uncertainties, an...
Accuracy Evaluation of the Unified P-Value from Combining Correlated P-Values

PubMed Central

Alves, Gelio; Yu, Yi-Kuo

2014-01-01

Meta-analysis methods that combine -values into a single unified -value are frequently employed to improve confidence in hypothesis testing. An assumption made by most meta-analysis methods is that the -values to be combined are independent, which may not always be true. To investigate the accuracy of the unified -value from combining correlated -values, we have evaluated a family of statistical methods that combine: independent, weighted independent, correlated, and weighted correlated -values. Statistical accuracy evaluation by combining simulated correlated -values showed that correlation among -values can have a significant effect on the accuracy of the combined -value obtained. Among the statistical methods evaluated those that weight -values compute more accurate combined -values than those that do not. Also, statistical methods that utilize the correlation information have the best performance, producing significantly more accurate combined -values. In our study we have demonstrated that statistical methods that combine -values based on the assumption of independence can produce inaccurate -values when combining correlated -values, even when the -values are only weakly correlated. Therefore, to prevent from drawing false conclusions during hypothesis testing, our study advises caution be used when interpreting the -value obtained from combining -values of unknown correlation. However, when the correlation information is available, the weighting-capable statistical method, first introduced by Brown and recently modified by Hou, seems to perform the best amongst the methods investigated. PMID:24663491
THE ATMOSPHERIC MODEL EVALUATION TOOL

EPA Science Inventory

This poster describes a model evaluation tool that is currently being developed and applied for meteorological and air quality model evaluation. The poster outlines the framework and provides examples of statistical evaluations that can be performed with the model evaluation tool...
Evaluation of surface detail reproduction, dimensional stability and gypsum compatibility of monophase polyvinyl-siloxane and polyether elastomeric impression materials under dry and moist conditions

PubMed Central

Vadapalli, Sriharsha Babu; Atluri, Kaleswararao; Putcha, Madhu Sudhan; Kondreddi, Sirisha; Kumar, N. Suman; Tadi, Durga Prasad

2016-01-01

Objectives: This in vitro study was designed to compare polyvinyl-siloxane (PVS) monophase and polyether (PE) monophase materials under dry and moist conditions for properties such as surface detail reproduction, dimensional stability, and gypsum compatibility. Materials and Methods: Surface detail reproduction was evaluated using two criteria. Dimensional stability was evaluated according to American Dental Association (ADA) specification no. 19. Gypsum compatibility was assessed by two criteria. All the samples were evaluated, and the data obtained were analyzed by a two-way analysis of variance (ANOVA) and Pearson's Chi-square tests. Results: When surface detail reproduction was evaluated with modification of ADA specification no. 19, both the groups under the two conditions showed no significant difference statistically. When evaluated macroscopically both the groups showed statistically significant difference. Results for dimensional stability showed that the deviation from standard was significant among the two groups, where Aquasil group showed significantly more deviation compared to Impregum group (P < 0.001). Two conditions also showed significant difference, with moist conditions showing significantly more deviation compared to dry condition (P < 0.001). The results of gypsum compatibility when evaluated with modification of ADA specification no. 19 and by giving grades to the casts for both the groups and under two conditions showed no significant difference statistically. Conclusion: Regarding dimensional stability, both impregum and aquasil performed better in dry condition than in moist; impregum performed better than aquasil in both the conditions. When tested for surface detail reproduction according to ADA specification, under dry and moist conditions both of them performed almost equally. When tested according to macroscopic evaluation, impregum and aquasil performed significantly better in dry condition compared to moist condition. In dry condition, both the materials performed almost equally. In moist condition, aquasil performed significantly better than impregum. Regarding gypsum compatibility according to ADA specification, in dry condition both the materials performed almost equally, and in moist condition aquasil performed better than impregum. When tested by macroscopic evaluation, impregum performed better than aquasil in both the conditions. PMID:27583217
Evaluation of surface detail reproduction, dimensional stability and gypsum compatibility of monophase polyvinyl-siloxane and polyether elastomeric impression materials under dry and moist conditions.

PubMed

Vadapalli, Sriharsha Babu; Atluri, Kaleswararao; Putcha, Madhu Sudhan; Kondreddi, Sirisha; Kumar, N Suman; Tadi, Durga Prasad

2016-01-01

This in vitro study was designed to compare polyvinyl-siloxane (PVS) monophase and polyether (PE) monophase materials under dry and moist conditions for properties such as surface detail reproduction, dimensional stability, and gypsum compatibility. Surface detail reproduction was evaluated using two criteria. Dimensional stability was evaluated according to American Dental Association (ADA) specification no. 19. Gypsum compatibility was assessed by two criteria. All the samples were evaluated, and the data obtained were analyzed by a two-way analysis of variance (ANOVA) and Pearson's Chi-square tests. When surface detail reproduction was evaluated with modification of ADA specification no. 19, both the groups under the two conditions showed no significant difference statistically. When evaluated macroscopically both the groups showed statistically significant difference. Results for dimensional stability showed that the deviation from standard was significant among the two groups, where Aquasil group showed significantly more deviation compared to Impregum group (P < 0.001). Two conditions also showed significant difference, with moist conditions showing significantly more deviation compared to dry condition (P < 0.001). The results of gypsum compatibility when evaluated with modification of ADA specification no. 19 and by giving grades to the casts for both the groups and under two conditions showed no significant difference statistically. Regarding dimensional stability, both impregum and aquasil performed better in dry condition than in moist; impregum performed better than aquasil in both the conditions. When tested for surface detail reproduction according to ADA specification, under dry and moist conditions both of them performed almost equally. When tested according to macroscopic evaluation, impregum and aquasil performed significantly better in dry condition compared to moist condition. In dry condition, both the materials performed almost equally. In moist condition, aquasil performed significantly better than impregum. Regarding gypsum compatibility according to ADA specification, in dry condition both the materials performed almost equally, and in moist condition aquasil performed better than impregum. When tested by macroscopic evaluation, impregum performed better than aquasil in both the conditions.
Statistical machine translation for biomedical text: are we there yet?

PubMed

Wu, Cuijun; Xia, Fei; Deleger, Louise; Solti, Imre

2011-01-01

In our paper we addressed the research question: "Has machine translation achieved sufficiently high quality to translate PubMed titles for patients?". We analyzed statistical machine translation output for six foreign language - English translation pairs (bi-directionally). We built a high performing in-house system and evaluated its output for each translation pair on large scale both with automated BLEU scores and human judgment. In addition to the in-house system, we also evaluated Google Translate's performance specifically within the biomedical domain. We report high performance for German, French and Spanish -- English bi-directional translation pairs for both Google Translate and our system.
Performance Evaluation of 14 Neural Network Architectures Used for Predicting Heat Transfer Characteristics of Engine Oils

NASA Astrophysics Data System (ADS)

Al-Ajmi, R. M.; Abou-Ziyan, H. Z.; Mahmoud, M. A.

2012-01-01

This paper reports the results of a comprehensive study that aimed at identifying best neural network architecture and parameters to predict subcooled boiling characteristics of engine oils. A total of 57 different neural networks (NNs) that were derived from 14 different NN architectures were evaluated for four different prediction cases. The NNs were trained on experimental datasets performed on five engine oils of different chemical compositions. The performance of each NN was evaluated using a rigorous statistical analysis as well as careful examination of smoothness of predicted boiling curves. One NN, out of the 57 evaluated, correctly predicted the boiling curves for all cases considered either for individual oils or for all oils taken together. It was found that the pattern selection and weight update techniques strongly affect the performance of the NNs. It was also revealed that the use of descriptive statistical analysis such as R2, mean error, standard deviation, and T and slope tests, is a necessary but not sufficient condition for evaluating NN performance. The performance criteria should also include inspection of the smoothness of the predicted curves either visually or by plotting the slopes of these curves.
Statistical Evaluation of CRM-Simulated Cloud and Precipitation Structures Using Multi- sensor TRMM Measurements and Retrievals

NASA Astrophysics Data System (ADS)

Posselt, D.; L'Ecuyer, T.; Matsui, T.

2009-05-01

Cloud resolving models are typically used to examine the characteristics of clouds and precipitation and their relationship to radiation and the large-scale circulation. As such, they are not required to reproduce the exact location of each observed convective system, much less each individual cloud. Some of the most relevant information about clouds and precipitation is provided by instruments located on polar-orbiting satellite platforms, but these observations are intermittent "snapshots" in time, making assessment of model performance challenging. In contrast to direct comparison, model results can be evaluated statistically. This avoids the requirement for the model to reproduce the observed systems, while returning valuable information on the performance of the model in a climate-relevant sense. The focus of this talk is a model evaluation study, in which updates to the microphysics scheme used in a three-dimensional version of the Goddard Cumulus Ensemble (GCE) model are evaluated using statistics of observed clouds, precipitation, and radiation. We present the results of multiday (non-equilibrium) simulations of organized deep convection using single- and double-moment versions of a the model's cloud microphysical scheme. Statistics of TRMM multi-sensor derived clouds, precipitation, and radiative fluxes are used to evaluate the GCE results, as are simulated TRMM measurements obtained using a sophisticated instrument simulator suite. We present advantages and disadvantages of performing model comparisons in retrieval and measurement space and conclude by motivating the use of data assimilation techniques for analyzing and improving model parameterizations.
Investigating the Investigative Task: Testing for Skewness--An Investigation of Different Test Statistics and Their Power to Detect Skewness

ERIC Educational Resources Information Center

Tabor, Josh

2010-01-01

On the 2009 AP[c] Statistics Exam, students were asked to create a statistic to measure skewness in a distribution. This paper explores several of the most popular student responses and evaluates which statistic performs best when sampling from various skewed populations. (Contains 8 figures, 3 tables, and 4 footnotes.)
[Teaching performance assessment in Public Health employing three different strategies].

PubMed

Martínez-González, Adrián; Moreno-Altamirano, Laura; Ponce-Rosas, Efrén Raúl; Martínez-Franco, Adrián Israel; Urrutia-Aguilar, María Esther

2011-01-01

The educational system depends upon the quality and performance of their faculty and should therefore be process of continuous improvement. To assess the teaching performance of the Public Health professors, at the Faculty of Medicine, UNAM through three strategies. Justification study. The evaluation was conducted under a mediational model through three strategies: students' opinion assessment, self-assessment and students' academic achievement. We applied descriptive statistics, Student t test, ANOVA and Pearson correlation. Twenty professors were evaluated from the Public Health department, representing 57% of all them who teach the subject. The professor's performance was highly valued self-assessment compared with assessment of student opinion, was confirmed by statistical analysis the difference was significant. The difference amongst the three evaluation strategies became more evident between self-assessment and the scores obtained by students in their academic achievement. The integration of these three strategies offers a more complete view of the teacher's performance quality. Academic achievement appears to be a more objective strategy for teaching performance assessment than students' opinion and self-assessment.
Comparison of differences in performance evaluation of faculty by students with faculty's self-assessment.

PubMed

Azizi, Kourosh; Aghamolaei, Teamur; Parsa, Nader; Dabbaghmanesh, Tahereh

2014-07-01

The present study aimed to compare self-assessment forms of coursework taught in the school of public health at undergraduate, graduate, and postgraduate levels and students' evaluation of the performance of the faculty members at these levels. The subjects in this cross-sectional study were the faculty members and students of the School of Public Health and Nutrition, Shiraz University of Medical Sciences, Shiraz, Iran. The data were collected using a socio-demographic information form and evaluation forms of professors prepared by the Educational Development Center (EDC). The faculty members were assessed by the students in undergraduate and graduate classes. Among the study subjects, 23 faculty members filled out the self-assessment forms which were then evaluated by 23 students. Then, the data were analyzed using the SPSS statistical 14. Paired t-test was used to compare the students' evaluation of the faculty members' performance and the professors' self-assessment. The mean score of self-assessment of the faculty members who taught undergraduate courses was 289.7±8.3, while that of the students' evaluation was 281.3±16.1; the difference was statistically significant (t=3.56, p=0.001). Besides, the mean score of the self-assessment of the faculty members who taught graduate courses was 269.0±9.7, while that of the students' evaluation was 265.7±14.6 but the difference was not statistically significant (t=1.09, p=0.28). Teaching performance perceptions of the faculty were similar to those of the graduate students as compared to the undergraduate ones. This may reflect better understanding of coursework at this level compared to the undergraduate students. Faculty members may need to adjust teaching methods to improve students' performance and understanding especially in the undergraduate level.
Syndromic surveillance of influenza activity in Sweden: an evaluation of three tools.

PubMed

Ma, T; Englund, H; Bjelkmar, P; Wallensten, A; Hulth, A

2015-08-01

An evaluation was conducted to determine which syndromic surveillance tools complement traditional surveillance by serving as earlier indicators of influenza activity in Sweden. Web queries, medical hotline statistics, and school absenteeism data were evaluated against two traditional surveillance tools. Cross-correlation calculations utilized aggregated weekly data for all-age, nationwide activity for four influenza seasons, from 2009/2010 to 2012/2013. The surveillance tool indicative of earlier influenza activity, by way of statistical and visual evidence, was identified. The web query algorithm and medical hotline statistics performed equally well as each other and to the traditional surveillance tools. School absenteeism data were not reliable resources for influenza surveillance. Overall, the syndromic surveillance tools did not perform with enough consistency in season lead nor in earlier timing of the peak week to be considered as early indicators. They do, however, capture incident cases before they have formally entered the primary healthcare system.
System Analysis for the Huntsville Operation Support Center, Distributed Computer System

NASA Technical Reports Server (NTRS)

Ingels, F. M.; Massey, D.

1985-01-01

HOSC as a distributed computing system, is responsible for data acquisition and analysis during Space Shuttle operations. HOSC also provides computing services for Marshall Space Flight Center's nonmission activities. As mission and nonmission activities change, so do the support functions of HOSC change, demonstrating the need for some method of simulating activity at HOSC in various configurations. The simulation developed in this work primarily models the HYPERchannel network. The model simulates the activity of a steady state network, reporting statistics such as, transmitted bits, collision statistics, frame sequences transmitted, and average message delay. These statistics are used to evaluate such performance indicators as throughout, utilization, and delay. Thus the overall performance of the network is evaluated, as well as predicting possible overload conditions.
Detection and Evaluation of Spatio-Temporal Spike Patterns in Massively Parallel Spike Train Data with SPADE.

PubMed

Quaglio, Pietro; Yegenoglu, Alper; Torre, Emiliano; Endres, Dominik M; Grün, Sonja

2017-01-01

Repeated, precise sequences of spikes are largely considered a signature of activation of cell assemblies. These repeated sequences are commonly known under the name of spatio-temporal patterns (STPs). STPs are hypothesized to play a role in the communication of information in the computational process operated by the cerebral cortex. A variety of statistical methods for the detection of STPs have been developed and applied to electrophysiological recordings, but such methods scale poorly with the current size of available parallel spike train recordings (more than 100 neurons). In this work, we introduce a novel method capable of overcoming the computational and statistical limits of existing analysis techniques in detecting repeating STPs within massively parallel spike trains (MPST). We employ advanced data mining techniques to efficiently extract repeating sequences of spikes from the data. Then, we introduce and compare two alternative approaches to distinguish statistically significant patterns from chance sequences. The first approach uses a measure known as conceptual stability, of which we investigate a computationally cheap approximation for applications to such large data sets. The second approach is based on the evaluation of pattern statistical significance. In particular, we provide an extension to STPs of a method we recently introduced for the evaluation of statistical significance of synchronous spike patterns. The performance of the two approaches is evaluated in terms of computational load and statistical power on a variety of artificial data sets that replicate specific features of experimental data. Both methods provide an effective and robust procedure for detection of STPs in MPST data. The method based on significance evaluation shows the best overall performance, although at a higher computational cost. We name the novel procedure the spatio-temporal Spike PAttern Detection and Evaluation (SPADE) analysis.

Detection and Evaluation of Spatio-Temporal Spike Patterns in Massively Parallel Spike Train Data with SPADE

PubMed Central

Quaglio, Pietro; Yegenoglu, Alper; Torre, Emiliano; Endres, Dominik M.; Grün, Sonja

2017-01-01

Repeated, precise sequences of spikes are largely considered a signature of activation of cell assemblies. These repeated sequences are commonly known under the name of spatio-temporal patterns (STPs). STPs are hypothesized to play a role in the communication of information in the computational process operated by the cerebral cortex. A variety of statistical methods for the detection of STPs have been developed and applied to electrophysiological recordings, but such methods scale poorly with the current size of available parallel spike train recordings (more than 100 neurons). In this work, we introduce a novel method capable of overcoming the computational and statistical limits of existing analysis techniques in detecting repeating STPs within massively parallel spike trains (MPST). We employ advanced data mining techniques to efficiently extract repeating sequences of spikes from the data. Then, we introduce and compare two alternative approaches to distinguish statistically significant patterns from chance sequences. The first approach uses a measure known as conceptual stability, of which we investigate a computationally cheap approximation for applications to such large data sets. The second approach is based on the evaluation of pattern statistical significance. In particular, we provide an extension to STPs of a method we recently introduced for the evaluation of statistical significance of synchronous spike patterns. The performance of the two approaches is evaluated in terms of computational load and statistical power on a variety of artificial data sets that replicate specific features of experimental data. Both methods provide an effective and robust procedure for detection of STPs in MPST data. The method based on significance evaluation shows the best overall performance, although at a higher computational cost. We name the novel procedure the spatio-temporal Spike PAttern Detection and Evaluation (SPADE) analysis. PMID:28596729
Metrology Standards for Quantitative Imaging Biomarkers

PubMed Central

Obuchowski, Nancy A.; Kessler, Larry G.; Raunig, David L.; Gatsonis, Constantine; Huang, Erich P.; Kondratovich, Marina; McShane, Lisa M.; Reeves, Anthony P.; Barboriak, Daniel P.; Guimaraes, Alexander R.; Wahl, Richard L.

2015-01-01

Although investigators in the imaging community have been active in developing and evaluating quantitative imaging biomarkers (QIBs), the development and implementation of QIBs have been hampered by the inconsistent or incorrect use of terminology or methods for technical performance and statistical concepts. Technical performance is an assessment of how a test performs in reference objects or subjects under controlled conditions. In this article, some of the relevant statistical concepts are reviewed, methods that can be used for evaluating and comparing QIBs are described, and some of the technical performance issues related to imaging biomarkers are discussed. More consistent and correct use of terminology and study design principles will improve clinical research, advance regulatory science, and foster better care for patients who undergo imaging studies. © RSNA, 2015 PMID:26267831
Efficient statistical tests to compare Youden index: accounting for contingency correlation.

PubMed

Chen, Fangyao; Xue, Yuqiang; Tan, Ming T; Chen, Pingyan

2015-04-30

Youden index is widely utilized in studies evaluating accuracy of diagnostic tests and performance of predictive, prognostic, or risk models. However, both one and two independent sample tests on Youden index have been derived ignoring the dependence (association) between sensitivity and specificity, resulting in potentially misleading findings. Besides, paired sample test on Youden index is currently unavailable. This article develops efficient statistical inference procedures for one sample, independent, and paired sample tests on Youden index by accounting for contingency correlation, namely associations between sensitivity and specificity and paired samples typically represented in contingency tables. For one and two independent sample tests, the variances are estimated by Delta method, and the statistical inference is based on the central limit theory, which are then verified by bootstrap estimates. For paired samples test, we show that the estimated covariance of the two sensitivities and specificities can be represented as a function of kappa statistic so the test can be readily carried out. We then show the remarkable accuracy of the estimated variance using a constrained optimization approach. Simulation is performed to evaluate the statistical properties of the derived tests. The proposed approaches yield more stable type I errors at the nominal level and substantially higher power (efficiency) than does the original Youden's approach. Therefore, the simple explicit large sample solution performs very well. Because we can readily implement the asymptotic and exact bootstrap computation with common software like R, the method is broadly applicable to the evaluation of diagnostic tests and model performance. Copyright © 2015 John Wiley & Sons, Ltd.
Predicting and downscaling ENSO impacts on intraseasonal precipitation statistics in California: The 1997/98 event

USGS Publications Warehouse

Gershunov, A.; Barnett, T.P.; Cayan, D.R.; Tubbs, T.; Goddard, L.

2000-01-01

Three long-range forecasting methods have been evaluated for prediction and downscaling of seasonal and intraseasonal precipitation statistics in California. Full-statistical, hybrid-dynamical - statistical and full-dynamical approaches have been used to forecast El Nin??o - Southern Oscillation (ENSO) - related total precipitation, daily precipitation frequency, and average intensity anomalies during the January - March season. For El Nin??o winters, the hybrid approach emerges as the best performer, while La Nin??a forecasting skill is poor. The full-statistical forecasting method features reasonable forecasting skill for both La Nin??a and El Nin??o winters. The performance of the full-dynamical approach could not be evaluated as rigorously as that of the other two forecasting schemes. Although the full-dynamical forecasting approach is expected to outperform simpler forecasting schemes in the long run, evidence is presented to conclude that, at present, the full-dynamical forecasting approach is the least viable of the three, at least in California. The authors suggest that operational forecasting of any intraseasonal temperature, precipitation, or streamflow statistic derivable from the available records is possible now for ENSO-extreme years.
Analysis of statistical and standard algorithms for detecting muscle onset with surface electromyography.

PubMed

Tenan, Matthew S; Tweedell, Andrew J; Haynes, Courtney A

2017-01-01

The timing of muscle activity is a commonly applied analytic method to understand how the nervous system controls movement. This study systematically evaluates six classes of standard and statistical algorithms to determine muscle onset in both experimental surface electromyography (EMG) and simulated EMG with a known onset time. Eighteen participants had EMG collected from the biceps brachii and vastus lateralis while performing a biceps curl or knee extension, respectively. Three established methods and three statistical methods for EMG onset were evaluated. Linear envelope, Teager-Kaiser energy operator + linear envelope and sample entropy were the established methods evaluated while general time series mean/variance, sequential and batch processing of parametric and nonparametric tools, and Bayesian changepoint analysis were the statistical techniques used. Visual EMG onset (experimental data) and objective EMG onset (simulated data) were compared with algorithmic EMG onset via root mean square error and linear regression models for stepwise elimination of inferior algorithms. The top algorithms for both data types were analyzed for their mean agreement with the gold standard onset and evaluation of 95% confidence intervals. The top algorithms were all Bayesian changepoint analysis iterations where the parameter of the prior (p0) was zero. The best performing Bayesian algorithms were p0 = 0 and a posterior probability for onset determination at 60-90%. While existing algorithms performed reasonably, the Bayesian changepoint analysis methodology provides greater reliability and accuracy when determining the singular onset of EMG activity in a time series. Further research is needed to determine if this class of algorithms perform equally well when the time series has multiple bursts of muscle activity.
A nonparametric spatial scan statistic for continuous data.

PubMed

Jung, Inkyung; Cho, Ho Jin

2015-10-20

Spatial scan statistics are widely used for spatial cluster detection, and several parametric models exist. For continuous data, a normal-based scan statistic can be used. However, the performance of the model has not been fully evaluated for non-normal data. We propose a nonparametric spatial scan statistic based on the Wilcoxon rank-sum test statistic and compared the performance of the method with parametric models via a simulation study under various scenarios. The nonparametric method outperforms the normal-based scan statistic in terms of power and accuracy in almost all cases under consideration in the simulation study. The proposed nonparametric spatial scan statistic is therefore an excellent alternative to the normal model for continuous data and is especially useful for data following skewed or heavy-tailed distributions.
Quantitative evaluation of pairs and RS steganalysis

NASA Astrophysics Data System (ADS)

Ker, Andrew D.

2004-06-01

We give initial results from a new project which performs statistically accurate evaluation of the reliability of image steganalysis algorithms. The focus here is on the Pairs and RS methods, for detection of simple LSB steganography in grayscale bitmaps, due to Fridrich et al. Using libraries totalling around 30,000 images we have measured the performance of these methods and suggest changes which lead to significant improvements. Particular results from the project presented here include notes on the distribution of the RS statistic, the relative merits of different "masks" used in the RS algorithm, the effect on reliability when previously compressed cover images are used, and the effect of repeating steganalysis on the transposed image. We also discuss improvements to the Pairs algorithm, restricting it to spatially close pairs of pixels, which leads to a substantial performance improvement, even to the extent of surpassing the RS statistic which was previously thought superior for grayscale images. We also describe some of the questions for a general methodology of evaluation of steganalysis, and potential pitfalls caused by the differences between uncompressed, compressed, and resampled cover images.
Statistical Process Control: A Quality Tool for a Venous Thromboembolic Disease Registry.

PubMed

Posadas-Martinez, Maria Lourdes; Rojas, Liliana Paloma; Vazquez, Fernando Javier; De Quiros, Fernan Bernaldo; Waisman, Gabriel Dario; Giunta, Diego Hernan

2016-01-01

We aim to describe Statistical Control Process as a quality tool for the Institutional Registry of Venous Thromboembolic Disease (IRTD), a registry developed in a community-care tertiary hospital in Buenos Aires, Argentina. The IRTD is a prospective cohort. The process of data acquisition began with the creation of a computerized alert generated whenever physicians requested imaging or laboratory study to diagnose venous thromboembolism, which defined eligible patients. The process then followed a structured methodology for patient's inclusion, evaluation, and posterior data entry. To control this process, process performance indicators were designed to be measured monthly. These included the number of eligible patients, the number of included patients, median time to patient's evaluation, and percentage of patients lost to evaluation. Control charts were graphed for each indicator. The registry was evaluated in 93 months, where 25,757 patients were reported and 6,798 patients met inclusion criteria. The median time to evaluation was 20 hours (SD, 12) and 7.7% of the total was lost to evaluation. Each indicator presented trends over time, caused by structural changes and improvement cycles, and therefore the central limit suffered inflexions. Statistical process control through process performance indicators allowed us to control the performance of the registry over time to detect systematic problems. We postulate that this approach could be reproduced for other clinical registries.
Uncertainties in Estimates of Fleet Average Fuel Economy : A Statistical Evaluation

DOT National Transportation Integrated Search

1977-01-01

Research was performed to assess the current Federal procedure for estimating the average fuel economy of each automobile manufacturer's new car fleet. Test vehicle selection and fuel economy estimation methods were characterized statistically and so...
Evaluating markers for the early detection of cancer: overview of study designs and methods.

PubMed

Baker, Stuart G; Kramer, Barnett S; McIntosh, Martin; Patterson, Blossom H; Shyr, Yu; Skates, Steven

2006-01-01

The field of cancer biomarker development has been evolving rapidly. New developments both in the biologic and statistical realms are providing increasing opportunities for evaluation of markers for both early detection and diagnosis of cancer. To review the major conceptual and methodological issues in cancer biomarker evaluation, with an emphasis on recent developments in statistical methods together with practical recommendations. We organized this review by type of study: preliminary performance, retrospective performance, prospective performance and cancer screening evaluation. For each type of study, we discuss methodologic issues, provide examples and discuss strengths and limitations. Preliminary performance studies are useful for quickly winnowing down the number of candidate markers; however their results may not apply to the ultimate target population, asymptomatic subjects. If stored specimens from cohort studies with clinical cancer endpoints are available, retrospective studies provide a quick and valid way to evaluate performance of the markers or changes in the markers prior to the onset of clinical symptoms. Prospective studies have a restricted role because they require large sample sizes, and, if the endpoint is cancer on biopsy, there may be bias due to overdiagnosis. Cancer screening studies require very large sample sizes and long follow-up, but are necessary for evaluating the marker as a trigger of early intervention.
On Lack of Robustness in Hydrological Model Development Due to Absence of Guidelines for Selecting Calibration and Evaluation Data: Demonstration for Data-Driven Models

NASA Astrophysics Data System (ADS)

Zheng, Feifei; Maier, Holger R.; Wu, Wenyan; Dandy, Graeme C.; Gupta, Hoshin V.; Zhang, Tuqiao

2018-02-01

Hydrological models are used for a wide variety of engineering purposes, including streamflow forecasting and flood-risk estimation. To develop such models, it is common to allocate the available data to calibration and evaluation data subsets. Surprisingly, the issue of how this allocation can affect model evaluation performance has been largely ignored in the research literature. This paper discusses the evaluation performance bias that can arise from how available data are allocated to calibration and evaluation subsets. As a first step to assessing this issue in a statistically rigorous fashion, we present a comprehensive investigation of the influence of data allocation on the development of data-driven artificial neural network (ANN) models of streamflow. Four well-known formal data splitting methods are applied to 754 catchments from Australia and the U.S. to develop 902,483 ANN models. Results clearly show that the choice of the method used for data allocation has a significant impact on model performance, particularly for runoff data that are more highly skewed, highlighting the importance of considering the impact of data splitting when developing hydrological models. The statistical behavior of the data splitting methods investigated is discussed and guidance is offered on the selection of the most appropriate data splitting methods to achieve representative evaluation performance for streamflow data with different statistical properties. Although our results are obtained for data-driven models, they highlight the fact that this issue is likely to have a significant impact on all types of hydrological models, especially conceptual rainfall-runoff models.
Signal Statistics and Maximum Likelihood Sequence Estimation in Intensity Modulated Fiber Optic Links Containing a Single Optical Pre-amplifier.

PubMed

Alić, Nikola; Papen, George; Saperstein, Robert; Milstein, Laurence; Fainman, Yeshaiahu

2005-06-13

Exact signal statistics for fiber-optic links containing a single optical pre-amplifier are calculated and applied to sequence estimation for electronic dispersion compensation. The performance is evaluated and compared with results based on the approximate chi-square statistics. We show that detection in existing systems based on exact statistics can be improved relative to using a chi-square distribution for realistic filter shapes. In contrast, for high-spectral efficiency systems the difference between the two approaches diminishes, and performance tends to be less dependent on the exact shape of the filter used.
Cognitive Complaints After Breast Cancer Treatments: Examining the Relationship With Neuropsychological Test Performance

PubMed Central

2013-01-01

Background Cognitive complaints are reported frequently after breast cancer treatments. Their association with neuropsychological (NP) test performance is not well-established. Methods Early-stage, posttreatment breast cancer patients were enrolled in a prospective, longitudinal, cohort study prior to starting endocrine therapy. Evaluation included an NP test battery and self-report questionnaires assessing symptoms, including cognitive complaints. Multivariable regression models assessed associations among cognitive complaints, mood, treatment exposures, and NP test performance. Results One hundred eighty-nine breast cancer patients, aged 21–65 years, completed the evaluation; 23.3% endorsed higher memory complaints and 19.0% reported higher executive function complaints (>1 SD above the mean for healthy control sample). Regression modeling demonstrated a statistically significant association of higher memory complaints with combined chemotherapy and radiation treatments (P = .01), poorer NP verbal memory performance (P = .02), and higher depressive symptoms (P < .001), controlling for age and IQ. For executive functioning complaints, multivariable modeling controlling for age, IQ, and other confounds demonstrated statistically significant associations with better NP visual memory performance (P = .03) and higher depressive symptoms (P < .001), whereas combined chemotherapy and radiation treatment (P = .05) approached statistical significance. Conclusions About one in five post–adjuvant treatment breast cancer patients had elevated memory and/or executive function complaints that were statistically significantly associated with domain-specific NP test performances and depressive symptoms; combined chemotherapy and radiation treatment was also statistically significantly associated with memory complaints. These results and other emerging studies suggest that subjective cognitive complaints in part reflect objective NP performance, although their etiology and biology appear to be multifactorial, motivating further transdisciplinary research. PMID:23606729
Analysis of statistical and standard algorithms for detecting muscle onset with surface electromyography

PubMed Central

Tweedell, Andrew J.; Haynes, Courtney A.

2017-01-01

The timing of muscle activity is a commonly applied analytic method to understand how the nervous system controls movement. This study systematically evaluates six classes of standard and statistical algorithms to determine muscle onset in both experimental surface electromyography (EMG) and simulated EMG with a known onset time. Eighteen participants had EMG collected from the biceps brachii and vastus lateralis while performing a biceps curl or knee extension, respectively. Three established methods and three statistical methods for EMG onset were evaluated. Linear envelope, Teager-Kaiser energy operator + linear envelope and sample entropy were the established methods evaluated while general time series mean/variance, sequential and batch processing of parametric and nonparametric tools, and Bayesian changepoint analysis were the statistical techniques used. Visual EMG onset (experimental data) and objective EMG onset (simulated data) were compared with algorithmic EMG onset via root mean square error and linear regression models for stepwise elimination of inferior algorithms. The top algorithms for both data types were analyzed for their mean agreement with the gold standard onset and evaluation of 95% confidence intervals. The top algorithms were all Bayesian changepoint analysis iterations where the parameter of the prior (p0) was zero. The best performing Bayesian algorithms were p0 = 0 and a posterior probability for onset determination at 60–90%. While existing algorithms performed reasonably, the Bayesian changepoint analysis methodology provides greater reliability and accuracy when determining the singular onset of EMG activity in a time series. Further research is needed to determine if this class of algorithms perform equally well when the time series has multiple bursts of muscle activity. PMID:28489897
A novel measure and significance testing in data analysis of cell image segmentation.

PubMed

Wu, Jin Chu; Halter, Michael; Kacker, Raghu N; Elliott, John T; Plant, Anne L

2017-03-14

Cell image segmentation (CIS) is an essential part of quantitative imaging of biological cells. Designing a performance measure and conducting significance testing are critical for evaluating and comparing the CIS algorithms for image-based cell assays in cytometry. Many measures and methods have been proposed and implemented to evaluate segmentation methods. However, computing the standard errors (SE) of the measures and their correlation coefficient is not described, and thus the statistical significance of performance differences between CIS algorithms cannot be assessed. We propose the total error rate (TER), a novel performance measure for segmenting all cells in the supervised evaluation. The TER statistically aggregates all misclassification error rates (MER) by taking cell sizes as weights. The MERs are for segmenting each single cell in the population. The TER is fully supported by the pairwise comparisons of MERs using 106 manually segmented ground-truth cells with different sizes and seven CIS algorithms taken from ImageJ. Further, the SE and 95% confidence interval (CI) of TER are computed based on the SE of MER that is calculated using the bootstrap method. An algorithm for computing the correlation coefficient of TERs between two CIS algorithms is also provided. Hence, the 95% CI error bars can be used to classify CIS algorithms. The SEs of TERs and their correlation coefficient can be employed to conduct the hypothesis testing, while the CIs overlap, to determine the statistical significance of the performance differences between CIS algorithms. A novel measure TER of CIS is proposed. The TER's SEs and correlation coefficient are computed. Thereafter, CIS algorithms can be evaluated and compared statistically by conducting the significance testing.
Distinguishing Positive Selection From Neutral Evolution: Boosting the Performance of Summary Statistics

PubMed Central

Lin, Kao; Li, Haipeng; Schlötterer, Christian; Futschik, Andreas

2011-01-01

Summary statistics are widely used in population genetics, but they suffer from the drawback that no simple sufficient summary statistic exists, which captures all information required to distinguish different evolutionary hypotheses. Here, we apply boosting, a recent statistical method that combines simple classification rules to maximize their joint predictive performance. We show that our implementation of boosting has a high power to detect selective sweeps. Demographic events, such as bottlenecks, do not result in a large excess of false positives. A comparison to other neutrality tests shows that our boosting implementation performs well compared to other neutrality tests. Furthermore, we evaluated the relative contribution of different summary statistics to the identification of selection and found that for recent sweeps integrated haplotype homozygosity is very informative whereas older sweeps are better detected by Tajima's π. Overall, Watterson's θ was found to contribute the most information for distinguishing between bottlenecks and selection. PMID:21041556
Effects of structured written feedback by cards on medical students' performance at Mini Clinical Evaluation Exercise (Mini-CEX) in an outpatient clinic.

PubMed

Haghani, Fariba; Hatef Khorami, Mohammad; Fakhari, Mohammad

2016-07-01

Feedback cards are recommended as a feasible tool for structured written feedback delivery in clinical education while effectiveness of this tool on the medical students' performance is still questionable. The purpose of this study was to compare the effects of structured written feedback by cards as well as verbal feedback versus verbal feedback alone on the clinical performance of medical students at the Mini Clinical Evaluation Exercise (Mini-CEX) test in an outpatient clinic. This is a quasi-experimental study with pre- and post-test comprising four groups in two terms of medical students' externship. The students' performance was assessed through the Mini-Clinical Evaluation Exercise (Mini-CEX) as a clinical performance evaluation tool. Structured written feedbacks were given to two experimental groups by designed feedback cards as well as verbal feedback, while in the two control groups feedback was delivered verbally as a routine approach in clinical education. By consecutive sampling method, 62 externship students were enrolled in this study and seven students were excluded from the final analysis due to their absence for three days. According to the ANOVA analysis and Post Hoc Tukey test, no statistically significant difference was observed among the four groups at the pre-test, whereas a statistically significant difference was observed between the experimental and control groups at the post-test (F = 4.023, p =0.012). The effect size of the structured written feedbacks on clinical performance was 0.19. Structured written feedback by cards could improve the performance of medical students in a statistical sense. Further studies must be conducted in other clinical courses with longer durations.
Evaluating bacterial gene-finding HMM structures as probabilistic logic programs.

PubMed

Mørk, Søren; Holmes, Ian

2012-03-01

Probabilistic logic programming offers a powerful way to describe and evaluate structured statistical models. To investigate the practicality of probabilistic logic programming for structure learning in bioinformatics, we undertook a simplified bacterial gene-finding benchmark in PRISM, a probabilistic dialect of Prolog. We evaluate Hidden Markov Model structures for bacterial protein-coding gene potential, including a simple null model structure, three structures based on existing bacterial gene finders and two novel model structures. We test standard versions as well as ADPH length modeling and three-state versions of the five model structures. The models are all represented as probabilistic logic programs and evaluated using the PRISM machine learning system in terms of statistical information criteria and gene-finding prediction accuracy, in two bacterial genomes. Neither of our implementations of the two currently most used model structures are best performing in terms of statistical information criteria or prediction performances, suggesting that better-fitting models might be achievable. The source code of all PRISM models, data and additional scripts are freely available for download at: http://github.com/somork/codonhmm. Supplementary data are available at Bioinformatics online.
Model Performance Evaluation and Scenario Analysis ...

EPA Pesticide Factsheets

This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors. The performance measures include error analysis, coefficient of determination, Nash-Sutcliffe efficiency, and a new weighted rank method. These performance metrics only provide useful information about the overall model performance. Note that MPESA is based on the separation of observed and simulated time series into magnitude and sequence components. The separation of time series into magnitude and sequence components and the reconstruction back to time series provides diagnostic insights to modelers. For example, traditional approaches lack the capability to identify if the source of uncertainty in the simulated data is due to the quality of the input data or the way the analyst adjusted the model parameters. This report presents a suite of model diagnostics that identify if mismatches between observed and simulated data result from magnitude or sequence related errors. MPESA offers graphical and statistical options that allow HSPF users to compare observed and simulated time series and identify the parameter values to adjust or the input data to modify. The scenario analysis part of the too
Specialized data analysis of SSME and advanced propulsion system vibration measurements

NASA Technical Reports Server (NTRS)

Coffin, Thomas; Swanson, Wayne L.; Jong, Yen-Yi

1993-01-01

The basic objectives of this contract were to perform detailed analysis and evaluation of dynamic data obtained during Space Shuttle Main Engine (SSME) test and flight operations, including analytical/statistical assessment of component dynamic performance, and to continue the development and implementation of analytical/statistical models to effectively define nominal component dynamic characteristics, detect anomalous behavior, and assess machinery operational conditions. This study was to provide timely assessment of engine component operational status, identify probable causes of malfunction, and define feasible engineering solutions. The work was performed under three broad tasks: (1) Analysis, Evaluation, and Documentation of SSME Dynamic Test Results; (2) Data Base and Analytical Model Development and Application; and (3) Development and Application of Vibration Signature Analysis Techniques.

An evaluation of GTAW-P versus GTA welding of alloy 718

NASA Technical Reports Server (NTRS)

Gamwell, W. R.; Kurgan, C.; Malone, T. W.

1991-01-01

Mechanical properties were evaluated to determine statistically whether the pulsed current gas tungsten arc welding (GTAW-P) process produces welds in alloy 718 with room temperature structural performance equivalent to current Space Shuttle Main Engine (SSME) welds manufactured by the constant current GTAW-P process. Evaluations were conducted on two base metal lots, two filler metal lots, two heat input levels, and two welding processes. The material form was 0.125-inch (3.175-mm) alloy 718 sheet. Prior to welding, sheets were treated to either the ST or STA-1 condition. After welding, panels were left as welded or heat treated to the STA-1 condition, and weld beads were left intact or machined flush. Statistical analyses were performed on yield strength, ultimate tensile strength (UTS), and high cycle fatigue (HCF) properties for all the post welded material conditions. Analyses of variance were performed on the data to determine if there were any significant effects on UTS or HCF life due to variations in base metal, filler metal, heat input level, or welding process. Statistical analyses showed that the GTAW-P process does produce welds with room temperature structural performance equivalent to current SSME welds manufactured by the GTAW process, regardless of prior material condition or post welding condition.
Performance comparison of LUR and OK in PM2.5 concentration mapping: a multidimensional perspective

PubMed Central

Zou, Bin; Luo, Yanqing; Wan, Neng; Zheng, Zhong; Sternberg, Troy; Liao, Yilan

2015-01-01

Methods of Land Use Regression (LUR) modeling and Ordinary Kriging (OK) interpolation have been widely used to offset the shortcomings of PM2.5 data observed at sparse monitoring sites. However, traditional point-based performance evaluation strategy for these methods remains stagnant, which could cause unreasonable mapping results. To address this challenge, this study employs ‘information entropy’, an area-based statistic, along with traditional point-based statistics (e.g. error rate, RMSE) to evaluate the performance of LUR model and OK interpolation in mapping PM2.5 concentrations in Houston from a multidimensional perspective. The point-based validation reveals significant differences between LUR and OK at different test sites despite the similar end-result accuracy (e.g. error rate 6.13% vs. 7.01%). Meanwhile, the area-based validation demonstrates that the PM2.5 concentrations simulated by the LUR model exhibits more detailed variations than those interpolated by the OK method (i.e. information entropy, 7.79 vs. 3.63). Results suggest that LUR modeling could better refine the spatial distribution scenario of PM2.5 concentrations compared to OK interpolation. The significance of this study primarily lies in promoting the integration of point- and area-based statistics for model performance evaluation in air pollution mapping. PMID:25731103
The Shock and Vibration Bulletin. Part 2. Invited Papers, Structural Dynamics

DTIC Science & Technology

1974-08-01

VIKING LANDER DYNAMICS 41 Mr. Joseph C. Pohlen, Martin Marietta Aerospace, Denver, Colorado Structural Dynamics PERFORMANCE OF STATISTICAL ENERGY ANALYSIS 47...aerospace structures. Analytical prediction of these environments is beyond the current scope of classical modal techniques. Statistical energy analysis methods...have been developed that circumvent the difficulties of high-frequency nodal analysis. These statistical energy analysis methods are evaluated
Performance Evaluation of New-Generation Pulse Oximeters in the NICU: Observational Study.

PubMed

Nizami, Shermeen; Greenwood, Kim; Barrowman, Nick; Harrold, JoAnn

2015-09-01

This crossover observational study compares the data characteristics and performance of new-generation Nellcor OXIMAX and Masimo SET SmartPod pulse oximeter technologies. The study was conducted independent of either original equipment manufacturer (OEM) across eleven preterm infants in a Neonatal Intensive Care Unit (NICU). The SmartPods were integrated with Dräger Infinity Delta monitors. The Delta monitor measured the heart rate (HR) using an independent electrocardiogram sensor, and the two SmartPods collected arterial oxygen saturation (SpO2) and pulse rate (PR). All patient data were non-Gaussian. Nellcor PR showed a higher correlation with the HR as compared to Masimo PR. The statistically significant difference found in their median values (1% for SpO2, 1 bpm for PR) was deemed clinically insignificant. SpO2 alarms generated by both SmartPods were observed and categorized for performance evaluation. Results for sensitivity, positive predictive value, accuracy and false alarm rates were Nellcor (80.3, 50, 44.5, 50%) and Masimo (72.2, 48.2, 40.6, 51.8%) respectively. These metrics were not statistically significantly different between the two pulse oximeters. Despite claims by OEMs, both pulse oximeters exhibited high false alarm rates, with no statistically or clinically significant difference in performance. These findings have a direct impact on alarm fatigue in the NICU. Performance evaluation studies can also impact medical device purchase decisions made by hospital administrators.
Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing

PubMed Central

Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

2018-01-01

Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
Results of the 1980 NACUBO Comparative Performance Study and Investment Questionnaire.

ERIC Educational Resources Information Center

Dresner, Bruce M.

The purpose of the annual National Association of College and University Business Officers' (NACUBO) Comparative Performance Study is to aid administrators in evaluating the performance of their investment pools. The 1980 study contains two parts: (1) comparative performance information and related investment performance statistics; and (2) other…
Results of the 1979 NACUBO Comparative Performance Study and Investment Questionnaire.

ERIC Educational Resources Information Center

Dresner, Bruce M.

Results of the 1979 Comparative Performance Study of the National Association of College and Business Officers are presented. The study is designed to aid administrators in evaluating the performance of their investment pools. The report covers comparative performance information and related investment performance statistics and other endowment…
A Comparative Evaluation of Mixed Dentition Analysis on Reliability of Cone Beam Computed Tomography Image Compared to Plaster Model.

PubMed

Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam

2017-01-01

The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t -test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis.
Performance assessment through pre- and post-training evaluation of continuing medical education courses in prevention and management of cardio-vascular diseases in primary health care facilities of Armenia.

PubMed

Khachatryan, Lilit; Balalian, Arin

2013-12-01

To assess the difference of pre- and post-training performance evaluation of continuing medical education (CME) courses in cardio-vascular diseases (CVD) management among physicians at primary health care facilities of Armenian regions we conducted an evaluation survey. 212 medical records were surveyed on assessment of performance before and after the training courses through a self-employed structured questionnaire. Analysis of survey revealed statistically significant differences (p < 0.05) in a number of variables: threefold increased recording of lipids and body mass index (p = 0.001); moderate increased recording of comorbidities and aspirin prescription (p < 0.012); eightfold increased recording of dyslipidemia management plan, twofold increased recording for CVD management plan and fivefold increased recording for CVD absolute risk (p = 0.000). Missing records of electrocardiography and urine/creatinine analyses decreased statistically significantly (p < 0.05). Statistically significant decrease was observed in prescription of thiazides and angiotensin receptor blockers/angiotensin converting enzyme inhibitors (p < 0.005), while prescription of statins and statins with diet for dyslipidemia management showed increased recording (p < 0.05). Similarly, we observed increased records for counseling of rehabilitation physical activity (p = 0.006). In this survey most differences in pre- and post-evaluation of performance assessment may be explained by improved and interactive training modes, more advanced methods of demonstration of modeling. Current findings may serve a basis for future planning of CME courses for physicians of remote areas facing challenges in upgrading their knowledge, as well as expand the experience of performance assessment along with evaluation of knowledge scores.
Statistical process control applied to mechanized peanut sowing as a function of soil texture.

PubMed

Zerbato, Cristiano; Furlani, Carlos Eduardo Angeli; Ormond, Antonio Tassio Santana; Gírio, Lucas Augusto da Silva; Carneiro, Franciele Morlin; da Silva, Rouverson Pereira

2017-01-01

The successful establishment of agricultural crops depends on sowing quality, machinery performance, soil type and conditions, among other factors. This study evaluates the operational quality of mechanized peanut sowing in three soil types (sand, silt, and clay) with variable moisture contents. The experiment was conducted in three locations in the state of São Paulo, Brazil. The track-sampling scheme was used for 80 sampling locations of each soil type. Descriptive statistics and statistical process control (SPC) were used to evaluate the quality indicators of mechanized peanut sowing. The variables had normal distributions and were stable from the viewpoint of SPC. The best performance for peanut sowing density, normal spacing, and the initial seedling growing stand was found for clayey soil followed by sandy soil and then silty soil. Sandy or clayey soils displayed similar results regarding sowing depth, which was deeper than in the silty soil. Overall, the texture and the moisture of clayey soil provided the best operational performance for mechanized peanut sowing.
Statistical process control applied to mechanized peanut sowing as a function of soil texture

PubMed Central

Furlani, Carlos Eduardo Angeli; da Silva, Rouverson Pereira

2017-01-01

The successful establishment of agricultural crops depends on sowing quality, machinery performance, soil type and conditions, among other factors. This study evaluates the operational quality of mechanized peanut sowing in three soil types (sand, silt, and clay) with variable moisture contents. The experiment was conducted in three locations in the state of São Paulo, Brazil. The track-sampling scheme was used for 80 sampling locations of each soil type. Descriptive statistics and statistical process control (SPC) were used to evaluate the quality indicators of mechanized peanut sowing. The variables had normal distributions and were stable from the viewpoint of SPC. The best performance for peanut sowing density, normal spacing, and the initial seedling growing stand was found for clayey soil followed by sandy soil and then silty soil. Sandy or clayey soils displayed similar results regarding sowing depth, which was deeper than in the silty soil. Overall, the texture and the moisture of clayey soil provided the best operational performance for mechanized peanut sowing. PMID:28742095
Injuries and Illnesses of Vietnam War POWs Revisited: IV. Air Force Risk Factors

DTIC Science & Technology

2017-03-22

predominantly aviators imprisoned in North Vietnam. Statistical analyses were performed using SPSS version 19. Pearson correlations were obtained...Repatriated Prisoner of War Initial Medical Evaluation Forms. Department of Defense. Washington, D.C. 5. IBM Corporation (2010). IBM SPSS Statistics for
Statistical porcess control in Deep Space Network operation

NASA Technical Reports Server (NTRS)

Hodder, J. A.

2002-01-01

This report describes how the Deep Space Mission System (DSMS) Operations Program Office at the Jet Propulsion Laboratory's (EL) uses Statistical Process Control (SPC) to monitor performance and evaluate initiatives for improving processes on the National Aeronautics and Space Administration's (NASA) Deep Space Network (DSN).
Performance evaluation of spectral vegetation indices using a statistical sensitivity function

USGS Publications Warehouse

Ji, Lei; Peters, Albert J.

2007-01-01

A great number of spectral vegetation indices (VIs) have been developed to estimate biophysical parameters of vegetation. Traditional techniques for evaluating the performance of VIs are regression-based statistics, such as the coefficient of determination and root mean square error. These statistics, however, are not capable of quantifying the detailed relationship between VIs and biophysical parameters because the sensitivity of a VI is usually a function of the biophysical parameter instead of a constant. To better quantify this relationship, we developed a “sensitivity function” for measuring the sensitivity of a VI to biophysical parameters. The sensitivity function is defined as the first derivative of the regression function, divided by the standard error of the dependent variable prediction. The function elucidates the change in sensitivity over the range of the biophysical parameter. The Student's t- or z-statistic can be used to test the significance of VI sensitivity. Additionally, we developed a “relative sensitivity function” that compares the sensitivities of two VIs when the biophysical parameters are unavailable.
Statistical performance evaluation of ECG transmission using wireless networks.

PubMed

Shakhatreh, Walid; Gharaibeh, Khaled; Al-Zaben, Awad

2013-07-01

This paper presents simulation of the transmission of biomedical signals (using ECG signal as an example) over wireless networks. Investigation of the effect of channel impairments including SNR, pathloss exponent, path delay and network impairments such as packet loss probability; on the diagnosability of the received ECG signal are presented. The ECG signal is transmitted through a wireless network system composed of two communication protocols; an 802.15.4- ZigBee protocol and an 802.11b protocol. The performance of the transmission is evaluated using higher order statistics parameters such as kurtosis and Negative Entropy in addition to the common techniques such as the PRD, RMS and Cross Correlation.
ANN based Performance Evaluation of BDI for Condition Monitoring of Induction Motor Bearings

NASA Astrophysics Data System (ADS)

Patel, Raj Kumar; Giri, V. K.

2017-06-01

One of the critical parts in rotating machines is bearings and most of the failure arises from the defective bearings. Bearing failure leads to failure of a machine and the unpredicted productivity loss in the performance. Therefore, bearing fault detection and prognosis is an integral part of the preventive maintenance procedures. In this paper vibration signal for four conditions of a deep groove ball bearing; normal (N), inner race defect (IRD), ball defect (BD) and outer race defect (ORD) were acquired from a customized bearing test rig, under four different conditions and three different fault sizes. Two approaches have been opted for statistical feature extraction from the vibration signal. In the first approach, raw signal is used for statistical feature extraction and in the second approach statistical features extracted are based on bearing damage index (BDI). The proposed BDI technique uses wavelet packet node energy coefficients analysis method. Both the features are used as inputs to an ANN classifier to evaluate its performance. A comparison of ANN performance is made based on raw vibration data and data chosen by using BDI. The ANN performance has been found to be fairly higher when BDI based signals were used as inputs to the classifier.
Relevance of the c-statistic when evaluating risk-adjustment models in surgery.

PubMed

Merkow, Ryan P; Hall, Bruce L; Cohen, Mark E; Dimick, Justin B; Wang, Edward; Chow, Warren B; Ko, Clifford Y; Bilimoria, Karl Y

2012-05-01

The measurement of hospital quality based on outcomes requires risk adjustment. The c-statistic is a popular tool used to judge model performance, but can be limited, particularly when evaluating specific operations in focused populations. Our objectives were to examine the interpretation and relevance of the c-statistic when used in models with increasingly similar case mix and to consider an alternative perspective on model calibration based on a graphical depiction of model fit. From the American College of Surgeons National Surgical Quality Improvement Program (2008-2009), patients were identified who underwent a general surgery procedure, and procedure groups were increasingly restricted: colorectal-all, colorectal-elective cases only, and colorectal-elective cancer cases only. Mortality and serious morbidity outcomes were evaluated using logistic regression-based risk adjustment, and model c-statistics and calibration curves were used to compare model performance. During the study period, 323,427 general, 47,605 colorectal-all, 39,860 colorectal-elective, and 21,680 colorectal cancer patients were studied. Mortality ranged from 1.0% in general surgery to 4.1% in the colorectal-all group, and serious morbidity ranged from 3.9% in general surgery to 12.4% in the colorectal-all procedural group. As case mix was restricted, c-statistics progressively declined from the general to the colorectal cancer surgery cohorts for both mortality and serious morbidity (mortality: 0.949 to 0.866; serious morbidity: 0.861 to 0.668). Calibration was evaluated graphically by examining predicted vs observed number of events over risk deciles. For both mortality and serious morbidity, there was no qualitative difference in calibration identified between the procedure groups. In the present study, we demonstrate how the c-statistic can become less informative and, in certain circumstances, can lead to incorrect model-based conclusions, as case mix is restricted and patients become more homogenous. Although it remains an important tool, caution is advised when the c-statistic is advanced as the sole measure of a model performance. Copyright © 2012 American College of Surgeons. All rights reserved.
An interlaboratory comparison programme on radio frequency electromagnetic field measurements: the second round of the scheme.

PubMed

Nicolopoulou, E P; Ztoupis, I N; Karabetsos, E; Gonos, I F; Stathopulos, I A

2015-04-01

The second round of an interlaboratory comparison scheme on radio frequency electromagnetic field measurements has been conducted in order to evaluate the overall performance of laboratories that perform measurements in the vicinity of mobile phone base stations and broadcast antenna facilities. The participants recorded the electric field strength produced by two high frequency signal generators inside an anechoic chamber in three measurement scenarios with the antennas transmitting each time different signals at the FM, VHF, UHF and GSM frequency bands. In each measurement scenario, the participants also used their measurements in order to calculate the relative exposure ratios. The results were evaluated in each test level calculating performance statistics (z-scores and En numbers). Subsequently, possible sources of errors for each participating laboratory were discussed, and the overall evaluation of their performances was determined by using an aggregated performance statistic. A comparison between the two rounds proves the necessity of the scheme. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Does daily nurse staffing match ward workload variability? Three hospitals' experiences.

PubMed

Gabbay, Uri; Bukchin, Michael

2009-01-01

Nurse shortage and rising healthcare resource burdens mean that appropriate workforce use is imperative. This paper aims to evaluate whether daily nursing staffing meets ward workload needs. Nurse attendance and daily nurses' workload capacity in three hospitals were evaluated. Statistical process control was used to evaluate intra-ward nurse workload capacity and day-to-day variations. Statistical process control is a statistics-based method for process monitoring that uses charts with predefined target measure and control limits. Standardization was performed for inter-ward analysis by converting ward-specific crude measures to ward-specific relative measures by dividing observed/expected. Two charts: acceptable and tolerable daily nurse workload intensity, were defined. Appropriate staffing indicators were defined as those exceeding predefined rates within acceptable and tolerable limits (50 percent and 80 percent respectively). A total of 42 percent of the overall days fell within acceptable control limits and 71 percent within tolerable control limits. Appropriate staffing indicators were met in only 33 percent of wards regarding acceptable nurse workload intensity and in only 45 percent of wards regarding tolerable workloads. The study work did not differentiate crude nurse attendance and it did not take into account patient severity since crude bed occupancy was used. Double statistical process control charts and certain staffing indicators were used, which is open to debate. Wards that met appropriate staffing indicators prove the method's feasibility. Wards that did not meet appropriate staffing indicators prove the importance and the need for process evaluations and monitoring. Methods presented for monitoring daily staffing appropriateness are simple to implement either for intra-ward day-to-day variation by using nurse workload capacity statistical process control charts or for inter-ward evaluation using standardized measure of nurse workload intensity. The real challenge will be to develop planning systems and implement corrective interventions such as dynamic and flexible daily staffing, which will face difficulties and barriers. The paper fulfils the need for workforce utilization evaluation. A simple method using available data for daily staffing appropriateness evaluation, which is easy to implement and operate, is presented. The statistical process control method enables intra-ward evaluation, while standardization by converting crude into relative measures enables inter-ward analysis. The staffing indicator definitions enable performance evaluation. This original study uses statistical process control to develop simple standardization methods and applies straightforward statistical tools. This method is not limited to crude measures, rather it uses weighted workload measures such as nursing acuity or weighted nurse level (i.e. grade/band).
Filter Tuning Using the Chi-Squared Statistic

NASA Technical Reports Server (NTRS)

Lilly-Salkowski, Tyler B.

2017-01-01

This paper examines the use of the Chi-square statistic as a means of evaluating filter performance. The goal of the process is to characterize the filter performance in the metric of covariance realism. The Chi-squared statistic is the value calculated to determine the realism of a covariance based on the prediction accuracy and the covariance values at a given point in time. Once calculated, it is the distribution of this statistic that provides insight on the accuracy of the covariance. The process of tuning an Extended Kalman Filter (EKF) for Aqua and Aura support is described, including examination of the measurement errors of available observation types, and methods of dealing with potentially volatile atmospheric drag modeling. Predictive accuracy and the distribution of the Chi-squared statistic, calculated from EKF solutions, are assessed.

Statistical tests for detecting associations with groups of genetic variants: generalization, evaluation, and implementation

PubMed Central

Ferguson, John; Wheeler, William; Fu, YiPing; Prokunina-Olsson, Ludmila; Zhao, Hongyu; Sampson, Joshua

2013-01-01

With recent advances in sequencing, genotyping arrays, and imputation, GWAS now aim to identify associations with rare and uncommon genetic variants. Here, we describe and evaluate a class of statistics, generalized score statistics (GSS), that can test for an association between a group of genetic variants and a phenotype. GSS are a simple weighted sum of single-variant statistics and their cross-products. We show that the majority of statistics currently used to detect associations with rare variants are equivalent to choosing a specific set of weights within this framework. We then evaluate the power of various weighting schemes as a function of variant characteristics, such as MAF, the proportion associated with the phenotype, and the direction of effect. Ultimately, we find that two classical tests are robust and powerful, but details are provided as to when other GSS may perform favorably. The software package CRaVe is available at our website (http://dceg.cancer.gov/bb/tools/crave). PMID:23092956
Methods in pharmacoepidemiology: a review of statistical analyses and data reporting in pediatric drug utilization studies.

PubMed

Sequi, Marco; Campi, Rita; Clavenna, Antonio; Bonati, Maurizio

2013-03-01

To evaluate the quality of data reporting and statistical methods performed in drug utilization studies in the pediatric population. Drug utilization studies evaluating all drug prescriptions to children and adolescents published between January 1994 and December 2011 were retrieved and analyzed. For each study, information on measures of exposure/consumption, the covariates considered, descriptive and inferential analyses, statistical tests, and methods of data reporting was extracted. An overall quality score was created for each study using a 12-item checklist that took into account the presence of outcome measures, covariates of measures, descriptive measures, statistical tests, and graphical representation. A total of 22 studies were reviewed and analyzed. Of these, 20 studies reported at least one descriptive measure. The mean was the most commonly used measure (18 studies), but only five of these also reported the standard deviation. Statistical analyses were performed in 12 studies, with the chi-square test being the most commonly performed test. Graphs were presented in 14 papers. Sixteen papers reported the number of drug prescriptions and/or packages, and ten reported the prevalence of the drug prescription. The mean quality score was 8 (median 9). Only seven of the 22 studies received a score of ≥10, while four studies received a score of <6. Our findings document that only a few of the studies reviewed applied statistical methods and reported data in a satisfactory manner. We therefore conclude that the methodology of drug utilization studies needs to be improved.
FAST COGNITIVE AND TASK ORIENTED, ITERATIVE DATA DISPLAY (FACTOID)

DTIC Science & Technology

2017-06-01

approaches. As a result, the following assumptions guided our efforts in developing modeling and descriptive metrics for evaluation purposes...Application Evaluation . Our analytic workflow for evaluation is to first provide descriptive statistics about applications across metrics (performance...distributions for evaluation purposes because the goal of evaluation is accurate description , not inference (e.g., prediction). Outliers depicted
Correlation analysis between 2D and quasi-3D gamma evaluations for both intensity-modulated radiation therapy and volumetric modulated arc therapy

PubMed Central

Kim, Jung-in; Choi, Chang Heon; Wu, Hong-Gyun; Kim, Jin Ho; Kim, Kyubo; Park, Jong Min

2017-01-01

The aim of this work was to investigate correlations between 2D and quasi-3D gamma passing rates. A total of 20 patients (10 prostate cases and 10 head and neck cases, H&N) were retrospectively selected. For each patient, both intensity-modulated radiation therapy (IMRT) and volumetric modulated arc therapy (VMAT) plans were generated. For each plan, 2D gamma evaluation with radiochromic films and quasi-3D gamma evaluation with fluence measurements were performed with both 2%/2 mm and 3%/3 mm criteria. Gamma passing rates were grouped together according to delivery techniques and treatment sites. Statistical analyses were performed to examine the correlation between 2D and quasi-3D gamma evaluations. Statistically significant difference was observed between delivery techniques only in the quasi-3D gamma passing rates with 2%/2 mm. Statistically significant differences were observed between treatment sites in the 2D gamma passing rates (differences of less than 8%). No statistically significant correlations were observed between 2D and quasi-3D gamma passing rates except the VMAT group and the group including both IMRT and VMAT with 3%/3 mm (r = 0.564 with p = 0.012 for theVMAT group and r = 0.372 with p = 0.020 for the group including both IMRT and VMAT), however, those were not strong. No strong correlations were observed between 2D and quasi-3D gamma evaluations. PMID:27690300
The relationship between temporomandibular dysfunction and head and cervical posture.

PubMed

Matheus, Ricardo Alves; Ramos-Perez, Flávia Maria de Moraes; Menezes, Alynne Vieira; Ambrosano, Gláucia Maria Bovi; Haiter-Neto, Francisco; Bóscolo, Frab Norberto; de Almeida, Solange Maria

2009-01-01

This study aimed to evaluate the possibility of any correlation between disc displacement and parameters used for evaluation of skull positioning in relation to the cervical spine: craniocervical angle, suboccipital space between C0-C1, cervical curvature and position of the hyoid bone in individuals with and without symptoms of temporomandibular dysfunction. The patients were evaluated following the guidelines set forth by RDC/TMD. Evaluation was performed by magnetic resonance imaging for establishment of disc positioning in the temporomandibular joints (TMJs) of 30 volunteer patients without temporomandibular dysfunction symptoms and 30 patients with symptoms. Evaluation of skull positioning in relation to the cervical spine was performed on lateral cephalograms achieved with the individual in natural head position. Data were submitted to statistical analysis by Fisher's exact test at 5% significance level. To measure the degree of reproducibility/agreements between surveys, the kappa (K) statistics was used. Significant differences were observed between C0-C1 measurement for both symptomatic (p=0.04) and asymptomatic (p=0.02). No statistical differences were observed regarding craniocervical angle, C1-C2 and hyoid bone position in relation to the TMJs with and without disc displacement. Although statistically significant difference was found in the C0-C1 space, no association between these and internal temporomandibular joint disorder can be considered. Based on the results observed in this study, no direct relationship could be determined between the presence of disc displacement and the variables assessed.
THE RELATIONSHIP BETWEEN TEMPOROMANDIBULAR DYSFUNCTION AND HEAD AND CERVICAL POSTURE

PubMed Central

Matheus, Ricardo Alves; Ramos-Perez, Flávia Maria de Moraes; Menezes, Alynne Vieira; Ambrosano, Gláucia Maria Bovi; Haiter, Francisco; Bóscolo, Frab Norberto; de Almeida, Solange Maria

2009-01-01

Objective: This study aimed to evaluate the possibility of any correlation between disc displacement and parameters used for evaluation of skull positioning in relation to the cervical spine: craniocervical angle, suboccipital space between C0-C1, cervical curvature and position of the hyoid bone in individuals with and without symptoms of temporomandibular dysfunction. Material and Methods: The patients were evaluated following the guidelines set forth by RDC/TMD. Evaluation was performed by magnetic resonance imaging for establishment of disc positioning in the temporomandibular joints (TMJs) of 30 volunteer patients without temporomandibular dysfunction symptoms and 30 patients with symptoms. Evaluation of skull positioning in relation to the cervical spine was performed on lateral cephalograms achieved with the individual in natural head position. Data were submitted to statistical analysis by Fisher's exact test at 5% significance level. To measure the degree of reproducibility/agreements between surveys, the kappa (K) statistics was used. Results: Significant differences were observed between C0-C1 measurement for both symptomatic (p=0.04) and asymptomatic (p=0.02). No statistical differences were observed regarding craniocervical angle, C1-C2 and hyoid bone position in relation to the TMJs with and without disc displacement. Although statistically significant difference was found in the C0-C1 space, no association between these and internal temporomandibular joint disorder can be considered. Conclusion: Based on the results observed in this study, no direct relationship could be determined between the presence of disc displacement and the variables assessed. PMID:19466252
Recommendations for Improved Performance Appraisal in the Federal Sector

DTIC Science & Technology

1986-01-01

camera-ready copy of a Participant’s Coursebook to be used in conducting sessions of the course, and (d) an evaluation instrument for use in obtaining...Timeliness and Availability of Departmental Statistics and Analyses Develop complete plans for conducting the 1990 census • Improve statistics on
Proceedings of the Conference on the Design of Experiments in Army Research, Development, and Testing (33rd)

DTIC Science & Technology

1988-05-01

Evaluation Directorate (ARMTE) was tasked to conduct a "side- by-side" comparison of EMPS vs . DATMs and to conduct a human factors evaluation of the EMPS...performance ("side-by-side") comparison of EMPS vs . DATMs and to conduct a human factors evaluation. The performance evaluation was based on the speed... independent targets over time. To acquire data for this research, the BRL conducted a statistically designed exper- iment, the Firepower Control Experiment
Evaluation of maintenance/rehabilitation alternatives for continuously reinforced concrete pavement

NASA Astrophysics Data System (ADS)

Barnett, T. L.; Darter, M. I.; Laybourne, N. R.

1981-05-01

The design, construction, performance, and costs of several maintenance and rehabilitation methods were evaluated. Patching, cement grout and asphalt undersealing, epoxying of cracks, and an asphalt overlay were considered. Nondestructive testing, deflections, reflection cracking, cost, and statistical analyses were used to evaluate the methods.
Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing.

PubMed

Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

2018-02-01

A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Evaluation of Solid Rocket Motor Component Data Using a Commercially Available Statistical Software Package

NASA Technical Reports Server (NTRS)

Stefanski, Philip L.

2015-01-01

Commercially available software packages today allow users to quickly perform the routine evaluations of (1) descriptive statistics to numerically and graphically summarize both sample and population data, (2) inferential statistics that draws conclusions about a given population from samples taken of it, (3) probability determinations that can be used to generate estimates of reliability allowables, and finally (4) the setup of designed experiments and analysis of their data to identify significant material and process characteristics for application in both product manufacturing and performance enhancement. This paper presents examples of analysis and experimental design work that has been conducted using Statgraphics®(Registered Trademark) statistical software to obtain useful information with regard to solid rocket motor propellants and internal insulation material. Data were obtained from a number of programs (Shuttle, Constellation, and Space Launch System) and sources that include solid propellant burn rate strands, tensile specimens, sub-scale test motors, full-scale operational motors, rubber insulation specimens, and sub-scale rubber insulation analog samples. Besides facilitating the experimental design process to yield meaningful results, statistical software has demonstrated its ability to quickly perform complex data analyses and yield significant findings that might otherwise have gone unnoticed. One caveat to these successes is that useful results not only derive from the inherent power of the software package, but also from the skill and understanding of the data analyst.
Test Vehicle Forebody Wake Effects on CPAS Parachutes

NASA Technical Reports Server (NTRS)

Ray, Eric S.

2017-01-01

Parachute drag performance has been reconstructed for a large number of Capsule Parachute Assembly System (CPAS) flight tests. This allows for determining forebody wake effects indirectly through statistical means. When data are available in a "clean" wake, such as behind a slender test vehicle, the relative degradation in performance for other test vehicles can be computed as a Pressure Recovery Fraction (PRF). All four CPAS parachute types were evaluated: Forward Bay Cover Parachutes (FBCPs), Drogues, Pilots, and Mains. Many tests used the missile-shaped Parachute Compartment Drop Test Vehicle (PCDTV) to obtain data at high airspeeds. Other tests used the Orion "boilerplate" Parachute Test Vehicle (PTV) to evaluate parachute performance in a representative heatshield wake. Drag data from both vehicles are normalized to a "capsule" forebody equivalent for Orion simulations. A separate database of PCDTV-specific performance is maintained to accurately predict flight tests. Data are shared among analogous parachutes whenever possible to maximize statistical significance.
A method to evaluate process performance by integrating time and resources

NASA Astrophysics Data System (ADS)

Wang, Yu; Wei, Qingjie; Jin, Shuang

2017-06-01

The purpose of process mining is to improve the existing process of the enterprise, so how to measure the performance of the process is particularly important. However, the current research on the performance evaluation method is still insufficient. The main methods of evaluation are mainly using time or resource. These basic statistics cannot evaluate process performance very well. In this paper, a method of evaluating the performance of the process based on time dimension and resource dimension is proposed. This method can be used to measure the utilization and redundancy of resources in the process. This paper will introduce the design principle and formula of the evaluation algorithm. Then, the design and the implementation of the evaluation method will be introduced. Finally, we will use the evaluating method to analyse the event log from a telephone maintenance process and propose an optimization plan.
The Shock and Vibration Digest. Volume 14, Number 12

DTIC Science & Technology

1982-12-01

to evaluate the uses of statistical energy analysis for determining sound transmission performance. Coupling loss factors were mea- sured and compared...measurements for the artificial (Also see No. 2623) cracks in mild-steel test pieces. 82-2676 Ihprovement of the Method of Statistical Energy Analysis for...eters, using a large number of free-response time histories In the application of the statistical energy analysis theory simultaneously in one analysis
Preliminary Evaluation of Visual and Flight Performance of Three Current Multifocal Contact Lens Designs for Presbyopic US Army Aviators

DTIC Science & Technology

2005-01-01

absolute emmetropia. Up to a substantial 25% of the aviation population develops ametropia requiring the use of spectacles or other refractive correction...to determine statistical differences in visual performance. The types of contact lenses were compared in general to determine whether there was any...Lomb multi- focal, and 187 (sd=25) for the Ciba progressive. There was not a statistically significant difference for high luminance contrast
The Statistical Loop Analyzer (SLA)

NASA Technical Reports Server (NTRS)

Lindsey, W. C.

1985-01-01

The statistical loop analyzer (SLA) is designed to automatically measure the acquisition, tracking and frequency stability performance characteristics of symbol synchronizers, code synchronizers, carrier tracking loops, and coherent transponders. Automated phase lock and system level tests can also be made using the SLA. Standard baseband, carrier and spread spectrum modulation techniques can be accomodated. Through the SLA's phase error jitter and cycle slip measurements the acquisition and tracking thresholds of the unit under test are determined; any false phase and frequency lock events are statistically analyzed and reported in the SLA output in probabilistic terms. Automated signal drop out tests can be performed in order to trouble shoot algorithms and evaluate the reacquisition statistics of the unit under test. Cycle slip rates and cycle slip probabilities can be measured using the SLA. These measurements, combined with bit error probability measurements, are all that are needed to fully characterize the acquisition and tracking performance of a digital communication system.
U.S. Geological Survey Standard Reference Sample Project: Performance Evaluation of Analytical Laboratories

USGS Publications Warehouse

Long, H. Keith; Daddow, Richard L.; Farrar, Jerry W.

1998-01-01

Since 1962, the U.S. Geological Survey (USGS) has operated the Standard Reference Sample Project to evaluate the performance of USGS, cooperator, and contractor analytical laboratories that analyze chemical constituents of environmental samples. The laboratories are evaluated by using performance evaluation samples, called Standard Reference Samples (SRSs). SRSs are submitted to laboratories semi-annually for round-robin laboratory performance comparison purposes. Currently, approximately 100 laboratories are evaluated for their analytical performance on six SRSs for inorganic and nutrient constituents. As part of the SRS Project, a surplus of homogeneous, stable SRSs is maintained for purchase by USGS offices and participating laboratories for use in continuing quality-assurance and quality-control activities. Statistical evaluation of the laboratories results provides information to compare the analytical performance of the laboratories and to determine possible analytical deficiences and problems. SRS results also provide information on the bias and variability of different analytical methods used in the SRS analyses.
Statistical Evaluation of Time Series Analysis Techniques

NASA Technical Reports Server (NTRS)

Benignus, V. A.

1973-01-01

The performance of a modified version of NASA's multivariate spectrum analysis program is discussed. A multiple regression model was used to make the revisions. Performance improvements were documented and compared to the standard fast Fourier transform by Monte Carlo techniques.
A Comparative Evaluation of Mixed Dentition Analysis on Reliability of Cone Beam Computed Tomography Image Compared to Plaster Model

PubMed Central

Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam

2017-01-01

Aims and Objective: The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Materials and Methods: Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t-test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Results: Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. Conclusion: CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis. PMID:28852639
Non-parametric early seizure detection in an animal model of temporal lobe epilepsy

NASA Astrophysics Data System (ADS)

Talathi, Sachin S.; Hwang, Dong-Uk; Spano, Mark L.; Simonotto, Jennifer; Furman, Michael D.; Myers, Stephen M.; Winters, Jason T.; Ditto, William L.; Carney, Paul R.

2008-03-01

The performance of five non-parametric, univariate seizure detection schemes (embedding delay, Hurst scale, wavelet scale, nonlinear autocorrelation and variance energy) were evaluated as a function of the sampling rate of EEG recordings, the electrode types used for EEG acquisition, and the spatial location of the EEG electrodes in order to determine the applicability of the measures in real-time closed-loop seizure intervention. The criteria chosen for evaluating the performance were high statistical robustness (as determined through the sensitivity and the specificity of a given measure in detecting a seizure) and the lag in seizure detection with respect to the seizure onset time (as determined by visual inspection of the EEG signal by a trained epileptologist). An optimality index was designed to evaluate the overall performance of each measure. For the EEG data recorded with microwire electrode array at a sampling rate of 12 kHz, the wavelet scale measure exhibited better overall performance in terms of its ability to detect a seizure with high optimality index value and high statistics in terms of sensitivity and specificity.

Improving esthetic results in benign parotid surgery: statistical evaluation of facelift approach, sternocleidomastoid flap, and superficial musculoaponeurotic system flap application.

PubMed

Bianchi, Bernardo; Ferri, Andrea; Ferrari, Silvano; Copelli, Chiara; Sesenna, Enrico

2011-04-01

The purpose of this article was to analyze the efficacy of facelift incision, sternocleidomastoid muscle flap, and superficial musculoaponeurotic system flap for improving the esthetic results in patients undergoing partial parotidectomy for benign parotid tumor resection. The usefulness of partial parotidectomy is discussed, and a statistical evaluation of the esthetic results was performed. From January 1, 1996, to January 1, 2007, 274 patients treated for benign parotid tumors were studied. Of these, 172 underwent partial parotidectomy. The 172 patients were divided into 4 groups: partial parotidectomy with classic or modified Blair incision without reconstruction (group 1), partial parotidectomy with facelift incision and without reconstruction (group 2), partial parotidectomy with facelift incision associated with sternocleidomastoid muscle flap (group 3), and partial parotidectomy with facelift incision associated with superficial musculoaponeurotic system flap (group 4). Patients were considered, after a follow-up of at least 18 months, for functional and esthetic evaluation. The functional outcome was assessed considering the facial nerve function, Frey syndrome, and recurrence. The esthetic evaluation was performed by inviting the patients and a blind panel of 1 surgeon and 2 secretaries of the department to give a score of 1 to 10 to assess the final cosmetic outcome. The statistical analysis was finally performed using the Mann-Whitney U test for nonparametric data to compare the different group results. P less than .05 was considered significant. No recurrence developed in any of the 4 groups or in any of the 274 patients during the follow-up period. The statistical analysis, comparing group 1 and the other groups, revealed a highly significant statistical difference (P < .0001) for all groups. Also, when group 2 was compared with groups 3 and 4, the difference was highly significantly different statistically (P = .0018 for group 3 and P = .0005 for group 4). Finally, when groups 3 and 4 were compared, the difference was not statistically significant (P = .3467). Partial parotidectomy is the real key point for improving esthetic results in benign parotid surgery. The evaluation of functional complications and the recurrence rate in this series of patients has confirmed that this technique can be safely used for parotid benign tumor resection. The use of a facelift incision alone led to a high statistically significant improvement in the esthetic outcome. When the facelift incision was used with reconstructive techniques, such as the sternocleidomastoid muscle flap or the superficial musculoaponeurotic system flap, the esthetic results improved further. Finally, no statistically significant difference resulted comparing the use of the superficial musculoaponeurotic system and the sternocleidomastoid muscle flap. Copyright © 2011 American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.
Simulated performance of an order statistic threshold strategy for detection of narrowband signals

NASA Technical Reports Server (NTRS)

Satorius, E.; Brady, R.; Deich, W.; Gulkis, S.; Olsen, E.

1988-01-01

The application of order statistics to signal detection is becoming an increasingly active area of research. This is due to the inherent robustness of rank estimators in the presence of large outliers that would significantly degrade more conventional mean-level-based detection systems. A detection strategy is presented in which the threshold estimate is obtained using order statistics. The performance of this algorithm in the presence of simulated interference and broadband noise is evaluated. In this way, the robustness of the proposed strategy in the presence of the interference can be fully assessed as a function of the interference, noise, and detector parameters.
Clinical relevance vs. statistical significance: Using neck outcomes in patients with temporomandibular disorders as an example.

PubMed

Armijo-Olivo, Susan; Warren, Sharon; Fuentes, Jorge; Magee, David J

2011-12-01

Statistical significance has been used extensively to evaluate the results of research studies. Nevertheless, it offers only limited information to clinicians. The assessment of clinical relevance can facilitate the interpretation of the research results into clinical practice. The objective of this study was to explore different methods to evaluate the clinical relevance of the results using a cross-sectional study as an example comparing different neck outcomes between subjects with temporomandibular disorders and healthy controls. Subjects were compared for head and cervical posture, maximal cervical muscle strength, endurance of the cervical flexor and extensor muscles, and electromyographic activity of the cervical flexor muscles during the CranioCervical Flexion Test (CCFT). The evaluation of clinical relevance of the results was performed based on the effect size (ES), minimal important difference (MID), and clinical judgement. The results of this study show that it is possible to have statistical significance without having clinical relevance, to have both statistical significance and clinical relevance, to have clinical relevance without having statistical significance, or to have neither statistical significance nor clinical relevance. The evaluation of clinical relevance in clinical research is crucial to simplify the transfer of knowledge from research into practice. Clinical researchers should present the clinical relevance of their results. Copyright © 2011 Elsevier Ltd. All rights reserved.
Determining Functional Reliability of Pyrotechnic Mechanical Devices

NASA Technical Reports Server (NTRS)

Bement, Laurence J.; Multhaup, Herbert A.

1997-01-01

This paper describes a new approach for evaluating mechanical performance and predicting the mechanical functional reliability of pyrotechnic devices. Not included are other possible failure modes, such as the initiation of the pyrotechnic energy source. The requirement of hundreds or thousands of consecutive, successful tests on identical components for reliability predictions, using the generally accepted go/no-go statistical approach routinely ignores physics of failure. The approach described in this paper begins with measuring, understanding and controlling mechanical performance variables. Then, the energy required to accomplish the function is compared to that delivered by the pyrotechnic energy source to determine mechanical functional margin. Finally, the data collected in establishing functional margin is analyzed to predict mechanical functional reliability, using small-sample statistics. A careful application of this approach can provide considerable cost improvements and understanding over that of go/no-go statistics. Performance and the effects of variables can be defined, and reliability predictions can be made by evaluating 20 or fewer units. The application of this approach to a pin puller used on a successful NASA mission is provided as an example.
A Web-Based Learning Tool Improves Student Performance in Statistics: A Randomized Masked Trial

ERIC Educational Resources Information Center

Gonzalez, Jose A.; Jover, Lluis; Cobo, Erik; Munoz, Pilar

2010-01-01

Background: e-status is a web-based tool able to generate different statistical exercises and to provide immediate feedback to students' answers. Although the use of Information and Communication Technologies (ICTs) is becoming widespread in undergraduate education, there are few experimental studies evaluating its effects on learning. Method: All…
Correlation of MRI Visual Scales with Neuropsychological Profile in Mild Cognitive Impairment of Parkinson's Disease.

PubMed

Vasconcellos, Luiz Felipe; Pereira, João Santos; Adachi, Marcelo; Greca, Denise; Cruz, Manuela; Malak, Ana Lara; Charchat-Fichman, Helenice; Spitz, Mariana

2017-01-01

Few studies have evaluated magnetic resonance imaging (MRI) visual scales in Parkinson's disease-Mild Cognitive Impairment (PD-MCI). We selected 79 PD patients and 92 controls (CO) to perform neurologic and neuropsychological evaluation. Brain MRI was performed to evaluate the following scales: Global Cortical Atrophy (GCA), Fazekas, and medial temporal atrophy (MTA). The analysis revealed that both PD groups (amnestic and nonamnestic) showed worse performance on several tests when compared to CO. Memory, executive function, and attention impairment were more severe in amnestic PD-MCI group. Overall analysis of frequency of MRI visual scales by MCI subtype did not reveal any statistically significant result. Statistically significant inverse correlation was observed between GCA scale and Mini-Mental Status Examination (MMSE), Montreal Cognitive Assessment (MoCA), semantic verbal fluency, Stroop test, figure memory test, trail making test (TMT) B, and Rey Auditory Verbal Learning Test (RAVLT). The MTA scale correlated with Stroop test and Fazekas scale with figure memory test, digit span, and Stroop test according to the subgroup evaluated. Visual scales by MRI in MCI should be evaluated by cognitive domain and might be more useful in more severely impaired MCI or dementia patients.
Comparative Analysis Between Computed and Conventional Inferior Alveolar Nerve Block Techniques.

PubMed

Araújo, Gabriela Madeira; Barbalho, Jimmy Charles Melo; Dias, Tasiana Guedes de Souza; Santos, Thiago de Santana; Vasconcellos, Ricardo José de Holanda; de Morais, Hécio Henrique Araújo

2015-11-01

The aim of this randomized, double-blind, controlled trial was to compare the computed and conventional inferior alveolar nerve block techniques in symmetrically positioned inferior third molars. Both computed and conventional anesthetic techniques were performed in 29 healthy patients (58 surgeries) aged between 18 and 40 years. The anesthetic of choice was 2% lidocaine with 1: 200,000 epinephrine. The Visual Analogue Scale assessed the pain variable after anesthetic infiltration. Patient satisfaction was evaluated using the Likert Scale. Heart and respiratory rates, mean time to perform technique, and the need for additional anesthesia were also evaluated. Pain variable means were higher for the conventional technique as compared with computed, 3.45 ± 2.73 and 2.86 ± 1.96, respectively, but no statistically significant differences were found (P > 0.05). Patient satisfaction showed no statistically significant differences. The average computed technique runtime and the conventional were 3.85 and 1.61 minutes, respectively, showing statistically significant differences (P <0.001). The computed anesthetic technique showed lower mean pain perception, but did not show statistically significant differences when contrasted to the conventional technique.
Development and evaluation of statistical shape modeling for principal inner organs on torso CT images.

PubMed

Zhou, Xiangrong; Xu, Rui; Hara, Takeshi; Hirano, Yasushi; Yokoyama, Ryujiro; Kanematsu, Masayuki; Hoshi, Hiroaki; Kido, Shoji; Fujita, Hiroshi

2014-07-01

The shapes of the inner organs are important information for medical image analysis. Statistical shape modeling provides a way of quantifying and measuring shape variations of the inner organs in different patients. In this study, we developed a universal scheme that can be used for building the statistical shape models for different inner organs efficiently. This scheme combines the traditional point distribution modeling with a group-wise optimization method based on a measure called minimum description length to provide a practical means for 3D organ shape modeling. In experiments, the proposed scheme was applied to the building of five statistical shape models for hearts, livers, spleens, and right and left kidneys by use of 50 cases of 3D torso CT images. The performance of these models was evaluated by three measures: model compactness, model generalization, and model specificity. The experimental results showed that the constructed shape models have good "compactness" and satisfied the "generalization" performance for different organ shape representations; however, the "specificity" of these models should be improved in the future.
Videotape Reliability: A Method of Evaluation of a Clinical Performance Examination.

ERIC Educational Resources Information Center

And Others; Liu, Philip

1980-01-01

A method of statistically analyzing clinical performance examinations for reliability and the application of this method in determining the reliability of two examinations of skill in administering anesthesia are described. Videotaped performances for the Spinal Anesthesia Skill Examination and the Anesthesia Setup and Machine Checkout Examination…
Performance assessment in a flight simulator test—Validation of a space psychology methodology

NASA Astrophysics Data System (ADS)

Johannes, B.; Salnitski, Vyacheslav; Soll, Henning; Rauch, Melina; Goeters, Klaus-Martin; Maschke, Peter; Stelling, Dirk; Eißfeldt, Hinnerk

2007-02-01

The objective assessment of operator performance in hand controlled docking of a spacecraft on a space station has 30 years of tradition and is well established. In the last years the performance assessment was successfully combined with a psycho-physiological approach for the objective assessment of the levels of physiological arousal and psychological load. These methods are based on statistical reference data. For the enhancement of the statistical power of the evaluation methods, both were actually implemented into a comparable terrestrial task: the flight simulator test of DLR in the selection procedure for ab initio pilot applicants for civil airlines. In the first evaluation study 134 male subjects were analysed. Subjects underwent a flight simulator test including three tasks, which were evaluated by instructors applying well-established and standardised rating scales. The principles of the performance algorithms of the docking training were adapted for the automated flight performance assessment. They are presented here. The increased human errors under instrument flight conditions without visual feedback required a manoeuvre recognition algorithm before calculating the deviation of the flown track from the given task elements. Each manoeuvre had to be evaluated independently of former failures. The expert rated performance showed a highly significant correlation with the automatically calculated performance for each of the three tasks: r=.883, r=.874, r=.872, respectively. An automated algorithm successfully assessed the flight performance. This new method will possibly provide a wide range of other future applications in aviation and space psychology.
The Student-to-Student Chemistry Initiative: Training High School Students To Perform Chemistry Demonstration Programs for Elementary School Students

NASA Astrophysics Data System (ADS)

Voegel, Phillip D.; Quashnock, Kathryn A.; Heil, Katrina M.

2004-05-01

The Student-to-Student Chemistry Initiative is an outreach program started in the fall of 2001 at Midwestern State University (MSU). The oncampus program trains high school science students to perform a series of chemistry demonstrations and subsequently provides kits containing necessary supplies and reagents for the high school students to perform demonstration programs at elementary schools. The program focuses on improving student perception of science. The program's impact on high school student perception is evaluated through statistical analysis of paired preparticipation and postparticipation surveys. The surveys focus on four areas of student perception: general attitude toward science, interest in careers in science, science awareness, and interest in attending MSU for postsecondary education. Increased scores were observed in all evaluation areas including a statistically significant increase in science awareness following participation.
Implementation of statistical process control for proteomic experiments via LC MS/MS.

PubMed

Bereman, Michael S; Johnson, Richard; Bollinger, James; Boss, Yuval; Shulman, Nick; MacLean, Brendan; Hoofnagle, Andrew N; MacCoss, Michael J

2014-04-01

Statistical process control (SPC) is a robust set of tools that aids in the visualization, detection, and identification of assignable causes of variation in any process that creates products, services, or information. A tool has been developed termed Statistical Process Control in Proteomics (SProCoP) which implements aspects of SPC (e.g., control charts and Pareto analysis) into the Skyline proteomics software. It monitors five quality control metrics in a shotgun or targeted proteomic workflow. None of these metrics require peptide identification. The source code, written in the R statistical language, runs directly from the Skyline interface, which supports the use of raw data files from several of the mass spectrometry vendors. It provides real time evaluation of the chromatographic performance (e.g., retention time reproducibility, peak asymmetry, and resolution), and mass spectrometric performance (targeted peptide ion intensity and mass measurement accuracy for high resolving power instruments) via control charts. Thresholds are experiment- and instrument-specific and are determined empirically from user-defined quality control standards that enable the separation of random noise and systematic error. Finally, Pareto analysis provides a summary of performance metrics and guides the user to metrics with high variance. The utility of these charts to evaluate proteomic experiments is illustrated in two case studies.
An examination of the relationships between physicians' clinical and hospital-utilization performance.

PubMed Central

Saywell, R M; Bean, J A; Ludke, R L; Redman, R W; McHugh, G J

1981-01-01

To examine the relationships between measures of attending physician teams' clinical and utilization performance, inpatient hospital audits were conducted in 22 Maryland and western Pennsylvania nonfederal short-term hospitals. A total of 6,980 medical records were abstracted from eight diagnostic categories using the Payne and JCAH PEP medical audit procedures. The results indicate weak statistical associations between the two medical care evaluation audits; between clinical performance and utilization performance, as measured by appropriateness of admissions and length of stay; and between three utilization measures. Based on these findings, it does not appear valid to use performance in one area to evaluate performance in the other in order to measure or evaluate and ultimately improve physicians; clinical or utilization performance. PMID:6946048
Performance evaluation of a hybrid-passive landfill leachate treatment system using multivariate statistical techniques

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wallace, Jack, E-mail: jack.wallace@ce.queensu.ca; Champagne, Pascale, E-mail: champagne@civil.queensu.ca; Monnier, Anne-Charlotte, E-mail: anne-charlotte.monnier@insa-lyon.fr

Highlights: • Performance of a hybrid passive landfill leachate treatment system was evaluated. • 33 Water chemistry parameters were sampled for 21 months and statistically analyzed. • Parameters were strongly linked and explained most (>40%) of the variation in data. • Alkalinity, ammonia, COD, heavy metals, and iron were criteria for performance. • Eight other parameters were key in modeling system dynamics and criteria. - Abstract: A pilot-scale hybrid-passive treatment system operated at the Merrick Landfill in North Bay, Ontario, Canada, treats municipal landfill leachate and provides for subsequent natural attenuation. Collected leachate is directed to a hybrid-passive treatment system,more » followed by controlled release to a natural attenuation zone before entering the nearby Little Sturgeon River. The study presents a comprehensive evaluation of the performance of the system using multivariate statistical techniques to determine the interactions between parameters, major pollutants in the leachate, and the biological and chemical processes occurring in the system. Five parameters (ammonia, alkalinity, chemical oxygen demand (COD), “heavy” metals of interest, with atomic weights above calcium, and iron) were set as criteria for the evaluation of system performance based on their toxicity to aquatic ecosystems and importance in treatment with respect to discharge regulations. System data for a full range of water quality parameters over a 21-month period were analyzed using principal components analysis (PCA), as well as principal components (PC) and partial least squares (PLS) regressions. PCA indicated a high degree of association for most parameters with the first PC, which explained a high percentage (>40%) of the variation in the data, suggesting strong statistical relationships among most of the parameters in the system. Regression analyses identified 8 parameters (set as independent variables) that were most frequently retained for modeling the five criteria parameters (set as dependent variables), on a statistically significant level: conductivity, dissolved oxygen (DO), nitrite (NO{sub 2}{sup −}), organic nitrogen (N), oxidation reduction potential (ORP), pH, sulfate and total volatile solids (TVS). The criteria parameters and the significant explanatory parameters were most important in modeling the dynamics of the passive treatment system during the study period. Such techniques and procedures were found to be highly valuable and could be applied to other sites to determine parameters of interest in similar naturalized engineered systems.« less
An Assessment of the Effectiveness of Air Force Risk Management Practices in Program Acquisition Using Survey Instrument Analysis

DTIC Science & Technology

2015-06-18

Engineering Effectiveness Survey. CMU/SEI-2012-SR-009. Carnegie Mellon University. November 2012. Field, Andy. Discovering Statistics Using SPSS , 3rd...enough into the survey to begin answering questions on risk practices. All of the data statistical analysis will be performed using SPSS . Prior to...probabilistically using distributions for likelihood and impact. Statistical methods like Monte Carlo can more comprehensively evaluate the cost and
Do clinical safety charts improve paramedic key performance indicator results? (A clinical improvement programme evaluation).

PubMed

Ebbs, Phillip; Middleton, Paul M; Bonner, Ann; Loudfoot, Allan; Elliott, Peter

2012-07-01

Is the Clinical Safety Chart clinical improvement programme (CIP) effective at improving paramedic key performance indicator (KPI) results within the Ambulance Service of New South Wales? The CIP intervention area was compared with the non-intervention area in order to determine whether there was a statistically significant improvement in KPI results. The CIP was associated with a statistically significant improvement in paramedic KPI results within the intervention area. The strategies used within this CIP are recommended for further consideration.
Implementation of Insight Responsibilities in Process Engineering

NASA Technical Reports Server (NTRS)

Osborne, Deborah M.

1997-01-01

This report describes an approach for evaluating flight readiness (COFR) and contractor performance evaluation (award fee) as part of the insight role of NASA Process Engineering at Kennedy Space Center. Several evaluation methods are presented, including systems engineering evaluations and use of systems performance data. The transition from an oversight function to the insight function is described. The types of analytical tools appropriate for achieving the flight readiness and contractor performance evaluation goals are described and examples are provided. Special emphasis is placed upon short and small run statistical quality control techniques. Training requirements for system engineers are delineated. The approach described herein would be equally appropriate in other directorates at Kennedy Space Center.
Match statistics related to winning in the group stage of 2014 Brazil FIFA World Cup.

PubMed

Liu, Hongyou; Gomez, Miguel-Ángel; Lago-Peñas, Carlos; Sampaio, Jaime

2015-01-01

Identifying match statistics that strongly contribute to winning in football matches is a very important step towards a more predictive and prescriptive performance analysis. The current study aimed to determine relationships between 24 match statistics and the match outcome (win, loss and draw) in all games and close games of the group stage of FIFA World Cup (2014, Brazil) by employing the generalised linear model. The cumulative logistic regression was run in the model taking the value of each match statistic as independent variable to predict the logarithm of the odds of winning. Relationships were assessed as effects of a two-standard-deviation increase in the value of each variable on the change in the probability of a team winning a match. Non-clinical magnitude-based inferences were employed and were evaluated by using the smallest worthwhile change. Results showed that for all the games, nine match statistics had clearly positive effects on the probability of winning (Shot, Shot on Target, Shot from Counter Attack, Shot from Inside Area, Ball Possession, Short Pass, Average Pass Streak, Aerial Advantage and Tackle), four had clearly negative effects (Shot Blocked, Cross, Dribble and Red Card), other 12 statistics had either trivial or unclear effects. While for the close games, the effects of Aerial Advantage and Yellow Card turned to trivial and clearly negative, respectively. Information from the tactical modelling can provide a more thorough and objective match understanding to coaches and performance analysts for evaluating post-match performances and for scouting upcoming oppositions.
Evaluating and implementing temporal, spatial, and spatio-temporal methods for outbreak detection in a local syndromic surveillance system

PubMed Central

Lall, Ramona; Levin-Rector, Alison; Sell, Jessica; Paladini, Marc; Konty, Kevin J.; Olson, Don; Weiss, Don

2017-01-01

The New York City Department of Health and Mental Hygiene has operated an emergency department syndromic surveillance system since 2001, using temporal and spatial scan statistics run on a daily basis for cluster detection. Since the system was originally implemented, a number of new methods have been proposed for use in cluster detection. We evaluated six temporal and four spatial/spatio-temporal detection methods using syndromic surveillance data spiked with simulated injections. The algorithms were compared on several metrics, including sensitivity, specificity, positive predictive value, coherence, and timeliness. We also evaluated each method’s implementation, programming time, run time, and the ease of use. Among the temporal methods, at a set specificity of 95%, a Holt-Winters exponential smoother performed the best, detecting 19% of the simulated injects across all shapes and sizes, followed by an autoregressive moving average model (16%), a generalized linear model (15%), a modified version of the Early Aberration Reporting System’s C2 algorithm (13%), a temporal scan statistic (11%), and a cumulative sum control chart (<2%). Of the spatial/spatio-temporal methods we tested, a spatial scan statistic detected 3% of all injects, a Bayes regression found 2%, and a generalized linear mixed model and a space-time permutation scan statistic detected none at a specificity of 95%. Positive predictive value was low (<7%) for all methods. Overall, the detection methods we tested did not perform well in identifying the temporal and spatial clusters of cases in the inject dataset. The spatial scan statistic, our current method for spatial cluster detection, performed slightly better than the other tested methods across different inject magnitudes and types. Furthermore, we found the scan statistics, as applied in the SaTScan software package, to be the easiest to program and implement for daily data analysis. PMID:28886112
Statistical approach for selection of biologically informative genes.

PubMed

Das, Samarendra; Rai, Anil; Mishra, D C; Rai, Shesh N

2018-05-20

Selection of informative genes from high dimensional gene expression data has emerged as an important research area in genomics. Many gene selection techniques have been proposed so far are either based on relevancy or redundancy measure. Further, the performance of these techniques has been adjudged through post selection classification accuracy computed through a classifier using the selected genes. This performance metric may be statistically sound but may not be biologically relevant. A statistical approach, i.e. Boot-MRMR, was proposed based on a composite measure of maximum relevance and minimum redundancy, which is both statistically sound and biologically relevant for informative gene selection. For comparative evaluation of the proposed approach, we developed two biological sufficient criteria, i.e. Gene Set Enrichment with QTL (GSEQ) and biological similarity score based on Gene Ontology (GO). Further, a systematic and rigorous evaluation of the proposed technique with 12 existing gene selection techniques was carried out using five gene expression datasets. This evaluation was based on a broad spectrum of statistically sound (e.g. subject classification) and biological relevant (based on QTL and GO) criteria under a multiple criteria decision-making framework. The performance analysis showed that the proposed technique selects informative genes which are more biologically relevant. The proposed technique is also found to be quite competitive with the existing techniques with respect to subject classification and computational time. Our results also showed that under the multiple criteria decision-making setup, the proposed technique is best for informative gene selection over the available alternatives. Based on the proposed approach, an R Package, i.e. BootMRMR has been developed and available at https://cran.r-project.org/web/packages/BootMRMR. This study will provide a practical guide to select statistical techniques for selecting informative genes from high dimensional expression data for breeding and system biology studies. Published by Elsevier B.V.

Evaluating and implementing temporal, spatial, and spatio-temporal methods for outbreak detection in a local syndromic surveillance system.

PubMed

Mathes, Robert W; Lall, Ramona; Levin-Rector, Alison; Sell, Jessica; Paladini, Marc; Konty, Kevin J; Olson, Don; Weiss, Don

2017-01-01

The New York City Department of Health and Mental Hygiene has operated an emergency department syndromic surveillance system since 2001, using temporal and spatial scan statistics run on a daily basis for cluster detection. Since the system was originally implemented, a number of new methods have been proposed for use in cluster detection. We evaluated six temporal and four spatial/spatio-temporal detection methods using syndromic surveillance data spiked with simulated injections. The algorithms were compared on several metrics, including sensitivity, specificity, positive predictive value, coherence, and timeliness. We also evaluated each method's implementation, programming time, run time, and the ease of use. Among the temporal methods, at a set specificity of 95%, a Holt-Winters exponential smoother performed the best, detecting 19% of the simulated injects across all shapes and sizes, followed by an autoregressive moving average model (16%), a generalized linear model (15%), a modified version of the Early Aberration Reporting System's C2 algorithm (13%), a temporal scan statistic (11%), and a cumulative sum control chart (<2%). Of the spatial/spatio-temporal methods we tested, a spatial scan statistic detected 3% of all injects, a Bayes regression found 2%, and a generalized linear mixed model and a space-time permutation scan statistic detected none at a specificity of 95%. Positive predictive value was low (<7%) for all methods. Overall, the detection methods we tested did not perform well in identifying the temporal and spatial clusters of cases in the inject dataset. The spatial scan statistic, our current method for spatial cluster detection, performed slightly better than the other tested methods across different inject magnitudes and types. Furthermore, we found the scan statistics, as applied in the SaTScan software package, to be the easiest to program and implement for daily data analysis.
40 CFR 610.10 - Program purpose.

Code of Federal Regulations, 2013 CFR

2013-07-01

... DEVICES Test Procedures and Evaluation Criteria General Provisions § 610.10 Program purpose. (a) The... standardized procedures, the performance of various retrofit devices applicable to automobiles for which fuel... statistical analysis of data from vehicle tests, the evaluation program will determine the effects on fuel...
40 CFR 610.10 - Program purpose.

Code of Federal Regulations, 2014 CFR

2014-07-01

... DEVICES Test Procedures and Evaluation Criteria General Provisions § 610.10 Program purpose. (a) The... standardized procedures, the performance of various retrofit devices applicable to automobiles for which fuel... statistical analysis of data from vehicle tests, the evaluation program will determine the effects on fuel...
40 CFR 610.10 - Program purpose.

Code of Federal Regulations, 2011 CFR

2011-07-01

... DEVICES Test Procedures and Evaluation Criteria General Provisions § 610.10 Program purpose. (a) The... standardized procedures, the performance of various retrofit devices applicable to automobiles for which fuel... statistical analysis of data from vehicle tests, the evaluation program will determine the effects on fuel...
40 CFR 610.10 - Program purpose.

Code of Federal Regulations, 2012 CFR

2012-07-01

... DEVICES Test Procedures and Evaluation Criteria General Provisions § 610.10 Program purpose. (a) The... standardized procedures, the performance of various retrofit devices applicable to automobiles for which fuel... statistical analysis of data from vehicle tests, the evaluation program will determine the effects on fuel...
A statistical, task-based evaluation method for three-dimensional x-ray breast imaging systems using variable-background phantoms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Park, Subok; Jennings, Robert; Liu Haimo

Purpose: For the last few years, development and optimization of three-dimensional (3D) x-ray breast imaging systems, such as digital breast tomosynthesis (DBT) and computed tomography, have drawn much attention from the medical imaging community, either academia or industry. However, there is still much room for understanding how to best optimize and evaluate the devices over a large space of many different system parameters and geometries. Current evaluation methods, which work well for 2D systems, do not incorporate the depth information from the 3D imaging systems. Therefore, it is critical to develop a statistically sound evaluation method to investigate the usefulnessmore » of inclusion of depth and background-variability information into the assessment and optimization of the 3D systems. Methods: In this paper, we present a mathematical framework for a statistical assessment of planar and 3D x-ray breast imaging systems. Our method is based on statistical decision theory, in particular, making use of the ideal linear observer called the Hotelling observer. We also present a physical phantom that consists of spheres of different sizes and materials for producing an ensemble of randomly varying backgrounds to be imaged for a given patient class. Lastly, we demonstrate our evaluation method in comparing laboratory mammography and three-angle DBT systems for signal detection tasks using the phantom's projection data. We compare the variable phantom case to that of a phantom of the same dimensions filled with water, which we call the uniform phantom, based on the performance of the Hotelling observer as a function of signal size and intensity. Results: Detectability trends calculated using the variable and uniform phantom methods are different from each other for both mammography and DBT systems. Conclusions: Our results indicate that measuring the system's detection performance with consideration of background variability may lead to differences in system performance estimates and comparisons. For the assessment of 3D systems, to accurately determine trade offs between image quality and radiation dose, it is critical to incorporate randomness arising from the imaging chain including background variability into system performance calculations.« less
Evaluation of Methods Used for Estimating Selected Streamflow Statistics, and Flood Frequency and Magnitude, for Small Basins in North Coastal California

USGS Publications Warehouse

Mann, Michael P.; Rizzardo, Jule; Satkowski, Richard

2004-01-01

Accurate streamflow statistics are essential to water resource agencies involved in both science and decision-making. When long-term streamflow data are lacking at a site, estimation techniques are often employed to generate streamflow statistics. However, procedures for accurately estimating streamflow statistics often are lacking. When estimation procedures are developed, they often are not evaluated properly before being applied. Use of unevaluated or underevaluated flow-statistic estimation techniques can result in improper water-resources decision-making. The California State Water Resources Control Board (SWRCB) uses two key techniques, a modified rational equation and drainage basin area-ratio transfer, to estimate streamflow statistics at ungaged locations. These techniques have been implemented to varying degrees, but have not been formally evaluated. For estimating peak flows at the 2-, 5-, 10-, 25-, 50-, and 100-year recurrence intervals, the SWRCB uses the U.S. Geological Surveys (USGS) regional peak-flow equations. In this study, done cooperatively by the USGS and SWRCB, the SWRCB estimated several flow statistics at 40 USGS streamflow gaging stations in the north coast region of California. The SWRCB estimates were made without reference to USGS flow data. The USGS used the streamflow data provided by the 40 stations to generate flow statistics that could be compared with SWRCB estimates for accuracy. While some SWRCB estimates compared favorably with USGS statistics, results were subject to varying degrees of error over the region. Flow-based estimation techniques generally performed better than rain-based methods, especially for estimation of December 15 to March 31 mean daily flows. The USGS peak-flow equations also performed well, but tended to underestimate peak flows. The USGS equations performed within reported error bounds, but will require updating in the future as peak-flow data sets grow larger. Little correlation was discovered between estimation errors and geographic locations or various basin characteristics. However, for 25-percentile year mean-daily-flow estimates for December 15 to March 31, the greatest estimation errors were at east San Francisco Bay area stations with mean annual precipitation less than or equal to 30 inches, and estimated 2-year/24-hour rainfall intensity less than 3 inches.
AIDS Clinical Trials Group Network

MedlinePlus

... ACTG (PDF - 42 KB) Bylaws, SOPs, and Guidelines Leadership and Operations Center Network Coordinating Center Statistical and Data Management Center Performance Evaluation Program Sites Community General Information ...
A generalized K statistic for estimating phylogenetic signal from shape and other high-dimensional multivariate data.

PubMed

Adams, Dean C

2014-09-01

Phylogenetic signal is the tendency for closely related species to display similar trait values due to their common ancestry. Several methods have been developed for quantifying phylogenetic signal in univariate traits and for sets of traits treated simultaneously, and the statistical properties of these approaches have been extensively studied. However, methods for assessing phylogenetic signal in high-dimensional multivariate traits like shape are less well developed, and their statistical performance is not well characterized. In this article, I describe a generalization of the K statistic of Blomberg et al. that is useful for quantifying and evaluating phylogenetic signal in highly dimensional multivariate data. The method (K(mult)) is found from the equivalency between statistical methods based on covariance matrices and those based on distance matrices. Using computer simulations based on Brownian motion, I demonstrate that the expected value of K(mult) remains at 1.0 as trait variation among species is increased or decreased, and as the number of trait dimensions is increased. By contrast, estimates of phylogenetic signal found with a squared-change parsimony procedure for multivariate data change with increasing trait variation among species and with increasing numbers of trait dimensions, confounding biological interpretations. I also evaluate the statistical performance of hypothesis testing procedures based on K(mult) and find that the method displays appropriate Type I error and high statistical power for detecting phylogenetic signal in high-dimensional data. Statistical properties of K(mult) were consistent for simulations using bifurcating and random phylogenies, for simulations using different numbers of species, for simulations that varied the number of trait dimensions, and for different underlying models of trait covariance structure. Overall these findings demonstrate that K(mult) provides a useful means of evaluating phylogenetic signal in high-dimensional multivariate traits. Finally, I illustrate the utility of the new approach by evaluating the strength of phylogenetic signal for head shape in a lineage of Plethodon salamanders. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Reproducible detection of disease-associated markers from gene expression data.

PubMed

Omae, Katsuhiro; Komori, Osamu; Eguchi, Shinto

2016-08-18

Detection of disease-associated markers plays a crucial role in gene screening for biological studies. Two-sample test statistics, such as the t-statistic, are widely used to rank genes based on gene expression data. However, the resultant gene ranking is often not reproducible among different data sets. Such irreproducibility may be caused by disease heterogeneity. When we divided data into two subsets, we found that the signs of the two t-statistics were often reversed. Focusing on such instability, we proposed a sign-sum statistic that counts the signs of the t-statistics for all possible subsets. The proposed method excludes genes affected by heterogeneity, thereby improving the reproducibility of gene ranking. We compared the sign-sum statistic with the t-statistic by a theoretical evaluation of the upper confidence limit. Through simulations and applications to real data sets, we show that the sign-sum statistic exhibits superior performance. We derive the sign-sum statistic for getting a robust gene ranking. The sign-sum statistic gives more reproducible ranking than the t-statistic. Using simulated data sets we show that the sign-sum statistic excludes hetero-type genes well. Also for the real data sets, the sign-sum statistic performs well in a viewpoint of ranking reproducibility.
Performance of Bootstrapping Approaches To Model Test Statistics and Parameter Standard Error Estimation in Structural Equation Modeling.

ERIC Educational Resources Information Center

Nevitt, Jonathan; Hancock, Gregory R.

2001-01-01

Evaluated the bootstrap method under varying conditions of nonnormality, sample size, model specification, and number of bootstrap samples drawn from the resampling space. Results for the bootstrap suggest the resampling-based method may be conservative in its control over model rejections, thus having an impact on the statistical power associated…
An Empirical Investigation of Methods for Assessing Item Fit for Mixed Format Tests

ERIC Educational Resources Information Center

Chon, Kyong Hee; Lee, Won-Chan; Ansley, Timothy N.

2013-01-01

Empirical information regarding performance of model-fit procedures has been a persistent need in measurement practice. Statistical procedures for evaluating item fit were applied to real test examples that consist of both dichotomously and polytomously scored items. The item fit statistics used in this study included the PARSCALE's G[squared],…
Statistical Classification for Cognitive Diagnostic Assessment: An Artificial Neural Network Approach

ERIC Educational Resources Information Center

Cui, Ying; Gierl, Mark; Guo, Qi

2016-01-01

The purpose of the current investigation was to describe how the artificial neural networks (ANNs) can be used to interpret student performance on cognitive diagnostic assessments (CDAs) and evaluate the performances of ANNs using simulation results. CDAs are designed to measure student performance on problem-solving tasks and provide useful…
Statistical properties of a utility measure of observer performance compared to area under the ROC curve

NASA Astrophysics Data System (ADS)

Abbey, Craig K.; Samuelson, Frank W.; Gallas, Brandon D.; Boone, John M.; Niklason, Loren T.

2013-03-01

The receiver operating characteristic (ROC) curve has become a common tool for evaluating diagnostic imaging technologies, and the primary endpoint of such evaluations is the area under the curve (AUC), which integrates sensitivity over the entire false positive range. An alternative figure of merit for ROC studies is expected utility (EU), which focuses on the relevant region of the ROC curve as defined by disease prevalence and the relative utility of the task. However if this measure is to be used, it must also have desirable statistical properties keep the burden of observer performance studies as low as possible. Here, we evaluate effect size and variability for EU and AUC. We use two observer performance studies recently submitted to the FDA to compare the EU and AUC endpoints. The studies were conducted using the multi-reader multi-case methodology in which all readers score all cases in all modalities. ROC curves from the study were used to generate both the AUC and EU values for each reader and modality. The EU measure was computed assuming an iso-utility slope of 1.03. We find mean effect sizes, the reader averaged difference between modalities, to be roughly 2.0 times as big for EU as AUC. The standard deviation across readers is roughly 1.4 times as large, suggesting better statistical properties for the EU endpoint. In a simple power analysis of paired comparison across readers, the utility measure required 36% fewer readers on average to achieve 80% statistical power compared to AUC.
A statistical evaluation and comparison of VISSR Atmospheric Sounder (VAS) data

NASA Technical Reports Server (NTRS)

Jedlovec, G. J.

1984-01-01

In order to account for the temporal and spatial discrepancies between the VAS and rawinsonde soundings, the rawinsonde data were adjusted to a common hour of release where the new observation time corresponded to the satellite scan time. Both the satellite and rawinsonde observations of the basic atmospheric parameters (T Td, and Z) were objectively analyzed to a uniform grid maintaining the same mesoscale structure in each data set. The performance of each retrieval algorithm in producing accurate and representative soundings was evaluated using statistical parameters such as the mean, standard deviation, and root mean square of the difference fields for each parameter and grid level. Horizontal structure was also qualitatively evaluated by examining atmospheric features on constant pressure surfaces. An analysis of the vertical structure of the atmosphere were also performed by looking at colocated and grid mean vertical profiles of both the satellite and rawinsonde data sets. Highlights of these results are presented.
Using statistical text classification to identify health information technology incidents

PubMed Central

Chai, Kevin E K; Anthony, Stephen; Coiera, Enrico; Magrabi, Farah

2013-01-01

Objective To examine the feasibility of using statistical text classification to automatically identify health information technology (HIT) incidents in the USA Food and Drug Administration (FDA) Manufacturer and User Facility Device Experience (MAUDE) database. Design We used a subset of 570 272 incidents including 1534 HIT incidents reported to MAUDE between 1 January 2008 and 1 July 2010. Text classifiers using regularized logistic regression were evaluated with both ‘balanced’ (50% HIT) and ‘stratified’ (0.297% HIT) datasets for training, validation, and testing. Dataset preparation, feature extraction, feature selection, cross-validation, classification, performance evaluation, and error analysis were performed iteratively to further improve the classifiers. Feature-selection techniques such as removing short words and stop words, stemming, lemmatization, and principal component analysis were examined. Measurements κ statistic, F1 score, precision and recall. Results Classification performance was similar on both the stratified (0.954 F1 score) and balanced (0.995 F1 score) datasets. Stemming was the most effective technique, reducing the feature set size to 79% while maintaining comparable performance. Training with balanced datasets improved recall (0.989) but reduced precision (0.165). Conclusions Statistical text classification appears to be a feasible method for identifying HIT reports within large databases of incidents. Automated identification should enable more HIT problems to be detected, analyzed, and addressed in a timely manner. Semi-supervised learning may be necessary when applying machine learning to big data analysis of patient safety incidents and requires further investigation. PMID:23666777
[Treatment of proximal humeral fractures by reverse shoulder arthroplasty: mid-term evaluation of functional results and Notching].

PubMed

Hernández-Elena, J; de la Red-Gallego, M Á; Garcés-Zarzalejo, C; Pascual-Carra, M A; Pérez-Aguilar, M D; Rodríguez-López, T; Alfonso-Fernández, A; Pérez-Núñez, M I

2015-01-01

An analysis was made on relationship between Notching and functional and radiographic parameters after treatment of acute proximal humeral fractures with reverse total shoulder arthroplasty. A retrospective evaluation was performed on 37 patients with acute proximal humeral fracture treated by reversed shoulder arthroplasty. The mean follow-up was 24 months. Range of motion, intraoperative and postoperative complications were recorded. Nerot's classification was used to evaluate Notching. Patient satisfaction was evaluated with the Constant Score (CS). Statistical analysis was performed to evaluate the relationship between Notching and glenosphere position, or functional outcomes. Mean range of elevation, abduction, external and internal rotation were 106.22°, 104.46°, 46.08° and 40.27°, respectively. Mean CS was 63. Notching was present at 12 months in 29% of patients. Statistical analysis showed significance differences between age and CS, age and notching development, and tilt with notching. No statistical significance differences were found between elevation, abduction, internal and external rotation and CS either with scapular or glenosphere-neck angle. Reverse shoulder arthroplasty is a valuable option for acute humeral fractures in patients with osteoporosis and cuff-tear arthropathy. It leads to early pain relief and shoulder motion. Nevertheless, it is not exempt from complications, and long-term studies are needed to determine the importance of notching. Copyright © 2014 SECOT. Published by Elsevier Espana. All rights reserved.
Precipitation forecast using artificial neural networks. An application to the Guadalupe Valley, Baja California, Mexico

NASA Astrophysics Data System (ADS)

Herrera-Oliva, C. S.

2013-05-01

In this work we design and implement a method for the determination of precipitation forecast through the application of an elementary neuronal network (perceptron) to the statistical analysis of the precipitation reported in catalogues. The method is limited mainly by the catalogue length (and, in a smaller degree, by its accuracy). The method performance is measured using grading functions that evaluate a tradeoff between positive and negative aspects of performance. The method is applied to the Guadalupe Valley, Baja California, Mexico. Using consecutive intervals of dt=0.1 year, employing the data of several climatological stations situated in and surrounding this important wine industries zone. We evaluated the performance of different models of ANN, whose variables of entrance are the heights of precipitation. The results obtained were satisfactory, except for exceptional values of rain. Key words: precipitation forecast, artificial neural networks, statistical analysis
How to Perform a Systematic Review and Meta-analysis of Diagnostic Imaging Studies.

PubMed

Cronin, Paul; Kelly, Aine Marie; Altaee, Duaa; Foerster, Bradley; Petrou, Myria; Dwamena, Ben A

2018-05-01

A systematic review is a comprehensive search, critical evaluation, and synthesis of all the relevant studies on a specific (clinical) topic that can be applied to the evaluation of diagnostic and screening imaging studies. It can be a qualitative or a quantitative (meta-analysis) review of available literature. A meta-analysis uses statistical methods to combine and summarize the results of several studies. In this review, a 12-step approach to performing a systematic review (and meta-analysis) is outlined under the four domains: (1) Problem Formulation and Data Acquisition, (2) Quality Appraisal of Eligible Studies, (3) Statistical Analysis of Quantitative Data, and (4) Clinical Interpretation of the Evidence. This review is specifically geared toward the performance of a systematic review and meta-analysis of diagnostic test accuracy (imaging) studies. Copyright © 2018 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
Simulator evaluation of the effects of reduced spoiler and thrust authority on a decoupled longitudinal control system during landings in wind shear

NASA Technical Reports Server (NTRS)

Miller, G. K., Jr.

1981-01-01

The effect of reduced control authority, both in symmetric spoiler travel and thrust level, on the effectiveness of a decoupled longitudinal control system was examined during the approach and landing of the NASA terminal configured vehicle (TCV) aft flight deck simulator in the presence of wind shear. The evaluation was conducted in a fixed-base simulator that represented the TCV aft cockpit. There were no statistically significant effects of reduced spoiler and thrust authority on pilot performance during approach and landing. Increased wind severity degraded approach and landing performance by an amount that was often significant. However, every attempted landing was completed safely regardless of the wind severity. There were statistically significant differences in performance between subjects, but the differences were generally restricted to the control wheel and control-column activity during the approach.

Appraisal of within- and between-laboratory reproducibility of non-radioisotopic local lymph node assay using flow cytometry, LLNA:BrdU-FCM: comparison of OECD TG429 performance standard and statistical evaluation.

PubMed

Yang, Hyeri; Na, Jihye; Jang, Won-Hee; Jung, Mi-Sook; Jeon, Jun-Young; Heo, Yong; Yeo, Kyung-Wook; Jo, Ji-Hoon; Lim, Kyung-Min; Bae, SeungJin

2015-05-05

Mouse local lymph node assay (LLNA, OECD TG429) is an alternative test replacing conventional guinea pig tests (OECD TG406) for the skin sensitization test but the use of a radioisotopic agent, (3)H-thymidine, deters its active dissemination. New non-radioisotopic LLNA, LLNA:BrdU-FCM employs a non-radioisotopic analog, 5-bromo-2'-deoxyuridine (BrdU) and flow cytometry. For an analogous method, OECD TG429 performance standard (PS) advises that two reference compounds be tested repeatedly and ECt(threshold) values obtained must fall within acceptable ranges to prove within- and between-laboratory reproducibility. However, this criteria is somewhat arbitrary and sample size of ECt is less than 5, raising concerns about insufficient reliability. Here, we explored various statistical methods to evaluate the reproducibility of LLNA:BrdU-FCM with stimulation index (SI), the raw data for ECt calculation, produced from 3 laboratories. Descriptive statistics along with graphical representation of SI was presented. For inferential statistics, parametric and non-parametric methods were applied to test the reproducibility of SI of a concurrent positive control and the robustness of results were investigated. Descriptive statistics and graphical representation of SI alone could illustrate the within- and between-laboratory reproducibility. Inferential statistics employing parametric and nonparametric methods drew similar conclusion. While all labs passed within- and between-laboratory reproducibility criteria given by OECD TG429 PS based on ECt values, statistical evaluation based on SI values showed that only two labs succeeded in achieving within-laboratory reproducibility. For those two labs that satisfied the within-lab reproducibility, between-laboratory reproducibility could be also attained based on inferential as well as descriptive statistics. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Statistical and Machine Learning forecasting methods: Concerns and ways forward

PubMed Central

Makridakis, Spyros; Assimakopoulos, Vassilios

2018-01-01

Machine Learning (ML) methods have been proposed in the academic literature as alternatives to statistical ones for time series forecasting. Yet, scant evidence is available about their relative performance in terms of accuracy and computational requirements. The purpose of this paper is to evaluate such performance across multiple forecasting horizons using a large subset of 1045 monthly time series used in the M3 Competition. After comparing the post-sample accuracy of popular ML methods with that of eight traditional statistical ones, we found that the former are dominated across both accuracy measures used and for all forecasting horizons examined. Moreover, we observed that their computational requirements are considerably greater than those of statistical methods. The paper discusses the results, explains why the accuracy of ML models is below that of statistical ones and proposes some possible ways forward. The empirical results found in our research stress the need for objective and unbiased ways to test the performance of forecasting methods that can be achieved through sizable and open competitions allowing meaningful comparisons and definite conclusions. PMID:29584784
An empirical evaluation of genetic distance statistics using microsatellite data from bear (Ursidae) populations.

PubMed

Paetkau, D; Waits, L P; Clarkson, P L; Craighead, L; Strobeck, C

1997-12-01

A large microsatellite data set from three species of bear (Ursidae) was used to empirically test the performance of six genetic distance measures in resolving relationships at a variety of scales ranging from adjacent areas in a continuous distribution to species that diverged several million years ago. At the finest scale, while some distance measures performed extremely well, statistics developed specifically to accommodate the mutational processes of microsatellites performed relatively poorly, presumably because of the relatively higher variance of these statistics. At the other extreme, no statistic was able to resolve the close sister relationship of polar bears and brown bears from more distantly related pairs of species. This failure is most likely due to constraints on allele distributions at microsatellite loci. At intermediate scales, both within continuous distributions and in comparisons to insular populations of late Pleistocene origin, it was not possible to define the point where linearity was lost for each of the statistics, except that it is clearly lost after relatively short periods of independent evolution. All of the statistics were affected by the amount of genetic diversity within the populations being compared, significantly complicating the interpretation of genetic distance data.
An Empirical Evaluation of Genetic Distance Statistics Using Microsatellite Data from Bear (Ursidae) Populations

PubMed Central

Paetkau, D.; Waits, L. P.; Clarkson, P. L.; Craighead, L.; Strobeck, C.

1997-01-01

A large microsatellite data set from three species of bear (Ursidae) was used to empirically test the performance of six genetic distance measures in resolving relationships at a variety of scales ranging from adjacent areas in a continuous distribution to species that diverged several million years ago. At the finest scale, while some distance measures performed extremely well, statistics developed specifically to accommodate the mutational processes of microsatellites performed relatively poorly, presumably because of the relatively higher variance of these statistics. At the other extreme, no statistic was able to resolve the close sister relationship of polar bears and brown bears from more distantly related pairs of species. This failure is most likely due to constraints on allele distributions at microsatellite loci. At intermediate scales, both within continuous distributions and in comparisons to insular populations of late Pleistocene origin, it was not possible to define the point where linearity was lost for each of the statistics, except that it is clearly lost after relatively short periods of independent evolution. All of the statistics were affected by the amount of genetic diversity within the populations being compared, significantly complicating the interpretation of genetic distance data. PMID:9409849
Correcting Too Much or Too Little? The Performance of Three Chi-Square Corrections.

PubMed

Foldnes, Njål; Olsson, Ulf Henning

2015-01-01

This simulation study investigates the performance of three test statistics, T1, T2, and T3, used to evaluate structural equation model fit under non normal data conditions. T1 is the well-known mean-adjusted statistic of Satorra and Bentler. T2 is the mean-and-variance adjusted statistic of Sattertwaithe type where the degrees of freedom is manipulated. T3 is a recently proposed version of T2 that does not manipulate degrees of freedom. Discrepancies between these statistics and their nominal chi-square distribution in terms of errors of Type I and Type II are investigated. All statistics are shown to be sensitive to increasing kurtosis in the data, with Type I error rates often far off the nominal level. Under excess kurtosis true models are generally over-rejected by T1 and under-rejected by T2 and T3, which have similar performance in all conditions. Under misspecification there is a loss of power with increasing kurtosis, especially for T2 and T3. The coefficient of variation of the nonzero eigenvalues of a certain matrix is shown to be a reliable indicator for the adequacy of these statistics.
Nursing students' attitudes toward statistics: Effect of a biostatistics course and association with examination performance.

PubMed

Kiekkas, Panagiotis; Panagiotarou, Aliki; Malja, Alvaro; Tahirai, Daniela; Zykai, Rountina; Bakalis, Nick; Stefanopoulos, Nikolaos

2015-12-01

Although statistical knowledge and skills are necessary for promoting evidence-based practice, health sciences students have expressed anxiety about statistics courses, which may hinder their learning of statistical concepts. To evaluate the effects of a biostatistics course on nursing students' attitudes toward statistics and to explore the association between these attitudes and their performance in the course examination. One-group quasi-experimental pre-test/post-test design. Undergraduate nursing students of the fifth or higher semester of studies, who attended a biostatistics course. Participants were asked to complete the pre-test and post-test forms of The Survey of Attitudes Toward Statistics (SATS)-36 scale at the beginning and end of the course respectively. Pre-test and post-test scale scores were compared, while correlations between post-test scores and participants' examination performance were estimated. Among 156 participants, post-test scores of the overall SATS-36 scale and of the Affect, Cognitive Competence, Interest and Effort components were significantly higher than pre-test ones, indicating that the course was followed by more positive attitudes toward statistics. Among 104 students who participated in the examination, higher post-test scores of the overall SATS-36 scale and of the Affect, Difficulty, Interest and Effort components were significantly but weakly correlated with higher examination performance. Students' attitudes toward statistics can be improved through appropriate biostatistics courses, while positive attitudes contribute to higher course achievements and possibly to improved statistical skills in later professional life. Copyright © 2015 Elsevier Ltd. All rights reserved.
Evaluating Educational Programs. ETS R&D Scientific and Policy Contributions Series. ETS SPC-11-01. ETS Research Report No. RR-11-15

ERIC Educational Resources Information Center

Ball, Samuel

2011-01-01

Since its founding in 1947, ETS has conducted a significant and wide-ranging research program that has focused on, among other things, psychometric and statistical methodology; educational evaluation; performance assessment and scoring; large-scale assessment and evaluation; cognitive, developmental, personality, and social psychology; and…
Statistical and Hydrological evaluation of precipitation forecasts from IMD MME and ECMWF numerical weather forecasts for Indian River basins

NASA Astrophysics Data System (ADS)

Mohite, A. R.; Beria, H.; Behera, A. K.; Chatterjee, C.; Singh, R.

2016-12-01

Flood forecasting using hydrological models is an important and cost-effective non-structural flood management measure. For forecasting at short lead times, empirical models using real-time precipitation estimates have proven to be reliable. However, their skill depreciates with increasing lead time. Coupling a hydrologic model with real-time rainfall forecasts issued from numerical weather prediction (NWP) systems could increase the lead time substantially. In this study, we compared 1-5 days precipitation forecasts from India Meteorological Department (IMD) Multi-Model Ensemble (MME) with European Center for Medium Weather forecast (ECMWF) NWP forecasts for over 86 major river basins in India. We then evaluated the hydrologic utility of these forecasts over Basantpur catchment (approx. 59,000 km2) of the Mahanadi River basin. Coupled MIKE 11 RR (NAM) and MIKE 11 hydrodynamic (HD) models were used for the development of flood forecast system (FFS). RR model was calibrated using IMD station rainfall data. Cross-sections extracted from SRTM 30 were used as input to the MIKE 11 HD model. IMD started issuing operational MME forecasts from the year 2008, and hence, both the statistical and hydrologic evaluation were carried out from 2008-2014. The performance of FFS was evaluated using both the NWP datasets separately for the year 2011, which was a large flood year in Mahanadi River basin. We will present figures and metrics for statistical (threshold based statistics, skill in terms of correlation and bias) and hydrologic (Nash Sutcliffe efficiency, mean and peak error statistics) evaluation. The statistical evaluation will be at pan-India scale for all the major river basins and the hydrologic evaluation will be for the Basantpur catchment of the Mahanadi River basin.
Evaluation and comparison of statistical methods for early temporal detection of outbreaks: A simulation-based study

PubMed Central

Le Strat, Yann

2017-01-01

The objective of this paper is to evaluate a panel of statistical algorithms for temporal outbreak detection. Based on a large dataset of simulated weekly surveillance time series, we performed a systematic assessment of 21 statistical algorithms, 19 implemented in the R package surveillance and two other methods. We estimated false positive rate (FPR), probability of detection (POD), probability of detection during the first week, sensitivity, specificity, negative and positive predictive values and F1-measure for each detection method. Then, to identify the factors associated with these performance measures, we ran multivariate Poisson regression models adjusted for the characteristics of the simulated time series (trend, seasonality, dispersion, outbreak sizes, etc.). The FPR ranged from 0.7% to 59.9% and the POD from 43.3% to 88.7%. Some methods had a very high specificity, up to 99.4%, but a low sensitivity. Methods with a high sensitivity (up to 79.5%) had a low specificity. All methods had a high negative predictive value, over 94%, while positive predictive values ranged from 6.5% to 68.4%. Multivariate Poisson regression models showed that performance measures were strongly influenced by the characteristics of time series. Past or current outbreak size and duration strongly influenced detection performances. PMID:28715489
The effect of various factors on the masticatory performance of removable denture wearer

NASA Astrophysics Data System (ADS)

Pratama, S.; Koesmaningati, H.; Kusdhany, L. S.

2017-08-01

An individual’s masticatory performance concerns his/her ability to break down food in order to facilitate digestion, and it therefore plays an important role in nutrition. Removable dentures are used to rehabilitate a loss of teeth, which could jeopardize masticatory performance. Further, there exist various other factors that can affect masticatory performance. The objective of this research is to analyze the relationship between various factors and masticatory performance. Thirty-four removable denture wearers (full dentures, single complete dentures, or partial dentures) participated in a cross-sectional study of masticatory performance using color-changeable chewing gum (Masticatory Performance Evaluating Gum Xylitol®). The volume of saliva was evaluated using measuring cups, while the residual ridge heights were measured using a modified mouth mirror no. 3 with metric measurements. The residual ridge height and removable-denture-wearing experience exhibited a significant relationship with masticatory performance. However, age, gender, saliva volume, denture type, and the number and location of the missing teeth did not have a statistically significant association with masticatory performance. The residual ridge height influences the masticatory performance of removable denture wearers, since the greater the ridge height, the better the performance. The experience of using dentures also has a statistically significant influence on masticatory performance.
An Evaluation of the Impact of E-Learning Media Formats on Student Perception and Performance

NASA Astrophysics Data System (ADS)

Kurbel, Karl; Stankov, Ivo; Datsenka, Rastsislau

Factors influencing student evaluation of web-based courses are analyzed, based on student feedback from an online distance-learning graduate program. The impact of different media formats on the perception of the courses by the students as well as on their performance in these courses are examined. In particular, we studied conventional hypertext-based courses, video-based courses and audio-based courses, and tried to find out whether the media format has an effect on how students assess courses and how good or bad their grades are. Statistical analyses were performed to answer several research questions related to the topic and to properly evaluate the factors influencing student evaluation.
Evaluation of normalization methods in mammalian microRNA-Seq data

PubMed Central

Garmire, Lana Xia; Subramaniam, Shankar

2012-01-01

Simple total tag count normalization is inadequate for microRNA sequencing data generated from the next generation sequencing technology. However, so far systematic evaluation of normalization methods on microRNA sequencing data is lacking. We comprehensively evaluate seven commonly used normalization methods including global normalization, Lowess normalization, Trimmed Mean Method (TMM), quantile normalization, scaling normalization, variance stabilization, and invariant method. We assess these methods on two individual experimental data sets with the empirical statistical metrics of mean square error (MSE) and Kolmogorov-Smirnov (K-S) statistic. Additionally, we evaluate the methods with results from quantitative PCR validation. Our results consistently show that Lowess normalization and quantile normalization perform the best, whereas TMM, a method applied to the RNA-Sequencing normalization, performs the worst. The poor performance of TMM normalization is further evidenced by abnormal results from the test of differential expression (DE) of microRNA-Seq data. Comparing with the models used for DE, the choice of normalization method is the primary factor that affects the results of DE. In summary, Lowess normalization and quantile normalization are recommended for normalizing microRNA-Seq data, whereas the TMM method should be used with caution. PMID:22532701
Evaluating Statistical Process Control (SPC) techniques and computing the uncertainty of force calibrations

NASA Technical Reports Server (NTRS)

Navard, Sharon E.

1989-01-01

In recent years there has been a push within NASA to use statistical techniques to improve the quality of production. Two areas where statistics are used are in establishing product and process quality control of flight hardware and in evaluating the uncertainty of calibration of instruments. The Flight Systems Quality Engineering branch is responsible for developing and assuring the quality of all flight hardware; the statistical process control methods employed are reviewed and evaluated. The Measurement Standards and Calibration Laboratory performs the calibration of all instruments used on-site at JSC as well as those used by all off-site contractors. These calibrations must be performed in such a way as to be traceable to national standards maintained by the National Institute of Standards and Technology, and they must meet a four-to-one ratio of the instrument specifications to calibrating standard uncertainty. In some instances this ratio is not met, and in these cases it is desirable to compute the exact uncertainty of the calibration and determine ways of reducing it. A particular example where this problem is encountered is with a machine which does automatic calibrations of force. The process of force calibration using the United Force Machine is described in detail. The sources of error are identified and quantified when possible. Suggestions for improvement are made.
Maxillary sinus augmentation by crestal access: a retrospective study on cavity size and outcome correlation.

PubMed

Spinato, Sergio; Bernardello, Fabio; Galindo-Moreno, Pablo; Zaffe, Davide

2015-12-01

Cone-beam computed tomography (CBCT) and radiographic outcomes of crestal sinus elevation, performed using mineralized human bone allograft, were analyzed to correlate results with maxillary sinus size. A total of 60 sinus augmentations in 60 patients, with initial bone ≤5 mm, were performed. Digital radiographs were taken at surgical implant placement time up to post-prosthetic loading follow-up (12-72 months), when CBCT evaluation was carried out. Marginal bone loss (MBL) was radiographically analyzed at 6 months and follow-up time post-loading. Sinus size (BPD), implant distance from palatal (PID) and buccal wall (BID), and absence of bone coverage of implant (intra-sinus bone loss--IBL) were evaluated and statistically evaluated by ANOVA and linear regression analyses. MBL increased as a function of time. MBL at final follow-up was statistically associated with MBL at 6 months. A statistically significant correlation of IBL with wall distance and of IBL/mm with time was identified with greater values in wide sinuses (WS ≥ 13.27 mm) than in narrow sinuses (NS < 13.27 mm). This study is the first quantitative and statistically significant confirmation that crestal technique with residual ridge height <5 mm is more appropriate and predictable, in terms of intra-sinus bone coverage, in narrow than in WS. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
A PERFORMANCE EVALUATION OF THE ETA- CMAQ AIR QUALITY FORECAST SYSTEM FOR THE SUMMER OF 2005

EPA Science Inventory

This poster presents an evaluation of the Eta-CMAQ Air Quality Forecast System's experimental domain using O₃ observations obtained from EPA's AIRNOW program and a suite of statistical metrics examining both discrete and categorical forecasts.
Implementation of Statistical Process Control: Evaluating the Mechanical Performance of a Candidate Silicone Elastomer Docking Seal

NASA Technical Reports Server (NTRS)

Oravec, Heather Ann; Daniels, Christopher C.

2014-01-01

The National Aeronautics and Space Administration has been developing a novel docking system to meet the requirements of future exploration missions to low-Earth orbit and beyond. A dynamic gas pressure seal is located at the main interface between the active and passive mating components of the new docking system. This seal is designed to operate in the harsh space environment, but is also to perform within strict loading requirements while maintaining an acceptable level of leak rate. In this study, a candidate silicone elastomer seal was designed, and multiple subscale test articles were manufactured for evaluation purposes. The force required to fully compress each test article at room temperature was quantified and found to be below the maximum allowable load for the docking system. However, a significant amount of scatter was observed in the test results. Due to the stochastic nature of the mechanical performance of this candidate docking seal, a statistical process control technique was implemented to isolate unusual compression behavior from typical mechanical performance. The results of this statistical analysis indicated a lack of process control, suggesting a variation in the manufacturing phase of the process. Further investigation revealed that changes in the manufacturing molding process had occurred which may have influenced the mechanical performance of the seal. This knowledge improves the chance of this and future space seals to satisfy or exceed design specifications.
Performance of cancer cluster Q-statistics for case-control residential histories

PubMed Central

Sloan, Chantel D.; Jacquez, Geoffrey M.; Gallagher, Carolyn M.; Ward, Mary H.; Raaschou-Nielsen, Ole; Nordsborg, Rikke Baastrup; Meliker, Jaymie R.

2012-01-01

Few investigations of health event clustering have evaluated residential mobility, though causative exposures for chronic diseases such as cancer often occur long before diagnosis. Recently developed Q-statistics incorporate human mobility into disease cluster investigations by quantifying space- and time-dependent nearest neighbor relationships. Using residential histories from two cancer case-control studies, we created simulated clusters to examine Q-statistic performance. Results suggest the intersection of cases with significant clustering over their life course, Qi, with cases who are constituents of significant local clusters at given times, Qit, yielded the best performance, which improved with increasing cluster size. Upon comparison, a larger proportion of true positives were detected with Kulldorf’s spatial scan method if the time of clustering was provided. We recommend using Q-statistics to identify when and where clustering may have occurred, followed by the scan method to localize the candidate clusters. Future work should investigate the generalizability of these findings. PMID:23149326
Meta-analysis of the technical performance of an imaging procedure: guidelines and statistical methodology.

PubMed

Huang, Erich P; Wang, Xiao-Feng; Choudhury, Kingshuk Roy; McShane, Lisa M; Gönen, Mithat; Ye, Jingjing; Buckler, Andrew J; Kinahan, Paul E; Reeves, Anthony P; Jackson, Edward F; Guimaraes, Alexander R; Zahlmann, Gudrun

2015-02-01

Medical imaging serves many roles in patient care and the drug approval process, including assessing treatment response and guiding treatment decisions. These roles often involve a quantitative imaging biomarker, an objectively measured characteristic of the underlying anatomic structure or biochemical process derived from medical images. Before a quantitative imaging biomarker is accepted for use in such roles, the imaging procedure to acquire it must undergo evaluation of its technical performance, which entails assessment of performance metrics such as repeatability and reproducibility of the quantitative imaging biomarker. Ideally, this evaluation will involve quantitative summaries of results from multiple studies to overcome limitations due to the typically small sample sizes of technical performance studies and/or to include a broader range of clinical settings and patient populations. This paper is a review of meta-analysis procedures for such an evaluation, including identification of suitable studies, statistical methodology to evaluate and summarize the performance metrics, and complete and transparent reporting of the results. This review addresses challenges typical of meta-analyses of technical performance, particularly small study sizes, which often causes violations of assumptions underlying standard meta-analysis techniques. Alternative approaches to address these difficulties are also presented; simulation studies indicate that they outperform standard techniques when some studies are small. The meta-analysis procedures presented are also applied to actual [18F]-fluorodeoxyglucose positron emission tomography (FDG-PET) test-retest repeatability data for illustrative purposes. © The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Meta-analysis of the technical performance of an imaging procedure: Guidelines and statistical methodology

PubMed Central

Huang, Erich P; Wang, Xiao-Feng; Choudhury, Kingshuk Roy; McShane, Lisa M; Gönen, Mithat; Ye, Jingjing; Buckler, Andrew J; Kinahan, Paul E; Reeves, Anthony P; Jackson, Edward F; Guimaraes, Alexander R; Zahlmann, Gudrun

2017-01-01

Medical imaging serves many roles in patient care and the drug approval process, including assessing treatment response and guiding treatment decisions. These roles often involve a quantitative imaging biomarker, an objectively measured characteristic of the underlying anatomic structure or biochemical process derived from medical images. Before a quantitative imaging biomarker is accepted for use in such roles, the imaging procedure to acquire it must undergo evaluation of its technical performance, which entails assessment of performance metrics such as repeatability and reproducibility of the quantitative imaging biomarker. Ideally, this evaluation will involve quantitative summaries of results from multiple studies to overcome limitations due to the typically small sample sizes of technical performance studies and/or to include a broader range of clinical settings and patient populations. This paper is a review of meta-analysis procedures for such an evaluation, including identification of suitable studies, statistical methodology to evaluate and summarize the performance metrics, and complete and transparent reporting of the results. This review addresses challenges typical of meta-analyses of technical performance, particularly small study sizes, which often causes violations of assumptions underlying standard meta-analysis techniques. Alternative approaches to address these difficulties are also presented; simulation studies indicate that they outperform standard techniques when some studies are small. The meta-analysis procedures presented are also applied to actual [18F]-fluorodeoxyglucose positron emission tomography (FDG-PET) test–retest repeatability data for illustrative purposes. PMID:24872353
Evaluation of the prediction precision capability of partial least squares regression approach for analysis of high alloy steel by laser induced breakdown spectroscopy

NASA Astrophysics Data System (ADS)

Sarkar, Arnab; Karki, Vijay; Aggarwal, Suresh K.; Maurya, Gulab S.; Kumar, Rohit; Rai, Awadhesh K.; Mao, Xianglei; Russo, Richard E.

2015-06-01

Laser induced breakdown spectroscopy (LIBS) was applied for elemental characterization of high alloy steel using partial least squares regression (PLSR) with an objective to evaluate the analytical performance of this multivariate approach. The optimization of the number of principle components for minimizing error in PLSR algorithm was investigated. The effect of different pre-treatment procedures on the raw spectral data before PLSR analysis was evaluated based on several statistical (standard error of prediction, percentage relative error of prediction etc.) parameters. The pre-treatment with "NORM" parameter gave the optimum statistical results. The analytical performance of PLSR model improved by increasing the number of laser pulses accumulated per spectrum as well as by truncating the spectrum to appropriate wavelength region. It was found that the statistical benefit of truncating the spectrum can also be accomplished by increasing the number of laser pulses per accumulation without spectral truncation. The constituents (Co and Mo) present in hundreds of ppm were determined with relative precision of 4-9% (2σ), whereas the major constituents Cr and Ni (present at a few percent levels) were determined with a relative precision of ~ 2%(2σ).

Early seizure detection in an animal model of temporal lobe epilepsy

NASA Astrophysics Data System (ADS)

Talathi, Sachin S.; Hwang, Dong-Uk; Ditto, William; Carney, Paul R.

2007-11-01

The performance of five seizure detection schemes, i.e., Nonlinear embedding delay, Hurst scaling, Wavelet Scale, autocorrelation and gradient of accumulated energy, in their ability to detect EEG seizures close to the seizure onset time were evaluated to determine the feasibility of their application in the development of a real time closed loop seizure intervention program (RCLSIP). The criteria chosen for the performance evaluation were, high statistical robustness as determined through the predictability index, the sensitivity and the specificity of a given measure to detect an EEG seizure, the lag in seizure detection with respect to the EEG seizure onset time, as determined through visual inspection and the computational efficiency for each detection measure. An optimality function was designed to evaluate the overall performance of each measure dependent on the criteria chosen. While each of the above measures analyzed for seizure detection performed very well in terms of the statistical parameters, the nonlinear embedding delay measure was found to have the highest optimality index due to its ability to detect seizure very close to the EEG seizure onset time, thereby making it the most suitable dynamical measure in the development of RCLSIP in rat model with chronic limbic epilepsy.
Meta-analysis of prediction model performance across multiple studies: Which scale helps ensure between-study normality for the C-statistic and calibration measures?

PubMed

Snell, Kym Ie; Ensor, Joie; Debray, Thomas Pa; Moons, Karel Gm; Riley, Richard D

2017-01-01

If individual participant data are available from multiple studies or clusters, then a prediction model can be externally validated multiple times. This allows the model's discrimination and calibration performance to be examined across different settings. Random-effects meta-analysis can then be used to quantify overall (average) performance and heterogeneity in performance. This typically assumes a normal distribution of 'true' performance across studies. We conducted a simulation study to examine this normality assumption for various performance measures relating to a logistic regression prediction model. We simulated data across multiple studies with varying degrees of variability in baseline risk or predictor effects and then evaluated the shape of the between-study distribution in the C-statistic, calibration slope, calibration-in-the-large, and E/O statistic, and possible transformations thereof. We found that a normal between-study distribution was usually reasonable for the calibration slope and calibration-in-the-large; however, the distributions of the C-statistic and E/O were often skewed across studies, particularly in settings with large variability in the predictor effects. Normality was vastly improved when using the logit transformation for the C-statistic and the log transformation for E/O, and therefore we recommend these scales to be used for meta-analysis. An illustrated example is given using a random-effects meta-analysis of the performance of QRISK2 across 25 general practices.
US-VISIT Identity Matching Algorithm Evaluation Program: ADIS Algorithm Evaluation Project Plan Update

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grant, C W; Lenderman, J S; Gansemer, J D

This document is an update to the 'ADIS Algorithm Evaluation Project Plan' specified in the Statement of Work for the US-VISIT Identity Matching Algorithm Evaluation Program, as deliverable II.D.1. The original plan was delivered in August 2010. This document modifies the plan to reflect modified deliverables reflecting delays in obtaining a database refresh. This document describes the revised schedule of the program deliverables. The detailed description of the processes used, the statistical analysis processes and the results of the statistical analysis will be described fully in the program deliverables. The US-VISIT Identity Matching Algorithm Evaluation Program is work performed bymore » Lawrence Livermore National Laboratory (LLNL) under IAA HSHQVT-07-X-00002 P00004 from the Department of Homeland Security (DHS).« less
Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data.

PubMed

Tintle, Nathan L; Sitarik, Alexandra; Boerema, Benjamin; Young, Kylie; Best, Aaron A; Dejongh, Matthew

2012-08-08

Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Vigg, Steven; Johnson, John

In this annual Monitoring & Evaluation (M&E) report to the Bonneville Power Administration (BPA), we summarize significant activities and performance measures resultant from enhanced protection by Columbia River Inter-Tribal Fisheries Enforcement (CRITFE) in the mainstem corridor (BPA Project 2000-056). This report covers the Fiscal Year (FY) 2000 performance period -- May 15, 2000 to May 14, 2001. Quarterly progress reports have previously been submitted to BPA and are posted on the M&E Web site (www.Eco-Law.net) -- for the time period April-December 2000 (Vigg 2000b,c,d) and for the period January-June 2001 (Vigg 2001a,b). We also present comprehensive data representing the firstmore » quarter of year 2000 in this report for a pre-project comparison. In addition, we have analyzed specific annual enforcement statistics to evaluate trends during the baseline period 1996-2000. Additional statistics and more years of comprehensive baseline data are now being summarized, and will be presented in future M&E annual reports--to provide a longer time series for evaluation of trends in input, output and outcome performance standards.« less
Statistically Comparing the Performance of Multiple Automated Raters across Multiple Items

ERIC Educational Resources Information Center

Kieftenbeld, Vincent; Boyer, Michelle

2017-01-01

Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…
Learner Characteristics Predict Performance and Confidence in E-Learning: An Analysis of User Behavior and Self-Evaluation

ERIC Educational Resources Information Center

Jeske, Debora; Roßnagell, Christian Stamov; Backhaus, Joy

2014-01-01

We examined the role of learner characteristics as predictors of four aspects of e-learning performance, including knowledge test performance, learning confidence, learning efficiency, and navigational effectiveness. We used both self reports and log file records to compute the relevant statistics. Regression analyses showed that both need for…
Anxiety and performance of nursing students in regard to assessment via clinical simulations in the classroom versus filmed assessments.

PubMed

de Souza Teixeira, Carla Regina; Kusumota, Luciana; Alves Pereira, Marta Cristiane; Merizio Martins Braga, Fernanda Titareli; Pirani Gaioso, Vanessa; Mara Zamarioli, Cristina; Campos de Carvalho, Emilia

2014-01-01

To compare the level of anxiety and performance of nursing students when performing a clinical simulation through the traditional method of assessment with the presence of an evaluator and through a filmed assessment without the presence of an evaluator. Controlled trial with the participation of Brazilian public university 20 students who were randomly assigned to one of two groups: a) assessment through the traditional method with the presence of an evaluator; or b) filmed assessment. The level of anxiety was assessed using the Zung test and performance was measured based on the number of correct answers. Averages of 32 and 27 were obtained on the anxiety scale by the group assessed through the traditional method before and after the simulation, respectively, while the filmed group obtained averages of 33 and 26; the final scores correspond to mild anxiety. Even though there was a statistically significant reduction in the intra-groups scores before and after the simulation, there was no difference between the groups. As for the performance assessments in the clinical simulation, the groups obtained similar percentages of correct answers (83% in the traditional assessment and 84% in the filmed assessment) without statistically significant differences. Filming can be used and encouraged as a strategy to assess nursing undergraduate students.
SIRU utilization. Volume 1: Theory, development and test evaluation

NASA Technical Reports Server (NTRS)

Musoff, H.

1974-01-01

The theory, development, and test evaluations of the Strapdown Inertial Reference Unit (SIRU) are discussed. The statistical failure detection and isolation, single position calibration, and self alignment techniques are emphasized. Circuit diagrams of the system components are provided. Mathematical models are developed to show the performance characteristics of the subsystems. Specific areas of the utilization program are identified as: (1) error source propagation characteristics and (2) local level navigation performance demonstrations.
Monitoring the metering performance of an electronic voltage transformer on-line based on cyber-physics correlation analysis

NASA Astrophysics Data System (ADS)

Zhang, Zhu; Li, Hongbin; Tang, Dengping; Hu, Chen; Jiao, Yang

2017-10-01

Metering performance is the key parameter of an electronic voltage transformer (EVT), and it requires high accuracy. The conventional off-line calibration method using a standard voltage transformer is not suitable for the key equipment in a smart substation, which needs on-line monitoring. In this article, we propose a method for monitoring the metering performance of an EVT on-line based on cyber-physics correlation analysis. By the electrical and physical properties of a substation running in three-phase symmetry, the principal component analysis method is used to separate the metering deviation caused by the primary fluctuation and the EVT anomaly. The characteristic statistics of the measured data during operation are extracted, and the metering performance of the EVT is evaluated by analyzing the change in statistics. The experimental results show that the method successfully monitors the metering deviation of a Class 0.2 EVT accurately. The method demonstrates the accurate evaluation of on-line monitoring of the metering performance on an EVT without a standard voltage transformer.
Evaluation and Assessment of a Biomechanics Computer-Aided Instruction.

ERIC Educational Resources Information Center

Washington, N.; Parnianpour, M.; Fraser, J. M.

1999-01-01

Describes the Biomechanics Tutorial, a computer-aided instructional tool that was developed at Ohio State University to expedite the transition from lecture to application for undergraduate students. Reports evaluation results that used statistical analyses and student questionnaires to show improved performance on posttests as well as positive…
Public-Private Partnership Program Evaluation. 1988-89.

ERIC Educational Resources Information Center

Bland, June

This evaluation of the 1988-89 Public Private Partnership (PPP) program in Washington (District of Columbia) was seriously limited by the unavailability of statistical data on student progress and internship performance. PPP was designed to improve the preparation of high school students for the world of work by involving community businesses in…
Spillover in the Academy: Marriage Stability and Faculty Evaluations.

ERIC Educational Resources Information Center

Ludlow, Larry H.; Alvarez-Salvat, Rose M.

2001-01-01

Studied the spillover between family and work by examining the link between marital status and work performance across marriage, divorce, and remarriage. A polynomial regression model was fit to the data from 78 evaluations of an individual professor, and a cubic curve through the 3 periods was statistically significant. (SLD)
Prediction of Muscle Performance During Dynamic Repetitive Exercise

NASA Technical Reports Server (NTRS)

Byerly, D. L.; Byerly, K. A.; Sognier, M. A.; Squires, W. G.

2002-01-01

A method for predicting human muscle performance was developed. Eight test subjects performed a repetitive dynamic exercise to failure using a Lordex spinal machine. Electromyography (EMG) data was collected from the erector spinae. Evaluation of the EMG data using a 5th order Autoregressive (AR) model and statistical regression analysis revealed that an AR parameter, the mean average magnitude of AR poles, can predict performance to failure as early as the second repetition of the exercise. Potential applications to the space program include evaluating on-orbit countermeasure effectiveness, maximizing post-flight recovery, and future real-time monitoring capability during Extravehicular Activity.
Heart rate and performance during combat missions in a flight simulator.

PubMed

Lahtinen, Taija M M; Koskelo, Jukka P; Laitinen, Tomi; Leino, Tuomo K

2007-04-01

The psychological workload of flying has been shown to increase heart rate (HR) during flight simulator operation. The association between HR changes and flight performance remains unclear. There were 15 pilots who performed a combat flight mission in a Weapons Tactics Trainer simulator of an F-18 Hornet. An electrocardiogram (ECG) was recorded, and individual incremental heart rates (deltaHR) from the HR during rest were calculated for each flight phase and used in statistical analyses. The combat flight period was divided into 13 phases, which were evaluated on a scale of 1 to 5 by the flight instructor. HR increased during interceptions (from a mean resting level of 79.0 to mean value of 96.7 bpm in one of the interception flight phases) and decreased during the return to base and slightly increased during the ILS approach and landing. DeltaHR appeared to be similar among experienced and less experienced pilots. DeltaHR responses during the flight phases did not correlate with simulator flight performance scores. Overall simulator flight performance correlated statistically significantly (r = 0.50) with the F-18 Hornet flight experience. HR reflected the amount of cognitive load during the simulated flight. Hence, HR analysis can be used in the evaluation of the psychological workload of military simulator flight phases. However, more detailed flight performance evaluation methods are needed for this kind of complex flight simulation to replace the traditional but rough interval scales. Use of a visual analog scale by the flight instructors is suggested for simulator flight performance evaluation.
Evaluation of SLAR and thematic mapper MSS data for forest cover mapping using computer-aided analysis techniques

NASA Technical Reports Server (NTRS)

Hoffer, R. M. (Principal Investigator); Knowlton, D. J.; Dean, M. E.

1981-01-01

A set of training statistics for the 30 meter resolution simulated thematic mapper MSS data was generated based on land use/land cover classes. In addition to this supervised data set, a nonsupervised multicluster block of training statistics is being defined in order to compare the classification results and evaluate the effect of the different training selection methods on classification performance. Two test data sets, defined using a stratified sampling procedure incorporating a grid system with dimensions of 50 lines by 50 columns, and another set based on an analyst supervised set of test fields were used to evaluate the classifications of the TMS data. The supervised training data set generated training statistics, and a per point Gaussian maximum likelihood classification of the 1979 TMS data was obtained. The August 1980 MSS data was radiometrically adjusted. The SAR data was redigitized and the SAR imagery was qualitatively analyzed.
A 20-year period of orthotopic liver transplantation activity in a single center: a time series analysis performed using the R Statistical Software.

PubMed

Santori, G; Andorno, E; Morelli, N; Casaccia, M; Bottino, G; Di Domenico, S; Valente, U

2009-05-01

In many Western countries a "minimum volume rule" policy has been adopted as a quality measure for complex surgical procedures. In Italy, the National Transplant Centre set the minimum number of orthotopic liver transplantation (OLT) procedures/y at 25/center. OLT procedures performed in a single center for a reasonably large period may be treated as a time series to evaluate trend, seasonal cycles, and nonsystematic fluctuations. Between January 1, 1987 and December 31, 2006, we performed 563 cadaveric donor OLTs to adult recipients. During 2007, there were another 28 procedures. The greatest numbers of OLTs/y were performed in 2001 (n = 51), 2005 (n = 50), and 2004 (n = 49). A time series analysis performed using R Statistical Software (Foundation for Statistical Computing, Vienna, Austria), a free software environment for statistical computing and graphics, showed an incremental trend after exponential smoothing as well as after seasonal decomposition. The predicted OLT/mo for 2007 calculated with the Holt-Winters exponential smoothing applied to the previous period 1987-2006 helped to identify the months where there was a major difference between predicted and performed procedures. The time series approach may be helpful to establish a minimum volume/y at a single-center level.
How Many Oral and Maxillofacial Surgeons Does It Take to Perform Virtual Orthognathic Surgical Planning?

PubMed

Borba, Alexandre Meireles; Haupt, Dustin; de Almeida Romualdo, Leiliane Teresinha; da Silva, André Luis Fernandes; da Graça Naclério-Homem, Maria; Miloro, Michael

2016-09-01

Virtual surgical planning (VSP) has become routine practice in orthognathic treatment planning; however, most surgeons do not perform the planning without technical assistance, nor do they routinely evaluate the accuracy of the postoperative outcomes. The purpose of the present study was to propose a reproducible method that would allow surgeons to have an improved understanding of VSP orthognathic planning and to compare the planned surgical movements with the results obtained. A retrospective cohort of bimaxillary orthognathic surgery cases was used to evaluate the variability between the predicted and obtained movements using craniofacial landmarks and McNamara 3-dimensional cephalometric analysis from computed tomography scans. The demographic data (age, gender, and skeletal deformity type) were gathered from the medical records. The data analysis included the level of variability from the predicted to obtained surgical movements as assessed by the mean and standard deviation. For the overall sample, statistical analysis was performed using the 1-sample t test. The statistical analysis between the Class II and III patient groups used an unpaired t test. The study sample consisted of 50 patients who had undergone bimaxillary orthognathic surgery. The overall evaluation of the mean values revealed a discrepancy between the predicted and obtained values of less than 2.0 ± 2.0 mm for all maxillary landmarks, although some mandibular landmarks were greater than this value. An evaluation of the influence of gender and deformity type on the accuracy of surgical movements did not demonstrate statistical significance for most landmarks (P > .05). The method provides a reproducible tool for surgeons who use orthognathic VSP to perform routine evaluation of the postoperative outcomes, permitting the identification of specific variables that could assist in improving the accuracy of surgical planning and execution. Copyright © 2016 American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.
Analysis of Publications and Citations from a Geophysics Research Institute.

ERIC Educational Resources Information Center

Frohlich, Cliff; Resler, Lynn

2001-01-01

Performs an analysis of all 1128 publications produced by scientists during their employment at the University of Texas Institute for Geophysics, thus assessing research performance using as bibliometric indicators such statistics as publications per year, citations per paper, and cited half-lives. Evaluates five different methods for determining…
STATISTICAL EVALUATION OF CONFOCAL MICROSCOPY IMAGES

EPA Science Inventory

Abstract

In this study the CV is defined as the Mean/SD of the population of beads or pixels. Flow cytometry uses the CV of beads to determine if the machine is aligned correctly and performing properly. This CV concept to determine machine performance has been adapted to...

Evaluating the statistical methodology of randomized trials on dentin hypersensitivity management.

PubMed

Matranga, Domenica; Matera, Federico; Pizzo, Giuseppe

2017-12-27

The present study aimed to evaluate the characteristics and quality of statistical methodology used in clinical studies on dentin hypersensitivity management. An electronic search was performed for data published from 2009 to 2014 by using PubMed, Ovid/MEDLINE, and Cochrane Library databases. The primary search terms were used in combination. Eligibility criteria included randomized clinical trials that evaluated the efficacy of desensitizing agents in terms of reducing dentin hypersensitivity. A total of 40 studies were considered eligible for assessment of quality statistical methodology. The four main concerns identified were i) use of nonparametric tests in the presence of large samples, coupled with lack of information about normality and equality of variances of the response; ii) lack of P-value adjustment for multiple comparisons; iii) failure to account for interactions between treatment and follow-up time; and iv) no information about the number of teeth examined per patient and the consequent lack of cluster-specific approach in data analysis. Owing to these concerns, statistical methodology was judged as inappropriate in 77.1% of the 35 studies that used parametric methods. Additional studies with appropriate statistical analysis are required to obtain appropriate assessment of the efficacy of desensitizing agents.
Wave and Wind Model Performance Metrics Tools

NASA Astrophysics Data System (ADS)

Choi, J. K.; Wang, D. W.

2016-02-01

Continual improvements and upgrades of Navy ocean wave and wind models are essential to the assurance of battlespace environment predictability of ocean surface wave and surf conditions in support of Naval global operations. Thus, constant verification and validation of model performance is equally essential to assure the progress of model developments and maintain confidence in the predictions. Global and regional scale model evaluations may require large areas and long periods of time. For observational data to compare against, altimeter winds and waves along the tracks from past and current operational satellites as well as moored/drifting buoys can be used for global and regional coverage. Using data and model runs in previous trials such as the planned experiment, the Dynamics of the Adriatic in Real Time (DART), we demonstrated the use of accumulated altimeter wind and wave data over several years to obtain an objective evaluation of the performance the SWAN (Simulating Waves Nearshore) model running in the Adriatic Sea. The assessment provided detailed performance of wind and wave models by using cell-averaged statistical variables maps with spatial statistics including slope, correlation, and scatter index to summarize model performance. Such a methodology is easily generalized to other regions and at global scales. Operational technology currently used by subject matter experts evaluating the Navy Coastal Ocean Model and the Hybrid Coordinate Ocean Model can be expanded to evaluate wave and wind models using tools developed for ArcMAP, a GIS application developed by ESRI. Recent inclusion of altimeter and buoy data into a format through the Naval Oceanographic Office's (NAVOCEANO) quality control system and the netCDF standards applicable to all model output makes it possible for the fusion of these data and direct model verification. Also, procedures were developed for the accumulation of match-ups of modelled and observed parameters to form a data base with which statistics are readily calculated, for the short or long term. Such a system has potential for a quick transition to operations at NAVOCEANO.
Comparison of the Mahalanobis distance and Pearson's χ² statistic as measures of similarity of isotope patterns.

PubMed

Zamanzad Ghavidel, Fatemeh; Claesen, Jürgen; Burzykowski, Tomasz; Valkenborg, Dirk

2014-02-01

To extract a genuine peptide signal from a mass spectrum, an observed series of peaks at a particular mass can be compared with the isotope distribution expected for a peptide of that mass. To decide whether the observed series of peaks is similar to the isotope distribution, a similarity measure is needed. In this short communication, we investigate whether the Mahalanobis distance could be an alternative measure for the commonly employed Pearson's χ(2) statistic. We evaluate the performance of the two measures by using a controlled MALDI-TOF experiment. The results indicate that Pearson's χ(2) statistic has better discriminatory performance than the Mahalanobis distance and is a more robust measure.
Comparison of the Mahalanobis Distance and Pearson's χ2 Statistic as Measures of Similarity of Isotope Patterns

NASA Astrophysics Data System (ADS)

Zamanzad Ghavidel, Fatemeh; Claesen, Jürgen; Burzykowski, Tomasz; Valkenborg, Dirk

2014-02-01

To extract a genuine peptide signal from a mass spectrum, an observed series of peaks at a particular mass can be compared with the isotope distribution expected for a peptide of that mass. To decide whether the observed series of peaks is similar to the isotope distribution, a similarity measure is needed. In this short communication, we investigate whether the Mahalanobis distance could be an alternative measure for the commonly employed Pearson's χ2 statistic. We evaluate the performance of the two measures by using a controlled MALDI-TOF experiment. The results indicate that Pearson's χ2 statistic has better discriminatory performance than the Mahalanobis distance and is a more robust measure.
Statistical evaluation of forecasts

NASA Astrophysics Data System (ADS)

Mader, Malenka; Mader, Wolfgang; Gluckman, Bruce J.; Timmer, Jens; Schelter, Björn

2014-08-01

Reliable forecasts of extreme but rare events, such as earthquakes, financial crashes, and epileptic seizures, would render interventions and precautions possible. Therefore, forecasting methods have been developed which intend to raise an alarm if an extreme event is about to occur. In order to statistically validate the performance of a prediction system, it must be compared to the performance of a random predictor, which raises alarms independent of the events. Such a random predictor can be obtained by bootstrapping or analytically. We propose an analytic statistical framework which, in contrast to conventional methods, allows for validating independently the sensitivity and specificity of a forecasting method. Moreover, our method accounts for the periods during which an event has to remain absent or occur after a respective forecast.
Best practices for evaluating the capability of nondestructive evaluation (NDE) and structural health monitoring (SHM) techniques for damage characterization

NASA Astrophysics Data System (ADS)

Aldrin, John C.; Annis, Charles; Sabbagh, Harold A.; Lindgren, Eric A.

2016-02-01

A comprehensive approach to NDE and SHM characterization error (CE) evaluation is presented that follows the framework of the `ahat-versus-a' regression analysis for POD assessment. Characterization capability evaluation is typically more complex with respect to current POD evaluations and thus requires engineering and statistical expertise in the model-building process to ensure all key effects and interactions are addressed. Justifying the statistical model choice with underlying assumptions is key. Several sizing case studies are presented with detailed evaluations of the most appropriate statistical model for each data set. The use of a model-assisted approach is introduced to help assess the reliability of NDE and SHM characterization capability under a wide range of part, environmental and damage conditions. Best practices of using models are presented for both an eddy current NDE sizing and vibration-based SHM case studies. The results of these studies highlight the general protocol feasibility, emphasize the importance of evaluating key application characteristics prior to the study, and demonstrate an approach to quantify the role of varying SHM sensor durability and environmental conditions on characterization performance.
Analysis and interpretation of cost data in randomised controlled trials: review of published studies

PubMed Central

Barber, Julie A; Thompson, Simon G

1998-01-01

Objective To review critically the statistical methods used for health economic evaluations in randomised controlled trials where an estimate of cost is available for each patient in the study. Design Survey of published randomised trials including an economic evaluation with cost values suitable for statistical analysis; 45 such trials published in 1995 were identified from Medline. Main outcome measures The use of statistical methods for cost data was assessed in terms of the descriptive statistics reported, use of statistical inference, and whether the reported conclusions were justified. Results Although all 45 trials reviewed apparently had cost data for each patient, only 9 (20%) reported adequate measures of variability for these data and only 25 (56%) gave results of statistical tests or a measure of precision for the comparison of costs between the randomised groups. Only 16 (36%) of the articles gave conclusions which were justified on the basis of results presented in the paper. No paper reported sample size calculations for costs. Conclusions The analysis and interpretation of cost data from published trials reveal a lack of statistical awareness. Strong and potentially misleading conclusions about the relative costs of alternative therapies have often been reported in the absence of supporting statistical evidence. Improvements in the analysis and reporting of health economic assessments are urgently required. Health economic guidelines need to be revised to incorporate more detailed statistical advice. Key messagesHealth economic evaluations required for important healthcare policy decisions are often carried out in randomised controlled trialsA review of such published economic evaluations assessed whether statistical methods for cost outcomes have been appropriately used and interpretedFew publications presented adequate descriptive information for costs or performed appropriate statistical analysesIn at least two thirds of the papers, the main conclusions regarding costs were not justifiedThe analysis and reporting of health economic assessments within randomised controlled trials urgently need improving PMID:9794854
An astronomer's guide to period searching

NASA Astrophysics Data System (ADS)

Schwarzenberg-Czerny, A.

2003-03-01

We concentrate on analysis of unevenly sampled time series, interrupted by periodic gaps, as often encountered in astronomy. While some of our conclusions may appear surprising, all are based on classical statistical principles of Fisher & successors. Except for discussion of the resolution issues, it is best for the reader to forget temporarily about Fourier transforms and to concentrate on problems of fitting of a time series with a model curve. According to their statistical content we divide the issues into several sections, consisting of: (ii) statistical numerical aspects of model fitting, (iii) evaluation of fitted models as hypotheses testing, (iv) the role of the orthogonal models in signal detection (v) conditions for equivalence of periodograms (vi) rating sensitivity by test power. An experienced observer working with individual objects would benefit little from formalized statistical approach. However, we demonstrate the usefulness of this approach in evaluation of performance of periodograms and in quantitative design of large variability surveys.
Estimation of absorption rate constant (ka) following oral administration by Wagner-Nelson, Loo-Riegelman, and statistical moments in the presence of a secondary peak.

PubMed

Mahmood, Iftekhar

2004-01-01

The objective of this study was to evaluate the performance of Wagner-Nelson, Loo-Reigelman, and statistical moments methods in determining the absorption rate constant(s) in the presence of a secondary peak. These methods were also evaluated when there were two absorption rates without a secondary peak. Different sets of plasma concentration versus time data for a hypothetical drug following one or two compartment models were generated by simulation. The true ka was compared with the ka estimated by Wagner-Nelson, Loo-Riegelman and statistical moments methods. The results of this study indicate that Wagner-Nelson, Loo-Riegelman and statistical moments methods may not be used for the estimation of absorption rate constants in the presence of a secondary peak or when absorption takes place with two absorption rates.
Performance evaluation of dispersion parameterization schemes in the plume simulation of FFT-07 diffusion experiment

NASA Astrophysics Data System (ADS)

Pandey, Gavendra; Sharan, Maithili

2018-01-01

Application of atmospheric dispersion models in air quality analysis requires a proper representation of the vertical and horizontal growth of the plume. For this purpose, various schemes for the parameterization of dispersion parameters σ‧s are described in both stable and unstable conditions. These schemes differ on the use of (i) extent of availability of on-site measurements (ii) formulations developed for other sites and (iii) empirical relations. The performance of these schemes is evaluated in an earlier developed IIT (Indian Institute of Technology) dispersion model with the data set in single and multiple releases conducted at Fusion Field Trials, Dugway Proving Ground, Utah 2007. Qualitative and quantitative evaluation of the relative performance of all the schemes is carried out in both stable and unstable conditions in the light of (i) peak/maximum concentrations, and (ii) overall concentration distribution. The blocked bootstrap resampling technique is adopted to investigate the statistical significance of the differences in performances of each of the schemes by computing 95% confidence limits on the parameters FB and NMSE. The various analysis based on some selected statistical measures indicated consistency in the qualitative and quantitative performances of σ schemes. The scheme which is based on standard deviation of wind velocity fluctuations and Lagrangian time scales exhibits a relatively better performance in predicting the peak as well as the lateral spread.
Statistical analysis on the concordance of the radiological evaluation of fractures of the distal radius subjected to traction☆

PubMed Central

Machado, Daniel Gonçalves; da Cruz Cerqueira, Sergio Auto; de Lima, Alexandre Fernandes; de Mathias, Marcelo Bezerra; Aramburu, José Paulo Gabbi; Rodarte, Rodrigo Ribeiro Pinho

2016-01-01

Objective The objective of this study was to evaluate the current classifications for fractures of the distal extremity of the radius, since the classifications made using traditional radiographs in anteroposterior and lateral views have been questioned regarding their reproducibility. In the literature, it has been suggested that other options are needed, such as use of preoperative radiographs on fractures of the distal radius subjected to traction, with stratification by the evaluators. The aim was to demonstrate which classification systems present better statistical reliability. Results In the Universal classification, the results from the third-year resident group (R3) and from the group of more experienced evaluators (Staff) presented excellent correlation, with a statistically significant p-value (p < 0.05). Neither of the groups presented a statistically significant result through the Frykman classification. In the AO classification, there were high correlations in the R3 and Staff groups (respectively 0.950 and 0.800), with p-values lower than 0.05 (respectively <0.001 and 0.003). Conclusion It can be concluded that radiographs performed under traction showed good concordance in the Staff group and in the R3 group, and that this is a good tactic for radiographic evaluations of fractures of the distal extremity of the radius. PMID:26962498
A full year evaluation of the CALIOPE-EU air quality modeling system over Europe for 2004

NASA Astrophysics Data System (ADS)

Pay, M. T.; Piot, M.; Jorba, O.; Gassó, S.; Gonçalves, M.; Basart, S.; Dabdub, D.; Jiménez-Guerrero, P.; Baldasano, J. M.

The CALIOPE-EU high-resolution air quality modeling system, namely WRF-ARW/HERMES-EMEP/CMAQ/BSC-DREAM8b, is developed and applied to Europe (12 km × 12 km, 1 h). The model performances are tested in terms of air quality levels and dynamics reproducibility on a yearly basis. The present work describes a quantitative evaluation of gas phase species (O 3, NO 2 and SO 2) and particulate matter (PM2.5 and PM10) against ground-based measurements from the EMEP (European Monitoring and Evaluation Programme) network for the year 2004. The evaluation is based on statistics. Simulated O 3 achieves satisfactory performances for both daily mean and daily maximum concentrations, especially in summer, with annual mean correlations of 0.66 and 0.69, respectively. Mean normalized errors are comprised within the recommendations proposed by the United States Environmental Protection Agency (US-EPA). The general trends and daily variations of primary pollutants (NO 2 and SO 2) are satisfactory. Daily mean concentrations of NO 2 correlate well with observations (annual correlation r = 0.67) but tend to be underestimated. For SO 2, mean concentrations are well simulated (mean bias = 0.5 μg m -3) with relatively high annual mean correlation ( r = 0.60), although peaks are generally overestimated. The dynamics of PM2.5 and PM10 is well reproduced (0.49 < r < 0.62), but mean concentrations remain systematically underestimated. Deficiencies in particulate matter source characterization are discussed. Also, the spatially distributed statistics and the general patterns for each pollutant over Europe are examined. The model performances are compared with other European studies. While O 3 statistics generally remain lower than those obtained by the other considered studies, statistics for NO 2, SO 2, PM2.5 and PM10 present higher scores than most models.
Intelligent Condition Diagnosis Method Based on Adaptive Statistic Test Filter and Diagnostic Bayesian Network

PubMed Central

Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing

2016-01-01

A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method. PMID:26761006
Intelligent Condition Diagnosis Method Based on Adaptive Statistic Test Filter and Diagnostic Bayesian Network.

PubMed

Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing

2016-01-08

A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method.
Drug safety data mining with a tree-based scan statistic.

PubMed

Kulldorff, Martin; Dashevsky, Inna; Avery, Taliser R; Chan, Arnold K; Davis, Robert L; Graham, David; Platt, Richard; Andrade, Susan E; Boudreau, Denise; Gunter, Margaret J; Herrinton, Lisa J; Pawloski, Pamala A; Raebel, Marsha A; Roblin, Douglas; Brown, Jeffrey S

2013-05-01

In post-marketing drug safety surveillance, data mining can potentially detect rare but serious adverse events. Assessing an entire collection of drug-event pairs is traditionally performed on a predefined level of granularity. It is unknown a priori whether a drug causes a very specific or a set of related adverse events, such as mitral valve disorders, all valve disorders, or different types of heart disease. This methodological paper evaluates the tree-based scan statistic data mining method to enhance drug safety surveillance. We use a three-million-member electronic health records database from the HMO Research Network. Using the tree-based scan statistic, we assess the safety of selected antifungal and diabetes drugs, simultaneously evaluating overlapping diagnosis groups at different granularity levels, adjusting for multiple testing. Expected and observed adverse event counts were adjusted for age, sex, and health plan, producing a log likelihood ratio test statistic. Out of 732 evaluated disease groupings, 24 were statistically significant, divided among 10 non-overlapping disease categories. Five of the 10 signals are known adverse effects, four are likely due to confounding by indication, while one may warrant further investigation. The tree-based scan statistic can be successfully applied as a data mining tool in drug safety surveillance using observational data. The total number of statistical signals was modest and does not imply a causal relationship. Rather, data mining results should be used to generate candidate drug-event pairs for rigorous epidemiological studies to evaluate the individual and comparative safety profiles of drugs. Copyright © 2013 John Wiley & Sons, Ltd.
2008 Post-Election Voting Survey of Overseas Citizens: Statistical Methodology Report

DTIC Science & Technology

2009-08-01

Gorsak. Westat performed data collection and editing. DMDC’s Survey Technology Branch, under the guidance of Frederick Licari, Branch Chief, is...POST-ELECTION VOTING SURVEY OF OVERSEAS CITIZENS: STATISTICAL METHODOLOGY REPORT Executive Summary The Uniformed and Overseas Citizens Absentee ...ease the process of voting absentee , (3) to evaluate other progress made to facilitate voting participation, and (4) to identify any remaining
Various Effects of Embedded Intrapulse Communications on Pulsed Radar

DTIC Science & Technology

2017-06-01

specific type of interference that may be encountered by radar; however, this introductory information should suffice to illustrate to the reader why...chapter we seek to not merely understand the overall statistical performance of the radar with embedded intrapulse communications but rather to evaluate...Theory Probability of detection, discussed in Chapter 4, assesses the statistical probability of a radar accurately identifying a target given a
Performance of Between-Study Heterogeneity Measures in the Cochrane Library.

PubMed

Ma, Xiaoyue; Lin, Lifeng; Qu, Zhiyong; Zhu, Motao; Chu, Haitao

2018-05-29

The growth in comparative effectiveness research and evidence-based medicine has increased attention to systematic reviews and meta-analyses. Meta-analysis synthesizes and contrasts evidence from multiple independent studies to improve statistical efficiency and reduce bias. Assessing heterogeneity is critical for performing a meta-analysis and interpreting results. As a widely used heterogeneity measure, the I statistic quantifies the proportion of total variation across studies that is due to real differences in effect size. The presence of outlying studies can seriously exaggerate the I statistic. Two alternative heterogeneity measures, the Ir and Im, have been recently proposed to reduce the impact of outlying studies. To evaluate these measures' performance empirically, we applied them to 20,599 meta-analyses in the Cochrane Library. We found that the Ir and Im have strong agreement with the I, while they are more robust than the I when outlying studies appear.
A statistical framework for evaluating neural networks to predict recurrent events in breast cancer

NASA Astrophysics Data System (ADS)

Gorunescu, Florin; Gorunescu, Marina; El-Darzi, Elia; Gorunescu, Smaranda

2010-07-01

Breast cancer is the second leading cause of cancer deaths in women today. Sometimes, breast cancer can return after primary treatment. A medical diagnosis of recurrent cancer is often a more challenging task than the initial one. In this paper, we investigate the potential contribution of neural networks (NNs) to support health professionals in diagnosing such events. The NN algorithms are tested and applied to two different datasets. An extensive statistical analysis has been performed to verify our experiments. The results show that a simple network structure for both the multi-layer perceptron and radial basis function can produce equally good results, not all attributes are needed to train these algorithms and, finally, the classification performances of all algorithms are statistically robust. Moreover, we have shown that the best performing algorithm will strongly depend on the features of the datasets, and hence, there is not necessarily a single best classifier.
Impact of Embedded Military Metal Alloys on Skeletal Physiology in an Animal Model

DTIC Science & Technology

2017-04-04

turnover were completed and statistical comparison performed for each time point. Each ELISA was performed according to the instructions within each kit...expectations for controls. Results of osteocalcin ELISA were evaluated and any results with a coefficient of variation greater than 25% were omitted...Results of TRAP5b ELISA were evaluated and any results with a coefficient of variation greater than 25% were omitted from analysis. Measures of TRAP5b

The Impact of Linking Distinct Achievement Test Scores on the Interpretation of Student Growth in Achievement

ERIC Educational Resources Information Center

Airola, Denise Tobin

2011-01-01

Changes to state tests impact the ability of State Education Agencies (SEAs) to monitor change in performance over time. The purpose of this study was to evaluate the Standardized Performance Growth Index (PGIz), a proposed statistical model for measuring change in student and school performance, across transitions in tests. The PGIz is a…
Relationship between sitting volleyball performance and field fitness of sitting volleyball players in Korea

PubMed Central

Jeoung, Bogja

2017-01-01

The purpose of this study was to evaluate the relationship between sitting volleyball performance and the field fitness of sitting volleyball players. Forty-five elite sitting volleyball players participated in 10 field fitness tests. Additionally, the players’ head coach and coach assessed their volleyball performance (receive and defense, block, attack, and serve). Data were analyzed with SPSS software version 21 by using correlation and regression analyses, and the significance level was set at P< 0.05. The results showed that chest pass, overhand throw, one-hand throw, one-hand side throw, splint, speed endurance, reaction time, and graded exercise test results had a statistically significant influence on the players’ abilities to attack, serve, and block. Grip strength, t-test, speed, and agility showed a statistically significant relationship with the players’ skill at defense and receive. Our results showed that chest pass, overhand throw, one-hand throw, one-hand side throw, speed endurance, reaction time, and graded exercise test results had a statistically significant influence on volleyball performance. PMID:29326896
Cognitive load, emotion, and performance in high-fidelity simulation among beginning nursing students: a pilot study.

PubMed

Schlairet, Maura C; Schlairet, Timothy James; Sauls, Denise H; Bellflowers, Lois

2015-03-01

Establishing the impact of the high-fidelity simulation environment on student performance, as well as identifying factors that could predict learning, would refine simulation outcome expectations among educators. The purpose of this quasi-experimental pilot study was to explore the impact of simulation on emotion and cognitive load among beginning nursing students. Forty baccalaureate nursing students participated in teaching simulations, rated their emotional state and cognitive load, and completed evaluation simulations. Two principal components of emotion were identified representing the pleasant activation and pleasant deactivation components of affect. Mean rating of cognitive load following simulation was high. Linear regression identiffed slight but statistically nonsignificant positive associations between principal components of emotion and cognitive load. Logistic regression identified a negative but statistically nonsignificant effect of cognitive load on assessment performance. Among lower ability students, a more pronounced effect of cognitive load on assessment performance was observed; this also was statistically non-significant. Copyright 2015, SLACK Incorporated.
Prediction of Osteopathic Medical School Performance on the basis of MCAT score, GPA, sex, undergraduate major, and undergraduate institution.

PubMed

Dixon, Donna

2012-04-01

The relationships of students' preadmission academic variables, sex, undergraduate major, and undergraduate institution to academic performance in medical school have not been thoroughly examined. To determine the ability of students' preadmission academic variables to predict osteopathic medical school performance and whether students' sex, undergraduate major, or undergraduate institution influence osteopathic medical school performance. The study followed students who graduated from New York College of Osteopathic Medicine of New York Institute of Technology in Old Westbury between 2003 and 2006. Student preadmission data were Medical College Admission Test (MCAT) scores, undergraduate grade point averages (GPAs), sex, undergraduate major, and undergraduate institutional selectivity. Medical school performance variables were GPAs, clinical performance (ie, clinical subject examinations and clerkship evaluations), and scores on the Comprehensive Osteopathic Medical Licensing Examination-USA (COMLEX-USA) Level 1 and Level 2-Clinical Evaluation (CE). Data were analyzed with Pearson product moment correlation coefficients and multivariate linear regression analyses. Differences between student groups were compared with the independent-samples, 2-tailed t test. A total of 737 students were included. All preadmission academic variables, except nonscience undergraduate GPA, were statistically significant predictors of performance on COMLEX-USA Level 1, and all preadmission academic variables were statistically significant predictors of performance on COMLEX-USA Level 2-CE. The MCAT score for biological sciences had the highest correlation among all variables with COMLEX-USA Level 1 performance (Pearson r=0.304; P<.001) and Level 2-CE performance (Pearson r=0.272; P<.001). All preadmission variables were moderately correlated with the mean clinical subject examination scores. The mean clerkship evaluation score was moderately correlated with mean clinical examination results (Pearson r=0.267; P<.001) and COMLEX-USA Level 2-CE performance (Pearson r=0.301; P<.001). Clinical subject examination scores were highly correlated with COMLEX-USA Level 2-CE scores (Pearson r=0.817; P<.001). No statistically significant difference in medical school performance was found between students with science and nonscience undergraduate majors, nor was undergraduate institutional selectivity a factor influencing performance. Students' preadmission academic variables were predictive of osteopathic medical school performance, including GPAs, clinical performance, and COMLEX-USA Level 1 and Level 2-CE results. Clinical performance was predictive of COMLEX-USA Level 2-CE performance.
Physical fitness modulates incidental but not intentional statistical learning of simultaneous auditory sequences during concurrent physical exercise.

PubMed

Daikoku, Tatsuya; Takahashi, Yuji; Futagami, Hiroko; Tarumoto, Nagayoshi; Yasuda, Hideki

2017-02-01

In real-world auditory environments, humans are exposed to overlapping auditory information such as those made by human voices and musical instruments even during routine physical activities such as walking and cycling. The present study investigated how concurrent physical exercise affects performance of incidental and intentional learning of overlapping auditory streams, and whether physical fitness modulates the performances of learning. Participants were grouped with 11 participants with lower and higher fitness each, based on their Vo 2 max value. They were presented simultaneous auditory sequences with a distinct statistical regularity each other (i.e. statistical learning), while they were pedaling on the bike and seating on a bike at rest. In experiment 1, they were instructed to attend to one of the two sequences and ignore to the other sequence. In experiment 2, they were instructed to attend to both of the two sequences. After exposure to the sequences, learning effects were evaluated by familiarity test. In the experiment 1, performance of statistical learning of ignored sequences during concurrent pedaling could be higher in the participants with high than low physical fitness, whereas in attended sequence, there was no significant difference in performance of statistical learning between high than low physical fitness. Furthermore, there was no significant effect of physical fitness on learning while resting. In the experiment 2, the both participants with high and low physical fitness could perform intentional statistical learning of two simultaneous sequences in the both exercise and rest sessions. The improvement in physical fitness might facilitate incidental but not intentional statistical learning of simultaneous auditory sequences during concurrent physical exercise.
Dynamic Assessment for 3- and 4-Year-Old Children Who Use Augmentative and Alternative Communication: Evaluating Expressive Syntax.

PubMed

Binger, Cathy; Kent-Walsh, Jennifer; King, Marika

2017-07-12

The developmental readiness to produce early sentences with an iPad communication application was assessed with ten 3- and 4-year-old children with severe speech disorders using graduated prompting dynamic assessment (DA) techniques. The participants' changes in performance within the DA sessions were evaluated, and DA performance was compared with performance during a subsequent intervention. Descriptive statistics were used to examine patterns of performance at various cueing levels and mean levels of cueing support. The Wilcoxon signed-ranks test was used to measure changes within the DA sessions. Correlational data were calculated to determine how well performance in DA predicted performance during a subsequent intervention. Participants produced targets successfully in DA at various cueing levels, with some targets requiring less cueing than others. Performance improved significantly within the DA sessions-that is, the level of cueing required for accurate productions of the targets decreased during DA sessions. Last, moderate correlations existed between DA scores and performance during the intervention for 3 out of 4 targets, with statistically significant findings for 2 of 4 targets. DA offers promise for examining the developmental readiness of young children who use augmentative and alternative communication to produce early expressive language structures.
Dynamic Assessment for 3- and 4-Year-Old Children Who Use Augmentative and Alternative Communication: Evaluating Expressive Syntax

PubMed Central

Kent-Walsh, Jennifer; King, Marika

2017-01-01

Purpose The developmental readiness to produce early sentences with an iPad communication application was assessed with ten 3- and 4-year-old children with severe speech disorders using graduated prompting dynamic assessment (DA) techniques. The participants' changes in performance within the DA sessions were evaluated, and DA performance was compared with performance during a subsequent intervention. Method Descriptive statistics were used to examine patterns of performance at various cueing levels and mean levels of cueing support. The Wilcoxon signed-ranks test was used to measure changes within the DA sessions. Correlational data were calculated to determine how well performance in DA predicted performance during a subsequent intervention. Results Participants produced targets successfully in DA at various cueing levels, with some targets requiring less cueing than others. Performance improved significantly within the DA sessions—that is, the level of cueing required for accurate productions of the targets decreased during DA sessions. Last, moderate correlations existed between DA scores and performance during the intervention for 3 out of 4 targets, with statistically significant findings for 2 of 4 targets. Conclusion DA offers promise for examining the developmental readiness of young children who use augmentative and alternative communication to produce early expressive language structures. PMID:28614580
Mathematical modeling and statistical analysis of SPE-OCDMA systems utilizing second harmonic generation effect in thick crystal receivers

NASA Astrophysics Data System (ADS)

Matinfar, Mehdi D.; Salehi, Jawad A.

2009-11-01

In this paper we analytically study and evaluate the performance of a Spectral-Phase-Encoded Optical CDMA system for different parameters such as the user's code length and the number of users in the network. In this system an advanced receiver structure in which the Second Harmonic Generation effect imposed in a thick crystal is employed as the nonlinear pre-processor prior to the conventional low speed photodetector. We consider ASE noise of the optical amplifiers, effective in low power conditions, besides the multiple access interference (MAI) noise which is the dominant source of noise in any OCDMA communications system. We use the results of the previous work which we analyzed the statistical behavior of the thick crystals in an optically amplified digital lightwave communication system to evaluate the performance of the SPE-OCDMA system with thick crystals receiver structure. The error probability is evaluated using Saddle-Point approximation and the approximation is verified by Monte-Carlo simulation.
Initial evaluation of discrete orthogonal basis reconstruction of ECT images

DOE Office of Scientific and Technical Information (OSTI.GOV)

Moody, E.B.; Donohue, K.D.

1996-12-31

Discrete orthogonal basis restoration (DOBR) is a linear, non-iterative, and robust method for solving inverse problems for systems characterized by shift-variant transfer functions. This simulation study evaluates the feasibility of using DOBR for reconstructing emission computed tomographic (ECT) images. The imaging system model uses typical SPECT parameters and incorporates the effects of attenuation, spatially-variant PSF, and Poisson noise in the projection process. Sample reconstructions and statistical error analyses for a class of digital phantoms compare the DOBR performance for Hartley and Walsh basis functions. Test results confirm that DOBR with either basis set produces images with good statistical properties. Nomore » problems were encountered with reconstruction instability. The flexibility of the DOBR method and its consistent performance warrants further investigation of DOBR as a means of ECT image reconstruction.« less
Randomized clinical trial of encapsulated and hand-mixed glass-ionomer ART restorations: one-year follow-up.

PubMed

Freitas, Maria Cristina Carvalho de Almendra; Fagundes, Ticiane Cestari; Modena, Karin Cristina da Silva; Cardia, Guilherme Saintive; Navarro, Maria Fidela de Lima

2018-01-18

This prospective, randomized, split-mouth clinical trial evaluated the clinical performance of conventional glass ionomer cement (GIC; Riva Self-Cure, SDI), supplied in capsules or in powder/liquid kits and placed in Class I cavities in permanent molars by the Atraumatic Restorative Treatment (ART) approach. A total of 80 restorations were randomly placed in 40 patients aged 11-15 years. Each patient received one restoration with each type of GIC. The restorations were evaluated after periods of 15 days (baseline), 6 months, and 1 year, according to ART criteria. Wilcoxon matched pairs, multivariate logistic regression, and Gehan-Wilcoxon tests were used for statistical analysis. Patients were evaluated after 15 days (n=40), 6 months (n=34), and 1 year (n=29). Encapsulated GICs showed significantly superior clinical performance compared with hand-mixed GICs at baseline (p=0.017), 6 months (p=0.001), and 1 year (p=0.026). For hand-mixed GIC, a statistically significant difference was only observed over the period of baseline to 1 year (p=0.001). Encapsulated GIC presented statistically significant differences for the following periods: 6 months to 1 year (p=0.028) and baseline to 1 year (p=0.002). Encapsulated GIC presented superior cumulative survival rate than hand-mixed GIC over one year. Importantly, both GICs exhibited decreased survival over time. Encapsulated GIC promoted better ART performance, with an annual failure rate of 24%; in contrast, hand-mixed GIC demonstrated a failure rate of 42%.
Effect of multiple spin species on spherical shell neutron transmission analysis

NASA Technical Reports Server (NTRS)

Semler, T. T.

1972-01-01

A series of Monte Carlo calculations were performed in order to evaluate the effect of separated against merged spin statistics on the analysis of spherical shell neutron transmission experiments for gold. It is shown that the use of separated spin statistics results in larger average capture cross sections of gold at 24 KeV. This effect is explained by stronger windows in the total cross section caused by the interference between potential and J(+) resonances and by J(+) and J(-) resonance overlap allowed by the use of separated spin statistics.
Comparative evaluation of the accuracy of linear measurements between cone beam computed tomography and 3D microtomography.

PubMed

Mangione, Francesca; Meleo, Deborah; Talocco, Marco; Pecci, Raffaella; Pacifici, Luciano; Bedini, Rossella

2013-01-01

The aim of this study was to evaluate the influence of artifacts on the accuracy of linear measurements estimated with a common cone beam computed tomography (CBCT) system used in dental clinical practice, by comparing it with microCT system as standard reference. Ten bovine bone cylindrical samples containing one implant each, able to provide both points of reference and image quality degradation, have been scanned by CBCT and microCT systems. Thanks to the software of the two systems, for each cylindrical sample, two diameters taken at different levels, by using implants different points as references, have been measured. Results have been analyzed by ANOVA and a significant statistically difference has been found. Due to the obtained results, in this work it is possible to say that the measurements made with the two different instruments are still not statistically comparable, although in some samples were obtained similar performances and therefore not statistically significant. With the improvement of the hardware and software of CBCT systems, in the near future the two instruments will be able to provide similar performances.
Academic Performance and Perceived Stress among University Students

ERIC Educational Resources Information Center

Talib, Nadeem; Zia-ur-Rehman, Muhammad

2012-01-01

This study aims to investigate the effect of factor such as perceived stress on the academic performance of the students. A sample of 199 university graduates and undergraduates in Rawalpindi and Islamabad was selected as a statistical frame. Instrumentation used for this study is previously validated construct in order to evaluate the effect of…
75 FR 47592 - Final Test Guideline; Product Performance of Skin-applied Insect Repellents of Insect and Other...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-08-06

... considerations affecting the design and conduct of repellent studies when human subjects are involved. Any... recommendations for the design and execution of studies to evaluate the performance of pesticide products intended... recommends appropriate study designs and methods for selecting subjects, statistical analysis, and reporting...
Combining Phase Identification and Statistic Modeling for Automated Parallel Benchmark Generation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Ye; Ma, Xiaosong; Liu, Qing Gary

2015-01-01

Parallel application benchmarks are indispensable for evaluating/optimizing HPC software and hardware. However, it is very challenging and costly to obtain high-fidelity benchmarks reflecting the scale and complexity of state-of-the-art parallel applications. Hand-extracted synthetic benchmarks are time-and labor-intensive to create. Real applications themselves, while offering most accurate performance evaluation, are expensive to compile, port, reconfigure, and often plainly inaccessible due to security or ownership concerns. This work contributes APPRIME, a novel tool for trace-based automatic parallel benchmark generation. Taking as input standard communication-I/O traces of an application's execution, it couples accurate automatic phase identification with statistical regeneration of event parameters tomore » create compact, portable, and to some degree reconfigurable parallel application benchmarks. Experiments with four NAS Parallel Benchmarks (NPB) and three real scientific simulation codes confirm the fidelity of APPRIME benchmarks. They retain the original applications' performance characteristics, in particular the relative performance across platforms.« less
Performance evaluation of three computed radiography systems using methods recommended in American Association of Physicists in Medicine Report 93

PubMed Central

Muhogora, Wilbroad; Padovani, Renato; Bonutti, Faustino; Msaki, Peter; Kazema, R.

2011-01-01

The performances of three clinical computed radiography (CR) systems, (Agfa CR 75 (with CRMD 4.0 image plates), Kodak CR 850 (with Kodak GP plates) and Kodak CR 850A (with Kodak GP plates)) were evaluated using six tests recommended in American Association of Physicists in Medicine Report 93. The results indicated variable performances with majority being within acceptable limits. The variations were mainly attributed to differences in detector formulations, plate readers’ characteristics, and aging effects. The differences of the mean low contrast scores between the imaging systems for three observers were statistically significant for Agfa and Kodak CR 850A (P=0.009) and for Kodak CR systems (P=0.006) probably because of the differences in ages. However, the differences were not statistically significant between Agfa and Kodak CR 850 (P=0.284) suggesting similar perceived image quality. The study demonstrates the need to implement quality control program regularly. PMID:21897559
Performance evaluation of three computed radiography systems using methods recommended in American Association of Physicists in Medicine Report 93.

PubMed

Muhogora, Wilbroad; Padovani, Renato; Bonutti, Faustino; Msaki, Peter; Kazema, R

2011-07-01

The performances of three clinical computed radiography (CR) systems, (Agfa CR 75 (with CRMD 4.0 image plates), Kodak CR 850 (with Kodak GP plates) and Kodak CR 850A (with Kodak GP plates)) were evaluated using six tests recommended in American Association of Physicists in Medicine Report 93. The results indicated variable performances with majority being within acceptable limits. The variations were mainly attributed to differences in detector formulations, plate readers' characteristics, and aging effects. The differences of the mean low contrast scores between the imaging systems for three observers were statistically significant for Agfa and Kodak CR 850A (P=0.009) and for Kodak CR systems (P=0.006) probably because of the differences in ages. However, the differences were not statistically significant between Agfa and Kodak CR 850 (P=0.284) suggesting similar perceived image quality. The study demonstrates the need to implement quality control program regularly.
Knee osteoarthritis, dyslipidemia syndrome and exercise.

PubMed

Păstrăiguş, Carmen; Ancuţa, Codrina; Miu, Smaranda; Ancuţa, E; Chirieac, Rodica

2012-01-01

The aim of our study was to evaluate the influence of aerobic training on the dyslipedemia in patients with knee osteoarthritis (KOA). Prospective observational six-month study performed on 40 patients with KOA, fulfilling the inclusion criteria, classified according to their participation in specific aerobic training program (30 minutes/day, 5 days/ week) in two subgroups. A standard evaluation protocol was followed assessing lipid parameters (total cholesterol, triglycerides, LDL-cholesterol, HDL-cholesterol levels) at baseline, three and six months. Statistical analysis was performed in SPSS 16.0, p < 0.05. Subgroup analysis has demonstrated a statistical significant improvement in plasma lipids levels in all patients performing regular aerobic training (cholesterol, triglycerides, HDL-cholesterol, LDL-cholesterol) (p < 0.05). Although the difference reported for total cholesterol, triglycerides and LDL-cholesterol after six months between subgroups was not significant (p > 0.05), the mean level of HDL-cholesterol was significantly higher in patients performing aerobic training, reaching the cardio-vascular protective levels. Regular aerobic exercise has a positive effect on plasma lipoprotein concentrations; further research is needed for the assessment of long-term effects of physical exercises for both KOA and lipid pattern.
Phosphorylated neurofilament heavy: A potential blood biomarker to evaluate the severity of acute spinal cord injuries in adults

PubMed Central

Singh, Ajai; Kumar, Vineet; Ali, Sabir; Mahdi, Abbas Ali; Srivastava, Rajeshwer Nath

2017-01-01

Aims: The aim of this study is to analyze the serial estimation of phosphorylated neurofilament heavy (pNF-H) in blood plasma that would act as a potential biomarker for early prediction of the neurological severity of acute spinal cord injuries (SCI) in adults. Settings and Design: Pilot study/observational study. Subjects and Methods: A total of 40 patients (28 cases and 12 controls) of spine injury were included in this study. In the enrolled cases, plasma level of pNF-H was evaluated in blood samples and neurological evaluation was performed by the American Spinal Injury Association Injury Scale at specified period. Serial plasma neurofilament heavy values were then correlated with the neurological status of these patients during follow-up visits and were analyzed statistically. Statistical Analysis Used: Statistical analysis was performed using GraphPad InStat software (version 3.05 for Windows, San Diego, CA, USA). The correlation analysis between the clinical progression and pNF-H expression was done using Spearman's correlation. Results: The mean baseline level of pNF-H in cases was 6.40 ± 2.49 ng/ml, whereas in controls it was 0.54 ± 0.27 ng/ml. On analyzing the association between the two by Mann–Whitney U–test, the difference in levels was found to be statistically significant. The association between the neurological progression and pNF-H expression was determined using correlation analysis (Spearman's correlation). At 95% confidence interval, the correlation coefficient was found to be 0.64, and the correlation was statistically significant. Conclusions: Plasma pNF-H levels were elevated in accordance with the severity of SCI. Therefore, pNF-H may be considered as a potential biomarker to determine early the severity of SCI in adult patients. PMID:29291173
Flight tests for the assessment of task performance and control activity

NASA Technical Reports Server (NTRS)

Pausder, H. J.; Hummes, D.

1982-01-01

The tests were performed with the helicopters BO 105 and UH-1D. Closely connected with tactical demands the six test pilots' task was to minimize the time and the altitude over the obstacles. The data reduction yields statistical evaluation parameters describing the control activity of the pilots and the achieved task performance. The results are shown in form of evaluation diagrams. Additionally dolphin tests with varied control strategy were performed to get more insight into the influence of control techniques. From these test results recommendations can be derived to emphasize the direct force control and to reduce the collective to pitch crosscoupling for the dolphin.

Statistical methods for convergence detection of multi-objective evolutionary algorithms.

PubMed

Trautmann, H; Wagner, T; Naujoks, B; Preuss, M; Mehnen, J

2009-01-01

In this paper, two approaches for estimating the generation in which a multi-objective evolutionary algorithm (MOEA) shows statistically significant signs of convergence are introduced. A set-based perspective is taken where convergence is measured by performance indicators. The proposed techniques fulfill the requirements of proper statistical assessment on the one hand and efficient optimisation for real-world problems on the other hand. The first approach accounts for the stochastic nature of the MOEA by repeating the optimisation runs for increasing generation numbers and analysing the performance indicators using statistical tools. This technique results in a very robust offline procedure. Moreover, an online convergence detection method is introduced as well. This method automatically stops the MOEA when either the variance of the performance indicators falls below a specified threshold or a stagnation of their overall trend is detected. Both methods are analysed and compared for two MOEA and on different classes of benchmark functions. It is shown that the methods successfully operate on all stated problems needing less function evaluations while preserving good approximation quality at the same time.
Statistical process control: A feasibility study of the application of time-series measurement in early neurorehabilitation after acquired brain injury.

PubMed

Markovic, Gabriela; Schult, Marie-Louise; Bartfai, Aniko; Elg, Mattias

2017-01-31

Progress in early cognitive recovery after acquired brain injury is uneven and unpredictable, and thus the evaluation of rehabilitation is complex. The use of time-series measurements is susceptible to statistical change due to process variation. To evaluate the feasibility of using a time-series method, statistical process control, in early cognitive rehabilitation. Participants were 27 patients with acquired brain injury undergoing interdisciplinary rehabilitation of attention within 4 months post-injury. The outcome measure, the Paced Auditory Serial Addition Test, was analysed using statistical process control. Statistical process control identifies if and when change occurs in the process according to 3 patterns: rapid, steady or stationary performers. The statistical process control method was adjusted, in terms of constructing the baseline and the total number of measurement points, in order to measure a process in change. Statistical process control methodology is feasible for use in early cognitive rehabilitation, since it provides information about change in a process, thus enabling adjustment of the individual treatment response. Together with the results indicating discernible subgroups that respond differently to rehabilitation, statistical process control could be a valid tool in clinical decision-making. This study is a starting-point in understanding the rehabilitation process using a real-time-measurements approach.
Assessing the Robustness of Graph Statistics for Network Analysis Under Incomplete Information

DTIC Science & Technology

strategy for dismantling these networks based on their network structure. However, these strategies typically assume complete information about the...combat them with missing information . This thesis analyzes the performance of a variety of network statistics in the context of incomplete information by...leveraging simulation to remove nodes and edges from networks and evaluating the effect this missing information has on our ability to accurately
Statistical evaluation of the metallurgical test data in the ORR-PSF-PVS irradiation experiment. [PWR; BWR

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stallmann, F.W.

1984-08-01

A statistical analysis of Charpy test results of the two-year Pressure Vessel Simulation metallurgical irradiation experiment was performed. Determination of transition temperature and upper shelf energy derived from computer fits compare well with eyeball fits. Uncertainties for all results can be obtained with computer fits. The results were compared with predictions in Regulatory Guide 1.99 and other irradiation damage models.
Efficient evaluation of wireless real-time control networks.

PubMed

Horvath, Peter; Yampolskiy, Mark; Koutsoukos, Xenofon

2015-02-11

In this paper, we present a system simulation framework for the design and performance evaluation of complex wireless cyber-physical systems. We describe the simulator architecture and the specific developments that are required to simulate cyber-physical systems relying on multi-channel, multihop mesh networks. We introduce realistic and efficient physical layer models and a system simulation methodology, which provides statistically significant performance evaluation results with low computational complexity. The capabilities of the proposed framework are illustrated in the example of WirelessHART, a centralized, real-time, multi-hop mesh network designed for industrial control and monitor applications.
Statistical methodologies for the control of dynamic remapping

NASA Technical Reports Server (NTRS)

Saltz, J. H.; Nicol, D. M.

1986-01-01

Following an initial mapping of a problem onto a multiprocessor machine or computer network, system performance often deteriorates with time. In order to maintain high performance, it may be necessary to remap the problem. The decision to remap must take into account measurements of performance deterioration, the cost of remapping, and the estimated benefits achieved by remapping. We examine the tradeoff between the costs and the benefits of remapping two qualitatively different kinds of problems. One problem assumes that performance deteriorates gradually, the other assumes that performance deteriorates suddenly. We consider a variety of policies for governing when to remap. In order to evaluate these policies, statistical models of problem behaviors are developed. Simulation results are presented which compare simple policies with computationally expensive optimal decision policies; these results demonstrate that for each problem type, the proposed simple policies are effective and robust.
Evaluation on the use of cerium in the NBL Titrimetric Method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zebrowski, J.P.; Orlowicz, G.J.; Johnson, K.D.

An alternative to potassium dichromate as titrant in the New Brunswick Laboratory Titrimetric Method for uranium analysis was sought since chromium in the waste makes disposal difficult. Substitution of a ceric-based titrant was statistically evaluated. Analysis of the data indicated statistically equivalent precisions for the two methods, but a significant overall bias of +0.035% for the ceric titrant procedure. The cause of the bias was investigated, alterations to the procedure were made, and a second statistical study was performed. This second study revealed no statistically significant bias, nor any analyst-to-analyst variation in the ceric titration procedure. A statistically significant day-to-daymore » variation was detected, but this was physically small (0.01 5%) and was only detected because of the within-day precision of the method. The added mean and standard deviation of the %RD for a single measurement was found to be 0.031%. A comparison with quality control blind dichromate titration data again indicated similar overall precision. Effects of ten elements on the ceric titration`s performance was determined. Co, Ti, Cu, Ni, Na, Mg, Gd, Zn, Cd, and Cr in previous work at NBL these impurities did not interfere with the potassium dichromate titrant. This study indicated similar results for the ceric titrant, with the exception of Ti. All the elements (excluding Ti and Cr), caused no statistically significant bias in uranium measurements at levels of 10 mg impurity per 20-40 mg uranium. The presence of Ti was found to cause a bias of {minus}0.05%; this is attributed to the presence of sulfate ions, resulting in precipitation of titanium sulfate and occlusion of uranium. A negative bias of 0.012% was also statistically observed in the samples containing chromium impurities.« less
Cognition of and Demand for Education and Teaching in Medical Statistics in China: A Systematic Review and Meta-Analysis

PubMed Central

Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong

2015-01-01

Background Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. Objectives This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. Methods We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. Results There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. Conclusion The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent. PMID:26053876
Cognition of and Demand for Education and Teaching in Medical Statistics in China: A Systematic Review and Meta-Analysis.

PubMed

Wu, Yazhou; Zhou, Liang; Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong

2015-01-01

Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent.
Comparison of the Effects of Daily Single-Dose Use of Flurbiprofen, Diclofenac Sodium, and Tenoxicam on Postoperative Pain, Swelling, and Trismus: A Randomized Double-Blind Study.

PubMed

Kaplan, Volkan; Eroğlu, Cennet Neslihan

2016-10-01

The aim of the present study was to compare the effects of daily single-dose use of flurbiprofen, diclofenac sodium, and tenoxicam on pain, swelling, and trismus that occur after surgical extraction of impacted wisdom teeth using local anesthesia. The present study included 3 groups with 30 patients in each group. Those volunteering to participate in this double-blind randomized study (n = 90) were selected from a patient population with an indication for extraction of impacted wisdom teeth. Group 1 patients received 200 mg flurbiprofen, group 2 patients received 100 mg diclofenac sodium, and group 3 patients received 20 mg tenoxicam. All doses were once a day, starting preoperatively. Pain was evaluated postoperatively at 1, 2, 3, 6, 8, and 24 hours and at 2 and 7 days using a visual analog scale (VAS). For comparison with the preoperative measurements, the patients were invited to postoperative follow-up visits 2 and 7 days after extraction to evaluate for swelling and trismus. The statistical analysis was performed using descriptive statistics in SAS, version 9.4 (SAS Institute, Cary, NC), software. Statistical analysis of the pain, swelling, and trismus data was performed using the Kruskal-Wallis, Dunn, and Wilcoxon-Mann-Whitney U tests. The statistical level of significance was accepted at P = .05 and power of 0.80. Clinically, tenoxicam showed better analgesic and anti-inflammatory efficacy compared with diclofenac sodium and, in particular, flurbiprofen. Although the VAS scores in the evaluation of pain showed statistically significant differences at 2 days, no statistically significant difference was found for swelling and trismus. Our study evaluated the analgesic and anti-inflammatory effects with a daily single dose of flurbiprofen, diclofenac sodium, and tenoxicam. Daily 20 mg tenoxicam can be accepted as an adequate and safe option for patients after a surgical procedure. Copyright © 2016 American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.
Evaluation of different models to estimate the global solar radiation on inclined surface

NASA Astrophysics Data System (ADS)

Demain, C.; Journée, M.; Bertrand, C.

2012-04-01

Global and diffuse solar radiation intensities are, in general, measured on horizontal surfaces, whereas stationary solar conversion systems (both flat plate solar collector and solar photovoltaic) are mounted on inclined surface to maximize the amount of solar radiation incident on the collector surface. Consequently, the solar radiation incident measured on a tilted surface has to be determined by converting solar radiation from horizontal surface to tilted surface of interest. This study evaluates the performance of 14 models transposing 10 minutes, hourly and daily diffuse solar irradiation from horizontal to inclined surface. Solar radiation data from 8 months (April to November 2011) which include diverse atmospheric conditions and solar altitudes, measured on the roof of the radiation tower of the Royal Meteorological Institute of Belgium in Uccle (Longitude 4.35°, Latitude 50.79°) were used for validation purposes. The individual model performance is assessed by an inter-comparison between the calculated and measured solar global radiation on the south-oriented surface tilted at 50.79° using statistical methods. The relative performance of the different models under different sky conditions has been studied. Comparison of the statistical errors between the different radiation models in function of the clearness index shows that some models perform better under one type of sky condition. Putting together different models acting under different sky conditions can lead to a diminution of the statistical error between global measured solar radiation and global estimated solar radiation. As models described in this paper have been developed for hourly data inputs, statistical error indexes are minimum for hourly data and increase for 10 minutes and one day frequency data.
A simulation-based evaluation of methods for inferring linear barriers to gene flow

Treesearch

Christopher Blair; Dana E. Weigel; Matthew Balazik; Annika T. H. Keeley; Faith M. Walker; Erin Landguth; Sam Cushman; Melanie Murphy; Lisette Waits; Niko Balkenhol

2012-01-01

Different analytical techniques used on the same data set may lead to different conclusions about the existence and strength of genetic structure. Therefore, reliable interpretation of the results from different methods depends on the efficacy and reliability of different statistical methods. In this paper, we evaluated the performance of multiple analytical methods to...
Percutaneous Tracheostomy under Bronchoscopic Visualization Does Not Affect Short-Term or Long-Term Complications.

PubMed

Easterday, Thomas S; Moore, Joshua W; Redden, Meredith H; Feliciano, David V; Henderson, Vernon J; Humphries, Timothy; Kohler, Katherine E; Ramsay, Philip T; Spence, Stanston D; Walker, Mark; Wyrzykowski, Amy D

2017-07-01

Percutaneous tracheostomy is a safe and effective bedside procedure. Some advocate the use of bronchoscopy during the procedure to reduce the rate of complications. We evaluated our complication rate in trauma patients undergoing percutaneous tracheostomy with and without bronchoscopic guidance to ascertain if there was a difference in the rate of complications. A retrospective review of all tracheostomies performed in critically ill trauma patients was performed using the trauma registry from an urban, Level I Trauma Center. Bronchoscopy assistance was used based on surgeon preference. Standard statistical methodology was used to determine if there was a difference in complication rates for procedures performed with and without the bronchoscope. From January 2007, to April 2016, 649 patients underwent modified percuteaneous tracheostomy; 289 with the aid of a bronchoscope and 360 without. There were no statistically significant differences in any type of complication regardless of utilization of a bronchoscope. The addition of bronchoscopy provides several theoretical benefits when performing percutaneous tracheostomy. Our findings, however, do not demonstrate a statistically significant difference in complications between procedures performed with and without a bronchoscope. Use of the bronchoscope should, therefore, be left to the discretion of the performing physician.
Transient evoked oto-acoustic emission screening in newborns in Bogotá, Colombia: a retrospective study.

PubMed

Rojas, Jorge A; Bernal, Jaime E; García, Mary A; Zarante, Ignacio; Ramírez, Natalia; Bernal, Constanza; Gelvez, Nancy; Tamayo, Marta L

2014-10-01

The aim of this study was to investigate the characteristics and performance of transient evoked oto-acoustic emission (TEOAE) hearing screening in newborns in Colombia, and analyze all possible variables and factors affecting the results. An observational, descriptive and retrospective study with bivariate analysis was performed. The study population consisted of 56,822 newborns evaluated at the private institution, PREGEN. TEOAE testing was carried out as a pediatric hearing screening test from December 2003 to March 2012. The database from PREGEN was revised, and the protocol for evaluation included the same screening test performed twice. Demographic characteristics were recorded and the newborn's background was evaluated. Basic statistics of the qualitative and quantitative variables, and statistical analysis were obtained using the chi-square test. Of the 56,822 records examined, 0.28% were classed as abnormal, which corresponded to a prevalence of 1 in 350. In the screened newborns, 0.08% had a major abnormality or other clinical condition diagnosed, and 0.29% reported a family history of hearing loss. A prevalence of 6.7 in 10,000 was obtained for microtia, which is similar to the 6.4 in 10,000 previously reported in Colombia (database of the Latin-American Collaborative Study of Congenital Malformations - ECLAMC). Statistical analysis demonstrated an association between presenting with a major anomaly and a higher frequency of abnormal results on both TEOAE tests. Newborns in Colombia do not currently undergo screening for the early detection of hearing impairment. The results from this study suggest TEOAE screening tests, when performed twice, are able to detect hearing abnormalities in newborns. This highlights the need to improve the long-term evaluation and monitoring of patients in Colombia through diagnostic tests, and to provide tests that are both sensitive and specific. Furthermore, the use of TEOAE screening is justified by the favorable cost: benefit ratio demonstrated in many countries worldwide. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Specialized data analysis for the Space Shuttle Main Engine and diagnostic evaluation of advanced propulsion system components

NASA Technical Reports Server (NTRS)

1993-01-01

The Marshall Space Flight Center is responsible for the development and management of advanced launch vehicle propulsion systems, including the Space Shuttle Main Engine (SSME), which is presently operational, and the Space Transportation Main Engine (STME) under development. The SSME's provide high performance within stringent constraints on size, weight, and reliability. Based on operational experience, continuous design improvement is in progress to enhance system durability and reliability. Specialized data analysis and interpretation is required in support of SSME and advanced propulsion system diagnostic evaluations. Comprehensive evaluation of the dynamic measurements obtained from test and flight operations is necessary to provide timely assessment of the vibrational characteristics indicating the operational status of turbomachinery and other critical engine components. Efficient performance of this effort is critical due to the significant impact of dynamic evaluation results on ground test and launch schedules, and requires direct familiarity with SSME and derivative systems, test data acquisition, and diagnostic software. Detailed analysis and evaluation of dynamic measurements obtained during SSME and advanced system ground test and flight operations was performed including analytical/statistical assessment of component dynamic behavior, and the development and implementation of analytical/statistical models to efficiently define nominal component dynamic characteristics, detect anomalous behavior, and assess machinery operational condition. In addition, the SSME and J-2 data will be applied to develop vibroacoustic environments for advanced propulsion system components, as required. This study will provide timely assessment of engine component operational status, identify probable causes of malfunction, and indicate feasible engineering solutions. This contract will be performed through accomplishment of negotiated task orders.
Confidence interval or p-value?: part 4 of a series on evaluation of scientific publications.

PubMed

du Prel, Jean-Baptist; Hommel, Gerhard; Röhrig, Bernd; Blettner, Maria

2009-05-01

An understanding of p-values and confidence intervals is necessary for the evaluation of scientific articles. This article will inform the reader of the meaning and interpretation of these two statistical concepts. The uses of these two statistical concepts and the differences between them are discussed on the basis of a selective literature search concerning the methods employed in scientific articles. P-values in scientific studies are used to determine whether a null hypothesis formulated before the performance of the study is to be accepted or rejected. In exploratory studies, p-values enable the recognition of any statistically noteworthy findings. Confidence intervals provide information about a range in which the true value lies with a certain degree of probability, as well as about the direction and strength of the demonstrated effect. This enables conclusions to be drawn about the statistical plausibility and clinical relevance of the study findings. It is often useful for both statistical measures to be reported in scientific articles, because they provide complementary types of information.
The effect of berberine on insulin resistance in women with polycystic ovary syndrome: detailed statistical analysis plan (SAP) for a multicenter randomized controlled trial.

PubMed

Zhang, Ying; Sun, Jin; Zhang, Yun-Jiao; Chai, Qian-Yun; Zhang, Kang; Ma, Hong-Li; Wu, Xiao-Ke; Liu, Jian-Ping

2016-10-21

Although Traditional Chinese Medicine (TCM) has been widely used in clinical settings, a major challenge that remains in TCM is to evaluate its efficacy scientifically. This randomized controlled trial aims to evaluate the efficacy and safety of berberine in the treatment of patients with polycystic ovary syndrome. In order to improve the transparency and research quality of this clinical trial, we prepared this statistical analysis plan (SAP). The trial design, primary and secondary outcomes, and safety outcomes were declared to reduce selection biases in data analysis and result reporting. We specified detailed methods for data management and statistical analyses. Statistics in corresponding tables, listings, and graphs were outlined. The SAP provided more detailed information than trial protocol on data management and statistical analysis methods. Any post hoc analyses could be identified via referring to this SAP, and the possible selection bias and performance bias will be reduced in the trial. This study is registered at ClinicalTrials.gov, NCT01138930 , registered on 7 June 2010.
Evaluation of Skylab IB sensitivity to on-pad winds with turbulence

NASA Technical Reports Server (NTRS)

Coffin, T.

1972-01-01

Computer simulation was performed to estimate displacements and bending moments experienced by the SKYLAB 1B vehicle on the launch pad due to atmospheric winds. The vehicle was assumed to be a beam-like structure represented by a finite number of generalized coordinates. Wind flow across the vehicle was treated as a nonhomogeneous, stationary random process. Response computations were performed by the assumption of simple strip theory and application of generalized harmonic analysis. Displacement and bending moment statistics were obtained for six vehicle propellant loading conditions and four representative reference wind profile and turbulence levels. Means, variances and probability distributions are presented graphically for each case. A separate analysis was performed to indicate the influence of wind gradient variations on vehicle response statistics.
Exploring the Relationship between the Ventures for Excellence Teacher StyleProfile Data and Teacher Performance

ERIC Educational Resources Information Center

Nelson, Barry

2013-01-01

The purpose of this study was to determine if a commercial teacher selection tool, the Ventures for Excellence Teacher StyleProfile, had a statistically significant relationship with teacher evaluation and performance feedback data gathered during a teacher's first year of teaching in the Midwest School District. A review of the literature…
Recent evaluations of crack-opening-area in circumferentially cracked pipes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rahman, S.; Brust, F.; Ghadiali, N.

1997-04-01

Leak-before-break (LBB) analyses for circumferentially cracked pipes are currently being conducted in the nuclear industry to justify elimination of pipe whip restraints and jet shields which are present because of the expected dynamic effects from pipe rupture. The application of the LBB methodology frequently requires calculation of leak rates. The leak rates depend on the crack-opening area of the through-wall crack in the pipe. In addition to LBB analyses which assume a hypothetical flaw size, there is also interest in the integrity of actual leaking cracks corresponding to current leakage detection requirements in NRC Regulatory Guide 1.45, or for assessingmore » temporary repair of Class 2 and 3 pipes that have leaks as are being evaluated in ASME Section XI. The objectives of this study were to review, evaluate, and refine current predictive models for performing crack-opening-area analyses of circumferentially cracked pipes. The results from twenty-five full-scale pipe fracture experiments, conducted in the Degraded Piping Program, the International Piping Integrity Research Group Program, and the Short Cracks in Piping and Piping Welds Program, were used to verify the analytical models. Standard statistical analyses were performed to assess used to verify the analytical models. Standard statistical analyses were performed to assess quantitatively the accuracy of the predictive models. The evaluation also involved finite element analyses for determining the crack-opening profile often needed to perform leak-rate calculations.« less

Dynamic online peer evaluations to improve group assignments in nursing e-learning environment.

PubMed

Adwan, Jehad

2016-06-01

The purpose of this research was to evaluate the use of online peer evaluation forms for online group activities in improving group project outcomes. The investigator developed and used a web-based Google Forms® self and peer evaluation form of 2 group assignments' rubric for junior and senior nursing students. The form covered elements of the assignments including: research activity, analysis of the literature, writing of report, participation in making of presentation, overall contribution to the project, and participation in the weekly group discussions. Items were rated from 1 (did not contribute) to 5 (outstanding contribution) in addition to NA when one activity did not apply. The self and peer evaluation process was conducted twice: once after group assignment 1 and once after group assignment 2. The group assignments final products were done in the form of VoiceThread online presentations that were shared with the rest of the class reflecting the groups' work on a health informatics topic of interest. Data collected as the students completed self and peer evaluations for group assignments 1 and 2. Also, optional comments regarding member performance were collected to add contextual information in addition to ratings. Students received credit for completing the peer evaluations and the grade for the particular assignment was affected by their performance based on peer evaluations of their contributions. Students' peer evaluations showed in a color-coded spreadsheet which enabled the course faculty to view real time results of students' ratings after each assignment. The faculty provided timely and tailored feedback to groups or individuals as needed, using positive feedback and commending high performance while urging struggling individual students and groups to improve lower ratings in specific areas. Comparing evaluations of both assignments, there were statistically significant improvements among all students. The mean scores of the entire sample were skewed toward the higher end of the scale, suggesting an overall high performance group. However, analysis of the lower performing individuals showed consistent and statistically significant improvements in all areas of the evaluation criteria. Anonymous peer evaluation activities and timely faculty feedback in e-Learning environment can be a useful tool to faculty to improve group performance over time by engaging the learners within their groups. Peer evaluations provided real time view of group mid-semester formative evaluations that allowed the faculty to provide timely and tailored feedback on student performance which allowed for better outcomes. Copyright © 2016 Elsevier Ltd. All rights reserved.
Performance analysis of different tuning rules for an isothermal CSTR using integrated EPC and SPC

NASA Astrophysics Data System (ADS)

Roslan, A. H.; Karim, S. F. Abd; Hamzah, N.

2018-03-01

This paper demonstrates the integration of Engineering Process Control (EPC) and Statistical Process Control (SPC) for the control of product concentration of an isothermal CSTR. The objectives of this study are to evaluate the performance of Ziegler-Nichols (Z-N), Direct Synthesis, (DS) and Internal Model Control (IMC) tuning methods and determine the most effective method for this process. The simulation model was obtained from past literature and re-constructed using SIMULINK MATLAB to evaluate the process response. Additionally, the process stability, capability and normality were analyzed using Process Capability Sixpack reports in Minitab. Based on the results, DS displays the best response for having the smallest rise time, settling time, overshoot, undershoot, Integral Time Absolute Error (ITAE) and Integral Square Error (ISE). Also, based on statistical analysis, DS yields as the best tuning method as it exhibits the highest process stability and capability.
Enabling High-Energy, High-Voltage Lithium-Ion Cells: Standardization of Coin-Cell Assembly, Electrochemical Testing, and Evaluation of Full Cells

DOE PAGES

Long, Brandon R.; Rinaldo, Steven G.; Gallagher, Kevin G.; ...

2016-11-09

Coin-cells are often the test format of choice for laboratories engaged in battery research and development as they provide a convenient platform for rapid testing of new materials on a small scale. However, reliable, reproducible data via the coin-cell format is inherently difficult, particularly in the full-cell configuration. In addition, statistical evaluation to prove the consistency and reliability of such data is often neglected. Herein we report on several studies aimed at formalizing physical process parameters and coin-cell construction related to full cells. Statistical analysis and performance benchmarking approaches are advocated as a means to more confidently track changes inmore » cell performance. Finally, we show that trends in the electrochemical data obtained from coin-cells can be reliable and informative when standardized approaches are implemented in a consistent manner.« less
Effect of Advanced Trauma Life Support program on medical interns' performance in simulated trauma patient management.

PubMed

Ahmadi, Koorosh; Sedaghat, Mohammad; Safdarian, Mahdi; Hashemian, Amir-Masoud; Nezamdoust, Zahra; Vaseie, Mohammad; Rahimi-Movaghar, Vafa

2013-01-01

Since appropriate and time-table methods in trauma care have an important impact on patients'outcome, we evaluated the effect of Advanced Trauma Life Support (ATLS) program on medical interns' performance in simulated trauma patient management. A descriptive and analytical study before and after the training was conducted on 24 randomly selected undergraduate medical interns from Imam Reza Hospital in Mashhad, Iran. On the first day, we assessed interns' clinical knowledge and their practical skill performance in confronting simulated trauma patients. After 2 days of ATLS training, we performed the same study and evaluated their score again on the fourth day. The two findings, pre- and post- ATLS periods, were compared through SPSS version 15.0 software. P values less than 0.05 were considered statistically significant. Our findings showed that interns'ability in all the three tasks improved after the training course. On the fourth day after training, there was a statistically significant increase in interns' clinical knowledge of ATLS procedures, the sequence of procedures and skill performance in trauma situations (P less than 0.001, P equal to 0.016 and P equal to 0.01 respectively). ATLS course has an important role in increasing clinical knowledge and practical skill performance of trauma care in medical interns.
Effect of Internet-Based Cognitive Apprenticeship Model (i-CAM) on Statistics Learning among Postgraduate Students.

PubMed

Saadati, Farzaneh; Ahmad Tarmizi, Rohani; Mohd Ayub, Ahmad Fauzi; Abu Bakar, Kamariah

2015-01-01

Because students' ability to use statistics, which is mathematical in nature, is one of the concerns of educators, embedding within an e-learning system the pedagogical characteristics of learning is 'value added' because it facilitates the conventional method of learning mathematics. Many researchers emphasize the effectiveness of cognitive apprenticeship in learning and problem solving in the workplace. In a cognitive apprenticeship learning model, skills are learned within a community of practitioners through observation of modelling and then practice plus coaching. This study utilized an internet-based Cognitive Apprenticeship Model (i-CAM) in three phases and evaluated its effectiveness for improving statistics problem-solving performance among postgraduate students. The results showed that, when compared to the conventional mathematics learning model, the i-CAM could significantly promote students' problem-solving performance at the end of each phase. In addition, the combination of the differences in students' test scores were considered to be statistically significant after controlling for the pre-test scores. The findings conveyed in this paper confirmed the considerable value of i-CAM in the improvement of statistics learning for non-specialized postgraduate students.
Evaluation of cloud detection instruments and performance of laminar-flow leading-edge test articles during NASA Leading-Edge Flight-Test Program

NASA Technical Reports Server (NTRS)

Davis, Richard E.; Maddalon, Dal V.; Wagner, Richard D.; Fisher, David F.; Young, Ronald

1989-01-01

Summary evaluations of the performance of laminar-flow control (LFC) leading edge test articles on a NASA JetStar aircraft are presented. Statistics, presented for the test articles' performance in haze and cloud situations, as well as in clear air, show a significant effect of cloud particle concentrations on the extent of laminar flow. The cloud particle environment was monitored by two instruments, a cloud particle spectrometer (Knollenberg probe) and a charging patch. Both instruments are evaluated as diagnostic aids for avoiding laminar-flow detrimental particle concentrations in future LFC aircraft operations. The data base covers 19 flights in the simulated airline service phase of the NASA Leading-Edge Flight-Test (LEFT) Program.
Engineering evaluation of SSME dynamic data from engine tests and SSV flights

NASA Technical Reports Server (NTRS)

1986-01-01

An engineering evaluation of dynamic data from SSME hot firing tests and SSV flights is summarized. The basic objective of the study is to provide analyses of vibration, strain and dynamic pressure measurements in support of MSFC performance and reliability improvement programs. A brief description of the SSME test program is given and a typical test evaluation cycle reviewed. Data banks generated to characterize SSME component dynamic characteristics are described and statistical analyses performed on these data base measurements are discussed. Analytical models applied to define the dynamic behavior of SSME components (such as turbopump bearing elements and the flight accelerometer safety cut-off system) are also summarized. Appendices are included to illustrate some typical tasks performed under this study.
The statistical evaluation and comparison of ADMS-Urban model for the prediction of nitrogen dioxide with air quality monitoring network.

PubMed

Dėdelė, Audrius; Miškinytė, Auksė

2015-09-01

In many countries, road traffic is one of the main sources of air pollution associated with adverse effects on human health and environment. Nitrogen dioxide (NO2) is considered to be a measure of traffic-related air pollution, with concentrations tending to be higher near highways, along busy roads, and in the city centers, and the exceedances are mainly observed at measurement stations located close to traffic. In order to assess the air quality in the city and the air pollution impact on public health, air quality models are used. However, firstly, before the model can be used for these purposes, it is important to evaluate the accuracy of the dispersion modelling as one of the most widely used method. The monitoring and dispersion modelling are two components of air quality monitoring system (AQMS), in which statistical comparison was made in this research. The evaluation of the Atmospheric Dispersion Modelling System (ADMS-Urban) was made by comparing monthly modelled NO2 concentrations with the data of continuous air quality monitoring stations in Kaunas city. The statistical measures of model performance were calculated for annual and monthly concentrations of NO2 for each monitoring station site. The spatial analysis was made using geographic information systems (GIS). The calculation of statistical parameters indicated a good ADMS-Urban model performance for the prediction of NO2. The results of this study showed that the agreement of modelled values and observations was better for traffic monitoring stations compared to the background and residential stations.
Statistical error model for a solar electric propulsion thrust subsystem

NASA Technical Reports Server (NTRS)

Bantell, M. H.

1973-01-01

The solar electric propulsion thrust subsystem statistical error model was developed as a tool for investigating the effects of thrust subsystem parameter uncertainties on navigation accuracy. The model is currently being used to evaluate the impact of electric engine parameter uncertainties on navigation system performance for a baseline mission to Encke's Comet in the 1980s. The data given represent the next generation in statistical error modeling for low-thrust applications. Principal improvements include the representation of thrust uncertainties and random process modeling in terms of random parametric variations in the thrust vector process for a multi-engine configuration.
Statistical EMC: A new dimension electromagnetic compatibility of digital electronic systems

NASA Astrophysics Data System (ADS)

Tsaliovich, Anatoly

Electromagnetic compatibility compliance test results are used as a database for addressing three classes of electromagnetic-compatibility (EMC) related problems: statistical EMC profiles of digital electronic systems, the effect of equipment-under-test (EUT) parameters on the electromagnetic emission characteristics, and EMC measurement specifics. Open area test site (OATS) and absorber line shielded room (AR) results are compared for equipment-under-test highest radiated emissions. The suggested statistical evaluation methodology can be utilized to correlate the results of different EMC test techniques, characterize the EMC performance of electronic systems and components, and develop recommendations for electronic product optimal EMC design.
Lack of grading agreement among international hemostasis external quality assessment programs

PubMed Central

Olson, John D.; Jennings, Ian; Meijer, Piet; Bon, Chantal; Bonar, Roslyn; Favaloro, Emmanuel J.; Higgins, Russell A.; Keeney, Michael; Mammen, Joy; Marlar, Richard A.; Meley, Roland; Nair, Sukesh C.; Nichols, William L.; Raby, Anne; Reverter, Joan C.; Srivastava, Alok; Walker, Isobel

2018-01-01

Laboratory quality programs rely on internal quality control and external quality assessment (EQA). EQA programs provide unknown specimens for the laboratory to test. The laboratory's result is compared with other (peer) laboratories performing the same test. EQA programs assign target values using a variety of methods statistical tools and performance assessment of ‘pass’ or ‘fail’ is made. EQA provider members of the international organization, external quality assurance in thrombosis and hemostasis, took part in a study to compare outcome of performance analysis using the same data set of laboratory results. Eleven EQA organizations using eight different analytical approaches participated. Data for a normal and prolonged activated partial thromboplastin time (aPTT) and a normal and reduced factor VIII (FVIII) from 218 laboratories were sent to the EQA providers who analyzed the data set using their method of evaluation for aPTT and FVIII, determining the performance for each laboratory record in the data set. Providers also summarized their statistical approach to assignment of target values and laboratory performance. Each laboratory record in the data set was graded pass/fail by all EQA providers for each of the four analytes. There was a lack of agreement of pass/fail grading among EQA programs. Discordance in the grading was 17.9 and 11% of normal and prolonged aPTT results, respectively, and 20.2 and 17.4% of normal and reduced FVIII results, respectively. All EQA programs in this study employed statistical methods compliant with the International Standardization Organization (ISO), ISO 13528, yet the evaluation of laboratory results for all four analytes showed remarkable grading discordance. PMID:29232255
Evaluation of Cranio-cervical Posture in Children with Bruxism Before and After Bite Plate Therapy: A Pilot Project.

PubMed

Bortoletto, Carolina Carvalho; Cordeiro da Silva, Fernanda; Silva, Paula Fernanda da Costa; Leal de Godoy, Camila Haddad; Albertini, Regiane; Motta, Lara J; Mesquita-Ferrari, Raquel Agnelli; Fernandes, Kristianne Porta Santos; Romano, Renata; Bussadori, Sandra Kalil

2014-07-01

[Purpose] The aim of the present study was to evaluate the effect of a biteplate on the cranio-cervical posture of children with bruxism. [Subjects and Methods] Twelve male and female children aged six to 10 years with a diagnosis of bruxism participated in this study. The children used a biteplate during sleep for 30 days and were submitted to three postural evaluations: initial, immediately following placement of the biteplate, and at the end of treatment. Posture analysis was performed with the aid of the Alcimagem(®) 2.1 program. Data analysis (IBM SPSS Statistics 2.0) involved descriptive statistics and the Student's t-test. [Results] A statistically significant difference was found between the initial cranio-cervical angle and the angle immediately following placement of the biteplate. However, no statistically significant difference was found between the initial angle and the angle after one month of biteplate usage. [Conclusion] No significant change in the cranio-cervical posture of the children was found one month of biteplate usage. However, a reduction occurred in the cranio-cervical angle when the biteplate was in position.
Do hospitalist physicians improve the quality of inpatient care delivery? A systematic review of process, efficiency and outcome measures

PubMed Central

2011-01-01

Background Despite more than a decade of research on hospitalists and their performance, disagreement still exists regarding whether and how hospital-based physicians improve the quality of inpatient care delivery. This systematic review summarizes the findings from 65 comparative evaluations to determine whether hospitalists provide a higher quality of inpatient care relative to traditional inpatient physicians who maintain hospital privileges with concurrent outpatient practices. Methods Articles on hospitalist performance published between January 1996 and December 2010 were identified through MEDLINE, Embase, Science Citation Index, CINAHL, NHS Economic Evaluation Database and a hand-search of reference lists, key journals and editorials. Comparative evaluations presenting original, quantitative data on processes, efficiency or clinical outcome measures of care between hospitalists, community-based physicians and traditional academic attending physicians were included (n = 65). After proposing a conceptual framework for evaluating inpatient physician performance, major findings on quality are summarized according to their percentage change, direction and statistical significance. Results The majority of reviewed articles demonstrated that hospitalists are efficient providers of inpatient care on the basis of reductions in their patients' average length of stay (69%) and total hospital costs (70%); however, the clinical quality of hospitalist care appears to be comparable to that provided by their colleagues. The methodological quality of hospitalist evaluations remains a concern and has not improved over time. Persistent issues include insufficient reporting of source or sample populations (n = 30), patients lost to follow-up (n = 42) and estimates of effect or random variability (n = 35); inappropriate use of statistical tests (n = 55); and failure to adjust for established confounders (n = 37). Conclusions Future research should include an expanded focus on the specific structures of care that differentiate hospitalists from other inpatient physician groups as well as the development of better conceptual and statistical models that identify and measure underlying mechanisms driving provider-outcome associations in quality. PMID:21592322
A statistical approach to evaluate the performance of cardiac biomarkers in predicting death due to acute myocardial infarction: time-dependent ROC curve

PubMed

Karaismailoğlu, Eda; Dikmen, Zeliha Günnur; Akbıyık, Filiz; Karaağaoğlu, Ahmet Ergun

2018-04-30

Background/aim: Myoglobin, cardiac troponin T, B-type natriuretic peptide (BNP), and creatine kinase isoenzyme MB (CK-MB) are frequently used biomarkers for evaluating risk of patients admitted to an emergency department with chest pain. Recently, time- dependent receiver operating characteristic (ROC) analysis has been used to evaluate the predictive power of biomarkers where disease status can change over time. We aimed to determine the best set of biomarkers that estimate cardiac death during follow-up time. We also obtained optimal cut-off values of these biomarkers, which differentiates between patients with and without risk of death. A web tool was developed to estimate time intervals in risk. Materials and methods: A total of 410 patients admitted to the emergency department with chest pain and shortness of breath were included. Cox regression analysis was used to determine an optimal set of biomarkers that can be used for estimating cardiac death and to combine the significant biomarkers. Time-dependent ROC analysis was performed for evaluating performances of significant biomarkers and a combined biomarker during 240 h. The bootstrap method was used to compare statistical significance and the Youden index was used to determine optimal cut-off values. Results : Myoglobin and BNP were significant by multivariate Cox regression analysis. Areas under the time-dependent ROC curves of myoglobin and BNP were about 0.80 during 240 h, and that of the combined biomarker (myoglobin + BNP) increased to 0.90 during the first 180 h. Conclusion: Although myoglobin is not clinically specific to a cardiac event, in our study both myoglobin and BNP were found to be statistically significant for estimating cardiac death. Using this combined biomarker may increase the power of prediction. Our web tool can be useful for evaluating the risk status of new patients and helping clinicians in making decisions.
Randomized clinical trial of encapsulated and hand-mixed glass-ionomer ART restorations: one-year follow-up

PubMed Central

Freitas, Maria Cristina Carvalho de Almendra; Fagundes, Ticiane Cestari; Modena, Karin Cristina da Silva; Cardia, Guilherme Saintive; Navarro, Maria Fidela de Lima

2018-01-01

Abstract Objective This prospective, randomized, split-mouth clinical trial evaluated the clinical performance of conventional glass ionomer cement (GIC; Riva Self-Cure, SDI), supplied in capsules or in powder/liquid kits and placed in Class I cavities in permanent molars by the Atraumatic Restorative Treatment (ART) approach. Material and Methods A total of 80 restorations were randomly placed in 40 patients aged 11-15 years. Each patient received one restoration with each type of GIC. The restorations were evaluated after periods of 15 days (baseline), 6 months, and 1 year, according to ART criteria. Wilcoxon matched pairs, multivariate logistic regression, and Gehan-Wilcoxon tests were used for statistical analysis. Results Patients were evaluated after 15 days (n=40), 6 months (n=34), and 1 year (n=29). Encapsulated GICs showed significantly superior clinical performance compared with hand-mixed GICs at baseline (p=0.017), 6 months (p=0.001), and 1 year (p=0.026). For hand-mixed GIC, a statistically significant difference was only observed over the period of baseline to 1 year (p=0.001). Encapsulated GIC presented statistically significant differences for the following periods: 6 months to 1 year (p=0.028) and baseline to 1 year (p=0.002). Encapsulated GIC presented superior cumulative survival rate than hand-mixed GIC over one year. Importantly, both GICs exhibited decreased survival over time. Conclusions Encapsulated GIC promoted better ART performance, with an annual failure rate of 24%; in contrast, hand-mixed GIC demonstrated a failure rate of 42%. PMID:29364343
Macular thickness and volume after uncomplicated phacoemulsification surgery evaluated by optical coherence tomography. A one-year follow-up.

PubMed

Kecik, Dariusz; Makowiec-Tabernacka, Marta; Golebiewska, Joanna; Moneta-Wielgos, Joanna; Kasprzak, Jan

2009-01-01

To evaluate changes in the macular thickness and volume using optical coherence tomography in patients after phacoemulsification and intracapsular implantation of a foldable intraocular lens. The study included 82 patients (37 males and 45 females) after phacoemulsification and intracapsular implantaion of the same type of a foldable intraocular lens, without any other eye disease. Phacoemulsification was performed with an INFINITI machine. In all patients, macular thickness and volume were measured with an optical coherence tomograph (Stratus OCT) using the Fast Macular Thickness Map. The OCT evaluation was performed on days 1, 7, 30 and 90 postoperatively. In 58 patients (71%), it was additionally performed at 12 months after surgery and in 52 patients (63%) the macular parameters in the healthy and operated eyes were compared. A statistically significant increase in the minimal retinal thickness was observed on days 30 (p<0.0005) and 90 (p<0.005) postoperatively compared to post-operative day 1. A statistically significant increase in the foveal volume was seen on days 30 (p<0.00005) and 90 (p<0.0005). A statistically significant increase in the volume of the entire macula was found on days 7, 30 and 90 (p<0.00005). Uncomplicated cataract phacoemulsification is followed by increases in the central retinal thickness, foveal volume and volume of the entire macula on days 30 and 90 and at 12 months postoperatively. Further observation of patients is required to confirm whether the macular parameters will return to their values on day 1 postoperatively and if so, when this will occur.
Process evaluation to explore internal and external validity of the "Act in Case of Depression" care program in nursing homes.

PubMed

Leontjevas, Ruslan; Gerritsen, Debby L; Koopmans, Raymond T C M; Smalbrugge, Martin; Vernooij-Dassen, Myrra J F J

2012-06-01

A multidisciplinary, evidence-based care program to improve the management of depression in nursing home residents was implemented and tested using a stepped-wedge design in 23 nursing homes (NHs): "Act in case of Depression" (AiD). Before effect analyses, to evaluate AiD process data on sampling quality (recruitment and randomization, reach) and intervention quality (relevance and feasibility, extent to which AiD was performed), which can be used for understanding internal and external validity. In this article, a model is presented that divides process evaluation data into first- and second-order process data. Qualitative and quantitative data based on personal files of residents, interviews of nursing home professionals, and a research database were analyzed according to the following process evaluation components: sampling quality and intervention quality. Nursing home. The pattern of residents' informed consent rates differed for dementia special care units and somatic units during the study. The nursing home staff was satisfied with the AiD program and reported that the program was feasible and relevant. With the exception of the first screening step (nursing staff members using a short observer-based depression scale), AiD components were not performed fully by NH staff as prescribed in the AiD protocol. Although NH staff found the program relevant and feasible and was satisfied with the program content, individual AiD components may have different feasibility. The results on sampling quality implied that statistical analyses of AiD effectiveness should account for the type of unit, whereas the findings on intervention quality implied that, next to the type of unit, analyses should account for the extent to which individual AiD program components were performed. In general, our first-order process data evaluation confirmed internal and external validity of the AiD trial, and this evaluation enabled further statistical fine tuning. The importance of evaluating the first-order process data before executing statistical effect analyses is thus underlined. Copyright © 2012 American Medical Directors Association, Inc. Published by Elsevier Inc. All rights reserved.
Sequential CFAR detectors using a dead-zone limiter

NASA Astrophysics Data System (ADS)

Tantaratana, Sawasd

1990-09-01

The performances of some proposed sequential constant-false-alarm-rate (CFAR) detectors are evaluated. The observations are passed through a dead-zone limiter, the output of which is -1, 0, or +1, depending on whether the input is less than -c, between -c and c, or greater than c, where c is a constant. The test statistic is the sum of the outputs. The test is performed on a reduced set of data (those with absolute value larger than c), with the test statistic being the sum of the signs of the reduced set of data. Both constant and linear boundaries are considered. Numerical results show a significant reduction of the average number of observations needed to achieve the same false alarm and detection probabilities as a fixed-sample-size CFAR detector using the same kind of test statistic.
Improving Service Delivery in a County Health Department WIC Clinic: An Application of Statistical Process Control Techniques

PubMed Central

Boe, Debra Thingstad; Parsons, Helen

2009-01-01

Local public health agencies are challenged to continually improve service delivery, yet they frequently operate with constrained resources. Quality improvement methods and techniques such as statistical process control are commonly used in other industries, and they have recently been proposed as a means of improving service delivery and performance in public health settings. We analyzed a quality improvement project undertaken at a local Special Supplemental Nutrition Program for Women, Infants, and Children (WIC) clinic to reduce waiting times and improve client satisfaction with a walk-in nutrition education service. We used statistical process control techniques to evaluate initial process performance, implement an intervention, and assess process improvements. We found that implementation of these techniques significantly reduced waiting time and improved clients' satisfaction with the WIC service. PMID:19608964
Consistency of performance of robot-assisted surgical tasks in virtual reality.

PubMed

Suh, I H; Siu, K-C; Mukherjee, M; Monk, E; Oleynikov, D; Stergiou, N

2009-01-01

The purpose of this study was to investigate consistency of performance of robot-assisted surgical tasks in a virtual reality environment. Eight subjects performed two surgical tasks, bimanual carrying and needle passing, with both the da Vinci surgical robot and a virtual reality equivalent environment. Nonlinear analysis was utilized to evaluate consistency of performance by calculating the regularity and the amount of divergence in the movement trajectories of the surgical instrument tips. Our results revealed that movement patterns for both training tasks were statistically similar between the two environments. Consistency of performance as measured by nonlinear analysis could be an appropriate methodology to evaluate the complexity of the training tasks between actual and virtual environments and assist in developing better surgical training programs.

A benchmark for statistical microarray data analysis that preserves actual biological and technical variance.

PubMed

De Hertogh, Benoît; De Meulder, Bertrand; Berger, Fabrice; Pierre, Michael; Bareke, Eric; Gaigneaux, Anthoula; Depiereux, Eric

2010-01-11

Recent reanalysis of spike-in datasets underscored the need for new and more accurate benchmark datasets for statistical microarray analysis. We present here a fresh method using biologically-relevant data to evaluate the performance of statistical methods. Our novel method ranks the probesets from a dataset composed of publicly-available biological microarray data and extracts subset matrices with precise information/noise ratios. Our method can be used to determine the capability of different methods to better estimate variance for a given number of replicates. The mean-variance and mean-fold change relationships of the matrices revealed a closer approximation of biological reality. Performance analysis refined the results from benchmarks published previously.We show that the Shrinkage t test (close to Limma) was the best of the methods tested, except when two replicates were examined, where the Regularized t test and the Window t test performed slightly better. The R scripts used for the analysis are available at http://urbm-cluster.urbm.fundp.ac.be/~bdemeulder/.
Bio-based renewable additives for anti-icing applications (phase one).

DOT National Transportation Integrated Search

2016-09-04

The performance and impacts of several bio-based anti-icers along with a traditional chloride-based anti-icer (salt brine) were evaluated. : A statistical design of experiments (uniform design) was employed for developing anti-icing liquids consistin...
Methods for estimating aboveground biomass and its components for Douglas-fir and lodgepole pine trees

Treesearch

K.P. Poudel; H. Temesgen

2016-01-01

Estimating aboveground biomass and its components requires sound statistical formulation and evaluation. Using data collected from 55 destructively sampled trees in different parts of Oregon, we evaluated the performance of three groups of methods to estimate total aboveground biomass and (or) its components based on the bias and root mean squared error (RMSE) that...
An Assessment of Statistical Process Control-Based Approaches for Charting Student Evaluation Scores

ERIC Educational Resources Information Center

Ding, Xin; Wardell, Don; Verma, Rohit

2006-01-01

We compare three control charts for monitoring data from student evaluations of teaching (SET) with the goal of improving student satisfaction with teaching performance. The two charts that we propose are a modified "p" chart and a z-score chart. We show that these charts overcome some of the shortcomings of the more traditional charts…
Social and Environmental Impacts of Forest Management Certification in Indonesia

PubMed Central

Miteva, Daniela A.; Loucks, Colby J.; Pattanayak, Subhrendu K.

2015-01-01

In response to unsustainable timber production in tropical forest concessions, voluntary forest management certification programs such as the Forest Stewardship Council (FSC) have been introduced to improve environmental, social, and economic performance over existing management practices. However, despite the proliferation of forest certification over the past two decades, few studies have evaluated its effectiveness. Using temporally and spatially explicit village-level data on environmental and socio-economic indicators in Kalimantan (Indonesia), we evaluate the performance of the FSC-certified timber concessions compared to non-certified logging concessions. Employing triple difference matching estimators, we find that between 2000 and 2008 FSC reduced aggregate deforestation by 5 percentage points and the incidence of air pollution by 31%. It had no statistically significant impacts on fire incidence or core areas, but increased forest perforation by 4 km2 on average. In addition, we find that FSC reduced firewood dependence (by 33%), respiratory infections (by 32%) and malnutrition (by 1 person) on average. By conducting a rigorous statistical evaluation of FSC certification in a biodiversity hotspot such as Indonesia, we provide a reference point and offer methodological and data lessons that could aid the design of ongoing and future evaluations of a potentially critical conservation policy. PMID:26132491
[Evaluation of using statistical methods in selected national medical journals].

PubMed

Sych, Z

1996-01-01

The paper covers the performed evaluation of frequency with which the statistical methods were applied in analyzed works having been published in six selected, national medical journals in the years 1988-1992. For analysis the following journals were chosen, namely: Klinika Oczna, Medycyna Pracy, Pediatria Polska, Polski Tygodnik Lekarski, Roczniki Państwowego Zakładu Higieny, Zdrowie Publiczne. Appropriate number of works up to the average in the remaining medical journals was randomly selected from respective volumes of Pol. Tyg. Lek. The studies did not include works wherein the statistical analysis was not implemented, which referred both to national and international publications. That exemption was also extended to review papers, casuistic ones, reviews of books, handbooks, monographies, reports from scientific congresses, as well as papers on historical topics. The number of works was defined in each volume. Next, analysis was performed to establish the mode of finding out a suitable sample in respective studies, differentiating two categories: random and target selections. Attention was also paid to the presence of control sample in the individual works. In the analysis attention was also focussed on the existence of sample characteristics, setting up three categories: complete, partial and lacking. In evaluating the analyzed works an effort was made to present the results of studies in tables and figures (Tab. 1, 3). Analysis was accomplished with regard to the rate of employing statistical methods in analyzed works in relevant volumes of six selected, national medical journals for the years 1988-1992, simultaneously determining the number of works, in which no statistical methods were used. Concurrently the frequency of applying the individual statistical methods was analyzed in the scrutinized works. Prominence was given to fundamental statistical methods in the field of descriptive statistics (measures of position, measures of dispersion) as well as most important methods of mathematical statistics such as parametric tests of significance, analysis of variance (in single and dual classifications). non-parametric tests of significance, correlation and regression. The works, in which use was made of either multiple correlation or multiple regression or else more complex methods of studying the relationship for two or more numbers of variables, were incorporated into the works whose statistical methods were constituted by correlation and regression as well as other methods, e.g. statistical methods being used in epidemiology (coefficients of incidence and morbidity, standardization of coefficients, survival tables) factor analysis conducted by Jacobi-Hotellng's method, taxonomic methods and others. On the basis of the performed studies it has been established that the frequency of employing statistical methods in the six selected national, medical journals in the years 1988-1992 was 61.1-66.0% of the analyzed works (Tab. 3), and they generally were almost similar to the frequency provided in English language medical journals. On a whole, no significant differences were disclosed in the frequency of applied statistical methods (Tab. 4) as well as in frequency of random tests (Tab. 3) in the analyzed works, appearing in the medical journals in respective years 1988-1992. The most frequently used statistical methods in analyzed works for 1988-1992 were the measures of position 44.2-55.6% and measures of dispersion 32.5-38.5% as well as parametric tests of significance 26.3-33.1% of the works analyzed (Tab. 4). For the purpose of increasing the frequency and reliability of the used statistical methods, the didactics should be widened in the field of biostatistics at medical studies and postgraduation training designed for physicians and scientific-didactic workers.
Value assignment and uncertainty evaluation for single-element reference solutions

NASA Astrophysics Data System (ADS)

Possolo, Antonio; Bodnar, Olha; Butler, Therese A.; Molloy, John L.; Winchester, Michael R.

2018-06-01

A Bayesian statistical procedure is proposed for value assignment and uncertainty evaluation for the mass fraction of the elemental analytes in single-element solutions distributed as NIST standard reference materials. The principal novelty that we describe is the use of information about relative differences observed historically between the measured values obtained via gravimetry and via high-performance inductively coupled plasma optical emission spectrometry, to quantify the uncertainty component attributable to between-method differences. This information is encapsulated in a prior probability distribution for the between-method uncertainty component, and it is then used, together with the information provided by current measurement data, to produce a probability distribution for the value of the measurand from which an estimate and evaluation of uncertainty are extracted using established statistical procedures.
Accuracy assessment: The statistical approach to performance evaluation in LACIE. [Great Plains corridor, United States

NASA Technical Reports Server (NTRS)

Houston, A. G.; Feiveson, A. H.; Chhikara, R. S.; Hsu, E. M. (Principal Investigator)

1979-01-01

A statistical methodology was developed to check the accuracy of the products of the experimental operations throughout crop growth and to determine whether the procedures are adequate to accomplish the desired accuracy and reliability goals. It has allowed the identification and isolation of key problems in wheat area yield estimation, some of which have been corrected and some of which remain to be resolved. The major unresolved problem in accuracy assessment is that of precisely estimating the bias of the LACIE production estimator. Topics covered include: (1) evaluation techniques; (2) variance and bias estimation for the wheat production estimate; (3) the 90/90 evaluation; (4) comparison of the LACIE estimate with reference standards; and (5) first and second order error source investigations.
[Statistical process control applied to intensity modulated radiotherapy pretreatment controls with portal dosimetry].

PubMed

Villani, N; Gérard, K; Marchesi, V; Huger, S; François, P; Noël, A

2010-06-01

The first purpose of this study was to illustrate the contribution of statistical process control for a better security in intensity modulated radiotherapy (IMRT) treatments. This improvement is possible by controlling the dose delivery process, characterized by pretreatment quality control results. So, it is necessary to put under control portal dosimetry measurements (currently, the ionisation chamber measurements were already monitored by statistical process control thanks to statistical process control tools). The second objective was to state whether it is possible to substitute ionisation chamber with portal dosimetry in order to optimize time devoted to pretreatment quality control. At Alexis-Vautrin center, pretreatment quality controls in IMRT for prostate and head and neck treatments were performed for each beam of each patient. These controls were made with an ionisation chamber, which is the reference detector for the absolute dose measurement, and with portal dosimetry for the verification of dose distribution. Statistical process control is a statistical analysis method, coming from industry, used to control and improve the studied process quality. It uses graphic tools as control maps to follow-up process, warning the operator in case of failure, and quantitative tools to evaluate the process toward its ability to respect guidelines: this is the capability study. The study was performed on 450 head and neck beams and on 100 prostate beams. Control charts, showing drifts, both slow and weak, and also both strong and fast, of mean and standard deviation have been established and have shown special cause introduced (manual shift of the leaf gap of the multileaf collimator). Correlation between dose measured at one point, given with the EPID and the ionisation chamber has been evaluated at more than 97% and disagreement cases between the two measurements were identified. The study allowed to demonstrate the feasibility to reduce the time devoted to pretreatment controls, by substituting the ionisation chamber's measurements with those performed with EPID, and also that a statistical process control monitoring of data brought security guarantee. 2010 Société française de radiothérapie oncologique (SFRO). Published by Elsevier SAS. All rights reserved.
The MSFC UNIVAC 1108 EXEC 8 simulation model

NASA Technical Reports Server (NTRS)

Williams, T. G.; Richards, F. M.; Weatherbee, J. E.; Paul, L. K.

1972-01-01

A model is presented which simulates the MSFC Univac 1108 multiprocessor system. The hardware/operating system is described to enable a good statistical measurement of the system behavior. The performance of the 1108 is evaluated by performing twenty-four different experiments designed to locate system bottlenecks and also to test the sensitivity of system throughput with respect to perturbation of the various Exec 8 scheduling algorithms. The model is implemented in the general purpose system simulation language and the techniques described can be used to assist in the design, development, and evaluation of multiprocessor systems.
The discriminatory capability of existing scores to predict advanced colorectal neoplasia: a prospective colonoscopy study of 5,899 screening participants.

PubMed

Wong, Martin C S; Ching, Jessica Y L; Ng, Simpson; Lam, Thomas Y T; Luk, Arthur K C; Wong, Sunny H; Ng, Siew C; Ng, Simon S M; Wu, Justin C Y; Chan, Francis K L; Sung, Joseph J Y

2016-02-03

We evaluated the performance of seven existing risk scoring systems in predicting advanced colorectal neoplasia in an asymptomatic Chinese cohort. We prospectively recruited 5,899 Chinese subjects aged 50-70 years in a colonoscopy screening programme(2008-2014). Scoring systems under evaluation included two scoring tools from the US; one each from Spain, Germany, and Poland; the Korean Colorectal Screening(KCS) scores; and the modified Asia Pacific Colorectal Screening(APCS) scores. The c-statistics, sensitivity, specificity, positive predictive values(PPVs), and negative predictive values(NPVs) of these systems were evaluated. The resources required were estimated based on the Number Needed to Screen(NNS) and the Number Needed to Refer for colonoscopy(NNR). Advanced neoplasia was detected in 364 (6.2%) subjects. The German system referred the least proportion of subjects (11.2%) for colonoscopy, whilst the KCS scoring system referred the highest (27.4%). The c-statistics of all systems ranged from 0.56-0.65, with sensitivities ranging from 0.04-0.44 and specificities from 0.74-0.99. The modified APCS scoring system had the highest c-statistics (0.65, 95% C.I. 0.58-0.72). The NNS (12-19) and NNR (5-10) were similar among the scoring systems. The existing scoring systems have variable capability to predict advanced neoplasia among asymptomatic Chinese subjects, and further external validation should be performed.
Statistical Evaluation of VIIRS Ocean Color Products

NASA Astrophysics Data System (ADS)

Mikelsons, K.; Wang, M.; Jiang, L.

2016-02-01

Evaluation and validation of satellite-derived ocean color products is a complicated task, which often relies on precise in-situ measurements for satellite data quality assessment. However, in-situ measurements are only available in comparatively few locations, expensive, and not for all times. In the open ocean, the variability in spatial and temporal scales is longer, and the water conditions are generally more stable. We use this fact to perform extensive statistical evaluations of consistency for ocean color retrievals based on comparison of retrieved data at different times, and corresponding to various retrieval parameters. We have used the NOAA Multi-Sensor Level-1 to Level-2 (MSL12) ocean color data processing system for ocean color product data derived from the Visible Infrared Imaging Radiometer Suite (VIIRS). We show the results for statistical dependence of normalized water-leaving radiance spectra with respect to various parameters of retrieval geometry, such as solar- and sensor-zenith angles, as well as physical variables, such as wind speed, air pressure, ozone amount, water vapor, etc. In most cases, the results show consistent retrievals within the relevant range of retrieval parameters, showing a good performance with the MSL12 in the open ocean. The results also yield the upper bounds of solar- and sensor-zenith angles for reliable ocean color retrievals, and also show a slight increase of VIIRS-derived normalized water-leaving radiances with wind speed and water vapor concentration.
An investigation of new toxicity test method performance in validation studies: 1. Toxicity test methods that have predictive capacity no greater than chance.

PubMed

Bruner, L H; Carr, G J; Harbell, J W; Curren, R D

2002-06-01

An approach commonly used to measure new toxicity test method (NTM) performance in validation studies is to divide toxicity results into positive and negative classifications, and the identify true positive (TP), true negative (TN), false positive (FP) and false negative (FN) results. After this step is completed, the contingent probability statistics (CPS), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) are calculated. Although these statistics are widely used and often the only statistics used to assess the performance of toxicity test methods, there is little specific guidance in the validation literature on what values for these statistics indicate adequate performance. The purpose of this study was to begin developing data-based answers to this question by characterizing the CPS obtained from an NTM whose data have a completely random association with a reference test method (RTM). Determining the CPS of this worst-case scenario is useful because it provides a lower baseline from which the performance of an NTM can be judged in future validation studies. It also provides an indication of relationships in the CPS that help identify random or near-random relationships in the data. The results from this study of randomly associated tests show that the values obtained for the statistics vary significantly depending on the cut-offs chosen, that high values can be obtained for individual statistics, and that the different measures cannot be considered independently when evaluating the performance of an NTM. When the association between results of an NTM and RTM is random the sum of the complementary pairs of statistics (sensitivity + specificity, NPV + PPV) is approximately 1, and the prevalence (i.e., the proportion of toxic chemicals in the population of chemicals) and PPV are equal. Given that combinations of high sensitivity-low specificity or low specificity-high sensitivity (i.e., the sum of the sensitivity and specificity equal to approximately 1) indicate lack of predictive capacity, an NTM having these performance characteristics should be considered no better for predicting toxicity than by chance alone.
Evaluation of scheduling techniques for payload activity planning

NASA Technical Reports Server (NTRS)

Bullington, Stanley F.

1991-01-01

Two tasks related to payload activity planning and scheduling were performed. The first task involved making a comparison of space mission activity scheduling problems with production scheduling problems. The second task consisted of a statistical analysis of the output of runs of the Experiment Scheduling Program (ESP). Details of the work which was performed on these two tasks are presented.
Girls in Science and Technology in Secondary and Post-Secondary Education: The Case of France

ERIC Educational Resources Information Center

Stevanovic, Biljana

2014-01-01

Based on surveys undertaken by the Institut national de la statistique et des études économiques (France's National Institute of Statistics and Economic Studies) and by the Direction de l'évaluation de la prospective et de la performance (Directorate of Evaluation, Forecasting and Performance), this article examines the evolution of female student…
A Monte Carlo investigation of thrust imbalance of solid rocket motor pairs

NASA Technical Reports Server (NTRS)

Sforzini, R. H.; Foster, W. A., Jr.; Johnson, J. S., Jr.

1974-01-01

A technique is described for theoretical, statistical evaluation of the thrust imbalance of pairs of solid-propellant rocket motors (SRMs) firing in parallel. Sets of the significant variables, determined as a part of the research, are selected using a random sampling technique and the imbalance calculated for a large number of motor pairs. The performance model is upgraded to include the effects of statistical variations in the ovality and alignment of the motor case and mandrel. Effects of cross-correlations of variables are minimized by selecting for the most part completely independent input variables, over forty in number. The imbalance is evaluated in terms of six time - varying parameters as well as eleven single valued ones which themselves are subject to statistical analysis. A sample study of the thrust imbalance of 50 pairs of 146 in. dia. SRMs of the type to be used on the space shuttle is presented. The FORTRAN IV computer program of the analysis and complete instructions for its use are included. Performance computation time for one pair of SRMs is approximately 35 seconds on the IBM 370/155 using the FORTRAN H compiler.
Validating the simulation of large-scale parallel applications using statistical characteristics

DOE PAGES

Zhang, Deli; Wilke, Jeremiah; Hendry, Gilbert; ...

2016-03-01

Simulation is a widely adopted method to analyze and predict the performance of large-scale parallel applications. Validating the hardware model is highly important for complex simulations with a large number of parameters. Common practice involves calculating the percent error between the projected and the real execution time of a benchmark program. However, in a high-dimensional parameter space, this coarse-grained approach often suffers from parameter insensitivity, which may not be known a priori. Moreover, the traditional approach cannot be applied to the validation of software models, such as application skeletons used in online simulations. In this work, we present a methodologymore » and a toolset for validating both hardware and software models by quantitatively comparing fine-grained statistical characteristics obtained from execution traces. Although statistical information has been used in tasks like performance optimization, this is the first attempt to apply it to simulation validation. Lastly, our experimental results show that the proposed evaluation approach offers significant improvement in fidelity when compared to evaluation using total execution time, and the proposed metrics serve as reliable criteria that progress toward automating the simulation tuning process.« less
2008 Post-Election Voting Survey of Department of State Voting Assistance Officers: Statistical Methodology Report

DTIC Science & Technology

2009-08-01

Mike Wilson, Westat, Inc. developed weights for this survey. Westat performed data collection and editing. DMDC’s Survey Technology Branch, under...STATISTICAL METHODOLOGY REPORT Executive Summary The Uniformed and Overseas Citizens Absentee Voting Act of 1986 (UOCAVA), 42 USC 1973ff, permits members of...citizens covered by UOCAVA, (2) to assess the impact of the FVAP’s efforts to simplify and ease the process of voting absentee , (3) to evaluate other
Statistical metrology—measurement and modeling of variation for advanced process development and design rule generation

NASA Astrophysics Data System (ADS)

Boning, Duane S.; Chung, James E.

1998-11-01

Advanced process technology will require more detailed understanding and tighter control of variation in devices and interconnects. The purpose of statistical metrology is to provide methods to measure and characterize variation, to model systematic and random components of that variation, and to understand the impact of variation on both yield and performance of advanced circuits. Of particular concern are spatial or pattern-dependencies within individual chips; such systematic variation within the chip can have a much larger impact on performance than wafer-level random variation. Statistical metrology methods will play an important role in the creation of design rules for advanced technologies. For example, a key issue in multilayer interconnect is the uniformity of interlevel dielectric (ILD) thickness within the chip. For the case of ILD thickness, we describe phases of statistical metrology development and application to understanding and modeling thickness variation arising from chemical-mechanical polishing (CMP). These phases include screening experiments including design of test structures and test masks to gather electrical or optical data, techniques for statistical decomposition and analysis of the data, and approaches to calibrating empirical and physical variation models. These models can be integrated with circuit CAD tools to evaluate different process integration or design rule strategies. One focus for the generation of interconnect design rules are guidelines for the use of "dummy fill" or "metal fill" to improve the uniformity of underlying metal density and thus improve the uniformity of oxide thickness within the die. Trade-offs that can be evaluated via statistical metrology include the improvements to uniformity possible versus the effect of increased capacitance due to additional metal.
Statistical significance of trace evidence matches using independent physicochemical measurements

NASA Astrophysics Data System (ADS)

Almirall, Jose R.; Cole, Michael; Furton, Kenneth G.; Gettinby, George

1997-02-01

A statistical approach to the significance of glass evidence is proposed using independent physicochemical measurements and chemometrics. Traditional interpretation of the significance of trace evidence matches or exclusions relies on qualitative descriptors such as 'indistinguishable from,' 'consistent with,' 'similar to' etc. By performing physical and chemical measurements with are independent of one another, the significance of object exclusions or matches can be evaluated statistically. One of the problems with this approach is that the human brain is excellent at recognizing and classifying patterns and shapes but performs less well when that object is represented by a numerical list of attributes. Chemometrics can be employed to group similar objects using clustering algorithms and provide statistical significance in a quantitative manner. This approach is enhanced when population databases exist or can be created and the data in question can be evaluated given these databases. Since the selection of the variables used and their pre-processing can greatly influence the outcome, several different methods could be employed in order to obtain a more complete picture of the information contained in the data. Presently, we report on the analysis of glass samples using refractive index measurements and the quantitative analysis of the concentrations of the metals: Mg, Al, Ca, Fe, Mn, Ba, Sr, Ti and Zr. The extension of this general approach to fiber and paint comparisons also is discussed. This statistical approach should not replace the current interpretative approaches to trace evidence matches or exclusions but rather yields an additional quantitative measure. The lack of sufficient general population databases containing the needed physicochemical measurements and the potential for confusion arising from statistical analysis currently hamper this approach and ways of overcoming these obstacles are presented.

A note on the kappa statistic for clustered dichotomous data.

PubMed

Zhou, Ming; Yang, Zhao

2014-06-30

The kappa statistic is widely used to assess the agreement between two raters. Motivated by a simulation-based cluster bootstrap method to calculate the variance of the kappa statistic for clustered physician-patients dichotomous data, we investigate its special correlation structure and develop a new simple and efficient data generation algorithm. For the clustered physician-patients dichotomous data, based on the delta method and its special covariance structure, we propose a semi-parametric variance estimator for the kappa statistic. An extensive Monte Carlo simulation study is performed to evaluate the performance of the new proposal and five existing methods with respect to the empirical coverage probability, root-mean-square error, and average width of the 95% confidence interval for the kappa statistic. The variance estimator ignoring the dependence within a cluster is generally inappropriate, and the variance estimators from the new proposal, bootstrap-based methods, and the sampling-based delta method perform reasonably well for at least a moderately large number of clusters (e.g., the number of clusters K ⩾50). The new proposal and sampling-based delta method provide convenient tools for efficient computations and non-simulation-based alternatives to the existing bootstrap-based methods. Moreover, the new proposal has acceptable performance even when the number of clusters is as small as K = 25. To illustrate the practical application of all the methods, one psychiatric research data and two simulated clustered physician-patients dichotomous data are analyzed. Copyright © 2014 John Wiley & Sons, Ltd.
Statistical Performance Evaluation Of Soft Seat Pressure Relief Valves

DOE Office of Scientific and Technical Information (OSTI.GOV)

Harris, Stephen P.; Gross, Robert E.

2013-03-26

Risk-based inspection methods enable estimation of the probability of failure on demand for spring-operated pressure relief valves at the United States Department of Energy's Savannah River Site in Aiken, South Carolina. This paper presents a statistical performance evaluation of soft seat spring operated pressure relief valves. These pressure relief valves are typically smaller and of lower cost than hard seat (metal to metal) pressure relief valves and can provide substantial cost savings in fluid service applications (air, gas, liquid, and steam) providing that probability of failure on demand (the probability that the pressure relief valve fails to perform its intendedmore » safety function during a potentially dangerous over pressurization) is at least as good as that for hard seat valves. The research in this paper shows that the proportion of soft seat spring operated pressure relief valves failing is the same or less than that of hard seat valves, and that for failed valves, soft seat valves typically have failure ratios of proof test pressure to set pressure less than that of hard seat valves.« less
Assessing residents' operative skills for external ventricular drain placement and shunt surgery in pediatric neurosurgery.

PubMed

Aldave, Guillermo; Hansen, Daniel; Briceño, Valentina; Luerssen, Thomas G; Jea, Andrew

2017-04-01

OBJECTIVE The authors previously demonstrated the use of a validated Objective Structured Assessment of Technical Skills (OSATS) tool for evaluating residents' operative skills in pediatric neurosurgery. However, no benchmarks have been established for specific pediatric procedures despite an increased need for meaningful assessments that can either allow for early intervention for underperforming trainees or allow for proficient residents to progress to conducting operations independently with more passive supervision. This validated methodology and tool for assessment of operative skills for common pediatric neurosurgical procedures-external ventricular drain (EVD) placement and shunt surgery- was applied to establish its procedure-based feasibility and reliability, and to document the effect of repetition on achieving surgical skill proficiency in pediatric EVD placement and shunt surgery. METHODS A procedure-based technical skills assessment for EVD placements and shunt surgeries in pediatric neurosurgery was established through the use of task analysis. The authors enrolled all residents from 3 training programs (Baylor College of Medicine, Houston Methodist Hospital, and University of Texas-Medical Branch) who rotated through pediatric neurosurgery at Texas Children's Hospital over a 26-month period. For each EVD placement or shunt procedure performed with a resident, the faculty and resident (for self-assessment) completed an evaluation form (OSATS) based on a 5-point Likert scale with 7 categories. Data forms were then grouped according to faculty versus resident (self) assessment, length of pediatric neurosurgery rotation, postgraduate year level, and date of evaluation ("beginning of rotation," within 1 month of start date; "end of rotation," within 1 month of completion date; or "middle of rotation"). Descriptive statistical analyses were performed with the commercially available SPSS statistical software package. A p value < 0.05 was considered statistically significant. RESULTS Five attending evaluators (including 2 fellows who acted as attending surgeons) completed 260 evaluations. Twenty house staff completed 269 evaluations for self-assessment. Evaluations were completed in 562 EVD and shunt procedures before the surgeons left the operating room. There were statistically significant differences (p < 0.05) between overall attending (mean 4.3) and junior resident (self; mean 3.6) assessments, and between overall attending (mean 4.8) and senior resident (self; mean 4.6) assessment scores on general performance and technical skills. The learning curves produced for the residents demonstrate a stereotypical U- or V-shaped curve for acquiring skills, with a significant improvement in overall scores at the end of the rotation compared with the beginning. The improvement for junior residents (Δ score = 0.5; p = 0.002) was larger than for senior residents (Δ score = 0.2; p = 0.018). CONCLUSIONS The OSATS is an effective assessment tool as part of a comprehensive evaluation of neurosurgery residents' performance for specific pediatric procedures. The authors observed a U-shaped learning curve, contradicting the idea that developing one's surgical technique and learning a procedure represents a monotonic, cumulative process of repetitions and improvement.
Seismic activity prediction using computational intelligence techniques in northern Pakistan

NASA Astrophysics Data System (ADS)

Asim, Khawaja M.; Awais, Muhammad; Martínez-Álvarez, F.; Iqbal, Talat

2017-10-01

Earthquake prediction study is carried out for the region of northern Pakistan. The prediction methodology includes interdisciplinary interaction of seismology and computational intelligence. Eight seismic parameters are computed based upon the past earthquakes. Predictive ability of these eight seismic parameters is evaluated in terms of information gain, which leads to the selection of six parameters to be used in prediction. Multiple computationally intelligent models have been developed for earthquake prediction using selected seismic parameters. These models include feed-forward neural network, recurrent neural network, random forest, multi layer perceptron, radial basis neural network, and support vector machine. The performance of every prediction model is evaluated and McNemar's statistical test is applied to observe the statistical significance of computational methodologies. Feed-forward neural network shows statistically significant predictions along with accuracy of 75% and positive predictive value of 78% in context of northern Pakistan.
Subjective comparison and evaluation of speech enhancement algorithms

PubMed Central

Hu, Yi; Loizou, Philipos C.

2007-01-01

Making meaningful comparisons between the performance of the various speech enhancement algorithms proposed over the years, has been elusive due to lack of a common speech database, differences in the types of noise used and differences in the testing methodology. To facilitate such comparisons, we report on the development of a noisy speech corpus suitable for evaluation of speech enhancement algorithms. This corpus is subsequently used for the subjective evaluation of 13 speech enhancement methods encompassing four classes of algorithms: spectral subtractive, subspace, statistical-model based and Wiener-type algorithms. The subjective evaluation was performed by Dynastat, Inc. using the ITU-T P.835 methodology designed to evaluate the speech quality along three dimensions: signal distortion, noise distortion and overall quality. This paper reports the results of the subjective tests. PMID:18046463
Information retrieval and terminology extraction in online resources for patients with diabetes.

PubMed

Seljan, Sanja; Baretić, Maja; Kucis, Vlasta

2014-06-01

Terminology use, as a mean for information retrieval or document indexing, plays an important role in health literacy. Specific types of users, i.e. patients with diabetes need access to various online resources (on foreign and/or native language) searching for information on self-education of basic diabetic knowledge, on self-care activities regarding importance of dietetic food, medications, physical exercises and on self-management of insulin pumps. Automatic extraction of corpus-based terminology from online texts, manuals or professional papers, can help in building terminology lists or list of "browsing phrases" useful in information retrieval or in document indexing. Specific terminology lists represent an intermediate step between free text search and controlled vocabulary, between user's demands and existing online resources in native and foreign language. The research aiming to detect the role of terminology in online resources, is conducted on English and Croatian manuals and Croatian online texts, and divided into three interrelated parts: i) comparison of professional and popular terminology use ii) evaluation of automatic statistically-based terminology extraction on English and Croatian texts iii) comparison and evaluation of extracted terminology performed on English manual using statistical and hybrid approaches. Extracted terminology candidates are evaluated by comparison with three types of reference lists: list created by professional medical person, list of highly professional vocabulary contained in MeSH and list created by non-medical persons, made as intersection of 15 lists. Results report on use of popular and professional terminology in online diabetes resources, on evaluation of automatically extracted terminology candidates in English and Croatian texts and on comparison of statistical and hybrid extraction methods in English text. Evaluation of automatic and semi-automatic terminology extraction methods is performed by recall, precision and f-measure.
[Statistical analysis using freely-available "EZR (Easy R)" software].

PubMed

Kanda, Yoshinobu

2015-10-01

Clinicians must often perform statistical analyses for purposes such evaluating preexisting evidence and designing or executing clinical studies. R is a free software environment for statistical computing. R supports many statistical analysis functions, but does not incorporate a statistical graphical user interface (GUI). The R commander provides an easy-to-use basic-statistics GUI for R. However, the statistical function of the R commander is limited, especially in the field of biostatistics. Therefore, the author added several important statistical functions to the R commander and named it "EZR (Easy R)", which is now being distributed on the following website: http://www.jichi.ac.jp/saitama-sct/. EZR allows the application of statistical functions that are frequently used in clinical studies, such as survival analyses, including competing risk analyses and the use of time-dependent covariates and so on, by point-and-click access. In addition, by saving the script automatically created by EZR, users can learn R script writing, maintain the traceability of the analysis, and assure that the statistical process is overseen by a supervisor.
Evaluation of the Air Void Analyzer

DTIC Science & Technology

2013-07-01

lack of measurement would help explain the difference in values shown. Brief descriptions of other unpublished testing (Wang et al. 2008)  CTL Group...structure measurements taken from the controlled laboratory mixtures. A three-phase approach was used to evaluate the machine. First, a global ...method. Hypothesis testing using t-statistics was performed to increase understanding of the data collected globally in terms of the processes used for
A Study of Strengths and Weaknesses of Descriptive Assessment from Principals, Teachers and Experts Points of View in Chaharmahal and Bakhteyari Primary Schools

ERIC Educational Resources Information Center

Sharief, Mostafa; Naderi, Mahin; Hiedari, Maryam Shoja; Roodbari, Omolbanin; Jalilvand, Mohammad Reza

2012-01-01

The aim of current study is to determine the strengths and weaknesses of descriptive evaluation from the viewpoint of principals, teachers and experts of Chaharmahal and Bakhtiari province. A descriptive survey was performed. Statistical population includes 208 principals, 303 teachers, and 100 executive experts of descriptive evaluation scheme in…
Statistical analysis of DOE EML QAP data from 1982 to 1998.

PubMed

Mizanur Rahman, G M; Isenhour, T L; Larget, B; Greenlaw, P D

2001-01-01

The historical database from the Environmental Measurements Laboratory's Quality Assessment Program from 1982 to 1998 has been analyzed to determine control limits for future performance evaluations of the different laboratories contracted to the U.S. Department of Energy. Seventy-three radionuclides in four different matrices (air filter, soil, vegetation, and water) were analyzed. The evaluation criteria were established based on a z-score calculation.
Evaluation program for secondary spacecraft cells

NASA Technical Reports Server (NTRS)

Christy, D. E.

1972-01-01

The life cycle test of secondary spacecraft electric cells is discussed. The purpose of the tests is to insure that all cells put into the life cycle test meet the required specifications. The evaluation program gathers statistical information concerning cell performance characteristics and limitations. Weaknesses in cell design which are discovered during the tests are reported to research facilities in order to increase the service life of the cells.
Exploring Robust Methods for Evaluating Treatment and Comparison Groups in Chronic Care Management Programs

PubMed Central

Hamar, Brent; Bradley, Chastity; Gandy, William M.; Harrison, Patricia L.; Sidney, James A.; Coberley, Carter R.; Rula, Elizabeth Y.; Pope, James E.

2013-01-01

Abstract Evaluation of chronic care management (CCM) programs is necessary to determine the behavioral, clinical, and financial value of the programs. Financial outcomes of members who are exposed to interventions (treatment group) typically are compared to those not exposed (comparison group) in a quasi-experimental study design. However, because member assignment is not randomized, outcomes reported from these designs may be biased or inefficient if study groups are not comparable or balanced prior to analysis. Two matching techniques used to achieve balanced groups are Propensity Score Matching (PSM) and Coarsened Exact Matching (CEM). Unlike PSM, CEM has been shown to yield estimates of causal (program) effects that are lowest in variance and bias for any given sample size. The objective of this case study was to provide a comprehensive comparison of these 2 matching methods within an evaluation of a CCM program administered to a large health plan during a 2-year time period. Descriptive and statistical methods were used to assess the level of balance between comparison and treatment members pre matching. Compared with PSM, CEM retained more members, achieved better balance between matched members, and resulted in a statistically insignificant Wald test statistic for group aggregation. In terms of program performance, the results showed an overall higher medical cost savings among treatment members matched using CEM compared with those matched using PSM (-$25.57 versus -$19.78, respectively). Collectively, the results suggest CEM is a viable alternative, if not the most appropriate matching method, to apply when evaluating CCM program performance. (Population Health Management 2013;16:35–45) PMID:22788834
Exploring robust methods for evaluating treatment and comparison groups in chronic care management programs.

PubMed

Wells, Aaron R; Hamar, Brent; Bradley, Chastity; Gandy, William M; Harrison, Patricia L; Sidney, James A; Coberley, Carter R; Rula, Elizabeth Y; Pope, James E

2013-02-01

Evaluation of chronic care management (CCM) programs is necessary to determine the behavioral, clinical, and financial value of the programs. Financial outcomes of members who are exposed to interventions (treatment group) typically are compared to those not exposed (comparison group) in a quasi-experimental study design. However, because member assignment is not randomized, outcomes reported from these designs may be biased or inefficient if study groups are not comparable or balanced prior to analysis. Two matching techniques used to achieve balanced groups are Propensity Score Matching (PSM) and Coarsened Exact Matching (CEM). Unlike PSM, CEM has been shown to yield estimates of causal (program) effects that are lowest in variance and bias for any given sample size. The objective of this case study was to provide a comprehensive comparison of these 2 matching methods within an evaluation of a CCM program administered to a large health plan during a 2-year time period. Descriptive and statistical methods were used to assess the level of balance between comparison and treatment members pre matching. Compared with PSM, CEM retained more members, achieved better balance between matched members, and resulted in a statistically insignificant Wald test statistic for group aggregation. In terms of program performance, the results showed an overall higher medical cost savings among treatment members matched using CEM compared with those matched using PSM (-$25.57 versus -$19.78, respectively). Collectively, the results suggest CEM is a viable alternative, if not the most appropriate matching method, to apply when evaluating CCM program performance.
[Quality of clinical studies published in the RBGO over one decade (1999-2009): methodological and ethical aspects and statistical procedures].

PubMed

de Sá, Joceline Cássia Ferezini; Marini, Gabriela; Gelaleti, Rafael Bottaro; da Silva, João Batista; de Azevedo, George Gantas; Rudge, Marilza Vieira Cunha

2013-11-01

To evaluate the methodological and statistical design evolution of the publications in the Brazilian Journal of Gynecology and Obstetrics (RBGO) from resolution 196/96. A review of 133 articles published in 1999 (65) and 2009 (68) was performed by two independent reviewers with training in clinical epidemiology and methodology of scientific research. We included all original clinical articles, case and series reports and excluded editorials, letters to the editor, systematic reviews, experimental studies, opinion articles, besides abstracts of theses and dissertations. Characteristics related to the methodological quality of the studies were analyzed in each article using a checklist that evaluated two criteria: methodological aspects and statistical procedures. We used descriptive statistics and the χ2 test for comparison of the two years. There was a difference between 1999 and 2009 regarding the study and statistical design, with more accuracy in the procedures and the use of more robust tests between 1999 and 2009. In RBGO, we observed an evolution in the methods of published articles and a more in-depth use of the statistical analyses, with more sophisticated tests such as regression and multilevel analyses, which are essential techniques for the knowledge and planning of health interventions, leading to fewer interpretation errors.
Design of a testing strategy using non-animal based test methods: lessons learnt from the ACuteTox project.

PubMed

Kopp-Schneider, Annette; Prieto, Pilar; Kinsner-Ovaskainen, Agnieszka; Stanzel, Sven

2013-06-01

In the framework of toxicology, a testing strategy can be viewed as a series of steps which are taken to come to a final prediction about a characteristic of a compound under study. The testing strategy is performed as a single-step procedure, usually called a test battery, using simultaneously all information collected on different endpoints, or as tiered approach in which a decision tree is followed. Design of a testing strategy involves statistical considerations, such as the development of a statistical prediction model. During the EU FP6 ACuteTox project, several prediction models were proposed on the basis of statistical classification algorithms which we illustrate here. The final choice of testing strategies was not based on statistical considerations alone. However, without thorough statistical evaluations a testing strategy cannot be identified. We present here a number of observations made from the statistical viewpoint which relate to the development of testing strategies. The points we make were derived from problems we had to deal with during the evaluation of this large research project. A central issue during the development of a prediction model is the danger of overfitting. Procedures are presented to deal with this challenge. Copyright © 2012 Elsevier Ltd. All rights reserved.
Regular Formal Evaluation Sessions are Effective as Frame-of-Reference Training for Faculty Evaluators of Clerkship Medical Students.

PubMed

Hemmer, Paul A; Dadekian, Gregory A; Terndrup, Christopher; Pangaro, Louis N; Weisbrod, Allison B; Corriere, Mark D; Rodriguez, Rechell; Short, Patricia; Kelly, William F

2015-09-01

Face-to-face formal evaluation sessions between clerkship directors and faculty can facilitate the collection of trainee performance data and provide frame-of-reference training for faculty. We hypothesized that ambulatory faculty who attended evaluation sessions at least once in an academic year (attendees) would use the Reporter-Interpreter-Manager/Educator (RIME) terminology more appropriately than faculty who did not attend evaluation sessions (non-attendees). Investigators conducted a retrospective cohort study using the narrative assessments of ambulatory internal medicine clerkship students during the 2008-2009 academic year. The study included assessments of 49 clerkship medical students, which comprised 293 individual teacher narratives. Single-teacher written and transcribed verbal comments about student performance were masked and reviewed by a panel of experts who, by consensus, (1) determined whether RIME was used, (2) counted the number of RIME utterances, and (3) assigned a grade based on the comments. Analysis included descriptive statistics and Pearson correlation coefficients. The authors reviewed 293 individual teacher narratives regarding the performance of 49 students. Attendees explicitly used RIME more frequently than non-attendees (69.8 vs. 40.4 %; p < 0.0001). Grades recommended by attendees correlated more strongly with grades assigned by experts than grades recommended by non-attendees (r = 0.72; 95 % CI (0.65, 0.78) vs. 0.47; 95 % CI (0.26, 0.64); p = 0.005). Grade recommendations from individual attendees and non-attendees each correlated significantly with overall student clerkship clinical performance [r = 0.63; 95 % CI (0.54, 0.71) vs. 0.52 (0.36, 0.66), respectively], although the difference between the groups was not statistically significant (p = 0.21). On an ambulatory clerkship, teachers who attended evaluation sessions used RIME terminology more frequently and provided more accurate grade recommendations than teachers who did not attend. Formal evaluation sessions may provide frame-of-reference training for the RIME framework, a method that improves the validity and reliability of workplace assessment.
Sensorimotor abilities predict on-field performance in professional baseball.

PubMed

Burris, Kyle; Vittetoe, Kelly; Ramger, Benjamin; Suresh, Sunith; Tokdar, Surya T; Reiter, Jerome P; Appelbaum, L Gregory

2018-01-08

Baseball players must be able to see and react in an instant, yet it is hotly debated whether superior performance is associated with superior sensorimotor abilities. In this study, we compare sensorimotor abilities, measured through 8 psychomotor tasks comprising the Nike Sensory Station assessment battery, and game statistics in a sample of 252 professional baseball players to evaluate the links between sensorimotor skills and on-field performance. For this purpose, we develop a series of Bayesian hierarchical latent variable models enabling us to compare statistics across professional baseball leagues. Within this framework, we find that sensorimotor abilities are significant predictors of on-base percentage, walk rate and strikeout rate, accounting for age, position, and league. We find no such relationship for either slugging percentage or fielder-independent pitching. The pattern of results suggests performance contributions from both visual-sensory and visual-motor abilities and indicates that sensorimotor screenings may be useful for player scouting.
Digitized radiographs in skeletal trauma: a performance comparison between a digital workstation and the original film images.

PubMed

Wilson, A J; Hodge, J C

1995-08-01

To evaluate the diagnostic performance of a teleradiology system in skeletal trauma. Radiographs from 180 skeletal trauma patients were digitized (matrix, 2,000 x 2,500) and transmitted to a remote digital viewing console (1,200-line monitor). Four radiologists interpreted both the original film images and digital images. Each reader was asked to identify, locate, and characterize fractures and dislocations. Receiver operating characteristic curves were generated, and the results of the original and digitized film readings were compared. All readers performed better with the original film when interpreting fractures. Although the patterns varied between readers, all had statistically significant differences (P < .01) for the two image types. There was no statistically significant difference in performance with the two images when dislocations were diagnosed. The system tested is not a satisfactory alternative to the original radiograph for routine reading of fracture films.
Preliminary criteria for the definition of allergic rhinitis: a systematic evaluation of clinical parameters in a disease cohort (I).

PubMed

Ng, M L; Warlow, R S; Chrishanthan, N; Ellis, C; Walls, R

2000-09-01

The aim of this study is to formulate criteria for the definition of allergic rhinitis. Other studies have sought to develop scoring systems to categorize the severity of allergic rhinitis symptoms but it was never used for the formulation of diagnostic criteria. These other scoring systems were arbitrarily chosen and were not derived by any statistical analysis. To date, a study of this kind has not been performed. The hypothesis of this study is that it is possible to formulate criteria for the definition of allergic rhinitis. This is the first study to systematically examine and evaluate the relative importance of symptoms, signs and investigative tests in allergic rhinitis. We sought to statistically rank, from the most to the least important, the multiplicity of symptoms, signs and test results. Forty-seven allergic rhinitis and 23 normal subjects were evaluated with a detailed questionnaire and history, physical examination, serum total immunoglobulin E, skin prick tests and serum enzyme allergosorbent tests (EAST). Statistical ranking of variables indicated rhinitis symptoms (nasal, ocular and oronasal) were the most commonly occurring, followed by a history of allergen provocation, then serum total IgE, positive skin prick tests and positive EAST's to house dust mite, perennial rye and bermuda/couch grass. Throat symptoms ranked even lower whilst EAST's to cat epithelia, plantain and cockroach were the least important. Not all symptoms, signs and tests evaluated proved to be statistically significant when compared to a control group; this included symtoms and signs which had been considered historically to be traditionally associated with allergic rhinitis, e.g. sore throat and bleeding nose. In performing statistical analyses, we were able to rank from most to least important, the multiplicity of symptoms signs and test results. The most important symptoms and signs were identified for the first time, even though some of these were not included in our original selection criteria for defining the disease cohort i.e. sniffing, postnasal drip, oedematous nasal mucosa, impaired sense of smell, mouth breathing, itchy nose and many of the specific provocation factors.
The Opinion of Students and Faculty Members about the Effect of the Faculty Performance Evaluation

PubMed Central

Ghahrani, Nassim; Siamian, Hasan; Balaghafari, Azita; Aligolbandi, Kobra; Vahedi, Mohammad

2015-01-01

Background: One of the most common ways that in most countries and Iran in determining the status of teacher training is the evaluation by students. The most common method of evaluation is the survey questionnaire provided to the study subjects, comprised of questions about educational activities. The researchers plan to evaluate the opinion of students and faculty members about the effect of the faculty performance evaluation at Mazandaran University of Medical Sciences in 2014-15. Methods: In this descriptive cross-sectional survey of attitudes of students and professors base their evaluation on the impact on their academic performance, have been studied. The populations were 3904 students and 149 faculty members of basic sciences Mazandaran University of Medical Sciences. Sample of 350 students and 107 students using Cochran formula faculty members through proportional stratified random sampling was performed. The data of the questionnaire with 28 questions on a Likert Spectrum, respectively. Statistical Analysis Data are descriptive and inferential statistics using Kruskal-Wallis and Mann-Whitney U test is done. Results: Based on the results obtained from total of 350 students, 309 students and from total of 107 faculty members, 76 faculty of basic sciences, participated in this study. The most of the students, 80 (25.9%) of the Faculty of Allied Medical Sciences and most of the faculty of basic sciences, 33 (4.43) of the medicine science faculty. Comments Mazandaran University of Medical Sciences in comparison to the scope of the evaluation should test using Binominal test; we can conclude that in the field of regulatory, scientific, educational, and communications arena, there were no significant differences between the views of students. The greatest supporter of the education of 193 (62%) and most challengers of exam 147 (48%), respectively. Regarding the viewpoints of the faculty members at Mazandaran University of Medical Sciences towards the evaluation domains, using binomial test, it could be concluded that only on the regulation domain with the significance level of 0.000, significant different was observed. So that, 30(23%) and 50(53%) supported of the effect of evaluation on the effect of evaluation of situation. Evaluation to improve the regulatory status of teachers and 70% (53 patients), the effects are positive. Students and faculty evaluations to compare the Mann-Whitney U test was used. The results show, only within the rules, with a significance level of 0.01 considered statistically significant relationship between teachers and students there. Conclusion: considering the viewpoints of students and faculty members about the impact of teacher performance evaluation of the students, most of the students believed that the greatest impact assessment has been on the improve educational performance entitled as responsibility of the faculty member for education, interest in presenting lessons, using audio-visual tools, having lesson plans, faculty members participate interest and enthusiasm in presenting lessons the use of teaching aids, lesson plans, faculty members participation in seminars, creating interest in students to participate in class discussions and expressing the importance of learning lessons perspective of teachers, but the faculty members viewpoints indicate the impact of evaluation on the regular attendance and discipline, the greatest impact assessment in the area of regulatory and compliance with the timely and orderly and thus their activities. PMID:26543421

The Opinion of Students and Faculty Members about the Effect of the Faculty Performance Evaluation.

PubMed

Ghahrani, Nassim; Siamian, Hasan; Balaghafari, Azita; Aligolbandi, Kobra; Vahedi, Mohammad

2015-08-01

One of the most common ways that in most countries and Iran in determining the status of teacher training is the evaluation by students. The most common method of evaluation is the survey questionnaire provided to the study subjects, comprised of questions about educational activities. The researchers plan to evaluate the opinion of students and faculty members about the effect of the faculty performance evaluation at Mazandaran University of Medical Sciences in 2014-15. In this descriptive cross-sectional survey of attitudes of students and professors base their evaluation on the impact on their academic performance, have been studied. The populations were 3904 students and 149 faculty members of basic sciences Mazandaran University of Medical Sciences. Sample of 350 students and 107 students using Cochran formula faculty members through proportional stratified random sampling was performed. The data of the questionnaire with 28 questions on a Likert Spectrum, respectively. Statistical Analysis Data are descriptive and inferential statistics using Kruskal-Wallis and Mann-Whitney U test is done. Based on the results obtained from total of 350 students, 309 students and from total of 107 faculty members, 76 faculty of basic sciences, participated in this study. The most of the students, 80 (25.9%) of the Faculty of Allied Medical Sciences and most of the faculty of basic sciences, 33 (4.43) of the medicine science faculty. Comments Mazandaran University of Medical Sciences in comparison to the scope of the evaluation should test using Binominal test; we can conclude that in the field of regulatory, scientific, educational, and communications arena, there were no significant differences between the views of students. The greatest supporter of the education of 193 (62%) and most challengers of exam 147 (48%), respectively. Regarding the viewpoints of the faculty members at Mazandaran University of Medical Sciences towards the evaluation domains, using binomial test, it could be concluded that only on the regulation domain with the significance level of 0.000, significant different was observed. So that, 30(23%) and 50(53%) supported of the effect of evaluation on the effect of evaluation of situation. Evaluation to improve the regulatory status of teachers and 70% (53 patients), the effects are positive. Students and faculty evaluations to compare the Mann-Whitney U test was used. The results show, only within the rules, with a significance level of 0.01 considered statistically significant relationship between teachers and students there. considering the viewpoints of students and faculty members about the impact of teacher performance evaluation of the students, most of the students believed that the greatest impact assessment has been on the improve educational performance entitled as responsibility of the faculty member for education, interest in presenting lessons, using audio-visual tools, having lesson plans, faculty members participate interest and enthusiasm in presenting lessons the use of teaching aids, lesson plans, faculty members participation in seminars, creating interest in students to participate in class discussions and expressing the importance of learning lessons perspective of teachers, but the faculty members viewpoints indicate the impact of evaluation on the regular attendance and discipline, the greatest impact assessment in the area of regulatory and compliance with the timely and orderly and thus their activities.
METHOD FOR EVALUATING MOLD GROWTH ON CEILING TILE

EPA Science Inventory

A method to extract mold spores from porous ceiling tiles was developed using a masticator blender. Ceiling tiles were inoculated and analyzed using four species of mold. Statistical analysis comparing results obtained by masticator extraction and the swab method was performed. T...
Neural-genetic synthesis for state-space controllers based on linear quadratic regulator design for eigenstructure assignment.

PubMed

da Fonseca Neto, João Viana; Abreu, Ivanildo Silva; da Silva, Fábio Nogueira

2010-04-01

Toward the synthesis of state-space controllers, a neural-genetic model based on the linear quadratic regulator design for the eigenstructure assignment of multivariable dynamic systems is presented. The neural-genetic model represents a fusion of a genetic algorithm and a recurrent neural network (RNN) to perform the selection of the weighting matrices and the algebraic Riccati equation solution, respectively. A fourth-order electric circuit model is used to evaluate the convergence of the computational intelligence paradigms and the control design method performance. The genetic search convergence evaluation is performed in terms of the fitness function statistics and the RNN convergence, which is evaluated by landscapes of the energy and norm, as a function of the parameter deviations. The control problem solution is evaluated in the time and frequency domains by the impulse response, singular values, and modal analysis.
Pathway analysis with next-generation sequencing data.

PubMed

Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric; Xiong, Momiao

2015-04-01

Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.
Evaluation of satellite rainfall estimates for drought and flood monitoring in Mozambique

USGS Publications Warehouse

Tote, Carolien; Patricio, Domingos; Boogaard, Hendrik; van der Wijngaart, Raymond; Tarnavsky, Elena; Funk, Christopher C.

2015-01-01

Satellite derived rainfall products are useful for drought and flood early warning and overcome the problem of sparse, unevenly distributed and erratic rain gauge observations, provided their accuracy is well known. Mozambique is highly vulnerable to extreme weather events such as major droughts and floods and thus, an understanding of the strengths and weaknesses of different rainfall products is valuable. Three dekadal (10-day) gridded satellite rainfall products (TAMSAT African Rainfall Climatology And Time-series (TARCAT) v2.0, Famine Early Warning System NETwork (FEWS NET) Rainfall Estimate (RFE) v2.0, and Climate Hazards Group InfraRed Precipitation with Stations (CHIRPS)) are compared to independent gauge data (2001–2012). This is done using pairwise comparison statistics to evaluate the performance in estimating rainfall amounts and categorical statistics to assess rain-detection capabilities. The analysis was performed for different rainfall categories, over the seasonal cycle and for regions dominated by different weather systems. Overall, satellite products overestimate low and underestimate high dekadal rainfall values. The RFE and CHIRPS products perform as good, generally outperforming TARCAT on the majority of statistical measures of skill. TARCAT detects best the relative frequency of rainfall events, while RFE underestimates and CHIRPS overestimates the rainfall events frequency. Differences in products performance disappear with higher rainfall and all products achieve better results during the wet season. During the cyclone season, CHIRPS shows the best results, while RFE outperforms the other products for lower dekadal rainfall. Products blending thermal infrared and passive microwave imagery perform better than infrared only products and particularly when meteorological patterns are more complex, such as over the coastal, central and south regions of Mozambique, where precipitation is influenced by frontal systems.
Adaptive strategies of remote systems operators exposed to perturbed camera-viewing conditions

NASA Technical Reports Server (NTRS)

Stuart, Mark A.; Manahan, Meera K.; Bierschwale, John M.; Sampaio, Carlos E.; Legendre, A. J.

1991-01-01

This report describes a preliminary investigation of the use of perturbed visual feedback during the performance of simulated space-based remote manipulation tasks. The primary objective of this NASA evaluation was to determine to what extent operators exhibit adaptive strategies which allow them to perform these specific types of remote manipulation tasks more efficiently while exposed to perturbed visual feedback. A secondary objective of this evaluation was to establish a set of preliminary guidelines for enhancing remote manipulation performance and reducing the adverse effects. These objectives were accomplished by studying the remote manipulator performance of test subjects exposed to various perturbed camera-viewing conditions while performing a simulated space-based remote manipulation task. Statistical analysis of performance and subjective data revealed that remote manipulation performance was adversely affected by the use of perturbed visual feedback and performance tended to improve with successive trials in most perturbed viewing conditions.
Vaccine stability study design and analysis to support product licensure.

PubMed

Schofield, Timothy L

2009-11-01

Stability evaluation supporting vaccine licensure includes studies of bulk intermediates as well as final container product. Long-term and accelerated studies are performed to support shelf life and to determine release limits for the vaccine. Vaccine shelf life is best determined utilizing a formal statistical evaluation outlined in the ICH guidelines, while minimum release is calculated to help assure adequate potency through handling and storage of the vaccine. In addition to supporting release potency determination, accelerated stability studies may be used to support a strategy to recalculate product expiry after an unintended temperature excursion such as a cold storage unit failure or mishandling during transport. Appropriate statistical evaluation of vaccine stability data promotes strategic stability study design, in order to reduce the uncertainty associated with the determination of the degradation rate, and the associated risk to the customer.
Evaluation of air quality in a megacity using statistics tools

NASA Astrophysics Data System (ADS)

Ventura, Luciana Maria Baptista; de Oliveira Pinto, Fellipe; Soares, Laiza Molezon; Luna, Aderval Severino; Gioda, Adriana

2018-06-01

Local physical characteristics (e.g., meteorology and topography) associate to particle concentrations are important to evaluate air quality in a region. Meteorology and topography affect air pollutant dispersions. This study used statistics tools (PCA, HCA, Kruskal-Wallis, Mann-Whitney's test and others) to a better understanding of the relationship between fine particulate matter (PM2.5) levels and seasons, meteorological conditions and air basins. To our knowledge, it is one of the few studies performed in Latin America involving all parameters together. PM2.5 samples were collected in six sampling sites with different emission sources (industrial, vehicular, soil dust) in Rio de Janeiro, Brazil. The PM2.5 daily concentrations ranged from 1 to 61 µg m-3, with averages higher than the annual limit (15 µg m-3) for some of the sites. The results of the statistics evaluation showed that PM2.5 concentrations were not influenced by seasonality. Furthermore, air basins defined previously were not confirmed, because some sites presented similar emission sources. Therefore, new redefinitions of air basins need to be done, once they are important to air quality management.
Evaluation of air quality in a megacity using statistics tools

NASA Astrophysics Data System (ADS)

Ventura, Luciana Maria Baptista; de Oliveira Pinto, Fellipe; Soares, Laiza Molezon; Luna, Aderval Severino; Gioda, Adriana

2017-03-01

Local physical characteristics (e.g., meteorology and topography) associate to particle concentrations are important to evaluate air quality in a region. Meteorology and topography affect air pollutant dispersions. This study used statistics tools (PCA, HCA, Kruskal-Wallis, Mann-Whitney's test and others) to a better understanding of the relationship between fine particulate matter (PM2.5) levels and seasons, meteorological conditions and air basins. To our knowledge, it is one of the few studies performed in Latin America involving all parameters together. PM2.5 samples were collected in six sampling sites with different emission sources (industrial, vehicular, soil dust) in Rio de Janeiro, Brazil. The PM2.5 daily concentrations ranged from 1 to 61 µg m-3, with averages higher than the annual limit (15 µg m-3) for some of the sites. The results of the statistics evaluation showed that PM2.5 concentrations were not influenced by seasonality. Furthermore, air basins defined previously were not confirmed, because some sites presented similar emission sources. Therefore, new redefinitions of air basins need to be done, once they are important to air quality management.
Benefits and Disadvantages of Neoadjuvant Radiochemotherapy (RCT) in the Multimodal Therapy of Squamous Esophageal Cancer (ESC).

PubMed

Hanna, Adrian; Birla, Rodica; Iosif, Cristina; Boeriu, Marius; Constantinoiu, Silviu

2016-01-01

The purpose of this paper is to present the advantages and disadvantages of neoadjuvant RCT in multimodal therapyof ESC. Between 1998-2014 221 patients were treated with ESC, 85 of whom received neoadjuvant RCT. For these we have made imaging and pathologic assessment of response using RECIST and MANDARD criteria and statistical data were interpreted in terms of the factors that influence the response. Also, they were evaluated statistical correlations between RCT and resectability, postoperative morbidity, mortality and long-term survival. 45 patients were imagistic responders and 34 underwent surgery, 40 non-responders of which 14 underwent surgery. Of the 48 surgical patients with preoperative RCT, histopathological evaluation showed that 32 were pathological responders and 16 non responders. There were performed statistical analyzes of correlations between RCT and resectability, stage, location of ESC, morbidity, mortality and survival. RCT increase resectability, improves survival and maximum duration of survival, more in responders than in nonresponders and does not affect postoperative complications and postoperative mortality, nor among the responders or nonresponders. Imaging evaluation result of the response to RCT overestimate responders. Celsius.
Evaluation of the learning curve of non-penetrating glaucoma surgery.

PubMed

Aslan, Fatih; Yuce, Berna; Oztas, Zafer; Ates, Halil

2017-08-11

To evaluate the learning curve of non-penetrating glaucoma surgery (NPGS). The study included 32 eyes of 27 patients' (20 male and 7 female) with medically uncontrolled glaucoma. Non-penetrating glaucoma surgeries performed by trainees under control of an experienced surgeon between 2005 and 2007 at our tertiary referral hospital were evaluated. Residents were separated into two groups. Humanistic training model applied to the one in the first group, he studied with experimental models before performing NPGS. Two residents in the second group performed NPGS after a conventional training model. Surgeries of the residents were recorded on video and intraoperative parameters were scored by the experienced surgeon at the end of the study. Postoperative intraocular pressure, absolute and total success rates were analyzed. In the first group 19 eyes of 16 patients and in the second group 13 eyes of 11 patients had been operated by residents. Intraoperative parameters and complication rates were not statistically significant between groups (p > 0.05, Chi-square). The duration of surgery was 32.7 ± 5.6 min in the first group and 45 ± 3.8 min in the second group. The difference was statistically significant (p < 0.001, Student's t test). Absolute and total success was 68.8 and 93.8% in the first group and 62.5 and 87.5% in the second group, respectively. The difference was not statistically significant. Humanistic and conventional training models under control of an experienced surgeon are safe and effective for senior residents who manage phacoemulsification surgery in routine cataract cases. Senior residents can practice these surgical techniques with reasonable complication rates.
An analysis of student performance benchmarks in dental hygiene via distance education.

PubMed

Olmsted, Jodi L

2010-01-01

Three graduate programs, 35 undergraduate programs and 12 dental hygiene degree completion programs in the United States use varying forms of Distance Learning (DL). Relying heavily on DL leaves an unanswered question: Is learner performance on standard benchmark assessments impacted when using technology as a delivery system? A 10 year, longitudinal examination looked for student performance differences in a Distance Education (DE) dental hygiene program. The purpose of this research was to determine if there was a difference in performance between learners taught in a traditional classroom as compared to their counterparts taking classes through an alternative delivery system. A longitudinal, ex post facto design was used. Two hundred and sixty-six subject records were examined. Seventy-seven individuals (29%) were lost through attrition over 10 years. One hundred and eighty-nine records were used as the study sample, 117 individuals were located face-to-face and 72 were at a distance. Independent variables included time and location, while the dependent variables included course grades, grade point average (GPA) and the National Board of Dental Hygiene Examination (NBDHE). Three research questions were asked: Were there statistically significant differences in learner performance on the National Board of Dental Hygiene Examination (NBDHE)? Were there statistically significant differences in learner performance when considering GPAs? Did statistically significant differences in performance exist relating to individual course grades? T-tests were used for data analysis in answering the research questions. From a cumulative perspective, no statistically significant differences were apparent for the NBDHE and GPAs or for individual courses. Interactive Television (ITV), the synchronous DL system examined, was considered effective for delivering education to learners if similar performance outcomes were the evaluation criteria.
Physique and Performance of Young Wheelchair Basketball Players in Relation with Classification

PubMed Central

Zancanaro, Carlo

2015-01-01

The relationships among physical characteristics, performance, and functional ability classification of younger wheelchair basketball players have been barely investigated to date. The purpose of this work was to assess anthropometry, body composition, and performance in sport-specific field tests in a national sample of Italian younger wheelchair basketball players as well as to evaluate the association of these variables with the players’ functional ability classification and game-related statistics. Several anthropometric measurements were obtained for 52 out of 91 eligible players nationwide. Performance was assessed in seven sport-specific field tests (5m sprint, 20m sprint with ball, suicide, maximal pass, pass for accuracy, spot shot and lay-ups) and game-related statistics (free-throw points scored per match, two- and three-point field-goals scored per match, and their sum). Association between variables, and predictivity was assessed by correlation and regression analysis, respectively. Players were grouped into four Classes of increasing functional ability (A-D). One-way ANOVA with Bonferroni’s correction for multiple comparisons was used to assess differences between Classes. Sitting height and functional ability Class especially correlated with performance outcomes, but wheelchair basketball experience and skinfolds did not. Game-related statistics and sport-specific field-test scores all showed significant correlation with each other. Upper arm circumference and/or maximal pass and lay-ups test scores were able to explain 42 to 59% of variance in game-related statistics (P<0.001). A clear difference in performance was only found for functional ability Class A and D. Conclusion: In younger wheelchair basketball players, sitting height positively contributes to performance. The maximal pass and lay-ups test should be carefully considered in younger wheelchair basketball training plans. Functional ability Class reflects to a limited extent the actual differences in performance. PMID:26606681
Familiar units prevail over statistical cues in word segmentation.

PubMed

Poulin-Charronnat, Bénédicte; Perruchet, Pierre; Tillmann, Barbara; Peereman, Ronald

2017-09-01

In language acquisition research, the prevailing position is that listeners exploit statistical cues, in particular transitional probabilities between syllables, to discover words of a language. However, other cues are also involved in word discovery. Assessing the weight learners give to these different cues leads to a better understanding of the processes underlying speech segmentation. The present study evaluated whether adult learners preferentially used known units or statistical cues for segmenting continuous speech. Before the exposure phase, participants were familiarized with part-words of a three-word artificial language. This design allowed the dissociation of the influence of statistical cues and familiar units, with statistical cues favoring word segmentation and familiar units favoring (nonoptimal) part-word segmentation. In Experiment 1, performance in a two-alternative forced choice (2AFC) task between words and part-words revealed part-word segmentation (even though part-words were less cohesive in terms of transitional probabilities and less frequent than words). By contrast, an unfamiliarized group exhibited word segmentation, as usually observed in standard conditions. Experiment 2 used a syllable-detection task to remove the likely contamination of performance by memory and strategy effects in the 2AFC task. Overall, the results suggest that familiar units overrode statistical cues, ultimately questioning the need for computation mechanisms of transitional probabilities (TPs) in natural language speech segmentation.
Statistical performance and information content of time lag analysis and redundancy analysis in time series modeling.

PubMed

Angeler, David G; Viedma, Olga; Moreno, José M

2009-11-01

Time lag analysis (TLA) is a distance-based approach used to study temporal dynamics of ecological communities by measuring community dissimilarity over increasing time lags. Despite its increased use in recent years, its performance in comparison with other more direct methods (i.e., canonical ordination) has not been evaluated. This study fills this gap using extensive simulations and real data sets from experimental temporary ponds (true zooplankton communities) and landscape studies (landscape categories as pseudo-communities) that differ in community structure and anthropogenic stress history. Modeling time with a principal coordinate of neighborhood matrices (PCNM) approach, the canonical ordination technique (redundancy analysis; RDA) consistently outperformed the other statistical tests (i.e., TLAs, Mantel test, and RDA based on linear time trends) using all real data. In addition, the RDA-PCNM revealed different patterns of temporal change, and the strength of each individual time pattern, in terms of adjusted variance explained, could be evaluated, It also identified species contributions to these patterns of temporal change. This additional information is not provided by distance-based methods. The simulation study revealed better Type I error properties of the canonical ordination techniques compared with the distance-based approaches when no deterministic component of change was imposed on the communities. The simulation also revealed that strong emphasis on uniform deterministic change and low variability at other temporal scales is needed to result in decreased statistical power of the RDA-PCNM approach relative to the other methods. Based on the statistical performance of and information content provided by RDA-PCNM models, this technique serves ecologists as a powerful tool for modeling temporal change of ecological (pseudo-) communities.
Comparative evaluation of statistical and mechanistic models of Escherichia coli at beaches in southern Lake Michigan

USGS Publications Warehouse

Safaie, Ammar; Wendzel, Aaron; Ge, Zhongfu; Nevers, Meredith; Whitman, Richard L.; Corsi, Steven R.; Phanikumar, Mantha S.

2016-01-01

Statistical and mechanistic models are popular tools for predicting the levels of indicator bacteria at recreational beaches. Researchers tend to use one class of model or the other, and it is difficult to generalize statements about their relative performance due to differences in how the models are developed, tested, and used. We describe a cooperative modeling approach for freshwater beaches impacted by point sources in which insights derived from mechanistic modeling were used to further improve the statistical models and vice versa. The statistical models provided a basis for assessing the mechanistic models which were further improved using probability distributions to generate high-resolution time series data at the source, long-term “tracer” transport modeling based on observed electrical conductivity, better assimilation of meteorological data, and the use of unstructured-grids to better resolve nearshore features. This approach resulted in improved models of comparable performance for both classes including a parsimonious statistical model suitable for real-time predictions based on an easily measurable environmental variable (turbidity). The modeling approach outlined here can be used at other sites impacted by point sources and has the potential to improve water quality predictions resulting in more accurate estimates of beach closures.
Statistical Model of Dynamic Markers of the Alzheimer's Pathological Cascade.

PubMed

Balsis, Steve; Geraci, Lisa; Benge, Jared; Lowe, Deborah A; Choudhury, Tabina K; Tirso, Robert; Doody, Rachelle S

2018-05-05

Alzheimer's disease (AD) is a progressive disease reflected in markers across assessment modalities, including neuroimaging, cognitive testing, and evaluation of adaptive function. Identifying a single continuum of decline across assessment modalities in a single sample is statistically challenging because of the multivariate nature of the data. To address this challenge, we implemented advanced statistical analyses designed specifically to model complex data across a single continuum. We analyzed data from the Alzheimer's Disease Neuroimaging Initiative (ADNI; N = 1,056), focusing on indicators from the assessments of magnetic resonance imaging (MRI) volume, fluorodeoxyglucose positron emission tomography (FDG-PET) metabolic activity, cognitive performance, and adaptive function. Item response theory was used to identify the continuum of decline. Then, through a process of statistical scaling, indicators across all modalities were linked to that continuum and analyzed. Findings revealed that measures of MRI volume, FDG-PET metabolic activity, and adaptive function added measurement precision beyond that provided by cognitive measures, particularly in the relatively mild range of disease severity. More specifically, MRI volume, and FDG-PET metabolic activity become compromised in the very mild range of severity, followed by cognitive performance and finally adaptive function. Our statistically derived models of the AD pathological cascade are consistent with existing theoretical models.
Effect of Internet-Based Cognitive Apprenticeship Model (i-CAM) on Statistics Learning among Postgraduate Students

PubMed Central

Saadati, Farzaneh; Ahmad Tarmizi, Rohani

2015-01-01

Because students’ ability to use statistics, which is mathematical in nature, is one of the concerns of educators, embedding within an e-learning system the pedagogical characteristics of learning is ‘value added’ because it facilitates the conventional method of learning mathematics. Many researchers emphasize the effectiveness of cognitive apprenticeship in learning and problem solving in the workplace. In a cognitive apprenticeship learning model, skills are learned within a community of practitioners through observation of modelling and then practice plus coaching. This study utilized an internet-based Cognitive Apprenticeship Model (i-CAM) in three phases and evaluated its effectiveness for improving statistics problem-solving performance among postgraduate students. The results showed that, when compared to the conventional mathematics learning model, the i-CAM could significantly promote students’ problem-solving performance at the end of each phase. In addition, the combination of the differences in students' test scores were considered to be statistically significant after controlling for the pre-test scores. The findings conveyed in this paper confirmed the considerable value of i-CAM in the improvement of statistics learning for non-specialized postgraduate students. PMID:26132553
Local multiplicity adjustment for the spatial scan statistic using the Gumbel distribution.

PubMed

Gangnon, Ronald E

2012-03-01

The spatial scan statistic is an important and widely used tool for cluster detection. It is based on the simultaneous evaluation of the statistical significance of the maximum likelihood ratio test statistic over a large collection of potential clusters. In most cluster detection problems, there is variation in the extent of local multiplicity across the study region. For example, using a fixed maximum geographic radius for clusters, urban areas typically have many overlapping potential clusters, whereas rural areas have relatively few. The spatial scan statistic does not account for local multiplicity variation. We describe a previously proposed local multiplicity adjustment based on a nested Bonferroni correction and propose a novel adjustment based on a Gumbel distribution approximation to the distribution of a local scan statistic. We compare the performance of all three statistics in terms of power and a novel unbiased cluster detection criterion. These methods are then applied to the well-known New York leukemia dataset and a Wisconsin breast cancer incidence dataset. © 2011, The International Biometric Society.
Local multiplicity adjustment for the spatial scan statistic using the Gumbel distribution

PubMed Central

Gangnon, Ronald E.

2011-01-01

Summary The spatial scan statistic is an important and widely used tool for cluster detection. It is based on the simultaneous evaluation of the statistical significance of the maximum likelihood ratio test statistic over a large collection of potential clusters. In most cluster detection problems, there is variation in the extent of local multiplicity across the study region. For example, using a fixed maximum geographic radius for clusters, urban areas typically have many overlapping potential clusters, while rural areas have relatively few. The spatial scan statistic does not account for local multiplicity variation. We describe a previously proposed local multiplicity adjustment based on a nested Bonferroni correction and propose a novel adjustment based on a Gumbel distribution approximation to the distribution of a local scan statistic. We compare the performance of all three statistics in terms of power and a novel unbiased cluster detection criterion. These methods are then applied to the well-known New York leukemia dataset and a Wisconsin breast cancer incidence dataset. PMID:21762118

Computed Tomography Image Quality Evaluation of a New Iterative Reconstruction Algorithm in the Abdomen (Adaptive Statistical Iterative Reconstruction-V) a Comparison With Model-Based Iterative Reconstruction, Adaptive Statistical Iterative Reconstruction, and Filtered Back Projection Reconstructions.

PubMed

Goodenberger, Martin H; Wagner-Bartak, Nicolaus A; Gupta, Shiva; Liu, Xinming; Yap, Ramon Q; Sun, Jia; Tamm, Eric P; Jensen, Corey T

The purpose of this study was to compare abdominopelvic computed tomography images reconstructed with adaptive statistical iterative reconstruction-V (ASIR-V) with model-based iterative reconstruction (Veo 3.0), ASIR, and filtered back projection (FBP). Abdominopelvic computed tomography scans for 36 patients (26 males and 10 females) were reconstructed using FBP, ASIR (80%), Veo 3.0, and ASIR-V (30%, 60%, 90%). Mean ± SD patient age was 32 ± 10 years with mean ± SD body mass index of 26.9 ± 4.4 kg/m. Images were reviewed by 2 independent readers in a blinded, randomized fashion. Hounsfield unit, noise, and contrast-to-noise ratio (CNR) values were calculated for each reconstruction algorithm for further comparison. Phantom evaluation of low-contrast detectability (LCD) and high-contrast resolution was performed. Adaptive statistical iterative reconstruction-V 30%, ASIR-V 60%, and ASIR 80% were generally superior qualitatively compared with ASIR-V 90%, Veo 3.0, and FBP (P < 0.05). Adaptive statistical iterative reconstruction-V 90% showed superior LCD and had the highest CNR in the liver, aorta, and, pancreas, measuring 7.32 ± 3.22, 11.60 ± 4.25, and 4.60 ± 2.31, respectively, compared with the next best series of ASIR-V 60% with respective CNR values of 5.54 ± 2.39, 8.78 ± 3.15, and 3.49 ± 1.77 (P <0.0001). Veo 3.0 and ASIR 80% had the best and worst spatial resolution, respectively. Adaptive statistical iterative reconstruction-V 30% and ASIR-V 60% provided the best combination of qualitative and quantitative performance. Adaptive statistical iterative reconstruction 80% was equivalent qualitatively, but demonstrated inferior spatial resolution and LCD.
Statistical analysis of water-quality data containing multiple detection limits: S-language software for regression on order statistics

USGS Publications Warehouse

Lee, L.; Helsel, D.

2005-01-01

Trace contaminants in water, including metals and organics, often are measured at sufficiently low concentrations to be reported only as values below the instrument detection limit. Interpretation of these "less thans" is complicated when multiple detection limits occur. Statistical methods for multiply censored, or multiple-detection limit, datasets have been developed for medical and industrial statistics, and can be employed to estimate summary statistics or model the distributions of trace-level environmental data. We describe S-language-based software tools that perform robust linear regression on order statistics (ROS). The ROS method has been evaluated as one of the most reliable procedures for developing summary statistics of multiply censored data. It is applicable to any dataset that has 0 to 80% of its values censored. These tools are a part of a software library, or add-on package, for the R environment for statistical computing. This library can be used to generate ROS models and associated summary statistics, plot modeled distributions, and predict exceedance probabilities of water-quality standards. ?? 2005 Elsevier Ltd. All rights reserved.
Correcting for Optimistic Prediction in Small Data Sets

PubMed Central

Smith, Gordon C. S.; Seaman, Shaun R.; Wood, Angela M.; Royston, Patrick; White, Ian R.

2014-01-01

The C statistic is a commonly reported measure of screening test performance. Optimistic estimation of the C statistic is a frequent problem because of overfitting of statistical models in small data sets, and methods exist to correct for this issue. However, many studies do not use such methods, and those that do correct for optimism use diverse methods, some of which are known to be biased. We used clinical data sets (United Kingdom Down syndrome screening data from Glasgow (1991–2003), Edinburgh (1999–2003), and Cambridge (1990–2006), as well as Scottish national pregnancy discharge data (2004–2007)) to evaluate different approaches to adjustment for optimism. We found that sample splitting, cross-validation without replication, and leave-1-out cross-validation produced optimism-adjusted estimates of the C statistic that were biased and/or associated with greater absolute error than other available methods. Cross-validation with replication, bootstrapping, and a new method (leave-pair-out cross-validation) all generated unbiased optimism-adjusted estimates of the C statistic and had similar absolute errors in the clinical data set. Larger simulation studies confirmed that all 3 methods performed similarly with 10 or more events per variable, or when the C statistic was 0.9 or greater. However, with lower events per variable or lower C statistics, bootstrapping tended to be optimistic but with lower absolute and mean squared errors than both methods of cross-validation. PMID:24966219
Data Analysis and Instrumentation Requirements for Evaluating Rail Joints and Rail Fasteners in Urban Track

DOT National Transportation Integrated Search

1975-02-01

Rail fasteners for concrete ties and direct fixation and bolted rail joints have been identified as key components for improving track performance. However, the lack of statistical load data limits the development of improved design criteria and eval...
A Randomized Comparative Study Evaluating Various Cough Stress Tests and 24-Hour Pad Test with Urodynamics in the Diagnosis of Stress Urinary Incontinence.

PubMed

Henderson, Joseph W; Kane, Sarah M; Mangel, Jeffrey M; Kikano, Elias G; Garibay, Jorge A; Pollard, Robert R; Mahajan, Sangeeta T; Debanne, Sara M; Hijaz, Adonis K

2018-06-01

The cough stress test is a common and accepted tool to evaluate stress urinary incontinence but there is no agreement on how the test should be performed. We assessed the diagnostic ability of different cough stress tests performed when varying patient position and bladder volume using urodynamic stress urinary incontinence as the gold standard. The 24-hour pad test was also evaluated. We recruited women who presented to specialty outpatient clinics with the complaint of urinary incontinence and who were recommended to undergo urodynamic testing. A total of 140 patients were randomized to 4 cough stress test groups, including group 1-a comfortably full bladder, group 2-an empty bladder, group 3- a bladder infused with 200 cc saline and group 4-a bladder filled to half functional capacity. The sequence of standing and sitting was randomly assigned. The groups were compared by 1-way ANOVA or the generalized Fisher exact test. The κ statistic was used to evaluate agreement between the sitting and standing positions. The 95% CIs of sensitivity and specificity were calculated using the Wilson method. ROC analysis was done to evaluate the performance of the 24-hour pad test. The cough stress test performed with a bladder filled to half functional capacity was the best performing test with 83% sensitivity and 90% specificity. There was no statistically significant evidence that the sensitivity or specificity of 1 cough stress test differed from that of the others. The pad test had no significant predictive ability to diagnose urodynamic stress urinary incontinence (AUC 0.60, p = 0.08). Cough stress tests were accurate to diagnose urodynamic stress urinary incontinence. The 24-hour pad test was not predictive of urodynamic stress urinary incontinence and not helpful when used in conjunction with the cough stress test. Copyright © 2018 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
Student performance in a flipped classroom dental anatomy course.

PubMed

Chutinan, S; Riedy, C A; Park, S E

2017-11-09

The purpose of this study was to assess dental student learning in a dental anatomy module between traditional lecture and flipped classroom cohorts. Two cohorts of predoctoral dental students (N = 70 within each cohort) participated in a dental anatomy module within an Introduction to the Dental Patient (IDP) course ([traditional/lecture cohort: academic year (AY) 2012, 2013] and [flipped classroom cohort: AY 2014, 2015]). For the dental anatomy module, both cohorts were evaluated on pre-clinical tooth waxing exercises immediately after each of five lectures and tooth identification after all lectures were given. Additionally, the cohorts' performance on the overall IDP course examination was compared. The flipped classroom cohort had statistically significant higher waxing scores (dental anatomy module) than students in the traditional classroom. There was no statistically significant difference for tooth identification scores and the overall IDP course examination between the traditional vs flipped approach cohorts. This is due to the latter two assessments conducted at the end of the course gave all students enough time to review the lecture content prior to the assessment resulting in similar scores for both cohorts. The flipped classroom cohort promoted students' individual learning and resulted in improved students' performance on immediate evaluation but not on the end of the course evaluation. Redesign of courses to include a new pedagogical approach should be carefully implemented and evaluated for student's educational success. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Fuzzy-logic based strategy for validation of multiplex methods: example with qualitative GMO assays.

PubMed

Bellocchi, Gianni; Bertholet, Vincent; Hamels, Sandrine; Moens, W; Remacle, José; Van den Eede, Guy

2010-02-01

This paper illustrates the advantages that a fuzzy-based aggregation method could bring into the validation of a multiplex method for GMO detection (DualChip GMO kit, Eppendorf). Guidelines for validation of chemical, bio-chemical, pharmaceutical and genetic methods have been developed and ad hoc validation statistics are available and routinely used, for in-house and inter-laboratory testing, and decision-making. Fuzzy logic allows summarising the information obtained by independent validation statistics into one synthetic indicator of overall method performance. The microarray technology, introduced for simultaneous identification of multiple GMOs, poses specific validation issues (patterns of performance for a variety of GMOs at different concentrations). A fuzzy-based indicator for overall evaluation is illustrated in this paper, and applied to validation data for different genetically modified elements. Remarks were drawn on the analytical results. The fuzzy-logic based rules were shown to be applicable to improve interpretation of results and facilitate overall evaluation of the multiplex method.
Impact of the 80-hour workweek on surgical exposure and national in-training examination scores in an orthopedic residency program.

PubMed

Froelich, John; Milbrandt, Joseph C; Allan, D Gordon

2009-01-01

This study examines the impact of the 80-hour workweek on the number of surgical cases performed by PGY-2 through PGY-5 orthopedic residents. We also evaluated orthopedic in-training examination (OITE) scores during the same time period. Data were collected from the Accreditation Council for Graduate Medical Education (ACGME) national database for 3 academic years before and 5 years after July 1, 2003. CPT surgical procedure codes logged by all residents 3 years before and 5 years after implementation of the 80-hour workweek were compared. The average raw OITE scores for each class obtained during the same time period were also evaluated. Data were reported as the mean +/- standard deviation (SD), and group means were compared using independent t-tests. No statistical difference was noted in the number of surgical procedure codes logged before or after the institution of the 80-hour week during any single year of training. However, an increase in the number of CPT codes logged in the PGY-3 years after 2003 did approach significance (457.7 vs 551.9, p = 0.057). Overall, the average number of cases performed per resident increased each year after implementation of the work-hour restriction (464.4 vs 515.5 cases). No statistically significant difference was noted in the raw OITE scores before or after work-hour restrictions for our residents or nationally. We found no statistical difference for each residency class in the average number of cases performed or OITE scores, although the total number of cases performed has increased after implementation of the work-hour restrictions. We also found no statistical difference in the national OITE scores. Our data suggest that the impact of the 80-hour workweek has not had a detrimental effect on these 2 resident training measurements.
Comparison of performance of various tumour response criteria in assessment of regorafenib activity in advanced gastrointestinal stromal tumours after failure of imatinib and sunitinib.

PubMed

Shinagare, Atul B; Jagannathan, Jyothi P; Kurra, Vikram; Urban, Trinity; Manola, Judith; Choy, Edwin; Demetri, George D; George, Suzanne; Ramaiya, Nikhil H

2014-03-01

To compare performance of various tumour response criteria (TRCs) in assessment of regorafenib activity in patients with advanced gastrointestinal stromal tumour (GIST) with prior failure of imatinib and sunitinib. Twenty participants in a phase II trial received oral regorafenib (median duration 47 weeks; interquartile range (IQR) 24-88) with computed tomography (CT) imaging at baseline and every two months thereafter. Tumour response was prospectively determined on using Response Evaluation Criteria in Solid Tumours (RECIST) 1.1, and retrospectively reassessed for comparison per RECIST 1.0, World Health Organization (WHO) and Choi criteria, using the same target lesions. Clinical benefit rate [CBR; complete or partial response (CR or PR) or stable disease (SD)≥16 weeks] and progression-free survival (PFS) were compared between various TRCs using kappa statistics. Performance of TRCs in predicting overall survival (OS) was compared by comparing OS in groups with progression-free intervals less than or greater than 20 weeks by each TRC using c-statistics. PR was more frequent by Choi (90%) than RECIST 1.1, RECIST 1.0 and WHO (20% each), however, CBR was similar between various TRCs (overall CBR 85-90%, 95-100% agreement between all TRC pairs). PFS per RECIST 1.0 was similar to RECIST 1.1 (median 44 weeks versus 58 weeks), and shorter for WHO (median 34 weeks) and Choi (median 24 weeks). With RECIST 1.1, RECIST 1.0 and WHO, there was moderate concordance between PFS and OS (c-statistics 0.596-0.679). Choi criteria had less favourable concordance (c-statistic 0.506). RECIST 1.1 and WHO performed somewhat better than Choi criteria as TRC for response evaluation in patients with advanced GIST after prior failure on imatinib and sunitinib. Copyright © 2013 Elsevier Ltd. All rights reserved.
AERMOD performance evaluation for three coal-fired electrical generating units in Southwest Indiana.

PubMed

Frost, Kali D

2014-03-01

An evaluation of the steady-state dispersion model AERMOD was conducted to determine its accuracy at predicting hourly ground-level concentrations of sulfur dioxide (SO2) by comparing model-predicted concentrations to a full year of monitored SO2 data. The two study sites are comprised of three coal-fired electrical generating units (EGUs) located in southwest Indiana. The sites are characterized by tall, buoyant stacks,flat terrain, multiple SO2 monitors, and relatively isolated locations. AERMOD v12060 and AERMOD v12345 with BETA options were evaluated at each study site. For the six monitor-receptor pairs evaluated, AERMOD showed generally good agreement with monitor values for the hourly 99th percentile SO2 design value, with design value ratios that ranged from 0.92 to 1.99. AERMOD was within acceptable performance limits for the Robust Highest Concentration (RHC) statistic (RHC ratios ranged from 0.54 to 1.71) at all six monitors. Analysis of the top 5% of hourly concentrations at the six monitor-receptor sites, paired in time and space, indicated poor model performance in the upper concentration range. The amount of hourly model predicted data that was within a factor of 2 of observations at these higher concentrations ranged from 14 to 43% over the six sites. Analysis of subsets of data showed consistent overprediction during low wind speed and unstable meteorological conditions, and underprediction during stable, low wind conditions. Hourly paired comparisons represent a stringent measure of model performance; however given the potential for application of hourly model predictions to the SO2 NAAQS design value, this may be appropriate. At these two sites, AERMOD v12345 BETA options do not improve model performance. A regulatory evaluation of AERMOD utilizing quantile-quantile (Q-Q) plots, the RHC statistic, and 99th percentile design value concentrations indicates that model performance is acceptable according to widely accepted regulatory performance limits. However, a scientific evaluation examining hourly paired monitor and model values at concentrations of interest indicates overprediction and underprediction bias that is outside of acceptable model performance measures. Overprediction of 1-hr SO2 concentrations by AERMOD presents major ramifications for state and local permitting authorities when establishing emission limits.
Evaluation of PCR Systems for Field Screening of Bacillus anthracis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ozanich, Richard M.; Colburn, Heather A.; Victry, Kristin D.

There is little published data on the performance of hand-portable polymerase chain reaction (PCR) instruments that could be used by first responders to determine if a suspicious powder contains a potential biothreat agent. We evaluated five commercially available hand-portable PCR instruments for detection of Bacillus anthracis (Ba). We designed a cost-effective, statistically-based test plan that allows instruments to be evaluated at performance levels ranging from 0.85-0.95 lower confidence bound (LCB) on the probability of detection (POD) at confidence levels of 80-95%. We assessed specificity using purified genomic DNA from 13 Ba strains and 18 Bacillus near neighbors, interference with 22more » common hoax powders encountered in the field, and PCR inhibition when Ba spores were spiked into these powders. Our results indicated that three of the five instruments achieved >0.95 LCB on the POD with 95% confidence at test concentrations of 2,000 genome equivalents/mL (comparable to 2,000 spores/mL), displaying more than sufficient sensitivity for screening suspicious powders. These instruments exhibited no false positive results or PCR inhibition with common hoax powders, and reliably detected Ba spores spiked into common hoax powders, though some issues with instrument controls were observed. Our testing approach enables efficient instrument performance testing to a statistically rigorous and cost-effective test plan to generate performance data that will allow users to make informed decisions regarding the purchase and use of biodetection equipment in the field.« less
Preliminary Evaluation of an Aviation Safety Thesaurus' Utility for Enhancing Automated Processing of Incident Reports

NASA Technical Reports Server (NTRS)

Barrientos, Francesca; Castle, Joseph; McIntosh, Dawn; Srivastava, Ashok

2007-01-01

This document presents a preliminary evaluation the utility of the FAA Safety Analytics Thesaurus (SAT) utility in enhancing automated document processing applications under development at NASA Ames Research Center (ARC). Current development efforts at ARC are described, including overviews of the statistical machine learning techniques that have been investigated. An analysis of opportunities for applying thesaurus knowledge to improving algorithm performance is then presented.
Correcting evaluation bias of relational classifiers with network cross validation

DOE PAGES

Neville, Jennifer; Gallagher, Brian; Eliassi-Rad, Tina; ...

2011-01-04

Recently, a number of modeling techniques have been developed for data mining and machine learning in relational and network domains where the instances are not independent and identically distributed (i.i.d.). These methods specifically exploit the statistical dependencies among instances in order to improve classification accuracy. However, there has been little focus on how these same dependencies affect our ability to draw accurate conclusions about the performance of the models. More specifically, the complex link structure and attribute dependencies in relational data violate the assumptions of many conventional statistical tests and make it difficult to use these tests to assess themore » models in an unbiased manner. In this work, we examine the task of within-network classification and the question of whether two algorithms will learn models that will result in significantly different levels of performance. We show that the commonly used form of evaluation (paired t-test on overlapping network samples) can result in an unacceptable level of Type I error. Furthermore, we show that Type I error increases as (1) the correlation among instances increases and (2) the size of the evaluation set increases (i.e., the proportion of labeled nodes in the network decreases). Lastly, we propose a method for network cross-validation that combined with paired t-tests produces more acceptable levels of Type I error while still providing reasonable levels of statistical power (i.e., 1–Type II error).« less
Automated detection of hospital outbreaks: A systematic review of methods.

PubMed

Leclère, Brice; Buckeridge, David L; Boëlle, Pierre-Yves; Astagneau, Pascal; Lepelletier, Didier

2017-01-01

Several automated algorithms for epidemiological surveillance in hospitals have been proposed. However, the usefulness of these methods to detect nosocomial outbreaks remains unclear. The goal of this review was to describe outbreak detection algorithms that have been tested within hospitals, consider how they were evaluated, and synthesize their results. We developed a search query using keywords associated with hospital outbreak detection and searched the MEDLINE database. To ensure the highest sensitivity, no limitations were initially imposed on publication languages and dates, although we subsequently excluded studies published before 2000. Every study that described a method to detect outbreaks within hospitals was included, without any exclusion based on study design. Additional studies were identified through citations in retrieved studies. Twenty-nine studies were included. The detection algorithms were grouped into 5 categories: simple thresholds (n = 6), statistical process control (n = 12), scan statistics (n = 6), traditional statistical models (n = 6), and data mining methods (n = 4). The evaluation of the algorithms was often solely descriptive (n = 15), but more complex epidemiological criteria were also investigated (n = 10). The performance measures varied widely between studies: e.g., the sensitivity of an algorithm in a real world setting could vary between 17 and 100%. Even if outbreak detection algorithms are useful complementary tools for traditional surveillance, the heterogeneity in results among published studies does not support quantitative synthesis of their performance. A standardized framework should be followed when evaluating outbreak detection methods to allow comparison of algorithms across studies and synthesis of results.
Frequency of otitis media based on otoendoscopic evaluation in preterm infants.

PubMed

Coticchia, James; Shah, Priyanka; Sachdeva, Livjot; Kwong, Kelvin; Cortez, Josef M; Nation, Javan; Rudd, Tracy; Zidan, Marwan; Cepeda, Eugene; Gonik, Bernard

2014-10-01

This study was conducted to determine the frequency of otitis media in preterm neonates using otoendoscopy and tympanometry. Prospective study. Wayne State University, Hutzel Women's Hospital Neonatal Intensive Care Unit. Eighty-six preterm infants were included (gestational age <36 weeks). Otoendoscopy and tympanometry were performed to detect the presence of otitis media. Kappa statistic and logistic regression were used for statistical analysis. Otoendoscopy was performed in 85 patients. The frequency of otoendoscopy-diagnosed otitis media was 72.9% (62/85). Tympanometry could be performed on 69.76% of the ears. There was 73.5% agreement between the findings of tympanometry and those of otoendoscopy. The association between the presence of otitis media and gestational age at birth was statistically significant. The lower the gestational age, the higher the frequency of otoendoscopy-diagnosed otitis media (P = .001). Otoendoscopically diagnosed otitis media is frequent in preterm neonates. There was agreement between the results of tympanometry and those of otoendoscopy. The frequency of otitis media increased with lower gestational age. © American Academy of Otolaryngology—Head and Neck Surgery Foundation 2014.
Comparison of optimization strategy and similarity metric in atlas-to-subject registration using statistical deformation model

NASA Astrophysics Data System (ADS)

Otake, Y.; Murphy, R. J.; Grupp, R. B.; Sato, Y.; Taylor, R. H.; Armand, M.

2015-03-01

A robust atlas-to-subject registration using a statistical deformation model (SDM) is presented. The SDM uses statistics of voxel-wise displacement learned from pre-computed deformation vectors of a training dataset. This allows an atlas instance to be directly translated into an intensity volume and compared with a patient's intensity volume. Rigid and nonrigid transformation parameters were simultaneously optimized via the Covariance Matrix Adaptation - Evolutionary Strategy (CMA-ES), with image similarity used as the objective function. The algorithm was tested on CT volumes of the pelvis from 55 female subjects. A performance comparison of the CMA-ES and Nelder-Mead downhill simplex optimization algorithms with the mutual information and normalized cross correlation similarity metrics was conducted. Simulation studies using synthetic subjects were performed, as well as leave-one-out cross validation studies. Both studies suggested that mutual information and CMA-ES achieved the best performance. The leave-one-out test demonstrated 4.13 mm error with respect to the true displacement field, and 26,102 function evaluations in 180 seconds, on average.
Testing homogeneity of proportion ratios for stratified correlated bilateral data in two-arm randomized clinical trials.

PubMed

Pei, Yanbo; Tian, Guo-Liang; Tang, Man-Lai

2014-11-10

Stratified data analysis is an important research topic in many biomedical studies and clinical trials. In this article, we develop five test statistics for testing the homogeneity of proportion ratios for stratified correlated bilateral binary data based on an equal correlation model assumption. Bootstrap procedures based on these test statistics are also considered. To evaluate the performance of these statistics and procedures, we conduct Monte Carlo simulations to study their empirical sizes and powers under various scenarios. Our results suggest that the procedure based on score statistic performs well generally and is highly recommended. When the sample size is large, procedures based on the commonly used weighted least square estimate and logarithmic transformation with Mantel-Haenszel estimate are recommended as they do not involve any computation of maximum likelihood estimates requiring iterative algorithms. We also derive approximate sample size formulas based on the recommended test procedures. Finally, we apply the proposed methods to analyze a multi-center randomized clinical trial for scleroderma patients. Copyright © 2014 John Wiley & Sons, Ltd.
Evaluating pictogram prediction in a location-aware augmentative and alternative communication system.

PubMed

Garcia, Luís Filipe; de Oliveira, Luís Caldas; de Matos, David Martins

2016-01-01

This study compared the performance of two statistical location-aware pictogram prediction mechanisms, with an all-purpose (All) pictogram prediction mechanism, having no location knowledge. The All approach had a unique language model under all locations. One of the location-aware alternatives, the location-specific (Spec) approach, made use of specific language models for pictogram prediction in each location of interest. The other location-aware approach resulted from combining the Spec and the All approaches, and was designated the mixed approach (Mix). In this approach, the language models acquired knowledge from all locations, but a higher relevance was assigned to the vocabulary from the associated location. Results from simulations showed that the Mix and Spec approaches could only outperform the baseline in a statistically significant way if pictogram users reuse more than 50% and 75% of their sentences, respectively. Under low sentence reuse conditions there were no statistically significant differences between the location-aware approaches and the All approach. Under these conditions, the Mix approach performed better than the Spec approach in a statistically significant way.
Performance of Reclassification Statistics in Comparing Risk Prediction Models

PubMed Central

Paynter, Nina P.

2012-01-01

Concerns have been raised about the use of traditional measures of model fit in evaluating risk prediction models for clinical use, and reclassification tables have been suggested as an alternative means of assessing the clinical utility of a model. Several measures based on the table have been proposed, including the reclassification calibration (RC) statistic, the net reclassification improvement (NRI), and the integrated discrimination improvement (IDI), but the performance of these in practical settings has not been fully examined. We used simulations to estimate the type I error and power for these statistics in a number of scenarios, as well as the impact of the number and type of categories, when adding a new marker to an established or reference model. The type I error was found to be reasonable in most settings, and power was highest for the IDI, which was similar to the test of association. The relative power of the RC statistic, a test of calibration, and the NRI, a test of discrimination, varied depending on the model assumptions. These tools provide unique but complementary information. PMID:21294152
Rapid On-Site Evaluation of Fine-Needle Aspiration by Non-Cytopathologists: A Systematic Review and Meta-Analysis of Diagnostic Accuracy Studies for Adequacy Assessment.

PubMed

Pearson, Lauren; Factor, Rachel E; White, Sandra K; Walker, Brandon S; Layfield, Lester J; Schmidt, Robert L

2018-06-06

Rapid on-site evaluation (ROSE) has been shown to improve adequacy rates and reduce needle passes. ROSE is often performed by cytopathologists who have limited availability and may be costlier than alternatives. Several recent studies examined the use of alternative evaluators (AEs) for ROSE. A summary of this information could help inform guidelines regarding the use of AEs. The objective was to assess the accuracy of AEs compared to cytopathologists in assessing the adequacy of specimens during ROSE. This was a systematic review and meta-analysis. Reporting and study quality were assessed using the STARD guidelines and QUADAS-2. All steps were performed independently by two evaluators. Summary estimates were obtained using the hierarchal method in Stata v14. Heterogeneity was evaluated using Higgins' I2 statistic. The systematic review identified 13 studies that were included in the meta-analysis. Summary estimates of sensitivity and specificity for AEs were 97% (95% CI: 92-99%) and 83% (95% CI: 68-92%). There was wide variation in accuracy statistics between studies (I2 = 0.99). AEs sometimes have accuracy that is close to cytopathologists. However, there is wide variability between studies, so it is not possible to provide a broad guideline regarding the use of AEs. © 2018 S. Karger AG, Basel.

Investigating the feasibility of using partial least squares as a method of extracting salient information for the evaluation of digital breast tomosynthesis

NASA Astrophysics Data System (ADS)

Zhang, George Z.; Myers, Kyle J.; Park, Subok

2013-03-01

Digital breast tomosynthesis (DBT) has shown promise for improving the detection of breast cancer, but it has not yet been fully optimized due to a large space of system parameters to explore. A task-based statistical approach1 is a rigorous method for evaluating and optimizing this promising imaging technique with the use of optimal observers such as the Hotelling observer (HO). However, the high data dimensionality found in DBT has been the bottleneck for the use of a task-based approach in DBT evaluation. To reduce data dimensionality while extracting salient information for performing a given task, efficient channels have to be used for the HO. In the past few years, 2D Laguerre-Gauss (LG) channels, which are a complete basis for stationary backgrounds and rotationally symmetric signals, have been utilized for DBT evaluation2, 3 . But since background and signal statistics from DBT data are neither stationary nor rotationally symmetric, LG channels may not be efficient in providing reliable performance trends as a function of system parameters. Recently, partial least squares (PLS) has been shown to generate efficient channels for the Hotelling observer in detection tasks involving random backgrounds and signals.4 In this study, we investigate the use of PLS as a method for extracting salient information from DBT in order to better evaluate such systems.
Expected p-values in light of an ROC curve analysis applied to optimal multiple testing procedures.

PubMed

Vexler, Albert; Yu, Jihnhee; Zhao, Yang; Hutson, Alan D; Gurevich, Gregory

2017-01-01

Many statistical studies report p-values for inferential purposes. In several scenarios, the stochastic aspect of p-values is neglected, which may contribute to drawing wrong conclusions in real data experiments. The stochastic nature of p-values makes their use to examine the performance of given testing procedures or associations between investigated factors to be difficult. We turn our focus on the modern statistical literature to address the expected p-value (EPV) as a measure of the performance of decision-making rules. During the course of our study, we prove that the EPV can be considered in the context of receiver operating characteristic (ROC) curve analysis, a well-established biostatistical methodology. The ROC-based framework provides a new and efficient methodology for investigating and constructing statistical decision-making procedures, including: (1) evaluation and visualization of properties of the testing mechanisms, considering, e.g. partial EPVs; (2) developing optimal tests via the minimization of EPVs; (3) creation of novel methods for optimally combining multiple test statistics. We demonstrate that the proposed EPV-based approach allows us to maximize the integrated power of testing algorithms with respect to various significance levels. In an application, we use the proposed method to construct the optimal test and analyze a myocardial infarction disease dataset. We outline the usefulness of the "EPV/ROC" technique for evaluating different decision-making procedures, their constructions and properties with an eye towards practical applications.
Risk assessment model for development of advanced age-related macular degeneration.

PubMed

Klein, Michael L; Francis, Peter J; Ferris, Frederick L; Hamon, Sara C; Clemons, Traci E

2011-12-01

To design a risk assessment model for development of advanced age-related macular degeneration (AMD) incorporating phenotypic, demographic, environmental, and genetic risk factors. We evaluated longitudinal data from 2846 participants in the Age-Related Eye Disease Study. At baseline, these individuals had all levels of AMD, ranging from none to unilateral advanced AMD (neovascular or geographic atrophy). Follow-up averaged 9.3 years. We performed a Cox proportional hazards analysis with demographic, environmental, phenotypic, and genetic covariates and constructed a risk assessment model for development of advanced AMD. Performance of the model was evaluated using the C statistic and the Brier score and externally validated in participants in the Complications of Age-Related Macular Degeneration Prevention Trial. The final model included the following independent variables: age, smoking history, family history of AMD (first-degree member), phenotype based on a modified Age-Related Eye Disease Study simple scale score, and genetic variants CFH Y402H and ARMS2 A69S. The model did well on performance measures, with very good discrimination (C statistic = 0.872) and excellent calibration and overall performance (Brier score at 5 years = 0.08). Successful external validation was performed, and a risk assessment tool was designed for use with or without the genetic component. We constructed a risk assessment model for development of advanced AMD. The model performed well on measures of discrimination, calibration, and overall performance and was successfully externally validated. This risk assessment tool is available for online use.
The influence of various test plans on mission reliability. [for Shuttle Spacelab payloads

NASA Technical Reports Server (NTRS)

Stahle, C. V.; Gongloff, H. R.; Young, J. P.; Keegan, W. B.

1977-01-01

Methods have been developed for the evaluation of cost effective vibroacoustic test plans for Shuttle Spacelab payloads. The shock and vibration environments of components have been statistically represented, and statistical decision theory has been used to evaluate the cost effectiveness of five basic test plans with structural test options for two of the plans. Component, subassembly, and payload testing have been performed for each plan along with calculations of optimum test levels and expected costs. The tests have been ranked according to both minimizing expected project costs and vibroacoustic reliability. It was found that optimum costs may vary up to $6 million with the lowest plan eliminating component testing and maintaining flight vibration reliability via subassembly tests at high acoustic levels.
Color stability of shade guides after autoclave sterilization.

PubMed

Schmeling, Max; Sartori, Neimar; Monteiro, Sylvio; Baratieri, Luiz

2014-01-01

This study evaluated the influence of 120 autoclave sterilization cycles on the color stability of two commercial shade guides (Vita Classical and Vita System 3D-Master). The specimens were evaluated by spectrophotometer before and after the sterilization cycles. The color was described using the three-dimensional CIELab system. The statistical analysis was performed in three chromaticity coordinates, before and after sterilization cycles, using the paired samples t test. All specimens became darker after autoclave sterilization cycles. However, specimens of Vita Classical became redder, while those of the Vita System 3D-Master became more yellow. Repeated cycles of autoclave sterilization caused statistically significant changes in the color coordinates of the two shade guides. However, these differences are considered clinically acceptable.
Design and Test of Pseudorandom Number Generator Using a Star Network of Lorenz Oscillators

NASA Astrophysics Data System (ADS)

Cho, Kenichiro; Miyano, Takaya

We have recently developed a chaos-based stream cipher based on augmented Lorenz equations as a star network of Lorenz subsystems. In our method, the augmented Lorenz equations are used as a pseudorandom number generator. In this study, we propose a new method based on the augmented Lorenz equations for generating binary pseudorandom numbers and evaluate its security using the statistical tests of SP800-22 published by the National Institute for Standards and Technology in comparison with the performances of other chaotic dynamical models used as binary pseudorandom number generators. We further propose a faster version of the proposed method and evaluate its security using the statistical tests of TestU01 published by L’Ecuyer and Simard.
Evaluation of the operating performance of conventional versus flocculator secondary clarifiers at the Kuwahee Wastewater Treatment Plant, Knoxville, Tennessee.

PubMed

Moreno, Patricio A; Reed, Gregory D

2007-05-01

The difference in performance of three differently designed circular secondary clarifiers in the same wastewater treatment plant was analyzed in this paper. Data obtained using flocculated suspended solids and disperse suspended solids tests were analyzed using statistical tools. The conventional clarifier showed more variability in the average effluent suspended solids concentration when compared with the flocculator-clarifiers. Furthermore, a difference in performance among the two different flocculator-clarifiers was found.
A probabilistic approach to photovoltaic generator performance prediction

NASA Astrophysics Data System (ADS)

Khallat, M. A.; Rahman, S.

1986-09-01

A method for predicting the performance of a photovoltaic (PV) generator based on long term climatological data and expected cell performance is described. The equations for cell model formulation are provided. Use of the statistical model for characterizing the insolation level is discussed. The insolation data is fitted to appropriate probability distribution functions (Weibull, beta, normal). The probability distribution functions are utilized to evaluate the capacity factors of PV panels or arrays. An example is presented revealing the applicability of the procedure.
From Data to Bonuses: A Case Study of the Issues Related to Awarding Teachers Pay on the Basis of Their Students' Progress. Working Paper 2008-14

ERIC Educational Resources Information Center

McCaffrey, Daniel F.; Han, Bing; Lockwood, J. R.

2008-01-01

A key component to the new wave of performance-based pay initiatives is the use of student achievement data to evaluate teacher performance. As greater amounts of student achievement data are being collected, researchers have been developing and applying innovative statistical and econometric models to longitudinal data to develop measures of an…
Computer architecture evaluation for structural dynamics computations: Project summary

NASA Technical Reports Server (NTRS)

Standley, Hilda M.

1989-01-01

The intent of the proposed effort is the examination of the impact of the elements of parallel architectures on the performance realized in a parallel computation. To this end, three major projects are developed: a language for the expression of high level parallelism, a statistical technique for the synthesis of multicomputer interconnection networks based upon performance prediction, and a queueing model for the analysis of shared memory hierarchies.
Multiple performance measures are needed to evaluate triage systems in the emergency department.

PubMed

Zachariasse, Joany M; Nieboer, Daan; Oostenbrink, Rianne; Moll, Henriëtte A; Steyerberg, Ewout W

2018-02-01

Emergency department triage systems can be considered prediction rules with an ordinal outcome, where different directions of misclassification have different clinical consequences. We evaluated strategies to compare the performance of triage systems and aimed to propose a set of performance measures that should be used in future studies. We identified performance measures based on literature review and expert knowledge. Their properties are illustrated in a case study evaluating two triage modifications in a cohort of 14,485 pediatric emergency department visits. Strengths and weaknesses of the performance measures were systematically appraised. Commonly reported performance measures are measures of statistical association (34/60 studies) and diagnostic accuracy (17/60 studies). The case study illustrates that none of the performance measures fulfills all criteria for triage evaluation. Decision curves are the performance measures with the most attractive features but require dichotomization. In addition, paired diagnostic accuracy measures can be recommended for dichotomized analysis, and the triage-weighted kappa and Nagelkerke's R 2 for ordinal analyses. Other performance measures provide limited additional information. When comparing modifications of triage systems, decision curves and diagnostic accuracy measures should be used in a dichotomized analysis, and the triage-weighted kappa and Nagelkerke's R 2 in an ordinal approach. Copyright © 2017 Elsevier Inc. All rights reserved.
Generic Feature Selection with Short Fat Data

PubMed Central

Clarke, B.; Chu, J.-H.

2014-01-01

SUMMARY Consider a regression problem in which there are many more explanatory variables than data points, i.e., p ≫ n. Essentially, without reducing the number of variables inference is impossible. So, we group the p explanatory variables into blocks by clustering, evaluate statistics on the blocks and then regress the response on these statistics under a penalized error criterion to obtain estimates of the regression coefficients. We examine the performance of this approach for a variety of choices of n, p, classes of statistics, clustering algorithms, penalty terms, and data types. When n is not large, the discrimination over number of statistics is weak, but computations suggest regressing on approximately [n/K] statistics where K is the number of blocks formed by a clustering algorithm. Small deviations from this are observed when the blocks of variables are of very different sizes. Larger deviations are observed when the penalty term is an Lq norm with high enough q. PMID:25346546
Informativeness of Diagnostic Marker Values and the Impact of Data Grouping.

PubMed

Ma, Hua; Bandos, Andriy I; Gur, David

2018-01-01

Assessing performance of diagnostic markers is a necessary step for their use in decision making regarding various conditions of interest in diagnostic medicine and other fields. Globally useful markers could, however, have ranges of values that are " diagnostically non-informative" . This paper demonstrates that the presence of marker values from diagnostically non-informative ranges could lead to a loss in statistical efficiency during nonparametric evaluation and shows that grouping non-informative values provides a natural resolution to this problem. These points are theoretically proven and an extensive simulation study is conducted to illustrate the possible benefits of using grouped marker values in a number of practically reasonable scenarios. The results contradict the common conjecture regarding the detrimental effect of grouped marker values during performance assessments. Specifically, contrary to the common assumption that grouped marker values lead to bias, grouping non-informative values does not introduce bias and could substantially reduce sampling variability. The proven concept that grouped marker values could be statistically beneficial without detrimental consequences implies that in practice, tied values do not always require resolution whereas the use of continuous diagnostic results without addressing diagnostically non-informative ranges could be statistically detrimental. Based on these findings, more efficient methods for evaluating diagnostic markers could be developed.
Feedback Effects of Teaching Quality Assessment: Macro and Micro Evidence

ERIC Educational Resources Information Center

Bianchini, Stefano

2014-01-01

This study investigates the feedback effects of teaching quality assessment. Previous literature looked separately at the evolution of individual and aggregate scores to understand whether instructors and university performance depends on its past evaluation. I propose a new quantitative-based methodology, combining statistical distributions and…
10 CFR 431.17 - Determination of efficiency.

Code of Federal Regulations, 2014 CFR

2014-01-01

... characteristics of that basic model, and (ii) Based on engineering or statistical analysis, computer simulation or... simulation or modeling, and other analytic evaluation of performance data on which the AEDM is based... applied. (iii) If requested by the Department, the manufacturer shall conduct simulations to predict the...
10 CFR 431.17 - Determination of efficiency.

Code of Federal Regulations, 2012 CFR

2012-01-01

... characteristics of that basic model, and (ii) Based on engineering or statistical analysis, computer simulation or... simulation or modeling, and other analytic evaluation of performance data on which the AEDM is based... applied. (iii) If requested by the Department, the manufacturer shall conduct simulations to predict the...
Evaluation of an Automated Keywording System.

ERIC Educational Resources Information Center

Malone, Linda C.; And Others

1990-01-01

Discussion of automated indexing techniques focuses on ways to statistically document improvements in the development of an automated keywording system over time. The system developed by the Joint Chiefs of Staff to automate the storage, categorization, and retrieval of information from military exercises is explained, and performance measures are…
Assessment and evaluations of I-80 truck loads and their load effects : final report.

DOT National Transportation Integrated Search

2016-12-01

The research objective is to examine the safety of Wyoming bridges on the I-80 corridor considering the actual truck traffic on the : interstate based upon weigh in motion (WIM) data. This was accomplished by performing statistical analyses to determ...
Comparison of AERMOD and CALPUFF models for simulating SO2 concentrations in a gas refinery.

PubMed

Atabi, Farideh; Jafarigol, Farzaneh; Moattar, Faramarz; Nouri, Jafar

2016-09-01

In this study, concentration of SO2 from a gas refinery located in complex terrain was calculated by the steady-state, AERMOD model, and nonsteady-state CALPUFF model. First, in four seasons, SO2 concentrations emitted from 16 refinery stacks, in nine receptors, were obtained by field measurements, and then the performance of both models was evaluated. Then, the simulated results for SO2 ambient concentrations made by each model were compared with the results of the observed concentrations, and model results were compared among themselves. The evaluation of the two models to simulate SO2 concentrations was based on the statistical analysis and Q-Q plots. Review of statistical parameters and Q-Q plots has shown that, according to the evaluation of estimations made, performance of both models to simulate the concentration of SO2 in the region can be considered acceptable. The results showed the AERMOD composite ratio between simulated values made by models and the observed values in various receptors for all four average times is 0.72, whereas CALPUFF's ratio is 0.89. However, in the complex conditions of topography, CALPUFF offers better agreement with the observed concentrations.
Slice simulation from a model of the parenchymous vascularization to evaluate texture features: work in progress.

PubMed

Rolland, Y; Bézy-Wendling, J; Duvauferrier, R; Coatrieux, J L

1999-03-01

To demonstrate the usefulness of a model of the parenchymous vascularization to evaluate texture analysis methods. Slices with thickness varying from 1 to 4 mm were reformatted from a 3D vascular model corresponding to either normal tissue perfusion or local hypervascularization. Parameters of statistical methods were measured on 16128x128 regions of interest, and mean values and standard deviation were calculated. For each parameter, the performances (discrimination power and stability) were evaluated. Among 11 calculated statistical parameters, three (homogeneity, entropy, mean of gradients) were found to have a good discriminating power to differentiate normal perfusion from hypervascularization, but only the gradient mean was found to have a good stability with respect to the thickness. Five parameters (run percentage, run length distribution, long run emphasis, contrast, and gray level distribution) were found to have intermediate results. In the remaining three, curtosis and correlation was found to have little discrimination power, skewness none. This 3D vascular model, which allows the generation of various examples of vascular textures, is a powerful tool to assess the performance of texture analysis methods. This improves our knowledge of the methods and should contribute to their a priori choice when designing clinical studies.

Workload Characterization of CFD Applications Using Partial Differential Equation Solvers

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

1998-01-01

Workload characterization is used for modeling and evaluating of computing systems at different levels of detail. We present workload characterization for a class of Computational Fluid Dynamics (CFD) applications that solve Partial Differential Equations (PDEs). This workload characterization focuses on three high performance computing platforms: SGI Origin2000, EBM SP-2, a cluster of Intel Pentium Pro bases PCs. We execute extensive measurement-based experiments on these platforms to gather statistics of system resource usage, which results in workload characterization. Our workload characterization approach yields a coarse-grain resource utilization behavior that is being applied for performance modeling and evaluation of distributed high performance metacomputing systems. In addition, this study enhances our understanding of interactions between PDE solver workloads and high performance computing platforms and is useful for tuning these applications.
Benchmarking quantitative label-free LC-MS data processing workflows using a complex spiked proteomic standard dataset.

PubMed

Ramus, Claire; Hovasse, Agnès; Marcellin, Marlène; Hesse, Anne-Marie; Mouton-Barbosa, Emmanuelle; Bouyssié, David; Vaca, Sebastian; Carapito, Christine; Chaoui, Karima; Bruley, Christophe; Garin, Jérôme; Cianférani, Sarah; Ferro, Myriam; Van Dorssaeler, Alain; Burlet-Schiltz, Odile; Schaeffer, Christine; Couté, Yohann; Gonzalez de Peredo, Anne

2016-01-30

Proteomic workflows based on nanoLC-MS/MS data-dependent-acquisition analysis have progressed tremendously in recent years. High-resolution and fast sequencing instruments have enabled the use of label-free quantitative methods, based either on spectral counting or on MS signal analysis, which appear as an attractive way to analyze differential protein expression in complex biological samples. However, the computational processing of the data for label-free quantification still remains a challenge. Here, we used a proteomic standard composed of an equimolar mixture of 48 human proteins (Sigma UPS1) spiked at different concentrations into a background of yeast cell lysate to benchmark several label-free quantitative workflows, involving different software packages developed in recent years. This experimental design allowed to finely assess their performances in terms of sensitivity and false discovery rate, by measuring the number of true and false-positive (respectively UPS1 or yeast background proteins found as differential). The spiked standard dataset has been deposited to the ProteomeXchange repository with the identifier PXD001819 and can be used to benchmark other label-free workflows, adjust software parameter settings, improve algorithms for extraction of the quantitative metrics from raw MS data, or evaluate downstream statistical methods. Bioinformatic pipelines for label-free quantitative analysis must be objectively evaluated in their ability to detect variant proteins with good sensitivity and low false discovery rate in large-scale proteomic studies. This can be done through the use of complex spiked samples, for which the "ground truth" of variant proteins is known, allowing a statistical evaluation of the performances of the data processing workflow. We provide here such a controlled standard dataset and used it to evaluate the performances of several label-free bioinformatics tools (including MaxQuant, Skyline, MFPaQ, IRMa-hEIDI and Scaffold) in different workflows, for detection of variant proteins with different absolute expression levels and fold change values. The dataset presented here can be useful for tuning software tool parameters, and also testing new algorithms for label-free quantitative analysis, or for evaluation of downstream statistical methods. Copyright © 2015 Elsevier B.V. All rights reserved.
Study design and statistical analysis of data in human population studies with the micronucleus assay.

PubMed

Ceppi, Marcello; Gallo, Fabio; Bonassi, Stefano

2011-01-01

The most common study design performed in population studies based on the micronucleus (MN) assay, is the cross-sectional study, which is largely performed to evaluate the DNA damaging effects of exposure to genotoxic agents in the workplace, in the environment, as well as from diet or lifestyle factors. Sample size is still a critical issue in the design of MN studies since most recent studies considering gene-environment interaction, often require a sample size of several hundred subjects, which is in many cases difficult to achieve. The control of confounding is another major threat to the validity of causal inference. The most popular confounders considered in population studies using MN are age, gender and smoking habit. Extensive attention is given to the assessment of effect modification, given the increasing inclusion of biomarkers of genetic susceptibility in the study design. Selected issues concerning the statistical treatment of data have been addressed in this mini-review, starting from data description, which is a critical step of statistical analysis, since it allows to detect possible errors in the dataset to be analysed and to check the validity of assumptions required for more complex analyses. Basic issues dealing with statistical analysis of biomarkers are extensively evaluated, including methods to explore the dose-response relationship among two continuous variables and inferential analysis. A critical approach to the use of parametric and non-parametric methods is presented, before addressing the issue of most suitable multivariate models to fit MN data. In the last decade, the quality of statistical analysis of MN data has certainly evolved, although even nowadays only a small number of studies apply the Poisson model, which is the most suitable method for the analysis of MN data.
High-order statistical equalizer for nonlinearity compensation in dispersion-managed coherent optical communications.

PubMed

Koike-Akino, Toshiaki; Duan, Chunjie; Parsons, Kieran; Kojima, Keisuke; Yoshida, Tsuyoshi; Sugihara, Takashi; Mizuochi, Takashi

2012-07-02

Fiber nonlinearity has become a major limiting factor to realize ultra-high-speed optical communications. We propose a fractionally-spaced equalizer which exploits a trained high-order statistics to deal with data-pattern dependent nonlinear impairments in fiber-optic communications. The computer simulation reveals that the proposed 3-tap equalizer improves Q-factor by more than 2 dB for long-haul transmissions of 5,230 km distance and 40 Gbps data rate. We also demonstrate that the joint use of a digital backpropagation (DBP) and the proposed equalizer offers an additional 1-2 dB performance improvement due to the channel shortening gain. A performance in high-speed transmissions of 100 Gbps and beyond is evaluated as well.
Performance of the general circulation models in simulating temperature and precipitation over Iran

NASA Astrophysics Data System (ADS)

Abbasian, Mohammadsadegh; Moghim, Sanaz; Abrishamchi, Ahmad

2018-03-01

General Circulation Models (GCMs) are advanced tools for impact assessment and climate change studies. Previous studies show that the performance of the GCMs in simulating climate variables varies significantly over different regions. This study intends to evaluate the performance of the Coupled Model Intercomparison Project phase 5 (CMIP5) GCMs in simulating temperature and precipitation over Iran. Simulations from 37 GCMs and observations from the Climatic Research Unit (CRU) were obtained for the period of 1901-2005. Six measures of performance including mean bias, root mean square error (RMSE), Nash-Sutcliffe efficiency (NSE), linear correlation coefficient (r), Kolmogorov-Smirnov statistic (KS), Sen's slope estimator, and the Taylor diagram are used for the evaluation. GCMs are ranked based on each statistic at seasonal and annual time scales. Results show that most GCMs perform reasonably well in simulating the annual and seasonal temperature over Iran. The majority of the GCMs have a poor skill to simulate precipitation, particularly at seasonal scale. Based on the results, the best GCMs to represent temperature and precipitation simulations over Iran are the CMCC-CMS (Euro-Mediterranean Center on Climate Change) and the MRI-CGCM3 (Meteorological Research Institute), respectively. The results are valuable for climate and hydrometeorological studies and can help water resources planners and managers to choose the proper GCM based on their criteria.
Improving alignment in Tract-based spatial statistics: evaluation and optimization of image registration.

PubMed

de Groot, Marius; Vernooij, Meike W; Klein, Stefan; Ikram, M Arfan; Vos, Frans M; Smith, Stephen M; Niessen, Wiro J; Andersson, Jesper L R

2013-08-01

Anatomical alignment in neuroimaging studies is of such importance that considerable effort is put into improving the registration used to establish spatial correspondence. Tract-based spatial statistics (TBSS) is a popular method for comparing diffusion characteristics across subjects. TBSS establishes spatial correspondence using a combination of nonlinear registration and a "skeleton projection" that may break topological consistency of the transformed brain images. We therefore investigated feasibility of replacing the two-stage registration-projection procedure in TBSS with a single, regularized, high-dimensional registration. To optimize registration parameters and to evaluate registration performance in diffusion MRI, we designed an evaluation framework that uses native space probabilistic tractography for 23 white matter tracts, and quantifies tract similarity across subjects in standard space. We optimized parameters for two registration algorithms on two diffusion datasets of different quality. We investigated reproducibility of the evaluation framework, and of the optimized registration algorithms. Next, we compared registration performance of the regularized registration methods and TBSS. Finally, feasibility and effect of incorporating the improved registration in TBSS were evaluated in an example study. The evaluation framework was highly reproducible for both algorithms (R(2) 0.993; 0.931). The optimal registration parameters depended on the quality of the dataset in a graded and predictable manner. At optimal parameters, both algorithms outperformed the registration of TBSS, showing feasibility of adopting such approaches in TBSS. This was further confirmed in the example experiment. Copyright © 2013 Elsevier Inc. All rights reserved.
Evaluation of a statistics-based Ames mutagenicity QSAR model and interpretation of the results obtained.

PubMed

Barber, Chris; Cayley, Alex; Hanser, Thierry; Harding, Alex; Heghes, Crina; Vessey, Jonathan D; Werner, Stephane; Weiner, Sandy K; Wichard, Joerg; Giddings, Amanda; Glowienke, Susanne; Parenty, Alexis; Brigo, Alessandro; Spirkl, Hans-Peter; Amberg, Alexander; Kemper, Ray; Greene, Nigel

2016-04-01

The relative wealth of bacterial mutagenicity data available in the public literature means that in silico quantitative/qualitative structure activity relationship (QSAR) systems can readily be built for this endpoint. A good means of evaluating the performance of such systems is to use private unpublished data sets, which generally represent a more distinct chemical space than publicly available test sets and, as a result, provide a greater challenge to the model. However, raw performance metrics should not be the only factor considered when judging this type of software since expert interpretation of the results obtained may allow for further improvements in predictivity. Enough information should be provided by a QSAR to allow the user to make general, scientifically-based arguments in order to assess and overrule predictions when necessary. With all this in mind, we sought to validate the performance of the statistics-based in vitro bacterial mutagenicity prediction system Sarah Nexus (version 1.1) against private test data sets supplied by nine different pharmaceutical companies. The results of these evaluations were then analysed in order to identify findings presented by the model which would be useful for the user to take into consideration when interpreting the results and making their final decision about the mutagenic potential of a given compound. Copyright © 2015 Elsevier Inc. All rights reserved.
Predictive variables for postoperative pain after 520 consecutive dental extraction surgeries.

PubMed

Bortoluzzi, Marcelo Carlos; Manfro, Aline Rosler Grings; Nodari, Rudy Jose; Presta, Andreia Antoniuk

2012-01-01

The aim of this study was to evaluate postoperative pain in patients who had a single tooth or multiple erupted teeth extracted. This research evaluated 520 consecutive dental extraction surgeries in which 680 teeth were removed. Data collection was obtained through a questionnaire of patients and of the undergraduate students who performed all procedures. Pain was evaluated through qualitative self-reported scores at seven days postsurgery. An increased pain level was statistically associated with ostectomy, postoperative complications, and tobacco consumption. Pain that persisted for more than two days was statistically associated with the amount of anesthetic solution used, with a notable increase in surgical time and development of postoperative complications. Periods of pain lasting more than two days could be expected for traumatic surgeries lasting more than 30 minutes. Both severe and prolonged pain were signs of development of postoperative complications, such as alveolar osteitis and alveolar infection.
A time series analysis performed on a 25-year period of kidney transplantation activity in a single center.

PubMed

Santori, G; Fontana, I; Bertocchi, M; Gasloli, G; Valente, U

2010-05-01

Following the example of many Western countries, where a "minimum volume rule" policy has been adopted as a quality parameter for complex surgical procedures, the Italian National Transplant Centre set the minimum number of kidney transplantation procedures/y at 30/center. The number of procedures performed in a single center over a large period may be treated as a time series to evaluate trends, seasonal cycles, and nonsystematic fluctuations. Between January 1, 1983, and December 31, 2007, we performed 1376 procedures in adult or pediatric recipients from living or cadaveric donors. The greatest numbers of cases/y were performed in 1998 (n = 86) followed by 2004 (n = 82), 1996 (n = 75), and 2003 (n = 73). A time series analysis performed using R Statistical Software (Foundation for Statistical Computing, Vienna, Austria), a free software environment for statistical computing and graphics, showed a whole incremental trend after exponential smoothing as well as after seasonal decomposition. However, starting from 2005, we observed a decreased trend in the series. The number of kidney transplants expected to be performed for 2008 by using the Holt-Winters exponential smoothing applied to the period 1983 to 2007 suggested 58 procedures, while in that year there were 52. The time series approach may be helpful to establish a minimum volume/y at a single-center level. Copyright (c) 2010 Elsevier Inc. All rights reserved.
Performance of RVGui sensor and Kodak Ektaspeed Plus film for proximal caries detection.

PubMed

Abreu, M; Mol, A; Ludlow, J B

2001-03-01

A high-resolution charge-coupled device was used to compare the diagnostic performances obtained with Trophy's new RVGui sensor and Kodak Ektaspeed Plus film with respect to caries detection. Three acquisition modes of the Trophy RVGui sensor were compared with Kodak Ektaspeed Plus film. Images of the proximal surfaces of 40 extracted posterior teeth were evaluated by 6 observers. The presence or absence of caries was scored by means of a 5-point confidence scale. The actual caries status of each surface was determined through ground-section histology. Responses were evaluated by means of receiver operating characteristic analysis. Areas under receiver operating characteristic curves (A(Z)) were assessed through analysis of variance. The mean A(Z) scores were 0.85 for film, 0.84 for the high-resolution caries mode, and 0.82 for both the low resolution caries mode and the high-resolution periodontal mode. These differences were not statistically significant (P =.70). The differences among observers also were not statistically significant (P =.23). The performance of the RVGui sensor in high- and low-resolution modes for proximal caries detection is comparable to that of Ektaspeed Plus film.
Clinical evaluation of flowable resins in non-carious cervical lesions: two-year results.

PubMed

Celik, Cigdem; Ozgünaltay, Gül; Attar, Nuray

2007-01-01

This study evaluated the two-year clinical performance of one microhybrid composite and three different types of flowable resin materials in non-carious cervical lesions. A total of 252 noncarious cervical lesions were restored in 37 patients (12 male, 25 female) with Admira Flow, Dyract Flow, Filtek Flow and Filtek Z250, according to manufacturers' instructions. All the restorations were placed by one operator, and two other examiners evaluated the restorations clinically within one week after placement and after 6, 12, 18 and 24 months, using modified USPHS criteria. At the end of 24 months, 172 restorations were evaluated in 26 patients, with a recall rate of 68%. Statistical analysis was completed using the Pearson Chi-square and Fisher-Freeman-Halton tests (p < 0.05). Additionally, survival rates were analyzed with the Kaplan-Meier estimator and the Log-Rank test (p < 0.05). The Log-Rank test indicated statistically significant differences between the survival rates of Dyract Flow/Admira Flow and Dyract Flow/Filtek Z250 (p < 0.05). While there was a statistically significant difference between Dyract Flow and the other materials for color match at 12 and 18 months, no significant difference was observed among all of the materials tested at 24 months. Significant differences were revealed between Filtek Z250 and the other materials for marginal adaptation at 18 and 24 months (p < 0.05). With respect to marginal discoloration, secondary caries, surface texture and anatomic form, no significant differences were found between the resin materials (p > 0.05). It was concluded that different types of resin materials demonstrated acceptable clinical performance in non-carious cervical lesions, except for the retention rates of the Dyract Flow restorations.
Posterior tibial nerve stimulation vs parasacral transcutaneous neuromodulation for overactive bladder in children.

PubMed

Barroso, Ubirajara; Viterbo, Walter; Bittencourt, Joana; Farias, Tiago; Lordêlo, Patrícia

2013-08-01

Parasacral transcutaneous electrical nerve stimulation and posterior tibial nerve stimulation have emerged as effective methods to treat overactive bladder in children. However, to our knowledge no study has compared the 2 methods. We evaluated the results of parasacral transcutaneous electrical nerve stimulation and posterior tibial nerve stimulation in children with overactive bladder. We prospectively studied children with overactive bladder without dysfunctional voiding. Success of treatment was evaluated by visual analogue scale and dysfunctional voiding symptom score, and by level of improvement of each specific symptom. Parasacral transcutaneous electrical nerve stimulation was performed 3 times weekly and posterior tibial nerve stimulation was performed once weekly. A total of 22 consecutive patients were treated with posterior tibial nerve stimulation and 37 with parasacral transcutaneous electrical nerve stimulation. There was no difference between the 2 groups regarding demographic characteristics or types of symptoms. Concerning the evaluation by visual analogue scale, complete resolution of symptoms was seen in 70% of the group undergoing parasacral transcutaneous electrical nerve stimulation and in 9% of the group undergoing posterior tibial nerve stimulation (p = 0.02). When the groups were compared, there was no statistically significant difference (p = 0.55). The frequency of persistence of urgency and diurnal urinary incontinence was nearly double in the group undergoing posterior tibial nerve stimulation. However, this difference was not statistically significant. We found that parasacral transcutaneous electrical nerve stimulation is more effective in resolving overactive bladder symptoms, which matches parental perception. However, there were no statistically significant differences in the evaluation by dysfunctional voiding symptom score, or in complete resolution of urgency or diurnal incontinence. Copyright © 2013 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
Statistical evaluation of stability data: criteria for change-over-time and data variability.

PubMed

Bar, Raphael

2003-01-01

In a recently issued ICH Q1E guidance on evaluation of stability data of drug substances and products, the need to perform a statistical extrapolation of a shelf-life of a drug product or a retest period for a drug substance is based heavily on whether data exhibit a change-over-time and/or variability. However, this document suggests neither measures nor acceptance criteria of these two parameters. This paper demonstrates a useful application of simple statistical parameters for determining whether sets of stability data from either accelerated or long-term storage programs exhibit a change-over-time and/or variability. These parameters are all derived from a simple linear regression analysis first performed on the stability data. The p-value of the slope of the regression line is taken as a measure for change-over-time, and a value of 0.25 is suggested as a limit to insignificant change of the quantitative stability attributes monitored. The minimal process capability index, Cpk, calculated from the standard deviation of the regression line, is suggested as a measure for variability with a value of 2.5 as a limit for an insignificant variability. The usefulness of the above two parameters, p-value and Cpk, was demonstrated on stability data of a refrigerated drug product and on pooled data of three batches of a drug substance. In both cases, the determined parameters allowed characterization of the data in terms of change-over-time and variability. Consequently, complete evaluation of the stability data could be pursued according to the ICH guidance. It is believed that the application of the above two parameters with their acceptance criteria will allow a more unified evaluation of stability data.
Identification of robust statistical downscaling methods based on a comprehensive suite of performance metrics for South Korea

NASA Astrophysics Data System (ADS)

Eum, H. I.; Cannon, A. J.

2015-12-01

Climate models are a key provider to investigate impacts of projected future climate conditions on regional hydrologic systems. However, there is a considerable mismatch of spatial resolution between GCMs and regional applications, in particular a region characterized by complex terrain such as Korean peninsula. Therefore, a downscaling procedure is an essential to assess regional impacts of climate change. Numerous statistical downscaling methods have been used mainly due to the computational efficiency and simplicity. In this study, four statistical downscaling methods [Bias-Correction/Spatial Disaggregation (BCSD), Bias-Correction/Constructed Analogue (BCCA), Multivariate Adaptive Constructed Analogs (MACA), and Bias-Correction/Climate Imprint (BCCI)] are applied to downscale the latest Climate Forecast System Reanalysis data to stations for precipitation, maximum temperature, and minimum temperature over South Korea. By split sampling scheme, all methods are calibrated with observational station data for 19 years from 1973 to 1991 are and tested for the recent 19 years from 1992 to 2010. To assess skill of the downscaling methods, we construct a comprehensive suite of performance metrics that measure an ability of reproducing temporal correlation, distribution, spatial correlation, and extreme events. In addition, we employ Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) to identify robust statistical downscaling methods based on the performance metrics for each season. The results show that downscaling skill is considerably affected by the skill of CFSR and all methods lead to large improvements in representing all performance metrics. According to seasonal performance metrics evaluated, when TOPSIS is applied, MACA is identified as the most reliable and robust method for all variables and seasons. Note that such result is derived from CFSR output which is recognized as near perfect climate data in climate studies. Therefore, the ranking of this study may be changed when various GCMs are downscaled and evaluated. Nevertheless, it may be informative for end-users (i.e. modelers or water resources managers) to understand and select more suitable downscaling methods corresponding to priorities on regional applications.
Methods of learning in statistical education: Design and analysis of a randomized trial

NASA Astrophysics Data System (ADS)

Boyd, Felicity Turner

Background. Recent psychological and technological advances suggest that active learning may enhance understanding and retention of statistical principles. A randomized trial was designed to evaluate the addition of innovative instructional methods within didactic biostatistics courses for public health professionals. Aims. The primary objectives were to evaluate and compare the addition of two active learning methods (cooperative and internet) on students' performance; assess their impact on performance after adjusting for differences in students' learning style; and examine the influence of learning style on trial participation. Methods. Consenting students enrolled in a graduate introductory biostatistics course were randomized to cooperative learning, internet learning, or control after completing a pretest survey. The cooperative learning group participated in eight small group active learning sessions on key statistical concepts, while the internet learning group accessed interactive mini-applications on the same concepts. Controls received no intervention. Students completed evaluations after each session and a post-test survey. Study outcome was performance quantified by examination scores. Intervention effects were analyzed by generalized linear models using intent-to-treat analysis and marginal structural models accounting for reported participation. Results. Of 376 enrolled students, 265 (70%) consented to randomization; 69, 100, and 96 students were randomized to the cooperative, internet, and control groups, respectively. Intent-to-treat analysis showed no differences between study groups; however, 51% of students in the intervention groups had dropped out after the second session. After accounting for reported participation, expected examination scores were 2.6 points higher (of 100 points) after completing one cooperative learning session (95% CI: 0.3, 4.9) and 2.4 points higher after one internet learning session (95% CI: 0.0, 4.7), versus nonparticipants or controls, adjusting for other performance predictors. Students who preferred learning by reflective observation and active experimentation experienced improved performance through internet learning (5.9 points, 95% CI: 1.2, 10.6) and cooperative learning (2.9 points, 95% CI: 0.6, 5.2), respectively. Learning style did not influence study participation. Conclusions. No performance differences by group were observed by intent-to-treat analysis. Participation in active learning appears to improve student performance in an introductory biostatistics course and provides opportunities for enhancing understanding beyond that attained in traditional didactic classrooms.
Stochastic performance modeling and evaluation of obstacle detectability with imaging range sensors

NASA Technical Reports Server (NTRS)

Matthies, Larry; Grandjean, Pierrick

1993-01-01

Statistical modeling and evaluation of the performance of obstacle detection systems for Unmanned Ground Vehicles (UGVs) is essential for the design, evaluation, and comparison of sensor systems. In this report, we address this issue for imaging range sensors by dividing the evaluation problem into two levels: quality of the range data itself and quality of the obstacle detection algorithms applied to the range data. We review existing models of the quality of range data from stereo vision and AM-CW LADAR, then use these to derive a new model for the quality of a simple obstacle detection algorithm. This model predicts the probability of detecting obstacles and the probability of false alarms, as a function of the size and distance of the obstacle, the resolution of the sensor, and the level of noise in the range data. We evaluate these models experimentally using range data from stereo image pairs of a gravel road with known obstacles at several distances. The results show that the approach is a promising tool for predicting and evaluating the performance of obstacle detection with imaging range sensors.
Addressing issues associated with evaluating prediction models for survival endpoints based on the concordance statistic.

PubMed

Wang, Ming; Long, Qi

2016-09-01

Prediction models for disease risk and prognosis play an important role in biomedical research, and evaluating their predictive accuracy in the presence of censored data is of substantial interest. The standard concordance (c) statistic has been extended to provide a summary measure of predictive accuracy for survival models. Motivated by a prostate cancer study, we address several issues associated with evaluating survival prediction models based on c-statistic with a focus on estimators using the technique of inverse probability of censoring weighting (IPCW). Compared to the existing work, we provide complete results on the asymptotic properties of the IPCW estimators under the assumption of coarsening at random (CAR), and propose a sensitivity analysis under the mechanism of noncoarsening at random (NCAR). In addition, we extend the IPCW approach as well as the sensitivity analysis to high-dimensional settings. The predictive accuracy of prediction models for cancer recurrence after prostatectomy is assessed by applying the proposed approaches. We find that the estimated predictive accuracy for the models in consideration is sensitive to NCAR assumption, and thus identify the best predictive model. Finally, we further evaluate the performance of the proposed methods in both settings of low-dimensional and high-dimensional data under CAR and NCAR through simulations. © 2016, The International Biometric Society.
Virtual and stereoscopic anatomy: when virtual reality meets medical education.

PubMed

de Faria, Jose Weber Vieira; Teixeira, Manoel Jacobsen; de Moura Sousa Júnior, Leonardo; Otoch, Jose Pinhata; Figueiredo, Eberval Gadelha

2016-11-01

OBJECTIVE The authors sought to construct, implement, and evaluate an interactive and stereoscopic resource for teaching neuroanatomy, accessible from personal computers. METHODS Forty fresh brains (80 hemispheres) were dissected. Images of areas of interest were captured using a manual turntable and processed and stored in a 5337-image database. Pedagogic evaluation was performed in 84 graduate medical students, divided into 3 groups: 1 (conventional method), 2 (interactive nonstereoscopic), and 3 (interactive and stereoscopic). The method was evaluated through a written theory test and a lab practicum. RESULTS Groups 2 and 3 showed the highest mean scores in pedagogic evaluations and differed significantly from Group 1 (p < 0.05). Group 2 did not differ statistically from Group 3 (p > 0.05). Size effects, measured as differences in scores before and after lectures, indicate the effectiveness of the method. ANOVA results showed significant difference (p < 0.05) between groups, and the Tukey test showed statistical differences between Group 1 and the other 2 groups (p < 0.05). No statistical differences between Groups 2 and 3 were found in the practicum. However, there were significant differences when Groups 2 and 3 were compared with Group 1 (p < 0.05). CONCLUSIONS The authors conclude that this method promoted further improvement in knowledge for students and fostered significantly higher learning when compared with traditional teaching resources.
Learning process mapping heuristics under stochastic sampling overheads

NASA Technical Reports Server (NTRS)

Ieumwananonthachai, Arthur; Wah, Benjamin W.

1991-01-01

A statistical method was developed previously for improving process mapping heuristics. The method systematically explores the space of possible heuristics under a specified time constraint. Its goal is to get the best possible heuristics while trading between the solution quality of the process mapping heuristics and their execution time. The statistical selection method is extended to take into consideration the variations in the amount of time used to evaluate heuristics on a problem instance. The improvement in performance is presented using the more realistic assumption along with some methods that alleviate the additional complexity.
Color stability and degree of cure of direct composite restoratives after accelerated aging.

PubMed

Sarafianou, Aspasia; Iosifidou, Soultana; Papadopoulos, Triantafillos; Eliades, George

2007-01-01

This study evaluated the color changes and amount of remaining C = C bonds (%RDB) in three dental composites after hydrothermal- and photoaging. The materials tested were Estelite sigma, Filtek Supreme and Tetric Ceram. Specimens were fabricated from each material and subjected to L* a* b* colorimetry and FTIR spectroscopy before and after aging. Statistical evaluation of the deltaL,* deltaa,* deltab,* deltaE and %deltaRDB data was performed by one-way ANOVA and Tukey's test. The %RDB data before and after aging were statistically analyzed using two-way ANOVA and Student-Newman-Keuls test. In all cases an alpha = 0.05 significance level was used. No statistically significant differences were found in deltaL*, deltaa*, deltaE and %deltaRDB among the materials tested. Tetric Ceram demonstrated a significant difference in deltab*. All the materials showed visually perceptible (deltaE >1) but clinically acceptable values (deltaE < 3.3). Within each material group, statistically significant differences in %RDB were noticed before and after aging (p < 0.05). Filtek Supreme presented the lowest %RDB before aging, with Tetric Ceram presenting the lowest %RDB after aging (p < 0.05). The %deltaRDB mean values were statistically significantly different among all the groups tested. No correlation was found between deltaE and %deltaRDB.

An automated system for chromosome analysis. Volume 1: Goals, system design, and performance

NASA Technical Reports Server (NTRS)

Castleman, K. R.; Melnyk, J. H.

1975-01-01

The design, construction, and testing of a complete system to produce karyotypes and chromosome measurement data from human blood samples, and a basis for statistical analysis of quantitative chromosome measurement data is described. The prototype was assembled, tested, and evaluated on clinical material and thoroughly documented.
Effectiveness of an Online Simulation for Teacher Education

ERIC Educational Resources Information Center

Badiee, Farnaz; Kaufman, David

2014-01-01

This study evaluated the effectiveness of the "simSchool" (v.1) simulation as a tool for preparing student teachers for actual classroom teaching. Twenty-two student teachers used the simulation for a practice session and two test sessions; data included objective performance statistics generated by the simulation program, self-rated…
Developing and Assessing E-Learning Techniques for Teaching Forecasting

ERIC Educational Resources Information Center

Gel, Yulia R.; O'Hara Hines, R. Jeanette; Chen, He; Noguchi, Kimihiro; Schoner, Vivian

2014-01-01

In the modern business environment, managers are increasingly required to perform decision making and evaluate related risks based on quantitative information in the face of uncertainty, which in turn increases demand for business professionals with sound skills and hands-on experience with statistical data analysis. Computer-based training…
Use Of Statistical Tools To Evaluate The Reductive Dechlorination Of High Levels Of TCE In Microcosm Studies

EPA Science Inventory

A large, multi-laboratory microcosm study was performed to select amendments for supporting reductive dechlorination of high levels of trichloroethylene (TCE) found at an industrial site in the United Kingdom (UK) containing dense non-aqueous phase liquid (DNAPL) TCE. The study ...
Evaluating Teachers and Schools Using Student Growth Models

ERIC Educational Resources Information Center

Schafer, William D.; Lissitz, Robert W.; Zhu, Xiaoshu; Zhang, Yuan; Hou, Xiaodong; Li, Ying

2012-01-01

Interest in Student Growth Modeling (SGM) and Value Added Modeling (VAM) arises from educators concerned with measuring the effectiveness of teaching and other school activities through changes in student performance as a companion and perhaps even an alternative to status. Several formal statistical models have been proposed for year-to-year…
The New "LJ" Index

ERIC Educational Resources Information Center

Lance, Keith Curry; Lyons, Ray

2008-01-01

As published critics of Hennen's American Public Library Ratings (HAPLR), the authors propose a new ranking system that focuses more transparently on ranking libraries based on their performance. These annual rankings are intended to contribute to self-evaluation and peer comparison, prompt questions about the statistics and how to improve them,…
Wavelet methodology to improve single unit isolation in primary motor cortex cells

PubMed Central

Ortiz-Rosario, Alexis; Adeli, Hojjat; Buford, John A.

2016-01-01

The proper isolation of action potentials recorded extracellularly from neural tissue is an active area of research in the fields of neuroscience and biomedical signal processing. This paper presents an isolation methodology for neural recordings using the wavelet transform (WT), a statistical thresholding scheme, and the principal component analysis (PCA) algorithm. The effectiveness of five different mother wavelets was investigated: biorthogonal, Daubachies, discrete Meyer, symmetric, and Coifman; along with three different wavelet coefficient thresholding schemes: fixed form threshold, Stein’s unbiased estimate of risk, and minimax; and two different thresholding rules: soft and hard thresholding. The signal quality was evaluated using three different statistical measures: mean-squared error, root-mean squared, and signal to noise ratio. The clustering quality was evaluated using two different statistical measures: isolation distance, and L-ratio. This research shows that the selection of the mother wavelet has a strong influence on the clustering and isolation of single unit neural activity, with the Daubachies 4 wavelet and minimax thresholding scheme performing the best. PMID:25794461
Effect of the Oxidant-Antioxidant System in Seminal Plasma on Varicocele and Idiopathic Infertility in Male Humans.

PubMed

Yazar, Hayrullah; Halis, Fikret; Nasir, Yasemin; Guzel, Derya; Akdogan, Mehmet; Gokce, Ahmet

2017-05-01

The aim of this study was to investigate seminal oxidant-antioxidant activity in idiopathic and varicocele infertility in men. Total anti-oxidant capacity (TAC), total oxidant status (TOS), paraoxonase (PON1), aryl esterase (ARE), and total thiol levels (TTL) were measured in seminal plasma with an autoanalyzer. The TOS/TAC ratio was determined as the oxidative stress index (OSI). A histopathological evaluation of the sperm was performed in the andrology laboratory of the hospital. Number, motility, morphology, volume, pH, and leukocytes were evaluated in all samples according to World Health Organization criteria. The three study groups were as follows: G1, males with idiopathic infertility; G2, males with varicocele infertility; and G3, normal healthy males (had fathered a child in the last 2 years). Each group was composed of 36 men (age, 25 - 40 years). The Rel Assay Diagnostics kit was used to determine the levels of the parameters. The study was conducted according to the principles of the declaration of Helsinki and was approved by Sakarya University Medicine Faculty Ethic Committee (e.n: 16214662/050.01.04/07). Statistical significance was assumed if p < 0.05. All statistical evaluations were performed using SPSS (version 20.0 for Windows; SPSS, Inc., Chicago, IL, USA). No differences were detected between the mean values of antioxidant parameters among the three groups (Kruskal-Wallis test). The p-values of the test parameters (TAC, TOS, PON1, ARE, TTL, OSI) are respectively: 0.494, 0.548, 0.068, 0.151, 0.202, 0.873. The antioxidant parameters of all subjects were compared using the MannWhitney U-test in both groups as fertile (G3) and infertile (G1 + G2). The PON1 levels in infertile subjects were significantly higher than those in fertile subjects. There was a statistically significant difference (p = 0.042). The other antioxidant parameters had no statistically significant difference (p > 0.05). The ARE was not performed in group 3 (control) due to a methodological problem. PON1 levels in infertile subjects were significantly higher than those of fertile subjects.
[Evaluation of the capacity of the APR-DRG classification system to predict hospital mortality].

PubMed

De Marco, Maria Francesca; Lorenzoni, Luca; Addari, Piero; Nante, Nicola

2002-01-01

Inpatient mortality has increasingly been used as an hospital outcome measure. Comparing mortality rates across hospitals requires adjustment for patient risks before making inferences about quality of care based on patient outcomes. Therefore it is essential to dispose of well performing severity measures. The aim of this study is to evaluate the ability of the All Patient Refined DRG system to predict inpatient mortality for congestive heart failure, myocardial infarction, pneumonia and ischemic stroke. Administrative records were used in this analysis. We used two statistics methods to assess the ability of the APR-DRG to predict mortality: the area under the receiver operating characteristics curve (referred to as the c-statistic) and the Hosmer-Lemeshow test. The database for the study included 19,212 discharges for stroke, pneumonia, myocardial infarction and congestive heart failure from fifteen hospital participating in the Italian APR-DRG Project. A multivariate analysis was performed to predict mortality for each condition in study using age, sex and APR-DRG risk mortality subclass as independent variables. Inpatient mortality rate ranges from 9.7% (pneumonia) to 16.7% (stroke). Model discrimination, calculated using the c-statistic, was 0.91 for myocardial infarction, 0.68 for stroke, 0.78 for pneumonia and 0.71 for congestive heart failure. The model calibration assessed using the Hosmer-Leme-show test was quite good. The performance of the APR-DRG scheme when used on Italian hospital activity records is similar to that reported in literature and it seems to improve by adding age and sex to the model. The APR-DRG system does not completely capture the effects of these variables. In some cases, the better performance might be due to the inclusion of specific complications in the risk-of-mortality subclass assignment.
[Diversity and frequency of scientific research design and statistical methods in the "Arquivos Brasileiros de Oftalmologia": a systematic review of the "Arquivos Brasileiros de Oftalmologia"--1993-2002].

PubMed

Crosta, Fernando; Nishiwaki-Dantas, Maria Cristina; Silvino, Wilmar; Dantas, Paulo Elias Correa

2005-01-01

To verify the frequency of study design, applied statistical analysis and approval by institutional review offices (Ethics Committee) of articles published in the "Arquivos Brasileiros de Oftalmologia" during a 10-year interval, with later comparative and critical analysis by some of the main international journals in the field of Ophthalmology. Systematic review without metanalysis was performed. Scientific papers published in the "Arquivos Brasileiros de Oftalmologia" between January 1993 and December 2002 were reviewed by two independent reviewers and classified according to the applied study design, statistical analysis and approval by the institutional review offices. To categorize those variables, a descriptive statistical analysis was used. After applying inclusion and exclusion criteria, 584 articles for evaluation of statistical analysis and, 725 articles for evaluation of study design were reviewed. Contingency table (23.10%) was the most frequently applied statistical method, followed by non-parametric tests (18.19%), Student's t test (12.65%), central tendency measures (10.60%) and analysis of variance (9.81%). Of 584 reviewed articles, 291 (49.82%) presented no statistical analysis. Observational case series (26.48%) was the most frequently used type of study design, followed by interventional case series (18.48%), observational case description (13.37%), non-random clinical study (8.96%) and experimental study (8.55%). We found a higher frequency of observational clinical studies, lack of statistical analysis in almost half of the published papers. Increase in studies with approval by institutional review Ethics Committee was noted since it became mandatory in 1996.
Performance evaluation of 388 full-scale waste stabilization pond systems with seven different configurations.

PubMed

Espinosa, Maria Fernanda; von Sperling, Marcos; Verbyla, Matthew E

2017-02-01

Waste stabilization ponds (WSPs) and their variants are one the most widely used wastewater treatment systems in the world. However, the scarcity of systematic performance data from full-scale plants has led to challenges associated with their design. The objective of this research was to assess the performance of 388 full-scale WSP systems located in Brazil, Ecuador, Bolivia and the United States through the statistical analysis of available monitoring data. Descriptive statistics were calculated of the influent and effluent concentrations and the removal efficiencies for 5-day biochemical oxygen demand (BOD 5 ), total suspended solids (TSS), ammonia nitrogen (N-Ammonia), and either thermotolerant coliforms (TTC) or Escherichia coli for each WSP system, leading to a broad characterization of actual treatment performance. Compliance with different water quality and system performance goals was also evaluated. The treatment plants were subdivided into seven different categories, according to their units and flowsheet. The median influent concentrations of BOD 5 and TSS were 431 mg/L and 397 mg/L and the effluent concentrations varied from technology to technology, but median values were 50 mg/L and 47 mg/L, respectively. The median removal efficiencies were 85% for BOD 5 and 75% for TSS. The overall removals of TTC and E. coli were 1.74 and 1.63 log 10 units, respectively. Future research is needed to better understand the influence of design, operational and environmental factors on WSP system performance.
An entropy-based statistic for genomewide association studies.

PubMed

Zhao, Jinying; Boerwinkle, Eric; Xiong, Momiao

2005-07-01

Efficient genotyping methods and the availability of a large collection of single-nucleotide polymorphisms provide valuable tools for genetic studies of human disease. The standard chi2 statistic for case-control studies, which uses a linear function of allele frequencies, has limited power when the number of marker loci is large. We introduce a novel test statistic for genetic association studies that uses Shannon entropy and a nonlinear function of allele frequencies to amplify the differences in allele and haplotype frequencies to maintain statistical power with large numbers of marker loci. We investigate the relationship between the entropy-based test statistic and the standard chi2 statistic and show that, in most cases, the power of the entropy-based statistic is greater than that of the standard chi2 statistic. The distribution of the entropy-based statistic and the type I error rates are validated using simulation studies. Finally, we apply the new entropy-based test statistic to two real data sets, one for the COMT gene and schizophrenia and one for the MMP-2 gene and esophageal carcinoma, to evaluate the performance of the new method for genetic association studies. The results show that the entropy-based statistic obtained smaller P values than did the standard chi2 statistic.
[Nursing care time in a teaching hospital].

PubMed

Rogenski, Karin Emília; Fugulin, Fernanda Maria Togeiro; Gaidzinski, Raquel Rapone; Rogenski, Noemi Marisa Brunet

2011-03-01

This is a quantitative exploratory, descriptive study performed with the objective to identify and analyze the performance of the average time of nursing care delivered to patients of the Inpatient Units of the University Hospital at University of São Paulo (UH-USP), from 2001 to 2005. The average nursing care time delivered to patients of the referred units was identified by applying of a mathematical equation proposed in the literature, after surveying data from the Medical and Statistical Service and based on the monthly working shifts of the nursing professionals. Data analysis was performed using descriptive statistics. The average nursing care time observed in most units, despite some variations, remained stable during the analyzed period. Based on this observed stability, it is concluded that the nursing staff in the referred HU-USP units has been continuously evaluated with the purposes of maintaining the average time of assistance and, thus, the quality of the care being delivered.
Psychophysical Map Stability in Bilateral Sequential Cochlear Implantation: Comparing Current Audiology Methods to a New Statistical Definition.

PubMed

Domville-Lewis, Chloe; Santa Maria, Peter L; Upson, Gemma; Chester-Browne, Ronel; Atlas, Marcus D

2015-01-01

The purpose of this study was to establish a statistical definition for stability in cochlear implant maps. Once defined, this study aimed to compare the duration taken to achieve a stable map in first and second implants in patients who underwent sequential bilateral cochlear implantation. This article also sought to evaluate a number of factors that potentially affect map stability. A retrospective cohort study of 33 patients with sensorineural hearing loss who received sequential bilateral cochlear implantation (Cochlear, Sydney, Australia), performed by the senior author. Psychophysical parameters of hearing threshold scores, comfort scores, and the dynamic range were measured for the apical, medial, and basal portions of the cochlear implant electrode at a range of intervals postimplantation. Stability was defined statistically as a less than 10% difference in threshold, comfort, and dynamic range scores over three consecutive mapping sessions. A senior cochlear implant audiologist, blinded to implant order and the statistical results, separately analyzed these psychophysical map parameters using current assessment methods. First and second implants were compared for duration to achieve stability, age, gender, the duration of deafness, etiology of deafness, time between the insertion of the first and second implant, and the presence or absence of preoperative hearing aids were evaluated and its relationship to stability. Statistical analysis included performing a two-tailed Student's t tests and least squares regression analysis, with a statistical significance set at p ≤ 0.05. There was a significant positive correlation between the devised statistical definition and the current audiology methods for assessing stability, with a Pearson correlation coefficient r = 0.36 and a least squares regression slope (b) of 0.41, df(58), 95% confidence interval 0.07 to 0.55 (p = 0.004). The average duration from device switch on to stability in the first implant was 87 days using current audiology methods and 81 days using the statistical definition, with no statistically significant difference between assessment methods (p = 0.2). The duration to achieve stability in the second implant was 51 days using current audiology methods and 60 days using the statistical method, and again no difference between the two assessment methods (p = 0.13). There was a significant reduction in the time to achieve stability in second implants for both audiology and statistical methods (p < 0.001 and p = 0.02, respectively). There was a difference in duration to achieve stability based on electrode array region, with basal portions taking longer to stabilize than apical in the first implant (p = 0.02) and both apical and medial segments in second implants (p = 0.004 and p = 0.01, respectively). No factors that were evaluated in this study, including gender, age, etiology of deafness, duration of deafness, time between implant insertion, and the preoperative hearing aid status, were correlated with stability duration in either stability assessment method. Our statistical definition can accurately predict cochlear implant map stability when compared with current audiology practices. Cochlear implants that are implanted second tend to stabilize sooner than the first, which has a significant impact on counseling before a second implant. No factors evaluated affected the duration required to achieve stability in this study.
The effectiveness of repeat lumbar transforaminal epidural steroid injections.

PubMed

Murthy, Naveen S; Geske, Jennifer R; Shelerud, Randy A; Wald, John T; Diehn, Felix E; Thielen, Kent R; Kaufmann, Timothy J; Morris, Jonathan M; Lehman, Vance T; Amrami, Kimberly K; Carter, Rickey E; Maus, Timothy P

2014-10-01

The aim of this study was to determine 1) if repeat lumbar transforaminal epidural steroid injections (TFESIs) resulted in recovery of pain relief, which has waned since an index injection, and 2) if cumulative benefit could be achieved by repeat injections within 3 months of the index injection. Retrospective observational study with statistical modeling of the response to repeat TFESI. Academic radiology practice. Two thousand eighty-seven single-level TFESIs were performed for radicular pain on 933 subjects. Subjects received repeat TFESIs >2 weeks and <1 year from the index injection. Hierarchical linear modeling was performed to evaluate changes in continuous and categorical pain relief outcomes after repeat TFESI. Subgroup analyses were performed on patients with <3 months duration of pain (acute pain), patients receiving repeat injections within 3 months (clustered injections), and in patients with both acute pain and clustered injections. Repeat TFESIs achieved pain relief in both continuous and categorical outcomes. Relative to the index injection, there was a minimal but statistically significant decrease in pain relief in modeled continuous outcome measures with subsequent injections. Acute pain patients recovered all prior benefit with a statistically significant cumulative benefit. Patients receiving clustered injections achieved statistically significant cumulative benefit, of greater magnitude in acute pain patients. Repeat TFESI may be performed for recurrence of radicular pain with the expectation of recovery of most or all previously achieved benefit; acute pain patients will likely recover all prior benefit. Repeat TFESIs within 3 months of the index injection can provide cumulative benefit. Wiley Periodicals, Inc.
Predicting trauma patient mortality: ICD [or ICD-10-AM] versus AIS based approaches.

PubMed

Willis, Cameron D; Gabbe, Belinda J; Jolley, Damien; Harrison, James E; Cameron, Peter A

2010-11-01

The International Classification of Diseases Injury Severity Score (ICISS) has been proposed as an International Classification of Diseases (ICD)-10-based alternative to mortality prediction tools that use Abbreviated Injury Scale (AIS) data, including the Trauma and Injury Severity Score (TRISS). To date, studies have not examined the performance of ICISS using Australian trauma registry data. This study aimed to compare the performance of ICISS with other mortality prediction tools in an Australian trauma registry. This was a retrospective review of prospectively collected data from the Victorian State Trauma Registry. A training dataset was created for model development and a validation dataset for evaluation. The multiplicative ICISS model was compared with a worst injury ICISS approach, Victorian TRISS (V-TRISS, using local coefficients), maximum AIS severity and a multivariable model including ICD-10-AM codes as predictors. Models were investigated for discrimination (C-statistic) and calibration (Hosmer-Lemeshow statistic). The multivariable approach had the highest level of discrimination (C-statistic 0.90) and calibration (H-L 7.65, P= 0.468). Worst injury ICISS, V-TRISS and maximum AIS had similar performance. The multiplicative ICISS produced the lowest level of discrimination (C-statistic 0.80) and poorest calibration (H-L 50.23, P < 0.001). The performance of ICISS may be affected by the data used to develop estimates, the ICD version employed, the methods for deriving estimates and the inclusion of covariates. In this analysis, a multivariable approach using ICD-10-AM codes was the best-performing method. A multivariable ICISS approach may therefore be a useful alternative to AIS-based methods and may have comparable predictive performance to locally derived TRISS models. © 2010 The Authors. ANZ Journal of Surgery © 2010 Royal Australasian College of Surgeons.
MR Spectroscopy to Distinguish between Supratentorial Intraventricular Subependymoma and Central Neurocytoma.

PubMed

Ueda, Fumiaki; Aburano, Hiroyuki; Ryu, Yasuji; Yoshie, Yuichi; Nakada, Mitsutoshi; Hayashi, Yutaka; Matsui, Osamu; Gabata, Toshifumi

2017-07-10

The purpose of this study was to discriminate supratentorial intraventricular subependymoma (SIS) from central neurocytoma (CNC) using magnetic resonance spectroscopy (MRS). Single-voxel proton MRS using a 1.5T or 3T MR scanner from five SISs, five CNCs, and normal controls were evaluated. They were examined using a point-resolved spectroscopy. Automatically calculated ratios comparing choline (Cho), N-acetylaspartate (NAA), myoinositol (MI), and/or glycine (Gly) to creatine (Cr) were determined. Evaluation of Cr to unsuppressed water (USW) was also performed. Mann-Whitney U test was carried out to test the significance of differences in the metabolite ratios. Detectability of lactate (Lac) and alanine (Ala) was evaluated. Although a statistically significant difference (P < 0.0001) was observed in Cho/Cr among SIS, control spectra, and CNC, no statistical difference was noted between SIS and control spectra (P = 0.11). Statistically significant differences were observed in NAA/Cr between SIS and CNC (P = 0.04) or control spectra (P < 0.0001). A statistically significant difference was observed in MI and/or Gly to Cr between SIS and control spectra (P = 0.03), and CNC and control spectra (P < 0.0006). There were no statistical differences between SIS and CNC for MI and/or Gly to Cr (P = 0.32). Significant statistical differences were found between SIS and control spectra (P < 0.0053), control spectra and CNC (P < 0.0016), and SIS and CNC (P < 0.0083) for Cr to USW. Lac inverted doublets were confirmed in two SISs. Triplets of Lac and Ala were detected in four spectra of CNC. The present study showed that MRS can be useful in discriminating SIS from CNC.
Comparative Evaluation of Microleakage Between Nano-Ionomer, Giomer and Resin Modified Glass Ionomer Cement in Class V Cavities- CLSM Study

PubMed Central

Hari, Archana; Thumu, Jayaprakash; Velagula, Lakshmi Deepa; Bolla, Nagesh; Varri, Sujana; Kasaraneni, Srikanth; Nalli, Siva Venkata Malathi

2016-01-01

Introduction Marginal integrity of adhesive restorative materials provides better sealing ability for enamel and dentin and plays an important role in success of restoration in Class V cavities. Restorative material with good marginal adaptation improves the longevity of restorations. Aim Aim of this study was to evaluate microleakage in Class V cavities which were restored with Resin Modified Glass Ionomer Cement (RMGIC), Giomer and Nano-Ionomer. Materials and Methods This in-vitro study was performed on 60 human maxillary and mandibular premolars which were extracted for orthodontic reasons. A standard wedge shaped defect was prepared on the buccal surfaces of teeth with the gingival margin placed near Cemento Enamel Junction (CEJ). Teeth were divided into three groups of 20 each and restored with RMGIC, Giomer and Nano-Ionomer and were subjected to thermocycling. Teeth were then immersed in 0.5% Rhodamine B dye for 48 hours. They were sectioned longitudinally from the middle of cavity into mesial and distal parts. The sections were observed under Confocal Laser Scanning Microscope (CLSM) to evaluate microleakage. Depth of dye penetration was measured in millimeters. Statistical Analysis The data was analysed using the Kruskal Wallis test. Pair wise comparison was done with Mann Whitney U Test. A p-value<0.05 is taken as statistically significant. Results Nano-Ionomer showed less microleakage which was statistically significant when compared to Giomer (p=0.0050). Statistically no significant difference was found between Nano Ionomer and RMGIC (p=0.3550). There was statistically significant difference between RMGIC and Giomer (p=0.0450). Conclusion Nano-Ionomer and RMGIC showed significantly less leakage and better adaptation than Giomer and there was no statistically significant difference between Nano-Ionomer and RMGIC. PMID:27437363
Experimental evaluation of LED-based solar blind NLOS communication links.

PubMed

Chen, Gang; Abou-Galala, Feras; Xu, Zhengyuan; Sadler, Brian M

2008-09-15

Experimental results are reported demonstrating non-line of sight short-range ultraviolet communication link losses, and performance of photon counting detectors, operating in the solar blind spectrum regime. We employ light emitting diodes with divergent beams, a solar blind filter, and a wide field-of-view detector. Signal and noise statistics are characterized, and receiver performance is demonstrated. The effects of transmitter and receiver elevation angles, separation distance, and path loss are included.
SWATH Mass Spectrometry Performance Using Extended Peptide MS/MS Assay Libraries.

PubMed

Wu, Jemma X; Song, Xiaomin; Pascovici, Dana; Zaw, Thiri; Care, Natasha; Krisp, Christoph; Molloy, Mark P

2016-07-01

The use of data-independent acquisition methods such as SWATH for mass spectrometry based proteomics is usually performed with peptide MS/MS assay libraries which enable identification and quantitation of peptide peak areas. Reference assay libraries can be generated locally through information dependent acquisition, or obtained from community data repositories for commonly studied organisms. However, there have been no studies performed to systematically evaluate how locally generated or repository-based assay libraries affect SWATH performance for proteomic studies. To undertake this analysis, we developed a software workflow, SwathXtend, which generates extended peptide assay libraries by integration with a local seed library and delivers statistical analysis of SWATH-quantitative comparisons. We designed test samples using peptides from a yeast extract spiked into peptides from human K562 cell lysates at three different ratios to simulate protein abundance change comparisons. SWATH-MS performance was assessed using local and external assay libraries of varying complexities and proteome compositions. These experiments demonstrated that local seed libraries integrated with external assay libraries achieve better performance than local assay libraries alone, in terms of the number of identified peptides and proteins and the specificity to detect differentially abundant proteins. Our findings show that the performance of extended assay libraries is influenced by the MS/MS feature similarity of the seed and external libraries, while statistical analysis using multiple testing corrections increases the statistical rigor needed when searching against large extended assay libraries. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

SWATH Mass Spectrometry Performance Using Extended Peptide MS/MS Assay Libraries*

PubMed Central

Wu, Jemma X.; Song, Xiaomin; Pascovici, Dana; Zaw, Thiri; Care, Natasha; Krisp, Christoph; Molloy, Mark P.

2016-01-01

The use of data-independent acquisition methods such as SWATH for mass spectrometry based proteomics is usually performed with peptide MS/MS assay libraries which enable identification and quantitation of peptide peak areas. Reference assay libraries can be generated locally through information dependent acquisition, or obtained from community data repositories for commonly studied organisms. However, there have been no studies performed to systematically evaluate how locally generated or repository-based assay libraries affect SWATH performance for proteomic studies. To undertake this analysis, we developed a software workflow, SwathXtend, which generates extended peptide assay libraries by integration with a local seed library and delivers statistical analysis of SWATH-quantitative comparisons. We designed test samples using peptides from a yeast extract spiked into peptides from human K562 cell lysates at three different ratios to simulate protein abundance change comparisons. SWATH-MS performance was assessed using local and external assay libraries of varying complexities and proteome compositions. These experiments demonstrated that local seed libraries integrated with external assay libraries achieve better performance than local assay libraries alone, in terms of the number of identified peptides and proteins and the specificity to detect differentially abundant proteins. Our findings show that the performance of extended assay libraries is influenced by the MS/MS feature similarity of the seed and external libraries, while statistical analysis using multiple testing corrections increases the statistical rigor needed when searching against large extended assay libraries. PMID:27161445
Estimating the Probability of Traditional Copying, Conditional on Answer-Copying Statistics.

PubMed

Allen, Jeff; Ghattas, Andrew

2016-06-01

Statistics for detecting copying on multiple-choice tests produce p values measuring the probability of a value at least as large as that observed, under the null hypothesis of no copying. The posterior probability of copying is arguably more relevant than the p value, but cannot be derived from Bayes' theorem unless the population probability of copying and probability distribution of the answer-copying statistic under copying are known. In this article, the authors develop an estimator for the posterior probability of copying that is based on estimable quantities and can be used with any answer-copying statistic. The performance of the estimator is evaluated via simulation, and the authors demonstrate how to apply the formula using actual data. Potential uses, generalizability to other types of cheating, and limitations of the approach are discussed.
A review of empirical research related to the use of small quantitative samples in clinical outcome scale development.

PubMed

Houts, Carrie R; Edwards, Michael C; Wirth, R J; Deal, Linda S

2016-11-01

There has been a notable increase in the advocacy of using small-sample designs as an initial quantitative assessment of item and scale performance during the scale development process. This is particularly true in the development of clinical outcome assessments (COAs), where Rasch analysis has been advanced as an appropriate statistical tool for evaluating the developing COAs using a small sample. We review the benefits such methods are purported to offer from both a practical and statistical standpoint and detail several problematic areas, including both practical and statistical theory concerns, with respect to the use of quantitative methods, including Rasch-consistent methods, with small samples. The feasibility of obtaining accurate information and the potential negative impacts of misusing large-sample statistical methods with small samples during COA development are discussed.
Statistical assessment of crosstalk enrichment between gene groups in biological networks.

PubMed

McCormack, Theodore; Frings, Oliver; Alexeyenko, Andrey; Sonnhammer, Erik L L

2013-01-01

Analyzing groups of functionally coupled genes or proteins in the context of global interaction networks has become an important aspect of bioinformatic investigations. Assessing the statistical significance of crosstalk enrichment between or within groups of genes can be a valuable tool for functional annotation of experimental gene sets. Here we present CrossTalkZ, a statistical method and software to assess the significance of crosstalk enrichment between pairs of gene or protein groups in large biological networks. We demonstrate that the standard z-score is generally an appropriate and unbiased statistic. We further evaluate the ability of four different methods to reliably recover crosstalk within known biological pathways. We conclude that the methods preserving the second-order topological network properties perform best. Finally, we show how CrossTalkZ can be used to annotate experimental gene sets using known pathway annotations and that its performance at this task is superior to gene enrichment analysis (GEA). CrossTalkZ (available at http://sonnhammer.sbc.su.se/download/software/CrossTalkZ/) is implemented in C++, easy to use, fast, accepts various input file formats, and produces a number of statistics. These include z-score, p-value, false discovery rate, and a test of normality for the null distributions.
Comparison of Artificial Neural Networks and ARIMA statistical models in simulations of target wind time series

NASA Astrophysics Data System (ADS)

Kolokythas, Kostantinos; Vasileios, Salamalikis; Athanassios, Argiriou; Kazantzidis, Andreas

2015-04-01

The wind is a result of complex interactions of numerous mechanisms taking place in small or large scales, so, the better knowledge of its behavior is essential in a variety of applications, especially in the field of power production coming from wind turbines. In the literature there is a considerable number of models, either physical or statistical ones, dealing with the problem of simulation and prediction of wind speed. Among others, Artificial Neural Networks (ANNs) are widely used for the purpose of wind forecasting and, in the great majority of cases, outperform other conventional statistical models. In this study, a number of ANNs with different architectures, which have been created and applied in a dataset of wind time series, are compared to Auto Regressive Integrated Moving Average (ARIMA) statistical models. The data consist of mean hourly wind speeds coming from a wind farm on a hilly Greek region and cover a period of one year (2013). The main goal is to evaluate the models ability to simulate successfully the wind speed at a significant point (target). Goodness-of-fit statistics are performed for the comparison of the different methods. In general, the ANN showed the best performance in the estimation of wind speed prevailing over the ARIMA models.
A Novel Performance Evaluation Methodology for Single-Target Trackers.

PubMed

Kristan, Matej; Matas, Jiri; Leonardis, Ales; Vojir, Tomas; Pflugfelder, Roman; Fernandez, Gustavo; Nebehay, Georg; Porikli, Fatih; Cehovin, Luka

2016-11-01

This paper addresses the problem of single-target tracker performance evaluation. We consider the performance measures, the dataset and the evaluation system to be the most important components of tracker evaluation and propose requirements for each of them. The requirements are the basis of a new evaluation methodology that aims at a simple and easily interpretable tracker comparison. The ranking-based methodology addresses tracker equivalence in terms of statistical significance and practical differences. A fully-annotated dataset with per-frame annotations with several visual attributes is introduced. The diversity of its visual properties is maximized in a novel way by clustering a large number of videos according to their visual attributes. This makes it the most sophistically constructed and annotated dataset to date. A multi-platform evaluation system allowing easy integration of third-party trackers is presented as well. The proposed evaluation methodology was tested on the VOT2014 challenge on the new dataset and 38 trackers, making it the largest benchmark to date. Most of the tested trackers are indeed state-of-the-art since they outperform the standard baselines, resulting in a highly-challenging benchmark. An exhaustive analysis of the dataset from the perspective of tracking difficulty is carried out. To facilitate tracker comparison a new performance visualization technique is proposed.
Performance map of a cluster detection test using extended power

PubMed Central

2013-01-01

Background Conventional power studies possess limited ability to assess the performance of cluster detection tests. In particular, they cannot evaluate the accuracy of the cluster location, which is essential in such assessments. Furthermore, they usually estimate power for one or a few particular alternative hypotheses and thus cannot assess performance over an entire region. Takahashi and Tango developed the concept of extended power that indicates both the rate of null hypothesis rejection and the accuracy of the cluster location. We propose a systematic assessment method, using here extended power, to produce a map showing the performance of cluster detection tests over an entire region. Methods To explore the behavior of a cluster detection test on identical cluster types at any possible location, we successively applied four different spatial and epidemiological parameters. These parameters determined four cluster collections, each covering the entire study region. We simulated 1,000 datasets for each cluster and analyzed them with Kulldorff’s spatial scan statistic. From the area under the extended power curve, we constructed a map for each parameter set showing the performance of the test across the entire region. Results Consistent with previous studies, the performance of the spatial scan statistic increased with the baseline incidence of disease, the size of the at-risk population and the strength of the cluster (i.e., the relative risk). Performance was heterogeneous, however, even for very similar clusters (i.e., similar with respect to the aforementioned factors), suggesting the influence of other factors. Conclusions The area under the extended power curve is a single measure of performance and, although needing further exploration, it is suitable to conduct a systematic spatial evaluation of performance. The performance map we propose enables epidemiologists to assess cluster detection tests across an entire study region. PMID:24156765
Dimensional Changes of Fresh Sockets With Reactive Soft Tissue Preservation: A Cone Beam CT Study.

PubMed

Crespi, Roberto; Capparé, Paolo; Crespi, Giovanni; Gastaldi, Giorgio; Gherlone, Enrico Felice

2017-06-01

The aim of this study was to assess dimensional changes of the fresh sockets grafted with collagen sheets and maintenance of reactive soft tissue, using cone beam computed tomography (CBCT). Tooth extractions were performed with maximum preservation of the alveolar housing, reactive soft tissue was left into the sockets and collagen sheets filled bone defects. Cone beam computed tomography were performed before and 3 months after extractions. One hundred forty-five teeth, 60 monoradiculars and 85 molars, were extracted. In total, 269 alveoli were evaluated. In Group A, not statistically significant differences were found between monoradiculars, whereas statistically significant differences (P < 0.05) were found between molars, both for mesial and distal alveoli. In Group B, not statistically significant differences were found between maxillary and mandibular bone changes values (P > 0.05) for all types of teeth. This study reported an atraumatic tooth extraction, reactive soft tissue left in situ, and grafted collagen sponge may be helpful to reduce fresh socket collapse after extraction procedures.
Automated Cognitive Health Assessment From Smart Home-Based Behavior Data.

PubMed

Dawadi, Prafulla Nath; Cook, Diane Joyce; Schmitter-Edgecombe, Maureen

2016-07-01

Smart home technologies offer potential benefits for assisting clinicians by automating health monitoring and well-being assessment. In this paper, we examine the actual benefits of smart home-based analysis by monitoring daily behavior in the home and predicting clinical scores of the residents. To accomplish this goal, we propose a clinical assessment using activity behavior (CAAB) approach to model a smart home resident's daily behavior and predict the corresponding clinical scores. CAAB uses statistical features that describe characteristics of a resident's daily activity performance to train machine learning algorithms that predict the clinical scores. We evaluate the performance of CAAB utilizing smart home sensor data collected from 18 smart homes over two years. We obtain a statistically significant correlation ( r=0.72) between CAAB-predicted and clinician-provided cognitive scores and a statistically significant correlation ( r=0.45) between CAAB-predicted and clinician-provided mobility scores. These prediction results suggest that it is feasible to predict clinical scores using smart home sensor data and learning-based data analysis.
Statistical optimization of process parameters for lipase-catalyzed synthesis of triethanolamine-based esterquats using response surface methodology in 2-liter bioreactor.

PubMed

Masoumi, Hamid Reza Fard; Basri, Mahiran; Kassim, Anuar; Abdullah, Dzulkefly Kuang; Abdollahi, Yadollah; Abd Gani, Siti Salwa; Rezaee, Malahat

2013-01-01

Lipase-catalyzed production of triethanolamine-based esterquat by esterification of oleic acid (OA) with triethanolamine (TEA) in n-hexane was performed in 2 L stirred-tank reactor. A set of experiments was designed by central composite design to process modeling and statistically evaluate the findings. Five independent process variables, including enzyme amount, reaction time, reaction temperature, substrates molar ratio of OA to TEA, and agitation speed, were studied under the given conditions designed by Design Expert software. Experimental data were examined for normality test before data processing stage and skewness and kurtosis indices were determined. The mathematical model developed was found to be adequate and statistically accurate to predict the optimum conversion of product. Response surface methodology with central composite design gave the best performance in this study, and the methodology as a whole has been proven to be adequate for the design and optimization of the enzymatic process.
Health data in Ontario: taking stock and moving ahead.

PubMed

Iron, Karey

2006-01-01

Ontario has been a leader in performance-reporting in clinical areas such as surgery, cardiac care and drug use in the elderly. Data used to report on these areas are readily available for performance evaluation and are of reasonable quality. But other key areas like managing chronic disease and preventive care cannot be fully evaluated because relevant data are either unavailable or of poor quality. A focus on timely access to good quality demographic and vital statistics data would enhance our ability to evaluate components of the Ontario health system. New comprehensive primary care, laboratory services and drug prescriptions data sources are also necessary for health-system evaluation and planning. In the short term, a dedicated, centralized agency with legislative authority is proposed to move Ontario's health information agenda forward in a holistic, strategic and timely manner.
Performance comparison between total variation (TV)-based compressed sensing and statistical iterative reconstruction algorithms.

PubMed

Tang, Jie; Nett, Brian E; Chen, Guang-Hong

2009-10-07

Of all available reconstruction methods, statistical iterative reconstruction algorithms appear particularly promising since they enable accurate physical noise modeling. The newly developed compressive sampling/compressed sensing (CS) algorithm has shown the potential to accurately reconstruct images from highly undersampled data. The CS algorithm can be implemented in the statistical reconstruction framework as well. In this study, we compared the performance of two standard statistical reconstruction algorithms (penalized weighted least squares and q-GGMRF) to the CS algorithm. In assessing the image quality using these iterative reconstructions, it is critical to utilize realistic background anatomy as the reconstruction results are object dependent. A cadaver head was scanned on a Varian Trilogy system at different dose levels. Several figures of merit including the relative root mean square error and a quality factor which accounts for the noise performance and the spatial resolution were introduced to objectively evaluate reconstruction performance. A comparison is presented between the three algorithms for a constant undersampling factor comparing different algorithms at several dose levels. To facilitate this comparison, the original CS method was formulated in the framework of the statistical image reconstruction algorithms. Important conclusions of the measurements from our studies are that (1) for realistic neuro-anatomy, over 100 projections are required to avoid streak artifacts in the reconstructed images even with CS reconstruction, (2) regardless of the algorithm employed, it is beneficial to distribute the total dose to more views as long as each view remains quantum noise limited and (3) the total variation-based CS method is not appropriate for very low dose levels because while it can mitigate streaking artifacts, the images exhibit patchy behavior, which is potentially harmful for medical diagnosis.
Enhanced Component Performance Study. Emergency Diesel Generators 1998–2013

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schroeder, John Alton

2014-11-01

This report presents an enhanced performance evaluation of emergency diesel generators (EDGs) at U.S. commercial nuclear power plants. This report evaluates component performance over time using Institute of Nuclear Power Operations (INPO) Consolidated Events Database (ICES) data from 1998 through 2013 and maintenance unavailability (UA) performance data using Mitigating Systems Performance Index (MSPI) Basis Document data from 2002 through 2013. The objective is to present an analysis of factors that could influence the system and component trends in addition to annual performance trends of failure rates and probabilities. The factors analyzed for the EDG component are the differences in failuresmore » between all demands and actual unplanned engineered safety feature (ESF) demands, differences among manufacturers, and differences among EDG ratings. Statistical analyses of these differences are performed and results showing whether pooling is acceptable across these factors. In addition, engineering analyses were performed with respect to time period and failure mode. The factors analyzed are: sub-component, failure cause, detection method, recovery, manufacturer, and EDG rating.« less
Reliability and Validity of the Turkish Version of the Job Performance Scale Instrument.

PubMed

Harmanci Seren, Arzu Kader; Tuna, Rujnan; Eskin Bacaksiz, Feride

2018-02-01

Objective measurement of the job performance of nursing staff using valid and reliable instruments is important in the evaluation of healthcare quality. A current, valid, and reliable instrument that specifically measures the performance of nurses is required for this purpose. The aim of this study was to determine the validity and reliability of the Turkish version of the Job Performance Instrument. This study used a methodological design and a sample of 240 nurses working at different units in four hospitals in Istanbul, Turkey. A descriptive data form, the Job Performance Scale, and the Employee Performance Scale were used to collect data. Data were analyzed using IBM SPSS Statistics Version 21.0 and LISREL Version 8.51. On the basis of the data analysis, the instrument was revised. Some items were deleted, and subscales were combined. The Turkish version of the Job Performance Instrument was determined to be valid and reliable to measure the performance of nurses. The instrument is suitable for evaluating current nursing roles.
Relationship of admissions variables and college of osteopathic medicine variables to performance on COMLEX-USA level 3.

PubMed

Baker, Helen H; Shuman, Victoria L; Ridpath, Lance C; Pence, Lorenzo L; Fisk, Robert M; Boisvert, Craig S

2015-02-01

New accreditation standards require that all US colleges of osteopathic medicine (COMs) publically report the first-time pass rates of graduates on the Comprehensive Osteopathic Medical Licensing Examination-USA (COMLEX-USA) Level 3. Little is known about the extent to which admissions variables or COM performance measures relate to Level 3 performance. To examine the relationship of admissions variables and COM performance to scores on Level 3 and to assess whether a relationship existed between Level 3 scores and sex, curriculum track, year of graduation, and residency specialty in the first postgraduate year. Data were analyzed from 4 graduating classes (2008-2011) of the West Virginia School of Osteopathic Medicine in Lewisburg. Relationships were examined between first-attempt scores on COMLEX-USA Level 3 and Medical College Admission Test (MCAT) scores; undergraduate grade point averages (GPAs); GPAs in COM year 1, year 2, and clinical rotation years (years 3 and 4); and first-attempt scores on COMLEX-USA Level 1, Level 2-Cognitive Evaluation, and Level 2-Performance Evaluation. Of the 556 graduates during this 4-year period, COMLEX-USA Level 3 scores were available for 552 graduates (99.3%). No statistically significant differences were found in Level 3 scores based on sex, curriculum track, graduating class, or residency specialty. The strongest relationship between Level 3 scores and any admissions variable was with total MCAT score, which accounted for 4.2% of the variation in Level 3 scores. The strongest relationship between Level 3 scores and COM year performance measures was with year 2 GPA, which accounted for 35.4% of the variation in Level 3 scores. Level 1 scores accounted for 38.5% of the variation in Level 3 scores, and Level 2-Cognitive Evaluation scores accounted for the greatest percentage of variation (45.7%). The correlation of Level 3 scores with passing the Level 2-Performance Evaluation on the first attempt was not statistically significant. A weak relationship was found between admissions variables and performance on COMLEX-USA Level 3, suggesting that graduates with lower MCAT scores and undergraduate GPAs may have overcome their early disadvantage. Strong relationships were found between Level 3 scores and year 2 GPAs, as well as scores on COMLEX-USA Level 1 and Level 2-Cognitive Evaluation. © 2015 The American Osteopathic Association.
Empirical evaluation of data normalization methods for molecular classification.

PubMed

Huang, Huei-Chung; Qin, Li-Xuan

2018-01-01

Data artifacts due to variations in experimental handling are ubiquitous in microarray studies, and they can lead to biased and irreproducible findings. A popular approach to correct for such artifacts is through post hoc data adjustment such as data normalization. Statistical methods for data normalization have been developed and evaluated primarily for the discovery of individual molecular biomarkers. Their performance has rarely been studied for the development of multi-marker molecular classifiers-an increasingly important application of microarrays in the era of personalized medicine. In this study, we set out to evaluate the performance of three commonly used methods for data normalization in the context of molecular classification, using extensive simulations based on re-sampling from a unique pair of microRNA microarray datasets for the same set of samples. The data and code for our simulations are freely available as R packages at GitHub. In the presence of confounding handling effects, all three normalization methods tended to improve the accuracy of the classifier when evaluated in an independent test data. The level of improvement and the relative performance among the normalization methods depended on the relative level of molecular signal, the distributional pattern of handling effects (e.g., location shift vs scale change), and the statistical method used for building the classifier. In addition, cross-validation was associated with biased estimation of classification accuracy in the over-optimistic direction for all three normalization methods. Normalization may improve the accuracy of molecular classification for data with confounding handling effects; however, it cannot circumvent the over-optimistic findings associated with cross-validation for assessing classification accuracy.
Radiographic technical quality of root canal treatment performed ex vivo by dental students at Valencia University Medical and Dental School, Spain

PubMed Central

Faus-Matoses, Vicente; Alegre-Domingo, Teresa; Faus-Llácer, Vicente J.

2014-01-01

Objectives: To evaluate radiographically the quality of root canal fillings and compare manual and rotary preparation performed on extracted teeth by undergraduate dental students. Study Design: A total of 561 premolars and molars extracted teeth were prepared using nickel-titanium rotary files or manual instrumentation and filled with gutta-percha using a cold lateral condensation technique, by 4th grade undergraduate students. Periapical radiographs were used to assess the technical quality of the root canal filling, evaluating three variables: length, density and taper. These data were recorded, scored and used to study the “technical success rate” and the “overall score”. The length of each root canal filling was classified as acceptable, short and overfilled, based on their relationship with the radiographic apex. Density and taper of filling were evaluated based on the presence of voids and the uniform tapering of the filling, respectively. Statistical analysis was used to evaluate the quality of root canal treatment, considering p < 0.05 as a statistical significant level. Results: The percentage of technical success was 44% and the overall score was 7.8 out of 10. Technical success and overall score were greater with rotary instruments (52% against 28% with a manual one, p < 0.001; 8.3 against 6.7 respectively, p < 0.001). Conclusions: It appears that inexperienced operators perform better root canal treatment (RCT) with the use of rotary instrumentation. Key words:Dental education, endodontics, rotary instrumentation, radiographs, root canal treatment, undergraduate students. PMID:24121911
Evaluation of neutron total and capture cross sections on 99Tc in the unresolved resonance region

NASA Astrophysics Data System (ADS)

Iwamoto, Nobuyuki; Katabuchi, Tatsuya

2017-09-01

Long-lived fission product Technetium-99 is one of the most important radioisotopes for nuclear transmutation. The reliable nuclear data are indispensable for a wide energy range up to a few MeV, in order to develop environmental load reducing technology. The statistical analyses of resolved resonances were performed by using the truncated Porter-Thomas distribution, coupled-channels optical model, nuclear level density model and Bayes' theorem on conditional probability. The total and capture cross sections were calculated by a nuclear reaction model code CCONE. The resulting cross sections have statistical consistency between the resolved and unresolved resonance regions. The evaluated capture data reproduce those recently measured at ANNRI of J-PARC/MLF above resolved resonance region up to 800 keV.
Evidential evaluation of DNA profiles using a discrete statistical model implemented in the DNA LiRa software.

PubMed

Puch-Solis, Roberto; Clayton, Tim

2014-07-01

The high sensitivity of the technology for producing profiles means that it has become routine to produce profiles from relatively small quantities of DNA. The profiles obtained from low template DNA (LTDNA) are affected by several phenomena which must be taken into consideration when interpreting and evaluating this evidence. Furthermore, many of the same phenomena affect profiles from higher amounts of DNA (e.g. where complex mixtures has been revealed). In this article we present a statistical model, which forms the basis of software DNA LiRa, and that is able to calculate likelihood ratios where one to four donors are postulated and for any number of replicates. The model can take into account dropin and allelic dropout for different contributors, template degradation and uncertain allele designations. In this statistical model unknown parameters are treated following the Empirical Bayesian paradigm. The performance of LiRa is tested using examples and the outputs are compared with those generated using two other statistical software packages likeLTD and LRmix. The concept of ban efficiency is introduced as a measure for assessing model sensitivity. Copyright © 2014. Published by Elsevier Ireland Ltd.
A spatial scan statistic for multiple clusters.

PubMed

Li, Xiao-Zhou; Wang, Jin-Feng; Yang, Wei-Zhong; Li, Zhong-Jie; Lai, Sheng-Jie

2011-10-01

Spatial scan statistics are commonly used for geographical disease surveillance and cluster detection. While there are multiple clusters coexisting in the study area, they become difficult to detect because of clusters' shadowing effect to each other. The recently proposed sequential method showed its better power for detecting the second weaker cluster, but did not improve the ability of detecting the first stronger cluster which is more important than the second one. We propose a new extension of the spatial scan statistic which could be used to detect multiple clusters. Through constructing two or more clusters in the alternative hypothesis, our proposed method accounts for other coexisting clusters in the detecting and evaluating process. The performance of the proposed method is compared to the sequential method through an intensive simulation study, in which our proposed method shows better power in terms of both rejecting the null hypothesis and accurately detecting the coexisting clusters. In the real study of hand-foot-mouth disease data in Pingdu city, a true cluster town is successfully detected by our proposed method, which cannot be evaluated to be statistically significant by the standard method due to another cluster's shadowing effect. Copyright © 2011 Elsevier Inc. All rights reserved.

Gene set analysis approaches for RNA-seq data: performance evaluation and application guideline

PubMed Central

Rahmatallah, Yasir; Emmert-Streib, Frank

2016-01-01

Transcriptome sequencing (RNA-seq) is gradually replacing microarrays for high-throughput studies of gene expression. The main challenge of analyzing microarray data is not in finding differentially expressed genes, but in gaining insights into the biological processes underlying phenotypic differences. To interpret experimental results from microarrays, gene set analysis (GSA) has become the method of choice, in particular because it incorporates pre-existing biological knowledge (in a form of functionally related gene sets) into the analysis. Here we provide a brief review of several statistically different GSA approaches (competitive and self-contained) that can be adapted from microarrays practice as well as those specifically designed for RNA-seq. We evaluate their performance (in terms of Type I error rate, power, robustness to the sample size and heterogeneity, as well as the sensitivity to different types of selection biases) on simulated and real RNA-seq data. Not surprisingly, the performance of various GSA approaches depends only on the statistical hypothesis they test and does not depend on whether the test was developed for microarrays or RNA-seq data. Interestingly, we found that competitive methods have lower power as well as robustness to the samples heterogeneity than self-contained methods, leading to poor results reproducibility. We also found that the power of unsupervised competitive methods depends on the balance between up- and down-regulated genes in tested gene sets. These properties of competitive methods have been overlooked before. Our evaluation provides a concise guideline for selecting GSA approaches, best performing under particular experimental settings in the context of RNA-seq. PMID:26342128
Evaluation of PCR Systems for Field Screening of Bacillus anthracis

PubMed Central

Ozanich, Richard M.; Colburn, Heather A.; Victry, Kristin D.; Bartholomew, Rachel A.; Arce, Jennifer S.; Heredia-Langner, Alejandro; Jarman, Kristin; Kreuzer, Helen W.

2017-01-01

There is little published data on the performance of hand-portable polymerase chain reaction (PCR) systems that can be used by first responders to determine if a suspicious powder contains a potential biothreat agent. We evaluated 5 commercially available hand-portable PCR instruments for detection of Bacillus anthracis. We used a cost-effective, statistically based test plan to evaluate systems at performance levels ranging from 0.85-0.95 lower confidence bound (LCB) of the probability of detection (POD) at confidence levels of 80% to 95%. We assessed specificity using purified genomic DNA from 13 B. anthracis strains and 18 Bacillus near neighbors, potential interference with 22 suspicious powders that are commonly encountered in the field by first responders during suspected biothreat incidents, and the potential for PCR inhibition when B. anthracis spores were spiked into these powders. Our results indicate that 3 of the 5 systems achieved 0.95 LCB of the probability of detection with 95% confidence levels at test concentrations of 2,000 genome equivalents/mL (GE/mL), which is comparable to 2,000 spores/mL. This is more than sufficient sensitivity for screening visible suspicious powders. These systems exhibited no false-positive results or PCR inhibition with common suspicious powders and reliably detected B. anthracis spores spiked into these powders, though some issues with assay controls were observed. Our testing approach enables efficient performance testing using a statistically rigorous and cost-effective test plan to generate performance data that allow users to make informed decisions regarding the purchase and use of field biodetection equipment. PMID:28192050
Predicting adsorptive removal of chlorophenol from aqueous solution using artificial intelligence based modeling approaches.

PubMed

Singh, Kunwar P; Gupta, Shikha; Ojha, Priyanka; Rai, Premanjali

2013-04-01

The research aims to develop artificial intelligence (AI)-based model to predict the adsorptive removal of 2-chlorophenol (CP) in aqueous solution by coconut shell carbon (CSC) using four operational variables (pH of solution, adsorbate concentration, temperature, and contact time), and to investigate their effects on the adsorption process. Accordingly, based on a factorial design, 640 batch experiments were conducted. Nonlinearities in experimental data were checked using Brock-Dechert-Scheimkman (BDS) statistics. Five nonlinear models were constructed to predict the adsorptive removal of CP in aqueous solution by CSC using four variables as input. Performances of the constructed models were evaluated and compared using statistical criteria. BDS statistics revealed strong nonlinearity in experimental data. Performance of all the models constructed here was satisfactory. Radial basis function network (RBFN) and multilayer perceptron network (MLPN) models performed better than generalized regression neural network, support vector machines, and gene expression programming models. Sensitivity analysis revealed that the contact time had highest effect on adsorption followed by the solution pH, temperature, and CP concentration. The study concluded that all the models constructed here were capable of capturing the nonlinearity in data. A better generalization and predictive performance of RBFN and MLPN models suggested that these can be used to predict the adsorption of CP in aqueous solution using CSC.
Interrater Reliability and Diagnostic Performance of Subjective Evaluation of Sublingual Microcirculation Images by Physicians and Nurses: A Multicenter Observational Study.

PubMed

Lima, Alexandre; López, Alejandra; van Genderen, Michel E; Hurtado, Francisco Javier; Angulo, Martin; Grignola, Juan C; Shono, Atsuko; van Bommel, Jasper

2015-09-01

This was a cross-sectional multicenter study to investigate the ability of physicians and nurses from three different countries to subjectively evaluate sublingual microcirculation images and thereby discriminate normal from abnormal sublingual microcirculation based on flow and density abnormalities. Forty-five physicians and 61 nurses (mean age, 36 ± 10 years; 44 males) from three different centers in The Netherlands (n = 61), Uruguay (n = 12), and Japan (n = 33) were asked to subjectively evaluate a sample of 15 microcirculation videos randomly selected from an experimental model of endotoxic shock in pigs. All videos were first analyzed offline using the A.V.A. software by an independent, experienced investigator and were categorized as good, bad, or very bad microcirculation based on the microvascular flow index, perfused capillary density, and proportion of perfused capillaries. Then, the videos were randomly assigned to the examiners, who were instructed to subjectively categorize each image as good, bad, or very bad. An interrater analysis was performed, and sensitivity and specificity tests were calculated to evaluate the proportion of A.V.A. score abnormalities that the examiners correctly identified. The κ statistics indicated moderate agreement in the evaluation of microcirculation abnormalities using three categories, i.e., good, bad, or very bad (κ = 0.48), and substantial agreement using two categories, i.e., normal (good) and abnormal (bad or very bad) (κ = 0.66). There was no significant difference between the κ three and κ two statistics. We found that the examiner's subjective evaluations had good diagnostic performance and were highly sensitive (84%; 95% confidence interval, 81%-86%) and specific (87%; 95% confidence interval, 84%-90%) for sublingual microcirculatory abnormalities as assessed using the A.V.A. software. The subjective evaluations of sublingual microcirculation by physicians and nurses agreed well with a conventional offline analysis and were highly sensitive and specific for sublingual microcirculatory abnormalities.
Evaluation of the Geotech SMART24BH 20Vpp/5Vpp data acquisition system with active fortezza crypto card data signing and authentication.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rembold, Randy Kai; Hart, Darren M.

Sandia National Laboratories has tested and evaluated Geotech SMART24BH borehole data acquisition system with active Fortezza crypto card data signing and authentication. The test results included in this report were in response to static and tonal-dynamic input signals. Most test methodologies used were based on IEEE Standards 1057 for Digitizing Waveform Recorders and 1241 for Analog to Digital Converters; others were designed by Sandia specifically for infrasound application evaluation and for supplementary criteria not addressed in the IEEE standards. The objective of this work was to evaluate the overall technical performance of two Geotech SMART24BH digitizers with a Fortezza PCMCIAmore » crypto card actively implementing the signing of data packets. The results of this evaluation were compared to relevant specifications provided within manufacturer's documentation notes. The tests performed were chosen to demonstrate different performance aspects of the digitizer under test. The performance aspects tested include determining noise floor, least significant bit (LSB), dynamic range, cross-talk, relative channel-to-channel timing, time-tag accuracy/statistics/drift, analog bandwidth.« less
Automated detection of hospital outbreaks: A systematic review of methods

PubMed Central

Buckeridge, David L.; Lepelletier, Didier

2017-01-01

Objectives Several automated algorithms for epidemiological surveillance in hospitals have been proposed. However, the usefulness of these methods to detect nosocomial outbreaks remains unclear. The goal of this review was to describe outbreak detection algorithms that have been tested within hospitals, consider how they were evaluated, and synthesize their results. Methods We developed a search query using keywords associated with hospital outbreak detection and searched the MEDLINE database. To ensure the highest sensitivity, no limitations were initially imposed on publication languages and dates, although we subsequently excluded studies published before 2000. Every study that described a method to detect outbreaks within hospitals was included, without any exclusion based on study design. Additional studies were identified through citations in retrieved studies. Results Twenty-nine studies were included. The detection algorithms were grouped into 5 categories: simple thresholds (n = 6), statistical process control (n = 12), scan statistics (n = 6), traditional statistical models (n = 6), and data mining methods (n = 4). The evaluation of the algorithms was often solely descriptive (n = 15), but more complex epidemiological criteria were also investigated (n = 10). The performance measures varied widely between studies: e.g., the sensitivity of an algorithm in a real world setting could vary between 17 and 100%. Conclusion Even if outbreak detection algorithms are useful complementary tools for traditional surveillance, the heterogeneity in results among published studies does not support quantitative synthesis of their performance. A standardized framework should be followed when evaluating outbreak detection methods to allow comparison of algorithms across studies and synthesis of results. PMID:28441422
Performance of the ATLAS muon trigger in pp collisions at √s = 8 TeV

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aad, G.

The performance of the ATLAS muon trigger system is evaluated with proton–proton collision data collected in 2012 at the Large Hadron Collider at a centre-of-mass energy of 8 TeV. It is primarily evaluated using events containing a pair of muons from the decay of Z bosons. The efficiency of the single-muon trigger is measured for muons with transverse momentum 25 < p T < 100 GeV, with a statistical uncertainty of less than 0.01 % and a systematic uncertainty of 0.6 %. The pT range for efficiency determination is extended by using muons from decays of J/ψ mesons, W bosons,more » and top quarks. The muon trigger shows highly uniform and stable performance. Thus, the performance is compared to the prediction of a detailed simulation.« less
Performance of the ATLAS muon trigger in pp collisions at √s = 8 TeV

DOE PAGES

Aad, G.

2015-03-13

The performance of the ATLAS muon trigger system is evaluated with proton–proton collision data collected in 2012 at the Large Hadron Collider at a centre-of-mass energy of 8 TeV. It is primarily evaluated using events containing a pair of muons from the decay of Z bosons. The efficiency of the single-muon trigger is measured for muons with transverse momentum 25 < p T < 100 GeV, with a statistical uncertainty of less than 0.01 % and a systematic uncertainty of 0.6 %. The pT range for efficiency determination is extended by using muons from decays of J/ψ mesons, W bosons,more » and top quarks. The muon trigger shows highly uniform and stable performance. Thus, the performance is compared to the prediction of a detailed simulation.« less
[Comparison between administrative and clinical databases in the evaluation of cardiac surgery performance].

PubMed

Rosato, Stefano; D'Errigo, Paola; Badoni, Gabriella; Fusco, Danilo; Perucci, Carlo A; Seccareccia, Fulvia

2008-08-01

The availability of two contemporary sources of information about coronary artery bypass graft (CABG) interventions, allowed 1) to verify the feasibility of performing outcome evaluation studies using administrative data sources, and 2) to compare hospital performance obtainable using the CABG Project clinical database with hospital performance derived from the use of current administrative data. Interventions recorded in the CABG Project were linked to the hospital discharge record (HDR) administrative database. Only the linked records were considered for subsequent analyses (46% of the total CABG Project). A new selected population "clinical card-HDR" was then defined. Two independent risk-adjustment models were applied, each of them using information derived from one of the two different sources. Then, HDR information was supplemented with some patient preoperative conditions from the CABG clinical database. The two models were compared in terms of their adaptability to data. Hospital performances identified by the two different models and significantly different from the mean was compared. In only 4 of the 13 hospitals considered for analysis, the results obtained using the HDR model did not completely overlap with those obtained by the CABG model. When comparing statistical parameters of the HDR model and the HDR model + patient preoperative conditions, the latter showed the best adaptability to data. In this "clinical card-HDR" population, hospital performance assessment obtained using information from the clinical database is similar to that derived from the use of current administrative data. However, when risk-adjustment models built on administrative databases are supplemented with a few clinical variables, their statistical parameters improve and hospital performance assessment becomes more accurate.
Contralateral Bimodal Stimulation: A Way to Enhance Speech Performance in Arabic-Speaking Cochlear Implant Patients.

PubMed

Abdeltawwab, Mohamed M; Khater, Ahmed; El-Anwar, Mohammad W

2016-01-01

The combination of acoustic and electric stimulation as a way to enhance speech recognition performance in cochlear implant (CI) users has generated considerable interest in the recent years. The purpose of this study was to evaluate the bimodal advantage of the FS4 speech processing strategy in combination with hearing aids (HA) as a means to improve low-frequency resolution in CI patients. Nineteen postlingual CI adults were selected to participate in this study. All patients wore implants on one side and HA on the contralateral side with residual hearing. Monosyllabic word recognition, speech in noise, and emotion and talker identification were assessed using CI with fine structure processing/FS4 and high-definition continuous interleaved sampling strategies, HA alone, and a combination of CI and HA. The bimodal stimulation showed improvement in speech performance and emotion identification for the question/statement/order tasks, which was statistically significant compared to patients with CI alone, but there were no significant statistical differences in intragender talker discrimination and emotion identification for the happy/angry/neutral tasks. The poorest performance was obtained with HA only, and it was statistically significant compared to the other modalities. The bimodal stimulation showed enhanced speech performance in CI patients, and it improves the limitations provided by electric or acoustic stimulation alone. © 2016 S. Karger AG, Basel.
Comparison of statistical sampling methods with ScannerBit, the GAMBIT scanning module

NASA Astrophysics Data System (ADS)

Martinez, Gregory D.; McKay, James; Farmer, Ben; Scott, Pat; Roebber, Elinore; Putze, Antje; Conrad, Jan

2017-11-01

We introduce ScannerBit, the statistics and sampling module of the public, open-source global fitting framework GAMBIT. ScannerBit provides a standardised interface to different sampling algorithms, enabling the use and comparison of multiple computational methods for inferring profile likelihoods, Bayesian posteriors, and other statistical quantities. The current version offers random, grid, raster, nested sampling, differential evolution, Markov Chain Monte Carlo (MCMC) and ensemble Monte Carlo samplers. We also announce the release of a new standalone differential evolution sampler, Diver, and describe its design, usage and interface to ScannerBit. We subject Diver and three other samplers (the nested sampler MultiNest, the MCMC GreAT, and the native ScannerBit implementation of the ensemble Monte Carlo algorithm T-Walk) to a battery of statistical tests. For this we use a realistic physical likelihood function, based on the scalar singlet model of dark matter. We examine the performance of each sampler as a function of its adjustable settings, and the dimensionality of the sampling problem. We evaluate performance on four metrics: optimality of the best fit found, completeness in exploring the best-fit region, number of likelihood evaluations, and total runtime. For Bayesian posterior estimation at high resolution, T-Walk provides the most accurate and timely mapping of the full parameter space. For profile likelihood analysis in less than about ten dimensions, we find that Diver and MultiNest score similarly in terms of best fit and speed, outperforming GreAT and T-Walk; in ten or more dimensions, Diver substantially outperforms the other three samplers on all metrics.
Is there a relationship between periodontal disease and causes of death? A cross sectional study.

PubMed

Natto, Zuhair S; Aladmawy, Majdi; Alasqah, Mohammed; Papas, Athena

2015-01-01

The aim of this study was to evaluate whether there is any correlation between periodontal disease and mortality contributing factors, such as cardiovascular disease and diabetes mellitus in the elderly population. A dental evaluation was performed by a single examiner at Tufts University dental clinics for 284 patients. Periodontal assessments were performed by probing with a manual UNC-15 periodontal probe to measure pocket depth and clinical attachment level (CAL) at 6 sites. Causes of death abstracted from death certificate. Statistical analysis involved ANOVA, chi-square and multivariate logistic regression analysis. The demographics of the population sample indicated that, most were females (except for diabetes mellitus), white, married, completed 13 years of education and were 83 years old on average. CAL (continuous or dichotomous) and marital status attained statistical significance (p<0.05) in contingency table analysis (Chi-square for independence). Individuals with increased CAL were 2.16 times more likely (OR=2.16, 95% CI=1.47-3.17) to die due to CVD and this effect persisted even after control for age, marital status, gender, race, years of education (OR=2.03, 95% CI=1.35-3.03). CAL (continuous or dichotomous) was much higher among those who died due to diabetes mellitus or out of state of Massachusetts. However, these results were not statistically significant. The same pattern was observed with pocket depth (continuous or dichotomous), but these results were not statistically significant either. CAL seems to be more sensitive to chronic diseases than pocket depth. Among those conditions, cardiovascular disease has the strongest effect.
Optical properties of mice skin for optical therapy relevant wavelengths: influence of gender and pigmentation

NASA Astrophysics Data System (ADS)

Sabino, C. P.; Deana, A. M.; Silva, D. F. T.; França, C. M.; Yoshimura, T. M.; Ribeiro, M. S.

2015-03-01

Red and near-infrared light have been widely employed in optical therapies. Skin is the most common optical barrier in non-invasive techniques and in many cases it is the target tissue itself. Consequently, to optimize the outcomes brought by lightbased therapies, the optical properties of skin tissue must be very well elucidated. In the present study, we evaluated the dorsal skin optical properties of albino (BALB/c) and pigmented (C57BL/6) mice using the Kubelka-Munk photon transport model. We evaluated samples from male and female young mice of both strains. Analysis was performed for wavelengths at 630, 660, 780, 810 and 905 nm due to their prevalent use in optical therapies, such as low-level light (or laser) and photodynamic therapies. Spectrophotometric measurements of diffuse transmittance and reflectance were performed using a single integrating sphere coupled to a proper spectrophotometer. Statistic analysis was made by two-way ANOVA, with Tukey as post-test and Levenne and Shapiro-Wilks as pre-tests. Statistical significance was considered when p<0.05. Our results show only a slight transmittance increment (<10 %) as wavelengths are increased from 630 to 905 nm, and no statistical significance was observed. Albino male mice present reduced transmittance levels for all wavelengths. The organization and abundance of skin composing tissues significantly influence its scattering optical properties although absorption remains constant. We conclude that factors such as subcutaneous adiposity and connective tissue structure can have statistically significant influence on mice skin optical properties and these factors have relevant variations among different gender and strains.
Latin-American Special Olympics athletes: evaluation of oral health status, 2010.

PubMed

Hanke-Herrero, Rosana; López Del Valle, Lydia M; Sánchez, Carolina; Waldman, H Barry; Perlman, Steven P

2013-01-01

The purpose of this study was to evaluate the oral health status and dental needs of the athletes with intellectual disabilities from Latin-American and Caribbean countries who were participating in the II Latin-American Special Olympics games held in Puerto Rico, February 2010. There were 930 athletes who participated in the games, of whom 445 received a dental examination, including 367 from Latin-American and 78 from Caribbean countries. Forty-four trained and standardized dental professionals performed dental screenings of athletes with intellectual disabilities, following Special Olympic Special Smiles and CDC protocols. These criteria were used to record untreated caries, missing and filled teeth, and gingival status. Socio-demographics, existence, and severity of pain and oral hygiene habits were assessed by questionnaire. Statistical analysis was performed using EPI-INFO and SPSS Statistical Program to produce descriptive statistics and chi-square test. Untreated dental caries was recorded for more than half of the examined athletes. Missing teeth were noted in more than one-third of the athletes. More than half of the participants had signs of gingival disease and half needed preventive mouth guards. Statistics for each Latin-American country suggests a dissimilar trend of dental decay and treatment needs among nations. While the Special Olympic athletes may not be representative of the entire population of individuals with intellectual disabilities in their specific country, the general consistency of the oral health status of these athletes from the 31 countries supports the certainty of the need for increased dental services for individuals with intellectual disability in the respective countries. ©2013 Special Care Dentistry Association and Wiley Periodicals, Inc.
Alpha1 LASSO data bundles Lamont, OK

DOE Data Explorer

Gustafson, William Jr; Vogelmann, Andrew; Endo, Satoshi; Toto, Tami; Xiao, Heng; Li, Zhijin; Cheng, Xiaoping; Krishna, Bhargavi (ORCID:000000018828528X)

2016-08-03

A data bundle is a unified package consisting of LASSO LES input and output, observations, evaluation diagnostics, and model skill scores. LES input includes model configuration information and forcing data. LES output includes profile statistics and full domain fields of cloud and environmental variables. Model evaluation data consists of LES output and ARM observations co-registered on the same grid and sampling frequency. Model performance is quantified by skill scores and diagnostics in terms of cloud and environmental variables.
Examining the Statistical Rigor of Test and Evaluation Results in the Live, Virtual and Constructive Environment

DTIC Science & Technology

2011-06-01

Committee Meeting. 23 June 2008. Bjorkman, Eileen A. and Frank B. Gray . “Testing in a Joint Environment 2004-2008: Findings, Conclusions and...the LVC joint test environment to evaluate system performance and joint mission effectiveness (Bjorkman and Gray 2009a). The LVC battlespace...attack (Bjorkman and Gray 2009b). Figure 3 - JTEM Methodology (Bjorkman 2008) A key INTEGRAL FIRE lesson learned was realizing the need for each
Dental students' self-assessment of operative preparations using CAD/CAM: a preliminary analysis.

PubMed

Mays, Keith A; Levine, Eric

2014-12-01

The Commission on Dental Accreditation (CODA)'s accreditation standards for dental schools state that "graduates must demonstrate the ability to self-assess." Therefore, dental schools have developed preclinical and clinical self-assessment (SA) protocols aimed at fostering a reflective process. This study comparing students' visual SA with students' digital SA and with faculty assessment was designed to test the hypothesis that higher agreement would occur when utilizing a digital evaluation. Twenty-five first-year dental students at one dental school participated by preparing a mesial occlusal preparation on tooth #30 and performing both types of SAs. A faculty evaluation was then performed both visually and digitally using the same evaluation criteria. The Kappa statistic was used to measure agreement between evaluators. The results showed statistically significant moderate agreement between the faculty visual and faculty digital modes of evaluation for occlusal shape (K=0.507, p=0.002), proximal shape (K=0.564, p=0.001), orientation (K=0.425, p=0.001), and definition (K=0.480, p=0.001). There was slight to poor agreement between the student visual and faculty visual assessment, except for preparation orientation occlusal shape (K=0.164, p=0.022), proximal shape (K=-0.227, p=0.032), orientation (K=0.253, p=0.041), and definition (K=-0.027, p=0.824). This study showed that the students had challenges in self-assessing even when using CAD/CAM and the digital assessment did not improve the amount of student/faculty agreement.
Communicating Patient Status: Comparison of Teaching Strategies in Prelicensure Nursing Education.

PubMed

Lanz, Amelia S; Wood, Felecia G

Research indicates that nurses lack adequate preparation for reporting patient status. This study compared 2 instructional methods focused on patient status reporting in the clinical setting using a randomized posttest-only comparison group design. Reporting performance using a standardized communication framework and student perceptions of satisfaction and confidence with learning were measured in a simulated event that followed the instruction. Between the instructional methods, there was no statistical difference in student reporting performance or perceptions of learning. Performance evaluations provided helpful insights for the nurse educator.
Comparative evaluation of stress levels before, during, and after periodontal surgical procedures with and without nitrous oxide-oxygen inhalation sedation

PubMed Central

Sandhu, Gurkirat; Khinda, Paramjit Kaur; Gill, Amarjit Singh; Singh Khinda, Vineet Inder; Baghi, Kamal; Chahal, Gurparkash Singh

2017-01-01

Context: Periodontal surgical procedures produce varying degree of stress in all patients. Nitrous oxide-oxygen inhalation sedation is very effective for adult patients with mild-to-moderate anxiety due to dental procedures and needle phobia. Aim: The present study was designed to perform periodontal surgical procedures under nitrous oxide-oxygen inhalation sedation and assess whether this technique actually reduces stress physiologically, in comparison to local anesthesia alone (LA) during lengthy periodontal surgical procedures. Settings and Design: This was a randomized, split-mouth, cross-over study. Materials and Methods: A total of 16 patients were selected for this randomized, split-mouth, cross-over study. One surgical session (SS) was performed under local anesthesia aided by nitrous oxide-oxygen inhalation sedation, and the other SS was performed on the contralateral quadrant under LA. For each session, blood samples to measure and evaluate serum cortisol levels were obtained, and vital parameters including blood pressure, heart rate, respiratory rate, and arterial blood oxygen saturation were monitored before, during, and after periodontal surgical procedures. Statistical Analysis Used: Paired t-test and repeated measure ANOVA. Results: The findings of the present study revealed a statistically significant decrease in serum cortisol levels, blood pressure and pulse rate and a statistically significant increase in respiratory rate and arterial blood oxygen saturation during periodontal surgical procedures under nitrous oxide inhalation sedation. Conclusion: Nitrous oxide-oxygen inhalation sedation for periodontal surgical procedures is capable of reducing stress physiologically, in comparison to LA during lengthy periodontal surgical procedures. PMID:29386796
A reliability study on brain activation during active and passive arm movements supported by an MRI-compatible robot.

PubMed

Estévez, Natalia; Yu, Ningbo; Brügger, Mike; Villiger, Michael; Hepp-Reymond, Marie-Claude; Riener, Robert; Kollias, Spyros

2014-11-01

In neurorehabilitation, longitudinal assessment of arm movement related brain function in patients with motor disability is challenging due to variability in task performance. MRI-compatible robots monitor and control task performance, yielding more reliable evaluation of brain function over time. The main goals of the present study were first to define the brain network activated while performing active and passive elbow movements with an MRI-compatible arm robot (MaRIA) in healthy subjects, and second to test the reproducibility of this activation over time. For the fMRI analysis two models were compared. In model 1 movement onset and duration were included, whereas in model 2 force and range of motion were added to the analysis. Reliability of brain activation was tested with several statistical approaches applied on individual and group activation maps and on summary statistics. The activated network included mainly the primary motor cortex, primary and secondary somatosensory cortex, superior and inferior parietal cortex, medial and lateral premotor regions, and subcortical structures. Reliability analyses revealed robust activation for active movements with both fMRI models and all the statistical methods used. Imposed passive movements also elicited mainly robust brain activation for individual and group activation maps, and reliability was improved by including additional force and range of motion using model 2. These findings demonstrate that the use of robotic devices, such as MaRIA, can be useful to reliably assess arm movement related brain activation in longitudinal studies and may contribute in studies evaluating therapies and brain plasticity following injury in the nervous system.

Creation of a virtual cutaneous tissue bank

NASA Astrophysics Data System (ADS)

LaFramboise, William A.; Shah, Sujal; Hoy, R. W.; Letbetter, D.; Petrosko, P.; Vennare, R.; Johnson, Peter C.

2000-04-01

Cellular and non-cellular constituents of skin contain fundamental morphometric features and structural patterns that correlate with tissue function. High resolution digital image acquisitions performed using an automated system and proprietary software to assemble adjacent images and create a contiguous, lossless, digital representation of individual microscope slide specimens. Serial extraction, evaluation and statistical analysis of cutaneous feature is performed utilizing an automated analysis system, to derive normal cutaneous parameters comprising essential structural skin components. Automated digital cutaneous analysis allows for fast extraction of microanatomic dat with accuracy approximating manual measurement. The process provides rapid assessment of feature both within individual specimens and across sample populations. The images, component data, and statistical analysis comprise a bioinformatics database to serve as an architectural blueprint for skin tissue engineering and as a diagnostic standard of comparison for pathologic specimens.
Lithium-Ion Batteries Being Evaluated for Low-Earth-Orbit Applications

NASA Technical Reports Server (NTRS)

McKissock, Barbara I.

2005-01-01

The performance characteristics and long-term cycle life of aerospace lithium-ion (Li-ion) batteries in low-Earth-orbit applications are being investigated. A statistically designed test using Li-ion cells from various manufacturers began in September 2004 to study the effects of temperature, end-of-charge voltage, and depth-of-discharge operating conditions on the cycle life and performance of these cells. Performance degradation with cycling is being evaluated, and performance characteristics and failure modes are being modeled statistically. As technology improvements are incorporated into aerospace Li-ion cells, these new designs can be added to the test to evaluate the effect of the design changes on performance and life. Cells from Lithion and Saft have achieved over 2000 cycles under 10 different test condition combinations and are being evaluated. Cells from Mine Safety Appliances (MSA) and modules made up of commercial-off-the-shelf 18650 Li-ion cells connected in series/parallel combinations are scheduled to be added in the summer of 2005. The test conditions include temperatures of 10, 20, and 30 C, end-of-charge voltages of 3.85, 3.95, and 4.05 V, and depth-of-discharges from 20 to 40 percent. The low-Earth-orbit regime consists of a 55 min charge, at a constant-current rate that is 110 percent of the current required to fully recharge the cells in 55 min until the charge voltage limit is reached, and then at a constant voltage for the remaining charge time. Cells are discharged for 35 min at the current required for their particular depth-of-discharge condition. Cells are being evaluated in four-cell series strings with charge voltage limits being applied to individual cells by the use of charge-control units designed and produced at the NASA Glenn Research Center. These charge-control units clamp the individual cell voltages as each cell reaches its end-of-charge voltage limit, and they bypass the excess current from that cell, while allowing the full current flow to the remaining cells in the pack. The goal of this evaluation is to identify conditions and cell designs for Li-ion technology that can achieve more than 30,000 low-Earth-orbit cycles. Testing is being performed at the Naval Surface Warfare Center, Crane Division, in Crane, Indiana.
Development and evaluation of automatic landing control laws for light wing loading STOL aircraft

NASA Technical Reports Server (NTRS)

Feinreich, B.; Degani, O.; Gevaert, G.

1981-01-01

Automatic flare and decrab control laws were developed for NASA's experimental Twin Otter. This light wing loading STOL aircraft was equipped with direct lift control (DLC) wing spoilers to enhance flight path control. Automatic landing control laws that made use of the spoilers were developed, evaluated in a simulation and the results compared with these obtained for configurations that did not use DLC. The spoilers produced a significant improvement in performance. A simulation that could be operated faster than real time in order to provide statistical landing data for a large number of landings over a wide spectrum of disturbances in a short time was constructed and used in the evaluation and refinement of control law configurations. A longitudinal control law that had been previously developed and evaluated in flight was also simulated and its performance compared with that of the control laws developed. Runway alignment control laws were also defined, evaluated, and refined to result in a final recommended configuration. Good landing performance, compatible with Category 3 operation into STOL runways, was obtained.
mvp - an open-source preprocessor for cleaning duplicate records and missing values in mass spectrometry data.

PubMed

Lee, Geunho; Lee, Hyun Beom; Jung, Byung Hwa; Nam, Hojung

2017-07-01

Mass spectrometry (MS) data are used to analyze biological phenomena based on chemical species. However, these data often contain unexpected duplicate records and missing values due to technical or biological factors. These 'dirty data' problems increase the difficulty of performing MS analyses because they lead to performance degradation when statistical or machine-learning tests are applied to the data. Thus, we have developed missing values preprocessor (mvp), an open-source software for preprocessing data that might include duplicate records and missing values. mvp uses the property of MS data in which identical chemical species present the same or similar values for key identifiers, such as the mass-to-charge ratio and intensity signal, and forms cliques via graph theory to process dirty data. We evaluated the validity of the mvp process via quantitative and qualitative analyses and compared the results from a statistical test that analyzed the original and mvp-applied data. This analysis showed that using mvp reduces problems associated with duplicate records and missing values. We also examined the effects of using unprocessed data in statistical tests and examined the improved statistical test results obtained with data preprocessed using mvp.
MORTICIA, a statistical analysis software package for determining optical surveillance system effectiveness.

NASA Astrophysics Data System (ADS)

Ramkilowan, A.; Griffith, D. J.

2017-10-01

Surveillance modelling in terms of the standard Detect, Recognise and Identify (DRI) thresholds remains a key requirement for determining the effectiveness of surveillance sensors. With readily available computational resources it has become feasible to perform statistically representative evaluations of the effectiveness of these sensors. A new capability for performing this Monte-Carlo type analysis is demonstrated in the MORTICIA (Monte- Carlo Optical Rendering for Theatre Investigations of Capability under the Influence of the Atmosphere) software package developed at the Council for Scientific and Industrial Research (CSIR). This first generation, python-based open-source integrated software package, currently in the alpha stage of development aims to provide all the functionality required to perform statistical investigations of the effectiveness of optical surveillance systems in specific or generic deployment theatres. This includes modelling of the mathematical and physical processes that govern amongst other components of a surveillance system; a sensor's detector and optical components, a target and its background as well as the intervening atmospheric influences. In this paper we discuss integral aspects of the bespoke framework that are critical to the longevity of all subsequent modelling efforts. Additionally, some preliminary results are presented.
Explaining Crossing DIF in Polytomous Items Using Differential Step Functioning Effects

ERIC Educational Resources Information Center

Penfield, Randall D.

2010-01-01

Crossing, or intersecting, differential item functioning (DIF) is a form of nonuniform DIF that exists when the sign of the between-group difference in expected item performance changes across the latent trait continuum. The presence of crossing DIF presents a problem for many statistics developed for evaluating DIF because positive and negative…
Administration and Research of Competency-/Performance-Based Teacher Education Programs.

ERIC Educational Resources Information Center

Trzasko, Joseph A.

This paper describes the proposed Assessment Center at Mercy College in Dobbs Ferry, New York. The center is intended to provide statistical and technical support for the Mercy College elementary education, special education, and speech and hearing departments in the areas of student assessment, student guidance, and program evaluation. Evaluation…
Performance of DIMTEST-and NOHARM-Based Statistics for Testing Unidimensionality

ERIC Educational Resources Information Center

Finch, Holmes; Habing, Brian

2007-01-01

This Monte Carlo study compares the ability of the parametric bootstrap version of DIMTEST with three goodness-of-fit tests calculated from a fitted NOHARM model to detect violations of the assumption of unidimensionality in testing data. The effectiveness of the procedures was evaluated for different numbers of items, numbers of examinees,…
10 CFR 431.445 - Determination of small electric motor efficiency.

Code of Federal Regulations, 2010 CFR

2010-01-01

... determined either by testing in accordance with § 431.444 of this subpart, or by application of an... method. An AEDM applied to a basic model must be: (i) Derived from a mathematical model that represents... statistical analysis, computer simulation or modeling, or other analytic evaluation of performance data. (3...
76 FR 79548 - Loan Participations; Purchase, Sale and Pledge of Eligible Obligations; Purchase of Assets and...

Federal Register 2010, 2011, 2012, 2013, 2014

2011-12-22

..., a FICU purchasing a loan participation pool might perform statistical sampling in evaluating the..., since 2007, FISCUs- overall experienced a higher delinquency rate in their loan participation portfolios. At year-end 2010, the delinquency rate for the FISCU-participated portfolio was 4.11 percent...
Performance Evaluation Methods for Army Finance and Accounting Offices.

DTIC Science & Technology

1981-12-01

FINOPS and FINES. FINOPS provides data through command channels to USAFAC, which is the basis for manangement to ascertain the overall perfor- mance of...IV-I. 69 LU . LU z z * 4 . I L 0; 9 7 - It should be emphasized that these tests do not constitute a classical statistical, controlled experiment to
Executive Functions: Formative versus Reflective Measurement

ERIC Educational Resources Information Center

Willoughby, Michael; Holochwost, Steven J.; Blanton, Zane E.; Blair, Clancy B.

2014-01-01

The primary objective of this article was to critically evaluate the routine use of confirmatory factor analysis (CFA) for representing an individual's performance across a battery of executive function tasks. A conceptual review and statistical reanalysis of N = 10 studies that used CFA methods of EF tasks was undertaken. Despite evidence of…
Quantile regression reveals hidden bias and uncertainty in habitat models

Treesearch

Brian S. Cade; Barry R. Noon; Curtis H. Flather

2005-01-01

We simulated the effects of missing information on statistical distributions of animal response that covaried with measured predictors of habitat to evaluate the utility and performance of quantile regression for providing more useful intervals of uncertainty in habitat relationships. These procedures were evaulated for conditions in which heterogeneity and hidden bias...
Estimating annual bole biomass production using uncertainty analysis

Treesearch

Travis J. Woolley; Mark E. Harmon; Kari B. O' Connell

2007-01-01

Two common sampling methodologies coupled with a simple statistical model were evaluated to determine the accuracy and precision of annual bole biomass production (BBP) and inter-annual variability estimates using this type of approach. We performed an uncertainty analysis using Monte Carlo methods in conjunction with radial growth core data from trees in three Douglas...
Impacting Information Literacy Learning in First-Year Seminars: A Rubric-Based Evaluation

ERIC Educational Resources Information Center

Lowe, M. Sara; Booth, Char; Stone, Sean; Tagge, Natalie

2015-01-01

The authors conducted a rubric assessment of information literacy (IL) skills in research papers across five undergraduate first-year seminar programs to explore the question "What impact does librarian intervention in first-year courses have on IL performance in student work?" Statistical results indicate that students in courses with…
Evaluation of the flame propagation within an SI engine using flame imaging and LES

NASA Astrophysics Data System (ADS)

He, Chao; Kuenne, Guido; Yildar, Esra; van Oijen, Jeroen; di Mare, Francesca; Sadiki, Amsini; Ding, Carl-Philipp; Baum, Elias; Peterson, Brian; Böhm, Benjamin; Janicka, Johannes

2017-11-01

This work shows experiments and simulations of the fired operation of a spark ignition engine with port-fuelled injection. The test rig considered is an optically accessible single cylinder engine specifically designed at TU Darmstadt for the detailed investigation of in-cylinder processes and model validation. The engine was operated under lean conditions using iso-octane as a substitute for gasoline. Experiments have been conducted to provide a sound database of the combustion process. A planar flame imaging technique has been applied within the swirl- and tumble-planes to provide statistical information on the combustion process to complement a pressure-based comparison between simulation and experiments. This data is then analysed and used to assess the large eddy simulation performed within this work. For the simulation, the engine code KIVA has been extended by the dynamically thickened flame model combined with chemistry reduction by means of pressure dependent tabulation. Sixty cycles have been simulated to perform a statistical evaluation. Based on a detailed comparison with the experimental data, a systematic study has been conducted to obtain insight into the most crucial modelling uncertainties.
Dried, ground banana plant leaves (Musa spp.) for the control of Haemonchus contortus and Trichostrongylus colubriformis infections in sheep.

PubMed

Gregory, L; Yoshihara, E; Ribeiro, B L M; Silva, L K F; Marques, E C; Meira, E B S; Rossi, R S; Sampaio, P H; Louvandini, H; Hasegawa, M Y

2015-12-01

To evaluate the anthelmintic effect of Musa spp. leaves, 12 animals were artificially infected with Haemonchus contortus, and another 12 animals were infected with Trichostrongylus colubriformis. Then, both treatment groups were offered 400 g of dried ground banana plant leaves, and the control animals were offered only 1000 g of coast cross hay. During the trials, the animals received weekly physical examinations. The methods used to evaluate the efficiency of this treatment were packed cell volume, total plasma protein and faecal egg counts, and egg hatchability tests were performed on days -2, +3, +6, +9, +13 and +15. Coproculture tests were performed on day -2 to confirm monospecific infections. In the FEC and EHT, a statistically significant difference (0.04, 0.005; p < 0.05) was noted for T. colubriformis. There were no statistically significant differences (p > 0.05) for Haemochus contortus group in all tests. Our results confirmed previous findings suggesting that dried ground banana plant leaves possess anthelmintic activity.
On the suitability of Elekta’s Agility 160 MLC for tracked radiation delivery: closed-loop machine performance

NASA Astrophysics Data System (ADS)

Glitzner, M.; Crijns, S. P. M.; de Senneville, B. Denis; Lagendijk, J. J. W.; Raaymakers, B. W.

2015-03-01

For motion adaptive radiotherapy, dynamic multileaf collimator tracking can be employed to reduce treatment margins by steering the beam according to the organ motion. The Elekta Agility 160 MLC has hitherto not been evaluated for its tracking suitability. Both dosimetric performance and latency are key figures and need to be assessed generically, independent of the used motion sensor. In this paper, we propose the use of harmonic functions directly fed to the MLC to determine its latency during continuous motion. Furthermore, a control variable is extracted from a camera system and fed to the MLC. Using this setup, film dosimetry and subsequent γ statistics are performed, evaluating the response when tracking (MRI)-based physiologic motion in a closed-loop. The delay attributed to the MLC itself was shown to be a minor contributor to the overall feedback chain as compared to the impact of imaging components such as MRI sequences. Delay showed a linear phase behaviour of the MLC employed in continuously dynamic applications, which enables a general MLC-characterization. Using the exemplary feedback chain, dosimetry showed a vast increase in pass rate employing γ statistics. In this early stage, the tracking performance of the Agility using the test bench yielded promising results, making the technique eligible for translation to tracking using clinical imaging modalities.
DEPEND - A design environment for prediction and evaluation of system dependability

NASA Technical Reports Server (NTRS)

Goswami, Kumar K.; Iyer, Ravishankar K.

1990-01-01

The development of DEPEND, an integrated simulation environment for the design and dependability analysis of fault-tolerant systems, is described. DEPEND models both hardware and software components at a functional level, and allows automatic failure injection to assess system performance and reliability. It relieves the user of the work needed to inject failures, maintain statistics, and output reports. The automatic failure injection scheme is geared toward evaluating a system under high stress (workload) conditions. The failures that are injected can affect both hardware and software components. To illustrate the capability of the simulator, a distributed system which employs a prediction-based, dynamic load-balancing heuristic is evaluated. Experiments were conducted to determine the impact of failures on system performance and to identify the failures to which the system is especially susceptible.
A comparison of statistical methods for evaluating matching performance of a biometric identification device: a preliminary report

NASA Astrophysics Data System (ADS)

Schuckers, Michael E.; Hawley, Anne; Livingstone, Katie; Mramba, Nona

2004-08-01

Confidence intervals are an important way to assess and estimate a parameter. In the case of biometric identification devices, several approaches to confidence intervals for an error rate have been proposed. Here we evaluate six of these methods. To complete this evaluation, we simulate data from a wide variety of parameter values. This data are simulated via a correlated binary distribution. We then determine how well these methods do at what they say they do: capturing the parameter inside the confidence interval. In addition, the average widths of the various confidence intervals are recorded for each set of parameters. The complete results of this simulation are presented graphically for easy comparison. We conclude by making a recommendation regarding which method performs best.

The Reliability of Panoramic Radiography Versus Cone Beam Computed Tomography when Evaluating the Distance to the Alveolar Nerve in the Site of Lateral Teeth.

PubMed

Česaitienė, Gabrielė; Česaitis, Kęstutis; Junevičius, Jonas; Venskutonis, Tadas

2017-07-04

BACKGROUND The aim of this study was to compare the reliability of panoramic radiography (PR) and cone beam computed tomography (CBCT) in the evaluation of the distance of the roots of lateral teeth to the inferior alveolar nerve canal (IANC). MATERIAL AND METHODS 100 PR and 100 CBCT images that met the selection criteria were selected from the database. In PR images, the distances were measured using an electronic caliper with 0.01 mm accuracy and white light x-ray film reviewer. Actual values of the measurements were calculated taking into consideration the magnification used in PR images (130%). Measurements on CBCT images were performed using i-CAT Vision software. Statistical data analysis was performed using R software and applying Welch's t-test and the Wilcoxon test. RESULTS There was no statistically significant difference in the mean distance from the root of the second premolar and the mesial and distal roots of the first molar to the IANC between PR and CBCT images. The difference in the mean distance from the mesial and distal roots of the second and the third molars to the IANC measured in PR and CBCT images was statistically significant. CONCLUSIONS PR may be uninformative or misleading when measuring the distance from the mesial and distal roots of the second and the third molars to the IANC.
The Reliability of Panoramic Radiography Versus Cone Beam Computed Tomography when Evaluating the Distance to the Alveolar Nerve in the Site of Lateral Teeth

PubMed Central

Česaitienė, Gabrielė; Česaitis, Kęstutis; Junevičius, Jonas; Venskutonis, Tadas

2017-01-01

Background The aim of this study was to compare the reliability of panoramic radiography (PR) and cone beam computed tomography (CBCT) in the evaluation of the distance of the roots of lateral teeth to the inferior alveolar nerve canal (IANC). Material/Methods 100 PR and 100 CBCT images that met the selection criteria were selected from the database. In PR images, the distances were measured using an electronic caliper with 0.01 mm accuracy and white light x-ray film reviewer. Actual values of the measurements were calculated taking into consideration the magnification used in PR images (130%). Measurements on CBCT images were performed using i-CAT Vision software. Statistical data analysis was performed using R software and applying Welch’s t-test and the Wilcoxon test. Results There was no statistically significant difference in the mean distance from the root of the second premolar and the mesial and distal roots of the first molar to the IANC between PR and CBCT images. The difference in the mean distance from the mesial and distal roots of the second and the third molars to the IANC measured in PR and CBCT images was statistically significant. Conclusions PR may be uninformative or misleading when measuring the distance from the mesial and distal roots of the second and the third molars to the IANC. PMID:28674379
Quantifying discrimination of Framingham risk functions with different survival C statistics.

PubMed

Pencina, Michael J; D'Agostino, Ralph B; Song, Linye

2012-07-10

Cardiovascular risk prediction functions offer an important diagnostic tool for clinicians and patients themselves. They are usually constructed with the use of parametric or semi-parametric survival regression models. It is essential to be able to evaluate the performance of these models, preferably with summaries that offer natural and intuitive interpretations. The concept of discrimination, popular in the logistic regression context, has been extended to survival analysis. However, the extension is not unique. In this paper, we define discrimination in survival analysis as the model's ability to separate those with longer event-free survival from those with shorter event-free survival within some time horizon of interest. This definition remains consistent with that used in logistic regression, in the sense that it assesses how well the model-based predictions match the observed data. Practical and conceptual examples and numerical simulations are employed to examine four C statistics proposed in the literature to evaluate the performance of survival models. We observe that they differ in the numerical values and aspects of discrimination that they capture. We conclude that the index proposed by Harrell is the most appropriate to capture discrimination described by the above definition. We suggest researchers report which C statistic they are using, provide a rationale for their selection, and be aware that comparing different indices across studies may not be meaningful. Copyright © 2012 John Wiley & Sons, Ltd.
Pupil Influence on the Visual Outcomes of a New-Generation Multifocal Toric Intraocular Lens With a Surface-Embedded Near Segment.

PubMed

Wang, Mengmeng; Corpuz, Christine Carole C; Huseynova, Tukezban; Tomita, Minoru

2016-02-01

To evaluate the influences of preoperative pupil parameters on the visual outcomes of a new-generation multifocal toric intraocular lens (IOL) model with a surface-embedded near segment. In this prospective study, patients with cataract had phacoemulsification and implantation of Lentis Mplus toric LU-313 30TY IOLs (Oculentis GmbH, Berlin, Germany). The visual and optical outcomes were measured and compared preoperatively and postoperatively. The correlations between preoperative pupil parameters (diameter and decentration) and 3-month postoperative visual outcomes were evaluated using the Spearman's rank-order correlation coefficient (Rs) for the nonparametric data. A total of 27 eyes (16 patients) were enrolled into the current study. Statistically significant improvements in visual and refractive performances were found after the implantation of Lentis Mplus toric LU-313 30TY IOLs (P < .05). Statistically significant correlations were present between preoperative pupil diameters and postoperative visual acuities (Rs > 0; P < .05). Patients with a larger pupil always have better postoperative visual acuities. Meanwhile, there was no statistically significant correlation between pupil decentration and visual acuities (P > .05). Lentis Mplus toric LU-313 30TY IOLs provided excellent visual and optical performances during the 3-month follow-up. The preoperative pupil size is an important parameter when this toric multifocal IOL model is contemplated for surgery. Copyright 2016, SLACK Incorporated.
Outcomes of office-based temporomandibular joint arthroscopy: a 5-year retrospective study.

PubMed

Hossameldin, R H; McCain, J P

2018-01-01

Temporomandibular joint (TMJ) arthroscopy is a minimally invasive surgical approach for intra-articular TMJ diseases. Office-based arthroscopy using the smallest TMJ scope allows for good visualization, as well as the ability to lavage the joint in an office setting. This study aimed to assess the efficacy of an office-based TMJ arthroscopic technique. A retrospective evaluation of 363 patients with a TMJ disorder was performed. These patients underwent office-based arthroscopy using the OnPoint 1.2mm Scope System (Biomet Microfixation, Jacksonville, FL, USA) in Florida, USA, from July 2007. The following outcomes of the procedure were assessed: improvement in painless range of mandibular motion, pain on loading, and functional jaw pain; these were evaluated using a visual analog scale (VAS) over an average follow-up period of 263.81±142.1 days. The statistical analysis was performed using IBM SPSS Statistics version 20. Statistically significant improvements in TMJ pain and function, and other variables (P=0.001) were shown following TMJ arthroscopic lysis and lavage. Office-based arthroscopy using the OnPoint System was demonstrated to be a safe and efficient procedure for the treatment of patients with TMJ disorders as the first level of the algorithm of care. Copyright © 2017 International Association of Oral and Maxillofacial Surgeons. Published by Elsevier Ltd. All rights reserved.
Subject-enabled analytics model on measurement statistics in health risk expert system for public health informatics.

PubMed

Chung, Chi-Jung; Kuo, Yu-Chen; Hsieh, Yun-Yu; Li, Tsai-Chung; Lin, Cheng-Chieh; Liang, Wen-Miin; Liao, Li-Na; Li, Chia-Ing; Lin, Hsueh-Chun

2017-11-01

This study applied open source technology to establish a subject-enabled analytics model that can enhance measurement statistics of case studies with the public health data in cloud computing. The infrastructure of the proposed model comprises three domains: 1) the health measurement data warehouse (HMDW) for the case study repository, 2) the self-developed modules of online health risk information statistics (HRIStat) for cloud computing, and 3) the prototype of a Web-based process automation system in statistics (PASIS) for the health risk assessment of case studies with subject-enabled evaluation. The system design employed freeware including Java applications, MySQL, and R packages to drive a health risk expert system (HRES). In the design, the HRIStat modules enforce the typical analytics methods for biomedical statistics, and the PASIS interfaces enable process automation of the HRES for cloud computing. The Web-based model supports both modes, step-by-step analysis and auto-computing process, respectively for preliminary evaluation and real time computation. The proposed model was evaluated by computing prior researches in relation to the epidemiological measurement of diseases that were caused by either heavy metal exposures in the environment or clinical complications in hospital. The simulation validity was approved by the commercial statistics software. The model was installed in a stand-alone computer and in a cloud-server workstation to verify computing performance for a data amount of more than 230K sets. Both setups reached efficiency of about 10 5 sets per second. The Web-based PASIS interface can be used for cloud computing, and the HRIStat module can be flexibly expanded with advanced subjects for measurement statistics. The analytics procedure of the HRES prototype is capable of providing assessment criteria prior to estimating the potential risk to public health. Copyright © 2017 Elsevier B.V. All rights reserved.
Performance evaluation of mobile downflow booths for reducing airborne particles in the workplace.

PubMed

Lo, Li-Ming; Hocker, Braden; Steltz, Austin E; Kremer, John; Feng, H Amy

2017-11-01

Compared to other common control measures, the downflow booth is a costly engineering control used to contain airborne dust or particles. The downflow booth provides unidirectional filtered airflow from the ceiling, entraining released particles away from the workers' breathing zone, and delivers contained airflow to a lower level exhaust for removing particulates by filtering media. In this study, we designed and built a mobile downflow booth that is capable of quick assembly and easy size change to provide greater flexibility and particle control for various manufacturing processes or tasks. An experimental study was conducted to thoroughly evaluate the control performance of downflow booths used for removing airborne particles generated by the transfer of powdered lactose between two containers. Statistical analysis compared particle reduction ratios obtained from various test conditions including booth size (short, regular, or extended), supply air velocity (0.41 and 0.51 m/s or 80 and 100 feet per minute, fpm), powder transfer location (near or far from the booth exhaust), and inclusion or exclusion of curtains at the booth entrance. Our study results show that only short-depth downflow booths failed to protect the worker performing powder transfer far from the booth exhausts. Statistical analysis shows that better control performance can be obtained with supply air velocity of 0.51 m/s (100 fpm) than with 0.41 m/s (80 fpm) and that use of curtains for downflow booths did not improve their control performance.
Interactive design and analysis of future large spacecraft concepts

NASA Technical Reports Server (NTRS)

Garrett, L. B.

1981-01-01

An interactive computer aided design program used to perform systems level design and analysis of large spacecraft concepts is presented. Emphasis is on rapid design, analysis of integrated spacecraft, and automatic spacecraft modeling for lattice structures. Capabilities and performance of multidiscipline applications modules, the executive and data management software, and graphics display features are reviewed. A single user at an interactive terminal create, design, analyze, and conduct parametric studies of Earth orbiting spacecraft with relative ease. Data generated in the design, analysis, and performance evaluation of an Earth-orbiting large diameter antenna satellite are used to illustrate current capabilities. Computer run time statistics for the individual modules quantify the speed at which modeling, analysis, and design evaluation of integrated spacecraft concepts is accomplished in a user interactive computing environment.
Integrating image quality in 2nu-SVM biometric match score fusion.

PubMed

Vatsa, Mayank; Singh, Richa; Noore, Afzel

2007-10-01

This paper proposes an intelligent 2nu-support vector machine based match score fusion algorithm to improve the performance of face and iris recognition by integrating the quality of images. The proposed algorithm applies redundant discrete wavelet transform to evaluate the underlying linear and non-linear features present in the image. A composite quality score is computed to determine the extent of smoothness, sharpness, noise, and other pertinent features present in each subband of the image. The match score and the corresponding quality score of an image are fused using 2nu-support vector machine to improve the verification performance. The proposed algorithm is experimentally validated using the FERET face database and the CASIA iris database. The verification performance and statistical evaluation show that the proposed algorithm outperforms existing fusion algorithms.
Synthetic data sets for the identification of key ingredients for RNA-seq differential analysis.

PubMed

Rigaill, Guillem; Balzergue, Sandrine; Brunaud, Véronique; Blondet, Eddy; Rau, Andrea; Rogier, Odile; Caius, José; Maugis-Rabusseau, Cathy; Soubigou-Taconnat, Ludivine; Aubourg, Sébastien; Lurin, Claire; Martin-Magniette, Marie-Laure; Delannoy, Etienne

2018-01-01

Numerous statistical pipelines are now available for the differential analysis of gene expression measured with RNA-sequencing technology. Most of them are based on similar statistical frameworks after normalization, differing primarily in the choice of data distribution, mean and variance estimation strategy and data filtering. We propose an evaluation of the impact of these choices when few biological replicates are available through the use of synthetic data sets. This framework is based on real data sets and allows the exploration of various scenarios differing in the proportion of non-differentially expressed genes. Hence, it provides an evaluation of the key ingredients of the differential analysis, free of the biases associated with the simulation of data using parametric models. Our results show the relevance of a proper modeling of the mean by using linear or generalized linear modeling. Once the mean is properly modeled, the impact of the other parameters on the performance of the test is much less important. Finally, we propose to use the simple visualization of the raw P-value histogram as a practical evaluation criterion of the performance of differential analysis methods on real data sets. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Simulation and evaluation of phase noise for optical amplification using semiconductor optical amplifiers in DPSK applications

NASA Astrophysics Data System (ADS)

Hong, Wei; Huang, Dexiu; Zhang, Xinliang; Zhu, Guangxi

2008-01-01

A thorough simulation and evaluation of phase noise for optical amplification using semiconductor optical amplifier (SOA) is very important for predicting its performance in differential phase-shift keyed (DPSK) applications. In this paper, standard deviation and probability distribution of differential phase noise at the SOA output are obtained from the statistics of simulated differential phase noise. By using a full-wave model of SOA, the noise performance in the entire operation range can be investigated. It is shown that nonlinear phase noise substantially contributes to the total phase noise in case of a noisy signal amplified by a saturated SOA and the nonlinear contribution is larger with shorter SOA carrier lifetime. It is also shown that Gaussian distribution can be useful as a good approximation of the total differential phase noise statistics in the whole operation range. Power penalty due to differential phase noise is evaluated using a semi-analytical probability density function (PDF) of receiver noise. Obvious increase of power penalty at high signal input powers can be found for low input OSNR, which is due to both the large nonlinear differential phase noise and the dependence of BER vs. receiving power curvature on differential phase noise standard deviation.
European cardiovascular mortality over the last three decades: evaluation of time trends, forecasts for 2016.

PubMed

Gaeta, M; Campanella, F; Gentile, L; Schifino, G M; Capasso, L; Bandera, F; Banfi, G; Arpesella, M; Ricci, C

2017-01-01

The circulatory diseases, in particular ischemic heart diseases and stroke, represent the main causes of death worldwide both in high income and in middle and low income countries. Our aim is to provide a comprehensive report to depict the circulatory disease mortality in Europe over the last 30 years and to address the sources of heterogeneity among different countries. Our study was performed using the WHO statistical information system - mortality database - and was restricted to the 28 countries belonging to the European Union (EU-28). We evaluated gender and age time series of all circulatory disease mortality, ischemic heart diseases, cerebrovascular diseases, pulmonary and other circulatory diseases and than we performed forecast for 2016. Mortality heterogeneity was evaluated by countries using the Cochrane Q statistic and the I-squared index. Between 1985 and 2011 SDR for deaths attributable to all circulatory system diseases decreased from 440.9 to 212.0 x 100,000 in EU-28 and a clear uniform reduction was observed. Heterogeneity among countries was found to be consistent, therefore different analysis were carried out considering geographical area. We forecast a reduction in European cardiovascular mortality. Heterogeneity among countries could only in part be explained by both geographical and health expenditure factors.
Improvement of IFNγ ELISPOT Performance Following Overnight Resting of Frozen PBMC Samples Confirmed Through Rigorous Statistical Analysis

PubMed Central

Santos, Radleigh; Buying, Alcinette; Sabri, Nazila; Yu, John; Gringeri, Anthony; Bender, James; Janetzki, Sylvia; Pinilla, Clemencia; Judkowski, Valeria A.

2014-01-01

Immune monitoring of functional responses is a fundamental parameter to establish correlates of protection in clinical trials evaluating vaccines and therapies to boost antigen-specific responses. The IFNγ ELISPOT assay is a well-standardized and validated method for the determination of functional IFNγ-producing T-cells in peripheral blood mononuclear cells (PBMC); however, its performance greatly depends on the quality and integrity of the cryopreserved PBMC. Here, we investigate the effect of overnight (ON) resting of the PBMC on the detection of CD8-restricted peptide-specific responses by IFNγ ELISPOT. The study used PBMC from healthy donors to evaluate the CD8 T-cell response to five pooled or individual HLA-A2 viral peptides. The results were analyzed using a modification of the existing distribution free resampling (DFR) recommended for the analysis of ELISPOT data to ensure the most rigorous possible standard of significance. The results of the study demonstrate that ON resting of PBMC samples prior to IFNγ ELISPOT increases both the magnitude and the statistical significance of the responses. In addition, a comparison of the results with a 13-day preculture of PBMC with the peptides before testing demonstrates that ON resting is sufficient for the efficient evaluation of immune functioning. PMID:25546016
Cytotoxicity and Initial Biocompatibility of Endodontic Biomaterials (MTA and Biodentine™) Used as Root-End Filling Materials.

PubMed

Escobar-García, Diana María; Aguirre-López, Eva; Méndez-González, Verónica; Pozos-Guillén, Amaury

2016-01-01

Objective. The aim of this study was to evaluate the cytotoxicity and cellular adhesion of Mineral Trioxide Aggregate (MTA) and Biodentine (BD) on periodontal ligament fibroblasts (PDL). Methods. PDL cells were obtained from nonerupted third molars and cultured; MTS cellular profusion test was carried out in two groups: MTA and BD, with respective controls at different time periods. Also, the LIVE/DEAD assay was performed at 24 h. For evaluation of cellular adhesion, immunocytochemistry was conducted to discern the expression of Integrin β1 and Vinculin at 12 h and 24 h. Statistical analysis was performed by the Kruskal-Wallis and Mann-Whitney U tests. Results. MTA and BD exhibited living cells up to 7 days. More expressions of Integrin β1 and Vinculin were demonstrated in the control group, followed by BD and MTA, which also showed cellular loss and morphological changes. There was a significant difference in the experimental groups cultured for 5 and 7 days compared with the control, but there was no significant statistical difference between both cements. Conclusions. Neither material was cytotoxic during the time evaluated. There was an increase of cell adhesion through the expression of focal contacts observed in the case of BD, followed by MTA, but not significantly.
Metabolic alterations in broiler chickens experimentally infected with sporulated oocysts of Eimeria maxima.

PubMed

Freitas, Fagner Luiz da Costa

2014-01-01

Metabolic and morphometric alterations of the duodenal villi caused by parasitism of chickens by Eimeria maxima were evaluated, using 100 male Cobb birds, randomly distributed into two groups (control and infected). The infected group was inoculated with 0.5 ml of a solution containing 5 × 10³ sporulated oocysts of Eimeria maxima. Ten birds per sample were sacrificed on the 6th, 11th, 22nd and 41st days post-infection (dpi). In order to evaluate the alterations, samples of duodenum, jejunum and ileum fragments were collected after necropsy for histological analysis. Villus biometry was determined by means of a slide graduated in microns that was attached to a binocular microscope. To evaluate the biochemical data, 5 ml of blood were sampled from the birds before sacrifice. The statistical analyses were performed using the GraphPad 5 statistical software for Windows. Tukey's multiple comparison test (p <0.05) was performed for the different dpi's and the unpaired t test for the difference between the groups. Infection by E. maxima causes both qualitative and quantitative alterations to the structure of the intestinal villi, thereby interfering with the absorption of nutrients such as calcium, phosphorus, magnesium, protein and lipids, with consequent reductions in the birds' weights.
Mean template for tensor-based morphometry using deformation tensors.

PubMed

Leporé, Natasha; Brun, Caroline; Pennec, Xavier; Chou, Yi-Yu; Lopez, Oscar L; Aizenstein, Howard J; Becker, James T; Toga, Arthur W; Thompson, Paul M

2007-01-01

Tensor-based morphometry (TBM) studies anatomical differences between brain images statistically, to identify regions that differ between groups, over time, or correlate with cognitive or clinical measures. Using a nonlinear registration algorithm, all images are mapped to a common space, and statistics are most commonly performed on the Jacobian determinant (local expansion factor) of the deformation fields. In, it was shown that the detection sensitivity of the standard TBM approach could be increased by using the full deformation tensors in a multivariate statistical analysis. Here we set out to improve the common space itself, by choosing the shape that minimizes a natural metric on the deformation tensors from that space to the population of control subjects. This method avoids statistical bias and should ease nonlinear registration of new subjects data to a template that is 'closest' to all subjects' anatomies. As deformation tensors are symmetric positive-definite matrices and do not form a vector space, all computations are performed in the log-Euclidean framework. The control brain B that is already the closest to 'average' is found. A gradient descent algorithm is then used to perform the minimization that iteratively deforms this template and obtains the mean shape. We apply our method to map the profile of anatomical differences in a dataset of 26 HIV/AIDS patients and 14 controls, via a log-Euclidean Hotelling's T2 test on the deformation tensors. These results are compared to the ones found using the 'best' control, B. Statistics on both shapes are evaluated using cumulative distribution functions of the p-values in maps of inter-group differences.
Validation of satellite-based rainfall in Kalahari

NASA Astrophysics Data System (ADS)

Lekula, Moiteela; Lubczynski, Maciek W.; Shemang, Elisha M.; Verhoef, Wouter

2018-06-01

Water resources management in arid and semi-arid areas is hampered by insufficient rainfall data, typically obtained from sparsely distributed rain gauges. Satellite-based rainfall estimates (SREs) are alternative sources of such data in these areas. In this study, daily rainfall estimates from FEWS-RFE∼11 km, TRMM-3B42∼27 km, CMOPRH∼27 km and CMORPH∼8 km were evaluated against nine, daily rain gauge records in Central Kalahari Basin (CKB), over a five-year period, 01/01/2001-31/12/2005. The aims were to evaluate the daily rainfall detection capabilities of the four SRE algorithms, analyze the spatio-temporal variability of rainfall in the CKB and perform bias-correction of the four SREs. Evaluation methods included scatter plot analysis, descriptive statistics, categorical statistics and bias decomposition. The spatio-temporal variability of rainfall, was assessed using the SREs' mean annual rainfall, standard deviation, coefficient of variation and spatial correlation functions. Bias correction of the four SREs was conducted using a Time-Varying Space-Fixed bias-correction scheme. The results underlined the importance of validating daily SREs, as they had different rainfall detection capabilities in the CKB. The FEWS-RFE∼11 km performed best, providing better results of descriptive and categorical statistics than the other three SREs, although bias decomposition showed that all SREs underestimated rainfall. The analysis showed that the most reliable SREs performance analysis indicator were the frequency of "miss" rainfall events and the "miss-bias", as they directly indicated SREs' sensitivity and bias of rainfall detection, respectively. The Time Varying and Space Fixed (TVSF) bias-correction scheme, improved some error measures but resulted in the reduction of the spatial correlation distance, thus increased, already high, spatial rainfall variability of all the four SREs. This study highlighted SREs as valuable source of daily rainfall data providing good spatio-temporal data coverage especially suitable for areas with limited rain gauges, such as the CKB, but also emphasized SREs' drawbacks, creating avenue for follow up research.
A framework for evaluating statistical downscaling performance under changing climatic conditions (Invited)

NASA Astrophysics Data System (ADS)

Dixon, K. W.; Balaji, V.; Lanzante, J.; Radhakrishnan, A.; Hayhoe, K.; Stoner, A. K.; Gaitan, C. F.

2013-12-01

Statistical downscaling (SD) methods may be viewed as generating a value-added product - a refinement of global climate model (GCM) output designed to add finer scale detail and to address GCM shortcomings via a process that gleans information from a combination of observations and GCM-simulated climate change responses. Making use of observational data sets and GCM simulations representing the same historical period, cross-validation techniques allow one to assess how well an SD method meets this goal. However, lacking observations of future, the extent to which a particular SD method's skill might degrade when applied to future climate projections cannot be assessed in the same manner. Here we illustrate and describe extensions to a 'perfect model' experimental design that seeks to quantify aspects of SD method performance both for a historical period (1979-2008) and for late 21st century climate projections. Examples highlighting cases in which downscaling performance deteriorates in future climate projections will be discussed. Also, results will be presented showing how synthetic datasets having known statistical properties may be used to further isolate factors responsible for degradations in SD method skill under changing climatic conditions. We will describe a set of input files used to conduct these analyses that are being made available to researchers who wish to utilize this experimental framework to evaluate SD methods they have developed. The gridded data sets cover a region centered on the contiguous 48 United States with a grid spacing of approximately 25km, have daily time resolution (e.g., maximum and minimum near-surface temperature and precipitation), and represent a total of 120 years of model simulations. This effort is consistent with the 2013 National Climate Predictions and Projections Platform Quantitative Evaluation of Downscaling Workshop goal of supporting a community approach to promote the informed use of downscaled climate projections.
An empirical comparison of statistical tests for assessing the proportional hazards assumption of Cox's model.

PubMed

Ng'andu, N H

1997-03-30

In the analysis of survival data using the Cox proportional hazard (PH) model, it is important to verify that the explanatory variables analysed satisfy the proportional hazard assumption of the model. This paper presents results of a simulation study that compares five test statistics to check the proportional hazard assumption of Cox's model. The test statistics were evaluated under proportional hazards and the following types of departures from the proportional hazard assumption: increasing relative hazards; decreasing relative hazards; crossing hazards; diverging hazards, and non-monotonic hazards. The test statistics compared include those based on partitioning of failure time and those that do not require partitioning of failure time. The simulation results demonstrate that the time-dependent covariate test, the weighted residuals score test and the linear correlation test have equally good power for detection of non-proportionality in the varieties of non-proportional hazards studied. Using illustrative data from the literature, these test statistics performed similarly.
Evaluation of a Performance-Based Expert Elicitation: WHO Global Attribution of Foodborne Diseases.

PubMed

Aspinall, W P; Cooke, R M; Havelaar, A H; Hoffmann, S; Hald, T

2016-01-01

For many societally important science-based decisions, data are inadequate, unreliable or non-existent, and expert advice is sought. In such cases, procedures for eliciting structured expert judgments (SEJ) are increasingly used. This raises questions regarding validity and reproducibility. This paper presents new findings from a large-scale international SEJ study intended to estimate the global burden of foodborne disease on behalf of WHO. The study involved 72 experts distributed over 134 expert panels, with panels comprising thirteen experts on average. Elicitations were conducted in five languages. Performance-based weighted solutions for target questions of interest were formed for each panel. These weights were based on individual expert's statistical accuracy and informativeness, determined using between ten and fifteen calibration variables from the experts' field with known values. Equal weights combinations were also calculated. The main conclusions on expert performance are: (1) SEJ does provide a science-based method for attribution of the global burden of foodborne diseases; (2) equal weighting of experts per panel increased statistical accuracy to acceptable levels, but at the cost of informativeness; (3) performance-based weighting increased informativeness, while retaining accuracy; (4) due to study constraints individual experts' accuracies were generally lower than in other SEJ studies, and (5) there was a negative correlation between experts' informativeness and statistical accuracy which attenuated as accuracy improved, revealing that the least accurate experts drive the negative correlation. It is shown, however, that performance-based weighting has the ability to yield statistically accurate and informative combinations of experts' judgments, thereby offsetting this contrary influence. The present findings suggest that application of SEJ on a large scale is feasible, and motivate the development of enhanced training and tools for remote elicitation of multiple, internationally-dispersed panels.

Evaluation of a Performance-Based Expert Elicitation: WHO Global Attribution of Foodborne Diseases

PubMed Central

Aspinall, W. P.; Cooke, R. M.; Havelaar, A. H.; Hoffmann, S.; Hald, T.

2016-01-01

For many societally important science-based decisions, data are inadequate, unreliable or non-existent, and expert advice is sought. In such cases, procedures for eliciting structured expert judgments (SEJ) are increasingly used. This raises questions regarding validity and reproducibility. This paper presents new findings from a large-scale international SEJ study intended to estimate the global burden of foodborne disease on behalf of WHO. The study involved 72 experts distributed over 134 expert panels, with panels comprising thirteen experts on average. Elicitations were conducted in five languages. Performance-based weighted solutions for target questions of interest were formed for each panel. These weights were based on individual expert’s statistical accuracy and informativeness, determined using between ten and fifteen calibration variables from the experts' field with known values. Equal weights combinations were also calculated. The main conclusions on expert performance are: (1) SEJ does provide a science-based method for attribution of the global burden of foodborne diseases; (2) equal weighting of experts per panel increased statistical accuracy to acceptable levels, but at the cost of informativeness; (3) performance-based weighting increased informativeness, while retaining accuracy; (4) due to study constraints individual experts’ accuracies were generally lower than in other SEJ studies, and (5) there was a negative correlation between experts' informativeness and statistical accuracy which attenuated as accuracy improved, revealing that the least accurate experts drive the negative correlation. It is shown, however, that performance-based weighting has the ability to yield statistically accurate and informative combinations of experts' judgments, thereby offsetting this contrary influence. The present findings suggest that application of SEJ on a large scale is feasible, and motivate the development of enhanced training and tools for remote elicitation of multiple, internationally-dispersed panels. PMID:26930595
Variational stereo imaging of oceanic waves with statistical constraints.

PubMed

Gallego, Guillermo; Yezzi, Anthony; Fedele, Francesco; Benetazzo, Alvise

2013-11-01

An image processing observational technique for the stereoscopic reconstruction of the waveform of oceanic sea states is developed. The technique incorporates the enforcement of any given statistical wave law modeling the quasi-Gaussianity of oceanic waves observed in nature. The problem is posed in a variational optimization framework, where the desired waveform is obtained as the minimizer of a cost functional that combines image observations, smoothness priors and a weak statistical constraint. The minimizer is obtained by combining gradient descent and multigrid methods on the necessary optimality equations of the cost functional. Robust photometric error criteria and a spatial intensity compensation model are also developed to improve the performance of the presented image matching strategy. The weak statistical constraint is thoroughly evaluated in combination with other elements presented to reconstruct and enforce constraints on experimental stereo data, demonstrating the improvement in the estimation of the observed ocean surface.
Student performance and course evaluations before and after use of the Classroom Performance System ™ in a third-year veterinary radiology course.

PubMed

Hecht, Silke; Adams, W H; Cunningham, M A; Lane, I F; Howell, N E

2013-01-01

Effective teaching of veterinary radiology can be challenging in a traditional classroom environment. Audience response systems, colloquially known as "clickers," provide a means of encouraging student interaction. The purpose of this study was to compare student performance and course evaluations before and after using the Classroom Performance System™ in the third-year (fifth semester) didactic radiology course at the University of Tennessee College of Veterinary Medicine. Overall student performance was assessed by comparing median numeric final course grades (%) between years without and with use of the Classroom Performance System™. Grades of students were determined for individual instructors' sections. Student evaluations of the radiology course were compared for the years available (2007-2010). Student interactions were also evaluated subjectively by instructors who used the Classroom Performance System™. There was a significant difference (p = 0.009) between the median student grade before (2005 - 2008, median 82.2%; interquartile range 77.6-85.7%; range 61.9-95.5%) and after use of the classroom performance system (2009-2010, median 83.6%; interquartile range 79.9-87.9%; range 68.2-93.2%). There was no statistically significant difference in median student grades for individual instructors over the study period. The radiology course student evaluation scores were significantly higher in years where the Classroom Performance System™ was used in comparison to previous years (P = 0.019). Subjectively, students appeared more involved when using clickers. Findings indicated that the Classroom Performance System™ may be a useful tool for enhancing veterinary radiology education. © 2012 Veterinary Radiology & Ultrasound.
Estimation of diagnostic test accuracy without full verification: a review of latent class methods

PubMed Central

Collins, John; Huynh, Minh

2014-01-01

The performance of a diagnostic test is best evaluated against a reference test that is without error. For many diseases, this is not possible, and an imperfect reference test must be used. However, diagnostic accuracy estimates may be biased if inaccurately verified status is used as the truth. Statistical models have been developed to handle this situation by treating disease as a latent variable. In this paper, we conduct a systematized review of statistical methods using latent class models for estimating test accuracy and disease prevalence in the absence of complete verification. PMID:24910172
Understanding evaluation of learning support in mathematics and statistics

NASA Astrophysics Data System (ADS)

MacGillivray, Helen; Croft, Tony

2011-03-01

With rapid and continuing growth of learning support initiatives in mathematics and statistics found in many parts of the world, and with the likelihood that this trend will continue, there is a need to ensure that robust and coherent measures are in place to evaluate the effectiveness of these initiatives. The nature of learning support brings challenges for measurement and analysis of its effects. After briefly reviewing the purpose, rationale for, and extent of current provision, this article provides a framework for those working in learning support to think about how their efforts can be evaluated. It provides references and specific examples of how workers in this field are collecting, analysing and reporting their findings. The framework is used to structure evaluation in terms of usage of facilities, resources and services provided, and also in terms of improvements in performance of the students and staff who engage with them. Very recent developments have started to address the effects of learning support on the development of deeper approaches to learning, the affective domain and the development of communities of practice of both learners and teachers. This article intends to be a stimulus to those who work in mathematics and statistics support to gather even richer, more valuable, forms of data. It provides a 'toolkit' for those interested in evaluation of learning support and closes by referring to an on-line resource being developed to archive the growing body of evidence.
A compliance assessment of midpoint formative assessments completed by APPE preceptors.

PubMed

Lea Bonner, C; Staton, April G; Naro, Patricia B; McCullough, Elizabeth; Lynn Stevenson, T; Williamson, Margaret; Sheffield, Melody C; Miller, Mindi; Fetterman, James W; Fan, Shirley; Momary, Kathryn M

Experiential pharmacy preceptors should provide formative and summative feedback during a learning experience. Preceptors are required to provide colleges and schools of pharmacy with assessments or evaluations of students' performance. Students and experiential programs value on-time completion of midpoint evaluations by preceptors. The objective of this study was to determine the number of on-time electronically documented formative midpoint evaluations completed by preceptors during advanced pharmacy practice experiences (APPEs). Compliance rates of on-time electronically documented formative midpoint evaluations were reviewed by the Office of Experiential Education of a five-member consortium during the two-year study period prior to the adoption of Standards 2016. Pearson chi-square test and generalized linear models were used to determine if statistically significant differences were present. Average midpoint compliance rates for the two-year research period were 40.7% and 41% respectively. No statistical significance was noted comparing compliance rates for year one versus year two. However, statistical significance was present when comparing compliance rates between schools during year two. Feedback from students and preceptors pointed to the need for brief formal midpoint evaluations that require minimal time to complete, user friendly experiential management software, and methods for documenting verbal feedback through student self-reflection. Additional education and training to both affiliate and faculty preceptors on the importance of written formative feedback at midpoint is critical to remaining in compliance with Standards 2016. Copyright © 2017 Elsevier Inc. All rights reserved.
SU-E-J-85: Leave-One-Out Perturbation (LOOP) Fitting Algorithm for Absolute Dose Film Calibration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chu, A; Ahmad, M; Chen, Z

2014-06-01

Purpose: To introduce an outliers-recognition fitting routine for film dosimetry. It cannot only be flexible with any linear and non-linear regression but also can provide information for the minimal number of sampling points, critical sampling distributions and evaluating analytical functions for absolute film-dose calibration. Methods: The technique, leave-one-out (LOO) cross validation, is often used for statistical analyses on model performance. We used LOO analyses with perturbed bootstrap fitting called leave-one-out perturbation (LOOP) for film-dose calibration . Given a threshold, the LOO process detects unfit points (“outliers”) compared to other cohorts, and a bootstrap fitting process follows to seek any possibilitiesmore » of using perturbations for further improvement. After that outliers were reconfirmed by a traditional t-test statistics and eliminated, then another LOOP feedback resulted in the final. An over-sampled film-dose- calibration dataset was collected as a reference (dose range: 0-800cGy), and various simulated conditions for outliers and sampling distributions were derived from the reference. Comparisons over the various conditions were made, and the performance of fitting functions, polynomial and rational functions, were evaluated. Results: (1) LOOP can prove its sensitive outlier-recognition by its statistical correlation to an exceptional better goodness-of-fit as outliers being left-out. (2) With sufficient statistical information, the LOOP can correct outliers under some low-sampling conditions that other “robust fits”, e.g. Least Absolute Residuals, cannot. (3) Complete cross-validated analyses of LOOP indicate that the function of rational type demonstrates a much superior performance compared to the polynomial. Even with 5 data points including one outlier, using LOOP with rational function can restore more than a 95% value back to its reference values, while the polynomial fitting completely failed under the same conditions. Conclusion: LOOP can cooperate with any fitting routine functioning as a “robust fit”. In addition, it can be set as a benchmark for film-dose calibration fitting performance.« less
Results of the U. S. Geological Survey's analytical evaluation program for standard reference samples distributed in April 2001

USGS Publications Warehouse

Woodworth, M.T.; Connor, B.F.

2001-01-01

This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-165 (trace constituents), M-158 (major constituents), N-69 (nutrient constituents), N-70 (nutrient constituents), P-36 (low ionic-strength constituents), and Hg-32 (mercury) -- that were distributed in April 2001 to laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data received from 73 laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Results of the U. S. Geological Survey's Analytical Evaluation Program for Standard Reference Samples Distributed in March 2002

USGS Publications Warehouse

Woodworth, M.T.; Conner, B.F.

2002-01-01

This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T- 169 (trace constituents), M- 162 (major constituents), N-73 (nutrient constituents), N-74 (nutrient constituents), P-38 (low ionic-strength constituents), and Hg-34 (mercury) -- that were distributed in March 2002 to laboratories enrolled in the U.S. Geological Survey sponsored intedaboratory testing program. Analytical data received from 93 laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Results of the U.S. Geological Survey's analytical evaluation program for standard reference samples distributed in September 2002

USGS Publications Warehouse

Woodworth, Mark T.; Connor, Brooke F.

2003-01-01

This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-171 (trace constituents), M-164 (major constituents), N-75 (nutrient constituents), N-76 (nutrient constituents), P-39 (low ionic-strength constituents), and Hg-35 (mercury) -- that were distributed in September 2002 to laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data received from 102 laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Results of the U.S. Geological Survey's analytical evaluation program for standard reference samples distributed in September 2001

USGS Publications Warehouse

Woodworth, Mark T.; Connor, Brooke F.

2002-01-01

This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-167 (trace constituents), M-160 (major constituents), N-71 (nutrient constituents), N-72 (nutrient constituents), P-37 (low ionic-strength constituents), and Hg-33 (mercury) -- that were distributed in September 2001 to laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data received from 98 laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Results of the U.S. Geological Survey's Analytical Evaluation Program for Standard Reference Samples Distributed in March 2000

USGS Publications Warehouse

Farrar, Jerry W.; Copen, Ashley M.

2000-01-01

This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-161 (trace constituents), M-154 (major constituents), N-65 (nutrient constituents), N-66 nutrient constituents), P-34 (low ionic strength constituents), and Hg-30 (mercury) -- that were distributed in March 2000 to 144 laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data that were received from 132 of the laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Results of the U.S. Geological Survey's analytical evaluation program for standard reference samples distributed in October 1999

USGS Publications Warehouse

Farrar, T.W.

2000-01-01

This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-159 (trace constituents), M-152 (major constituents), N-63 (nutrient constituents), N-64 (nutrient constituents), P-33 (low ionic strength constituents), and Hg-29 (mercury) -- that were distributed in October 1999 to 149 laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data that were received from 131 of the laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Results of the U.S. Geological Survey's analytical evaluation program for standard reference samples distributed in March 2003

USGS Publications Warehouse

Woodworth, Mark T.; Connor, Brooke F.

2003-01-01

This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-173 (trace constituents), M-166 (major constituents), N-77 (nutrient constituents), N-78 (nutrient constituents), P-40 (low ionic-strength constituents), and Hg-36 (mercury) -- that were distributed in March 2003 to laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data received from 110 laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
Results of the U. S. Geological Survey's analytical evaluation program for standard reference samples distributed in October 2000

USGS Publications Warehouse

Connor, B.F.; Currier, J.P.; Woodworth, M.T.

2001-01-01

This report presents the results of the U.S. Geological Survey's analytical evaluation program for six standard reference samples -- T-163 (trace constituents), M-156 (major constituents), N-67 (nutrient constituents), N-68 (nutrient constituents), P-35 (low ionic strength constituents), and Hg-31 (mercury) -- that were distributed in October 2000 to 126 laboratories enrolled in the U.S. Geological Survey sponsored interlaboratory testing program. Analytical data that were received from 122 of the laboratories were evaluated with respect to overall laboratory performance and relative laboratory performance for each analyte in the six reference samples. Results of these evaluations are presented in tabular form. Also presented are tables and graphs summarizing the analytical data provided by each laboratory for each analyte in the six standard reference samples. The most probable value for each analyte was determined using nonparametric statistics.
A short-term clinical evaluation of IPS Empress 2 crowns.

PubMed

Toksavul, Suna; Toman, Muhittin

2007-01-01

The aim of this study was to evaluate the clinical performance of all-ceramic crowns made with the IPS Empress 2 system after an observation period of 12 to 60 months. Seventy-nine IPS Empress 2 crowns were placed in 21 patients. The all-ceramic crowns were evaluated clinically, radiographically, and using clinical photographs. The evaluations took place at baseline (2 days after cementation) and at 6-month intervals for 12 to 60 months. Survival rate of the crowns was determined using Kaplan-Meier statistical analysis. Based on the US Public Health Service criteria, 95.24% of the crowns were rated satisfactory after a mean follow-up period of 58 months. Fracture was registered in only 1 crown. One endodontically treated tooth failed as a result of fracture at the cervical margin area. In this in vivo study, IPS Empress 2 crowns exhibited a satisfactory clinical performance during an observation period ranging from 12 to 60 months.
Two Methods of Automatic Evaluation of Speech Signal Enhancement Recorded in the Open-Air MRI Environment

NASA Astrophysics Data System (ADS)

Přibil, Jiří; Přibilová, Anna; Frollo, Ivan

2017-12-01

The paper focuses on two methods of evaluation of successfulness of speech signal enhancement recorded in the open-air magnetic resonance imager during phonation for the 3D human vocal tract modeling. The first approach enables to obtain a comparison based on statistical analysis by ANOVA and hypothesis tests. The second method is based on classification by Gaussian mixture models (GMM). The performed experiments have confirmed that the proposed ANOVA and GMM classifiers for automatic evaluation of the speech quality are functional and produce fully comparable results with the standard evaluation based on the listening test method.
Adaptively Tuned Iterative Low Dose CT Image Denoising

PubMed Central

Hashemi, SayedMasoud; Paul, Narinder S.; Beheshti, Soosan; Cobbold, Richard S. C.

2015-01-01

Improving image quality is a critical objective in low dose computed tomography (CT) imaging and is the primary focus of CT image denoising. State-of-the-art CT denoising algorithms are mainly based on iterative minimization of an objective function, in which the performance is controlled by regularization parameters. To achieve the best results, these should be chosen carefully. However, the parameter selection is typically performed in an ad hoc manner, which can cause the algorithms to converge slowly or become trapped in a local minimum. To overcome these issues a noise confidence region evaluation (NCRE) method is used, which evaluates the denoising residuals iteratively and compares their statistics with those produced by additive noise. It then updates the parameters at the end of each iteration to achieve a better match to the noise statistics. By combining NCRE with the fundamentals of block matching and 3D filtering (BM3D) approach, a new iterative CT image denoising method is proposed. It is shown that this new denoising method improves the BM3D performance in terms of both the mean square error and a structural similarity index. Moreover, simulations and patient results show that this method preserves the clinically important details of low dose CT images together with a substantial noise reduction. PMID:26089972
A risk-based approach to management of leachables utilizing statistical analysis of extractables.

PubMed

Stults, Cheryl L M; Mikl, Jaromir; Whelehan, Oliver; Morrical, Bradley; Duffield, William; Nagao, Lee M

2015-04-01

To incorporate quality by design concepts into the management of leachables, an emphasis is often put on understanding the extractable profile for the materials of construction for manufacturing disposables, container-closure, or delivery systems. Component manufacturing processes may also impact the extractable profile. An approach was developed to (1) identify critical components that may be sources of leachables, (2) enable an understanding of manufacturing process factors that affect extractable profiles, (3) determine if quantitative models can be developed that predict the effect of those key factors, and (4) evaluate the practical impact of the key factors on the product. A risk evaluation for an inhalation product identified injection molding as a key process. Designed experiments were performed to evaluate the impact of molding process parameters on the extractable profile from an ABS inhaler component. Statistical analysis of the resulting GC chromatographic profiles identified processing factors that were correlated with peak levels in the extractable profiles. The combination of statistically significant molding process parameters was different for different types of extractable compounds. ANOVA models were used to obtain optimal process settings and predict extractable levels for a selected number of compounds. The proposed paradigm may be applied to evaluate the impact of material composition and processing parameters on extractable profiles and utilized to manage product leachables early in the development process and throughout the product lifecycle.
Supporting creativity and appreciation of uncertainty in exploring geo-coded public health data.

PubMed

Thew, S L; Sutcliffe, A; de Bruijn, O; McNaught, J; Procter, R; Jarvis, Paul; Buchan, I

2011-01-01

We present a prototype visualisation tool, ADVISES (Adaptive Visualization for e-Science), designed to support epidemiologists and public health practitioners in exploring geo-coded datasets and generating spatial epidemiological hypotheses. The tool is designed to support creative thinking while providing the means for the user to evaluate the validity of the visualization in terms of statistical uncertainty. We present an overview of the application and the results of an evaluation exploring public health researchers' responses to maps as a new way of viewing familiar data, in particular the use of thematic maps with adjoining descriptive statistics and forest plots to support the generation and evaluation of new hypotheses. A series of qualitative evaluations involved one experienced researcher asking 21 volunteers to interact with the system to perform a series of relatively complex, realistic map-building and exploration tasks, using a 'think aloud' protocol, followed by a semi-structured interview The volunteers were academic epidemiologists and UK National Health Service analysts. All users quickly and confidently created maps, and went on to spend substantial amounts of time exploring and interacting with system, generating hypotheses about their maps. Our findings suggest that the tool is able to support creativity and statistical appreciation among public health professionals and epidemiologists building thematic maps. Software such as this, introduced appropriately, could increase the capability of existing personnel for generating public health intelligence.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.