Metrics for Performance Evaluation of Patient Exercises during Physical Therapy.
Vakanski, Aleksandar; Ferguson, Jake M; Lee, Stephen
2017-06-01
The article proposes a set of metrics for evaluation of patient performance in physical therapy exercises. Taxonomy is employed that classifies the metrics into quantitative and qualitative categories, based on the level of abstraction of the captured motion sequences. Further, the quantitative metrics are classified into model-less and model-based metrics, in reference to whether the evaluation employs the raw measurements of patient performed motions, or whether the evaluation is based on a mathematical model of the motions. The reviewed metrics include root-mean square distance, Kullback Leibler divergence, log-likelihood, heuristic consistency, Fugl-Meyer Assessment, and similar. The metrics are evaluated for a set of five human motions captured with a Kinect sensor. The metrics can potentially be integrated into a system that employs machine learning for modelling and assessment of the consistency of patient performance in home-based therapy setting. Automated performance evaluation can overcome the inherent subjectivity in human performed therapy assessment, and it can increase the adherence to prescribed therapy plans, and reduce healthcare costs.
Evaluating hydrological model performance using information theory-based metrics
USDA-ARS?s Scientific Manuscript database
The accuracy-based model performance metrics not necessarily reflect the qualitative correspondence between simulated and measured streamflow time series. The objective of this work was to use the information theory-based metrics to see whether they can be used as complementary tool for hydrologic m...
Texture metric that predicts target detection performance
NASA Astrophysics Data System (ADS)
Culpepper, Joanne B.
2015-12-01
Two texture metrics based on gray level co-occurrence error (GLCE) are used to predict probability of detection and mean search time. The two texture metrics are local clutter metrics and are based on the statistics of GLCE probability distributions. The degree of correlation between various clutter metrics and the target detection performance of the nine military vehicles in complex natural scenes found in the Search_2 dataset are presented. Comparison is also made between four other common clutter metrics found in the literature: root sum of squares, Doyle, statistical variance, and target structure similarity. The experimental results show that the GLCE energy metric is a better predictor of target detection performance when searching for targets in natural scenes than the other clutter metrics studied.
Rivard, Justin D; Vergis, Ashley S; Unger, Bertram J; Hardy, Krista M; Andrew, Chris G; Gillman, Lawrence M; Park, Jason
2014-06-01
Computer-based surgical simulators capture a multitude of metrics based on different aspects of performance, such as speed, accuracy, and movement efficiency. However, without rigorous assessment, it may be unclear whether all, some, or none of these metrics actually reflect technical skill, which can compromise educational efforts on these simulators. We assessed the construct validity of individual performance metrics on the LapVR simulator (Immersion Medical, San Jose, CA, USA) and used these data to create task-specific summary metrics. Medical students with no prior laparoscopic experience (novices, N = 12), junior surgical residents with some laparoscopic experience (intermediates, N = 12), and experienced surgeons (experts, N = 11) all completed three repetitions of four LapVR simulator tasks. The tasks included three basic skills (peg transfer, cutting, clipping) and one procedural skill (adhesiolysis). We selected 36 individual metrics on the four tasks that assessed six different aspects of performance, including speed, motion path length, respect for tissue, accuracy, task-specific errors, and successful task completion. Four of seven individual metrics assessed for peg transfer, six of ten metrics for cutting, four of nine metrics for clipping, and three of ten metrics for adhesiolysis discriminated between experience levels. Time and motion path length were significant on all four tasks. We used the validated individual metrics to create summary equations for each task, which successfully distinguished between the different experience levels. Educators should maintain some skepticism when reviewing the plethora of metrics captured by computer-based simulators, as some but not all are valid. We showed the construct validity of a limited number of individual metrics and developed summary metrics for the LapVR. The summary metrics provide a succinct way of assessing skill with a single metric for each task, but require further validation.
Performance metrics for the evaluation of hyperspectral chemical identification systems
NASA Astrophysics Data System (ADS)
Truslow, Eric; Golowich, Steven; Manolakis, Dimitris; Ingle, Vinay
2016-02-01
Remote sensing of chemical vapor plumes is a difficult but important task for many military and civilian applications. Hyperspectral sensors operating in the long-wave infrared regime have well-demonstrated detection capabilities. However, the identification of a plume's chemical constituents, based on a chemical library, is a multiple hypothesis testing problem which standard detection metrics do not fully describe. We propose using an additional performance metric for identification based on the so-called Dice index. Our approach partitions and weights a confusion matrix to develop both the standard detection metrics and identification metric. Using the proposed metrics, we demonstrate that the intuitive system design of a detector bank followed by an identifier is indeed justified when incorporating performance information beyond the standard detection metrics.
Performance assessment in brain-computer interface-based augmentative and alternative communication
2013-01-01
A large number of incommensurable metrics are currently used to report the performance of brain-computer interfaces (BCI) used for augmentative and alterative communication (AAC). The lack of standard metrics precludes the comparison of different BCI-based AAC systems, hindering rapid growth and development of this technology. This paper presents a review of the metrics that have been used to report performance of BCIs used for AAC from January 2005 to January 2012. We distinguish between Level 1 metrics used to report performance at the output of the BCI Control Module, which translates brain signals into logical control output, and Level 2 metrics at the Selection Enhancement Module, which translates logical control to semantic control. We recommend that: (1) the commensurate metrics Mutual Information or Information Transfer Rate (ITR) be used to report Level 1 BCI performance, as these metrics represent information throughput, which is of interest in BCIs for AAC; 2) the BCI-Utility metric be used to report Level 2 BCI performance, as it is capable of handling all current methods of improving BCI performance; (3) these metrics should be supplemented by information specific to each unique BCI configuration; and (4) studies involving Selection Enhancement Modules should report performance at both Level 1 and Level 2 in the BCI system. Following these recommendations will enable efficient comparison between both BCI Control and Selection Enhancement Modules, accelerating research and development of BCI-based AAC systems. PMID:23680020
Grading the Metrics: Performance-Based Funding in the Florida State University System
ERIC Educational Resources Information Center
Cornelius, Luke M.; Cavanaugh, Terence W.
2016-01-01
A policy analysis of Florida's 10-factor Performance-Based Funding system for state universities. The focus of the article is on the system of performance metrics developed by the state Board of Governors and their impact on institutions and their missions. The paper also discusses problems and issues with the metrics, their ongoing evolution, and…
Gamut Volume Index: a color preference metric based on meta-analysis and optimized colour samples.
Liu, Qiang; Huang, Zheng; Xiao, Kaida; Pointer, Michael R; Westland, Stephen; Luo, M Ronnier
2017-07-10
A novel metric named Gamut Volume Index (GVI) is proposed for evaluating the colour preference of lighting. This metric is based on the absolute gamut volume of optimized colour samples. The optimal colour set of the proposed metric was obtained by optimizing the weighted average correlation between the metric predictions and the subjective ratings for 8 psychophysical studies. The performance of 20 typical colour metrics was also investigated, which included colour difference based metrics, gamut based metrics, memory based metrics as well as combined metrics. It was found that the proposed GVI outperformed the existing counterparts, especially for the conditions where correlated colour temperatures differed.
Performance regression manager for large scale systems
Faraj, Daniel A.
2017-10-17
System and computer program product to perform an operation comprising generating, based on a first output generated by a first execution instance of a command, a first output file specifying a value of at least one performance metric, wherein the first output file is formatted according to a predefined format, comparing the value of the at least one performance metric in the first output file to a value of the performance metric in a second output file, the second output file having been generated based on a second output generated by a second execution instance of the command, and outputting for display an indication of a result of the comparison of the value of the at least one performance metric of the first output file to the value of the at least one performance metric of the second output file.
Performance regression manager for large scale systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Faraj, Daniel A.
Methods comprising generating, based on a first output generated by a first execution instance of a command, a first output file specifying a value of at least one performance metric, wherein the first output file is formatted according to a predefined format, comparing the value of the at least one performance metric in the first output file to a value of the performance metric in a second output file, the second output file having been generated based on a second output generated by a second execution instance of the command, and outputting for display an indication of a result ofmore » the comparison of the value of the at least one performance metric of the first output file to the value of the at least one performance metric of the second output file.« less
Performance regression manager for large scale systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Faraj, Daniel A.
System and computer program product to perform an operation comprising generating, based on a first output generated by a first execution instance of a command, a first output file specifying a value of at least one performance metric, wherein the first output file is formatted according to a predefined format, comparing the value of the at least one performance metric in the first output file to a value of the performance metric in a second output file, the second output file having been generated based on a second output generated by a second execution instance of the command, and outputtingmore » for display an indication of a result of the comparison of the value of the at least one performance metric of the first output file to the value of the at least one performance metric of the second output file.« less
NASA Astrophysics Data System (ADS)
Jimenez, Edward S.; Goodman, Eric L.; Park, Ryeojin; Orr, Laurel J.; Thompson, Kyle R.
2014-09-01
This paper will investigate energy-efficiency for various real-world industrial computed-tomography reconstruction algorithms, both CPU- and GPU-based implementations. This work shows that the energy required for a given reconstruction is based on performance and problem size. There are many ways to describe performance and energy efficiency, thus this work will investigate multiple metrics including performance-per-watt, energy-delay product, and energy consumption. This work found that irregular GPU-based approaches1 realized tremendous savings in energy consumption when compared to CPU implementations while also significantly improving the performance-per- watt and energy-delay product metrics. Additional energy savings and other metric improvement was realized on the GPU-based reconstructions by improving storage I/O by implementing a parallel MIMD-like modularization of the compute and I/O tasks.
On Applying the Prognostic Performance Metrics
NASA Technical Reports Server (NTRS)
Saxena, Abhinav; Celaya, Jose; Saha, Bhaskar; Saha, Sankalita; Goebel, Kai
2009-01-01
Prognostics performance evaluation has gained significant attention in the past few years. As prognostics technology matures and more sophisticated methods for prognostic uncertainty management are developed, a standardized methodology for performance evaluation becomes extremely important to guide improvement efforts in a constructive manner. This paper is in continuation of previous efforts where several new evaluation metrics tailored for prognostics were introduced and were shown to effectively evaluate various algorithms as compared to other conventional metrics. Specifically, this paper presents a detailed discussion on how these metrics should be interpreted and used. Several shortcomings identified, while applying these metrics to a variety of real applications, are also summarized along with discussions that attempt to alleviate these problems. Further, these metrics have been enhanced to include the capability of incorporating probability distribution information from prognostic algorithms as opposed to evaluation based on point estimates only. Several methods have been suggested and guidelines have been provided to help choose one method over another based on probability distribution characteristics. These approaches also offer a convenient and intuitive visualization of algorithm performance with respect to some of these new metrics like prognostic horizon and alpha-lambda performance, and also quantify the corresponding performance while incorporating the uncertainty information.
Caverzagie, Kelly J; Lane, Susan W; Sharma, Niraj; Donnelly, John; Jaeger, Jeffrey R; Laird-Fick, Heather; Moriarty, John P; Moyer, Darilyn V; Wallach, Sara L; Wardrop, Richard M; Steinmann, Alwin F
2017-12-12
Graduate medical education (GME) in the United States is financed by contributions from both federal and state entities that total over $15 billion annually. Within institutions, these funds are distributed with limited transparency to achieve ill-defined outcomes. To address this, the Institute of Medicine convened a committee on the governance and financing of GME to recommend finance reform that would promote a physician training system that meets society's current and future needs. The resulting report provided several recommendations regarding the oversight and mechanisms of GME funding, including implementation of performance-based GME payments, but did not provide specific details about the content and development of metrics for these payments. To initiate a national conversation about performance-based GME funding, the authors asked: What should GME be held accountable for in exchange for public funding? In answer to this question, the authors propose 17 potential performance-based metrics for GME funding that could inform future funding decisions. Eight of the metrics are described as exemplars to add context and to help readers obtain a deeper understanding of the inherent complexities of performance-based GME funding. The authors also describe considerations and precautions for metric implementation.
Multi-objective optimization for generating a weighted multi-model ensemble
NASA Astrophysics Data System (ADS)
Lee, H.
2017-12-01
Many studies have demonstrated that multi-model ensembles generally show better skill than each ensemble member. When generating weighted multi-model ensembles, the first step is measuring the performance of individual model simulations using observations. There is a consensus on the assignment of weighting factors based on a single evaluation metric. When considering only one evaluation metric, the weighting factor for each model is proportional to a performance score or inversely proportional to an error for the model. While this conventional approach can provide appropriate combinations of multiple models, the approach confronts a big challenge when there are multiple metrics under consideration. When considering multiple evaluation metrics, it is obvious that a simple averaging of multiple performance scores or model ranks does not address the trade-off problem between conflicting metrics. So far, there seems to be no best method to generate weighted multi-model ensembles based on multiple performance metrics. The current study applies the multi-objective optimization, a mathematical process that provides a set of optimal trade-off solutions based on a range of evaluation metrics, to combining multiple performance metrics for the global climate models and their dynamically downscaled regional climate simulations over North America and generating a weighted multi-model ensemble. NASA satellite data and the Regional Climate Model Evaluation System (RCMES) software toolkit are used for assessment of the climate simulations. Overall, the performance of each model differs markedly with strong seasonal dependence. Because of the considerable variability across the climate simulations, it is important to evaluate models systematically and make future projections by assigning optimized weighting factors to the models with relatively good performance. Our results indicate that the optimally weighted multi-model ensemble always shows better performance than an arithmetic ensemble mean and may provide reliable future projections.
GPS Device Testing Based on User Performance Metrics
DOT National Transportation Integrated Search
2015-10-02
1. Rationale for a Test Program Based on User Performance Metrics ; 2. Roberson and Associates Test Program ; 3. Status of, and Revisions to, the Roberson and Associates Test Program ; 4. Comparison of Roberson and DOT/Volpe Programs
Real-time performance monitoring and management system
Budhraja, Vikram S [Los Angeles, CA; Dyer, James D [La Mirada, CA; Martinez Morales, Carlos A [Upland, CA
2007-06-19
A real-time performance monitoring system for monitoring an electric power grid. The electric power grid has a plurality of grid portions, each grid portion corresponding to one of a plurality of control areas. The real-time performance monitoring system includes a monitor computer for monitoring at least one of reliability metrics, generation metrics, transmission metrics, suppliers metrics, grid infrastructure security metrics, and markets metrics for the electric power grid. The data for metrics being monitored by the monitor computer are stored in a data base, and a visualization of the metrics is displayed on at least one display computer having a monitor. The at least one display computer in one said control area enables an operator to monitor the grid portion corresponding to a different said control area.
Performance metrics for the assessment of satellite data products: an ocean color case study
Performance assessment of ocean color satellite data has generally relied on statistical metrics chosen for their common usage and the rationale for selecting certain metrics is infrequently explained. Commonly reported statistics based on mean squared errors, such as the coeffic...
Nindl, Bradley C; Jaffin, Dianna P; Dretsch, Michael N; Cheuvront, Samuel N; Wesensten, Nancy J; Kent, Michael L; Grunberg, Neil E; Pierce, Joseph R; Barry, Erin S; Scott, Jonathan M; Young, Andrew J; OʼConnor, Francis G; Deuster, Patricia A
2015-11-01
Human performance optimization (HPO) is defined as "the process of applying knowledge, skills and emerging technologies to improve and preserve the capabilities of military members, and organizations to execute essential tasks." The lack of consensus for operationally relevant and standardized metrics that meet joint military requirements has been identified as the single most important gap for research and application of HPO. In 2013, the Consortium for Health and Military Performance hosted a meeting to develop a toolkit of standardized HPO metrics for use in military and civilian research, and potentially for field applications by commanders, units, and organizations. Performance was considered from a holistic perspective as being influenced by various behaviors and barriers. To accomplish the goal of developing a standardized toolkit, key metrics were identified and evaluated across a spectrum of domains that contribute to HPO: physical performance, nutritional status, psychological status, cognitive performance, environmental challenges, sleep, and pain. These domains were chosen based on relevant data with regard to performance enhancers and degraders. The specific objectives at this meeting were to (a) identify and evaluate current metrics for assessing human performance within selected domains; (b) prioritize metrics within each domain to establish a human performance assessment toolkit; and (c) identify scientific gaps and the needed research to more effectively assess human performance across domains. This article provides of a summary of 150 total HPO metrics across multiple domains that can be used as a starting point-the beginning of an HPO toolkit: physical fitness (29 metrics), nutrition (24 metrics), psychological status (36 metrics), cognitive performance (35 metrics), environment (12 metrics), sleep (9 metrics), and pain (5 metrics). These metrics can be particularly valuable as the military emphasizes a renewed interest in Human Dimension efforts, and leverages science, resources, programs, and policies to optimize the performance capacities of all Service members.
Geospace Environment Modeling 2008-2009 Challenge: Ground Magnetic Field Perturbations
NASA Technical Reports Server (NTRS)
Pulkkinen, A.; Kuznetsova, M.; Ridley, A.; Raeder, J.; Vapirev, A.; Weimer, D.; Weigel, R. S.; Wiltberger, M.; Millward, G.; Rastatter, L.;
2011-01-01
Acquiring quantitative metrics!based knowledge about the performance of various space physics modeling approaches is central for the space weather community. Quantification of the performance helps the users of the modeling products to better understand the capabilities of the models and to choose the approach that best suits their specific needs. Further, metrics!based analyses are important for addressing the differences between various modeling approaches and for measuring and guiding the progress in the field. In this paper, the metrics!based results of the ground magnetic field perturbation part of the Geospace Environment Modeling 2008 2009 Challenge are reported. Predictions made by 14 different models, including an ensemble model, are compared to geomagnetic observatory recordings from 12 different northern hemispheric locations. Five different metrics are used to quantify the model performances for four storm events. It is shown that the ranking of the models is strongly dependent on the type of metric used to evaluate the model performance. None of the models rank near or at the top systematically for all used metrics. Consequently, one cannot pick the absolute winner : the choice for the best model depends on the characteristics of the signal one is interested in. Model performances vary also from event to event. This is particularly clear for root!mean!square difference and utility metric!based analyses. Further, analyses indicate that for some of the models, increasing the global magnetohydrodynamic model spatial resolution and the inclusion of the ring current dynamics improve the models capability to generate more realistic ground magnetic field fluctuations.
Kireeva, Natalia V; Ovchinnikova, Svetlana I; Kuznetsov, Sergey L; Kazennov, Andrey M; Tsivadze, Aslan Yu
2014-02-01
This study concerns large margin nearest neighbors classifier and its multi-metric extension as the efficient approaches for metric learning which aimed to learn an appropriate distance/similarity function for considered case studies. In recent years, many studies in data mining and pattern recognition have demonstrated that a learned metric can significantly improve the performance in classification, clustering and retrieval tasks. The paper describes application of the metric learning approach to in silico assessment of chemical liabilities. Chemical liabilities, such as adverse effects and toxicity, play a significant role in drug discovery process, in silico assessment of chemical liabilities is an important step aimed to reduce costs and animal testing by complementing or replacing in vitro and in vivo experiments. Here, to our knowledge for the first time, a distance-based metric learning procedures have been applied for in silico assessment of chemical liabilities, the impact of metric learning on structure-activity landscapes and predictive performance of developed models has been analyzed, the learned metric was used in support vector machines. The metric learning results have been illustrated using linear and non-linear data visualization techniques in order to indicate how the change of metrics affected nearest neighbors relations and descriptor space.
NASA Astrophysics Data System (ADS)
Kireeva, Natalia V.; Ovchinnikova, Svetlana I.; Kuznetsov, Sergey L.; Kazennov, Andrey M.; Tsivadze, Aslan Yu.
2014-02-01
This study concerns large margin nearest neighbors classifier and its multi-metric extension as the efficient approaches for metric learning which aimed to learn an appropriate distance/similarity function for considered case studies. In recent years, many studies in data mining and pattern recognition have demonstrated that a learned metric can significantly improve the performance in classification, clustering and retrieval tasks. The paper describes application of the metric learning approach to in silico assessment of chemical liabilities. Chemical liabilities, such as adverse effects and toxicity, play a significant role in drug discovery process, in silico assessment of chemical liabilities is an important step aimed to reduce costs and animal testing by complementing or replacing in vitro and in vivo experiments. Here, to our knowledge for the first time, a distance-based metric learning procedures have been applied for in silico assessment of chemical liabilities, the impact of metric learning on structure-activity landscapes and predictive performance of developed models has been analyzed, the learned metric was used in support vector machines. The metric learning results have been illustrated using linear and non-linear data visualization techniques in order to indicate how the change of metrics affected nearest neighbors relations and descriptor space.
Metric for evaluation of filter efficiency in spectral cameras.
Nahavandi, Alireza Mahmoudi; Tehran, Mohammad Amani
2016-11-10
Although metric functions that show the performance of a colorimetric imaging device have been investigated, a metric for performance analysis of a set of filters in wideband filter-based spectral cameras has rarely been studied. Based on a generalization of Vora's Measure of Goodness (MOG) and the spanning theorem, a single function metric that estimates the effectiveness of a filter set is introduced. The improved metric, named MMOG, varies between one, for a perfect, and zero, for the worst possible set of filters. Results showed that MMOG exhibits a trend that is more similar to the mean square of spectral reflectance reconstruction errors than does Vora's MOG index, and it is robust to noise in the imaging system. MMOG as a single metric could be exploited for further analysis of manufacturing errors.
An Evaluation of the IntelliMetric[SM] Essay Scoring System
ERIC Educational Resources Information Center
Rudner, Lawrence M.; Garcia, Veronica; Welch, Catherine
2006-01-01
This report provides a two-part evaluation of the IntelliMetric[SM] automated essay scoring system based on its performance scoring essays from the Analytic Writing Assessment of the Graduate Management Admission Test[TM] (GMAT[TM]). The IntelliMetric system performance is first compared to that of individual human raters, a Bayesian system…
Rudnick, Paul A.; Clauser, Karl R.; Kilpatrick, Lisa E.; Tchekhovskoi, Dmitrii V.; Neta, Pedatsur; Blonder, Nikša; Billheimer, Dean D.; Blackman, Ronald K.; Bunk, David M.; Cardasis, Helene L.; Ham, Amy-Joan L.; Jaffe, Jacob D.; Kinsinger, Christopher R.; Mesri, Mehdi; Neubert, Thomas A.; Schilling, Birgit; Tabb, David L.; Tegeler, Tony J.; Vega-Montoto, Lorenzo; Variyath, Asokan Mulayath; Wang, Mu; Wang, Pei; Whiteaker, Jeffrey R.; Zimmerman, Lisa J.; Carr, Steven A.; Fisher, Susan J.; Gibson, Bradford W.; Paulovich, Amanda G.; Regnier, Fred E.; Rodriguez, Henry; Spiegelman, Cliff; Tempst, Paul; Liebler, Daniel C.; Stein, Stephen E.
2010-01-01
A major unmet need in LC-MS/MS-based proteomics analyses is a set of tools for quantitative assessment of system performance and evaluation of technical variability. Here we describe 46 system performance metrics for monitoring chromatographic performance, electrospray source stability, MS1 and MS2 signals, dynamic sampling of ions for MS/MS, and peptide identification. Applied to data sets from replicate LC-MS/MS analyses, these metrics displayed consistent, reasonable responses to controlled perturbations. The metrics typically displayed variations less than 10% and thus can reveal even subtle differences in performance of system components. Analyses of data from interlaboratory studies conducted under a common standard operating procedure identified outlier data and provided clues to specific causes. Moreover, interlaboratory variation reflected by the metrics indicates which system components vary the most between laboratories. Application of these metrics enables rational, quantitative quality assessment for proteomics and other LC-MS/MS analytical applications. PMID:19837981
Measuring Distribution Performance? Benchmarking Warrants Your Attention
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ericson, Sean J; Alvarez, Paul
Identifying, designing, and measuring performance metrics is critical to securing customer value, but can be a difficult task. This article examines the use of benchmarks based on publicly available performance data to set challenging, yet fair, metrics and targets.
Sakieh, Yousef; Salmanmahiny, Abdolrassoul
2016-03-01
Performance evaluation is a critical step when developing land-use and cover change (LUCC) models. The present study proposes a spatially explicit model performance evaluation method, adopting a landscape metric-based approach. To quantify GEOMOD model performance, a set of composition- and configuration-based landscape metrics including number of patches, edge density, mean Euclidean nearest neighbor distance, largest patch index, class area, landscape shape index, and splitting index were employed. The model takes advantage of three decision rules including neighborhood effect, persistence of change direction, and urbanization suitability values. According to the results, while class area, largest patch index, and splitting indices demonstrated insignificant differences between spatial pattern of ground truth and simulated layers, there was a considerable inconsistency between simulation results and real dataset in terms of the remaining metrics. Specifically, simulation outputs were simplistic and the model tended to underestimate number of developed patches by producing a more compact landscape. Landscape-metric-based performance evaluation produces more detailed information (compared to conventional indices such as the Kappa index and overall accuracy) on the model's behavior in replicating spatial heterogeneity features of a landscape such as frequency, fragmentation, isolation, and density. Finally, as the main characteristic of the proposed method, landscape metrics employ the maximum potential of observed and simulated layers for a performance evaluation procedure, provide a basis for more robust interpretation of a calibration process, and also deepen modeler insight into the main strengths and pitfalls of a specific land-use change model when simulating a spatiotemporal phenomenon.
DOT National Transportation Integrated Search
2013-04-01
"This report provides a Quick Guide to the concept of asset sustainability metrics. Such metrics address the long-term performance of highway assets based upon expected expenditure levels. : It examines how such metrics are used in Australia, Britain...
Adaptive distance metric learning for diffusion tensor image segmentation.
Kong, Youyong; Wang, Defeng; Shi, Lin; Hui, Steve C N; Chu, Winnie C W
2014-01-01
High quality segmentation of diffusion tensor images (DTI) is of key interest in biomedical research and clinical application. In previous studies, most efforts have been made to construct predefined metrics for different DTI segmentation tasks. These methods require adequate prior knowledge and tuning parameters. To overcome these disadvantages, we proposed to automatically learn an adaptive distance metric by a graph based semi-supervised learning model for DTI segmentation. An original discriminative distance vector was first formulated by combining both geometry and orientation distances derived from diffusion tensors. The kernel metric over the original distance and labels of all voxels were then simultaneously optimized in a graph based semi-supervised learning approach. Finally, the optimization task was efficiently solved with an iterative gradient descent method to achieve the optimal solution. With our approach, an adaptive distance metric could be available for each specific segmentation task. Experiments on synthetic and real brain DTI datasets were performed to demonstrate the effectiveness and robustness of the proposed distance metric learning approach. The performance of our approach was compared with three classical metrics in the graph based semi-supervised learning framework.
Adaptive Distance Metric Learning for Diffusion Tensor Image Segmentation
Kong, Youyong; Wang, Defeng; Shi, Lin; Hui, Steve C. N.; Chu, Winnie C. W.
2014-01-01
High quality segmentation of diffusion tensor images (DTI) is of key interest in biomedical research and clinical application. In previous studies, most efforts have been made to construct predefined metrics for different DTI segmentation tasks. These methods require adequate prior knowledge and tuning parameters. To overcome these disadvantages, we proposed to automatically learn an adaptive distance metric by a graph based semi-supervised learning model for DTI segmentation. An original discriminative distance vector was first formulated by combining both geometry and orientation distances derived from diffusion tensors. The kernel metric over the original distance and labels of all voxels were then simultaneously optimized in a graph based semi-supervised learning approach. Finally, the optimization task was efficiently solved with an iterative gradient descent method to achieve the optimal solution. With our approach, an adaptive distance metric could be available for each specific segmentation task. Experiments on synthetic and real brain DTI datasets were performed to demonstrate the effectiveness and robustness of the proposed distance metric learning approach. The performance of our approach was compared with three classical metrics in the graph based semi-supervised learning framework. PMID:24651858
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morrissey, Elmer; O'Donnell, James; Keane, Marcus
2004-03-29
Minimizing building life cycle energy consumption is becoming of paramount importance. Performance metrics tracking offers a clear and concise manner of relating design intent in a quantitative form. A methodology is discussed for storage and utilization of these performance metrics through an Industry Foundation Classes (IFC) instantiated Building Information Model (BIM). The paper focuses on storage of three sets of performance data from three distinct sources. An example of a performance metrics programming hierarchy is displayed for a heat pump and a solar array. Utilizing the sets of performance data, two discrete performance effectiveness ratios may be computed, thus offeringmore » an accurate method of quantitatively assessing building performance.« less
Climate Classification is an Important Factor in Assessing Hospital Performance Metrics
NASA Astrophysics Data System (ADS)
Boland, M. R.; Parhi, P.; Gentine, P.; Tatonetti, N. P.
2017-12-01
Context/Purpose: Climate is a known modulator of disease, but its impact on hospital performance metrics remains unstudied. Methods: We assess the relationship between Köppen-Geiger climate classification and hospital performance metrics, specifically 30-day mortality, as reported in Hospital Compare, and collected for the period July 2013 through June 2014 (7/1/2013 - 06/30/2014). A hospital-level multivariate linear regression analysis was performed while controlling for known socioeconomic factors to explore the relationship between all-cause mortality and climate. Hospital performance scores were obtained from 4,524 hospitals belonging to 15 distinct Köppen-Geiger climates and 2,373 unique counties. Results: Model results revealed that hospital performance metrics for mortality showed significant climate dependence (p<0.001) after adjusting for socioeconomic factors. Interpretation: Currently, hospitals are reimbursed by Governmental agencies using 30-day mortality rates along with 30-day readmission rates. These metrics allow Government agencies to rank hospitals according to their `performance' along these metrics. Various socioeconomic factors are taken into consideration when determining individual hospitals performance. However, no climate-based adjustment is made within the existing framework. Our results indicate that climate-based variability in 30-day mortality rates does exist even after socioeconomic confounder adjustment. Use of standardized high-level climate classification systems (such as Koppen-Geiger) would be useful to incorporate in future metrics. Conclusion: Climate is a significant factor in evaluating hospital 30-day mortality rates. These results demonstrate that climate classification is an important factor when comparing hospital performance across the United States.
Partially supervised speaker clustering.
Tang, Hao; Chu, Stephen Mingyu; Hasegawa-Johnson, Mark; Huang, Thomas S
2012-05-01
Content-based multimedia indexing, retrieval, and processing as well as multimedia databases demand the structuring of the media content (image, audio, video, text, etc.), one significant goal being to associate the identity of the content to the individual segments of the signals. In this paper, we specifically address the problem of speaker clustering, the task of assigning every speech utterance in an audio stream to its speaker. We offer a complete treatment to the idea of partially supervised speaker clustering, which refers to the use of our prior knowledge of speakers in general to assist the unsupervised speaker clustering process. By means of an independent training data set, we encode the prior knowledge at the various stages of the speaker clustering pipeline via 1) learning a speaker-discriminative acoustic feature transformation, 2) learning a universal speaker prior model, and 3) learning a discriminative speaker subspace, or equivalently, a speaker-discriminative distance metric. We study the directional scattering property of the Gaussian mixture model (GMM) mean supervector representation of utterances in the high-dimensional space, and advocate exploiting this property by using the cosine distance metric instead of the euclidean distance metric for speaker clustering in the GMM mean supervector space. We propose to perform discriminant analysis based on the cosine distance metric, which leads to a novel distance metric learning algorithm—linear spherical discriminant analysis (LSDA). We show that the proposed LSDA formulation can be systematically solved within the elegant graph embedding general dimensionality reduction framework. Our speaker clustering experiments on the GALE database clearly indicate that 1) our speaker clustering methods based on the GMM mean supervector representation and vector-based distance metrics outperform traditional speaker clustering methods based on the “bag of acoustic features” representation and statistical model-based distance metrics, 2) our advocated use of the cosine distance metric yields consistent increases in the speaker clustering performance as compared to the commonly used euclidean distance metric, 3) our partially supervised speaker clustering concept and strategies significantly improve the speaker clustering performance over the baselines, and 4) our proposed LSDA algorithm further leads to state-of-the-art speaker clustering performance.
Wide-area, real-time monitoring and visualization system
Budhraja, Vikram S.; Dyer, James D.; Martinez Morales, Carlos A.
2013-03-19
A real-time performance monitoring system for monitoring an electric power grid. The electric power grid has a plurality of grid portions, each grid portion corresponding to one of a plurality of control areas. The real-time performance monitoring system includes a monitor computer for monitoring at least one of reliability metrics, generation metrics, transmission metrics, suppliers metrics, grid infrastructure security metrics, and markets metrics for the electric power grid. The data for metrics being monitored by the monitor computer are stored in a data base, and a visualization of the metrics is displayed on at least one display computer having a monitor. The at least one display computer in one said control area enables an operator to monitor the grid portion corresponding to a different said control area.
Wide-area, real-time monitoring and visualization system
Budhraja, Vikram S [Los Angeles, CA; Dyer, James D [La Mirada, CA; Martinez Morales, Carlos A [Upland, CA
2011-11-15
A real-time performance monitoring system for monitoring an electric power grid. The electric power grid has a plurality of grid portions, each grid portion corresponding to one of a plurality of control areas. The real-time performance monitoring system includes a monitor computer for monitoring at least one of reliability metrics, generation metrics, transmission metrics, suppliers metrics, grid infrastructure security metrics, and markets metrics for the electric power grid. The data for metrics being monitored by the monitor computer are stored in a data base, and a visualization of the metrics is displayed on at least one display computer having a monitor. The at least one display computer in one said control area enables an operator to monitor the grid portion corresponding to a different said control area.
Evaluation of image deblurring methods via a classification metric
NASA Astrophysics Data System (ADS)
Perrone, Daniele; Humphreys, David; Lamb, Robert A.; Favaro, Paolo
2012-09-01
The performance of single image deblurring algorithms is typically evaluated via a certain discrepancy measure between the reconstructed image and the ideal sharp image. The choice of metric, however, has been a source of debate and has also led to alternative metrics based on human visual perception. While fixed metrics may fail to capture some small but visible artifacts, perception-based metrics may favor reconstructions with artifacts that are visually pleasant. To overcome these limitations, we propose to assess the quality of reconstructed images via a task-driven metric. In this paper we consider object classification as the task and therefore use the rate of classification as the metric to measure deblurring performance. In our evaluation we use data with different types of blur in two cases: Optical Character Recognition (OCR), where the goal is to recognise characters in a black and white image, and object classification with no restrictions on pose, illumination and orientation. Finally, we show how off-the-shelf classification algorithms benefit from working with deblurred images.
Fusion set selection with surrogate metric in multi-atlas based image segmentation
NASA Astrophysics Data System (ADS)
Zhao, Tingting; Ruan, Dan
2016-02-01
Multi-atlas based image segmentation sees unprecedented opportunities but also demanding challenges in the big data era. Relevant atlas selection before label fusion plays a crucial role in reducing potential performance loss from heterogeneous data quality and high computation cost from extensive data. This paper starts with investigating the image similarity metric (termed ‘surrogate’), an alternative to the inaccessible geometric agreement metric (termed ‘oracle’) in atlas relevance assessment, and probes into the problem of how to select the ‘most-relevant’ atlases and how many such atlases to incorporate. We propose an inference model to relate the surrogates and the oracle geometric agreement metrics. Based on this model, we quantify the behavior of the surrogates in mimicking oracle metrics for atlas relevance ordering. Finally, analytical insights on the choice of fusion set size are presented from a probabilistic perspective, with the integrated goal of including the most relevant atlases and excluding the irrelevant ones. Empirical evidence and performance assessment are provided based on prostate and corpus callosum segmentation.
Hybrid monitoring scheme for end-to-end performance enhancement of multicast-based real-time media
NASA Astrophysics Data System (ADS)
Park, Ju-Won; Kim, JongWon
2004-10-01
As real-time media applications based on IP multicast networks spread widely, end-to-end QoS (quality of service) provisioning for these applications have become very important. To guarantee the end-to-end QoS of multi-party media applications, it is essential to monitor the time-varying status of both network metrics (i.e., delay, jitter and loss) and system metrics (i.e., CPU and memory utilization). In this paper, targeting the multicast-enabled AG (Access Grid) a next-generation group collaboration tool based on multi-party media services, the applicability of hybrid monitoring scheme that combines active and passive monitoring is investigated. The active monitoring measures network-layer metrics (i.e., network condition) with probe packets while the passive monitoring checks both application-layer metrics (i.e., user traffic condition by analyzing RTCP packets) and system metrics. By comparing these hybrid results, we attempt to pinpoint the causes of performance degradation and explore corresponding reactions to improve the end-to-end performance. The experimental results show that the proposed hybrid monitoring can provide useful information to coordinate the performance improvement of multi-party real-time media applications.
Calderon, Lindsay E; Kavanagh, Kevin T; Rice, Mara K
2015-10-01
Catheter-associated urinary tract infections (CAUTIs) occur in 290,000 US hospital patients annually, with an estimated cost of $290 million. Two different measurement systems are being used to track the US health care system's performance in lowering the rate of CAUTIs. Since 2010, the Agency for Healthcare Research and Quality (AHRQ) metric has shown a 28.2% decrease in CAUTI, whereas the Centers for Disease Control and Prevention metric has shown a 3%-6% increase in CAUTI since 2009. Differences in data acquisition and the definition of the denominator may explain this discrepancy. The AHRQ metric analyzes chart-audited data and reflects both catheter use and care. The Centers for Disease Control and Prevention metric analyzes self-reported data and primarily reflects catheter care. Because analysis of the AHRQ metric showed a progressive change in performance over time and the scientific literature supports the importance of catheter use in the prevention of CAUTI, it is suggested that risk-adjusted catheter-use data be incorporated into metrics that are used for determining facility performance and for value-based purchasing initiatives. Copyright © 2015 Association for Professionals in Infection Control and Epidemiology, Inc. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Kwakkel, Jan; Haasnoot, Marjolijn
2015-04-01
In response to climate and socio-economic change, in various policy domains there is increasingly a call for robust plans or policies. That is, plans or policies that performs well in a very large range of plausible futures. In the literature, a wide range of alternative robustness metrics can be found. The relative merit of these alternative conceptualizations of robustness has, however, received less attention. Evidently, different robustness metrics can result in different plans or policies being adopted. This paper investigates the consequences of several robustness metrics on decision making, illustrated here by the design of a flood risk management plan. A fictitious case, inspired by a river reach in the Netherlands is used. The performance of this system in terms of casualties, damages, and costs for flood and damage mitigation actions is explored using a time horizon of 100 years, and accounting for uncertainties pertaining to climate change and land use change. A set of candidate policy options is specified up front. This set of options includes dike raising, dike strengthening, creating more space for the river, and flood proof building and evacuation options. The overarching aim is to design an effective flood risk mitigation strategy that is designed from the outset to be adapted over time in response to how the future actually unfolds. To this end, the plan will be based on the dynamic adaptive policy pathway approach (Haasnoot, Kwakkel et al. 2013) being used in the Dutch Delta Program. The policy problem is formulated as a multi-objective robust optimization problem (Kwakkel, Haasnoot et al. 2014). We solve the multi-objective robust optimization problem using several alternative robustness metrics, including both satisficing robustness metrics and regret based robustness metrics. Satisficing robustness metrics focus on the performance of candidate plans across a large ensemble of plausible futures. Regret based robustness metrics compare the performance of a candidate plan with the performance of other candidate plans across a large ensemble of plausible futures. Initial results suggest that the simplest satisficing metric, inspired by the signal to noise ratio, results in very risk averse solutions. Other satisficing metrics, which handle the average performance and the dispersion around the average separately, provide substantial additional insights into the trade off between the average performance, and the dispersion around this average. In contrast, the regret-based metrics enhance insight into the relative merits of candidate plans, while being less clear on the average performance or the dispersion around this performance. These results suggest that it is beneficial to use multiple robustness metrics when doing a robust decision analysis study. Haasnoot, M., J. H. Kwakkel, W. E. Walker and J. Ter Maat (2013). "Dynamic Adaptive Policy Pathways: A New Method for Crafting Robust Decisions for a Deeply Uncertain World." Global Environmental Change 23(2): 485-498. Kwakkel, J. H., M. Haasnoot and W. E. Walker (2014). "Developing Dynamic Adaptive Policy Pathways: A computer-assisted approach for developing adaptive strategies for a deeply uncertain world." Climatic Change.
Tide or Tsunami? The Impact of Metrics on Scholarly Research
ERIC Educational Resources Information Center
Bonnell, Andrew G.
2016-01-01
Australian universities are increasingly resorting to the use of journal metrics such as impact factors and ranking lists in appraisal and promotion processes, and are starting to set quantitative "performance expectations" which make use of such journal-based metrics. The widespread use and misuse of research metrics is leading to…
Ranking streamflow model performance based on Information theory metrics
NASA Astrophysics Data System (ADS)
Martinez, Gonzalo; Pachepsky, Yakov; Pan, Feng; Wagener, Thorsten; Nicholson, Thomas
2016-04-01
The accuracy-based model performance metrics not necessarily reflect the qualitative correspondence between simulated and measured streamflow time series. The objective of this work was to use the information theory-based metrics to see whether they can be used as complementary tool for hydrologic model evaluation and selection. We simulated 10-year streamflow time series in five watersheds located in Texas, North Carolina, Mississippi, and West Virginia. Eight model of different complexity were applied. The information-theory based metrics were obtained after representing the time series as strings of symbols where different symbols corresponded to different quantiles of the probability distribution of streamflow. The symbol alphabet was used. Three metrics were computed for those strings - mean information gain that measures the randomness of the signal, effective measure complexity that characterizes predictability and fluctuation complexity that characterizes the presence of a pattern in the signal. The observed streamflow time series has smaller information content and larger complexity metrics than the precipitation time series. Watersheds served as information filters and and streamflow time series were less random and more complex than the ones of precipitation. This is reflected the fact that the watershed acts as the information filter in the hydrologic conversion process from precipitation to streamflow. The Nash Sutcliffe efficiency metric increased as the complexity of models increased, but in many cases several model had this efficiency values not statistically significant from each other. In such cases, ranking models by the closeness of the information-theory based parameters in simulated and measured streamflow time series can provide an additional criterion for the evaluation of hydrologic model performance.
Video-Based Method of Quantifying Performance and Instrument Motion During Simulated Phonosurgery
Conroy, Ellen; Surender, Ketan; Geng, Zhixian; Chen, Ting; Dailey, Seth; Jiang, Jack
2015-01-01
Objectives/Hypothesis To investigate the use of the Video-Based Phonomicrosurgery Instrument Tracking System to collect instrument position data during simulated phonomicrosurgery and calculate motion metrics using these data. We used this system to determine if novice subject motion metrics improved over 1 week of training. Study Design Prospective cohort study. Methods Ten subjects performed simulated surgical tasks once per day for 5 days. Instrument position data were collected and used to compute motion metrics (path length, depth perception, and motion smoothness). Data were analyzed to determine if motion metrics improved with practice time. Task outcome was also determined each day, and relationships between task outcome and motion metrics were used to evaluate the validity of motion metrics as indicators of surgical performance. Results Significant decreases over time were observed for path length (P <.001), depth perception (P <.001), and task outcome (P <.001). No significant change was observed for motion smoothness. Significant relationships were observed between task outcome and path length (P <.001), depth perception (P <.001), and motion smoothness (P <.001). Conclusions Our system can estimate instrument trajectory and provide quantitative descriptions of surgical performance. It may be useful for evaluating phonomicrosurgery performance. Path length and depth perception may be particularly useful indicators. PMID:24737286
Shwartz, Michael; Peköz, Erol A; Burgess, James F; Christiansen, Cindy L; Rosen, Amy K; Berlowitz, Dan
2014-12-01
Two approaches are commonly used for identifying high-performing facilities on a performance measure: one, that the facility is in a top quantile (eg, quintile or quartile); and two, that a confidence interval is below (or above) the average of the measure for all facilities. This type of yes/no designation often does not do well in distinguishing high-performing from average-performing facilities. To illustrate an alternative continuous-valued metric for profiling facilities--the probability a facility is in a top quantile--and show the implications of using this metric for profiling and pay-for-performance. We created a composite measure of quality from fiscal year 2007 data based on 28 quality indicators from 112 Veterans Health Administration nursing homes. A Bayesian hierarchical multivariate normal-binomial model was used to estimate shrunken rates of the 28 quality indicators, which were combined into a composite measure using opportunity-based weights. Rates were estimated using Markov Chain Monte Carlo methods as implemented in WinBUGS. The probability metric was calculated from the simulation replications. Our probability metric allowed better discrimination of high performers than the point or interval estimate of the composite score. In a pay-for-performance program, a smaller top quantile (eg, a quintile) resulted in more resources being allocated to the highest performers, whereas a larger top quantile (eg, being above the median) distinguished less among high performers and allocated more resources to average performers. The probability metric has potential but needs to be evaluated by stakeholders in different types of delivery systems.
Young, Laura K; Love, Gordon D; Smithson, Hannah E
2013-09-20
Advances in ophthalmic instrumentation have allowed high order aberrations to be measured in vivo. These measurements describe the distortions to a plane wavefront entering the eye, but not the effect they have on visual performance. One metric for predicting visual performance from a wavefront measurement uses the visual Strehl ratio, calculated in the optical transfer function (OTF) domain (VSOTF) (Thibos et al., 2004). We considered how well such a metric captures empirical measurements of the effects of defocus, coma and secondary astigmatism on letter identification and on reading. We show that predictions using the visual Strehl ratio can be significantly improved by weighting the OTF by the spatial frequency band that mediates letter identification and further improved by considering the orientation of phase and contrast changes imposed by the aberration. We additionally showed that these altered metrics compare well to a cross-correlation-based metric. We suggest a version of the visual Strehl ratio, VScombined, that incorporates primarily those phase disruptions and contrast changes that have been shown independently to affect object recognition processes. This metric compared well to VSOTF for letter identification and was the best predictor of reading performance, having a higher correlation with the data than either the VSOTF or cross-correlation-based metric. Copyright © 2013 The Authors. Published by Elsevier Ltd.. All rights reserved.
Gibbons, Theodore R; Mount, Stephen M; Cooper, Endymion D; Delwiche, Charles F
2015-07-10
Clustering protein sequences according to inferred homology is a fundamental step in the analysis of many large data sets. Since the publication of the Markov Clustering (MCL) algorithm in 2002, it has been the centerpiece of several popular applications. Each of these approaches generates an undirected graph that represents sequences as nodes connected to each other by edges weighted with a BLAST-based metric. MCL is then used to infer clusters of homologous proteins by analyzing these graphs. The various approaches differ only by how they weight the edges, yet there has been very little direct examination of the relative performance of alternative edge-weighting metrics. This study compares the performance of four BLAST-based edge-weighting metrics: the bit score, bit score ratio (BSR), bit score over anchored length (BAL), and negative common log of the expectation value (NLE). Performance is tested using the Extended CEGMA KOGs (ECK) database, which we introduce here. All metrics performed similarly when analyzing full-length sequences, but dramatic differences emerged as progressively larger fractions of the test sequences were split into fragments. The BSR and BAL successfully rescued subsets of clusters by strengthening certain types of alignments between fragmented sequences, but also shifted the largest correct scores down near the range of scores generated from spurious alignments. This penalty outweighed the benefits in most test cases, and was greatly exacerbated by increasing the MCL inflation parameter, making these metrics less robust than the bit score or the more popular NLE. Notably, the bit score performed as well or better than the other three metrics in all scenarios. The results provide a strong case for use of the bit score, which appears to offer equivalent or superior performance to the more popular NLE. The insight that MCL-based clustering methods can be improved using a more tractable edge-weighting metric will greatly simplify future implementations. We demonstrate this with our own minimalist Python implementation: Porthos, which uses only standard libraries and can process a graph with 25 m + edges connecting the 60 k + KOG sequences in half a minute using less than half a gigabyte of memory.
Binary sensitivity and specificity metrics are not adequate to describe the performance of quantitative microbial source tracking methods because the estimates depend on the amount of material tested and limit of detection. We introduce a new framework to compare the performance ...
Improving Department of Defense Global Distribution Performance Through Network Analysis
2016-06-01
network performance increase. 14. SUBJECT TERMS supply chain metrics, distribution networks, requisition shipping time, strategic distribution database...peace and war” (p. 4). USTRANSCOM Metrics and Analysis Branch defines, develops, tracks, and maintains outcomes- based supply chain metrics to...2014a, p. 8). The Joint Staff defines a TDD standard as the maximum number of days the supply chain can take to deliver requisitioned materiel
Performance metrics for the assessment of satellite data products: an ocean color case study
Seegers, Bridget N.; Stumpf, Richard P.; Schaeffer, Blake A.; Loftin, Keith A.; Werdell, P. Jeremy
2018-01-01
Performance assessment of ocean color satellite data has generally relied on statistical metrics chosen for their common usage and the rationale for selecting certain metrics is infrequently explained. Commonly reported statistics based on mean squared errors, such as the coefficient of determination (r2), root mean square error, and regression slopes, are most appropriate for Gaussian distributions without outliers and, therefore, are often not ideal for ocean color algorithm performance assessment, which is often limited by sample availability. In contrast, metrics based on simple deviations, such as bias and mean absolute error, as well as pair-wise comparisons, often provide more robust and straightforward quantities for evaluating ocean color algorithms with non-Gaussian distributions and outliers. This study uses a SeaWiFS chlorophyll-a validation data set to demonstrate a framework for satellite data product assessment and recommends a multi-metric and user-dependent approach that can be applied within science, modeling, and resource management communities. PMID:29609296
Performance Benchmarks for Scholarly Metrics Associated with Fisheries and Wildlife Faculty
Swihart, Robert K.; Sundaram, Mekala; Höök, Tomas O.; DeWoody, J. Andrew; Kellner, Kenneth F.
2016-01-01
Research productivity and impact are often considered in professional evaluations of academics, and performance metrics based on publications and citations increasingly are used in such evaluations. To promote evidence-based and informed use of these metrics, we collected publication and citation data for 437 tenure-track faculty members at 33 research-extensive universities in the United States belonging to the National Association of University Fisheries and Wildlife Programs. For each faculty member, we computed 8 commonly used performance metrics based on numbers of publications and citations, and recorded covariates including academic age (time since Ph.D.), sex, percentage of appointment devoted to research, and the sub-disciplinary research focus. Standardized deviance residuals from regression models were used to compare faculty after accounting for variation in performance due to these covariates. We also aggregated residuals to enable comparison across universities. Finally, we tested for temporal trends in citation practices to assess whether the “law of constant ratios”, used to enable comparison of performance metrics between disciplines that differ in citation and publication practices, applied to fisheries and wildlife sub-disciplines when mapped to Web of Science Journal Citation Report categories. Our regression models reduced deviance by ¼ to ½. Standardized residuals for each faculty member, when combined across metrics as a simple average or weighted via factor analysis, produced similar results in terms of performance based on percentile rankings. Significant variation was observed in scholarly performance across universities, after accounting for the influence of covariates. In contrast to findings for other disciplines, normalized citation ratios for fisheries and wildlife sub-disciplines increased across years. Increases were comparable for all sub-disciplines except ecology. We discuss the advantages and limitations of our methods, illustrate their use when applied to new data, and suggest future improvements. Our benchmarking approach may provide a useful tool to augment detailed, qualitative assessment of performance. PMID:27152838
Performance Benchmarks for Scholarly Metrics Associated with Fisheries and Wildlife Faculty.
Swihart, Robert K; Sundaram, Mekala; Höök, Tomas O; DeWoody, J Andrew; Kellner, Kenneth F
2016-01-01
Research productivity and impact are often considered in professional evaluations of academics, and performance metrics based on publications and citations increasingly are used in such evaluations. To promote evidence-based and informed use of these metrics, we collected publication and citation data for 437 tenure-track faculty members at 33 research-extensive universities in the United States belonging to the National Association of University Fisheries and Wildlife Programs. For each faculty member, we computed 8 commonly used performance metrics based on numbers of publications and citations, and recorded covariates including academic age (time since Ph.D.), sex, percentage of appointment devoted to research, and the sub-disciplinary research focus. Standardized deviance residuals from regression models were used to compare faculty after accounting for variation in performance due to these covariates. We also aggregated residuals to enable comparison across universities. Finally, we tested for temporal trends in citation practices to assess whether the "law of constant ratios", used to enable comparison of performance metrics between disciplines that differ in citation and publication practices, applied to fisheries and wildlife sub-disciplines when mapped to Web of Science Journal Citation Report categories. Our regression models reduced deviance by ¼ to ½. Standardized residuals for each faculty member, when combined across metrics as a simple average or weighted via factor analysis, produced similar results in terms of performance based on percentile rankings. Significant variation was observed in scholarly performance across universities, after accounting for the influence of covariates. In contrast to findings for other disciplines, normalized citation ratios for fisheries and wildlife sub-disciplines increased across years. Increases were comparable for all sub-disciplines except ecology. We discuss the advantages and limitations of our methods, illustrate their use when applied to new data, and suggest future improvements. Our benchmarking approach may provide a useful tool to augment detailed, qualitative assessment of performance.
Measuring β-diversity with species abundance data.
Barwell, Louise J; Isaac, Nick J B; Kunin, William E
2015-07-01
In 2003, 24 presence-absence β-diversity metrics were reviewed and a number of trade-offs and redundancies identified. We present a parallel investigation into the performance of abundance-based metrics of β-diversity. β-diversity is a multi-faceted concept, central to spatial ecology. There are multiple metrics available to quantify it: the choice of metric is an important decision. We test 16 conceptual properties and two sampling properties of a β-diversity metric: metrics should be 1) independent of α-diversity and 2) cumulative along a gradient of species turnover. Similarity should be 3) probabilistic when assemblages are independently and identically distributed. Metrics should have 4) a minimum of zero and increase monotonically with the degree of 5) species turnover, 6) decoupling of species ranks and 7) evenness differences. However, complete species turnover should always generate greater values of β than extreme 8) rank shifts or 9) evenness differences. Metrics should 10) have a fixed upper limit, 11) symmetry (βA,B = βB,A ), 12) double-zero asymmetry for double absences and double presences and 13) not decrease in a series of nested assemblages. Additionally, metrics should be independent of 14) species replication 15) the units of abundance and 16) differences in total abundance between sampling units. When samples are used to infer β-diversity, metrics should be 1) independent of sample sizes and 2) independent of unequal sample sizes. We test 29 metrics for these properties and five 'personality' properties. Thirteen metrics were outperformed or equalled across all conceptual and sampling properties. Differences in sensitivity to species' abundance lead to a performance trade-off between sample size bias and the ability to detect turnover among rare species. In general, abundance-based metrics are substantially less biased in the face of undersampling, although the presence-absence metric, βsim , performed well overall. Only βBaselga R turn , βBaselga B-C turn and βsim measured purely species turnover and were independent of nestedness. Among the other metrics, sensitivity to nestedness varied >4-fold. Our results indicate large amounts of redundancy among existing β-diversity metrics, whilst the estimation of unseen shared and unshared species is lacking and should be addressed in the design of new abundance-based metrics. © 2015 The Authors. Journal of Animal Ecology published by John Wiley & Sons Ltd on behalf of British Ecological Society.
Up Periscope! Designing a New Perceptual Metric for Imaging System Performance
NASA Technical Reports Server (NTRS)
Watson, Andrew B.
2016-01-01
Modern electronic imaging systems include optics, sensors, sampling, noise, processing, compression, transmission and display elements, and are viewed by the human eye. Many of these elements cannot be assessed by traditional imaging system metrics such as the MTF. More complex metrics such as NVTherm do address these elements, but do so largely through parametric adjustment of an MTF-like metric. The parameters are adjusted through subjective testing of human observers identifying specific targets in a set of standard images. We have designed a new metric that is based on a model of human visual pattern classification. In contrast to previous metrics, ours simulates the human observer identifying the standard targets. One application of this metric is to quantify performance of modern electronic periscope systems on submarines.
Garfjeld Roberts, Patrick; Guyver, Paul; Baldwin, Mathew; Akhtar, Kash; Alvand, Abtin; Price, Andrew J; Rees, Jonathan L
2017-02-01
To assess the construct and face validity of ArthroS, a passive haptic VR simulator. A secondary aim was to evaluate the novel performance metrics produced by this simulator. Two groups of 30 participants, each divided into novice, intermediate or expert based on arthroscopic experience, completed three separate tasks on either the knee or shoulder module of the simulator. Performance was recorded using 12 automatically generated performance metrics and video footage of the arthroscopic procedures. The videos were blindly assessed using a validated global rating scale (GRS). Participants completed a survey about the simulator's realism and training utility. This new simulator demonstrated construct validity of its tasks when evaluated against a GRS (p ≤ 0.003 in all cases). Regarding it's automatically generated performance metrics, established outputs such as time taken (p ≤ 0.001) and instrument path length (p ≤ 0.007) also demonstrated good construct validity. However, two-thirds of the proposed 'novel metrics' the simulator reports could not distinguish participants based on arthroscopic experience. Face validity assessment rated the simulator as a realistic and useful tool for trainees, but the passive haptic feedback (a key feature of this simulator) is rated as less realistic. The ArthroS simulator has good task construct validity based on established objective outputs, but some of the novel performance metrics could not distinguish between surgical experience. The passive haptic feedback of the simulator also needs improvement. If simulators could offer automated and validated performance feedback, this would facilitate improvements in the delivery of training by allowing trainees to practise and self-assess.
NASA Astrophysics Data System (ADS)
Gide, Milind S.; Karam, Lina J.
2016-08-01
With the increased focus on visual attention (VA) in the last decade, a large number of computational visual saliency methods have been developed over the past few years. These models are traditionally evaluated by using performance evaluation metrics that quantify the match between predicted saliency and fixation data obtained from eye-tracking experiments on human observers. Though a considerable number of such metrics have been proposed in the literature, there are notable problems in them. In this work, we discuss shortcomings in existing metrics through illustrative examples and propose a new metric that uses local weights based on fixation density which overcomes these flaws. To compare the performance of our proposed metric at assessing the quality of saliency prediction with other existing metrics, we construct a ground-truth subjective database in which saliency maps obtained from 17 different VA models are evaluated by 16 human observers on a 5-point categorical scale in terms of their visual resemblance with corresponding ground-truth fixation density maps obtained from eye-tracking data. The metrics are evaluated by correlating metric scores with the human subjective ratings. The correlation results show that the proposed evaluation metric outperforms all other popular existing metrics. Additionally, the constructed database and corresponding subjective ratings provide an insight into which of the existing metrics and future metrics are better at estimating the quality of saliency prediction and can be used as a benchmark.
An exploratory survey of methods used to develop measures of performance
NASA Astrophysics Data System (ADS)
Hamner, Kenneth L.; Lafleur, Charles A.
1993-09-01
Nonmanufacturing organizations are being challenged to provide high-quality products and services to their customers, with an emphasis on continuous process improvement. Measures of performance, referred to as metrics, can be used to foster process improvement. The application of performance measurement to nonmanufacturing processes can be very difficult. This research explored methods used to develop metrics in nonmanufacturing organizations. Several methods were formally defined in the literature, and the researchers used a two-step screening process to determine the OMB Generic Method was most likely to produce high-quality metrics. The OMB Generic Method was then used to develop metrics. A few other metric development methods were found in use at nonmanufacturing organizations. The researchers interviewed participants in metric development efforts to determine their satisfaction and to have them identify the strengths and weaknesses of, and recommended improvements to, the metric development methods used. Analysis of participants' responses allowed the researchers to identify the key components of a sound metrics development method. Those components were incorporated into a proposed metric development method that was based on the OMB Generic Method, and should be more likely to produce high-quality metrics that will result in continuous process improvement.
A neural net-based approach to software metrics
NASA Technical Reports Server (NTRS)
Boetticher, G.; Srinivas, Kankanahalli; Eichmann, David A.
1992-01-01
Software metrics provide an effective method for characterizing software. Metrics have traditionally been composed through the definition of an equation. This approach is limited by the fact that all the interrelationships among all the parameters be fully understood. This paper explores an alternative, neural network approach to modeling metrics. Experiments performed on two widely accepted metrics, McCabe and Halstead, indicate that the approach is sound, thus serving as the groundwork for further exploration into the analysis and design of software metrics.
A GPS Phase-Locked Loop Performance Metric Based on the Phase Discriminator Output
Stevanovic, Stefan; Pervan, Boris
2018-01-01
We propose a novel GPS phase-lock loop (PLL) performance metric based on the standard deviation of tracking error (defined as the discriminator’s estimate of the true phase error), and explain its advantages over the popular phase jitter metric using theory, numerical simulation, and experimental results. We derive an augmented GPS phase-lock loop (PLL) linear model, which includes the effect of coherent averaging, to be used in conjunction with this proposed metric. The augmented linear model allows more accurate calculation of tracking error standard deviation in the presence of additive white Gaussian noise (AWGN) as compared to traditional linear models. The standard deviation of tracking error, with a threshold corresponding to half of the arctangent discriminator pull-in region, is shown to be a more reliable/robust measure of PLL performance under interference conditions than the phase jitter metric. In addition, the augmented linear model is shown to be valid up until this threshold, which facilitates efficient performance prediction, so that time-consuming direct simulations and costly experimental testing can be reserved for PLL designs that are much more likely to be successful. The effect of varying receiver reference oscillator quality on the tracking error metric is also considered. PMID:29351250
Zone calculation as a tool for assessing performance outcome in laparoscopic suturing.
Buckley, Christina E; Kavanagh, Dara O; Nugent, Emmeline; Ryan, Donncha; Traynor, Oscar J; Neary, Paul C
2015-06-01
Simulator performance is measured by metrics, which are valued as an objective way of assessing trainees. Certain procedures such as laparoscopic suturing, however, may not be suitable for assessment under traditionally formulated metrics. Our aim was to assess if our new metric is a valid method of assessing laparoscopic suturing. A software program was developed to order to create a new metric, which would calculate the percentage of time spent operating within pre-defined areas called "zones." Twenty-five candidates (medical students N = 10, surgical residents N = 10, and laparoscopic experts N = 5) performed the laparoscopic suturing task on the ProMIS III(®) simulator. New metrics of "in-zone" and "out-zone" scores as well as traditional metrics of time, path length, and smoothness were generated. Performance was also assessed by two blinded observers using the OSATS and FLS rating scales. This novel metric was evaluated by comparing it to both traditional metrics and subjective scores. There was a significant difference in the average in-zone and out-zone scores between all three experience groups (p < 0.05). The new zone metrics scores correlated significantly with the subjective-blinded observer scores of OSATS and FLS (p = 0.0001). The new zone metric scores also correlated significantly with the traditional metrics of path length, time, and smoothness (p < 0.05). The new metric is a valid tool for assessing laparoscopic suturing objectively. This could be incorporated into a competency-based curriculum to monitor resident progression in the simulated setting.
Metrics for Offline Evaluation of Prognostic Performance
NASA Technical Reports Server (NTRS)
Saxena, Abhinav; Celaya, Jose; Saha, Bhaskar; Saha, Sankalita; Goebel, Kai
2010-01-01
Prognostic performance evaluation has gained significant attention in the past few years. Currently, prognostics concepts lack standard definitions and suffer from ambiguous and inconsistent interpretations. This lack of standards is in part due to the varied end-user requirements for different applications, time scales, available information, domain dynamics, etc. to name a few. The research community has used a variety of metrics largely based on convenience and their respective requirements. Very little attention has been focused on establishing a standardized approach to compare different efforts. This paper presents several new evaluation metrics tailored for prognostics that were recently introduced and were shown to effectively evaluate various algorithms as compared to other conventional metrics. Specifically, this paper presents a detailed discussion on how these metrics should be interpreted and used. These metrics have the capability of incorporating probabilistic uncertainty estimates from prognostic algorithms. In addition to quantitative assessment they also offer a comprehensive visual perspective that can be used in designing the prognostic system. Several methods are suggested to customize these metrics for different applications. Guidelines are provided to help choose one method over another based on distribution characteristics. Various issues faced by prognostics and its performance evaluation are discussed followed by a formal notational framework to help standardize subsequent developments.
Creating "Intelligent" Climate Model Ensemble Averages Using a Process-Based Framework
NASA Astrophysics Data System (ADS)
Baker, N. C.; Taylor, P. C.
2014-12-01
The CMIP5 archive contains future climate projections from over 50 models provided by dozens of modeling centers from around the world. Individual model projections, however, are subject to biases created by structural model uncertainties. As a result, ensemble averaging of multiple models is often used to add value to model projections: consensus projections have been shown to consistently outperform individual models. Previous reports for the IPCC establish climate change projections based on an equal-weighted average of all model projections. However, certain models reproduce climate processes better than other models. Should models be weighted based on performance? Unequal ensemble averages have previously been constructed using a variety of mean state metrics. What metrics are most relevant for constraining future climate projections? This project develops a framework for systematically testing metrics in models to identify optimal metrics for unequal weighting multi-model ensembles. A unique aspect of this project is the construction and testing of climate process-based model evaluation metrics. A climate process-based metric is defined as a metric based on the relationship between two physically related climate variables—e.g., outgoing longwave radiation and surface temperature. Metrics are constructed using high-quality Earth radiation budget data from NASA's Clouds and Earth's Radiant Energy System (CERES) instrument and surface temperature data sets. It is found that regional values of tested quantities can vary significantly when comparing weighted and unweighted model ensembles. For example, one tested metric weights the ensemble by how well models reproduce the time-series probability distribution of the cloud forcing component of reflected shortwave radiation. The weighted ensemble for this metric indicates lower simulated precipitation (up to .7 mm/day) in tropical regions than the unweighted ensemble: since CMIP5 models have been shown to overproduce precipitation, this result could indicate that the metric is effective in identifying models which simulate more realistic precipitation. Ultimately, the goal of the framework is to identify performance metrics for advising better methods for ensemble averaging models and create better climate predictions.
NASA Astrophysics Data System (ADS)
Trimborn, Barbara; Wolf, Ivo; Abu-Sammour, Denis; Henzler, Thomas; Schad, Lothar R.; Zöllner, Frank G.
2017-03-01
Image registration of preprocedural contrast-enhanced CTs to intraprocedual cone-beam computed tomography (CBCT) can provide additional information for interventional liver oncology procedures such as transcatheter arterial chemoembolisation (TACE). In this paper, a novel similarity metric for gradient-based image registration is proposed. The metric relies on the patch-based computation of histograms of oriented gradients (HOG) building the basis for a feature descriptor. The metric was implemented in a framework for rigid 3D-3D-registration of pre-interventional CT with intra-interventional CBCT data obtained during the workflow of a TACE. To evaluate the performance of the new metric, the capture range was estimated based on the calculation of the mean target registration error and compared to the results obtained with a normalized cross correlation metric. The results show that 3D HOG feature descriptors are suitable as image-similarity metric and that the novel metric can compete with established methods in terms of registration accuracy
Algal bioassessment metrics for wadeable streams and rivers of Maine, USA
Danielson, Thomas J.; Loftin, Cynthia S.; Tsomides, Leonidas; DiFranco, Jeanne L.; Connors, Beth
2011-01-01
Many state water-quality agencies use biological assessment methods based on lotic fish and macroinvertebrate communities, but relatively few states have incorporated algal multimetric indices into monitoring programs. Algae are good indicators for monitoring water quality because they are sensitive to many environmental stressors. We evaluated benthic algal community attributes along a landuse gradient affecting wadeable streams and rivers in Maine, USA, to identify potential bioassessment metrics. We collected epilithic algal samples from 193 locations across the state. We computed weighted-average optima for common taxa for total P, total N, specific conductance, % impervious cover, and % developed watershed, which included all land use that is no longer forest or wetland. We assigned Maine stream tolerance values and categories (sensitive, intermediate, tolerant) to taxa based on their optima and responses to watershed disturbance. We evaluated performance of algal community metrics used in multimetric indices from other regions and novel metrics based on Maine data. Metrics specific to Maine data, such as the relative richness of species characterized as being sensitive in Maine, were more correlated with % developed watershed than most metrics used in other regions. Few community-structure attributes (e.g., species richness) were useful metrics in Maine. Performance of algal bioassessment models would be improved if metrics were evaluated with attributes of local data before inclusion in multimetric indices or statistical models. ?? 2011 by The North American Benthological Society.
Developing a Security Metrics Scorecard for Healthcare Organizations.
Elrefaey, Heba; Borycki, Elizabeth; Kushniruk, Andrea
2015-01-01
In healthcare, information security is a key aspect of protecting a patient's privacy and ensuring systems availability to support patient care. Security managers need to measure the performance of security systems and this can be achieved by using evidence-based metrics. In this paper, we describe the development of an evidence-based security metrics scorecard specific to healthcare organizations. Study participants were asked to comment on the usability and usefulness of a prototype of a security metrics scorecard that was developed based on current research in the area of general security metrics. Study findings revealed that scorecards need to be customized for the healthcare setting in order for the security information to be useful and usable in healthcare organizations. The study findings resulted in the development of a security metrics scorecard that matches the healthcare security experts' information requirements.
DOT National Transportation Integrated Search
2016-06-01
Traditional highway safety performance metrics have been largely based on fatal crashes and more recently serious injury crashes. In the near future however, there may be less severe motor vehicle crashes due to advances in driver assistance systems,...
NASA Astrophysics Data System (ADS)
Camp, H. A.; Moyer, Steven; Moore, Richard K.
2010-04-01
The Night Vision and Electronic Sensors Directorate's current time-limited search (TLS) model, which makes use of the targeting task performance (TTP) metric to describe image quality, does not explicitly account for the effects of visual clutter on observer performance. The TLS model is currently based on empirical fits to describe human performance for a time of day, spectrum and environment. Incorporating a clutter metric into the TLS model may reduce the number of these empirical fits needed. The masked target transform volume (MTTV) clutter metric has been previously presented and compared to other clutter metrics. Using real infrared imagery of rural images with varying levels of clutter, NVESD is currently evaluating the appropriateness of the MTTV metric. NVESD had twenty subject matter experts (SME) rank the amount of clutter in each scene in a series of pair-wise comparisons. MTTV metric values were calculated and then compared to the SME observers rankings. The MTTV metric ranked the clutter in a similar manner to the SME evaluation, suggesting that the MTTV metric may emulate SME response. This paper is a first step in quantifying clutter and measuring the agreement to subjective human evaluation.
Guidelines for evaluating performance of oyster habitat restoration
Baggett, Lesley P.; Powers, Sean P.; Brumbaugh, Robert D.; Coen, Loren D.; DeAngelis, Bryan M.; Greene, Jennifer K.; Hancock, Boze T.; Morlock, Summer M.; Allen, Brian L.; Breitburg, Denise L.; Bushek, David; Grabowski, Jonathan H.; Grizzle, Raymond E.; Grosholz, Edwin D.; LaPeyre, Megan K.; Luckenbach, Mark W.; McGraw, Kay A.; Piehler, Michael F.; Westby, Stephanie R.; zu Ermgassen, Philine S. E.
2015-01-01
Restoration of degraded ecosystems is an important societal goal, yet inadequate monitoring and the absence of clear performance metrics are common criticisms of many habitat restoration projects. Funding limitations can prevent adequate monitoring, but we suggest that the lack of accepted metrics to address the diversity of restoration objectives also presents a serious challenge to the monitoring of restoration projects. A working group with experience in designing and monitoring oyster reef projects was used to develop standardized monitoring metrics, units, and performance criteria that would allow for comparison among restoration sites and projects of various construction types. A set of four universal metrics (reef areal dimensions, reef height, oyster density, and oyster size–frequency distribution) and a set of three universal environmental variables (water temperature, salinity, and dissolved oxygen) are recommended to be monitored for all oyster habitat restoration projects regardless of their goal(s). In addition, restoration goal-based metrics specific to four commonly cited ecosystem service-based restoration goals are recommended, along with an optional set of seven supplemental ancillary metrics that could provide information useful to the interpretation of prerestoration and postrestoration monitoring data. Widespread adoption of a common set of metrics with standardized techniques and units to assess well-defined goals not only allows practitioners to gauge the performance of their own projects but also allows for comparison among projects, which is both essential to the advancement of the field of oyster restoration and can provide new knowledge about the structure and ecological function of oyster reef ecosystems.
Using Publication Metrics to Highlight Academic Productivity and Research Impact
Carpenter, Christopher R.; Cone, David C.; Sarli, Cathy C.
2016-01-01
This article provides a broad overview of widely available measures of academic productivity and impact using publication data and highlights uses of these metrics for various purposes. Metrics based on publication data include measures such as number of publications, number of citations, the journal impact factor score, and the h-index, as well as emerging metrics based on document-level metrics. Publication metrics can be used for a variety of purposes for tenure and promotion, grant applications and renewal reports, benchmarking, recruiting efforts, and administrative purposes for departmental or university performance reports. The authors also highlight practical applications of measuring and reporting academic productivity and impact to emphasize and promote individual investigators, grant applications, or department output. PMID:25308141
Quality of service routing in the differentiated services framework
NASA Astrophysics Data System (ADS)
Oliveira, Marilia C.; Melo, Bruno; Quadros, Goncalo; Monteiro, Edmundo
2001-02-01
In this paper we present a quality of service routing strategy for network where traffic differentiation follows the class-based paradigm, as in the Differentiated Services framework. This routing strategy is based on a metric of quality of service. This metric represents the impact that delay and losses verified at each router in the network have in application performance. Based on this metric, it is selected a path for each class according to the class sensitivity to delay and losses. The distribution of the metric is triggered by a relative criterion with two thresholds, and the values advertised are the moving average of the last values measured.
ERIC Educational Resources Information Center
Travis, James L., III
2014-01-01
This study investigated how and to what extent the development and use of the OV-5a operational architecture decomposition tree (OADT) from the Department of Defense (DoD) Architecture Framework (DoDAF) affects requirements analysis with respect to complete performance metrics for performance-based services acquisition of ICT under rigid…
DeJournett, Jeremy; DeJournett, Leon
2017-11-01
Effective glucose control in the intensive care unit (ICU) setting has the potential to decrease morbidity and mortality rates and thereby decrease health care expenditures. To evaluate what constitutes effective glucose control, typically several metrics are reported, including time in range, time in mild and severe hypoglycemia, coefficient of variation, and others. To date, there is no one metric that combines all of these individual metrics to give a number indicative of overall performance. We proposed a composite metric that combines 5 commonly reported metrics, and we used this composite metric to compare 6 glucose controllers. We evaluated the following controllers: Ideal Medical Technologies (IMT) artificial-intelligence-based controller, Yale protocol, Glucommander, Wintergerst et al PID controller, GRIP, and NICE-SUGAR. We evaluated each controller across 80 simulated patients, 4 clinically relevant exogenous dextrose infusions, and one nonclinical infusion as a test of the controller's ability to handle difficult situations. This gave a total of 2400 5-day simulations, and 585 604 individual glucose values for analysis. We used a random walk sensor error model that gave a 10% MARD. For each controller, we calculated severe hypoglycemia (<40 mg/dL), mild hypoglycemia (40-69 mg/dL), normoglycemia (70-140 mg/dL), hyperglycemia (>140 mg/dL), and coefficient of variation (CV), as well as our novel controller metric. For the controllers tested, we achieved the following median values for our novel controller scoring metric: IMT: 88.1, YALE: 46.7, GLUC: 47.2, PID: 50, GRIP: 48.2, NICE: 46.4. The novel scoring metric employed in this study shows promise as a means for evaluating new and existing ICU-based glucose controllers, and it could be used in the future to compare results of glucose control studies in critical care. The IMT AI-based glucose controller demonstrated the most consistent performance results based on this new metric.
Value-based metrics and Internet-based enterprises
NASA Astrophysics Data System (ADS)
Gupta, Krishan M.
2001-10-01
Within the last few years, a host of value-based metrics like EVA, MVA, TBR, CFORI, and TSR have evolved. This paper attempts to analyze the validity and applicability of EVA and Balanced Scorecard for Internet based organizations. Despite the collapse of the dot-com model, the firms engaged in e- commerce continue to struggle to find new ways to account for customer-base, technology, employees, knowledge, etc, as part of the value of the firm. While some metrics, like the Balance Scorecard are geared towards internal use, others like EVA are for external use. Value-based metrics are used for performing internal audits as well as comparing firms against one another; and can also be effectively utilized by individuals outside the firm looking to determine if the firm is creating value for its stakeholders.
Evaluating true BCI communication rate through mutual information and language models.
Speier, William; Arnold, Corey; Pouratian, Nader
2013-01-01
Brain-computer interface (BCI) systems are a promising means for restoring communication to patients suffering from "locked-in" syndrome. Research to improve system performance primarily focuses on means to overcome the low signal to noise ratio of electroencephalogric (EEG) recordings. However, the literature and methods are difficult to compare due to the array of evaluation metrics and assumptions underlying them, including that: 1) all characters are equally probable, 2) character selection is memoryless, and 3) errors occur completely at random. The standardization of evaluation metrics that more accurately reflect the amount of information contained in BCI language output is critical to make progress. We present a mutual information-based metric that incorporates prior information and a model of systematic errors. The parameters of a system used in one study were re-optimized, showing that the metric used in optimization significantly affects the parameter values chosen and the resulting system performance. The results of 11 BCI communication studies were then evaluated using different metrics, including those previously used in BCI literature and the newly advocated metric. Six studies' results varied based on the metric used for evaluation and the proposed metric produced results that differed from those originally published in two of the studies. Standardizing metrics to accurately reflect the rate of information transmission is critical to properly evaluate and compare BCI communication systems and advance the field in an unbiased manner.
Performance Metrics, Error Modeling, and Uncertainty Quantification
NASA Technical Reports Server (NTRS)
Tian, Yudong; Nearing, Grey S.; Peters-Lidard, Christa D.; Harrison, Kenneth W.; Tang, Ling
2016-01-01
A common set of statistical metrics has been used to summarize the performance of models or measurements- the most widely used ones being bias, mean square error, and linear correlation coefficient. They assume linear, additive, Gaussian errors, and they are interdependent, incomplete, and incapable of directly quantifying uncertainty. The authors demonstrate that these metrics can be directly derived from the parameters of the simple linear error model. Since a correct error model captures the full error information, it is argued that the specification of a parametric error model should be an alternative to the metrics-based approach. The error-modeling methodology is applicable to both linear and nonlinear errors, while the metrics are only meaningful for linear errors. In addition, the error model expresses the error structure more naturally, and directly quantifies uncertainty. This argument is further explained by highlighting the intrinsic connections between the performance metrics, the error model, and the joint distribution between the data and the reference.
Designing a Robust Micromixer Based on Fluid Stretching
NASA Astrophysics Data System (ADS)
Mott, David; Gautam, Dipesh; Voth, Greg; Oran, Elaine
2010-11-01
A metric for measuring fluid stretching based on finite-time Lyapunov exponents is described, and the use of this metric for optimizing mixing in microfluidic components is explored. The metric is implemented within an automated design approach called the Computational Toolbox (CTB). The CTB designs components by adding geometric features, such a grooves of various shapes, to a microchannel. The transport produced by each of these features in isolation was pre-computed and stored as an "advection map" for that feature, and the flow through a composite geometry that combines these features is calculated rapidly by applying the corresponding maps in sequence. A genetic algorithm search then chooses the feature combination that optimizes a user-specified metric. Metrics based on the variance of concentration generally require the user to specify the fluid distributions at inflow, which leads to different mixer designs for different inflow arrangements. The stretching metric is independent of the fluid arrangement at inflow. Mixers designed using the stretching metric are compared to those designed using a variance of concentration metric and show excellent performance across a variety of inflow distributions and diffusivities.
Multidisciplinary life cycle metrics and tools for green buildings.
Helgeson, Jennifer F; Lippiatt, Barbara C
2009-07-01
Building sector stakeholders need compelling metrics, tools, data, and case studies to support major investments in sustainable technologies. Proponents of green building widely claim that buildings integrating sustainable technologies are cost effective, but often these claims are based on incomplete, anecdotal evidence that is difficult to reproduce and defend. The claims suffer from 2 main weaknesses: 1) buildings on which claims are based are not necessarily "green" in a science-based, life cycle assessment (LCA) sense and 2) measures of cost effectiveness often are not based on standard methods for measuring economic worth. Yet, the building industry demands compelling metrics to justify sustainable building designs. The problem is hard to solve because, until now, neither methods nor robust data supporting defensible business cases were available. The US National Institute of Standards and Technology (NIST) Building and Fire Research Laboratory is beginning to address these needs by developing metrics and tools for assessing the life cycle economic and environmental performance of buildings. Economic performance is measured with the use of standard life cycle costing methods. Environmental performance is measured by LCA methods that assess the "carbon footprint" of buildings, as well as 11 other sustainability metrics, including fossil fuel depletion, smog formation, water use, habitat alteration, indoor air quality, and effects on human health. Carbon efficiency ratios and other eco-efficiency metrics are established to yield science-based measures of the relative worth, or "business cases," for green buildings. Here, the approach is illustrated through a realistic building case study focused on different heating, ventilation, air conditioning technology energy efficiency. Additionally, the evolution of the Building for Environmental and Economic Sustainability multidisciplinary team and future plans in this area are described.
Detecting population recovery using gametic disequilibrium-based effective population size estimates
David A. Tallmon; Robin S. Waples; Dave Gregovich; Michael K. Schwartz
2012-01-01
Recovering populations often must meet specific growth rate or abundance targets before their legal status can be changed from endangered or threatened. While the efficacy, power, and performance of population metrics to infer trends in declining populations has received considerable attention, how these same metrics perform when populations are increasing is less...
Quality Measures for Dialysis: Time for a Balanced Scorecard
2016-01-01
Recent federal legislation establishes a merit-based incentive payment system for physicians, with a scorecard for each professional. The Centers for Medicare and Medicaid Services evaluate quality of care with clinical performance measures and have used these metrics for public reporting and payment to dialysis facilities. Similar metrics may be used for the future merit-based incentive payment system. In nephrology, most clinical performance measures measure processes and intermediate outcomes of care. These metrics were developed from population studies of best practice and do not identify opportunities for individualizing care on the basis of patient characteristics and individual goals of treatment. The In-Center Hemodialysis (ICH) Consumer Assessment of Healthcare Providers and Systems (CAHPS) survey examines patients' perception of care and has entered the arena to evaluate quality of care. A balanced scorecard of quality performance should include three elements: population-based best clinical practice, patient perceptions, and individually crafted patient goals of care. PMID:26316622
Tang, Tao; Stevenson, R Jan; Infante, Dana M
2016-10-15
Regional variation in both natural environment and human disturbance can influence performance of ecological assessments. In this study we calculated 5 types of benthic diatom multimetric indices (MMIs) with 3 different approaches to account for variation in ecological assessments. We used: site groups defined by ecoregions or diatom typologies; the same or different sets of metrics among site groups; and unmodeled or modeled MMIs, where models accounted for natural variation in metrics within site groups by calculating an expected reference condition for each metric and each site. We used data from the USEPA's National Rivers and Streams Assessment to calculate the MMIs and evaluate changes in MMI performance. MMI performance was evaluated with indices of precision, bias, responsiveness, sensitivity and relevancy which were respectively measured as MMI variation among reference sites, effects of natural variables on MMIs, difference between MMIs at reference and highly disturbed sites, percent of highly disturbed sites properly classified, and relation of MMIs to human disturbance and stressors. All 5 types of MMIs showed considerable discrimination ability. Using different metrics among ecoregions sometimes reduced precision, but it consistently increased responsiveness, sensitivity, and relevancy. Site specific metric modeling reduced bias and increased responsiveness. Combined use of different metrics among site groups and site specific modeling significantly improved MMI performance irrespective of site grouping approach. Compared to ecoregion site classification, grouping sites based on diatom typologies improved precision, but did not improve overall performance of MMIs if we accounted for natural variation in metrics with site specific models. We conclude that using different metrics among ecoregions and site specific metric modeling improve MMI performance, particularly when used together. Applications of these MMI approaches in ecological assessments introduced a tradeoff with assessment consistency when metrics differed across site groups, but they justified the convenient and consistent use of ecoregions. Copyright © 2016 Elsevier B.V. All rights reserved.
Cognitive skills assessment during robot-assisted surgery: separating the wheat from the chaff.
Guru, Khurshid A; Esfahani, Ehsan T; Raza, Syed J; Bhat, Rohit; Wang, Katy; Hammond, Yana; Wilding, Gregory; Peabody, James O; Chowriappa, Ashirwad J
2015-01-01
To investigate the utility of cognitive assessment during robot-assisted surgery (RAS) to define skills in terms of cognitive engagement, mental workload, and mental state; while objectively differentiating between novice and expert surgeons. In all, 10 surgeons with varying operative experience were assigned to beginner (BG), combined competent and proficient (CPG), and expert (EG) groups based on the Dreyfus model. The participants performed tasks for basic, intermediate and advanced skills on the da Vinci Surgical System. Participant performance was assessed using both tool-based and cognitive metrics. Tool-based metrics showed significant differences between the BG vs CPG and the BG vs EG, in basic skills. While performing intermediate skills, there were significant differences only on the instrument-to-instrument collisions between the BG vs CPG (2.0 vs 0.2, P = 0.028), and the BG vs EG (2.0 vs 0.1, P = 0.018). There were no significant differences between the CPG and EG for both basic and intermediate skills. However, using cognitive metrics, there were significant differences between all groups for the basic and intermediate skills. In advanced skills, there were no significant differences between the CPG and the EG except time (1116 vs 599.6 s), using tool-based metrics. However, cognitive metrics revealed significant differences between both groups. Cognitive assessment of surgeons may aid in defining levels of expertise performing complex surgical tasks once competence is achieved. Cognitive assessment may be used as an adjunct to the traditional methods for skill assessment during RAS. © 2014 The Authors. BJU International © 2014 BJU International.
Automated Metrics in a Virtual-Reality Myringotomy Simulator: Development and Construct Validity.
Huang, Caiwen; Cheng, Horace; Bureau, Yves; Ladak, Hanif M; Agrawal, Sumit K
2018-06-15
The objectives of this study were: 1) to develop and implement a set of automated performance metrics into the Western myringotomy simulator, and 2) to establish construct validity. Prospective simulator-based assessment study. The Auditory Biophysics Laboratory at Western University, London, Ontario, Canada. Eleven participants were recruited from the Department of Otolaryngology-Head & Neck Surgery at Western University: four senior otolaryngology consultants and seven junior otolaryngology residents. Educational simulation. Discrimination between expert and novice participants on five primary automated performance metrics: 1) time to completion, 2) surgical errors, 3) incision angle, 4) incision length, and 5) the magnification of the microscope. Automated performance metrics were developed, programmed, and implemented into the simulator. Participants were given a standardized simulator orientation and instructions on myringotomy and tube placement. Each participant then performed 10 procedures and automated metrics were collected. The metrics were analyzed using the Mann-Whitney U test with Bonferroni correction. All metrics discriminated senior otolaryngologists from junior residents with a significance of p < 0.002. Junior residents had 2.8 times more errors compared with the senior otolaryngologists. Senior otolaryngologists took significantly less time to completion compared with junior residents. The senior group also had significantly longer incision lengths, more accurate incision angles, and lower magnification keeping both the umbo and annulus in view. Automated quantitative performance metrics were successfully developed and implemented, and construct validity was established by discriminating between expert and novice participants.
Distributed Space Mission Design for Earth Observation Using Model-Based Performance Evaluation
NASA Technical Reports Server (NTRS)
Nag, Sreeja; LeMoigne-Stewart, Jacqueline; Cervantes, Ben; DeWeck, Oliver
2015-01-01
Distributed Space Missions (DSMs) are gaining momentum in their application to earth observation missions owing to their unique ability to increase observation sampling in multiple dimensions. DSM design is a complex problem with many design variables, multiple objectives determining performance and cost and emergent, often unexpected, behaviors. There are very few open-access tools available to explore the tradespace of variables, minimize cost and maximize performance for pre-defined science goals, and therefore select the most optimal design. This paper presents a software tool that can multiple DSM architectures based on pre-defined design variable ranges and size those architectures in terms of predefined science and cost metrics. The tool will help a user select Pareto optimal DSM designs based on design of experiments techniques. The tool will be applied to some earth observation examples to demonstrate its applicability in making some key decisions between different performance metrics and cost metrics early in the design lifecycle.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Owen, D; Anderson, C; Mayo, C
Purpose: To extend the functionality of a commercial treatment planning system (TPS) to support (i) direct use of quantitative image-based metrics within treatment plan optimization and (ii) evaluation of dose-functional volume relationships to assist in functional image adaptive radiotherapy. Methods: A script was written that interfaces with a commercial TPS via an Application Programming Interface (API). The script executes a program that performs dose-functional volume analyses. Written in C#, the script reads the dose grid and correlates it with image data on a voxel-by-voxel basis through API extensions that can access registration transforms. A user interface was designed through WinFormsmore » to input parameters and display results. To test the performance of this program, image- and dose-based metrics computed from perfusion SPECT images aligned to the treatment planning CT were generated, validated, and compared. Results: The integration of image analysis information was successfully implemented as a plug-in to a commercial TPS. Perfusion SPECT images were used to validate the calculation and display of image-based metrics as well as dose-intensity metrics and histograms for defined structures on the treatment planning CT. Various biological dose correction models, custom image-based metrics, dose-intensity computations, and dose-intensity histograms were applied to analyze the image-dose profile. Conclusion: It is possible to add image analysis features to commercial TPSs through custom scripting applications. A tool was developed to enable the evaluation of image-intensity-based metrics in the context of functional targeting and avoidance. In addition to providing dose-intensity metrics and histograms that can be easily extracted from a plan database and correlated with outcomes, the system can also be extended to a plug-in optimization system, which can directly use the computed metrics for optimization of post-treatment tumor or normal tissue response models. Supported by NIH - P01 - CA059827.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, T; Ruan, D
Purpose: The growing size and heterogeneity in training atlas necessitates sophisticated schemes to identify only the most relevant atlases for the specific multi-atlas-based image segmentation problem. This study aims to develop a model to infer the inaccessible oracle geometric relevance metric from surrogate image similarity metrics, and based on such model, provide guidance to atlas selection in multi-atlas-based image segmentation. Methods: We relate the oracle geometric relevance metric in label space to the surrogate metric in image space, by a monotonically non-decreasing function with additive random perturbations. Subsequently, a surrogate’s ability to prognosticate the oracle order for atlas subset selectionmore » is quantified probabilistically. Finally, important insights and guidance are provided for the design of fusion set size, balancing the competing demands to include the most relevant atlases and to exclude the most irrelevant ones. A systematic solution is derived based on an optimization framework. Model verification and performance assessment is performed based on clinical prostate MR images. Results: The proposed surrogate model was exemplified by a linear map with normally distributed perturbation, and verified with several commonly-used surrogates, including MSD, NCC and (N)MI. The derived behaviors of different surrogates in atlas selection and their corresponding performance in ultimate label estimate were validated. The performance of NCC and (N)MI was similarly superior to MSD, with a 10% higher atlas selection probability and a segmentation performance increase in DSC by 0.10 with the first and third quartiles of (0.83, 0.89), compared to (0.81, 0.89). The derived optimal fusion set size, valued at 7/8/8/7 for MSD/NCC/MI/NMI, agreed well with the appropriate range [4, 9] from empirical observation. Conclusion: This work has developed an efficacious probabilistic model to characterize the image-based surrogate metric on atlas selection. Analytical insights lead to valid guiding principles on fusion set size design.« less
Yeung, Dit-Yan; Chang, Hong; Dai, Guang
2008-11-01
In recent years, metric learning in the semisupervised setting has aroused a lot of research interest. One type of semisupervised metric learning utilizes supervisory information in the form of pairwise similarity or dissimilarity constraints. However, most methods proposed so far are either limited to linear metric learning or unable to scale well with the data set size. In this letter, we propose a nonlinear metric learning method based on the kernel approach. By applying low-rank approximation to the kernel matrix, our method can handle significantly larger data sets. Moreover, our low-rank approximation scheme can naturally lead to out-of-sample generalization. Experiments performed on both artificial and real-world data show very promising results.
Cognitive context detection in UAS operators using eye-gaze patterns on computer screens
NASA Astrophysics Data System (ADS)
Mannaru, Pujitha; Balasingam, Balakumar; Pattipati, Krishna; Sibley, Ciara; Coyne, Joseph
2016-05-01
In this paper, we demonstrate the use of eye-gaze metrics of unmanned aerial systems (UAS) operators as effective indices of their cognitive workload. Our analyses are based on an experiment where twenty participants performed pre-scripted UAS missions of three different difficulty levels by interacting with two custom designed graphical user interfaces (GUIs) that are displayed side by side. First, we compute several eye-gaze metrics, traditional eye movement metrics as well as newly proposed ones, and analyze their effectiveness as cognitive classifiers. Most of the eye-gaze metrics are computed by dividing the computer screen into "cells". Then, we perform several analyses in order to select metrics for effective cognitive context classification related to our specific application; the objective of these analyses are to (i) identify appropriate ways to divide the screen into cells; (ii) select appropriate metrics for training and classification of cognitive features; and (iii) identify a suitable classification method.
Accelerating Time-Varying Hardware Volume Rendering Using TSP Trees and Color-Based Error Metrics
NASA Technical Reports Server (NTRS)
Ellsworth, David; Chiang, Ling-Jen; Shen, Han-Wei; Kwak, Dochan (Technical Monitor)
2000-01-01
This paper describes a new hardware volume rendering algorithm for time-varying data. The algorithm uses the Time-Space Partitioning (TSP) tree data structure to identify regions within the data that have spatial or temporal coherence. By using this coherence, the rendering algorithm can improve performance when the volume data is larger than the texture memory capacity by decreasing the amount of textures required. This coherence can also allow improved speed by appropriately rendering flat-shaded polygons instead of textured polygons, and by not rendering transparent regions. To reduce the polygonization overhead caused by the use of the hierarchical data structure, we introduce an optimization method using polygon templates. The paper also introduces new color-based error metrics, which more accurately identify coherent regions compared to the earlier scalar-based metrics. By showing experimental results from runs using different data sets and error metrics, we demonstrate that the new methods give substantial improvements in volume rendering performance.
Duran, Cassidy; Estrada, Sean; O'Malley, Marcia; Sheahan, Malachi G; Shames, Murray L; Lee, Jason T; Bismuth, Jean
2015-12-01
Fundamental skills testing is now required for certification in general surgery. No model for assessing fundamental endovascular skills exists. Our objective was to develop a model that tests the fundamental endovascular skills and differentiates competent from noncompetent performance. The Fundamentals of Endovascular Surgery model was developed in silicon and virtual-reality versions. Twenty individuals (with a range of experience) performed four tasks on each model in three separate sessions. Tasks on the silicon model were performed under fluoroscopic guidance, and electromagnetic tracking captured motion metrics for catheter tip position. Image processing captured tool tip position and motion on the virtual model. Performance was evaluated using a global rating scale, blinded video assessment of error metrics, and catheter tip movement and position. Motion analysis was based on derivations of speed and position that define proficiency of movement (spectral arc length, duration of submovement, and number of submovements). Performance was significantly different between competent and noncompetent interventionalists for the three performance measures of motion metrics, error metrics, and global rating scale. The mean error metric score was 6.83 for noncompetent individuals and 2.51 for the competent group (P < .0001). Median global rating scores were 2.25 for the noncompetent group and 4.75 for the competent users (P < .0001). The Fundamentals of Endovascular Surgery model successfully differentiates competent and noncompetent performance of fundamental endovascular skills based on a series of objective performance measures. This model could serve as a platform for skills testing for all trainees. Copyright © 2015 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.
Yu, Zhan; Li, Yuanyang; Liu, Lisheng; Guo, Jin; Wang, Tingfeng; Yang, Guoqing
2017-11-10
The speckle pattern (line by line) sequential extraction (SPSE) metric is proposed by the one-dimensional speckle intensity level crossing theory. Through the sequential extraction of received speckle information, the speckle metrics for estimating the variation of focusing spot size on a remote diffuse target are obtained. Based on the simulation, we will give some discussions about the SPSE metric range of application under the theoretical conditions, and the aperture size will affect the metric performance of the observation system. The results of the analyses are verified by the experiment. This method is applied to the detection of relative static target (speckled jitter frequency is less than the CCD sampling frequency). The SPSE metric can determine the variation of the focusing spot size over a long distance, moreover, the metric will estimate the spot size under some conditions. Therefore, the monitoring and the feedback of far-field spot will be implemented laser focusing system applications and help the system to optimize the focusing performance.
NASA Astrophysics Data System (ADS)
Stisen, S.; Demirel, C.; Koch, J.
2017-12-01
Evaluation of performance is an integral part of model development and calibration as well as it is of paramount importance when communicating modelling results to stakeholders and the scientific community. There exists a comprehensive and well tested toolbox of metrics to assess temporal model performance in the hydrological modelling community. On the contrary, the experience to evaluate spatial performance is not corresponding to the grand availability of spatial observations readily available and to the sophisticate model codes simulating the spatial variability of complex hydrological processes. This study aims at making a contribution towards advancing spatial pattern oriented model evaluation for distributed hydrological models. This is achieved by introducing a novel spatial performance metric which provides robust pattern performance during model calibration. The promoted SPAtial EFficiency (spaef) metric reflects three equally weighted components: correlation, coefficient of variation and histogram overlap. This multi-component approach is necessary in order to adequately compare spatial patterns. spaef, its three components individually and two alternative spatial performance metrics, i.e. connectivity analysis and fractions skill score, are tested in a spatial pattern oriented model calibration of a catchment model in Denmark. The calibration is constrained by a remote sensing based spatial pattern of evapotranspiration and discharge timeseries at two stations. Our results stress that stand-alone metrics tend to fail to provide holistic pattern information to the optimizer which underlines the importance of multi-component metrics. The three spaef components are independent which allows them to complement each other in a meaningful way. This study promotes the use of bias insensitive metrics which allow comparing variables which are related but may differ in unit in order to optimally exploit spatial observations made available by remote sensing platforms. We see great potential of spaef across environmental disciplines dealing with spatially distributed modelling.
Improving Climate Projections Using "Intelligent" Ensembles
NASA Technical Reports Server (NTRS)
Baker, Noel C.; Taylor, Patrick C.
2015-01-01
Recent changes in the climate system have led to growing concern, especially in communities which are highly vulnerable to resource shortages and weather extremes. There is an urgent need for better climate information to develop solutions and strategies for adapting to a changing climate. Climate models provide excellent tools for studying the current state of climate and making future projections. However, these models are subject to biases created by structural uncertainties. Performance metrics-or the systematic determination of model biases-succinctly quantify aspects of climate model behavior. Efforts to standardize climate model experiments and collect simulation data-such as the Coupled Model Intercomparison Project (CMIP)-provide the means to directly compare and assess model performance. Performance metrics have been used to show that some models reproduce present-day climate better than others. Simulation data from multiple models are often used to add value to projections by creating a consensus projection from the model ensemble, in which each model is given an equal weight. It has been shown that the ensemble mean generally outperforms any single model. It is possible to use unequal weights to produce ensemble means, in which models are weighted based on performance (called "intelligent" ensembles). Can performance metrics be used to improve climate projections? Previous work introduced a framework for comparing the utility of model performance metrics, showing that the best metrics are related to the variance of top-of-atmosphere outgoing longwave radiation. These metrics improve present-day climate simulations of Earth's energy budget using the "intelligent" ensemble method. The current project identifies several approaches for testing whether performance metrics can be applied to future simulations to create "intelligent" ensemble-mean climate projections. It is shown that certain performance metrics test key climate processes in the models, and that these metrics can be used to evaluate model quality in both current and future climate states. This information will be used to produce new consensus projections and provide communities with improved climate projections for urgent decision-making.
Garcia Castro, Leyla Jael; Berlanga, Rafael; Garcia, Alexander
2015-10-01
Although full-text articles are provided by the publishers in electronic formats, it remains a challenge to find related work beyond the title and abstract context. Identifying related articles based on their abstract is indeed a good starting point; this process is straightforward and does not consume as many resources as full-text based similarity would require. However, further analyses may require in-depth understanding of the full content. Two articles with highly related abstracts can be substantially different regarding the full content. How similarity differs when considering title-and-abstract versus full-text and which semantic similarity metric provides better results when dealing with full-text articles are the main issues addressed in this manuscript. We have benchmarked three similarity metrics - BM25, PMRA, and Cosine, in order to determine which one performs best when using concept-based annotations on full-text documents. We also evaluated variations in similarity values based on title-and-abstract against those relying on full-text. Our test dataset comprises the Genomics track article collection from the 2005 Text Retrieval Conference. Initially, we used an entity recognition software to semantically annotate titles and abstracts as well as full-text with concepts defined in the Unified Medical Language System (UMLS®). For each article, we created a document profile, i.e., a set of identified concepts, term frequency, and inverse document frequency; we then applied various similarity metrics to those document profiles. We considered correlation, precision, recall, and F1 in order to determine which similarity metric performs best with concept-based annotations. For those full-text articles available in PubMed Central Open Access (PMC-OA), we also performed dispersion analyses in order to understand how similarity varies when considering full-text articles. We have found that the PubMed Related Articles similarity metric is the most suitable for full-text articles annotated with UMLS concepts. For similarity values above 0.8, all metrics exhibited an F1 around 0.2 and a recall around 0.1; BM25 showed the highest precision close to 1; in all cases the concept-based metrics performed better than the word-stem-based one. Our experiments show that similarity values vary when considering only title-and-abstract versus full-text similarity. Therefore, analyses based on full-text become useful when a given research requires going beyond title and abstract, particularly regarding connectivity across articles. Visualization available at ljgarcia.github.io/semsim.benchmark/, data available at http://dx.doi.org/10.5281/zenodo.13323. Copyright © 2015 Elsevier Inc. All rights reserved.
Toward objective image quality metrics: the AIC Eval Program of the JPEG
NASA Astrophysics Data System (ADS)
Richter, Thomas; Larabi, Chaker
2008-08-01
Objective quality assessment of lossy image compression codecs is an important part of the recent call of the JPEG for Advanced Image Coding. The target of the AIC ad-hoc group is twofold: First, to receive state-of-the-art still image codecs and to propose suitable technology for standardization; and second, to study objective image quality metrics to evaluate the performance of such codes. Even tthough the performance of an objective metric is defined by how well it predicts the outcome of a subjective assessment, one can also study the usefulness of a metric in a non-traditional way indirectly, namely by measuring the subjective quality improvement of a codec that has been optimized for a specific objective metric. This approach shall be demonstrated here on the recently proposed HDPhoto format14 introduced by Microsoft and a SSIM-tuned17 version of it by one of the authors. We compare these two implementations with JPEG1 in two variations and a visual and PSNR optimal JPEG200013 implementation. To this end, we use subjective and objective tests based on the multiscale SSIM and a new DCT based metric.
Information Geometry for Landmark Shape Analysis: Unifying Shape Representation and Deformation
Peter, Adrian M.; Rangarajan, Anand
2010-01-01
Shape matching plays a prominent role in the comparison of similar structures. We present a unifying framework for shape matching that uses mixture models to couple both the shape representation and deformation. The theoretical foundation is drawn from information geometry wherein information matrices are used to establish intrinsic distances between parametric densities. When a parameterized probability density function is used to represent a landmark-based shape, the modes of deformation are automatically established through the information matrix of the density. We first show that given two shapes parameterized by Gaussian mixture models (GMMs), the well-known Fisher information matrix of the mixture model is also a Riemannian metric (actually, the Fisher-Rao Riemannian metric) and can therefore be used for computing shape geodesics. The Fisher-Rao metric has the advantage of being an intrinsic metric and invariant to reparameterization. The geodesic—computed using this metric—establishes an intrinsic deformation between the shapes, thus unifying both shape representation and deformation. A fundamental drawback of the Fisher-Rao metric is that it is not available in closed form for the GMM. Consequently, shape comparisons are computationally very expensive. To address this, we develop a new Riemannian metric based on generalized ϕ-entropy measures. In sharp contrast to the Fisher-Rao metric, the new metric is available in closed form. Geodesic computations using the new metric are considerably more efficient. We validate the performance and discriminative capabilities of these new information geometry-based metrics by pairwise matching of corpus callosum shapes. We also study the deformations of fish shapes that have various topological properties. A comprehensive comparative analysis is also provided using other landmark-based distances, including the Hausdorff distance, the Procrustes metric, landmark-based diffeomorphisms, and the bending energies of the thin-plate (TPS) and Wendland splines. PMID:19110497
Riato, Luisa; Leira, Manel; Della Bella, Valentina; Oberholster, Paul J
2018-01-15
Acid mine drainage (AMD) from coal mining in the Mpumalanga Highveld region of South Africa has caused severe chemical and biological degradation of aquatic habitats, specifically depressional wetlands, as mines use these wetlands for storage of AMD. Diatom-based multimetric indices (MMIs) to assess wetland condition have mostly been developed to assess agricultural and urban land use impacts. No diatom MMI of wetland condition has been developed to assess AMD impacts related to mining activities. Previous approaches to diatom-based MMI development in wetlands have not accounted for natural variability. Natural variability among depressional wetlands may influence the accuracy of MMIs. Epiphytic diatom MMIs sensitive to AMD were developed for a range of depressional wetland types to account for natural variation in biological metrics. For this, we classified wetland types based on diatom typologies. A range of 4-15 final metrics were selected from a pool of ~140 candidate metrics to develop the MMIs based on their: (1) broad range, (2) high separation power and (3) low correlation among metrics. Final metrics were selected from three categories: similarity to reference sites, functional groups, and taxonomic composition, which represent different aspects of diatom assemblage structure and function. MMI performances were evaluated according to their precision in distinguishing reference sites, responsiveness to discriminate reference and disturbed sites, sensitivity to human disturbances and relevancy to AMD-related stressors. Each MMI showed excellent discriminatory power, whether or not it accounted for natural variation. However, accounting for variation by grouping sites based on diatom typologies improved overall performance of MMIs. Our study highlights the usefulness of diatom-based metrics and provides a model for the biological assessment of depressional wetland condition in South Africa and elsewhere. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhou, Rurui; Li, Yu; Lu, Di; Liu, Haixing; Zhou, Huicheng
2016-09-01
This paper investigates the use of an epsilon-dominance non-dominated sorted genetic algorithm II (ɛ-NSGAII) as a sampling approach with an aim to improving sampling efficiency for multiple metrics uncertainty analysis using Generalized Likelihood Uncertainty Estimation (GLUE). The effectiveness of ɛ-NSGAII based sampling is demonstrated compared with Latin hypercube sampling (LHS) through analyzing sampling efficiency, multiple metrics performance, parameter uncertainty and flood forecasting uncertainty with a case study of flood forecasting uncertainty evaluation based on Xinanjiang model (XAJ) for Qing River reservoir, China. Results obtained demonstrate the following advantages of the ɛ-NSGAII based sampling approach in comparison to LHS: (1) The former performs more effective and efficient than LHS, for example the simulation time required to generate 1000 behavioral parameter sets is shorter by 9 times; (2) The Pareto tradeoffs between metrics are demonstrated clearly with the solutions from ɛ-NSGAII based sampling, also their Pareto optimal values are better than those of LHS, which means better forecasting accuracy of ɛ-NSGAII parameter sets; (3) The parameter posterior distributions from ɛ-NSGAII based sampling are concentrated in the appropriate ranges rather than uniform, which accords with their physical significance, also parameter uncertainties are reduced significantly; (4) The forecasted floods are close to the observations as evaluated by three measures: the normalized total flow outside the uncertainty intervals (FOUI), average relative band-width (RB) and average deviation amplitude (D). The flood forecasting uncertainty is also reduced a lot with ɛ-NSGAII based sampling. This study provides a new sampling approach to improve multiple metrics uncertainty analysis under the framework of GLUE, and could be used to reveal the underlying mechanisms of parameter sets under multiple conflicting metrics in the uncertainty analysis process.
NASA Technical Reports Server (NTRS)
Jones, Harry
2003-01-01
The Advanced Life Support (ALS) has used a single number, Equivalent System Mass (ESM), for both reporting progress and technology selection. ESM is the launch mass required to provide a space system. ESM indicates launch cost. ESM alone is inadequate for technology selection, which should include other metrics such as Technology Readiness Level (TRL) and Life Cycle Cost (LCC) and also consider perfom.arxe 2nd risk. ESM has proven difficult to implement as a reporting metric, partly because it includes non-mass technology selection factors. Since it will not be used exclusively for technology selection, a new reporting metric can be made easier to compute and explain. Systems design trades-off performance, cost, and risk, but a risk weighted cost/benefit metric would be too complex to report. Since life support has fixed requirements, different systems usually have roughly equal performance. Risk is important since failure can harm the crew, but it is difficult to treat simply. Cost is not easy to estimate, but preliminary space system cost estimates are usually based on mass, which is better estimated than cost. Amass-based cost estimate, similar to ESM, would be a good single reporting metric. The paper defines and compares four mass-based cost estimates, Equivalent Mass (EM), Equivalent System Mass (ESM), Life Cycle Mass (LCM), and System Mass (SM). EM is traditional in life support and includes mass, volume, power, cooling and logistics. ESM is the specifically defined ALS metric, which adds crew time and possibly other cost factors to EM. LCM is a new metric, a mass-based estimate of LCC measured in mass units. SM includes only the factors of EM that are originally measured in mass, the hardware and logistics mass. All four mass-based metrics usually give similar comparisons. SM is by far the simplest to compute and easiest to explain.
Cuesta-Frau, David; Miró-Martínez, Pau; Jordán Núñez, Jorge; Oltra-Crespo, Sandra; Molina Picó, Antonio
2017-08-01
This paper evaluates the performance of first generation entropy metrics, featured by the well known and widely used Approximate Entropy (ApEn) and Sample Entropy (SampEn) metrics, and what can be considered an evolution from these, Fuzzy Entropy (FuzzyEn), in the Electroencephalogram (EEG) signal classification context. The study uses the commonest artifacts found in real EEGs, such as white noise, and muscular, cardiac, and ocular artifacts. Using two different sets of publicly available EEG records, and a realistic range of amplitudes for interfering artifacts, this work optimises and assesses the robustness of these metrics against artifacts in class segmentation terms probability. The results show that the qualitative behaviour of the two datasets is similar, with SampEn and FuzzyEn performing the best, and the noise and muscular artifacts are the most confounding factors. On the contrary, there is a wide variability as regards initialization parameters. The poor performance achieved by ApEn suggests that this metric should not be used in these contexts. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Marshak, William P.; Darkow, David J.; Wesler, Mary M.; Fix, Edward L.
2000-08-01
Computer-based display designers have more sensory modes and more dimensions within sensory modality with which to encode information in a user interface than ever before. This elaboration of information presentation has made measurement of display/format effectiveness and predicting display/format performance extremely difficult. A multivariate method has been devised which isolates critical information, physically measures its signal strength, and compares it with other elements of the display, which act like background noise. This common Metric relates signal-to-noise ratios (SNRs) within each stimulus dimension, then combines SNRs among display modes, dimensions and cognitive factors can predict display format effectiveness. Examples with their Common Metric assessment and validation in performance will be presented along with the derivation of the metric. Implications of the Common Metric in display design and evaluation will be discussed.
Orbit design and optimization based on global telecommunication performance metrics
NASA Technical Reports Server (NTRS)
Lee, Seungwon; Lee, Charles H.; Kerridge, Stuart; Cheung, Kar-Ming; Edwards, Charles D.
2006-01-01
The orbit selection of telecommunications orbiters is one of the critical design processes and should be guided by global telecom performance metrics and mission-specific constraints. In order to aid the orbit selection, we have coupled the Telecom Orbit Analysis and Simulation Tool (TOAST) with genetic optimization algorithms. As a demonstration, we have applied the developed tool to select an optimal orbit for general Mars telecommunications orbiters with the constraint of being a frozen orbit. While a typical optimization goal is to minimize tele-communications down time, several relevant performance metrics are examined: 1) area-weighted average gap time, 2) global maximum of local maximum gap time, 3) global maximum of local minimum gap time. Optimal solutions are found with each of the metrics. Common and different features among the optimal solutions as well as the advantage and disadvantage of each metric are presented. The optimal solutions are compared with several candidate orbits that were considered during the development of Mars Telecommunications Orbiter.
Xie, Y; Zhang, Y; Qin, W; Lu, S; Ni, C; Zhang, Q
2017-03-01
Increasing DTI studies have demonstrated that white matter microstructural abnormalities play an important role in type 2 diabetes mellitus-related cognitive impairment. In this study, the diffusional kurtosis imaging method was used to investigate WM microstructural alterations in patients with type 2 diabetes mellitus and to detect associations between diffusional kurtosis imaging metrics and clinical/cognitive measurements. Diffusional kurtosis imaging and cognitive assessments were performed on 58 patients with type 2 diabetes mellitus and 58 controls. Voxel-based intergroup comparisons of diffusional kurtosis imaging metrics were conducted, and ROI-based intergroup comparisons were further performed. Correlations between the diffusional kurtosis imaging metrics and cognitive/clinical measurements were assessed after controlling for age, sex, and education in both patients and controls. Altered diffusion metrics were observed in the corpus callosum, the bilateral frontal WM, the right superior temporal WM, the left external capsule, and the pons in patients with type 2 diabetes mellitus compared with controls. The splenium of the corpus callosum and the pons had abnormal kurtosis metrics in patients with type 2 diabetes mellitus. Additionally, altered diffusion metrics in the right prefrontal WM were significantly correlated with disease duration and attention task performance in patients with type 2 diabetes mellitus. With both conventional diffusion and additional kurtosis metrics, diffusional kurtosis imaging can provide additional information on WM microstructural abnormalities in patients with type 2 diabetes mellitus. Our results indicate that WM microstructural abnormalities occur before cognitive decline and may be used as neuroimaging markers for predicting the early cognitive impairment in patients with type 2 diabetes mellitus. © 2017 by American Journal of Neuroradiology.
A condition metric for Eucalyptus woodland derived from expert evaluations.
Sinclair, Steve J; Bruce, Matthew J; Griffioen, Peter; Dodd, Amanda; White, Matthew D
2018-02-01
The evaluation of ecosystem quality is important for land-management and land-use planning. Evaluation is unavoidably subjective, and robust metrics must be based on consensus and the structured use of observations. We devised a transparent and repeatable process for building and testing ecosystem metrics based on expert data. We gathered quantitative evaluation data on the quality of hypothetical grassy woodland sites from experts. We used these data to train a model (an ensemble of 30 bagged regression trees) capable of predicting the perceived quality of similar hypothetical woodlands based on a set of 13 site variables as inputs (e.g., cover of shrubs, richness of native forbs). These variables can be measured at any site and the model implemented in a spreadsheet as a metric of woodland quality. We also investigated the number of experts required to produce an opinion data set sufficient for the construction of a metric. The model produced evaluations similar to those provided by experts, as shown by assessing the model's quality scores of expert-evaluated test sites not used to train the model. We applied the metric to 13 woodland conservation reserves and asked managers of these sites to independently evaluate their quality. To assess metric performance, we compared the model's evaluation of site quality with the managers' evaluations through multidimensional scaling. The metric performed relatively well, plotting close to the center of the space defined by the evaluators. Given the method provides data-driven consensus and repeatability, which no single human evaluator can provide, we suggest it is a valuable tool for evaluating ecosystem quality in real-world contexts. We believe our approach is applicable to any ecosystem. © 2017 State of Victoria.
NASA Astrophysics Data System (ADS)
Grieggs, Samuel M.; McLaughlin, Michael J.; Ezekiel, Soundararajan; Blasch, Erik
2015-06-01
As technology and internet use grows at an exponential rate, video and imagery data is becoming increasingly important. Various techniques such as Wide Area Motion imagery (WAMI), Full Motion Video (FMV), and Hyperspectral Imaging (HSI) are used to collect motion data and extract relevant information. Detecting and identifying a particular object in imagery data is an important step in understanding visual imagery, such as content-based image retrieval (CBIR). Imagery data is segmented and automatically analyzed and stored in dynamic and robust database. In our system, we seek utilize image fusion methods which require quality metrics. Many Image Fusion (IF) algorithms have been proposed based on different, but only a few metrics, used to evaluate the performance of these algorithms. In this paper, we seek a robust, objective metric to evaluate the performance of IF algorithms which compares the outcome of a given algorithm to ground truth and reports several types of errors. Given the ground truth of a motion imagery data, it will compute detection failure, false alarm, precision and recall metrics, background and foreground regions statistics, as well as split and merge of foreground regions. Using the Structural Similarity Index (SSIM), Mutual Information (MI), and entropy metrics; experimental results demonstrate the effectiveness of the proposed methodology for object detection, activity exploitation, and CBIR.
NASA Astrophysics Data System (ADS)
McPhail, C.; Maier, H. R.; Kwakkel, J. H.; Giuliani, M.; Castelletti, A.; Westra, S.
2018-02-01
Robustness is being used increasingly for decision analysis in relation to deep uncertainty and many metrics have been proposed for its quantification. Recent studies have shown that the application of different robustness metrics can result in different rankings of decision alternatives, but there has been little discussion of what potential causes for this might be. To shed some light on this issue, we present a unifying framework for the calculation of robustness metrics, which assists with understanding how robustness metrics work, when they should be used, and why they sometimes disagree. The framework categorizes the suitability of metrics to a decision-maker based on (1) the decision-context (i.e., the suitability of using absolute performance or regret), (2) the decision-maker's preferred level of risk aversion, and (3) the decision-maker's preference toward maximizing performance, minimizing variance, or some higher-order moment. This article also introduces a conceptual framework describing when relative robustness values of decision alternatives obtained using different metrics are likely to agree and disagree. This is used as a measure of how "stable" the ranking of decision alternatives is when determined using different robustness metrics. The framework is tested on three case studies, including water supply augmentation in Adelaide, Australia, the operation of a multipurpose regulated lake in Italy, and flood protection for a hypothetical river based on a reach of the river Rhine in the Netherlands. The proposed conceptual framework is confirmed by the case study results, providing insight into the reasons for disagreements between rankings obtained using different robustness metrics.
Quality Measures for Dialysis: Time for a Balanced Scorecard.
Kliger, Alan S
2016-02-05
Recent federal legislation establishes a merit-based incentive payment system for physicians, with a scorecard for each professional. The Centers for Medicare and Medicaid Services evaluate quality of care with clinical performance measures and have used these metrics for public reporting and payment to dialysis facilities. Similar metrics may be used for the future merit-based incentive payment system. In nephrology, most clinical performance measures measure processes and intermediate outcomes of care. These metrics were developed from population studies of best practice and do not identify opportunities for individualizing care on the basis of patient characteristics and individual goals of treatment. The In-Center Hemodialysis (ICH) Consumer Assessment of Healthcare Providers and Systems (CAHPS) survey examines patients' perception of care and has entered the arena to evaluate quality of care. A balanced scorecard of quality performance should include three elements: population-based best clinical practice, patient perceptions, and individually crafted patient goals of care. Copyright © 2016 by the American Society of Nephrology.
Performance comparison of optical interference cancellation system architectures.
Lu, Maddie; Chang, Matt; Deng, Yanhua; Prucnal, Paul R
2013-04-10
The performance of three optics-based interference cancellation systems are compared and contrasted with each other, and with traditional electronic techniques for interference cancellation. The comparison is based on a set of common performance metrics that we have developed for this purpose. It is shown that thorough evaluation of our optical approaches takes into account the traditional notions of depth of cancellation and dynamic range, along with notions of link loss and uniformity of cancellation. Our evaluation shows that our use of optical components affords performance that surpasses traditional electronic approaches, and that the optimal choice for an optical interference canceller requires taking into account the performance metrics discussed in this paper.
Energy-Based Metrics for Arthroscopic Skills Assessment.
Poursartip, Behnaz; LeBel, Marie-Eve; McCracken, Laura C; Escoto, Abelardo; Patel, Rajni V; Naish, Michael D; Trejos, Ana Luisa
2017-08-05
Minimally invasive skills assessment methods are essential in developing efficient surgical simulators and implementing consistent skills evaluation. Although numerous methods have been investigated in the literature, there is still a need to further improve the accuracy of surgical skills assessment. Energy expenditure can be an indication of motor skills proficiency. The goals of this study are to develop objective metrics based on energy expenditure, normalize these metrics, and investigate classifying trainees using these metrics. To this end, different forms of energy consisting of mechanical energy and work were considered and their values were divided by the related value of an ideal performance to develop normalized metrics. These metrics were used as inputs for various machine learning algorithms including support vector machines (SVM) and neural networks (NNs) for classification. The accuracy of the combination of the normalized energy-based metrics with these classifiers was evaluated through a leave-one-subject-out cross-validation. The proposed method was validated using 26 subjects at two experience levels (novices and experts) in three arthroscopic tasks. The results showed that there are statistically significant differences between novices and experts for almost all of the normalized energy-based metrics. The accuracy of classification using SVM and NN methods was between 70% and 95% for the various tasks. The results show that the normalized energy-based metrics and their combination with SVM and NN classifiers are capable of providing accurate classification of trainees. The assessment method proposed in this study can enhance surgical training by providing appropriate feedback to trainees about their level of expertise and can be used in the evaluation of proficiency.
Physician-Pharmacist collaboration in a pay for performance healthcare environment.
Farley, T M; Izakovic, M
2015-01-01
Healthcare is becoming more complex and costly in both European (Slovak) and American models. Healthcare in the United States (U.S.) is undergoing a particularly dramatic change. Physician and hospital reimbursement are becoming less procedure focused and increasingly outcome focused. Efforts at Mercy Hospital have shown promise in terms of collaborative team based care improving performance on glucose control outcome metrics, linked to reimbursement. Our performance on the Centers for Medicare and Medicaid Services (CMS) post-operative glucose control metric for cardiac surgery patients increased from a 63.6% pass rate to a 95.1% pass rate after implementing interventions involving physician-pharmacist team based care.Having a multidisciplinary team that is able to adapt quickly to changing expectations in the healthcare environment has aided our institution. As healthcare becomes increasingly saturated with technology, data and quality metrics, collaborative efforts resulting in increased quality and physician efficiency are desirable. Multidisciplinary collaboration (including physician-pharmacist collaboration) appears to be a viable route to improved performance in an outcome based healthcare system (Fig. 2, Ref. 12).
Narayan, Anand; Cinelli, Christina; Carrino, John A; Nagy, Paul; Coresh, Josef; Riese, Victoria G; Durand, Daniel J
2015-11-01
As the US health care system transitions toward value-based reimbursement, there is an increasing need for metrics to quantify health care quality. Within radiology, many quality metrics are in use, and still more have been proposed, but there have been limited attempts to systematically inventory these measures and classify them using a standard framework. The purpose of this study was to develop an exhaustive inventory of public and private sector imaging quality metrics classified according to the classic Donabedian framework (structure, process, and outcome). A systematic review was performed in which eligibility criteria included published articles (from 2000 onward) from multiple databases. Studies were double-read, with discrepancies resolved by consensus. For the radiology benefit management group (RBM) survey, the six known companies nationally were surveyed. Outcome measures were organized on the basis of standard categories (structure, process, and outcome) and reported using Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. The search strategy yielded 1,816 citations; review yielded 110 reports (29 included for final analysis). Three of six RBMs (50%) responded to the survey; the websites of the other RBMs were searched for additional metrics. Seventy-five unique metrics were reported: 35 structure (46%), 20 outcome (27%), and 20 process (27%) metrics. For RBMs, 35 metrics were reported: 27 structure (77%), 4 process (11%), and 4 outcome (11%) metrics. The most commonly cited structure, process, and outcome metrics included ACR accreditation (37%), ACR Appropriateness Criteria (85%), and peer review (95%), respectively. Imaging quality metrics are more likely to be structural (46%) than process (27%) or outcome (27%) based (P < .05). As national value-based reimbursement programs increasingly emphasize outcome-based metrics, radiologists must keep pace by developing the data infrastructure required to collect outcome-based quality metrics. Copyright © 2015 American College of Radiology. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Colyvas, Jeannette A.
2012-01-01
Our current educational environment is subject to persistent calls for accountability, evidence-based practice, and data use for improvement, which largely take the form of performance metrics (PMs). This rapid proliferation of PMs has profoundly influenced the ways in which scholars and practitioners think about their own practices and the larger…
An Opportunistic Routing Mechanism Combined with Long-Term and Short-Term Metrics for WMN
Piao, Xianglan; Qiu, Tie
2014-01-01
WMN (wireless mesh network) is a useful wireless multihop network with tremendous research value. The routing strategy decides the performance of network and the quality of transmission. A good routing algorithm will use the whole bandwidth of network and assure the quality of service of traffic. Since the routing metric ETX (expected transmission count) does not assure good quality of wireless links, to improve the routing performance, an opportunistic routing mechanism combined with long-term and short-term metrics for WMN based on OLSR (optimized link state routing) and ETX is proposed in this paper. This mechanism always chooses the highest throughput links to improve the performance of routing over WMN and then reduces the energy consumption of mesh routers. The simulations and analyses show that the opportunistic routing mechanism is better than the mechanism with the metric of ETX. PMID:25250379
An opportunistic routing mechanism combined with long-term and short-term metrics for WMN.
Sun, Weifeng; Wang, Haotian; Piao, Xianglan; Qiu, Tie
2014-01-01
WMN (wireless mesh network) is a useful wireless multihop network with tremendous research value. The routing strategy decides the performance of network and the quality of transmission. A good routing algorithm will use the whole bandwidth of network and assure the quality of service of traffic. Since the routing metric ETX (expected transmission count) does not assure good quality of wireless links, to improve the routing performance, an opportunistic routing mechanism combined with long-term and short-term metrics for WMN based on OLSR (optimized link state routing) and ETX is proposed in this paper. This mechanism always chooses the highest throughput links to improve the performance of routing over WMN and then reduces the energy consumption of mesh routers. The simulations and analyses show that the opportunistic routing mechanism is better than the mechanism with the metric of ETX.
Creating "Intelligent" Ensemble Averages Using a Process-Based Framework
NASA Astrophysics Data System (ADS)
Baker, Noel; Taylor, Patrick
2014-05-01
The CMIP5 archive contains future climate projections from over 50 models provided by dozens of modeling centers from around the world. Individual model projections, however, are subject to biases created by structural model uncertainties. As a result, ensemble averaging of multiple models is used to add value to individual model projections and construct a consensus projection. Previous reports for the IPCC establish climate change projections based on an equal-weighted average of all model projections. However, individual models reproduce certain climate processes better than other models. Should models be weighted based on performance? Unequal ensemble averages have previously been constructed using a variety of mean state metrics. What metrics are most relevant for constraining future climate projections? This project develops a framework for systematically testing metrics in models to identify optimal metrics for unequal weighting multi-model ensembles. The intention is to produce improved ("intelligent") unequal-weight ensemble averages. A unique aspect of this project is the construction and testing of climate process-based model evaluation metrics. A climate process-based metric is defined as a metric based on the relationship between two physically related climate variables—e.g., outgoing longwave radiation and surface temperature. Several climate process metrics are constructed using high-quality Earth radiation budget data from NASA's Clouds and Earth's Radiant Energy System (CERES) instrument in combination with surface temperature data sets. It is found that regional values of tested quantities can vary significantly when comparing the equal-weighted ensemble average and an ensemble weighted using the process-based metric. Additionally, this study investigates the dependence of the metric weighting scheme on the climate state using a combination of model simulations including a non-forced preindustrial control experiment, historical simulations, and several radiative forcing Representative Concentration Pathway (RCP) scenarios. Ultimately, the goal of the framework is to advise better methods for ensemble averaging models and create better climate predictions.
Research on quality metrics of wireless adaptive video streaming
NASA Astrophysics Data System (ADS)
Li, Xuefei
2018-04-01
With the development of wireless networks and intelligent terminals, video traffic has increased dramatically. Adaptive video streaming has become one of the most promising video transmission technologies. For this type of service, a good QoS (Quality of Service) of wireless network does not always guarantee that all customers have good experience. Thus, new quality metrics have been widely studies recently. Taking this into account, the objective of this paper is to investigate the quality metrics of wireless adaptive video streaming. In this paper, a wireless video streaming simulation platform with DASH mechanism and multi-rate video generator is established. Based on this platform, PSNR model, SSIM model and Quality Level model are implemented. Quality Level Model considers the QoE (Quality of Experience) factors such as image quality, stalling and switching frequency while PSNR Model and SSIM Model mainly consider the quality of the video. To evaluate the performance of these QoE models, three performance metrics (SROCC, PLCC and RMSE) which are used to make a comparison of subjective and predicted MOS (Mean Opinion Score) are calculated. From these performance metrics, the monotonicity, linearity and accuracy of these quality metrics can be observed.
Metrics for measuring performance of market transformation initiatives
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gordon, F.; Schlegel, J.; Grabner, K.
1998-07-01
Regulators have traditionally rewarded utility efficiency programs based on energy and demand savings. Now, many regulators are encouraging utilities and other program administrators to save energy by transforming markets. Prior to achieving sustainable market transformation, the program administrators often must take actions to understand the markets, establish baselines for success, reduce market barriers, build alliances, and build market momentum. Because these activities often precede savings, year-by-year measurement of savings can be an inappropriate measure of near-term success. Because ultimate success in transforming markets is defined in terms of sustainable changes in market structure and practice, traditional measures of success canmore » also be misleading as initiatives reach maturity. This paper reviews early efforts in Massachusetts to develop metrics, or yardsticks, to gauge regulatory rewards for utility market transformation initiatives. From experience in multiparty negotiations, the authors review options for metrics based alternatively on market effects, outcomes, and good faith implementation. Additionally, alternative approaches are explored, based on end-results, interim results, and initial results. The political and practical constraints are described which have thus far led to a preference for one-year metrics, based primarily on good faith implementation. Strategies are offered for developing useful metrics which might be acceptable to regulators, advocates, and program administrators. Finally, they emphasize that the use of market transformation performance metrics is in its infancy. Both regulators and program administrators are encouraged to advance into this area with an experimental mind-set; don't put all the money on one horse until there's more of a track record.« less
Voxel-based statistical analysis of uncertainties associated with deformable image registration
NASA Astrophysics Data System (ADS)
Li, Shunshan; Glide-Hurst, Carri; Lu, Mei; Kim, Jinkoo; Wen, Ning; Adams, Jeffrey N.; Gordon, James; Chetty, Indrin J.; Zhong, Hualiang
2013-09-01
Deformable image registration (DIR) algorithms have inherent uncertainties in their displacement vector fields (DVFs).The purpose of this study is to develop an optimal metric to estimate DIR uncertainties. Six computational phantoms have been developed from the CT images of lung cancer patients using a finite element method (FEM). The FEM generated DVFs were used as a standard for registrations performed on each of these phantoms. A mechanics-based metric, unbalanced energy (UE), was developed to evaluate these registration DVFs. The potential correlation between UE and DIR errors was explored using multivariate analysis, and the results were validated by landmark approach and compared with two other error metrics: DVF inverse consistency (IC) and image intensity difference (ID). Landmark-based validation was performed using the POPI-model. The results show that the Pearson correlation coefficient between UE and DIR error is rUE-error = 0.50. This is higher than rIC-error = 0.29 for IC and DIR error and rID-error = 0.37 for ID and DIR error. The Pearson correlation coefficient between UE and the product of the DIR displacements and errors is rUE-error × DVF = 0.62 for the six patients and rUE-error × DVF = 0.73 for the POPI-model data. It has been demonstrated that UE has a strong correlation with DIR errors, and the UE metric outperforms the IC and ID metrics in estimating DIR uncertainties. The quantified UE metric can be a useful tool for adaptive treatment strategies, including probability-based adaptive treatment planning.
Increasing Army Supply Chain Performance: Using an Integrated End to End Metrics System
2017-01-01
Sched Deliver Sched Delinquent Contracts Current Metrics PQDR/SDRs Forecasting Accuracy Reliability Demand Management Asset Mgmt Strategies Pipeline...are identified and characterized by statistical analysis. The study proposed a framework and tool for inventory management based on factors such as
U50: A New Metric for Measuring Assembly Output Based on Non-Overlapping, Target-Specific Contigs.
Castro, Christina J; Ng, Terry Fei Fan
2017-11-01
Advances in next-generation sequencing technologies enable routine genome sequencing, generating millions of short reads. A crucial step for full genome analysis is the de novo assembly, and currently, performance of different assembly methods is measured by a metric called N 50 . However, the N 50 value can produce skewed, inaccurate results when complex data are analyzed, especially for viral and microbial datasets. To provide a better assessment of assembly output, we developed a new metric called U 50 . The U 50 identifies unique, target-specific contigs by using a reference genome as baseline, aiming at circumventing some limitations that are inherent to the N 50 metric. Specifically, the U 50 program removes overlapping sequence of multiple contigs by utilizing a mask array, so the performance of the assembly is only measured by unique contigs. We compared simulated and real datasets by using U 50 and N 50 , and our results demonstrated that U 50 has the following advantages over N 50 : (1) reducing erroneously large N 50 values due to a poor assembly, (2) eliminating overinflated N 50 values caused by large measurements from overlapping contigs, (3) eliminating diminished N 50 values caused by an abundance of small contigs, and (4) allowing comparisons across different platforms or samples based on the new percentage-based metric UG 50 %. The use of the U 50 metric allows for a more accurate measure of assembly performance by analyzing only the unique, non-overlapping contigs. In addition, most viral and microbial sequencing have high background noise (i.e., host and other non-targets), which contributes to having a skewed, misrepresented N 50 value-this is corrected by U 50 . Also, the UG 50 % can be used to compare assembly results from different samples or studies, the cross-comparisons of which cannot be performed with N 50 .
Context and meter enhance long-range planning in music performance
Mathias, Brian; Pfordresher, Peter Q.; Palmer, Caroline
2015-01-01
Neural responses demonstrate evidence of resonance, or oscillation, during the production of periodic auditory events. Music contains periodic auditory events that give rise to a sense of beat, which in turn generates a sense of meter on the basis of multiple periodicities. Metrical hierarchies may aid memory for music by facilitating similarity-based associations among sequence events at different periodic distances that unfold in longer contexts. A fundamental question is how metrical associations arising from a musical context influence memory during music performance. Longer contexts may facilitate metrical associations at higher hierarchical levels more than shorter contexts, a prediction of the range model, a formal model of planning processes in music performance (Palmer and Pfordresher, 2003; Pfordresher et al., 2007). Serial ordering errors, in which intended sequence events are produced in incorrect sequence positions, were measured as skilled pianists performed musical pieces that contained excerpts embedded in long or short musical contexts. Pitch errors arose from metrically similar positions and further sequential distances more often when the excerpt was embedded in long contexts compared to short contexts. Musicians’ keystroke intensities and error rates also revealed influences of metrical hierarchies, which differed for performances in long and short contexts. The range model accounted for contextual effects and provided better fits to empirical findings when metrical associations between sequence events were included. Longer sequence contexts may facilitate planning during sequence production by increasing conceptual similarity between hierarchically associated events. These findings are consistent with the notion that neural oscillations at multiple periodicities may strengthen metrical associations across sequence events during planning. PMID:25628550
NASA Astrophysics Data System (ADS)
Demirel, Mehmet C.; Mai, Juliane; Mendiguren, Gorka; Koch, Julian; Samaniego, Luis; Stisen, Simon
2018-02-01
Satellite-based earth observations offer great opportunities to improve spatial model predictions by means of spatial-pattern-oriented model evaluations. In this study, observed spatial patterns of actual evapotranspiration (AET) are utilised for spatial model calibration tailored to target the pattern performance of the model. The proposed calibration framework combines temporally aggregated observed spatial patterns with a new spatial performance metric and a flexible spatial parameterisation scheme. The mesoscale hydrologic model (mHM) is used to simulate streamflow and AET and has been selected due to its soil parameter distribution approach based on pedo-transfer functions and the build in multi-scale parameter regionalisation. In addition two new spatial parameter distribution options have been incorporated in the model in order to increase the flexibility of root fraction coefficient and potential evapotranspiration correction parameterisations, based on soil type and vegetation density. These parameterisations are utilised as they are most relevant for simulated AET patterns from the hydrologic model. Due to the fundamental challenges encountered when evaluating spatial pattern performance using standard metrics, we developed a simple but highly discriminative spatial metric, i.e. one comprised of three easily interpretable components measuring co-location, variation and distribution of the spatial data. The study shows that with flexible spatial model parameterisation used in combination with the appropriate objective functions, the simulated spatial patterns of actual evapotranspiration become substantially more similar to the satellite-based estimates. Overall 26 parameters are identified for calibration through a sequential screening approach based on a combination of streamflow and spatial pattern metrics. The robustness of the calibrations is tested using an ensemble of nine calibrations based on different seed numbers using the shuffled complex evolution optimiser. The calibration results reveal a limited trade-off between streamflow dynamics and spatial patterns illustrating the benefit of combining separate observation types and objective functions. At the same time, the simulated spatial patterns of AET significantly improved when an objective function based on observed AET patterns and a novel spatial performance metric compared to traditional streamflow-only calibration were included. Since the overall water balance is usually a crucial goal in hydrologic modelling, spatial-pattern-oriented optimisation should always be accompanied by traditional discharge measurements. In such a multi-objective framework, the current study promotes the use of a novel bias-insensitive spatial pattern metric, which exploits the key information contained in the observed patterns while allowing the water balance to be informed by discharge observations.
Mao, Shasha; Xiong, Lin; Jiao, Licheng; Feng, Tian; Yeung, Sai-Kit
2017-05-01
Riemannian optimization has been widely used to deal with the fixed low-rank matrix completion problem, and Riemannian metric is a crucial factor of obtaining the search direction in Riemannian optimization. This paper proposes a new Riemannian metric via simultaneously considering the Riemannian geometry structure and the scaling information, which is smoothly varying and invariant along the equivalence class. The proposed metric can make a tradeoff between the Riemannian geometry structure and the scaling information effectively. Essentially, it can be viewed as a generalization of some existing metrics. Based on the proposed Riemanian metric, we also design a Riemannian nonlinear conjugate gradient algorithm, which can efficiently solve the fixed low-rank matrix completion problem. By experimenting on the fixed low-rank matrix completion, collaborative filtering, and image and video recovery, it illustrates that the proposed method is superior to the state-of-the-art methods on the convergence efficiency and the numerical performance.
A Validation of Object-Oriented Design Metrics as Quality Indicators
NASA Technical Reports Server (NTRS)
Basili, Victor R.; Briand, Lionel C.; Melo, Walcelio
1997-01-01
This paper presents the results of a study in which we empirically investigated the suits of object-oriented (00) design metrics introduced in another work. More specifically, our goal is to assess these metrics as predictors of fault-prone classes and, therefore, determine whether they can be used as early quality indicators. This study is complementary to the work described where the same suite of metrics had been used to assess frequencies of maintenance changes to classes. To perform our validation accurately, we collected data on the development of eight medium-sized information management systems based on identical requirements. All eight projects were developed using a sequential life cycle model, a well-known 00 analysis/design method and the C++ programming language. Based on empirical and quantitative analysis, the advantages and drawbacks of these 00 metrics are discussed. Several of Chidamber and Kamerer's 00 metrics appear to be useful to predict class fault-proneness during the early phases of the life-cycle. Also, on our data set, they are better predictors than 'traditional' code metrics, which can only be collected at a later phase of the software development processes.
A Validation of Object-Oriented Design Metrics
NASA Technical Reports Server (NTRS)
Basili, Victor R.; Briand, Lionel; Melo, Walcelio L.
1995-01-01
This paper presents the results of a study conducted at the University of Maryland in which we experimentally investigated the suite of Object-Oriented (00) design metrics introduced by [Chidamber and Kemerer, 1994]. In order to do this, we assessed these metrics as predictors of fault-prone classes. This study is complementary to [Lieand Henry, 1993] where the same suite of metrics had been used to assess frequencies of maintenance changes to classes. To perform our validation accurately, we collected data on the development of eight medium-sized information management systems based on identical requirements. All eight projects were developed using a sequential life cycle model, a well-known 00 analysis/design method and the C++ programming language. Based on experimental results, the advantages and drawbacks of these 00 metrics are discussed and suggestions for improvement are provided. Several of Chidamber and Kemerer's 00 metrics appear to be adequate to predict class fault-proneness during the early phases of the life-cycle. We also showed that they are, on our data set, better predictors than "traditional" code metrics, which can only be collected at a later phase of the software development processes.
Health and Well-Being Metrics in Business: The Value of Integrated Reporting.
Pronk, Nicolaas P; Malan, Daniel; Christie, Gillian; Hajat, Cother; Yach, Derek
2018-01-01
Health and well-being (HWB) are material to sustainable business performance. Yet, corporate reporting largely lacks the intentional inclusion of HWB metrics. This brief report presents an argument for inclusion of HWB metrics into existing standards for corporate reporting. A Core Scorecard and a Comprehensive Scorecard, designed by a team of subject matter experts, based on available evidence of effectiveness, and organized around the categories of Governance, Management, and Evidence of Success, may be integrated into corporate reporting efforts. Pursuit of corporate integrated reporting requires corporate governance and ethical leadership and values that ultimately align with environmental, social, and economic performance. Agreement on metrics that intentionally include HWB may allow for integrated reporting that has the potential to yield significant value for business and society alike.
New Decentralized Algorithms for Spacecraft Formation Control Based on a Cyclic Approach
2010-06-01
space framework. As metric of performance, a common quadratic norm that weights the performance error and the control effort is traded with the cost...R = DTD, then the metric of interest is (’J)",,, the square of the 2-norm from input w to output z. Given a system G with state space description A ... spaced logarithmic spiral formation. These results are derived for
Relevance of motion-related assessment metrics in laparoscopic surgery.
Oropesa, Ignacio; Chmarra, Magdalena K; Sánchez-González, Patricia; Lamata, Pablo; Rodrigues, Sharon P; Enciso, Silvia; Sánchez-Margallo, Francisco M; Jansen, Frank-Willem; Dankelman, Jenny; Gómez, Enrique J
2013-06-01
Motion metrics have become an important source of information when addressing the assessment of surgical expertise. However, their direct relationship with the different surgical skills has not been fully explored. The purpose of this study is to investigate the relevance of motion-related metrics in the evaluation processes of basic psychomotor laparoscopic skills and their correlation with the different abilities sought to measure. A framework for task definition and metric analysis is proposed. An explorative survey was first conducted with a board of experts to identify metrics to assess basic psychomotor skills. Based on the output of that survey, 3 novel tasks for surgical assessment were designed. Face and construct validation was performed, with focus on motion-related metrics. Tasks were performed by 42 participants (16 novices, 22 residents, and 4 experts). Movements of the laparoscopic instruments were registered with the TrEndo tracking system and analyzed. Time, path length, and depth showed construct validity for all 3 tasks. Motion smoothness and idle time also showed validity for tasks involving bimanual coordination and tasks requiring a more tactical approach, respectively. Additionally, motion smoothness and average speed showed a high internal consistency, proving them to be the most task-independent of all the metrics analyzed. Motion metrics are complementary and valid for assessing basic psychomotor skills, and their relevance depends on the skill being evaluated. A larger clinical implementation, combined with quality performance information, will give more insight on the relevance of the results shown in this study.
Jensen, Katrine; Bjerrum, Flemming; Hansen, Henrik Jessen; Petersen, René Horsleben; Pedersen, Jesper Holst; Konge, Lars
2017-06-01
The societies of thoracic surgery are working to incorporate simulation and competency-based assessment into specialty training. One challenge is the development of a simulation-based test, which can be used as an assessment tool. The study objective was to establish validity evidence for a virtual reality simulator test of a video-assisted thoracoscopic surgery (VATS) lobectomy of a right upper lobe. Participants with varying experience in VATS lobectomy were included. They were familiarized with a virtual reality simulator (LapSim ® ) and introduced to the steps of the procedure for a VATS right upper lobe lobectomy. The participants performed two VATS lobectomies on the simulator with a 5-min break between attempts. Nineteen pre-defined simulator metrics were recorded. Fifty-three participants from nine different countries were included. High internal consistency was found for the metrics with Cronbach's alpha coefficient for standardized items of 0.91. Significant test-retest reliability was found for 15 of the metrics (p-values <0.05). Significant correlations between the metrics and the participants VATS lobectomy experience were identified for seven metrics (p-values <0.001), and 10 metrics showed significant differences between novices (0 VATS lobectomies performed) and experienced surgeons (>50 VATS lobectomies performed). A pass/fail level defined as approximately one standard deviation from the mean metric scores for experienced surgeons passed none of the novices (0 % false positives) and failed four of the experienced surgeons (29 % false negatives). This study is the first to establish validity evidence for a VATS right upper lobe lobectomy virtual reality simulator test. Several simulator metrics demonstrated significant differences between novices and experienced surgeons and pass/fail criteria for the test were set with acceptable consequences. This test can be used as a first step in assessing thoracic surgery trainees' VATS lobectomy competency.
State of the States 2009. Renewable Energy Development and the Role of Policy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Doris, Elizabeth; McLaren, Joyce; Healey, Victoria
2009-10-01
This report tracks the progress of U.S. renewable energy development at the state level, with metrics on development status and reviews of relevant policies. The analysis offers state-by-state policy suggestions and develops performance-based evaluation metrics to accelerate and improve renewable energy development.
EVALUATION OF METRIC PRECISION FOR A RIPARIAN FOREST SURVEY
This paper evaluates the performance of a protocol to monitor riparian forests in western Oregon based on the quality of the data obtained from a recent field survey. Precision and accuracy are the criteria used to determine the quality of 19 field metrics. The field survey con...
Fero, Laura J; O'Donnell, John M; Zullo, Thomas G; Dabbs, Annette DeVito; Kitutu, Julius; Samosky, Joseph T; Hoffman, Leslie A
2010-10-01
This paper is a report of an examination of the relationship between metrics of critical thinking skills and performance in simulated clinical scenarios. Paper and pencil assessments are commonly used to assess critical thinking but may not reflect simulated performance. In 2007, a convenience sample of 36 nursing students participated in measurement of critical thinking skills and simulation-based performance using videotaped vignettes, high-fidelity human simulation, the California Critical Thinking Disposition Inventory and California Critical Thinking Skills Test. Simulation-based performance was rated as 'meeting' or 'not meeting' overall expectations. Test scores were categorized as strong, average, or weak. Most (75.0%) students did not meet overall performance expectations using videotaped vignettes or high-fidelity human simulation; most difficulty related to problem recognition and reporting findings to the physician. There was no difference between overall performance based on method of assessment (P = 0.277). More students met subcategory expectations for initiating nursing interventions (P ≤ 0.001) using high-fidelity human simulation. The relationship between videotaped vignette performance and critical thinking disposition or skills scores was not statistically significant, except for problem recognition and overall critical thinking skills scores (Cramer's V = 0.444, P = 0.029). There was a statistically significant relationship between overall high-fidelity human simulation performance and overall critical thinking disposition scores (Cramer's V = 0.413, P = 0.047). Students' performance reflected difficulty meeting expectations in simulated clinical scenarios. High-fidelity human simulation performance appeared to approximate scores on metrics of critical thinking best. Further research is needed to determine if simulation-based performance correlates with critical thinking skills in the clinical setting. © 2010 The Authors. Journal of Advanced Nursing © 2010 Blackwell Publishing Ltd.
Fero, Laura J.; O’Donnell, John M.; Zullo, Thomas G.; Dabbs, Annette DeVito; Kitutu, Julius; Samosky, Joseph T.; Hoffman, Leslie A.
2018-01-01
Aim This paper is a report of an examination of the relationship between metrics of critical thinking skills and performance in simulated clinical scenarios. Background Paper and pencil assessments are commonly used to assess critical thinking but may not reflect simulated performance. Methods In 2007, a convenience sample of 36 nursing students participated in measurement of critical thinking skills and simulation-based performance using videotaped vignettes, high-fidelity human simulation, the California Critical Thinking Disposition Inventory and California Critical Thinking Skills Test. Simulation- based performance was rated as ‘meeting’ or ‘not meeting’ overall expectations. Test scores were categorized as strong, average, or weak. Results Most (75·0%) students did not meet overall performance expectations using videotaped vignettes or high-fidelity human simulation; most difficulty related to problem recognition and reporting findings to the physician. There was no difference between overall performance based on method of assessment (P = 0·277). More students met subcategory expectations for initiating nursing interventions (P ≤ 0·001) using high-fidelity human simulation. The relationship between video-taped vignette performance and critical thinking disposition or skills scores was not statistically significant, except for problem recognition and overall critical thinking skills scores (Cramer’s V = 0·444, P = 0·029). There was a statistically significant relationship between overall high-fidelity human simulation performance and overall critical thinking disposition scores (Cramer’s V = 0·413, P = 0·047). Conclusion Students’ performance reflected difficulty meeting expectations in simulated clinical scenarios. High-fidelity human simulation performance appeared to approximate scores on metrics of critical thinking best. Further research is needed to determine if simulation-based performance correlates with critical thinking skills in the clinical setting. PMID:20636471
Research and development on performance models of thermal imaging systems
NASA Astrophysics Data System (ADS)
Wang, Ji-hui; Jin, Wei-qi; Wang, Xia; Cheng, Yi-nan
2009-07-01
Traditional ACQUIRE models perform the discrimination tasks of detection (target orientation, recognition and identification) for military target based upon minimum resolvable temperature difference (MRTD) and Johnson criteria for thermal imaging systems (TIS). Johnson criteria is generally pessimistic for performance predict of sampled imager with the development of focal plane array (FPA) detectors and digital image process technology. Triangle orientation discrimination threshold (TOD) model, minimum temperature difference perceived (MTDP)/ thermal range model (TRM3) Model and target task performance (TTP) metric have been developed to predict the performance of sampled imager, especially TTP metric can provides better accuracy than the Johnson criteria. In this paper, the performance models above are described; channel width metrics have been presented to describe the synthesis performance including modulate translate function (MTF) channel width for high signal noise to ration (SNR) optoelectronic imaging systems and MRTD channel width for low SNR TIS; the under resolvable questions for performance assessment of TIS are indicated; last, the development direction of performance models for TIS are discussed.
Evaluating Algorithm Performance Metrics Tailored for Prognostics
NASA Technical Reports Server (NTRS)
Saxena, Abhinav; Celaya, Jose; Saha, Bhaskar; Saha, Sankalita; Goebel, Kai
2009-01-01
Prognostics has taken a center stage in Condition Based Maintenance (CBM) where it is desired to estimate Remaining Useful Life (RUL) of the system so that remedial measures may be taken in advance to avoid catastrophic events or unwanted downtimes. Validation of such predictions is an important but difficult proposition and a lack of appropriate evaluation methods renders prognostics meaningless. Evaluation methods currently used in the research community are not standardized and in many cases do not sufficiently assess key performance aspects expected out of a prognostics algorithm. In this paper we introduce several new evaluation metrics tailored for prognostics and show that they can effectively evaluate various algorithms as compared to other conventional metrics. Specifically four algorithms namely; Relevance Vector Machine (RVM), Gaussian Process Regression (GPR), Artificial Neural Network (ANN), and Polynomial Regression (PR) are compared. These algorithms vary in complexity and their ability to manage uncertainty around predicted estimates. Results show that the new metrics rank these algorithms in different manner and depending on the requirements and constraints suitable metrics may be chosen. Beyond these results, these metrics offer ideas about how metrics suitable to prognostics may be designed so that the evaluation procedure can be standardized. 1
An objective method for a video quality evaluation in a 3DTV service
NASA Astrophysics Data System (ADS)
Wilczewski, Grzegorz
2015-09-01
The following article describes proposed objective method for a 3DTV video quality evaluation, a Compressed Average Image Intensity (CAII) method. Identification of the 3DTV service's content chain nodes enables to design a versatile, objective video quality metric. It is based on an advanced approach to the stereoscopic videostream analysis. Insights towards designed metric mechanisms, as well as the evaluation of performance of the designed video quality metric, in the face of the simulated environmental conditions are herein discussed. As a result, created CAII metric might be effectively used in a variety of service quality assessment applications.
Comparison of Highly Resolved Model-Based Exposure ...
Human exposure to air pollution in many studies is represented by ambient concentrations from space-time kriging of observed values. Space-time kriging techniques based on a limited number of ambient monitors may fail to capture the concentration from local sources. Further, because people spend more time indoors, using ambient concentration to represent exposure may cause error. To quantify the associated exposure error, we computed a series of six different hourly-based exposure metrics at 16,095 Census blocks of three Counties in North Carolina for CO, NOx, PM2.5, and elemental carbon (EC) during 2012. These metrics include ambient background concentration from space-time ordinary kriging (STOK), ambient on-road concentration from the Research LINE source dispersion model (R-LINE), a hybrid concentration combining STOK and R-LINE, and their associated indoor concentrations from an indoor infiltration mass balance model. Using a hybrid-based indoor concentration as the standard, the comparison showed that outdoor STOK metrics yielded large error at both population (67% to 93%) and individual level (average bias between −10% to 95%). For pollutants with significant contribution from on-road emission (EC and NOx), the on-road based indoor metric performs the best at the population level (error less than 52%). At the individual level, however, the STOK-based indoor concentration performs the best (average bias below 30%). For PM2.5, due to the relatively low co
Synthesized view comparison method for no-reference 3D image quality assessment
NASA Astrophysics Data System (ADS)
Luo, Fangzhou; Lin, Chaoyi; Gu, Xiaodong; Ma, Xiaojun
2018-04-01
We develop a no-reference image quality assessment metric to evaluate the quality of synthesized view rendered from the Multi-view Video plus Depth (MVD) format. Our metric is named Synthesized View Comparison (SVC), which is designed for real-time quality monitoring at the receiver side in a 3D-TV system. The metric utilizes the virtual views in the middle which are warped from left and right views by Depth-image-based rendering algorithm (DIBR), and compares the difference between the virtual views rendered from different cameras by Structural SIMilarity (SSIM), a popular 2D full-reference image quality assessment metric. The experimental results indicate that our no-reference quality assessment metric for the synthesized images has competitive prediction performance compared with some classic full-reference image quality assessment metrics.
A New Metric for Quantifying Performance Impairment on the Psychomotor Vigilance Test
2012-01-01
used the coefficient of determination (R2) and the P-values based on Bartelss test of randomness of the residual error to quantify the goodness - of - fit ...we used the goodness - of - fit between each metric and the corresponding individualized two-process model output (Rajaraman et al., 2008, 2009) to assess...individualized two-process model fits for each of the 12 subjects using the five metrics. The P-values are for Bartelss
National Quality Forum Colon Cancer Quality Metric Performance: How Are Hospitals Measuring Up?
Mason, Meredith C; Chang, George J; Petersen, Laura A; Sada, Yvonne H; Tran Cao, Hop S; Chai, Christy; Berger, David H; Massarweh, Nader N
2017-12-01
To evaluate the impact of care at high-performing hospitals on the National Quality Forum (NQF) colon cancer metrics. The NQF endorses evaluating ≥12 lymph nodes (LNs), adjuvant chemotherapy (AC) for stage III patients, and AC within 4 months of diagnosis as colon cancer quality indicators. Data on hospital-level metric performance and the association with survival are unclear. Retrospective cohort study of 218,186 patients with resected stage I to III colon cancer in the National Cancer Data Base (2004-2012). High-performing hospitals (>75% achievement) were identified by the proportion of patients achieving each measure. The association between hospital performance and survival was evaluated using Cox shared frailty modeling. Only hospital LN performance improved (15.8% in 2004 vs 80.7% in 2012; trend test, P < 0.001), with 45.9% of hospitals performing well on all 3 measures concurrently in the most recent study year. Overall, 5-year survival was 75.0%, 72.3%, 72.5%, and 69.5% for those treated at hospitals with high performance on 3, 2, 1, and 0 metrics, respectively (log-rank, P < 0.001). Care at hospitals with high metric performance was associated with lower risk of death in a dose-response fashion [0 metrics, reference; 1, hazard ratio (HR) 0.96 (0.89-1.03); 2, HR 0.92 (0.87-0.98); 3, HR 0.85 (0.80-0.90); 2 vs 1, HR 0.96 (0.91-1.01); 3 vs 1, HR 0.89 (0.84-0.93); 3 vs 2, HR 0.95 (0.89-0.95)]. Performance on metrics in combination was associated with lower risk of death [LN + AC, HR 0.86 (0.78-0.95); AC + timely AC, HR 0.92 (0.87-0.98); LN + AC + timely AC, HR 0.85 (0.80-0.90)], whereas individual measures were not [LN, HR 0.95 (0.88-1.04); AC, HR 0.95 (0.87-1.05)]. Less than half of hospitals perform well on these NQF colon cancer metrics concurrently, and high performance on individual measures is not associated with improved survival. Quality improvement efforts should shift focus from individual measures to defining composite measures encompassing the overall multimodal care pathway and capturing successful transitions from one care modality to another.
Metric Learning for Hyperspectral Image Segmentation
NASA Technical Reports Server (NTRS)
Bue, Brian D.; Thompson, David R.; Gilmore, Martha S.; Castano, Rebecca
2011-01-01
We present a metric learning approach to improve the performance of unsupervised hyperspectral image segmentation. Unsupervised spatial segmentation can assist both user visualization and automatic recognition of surface features. Analysts can use spatially-continuous segments to decrease noise levels and/or localize feature boundaries. However, existing segmentation methods use tasks-agnostic measures of similarity. Here we learn task-specific similarity measures from training data, improving segment fidelity to classes of interest. Multiclass Linear Discriminate Analysis produces a linear transform that optimally separates a labeled set of training classes. The defines a distance metric that generalized to a new scenes, enabling graph-based segmentation that emphasizes key spectral features. We describe tests based on data from the Compact Reconnaissance Imaging Spectrometer (CRISM) in which learned metrics improve segment homogeneity with respect to mineralogical classes.
Perimal-Lewis, Lua; Teubner, David; Hakendorf, Paul; Horwood, Chris
2016-12-01
Effective and accurate use of routinely collected health data to produce Key Performance Indicator reporting is dependent on the underlying data quality. In this research, Process Mining methodology and tools were leveraged to assess the data quality of time-based Emergency Department data sourced from electronic health records. This research was done working closely with the domain experts to validate the process models. The hospital patient journey model was used to assess flow abnormalities which resulted from incorrect timestamp data used in time-based performance metrics. The research demonstrated process mining as a feasible methodology to assess data quality of time-based hospital performance metrics. The insight gained from this research enabled appropriate corrective actions to be put in place to address the data quality issues. © The Author(s) 2015.
Quality evaluation of motion-compensated edge artifacts in compressed video.
Leontaris, Athanasios; Cosman, Pamela C; Reibman, Amy R
2007-04-01
Little attention has been paid to an impairment common in motion-compensated video compression: the addition of high-frequency (HF) energy as motion compensation displaces blocking artifacts off block boundaries. In this paper, we employ an energy-based approach to measure this motion-compensated edge artifact, using both compressed bitstream information and decoded pixels. We evaluate the performance of our proposed metric, along with several blocking and blurring metrics, on compressed video in two ways. First, ordinal scales are evaluated through a series of expectations that a good quality metric should satisfy: the objective evaluation. Then, the best performing metrics are subjectively evaluated. The same subjective data set is finally used to obtain interval scales to gain more insight. Experimental results show that we accurately estimate the percentage of the added HF energy in compressed video.
Performance Evaluation of the Approaches and Algorithms for Hamburg Airport Operations
NASA Technical Reports Server (NTRS)
Zhu, Zhifan; Jung, Yoon; Lee, Hanbong; Schier, Sebastian; Okuniek, Nikolai; Gerdes, Ingrid
2016-01-01
In this work, fast-time simulations have been conducted using SARDA tools at Hamburg airport by NASA and real-time simulations using CADEO and TRACC with the NLR ATM Research Simulator (NARSIM) by DLR. The outputs are analyzed using a set of common metrics collaborated between DLR and NASA. The proposed metrics are derived from International Civil Aviation Organization (ICAO)s Key Performance Areas (KPAs) in capability, efficiency, predictability and environment, and adapted to simulation studies. The results are examined to explore and compare the merits and shortcomings of the two approaches using the common performance metrics. Particular attention is paid to the concept of the close-loop, trajectory-based taxi as well as the application of US concept to the European airport. Both teams consider the trajectory-based surface operation concept a critical technology advance in not only addressing the current surface traffic management problems, but also having potential application in unmanned vehicle maneuver on airport surface, such as autonomous towing or TaxiBot [6][7] and even Remote Piloted Aircraft (RPA). Based on this work, a future integration of TRACC and SOSS is described aiming at bringing conflict-free trajectory-based operation concept to US airport.
Decision-relevant evaluation of climate models: A case study of chill hours in California
NASA Astrophysics Data System (ADS)
Jagannathan, K. A.; Jones, A. D.; Kerr, A. C.
2017-12-01
The past decade has seen a proliferation of different climate datasets with over 60 climate models currently in use. Comparative evaluation and validation of models can assist practitioners chose the most appropriate models for adaptation planning. However, such assessments are usually conducted for `climate metrics' such as seasonal temperature, while sectoral decisions are often based on `decision-relevant outcome metrics' such as growing degree days or chill hours. Since climate models predict different metrics with varying skill, the goal of this research is to conduct a bottom-up evaluation of model skill for `outcome-based' metrics. Using chill hours (number of hours in winter months where temperature is lesser than 45 deg F) in Fresno, CA as a case, we assess how well different GCMs predict the historical mean and slope of chill hours, and whether and to what extent projections differ based on model selection. We then compare our results with other climate-based evaluations of the region, to identify similarities and differences. For the model skill evaluation, historically observed chill hours were compared with simulations from 27 GCMs (and multiple ensembles). Model skill scores were generated based on a statistical hypothesis test of the comparative assessment. Future projections from RCP 8.5 runs were evaluated, and a simple bias correction was also conducted. Our analysis indicates that model skill in predicting chill hour slope is dependent on its skill in predicting mean chill hours, which results from the non-linear nature of the chill metric. However, there was no clear relationship between the models that performed well for the chill hour metric and those that performed well in other temperature-based evaluations (such winter minimum temperature or diurnal temperature range). Further, contrary to conclusions from other studies, we also found that the multi-model mean or large ensemble mean results may not always be most appropriate for this outcome metric. Our assessment sheds light on key differences between global versus local skill, and broad versus specific skill of climate models, highlighting that decision-relevant model evaluation may be crucial for providing practitioners with the best available climate information for their specific needs.
Application of Support Vector Machine to Forex Monitoring
NASA Astrophysics Data System (ADS)
Kamruzzaman, Joarder; Sarker, Ruhul A.
Previous studies have demonstrated superior performance of artificial neural network (ANN) based forex forecasting models over traditional regression models. This paper applies support vector machines to build a forecasting model from the historical data using six simple technical indicators and presents a comparison with an ANN based model trained by scaled conjugate gradient (SCG) learning algorithm. The models are evaluated and compared on the basis of five commonly used performance metrics that measure closeness of prediction as well as correctness in directional change. Forecasting results of six different currencies against Australian dollar reveal superior performance of SVM model using simple linear kernel over ANN-SCG model in terms of all the evaluation metrics. The effect of SVM parameter selection on prediction performance is also investigated and analyzed.
NASA Astrophysics Data System (ADS)
Ciaramello, Francis M.; Hemami, Sheila S.
2007-02-01
For members of the Deaf Community in the United States, current communication tools include TTY/TTD services, video relay services, and text-based communication. With the growth of cellular technology, mobile sign language conversations are becoming a possibility. Proper coding techniques must be employed to compress American Sign Language (ASL) video for low-rate transmission while maintaining the quality of the conversation. In order to evaluate these techniques, an appropriate quality metric is needed. This paper demonstrates that traditional video quality metrics, such as PSNR, fail to predict subjective intelligibility scores. By considering the unique structure of ASL video, an appropriate objective metric is developed. Face and hand segmentation is performed using skin-color detection techniques. The distortions in the face and hand regions are optimally weighted and pooled across all frames to create an objective intelligibility score for a distorted sequence. The objective intelligibility metric performs significantly better than PSNR in terms of correlation with subjective responses.
Computer-enhanced laparoscopic training system (CELTS): bridging the gap.
Stylopoulos, N; Cotin, S; Maithel, S K; Ottensmeye, M; Jackson, P G; Bardsley, R S; Neumann, P F; Rattner, D W; Dawson, S L
2004-05-01
There is a large and growing gap between the need for better surgical training methodologies and the systems currently available for such training. In an effort to bridge this gap and overcome the disadvantages of the training simulators now in use, we developed the Computer-Enhanced Laparoscopic Training System (CELTS). CELTS is a computer-based system capable of tracking the motion of laparoscopic instruments and providing feedback about performance in real time. CELTS consists of a mechanical interface, a customizable set of tasks, and an Internet-based software interface. The special cognitive and psychomotor skills a laparoscopic surgeon should master were explicitly defined and transformed into quantitative metrics based on kinematics analysis theory. A single global standardized and task-independent scoring system utilizing a z-score statistic was developed. Validation exercises were performed. The scoring system clearly revealed a gap between experts and trainees, irrespective of the task performed; none of the trainees obtained a score above the threshold that distinguishes the two groups. Moreover, CELTS provided educational feedback by identifying the key factors that contributed to the overall score. Among the defined metrics, depth perception, smoothness of motion, instrument orientation, and the outcome of the task are major indicators of performance and key parameters that distinguish experts from trainees. Time and path length alone, which are the most commonly used metrics in currently available systems, are not considered good indicators of performance. CELTS is a novel and standardized skills trainer that combines the advantages of computer simulation with the features of the traditional and popular training boxes. CELTS can easily be used with a wide array of tasks and ensures comparability across different training conditions. This report further shows that a set of appropriate and clinically relevant performance metrics can be defined and a standardized scoring system can be designed.
Krieger, Jonathan D
2014-08-01
I present a protocol for creating geometric leaf shape metrics to facilitate widespread application of geometric morphometric methods to leaf shape measurement. • To quantify circularity, I created a novel shape metric in the form of the vector between a circle and a line, termed geometric circularity. Using leaves from 17 fern taxa, I performed a coordinate-point eigenshape analysis to empirically identify patterns of shape covariation. I then compared the geometric circularity metric to the empirically derived shape space and the standard metric, circularity shape factor. • The geometric circularity metric was consistent with empirical patterns of shape covariation and appeared more biologically meaningful than the standard approach, the circularity shape factor. The protocol described here has the potential to make geometric morphometrics more accessible to plant biologists by generalizing the approach to developing synthetic shape metrics based on classic, qualitative shape descriptors.
Defining and quantifying users' mental Imagery-based BCI skills: a first step.
Lotte, Fabien; Jeunet, Camille
2018-05-17
While promising for many applications, Electroencephalography (EEG)-based Brain-Computer Interfaces (BCIs) are still scarcely used outside laboratories, due to a poor reliability. It is thus necessary to study and fix this reliability issue. Doing so requires the use of appropriate reliability metrics to quantify both the classification algorithm and the BCI user's performances. So far, Classification Accuracy (CA) is the typical metric used for both aspects. However, we argue in this paper that CA is a poor metric to study BCI users' skills. Here, we propose a definition and new metrics to quantify such BCI skills for Mental Imagery (MI) BCIs, independently of any classification algorithm. Approach: We first show in this paper that CA is notably unspecific, discrete, training data and classifier dependent, and as such may not always reflect successful self-modulation of EEG patterns by the user. We then propose a definition of MI-BCI skills that reflects how well the user can self-modulate EEG patterns, and thus how well he could control an MI-BCI. Finally, we propose new performance metrics, classDis, restDist and classStab that specifically measure how distinct and stable the EEG patterns produced by the user are, independently of any classifier. Main results: By re-analyzing EEG data sets with such new metrics, we indeed confirmed that CA may hide some increase in MI-BCI skills or hide the user inability to self-modulate a given EEG pattern. On the other hand, our new metrics could reveal such skill improvements as well as identify when a mental task performed by a user was no different than rest EEG. Significance: Our results showed that when studying MI-BCI users' skills, CA should be used with care, and complemented with metrics such as the new ones proposed. Our results also stressed the need to redefine BCI user training by considering the different BCI subskills and their measures. To promote the complementary use of our new metrics, we provide the Matlab code to compute them for free and open-source. © 2018 IOP Publishing Ltd.
NASA Technical Reports Server (NTRS)
Matic, Roy M.; Mosley, Judith I.
1994-01-01
Future space-based, remote sensing systems will have data transmission requirements that exceed available downlinks necessitating the use of lossy compression techniques for multispectral data. In this paper, we describe several algorithms for lossy compression of multispectral data which combine spectral decorrelation techniques with an adaptive, wavelet-based, image compression algorithm to exploit both spectral and spatial correlation. We compare the performance of several different spectral decorrelation techniques including wavelet transformation in the spectral dimension. The performance of each technique is evaluated at compression ratios ranging from 4:1 to 16:1. Performance measures used are visual examination, conventional distortion measures, and multispectral classification results. We also introduce a family of distortion metrics that are designed to quantify and predict the effect of compression artifacts on multi spectral classification of the reconstructed data.
An Analysis of Performance-Based Funding Policies and Recommendations for the Florida College System
ERIC Educational Resources Information Center
Balog, Scott E.
2016-01-01
Nearly 30 states have adopted or are transitioning to performance-based funding programs for community colleges that allocate funding based on institutional performance according to defined metrics. While embraced by state lawmakers and promoted by outside advocacy groups as a method to improve student outcomes, enhance accountability and ensure…
Miller, Vonda H; Jansen, Ben H
2008-12-01
Computer algorithms that match human performance in recognizing written text or spoken conversation remain elusive. The reasons why the human brain far exceeds any existing recognition scheme to date in the ability to generalize and to extract invariant characteristics relevant to category matching are not clear. However, it has been postulated that the dynamic distribution of brain activity (spatiotemporal activation patterns) is the mechanism by which stimuli are encoded and matched to categories. This research focuses on supervised learning using a trajectory based distance metric for category discrimination in an oscillatory neural network model. Classification is accomplished using a trajectory based distance metric. Since the distance metric is differentiable, a supervised learning algorithm based on gradient descent is demonstrated. Classification of spatiotemporal frequency transitions and their relation to a priori assessed categories is shown along with the improved classification results after supervised training. The results indicate that this spatiotemporal representation of stimuli and the associated distance metric is useful for simple pattern recognition tasks and that supervised learning improves classification results.
Christoforou, Christoforos; Christou-Champi, Spyros; Constantinidou, Fofi; Theodorou, Maria
2015-01-01
Eye-tracking has been extensively used to quantify audience preferences in the context of marketing and advertising research, primarily in methodologies involving static images or stimuli (i.e., advertising, shelf testing, and website usability). However, these methodologies do not generalize to narrative-based video stimuli where a specific storyline is meant to be communicated to the audience. In this paper, a novel metric based on eye-gaze dispersion (both within and across viewings) that quantifies the impact of narrative-based video stimuli to the preferences of large audiences is presented. The metric is validated in predicting the performance of video advertisements aired during the 2014 Super Bowl final. In particular, the metric is shown to explain 70% of the variance in likeability scores of the 2014 Super Bowl ads as measured by the USA TODAY Ad-Meter. In addition, by comparing the proposed metric with Heart Rate Variability (HRV) indices, we have associated the metric with biological processes relating to attention allocation. The underlying idea behind the proposed metric suggests a shift in perspective when it comes to evaluating narrative-based video stimuli. In particular, it suggests that audience preferences on video are modulated by the level of viewers lack of attention allocation. The proposed metric can be calculated on any narrative-based video stimuli (i.e., movie, narrative content, emotional content, etc.), and thus has the potential to facilitate the use of such stimuli in several contexts: prediction of audience preferences of movies, quantitative assessment of entertainment pieces, prediction of the impact of movie trailers, identification of group, and individual differences in the study of attention-deficit disorders, and the study of desensitization to media violence. PMID:26029135
Christoforou, Christoforos; Christou-Champi, Spyros; Constantinidou, Fofi; Theodorou, Maria
2015-01-01
Eye-tracking has been extensively used to quantify audience preferences in the context of marketing and advertising research, primarily in methodologies involving static images or stimuli (i.e., advertising, shelf testing, and website usability). However, these methodologies do not generalize to narrative-based video stimuli where a specific storyline is meant to be communicated to the audience. In this paper, a novel metric based on eye-gaze dispersion (both within and across viewings) that quantifies the impact of narrative-based video stimuli to the preferences of large audiences is presented. The metric is validated in predicting the performance of video advertisements aired during the 2014 Super Bowl final. In particular, the metric is shown to explain 70% of the variance in likeability scores of the 2014 Super Bowl ads as measured by the USA TODAY Ad-Meter. In addition, by comparing the proposed metric with Heart Rate Variability (HRV) indices, we have associated the metric with biological processes relating to attention allocation. The underlying idea behind the proposed metric suggests a shift in perspective when it comes to evaluating narrative-based video stimuli. In particular, it suggests that audience preferences on video are modulated by the level of viewers lack of attention allocation. The proposed metric can be calculated on any narrative-based video stimuli (i.e., movie, narrative content, emotional content, etc.), and thus has the potential to facilitate the use of such stimuli in several contexts: prediction of audience preferences of movies, quantitative assessment of entertainment pieces, prediction of the impact of movie trailers, identification of group, and individual differences in the study of attention-deficit disorders, and the study of desensitization to media violence.
Fu, Lawrence D.; Aphinyanaphongs, Yindalon; Wang, Lily; Aliferis, Constantin F.
2011-01-01
Evaluating the biomedical literature and health-related websites for quality are challenging information retrieval tasks. Current commonly used methods include impact factor for journals, PubMed’s clinical query filters and machine learning-based filter models for articles, and PageRank for websites. Previous work has focused on the average performance of these methods without considering the topic, and it is unknown how performance varies for specific topics or focused searches. Clinicians, researchers, and users should be aware when expected performance is not achieved for specific topics. The present work analyzes the behavior of these methods for a variety of topics. Impact factor, clinical query filters, and PageRank vary widely across different topics while a topic-specific impact factor and machine learning-based filter models are more stable. The results demonstrate that a method may perform excellently on average but struggle when used on a number of narrower topics. Topic adjusted metrics and other topic robust methods have an advantage in such situations. Users of traditional topic-sensitive metrics should be aware of their limitations. PMID:21419864
Performance evaluation of no-reference image quality metrics for face biometric images
NASA Astrophysics Data System (ADS)
Liu, Xinwei; Pedersen, Marius; Charrier, Christophe; Bours, Patrick
2018-03-01
The accuracy of face recognition systems is significantly affected by the quality of face sample images. The recent established standardization proposed several important aspects for the assessment of face sample quality. There are many existing no-reference image quality metrics (IQMs) that are able to assess natural image quality by taking into account similar image-based quality attributes as introduced in the standardization. However, whether such metrics can assess face sample quality is rarely considered. We evaluate the performance of 13 selected no-reference IQMs on face biometrics. The experimental results show that several of them can assess face sample quality according to the system performance. We also analyze the strengths and weaknesses of different IQMs as well as why some of them failed to assess face sample quality. Retraining an original IQM by using face database can improve the performance of such a metric. In addition, the contribution of this paper can be used for the evaluation of IQMs on other biometric modalities; furthermore, it can be used for the development of multimodality biometric IQMs.
Machine characterization based on an abstract high-level language machine
NASA Technical Reports Server (NTRS)
Saavedra-Barrera, Rafael H.; Smith, Alan Jay; Miya, Eugene
1989-01-01
Measurements are presented for a large number of machines ranging from small workstations to supercomputers. The authors combine these measurements into groups of parameters which relate to specific aspects of the machine implementation, and use these groups to provide overall machine characterizations. The authors also define the concept of pershapes, which represent the level of performance of a machine for different types of computation. A metric based on pershapes is introduced that provides a quantitative way of measuring how similar two machines are in terms of their performance distributions. The metric is related to the extent to which pairs of machines have varying relative performance levels depending on which benchmark is used.
Healthcare4VideoStorm: Making Smart Decisions Based on Storm Metrics.
Zhang, Weishan; Duan, Pengcheng; Chen, Xiufeng; Lu, Qinghua
2016-04-23
Storm-based stream processing is widely used for real-time large-scale distributed processing. Knowing the run-time status and ensuring performance is critical to providing expected dependability for some applications, e.g., continuous video processing for security surveillance. The existing scheduling strategies' granularity is too coarse to have good performance, and mainly considers network resources without computing resources while scheduling. In this paper, we propose Healthcare4Storm, a framework that finds Storm insights based on Storm metrics to gain knowledge from the health status of an application, finally ending up with smart scheduling decisions. It takes into account both network and computing resources and conducts scheduling at a fine-grained level using tuples instead of topologies. The comprehensive evaluation shows that the proposed framework has good performance and can improve the dependability of the Storm-based applications.
A Methodology to Analyze Photovoltaic Tracker Uptime
DOE Office of Scientific and Technical Information (OSTI.GOV)
Muller, Matthew T; Ruth, Dan
A metric is developed to analyze the daily performance of single-axis photovoltaic (PV) trackers. The metric relies on comparing correlations between the daily time series of the PV power output and an array of simulated plane-of-array irradiances for the given day. Mathematical thresholds and a logic sequence are presented, so the daily tracking metric can be applied in an automated fashion on large-scale PV systems. The results of applying the metric are visually examined against the time series of the power output data for a large number of days and for various systems. The visual inspection results suggest that overall,more » the algorithm is accurate in identifying stuck or functioning trackers on clear-sky days. Visual inspection also shows that there are days that are not classified by the metric where the power output data may be sufficient to identify a stuck tracker. Based on the daily tracking metric, uptime results are calculated for 83 different inverters at 34 PV sites. The mean tracker uptime is calculated at 99% based on 2 different calculation methods. The daily tracking metric clearly has limitations, but as there is no existing metrics in the literature, it provides a valuable tool for flagging stuck trackers.« less
NASA Technical Reports Server (NTRS)
Idris, Husni; Shen, Ni; Wing, David J.
2011-01-01
The growing demand for air travel is increasing the need for mitigating air traffic congestion and complexity problems, which are already at high levels. At the same time new surveillance, navigation, and communication technologies are enabling major transformations in the air traffic management system, including net-based information sharing and collaboration, performance-based access to airspace resources, and trajectory-based rather than clearance-based operations. The new system will feature different schemes for allocating tasks and responsibilities between the ground and airborne agents and between the human and automation, with potential capacity and cost benefits. Therefore, complexity management requires new metrics and methods that can support these new schemes. This paper presents metrics and methods for preserving trajectory flexibility that have been proposed to support a trajectory-based approach for complexity management by airborne or ground-based systems. It presents extensions to these metrics as well as to the initial research conducted to investigate the hypothesis that using these metrics to guide user and service provider actions will naturally mitigate traffic complexity. The analysis showed promising results in that: (1) Trajectory flexibility preservation mitigated traffic complexity as indicated by inducing self-organization in the traffic patterns and lowering traffic complexity indicators such as dynamic density and traffic entropy. (2)Trajectory flexibility preservation reduced the potential for secondary conflicts in separation assurance. (3) Trajectory flexibility metrics showed potential application to support user and service provider negotiations for minimizing the constraints imposed on trajectories without jeopardizing their objectives.
NASA Astrophysics Data System (ADS)
Marchant, T. E.; Joshi, K. D.; Moore, C. J.
2018-03-01
Radiotherapy dose calculations based on cone-beam CT (CBCT) images can be inaccurate due to unreliable Hounsfield units (HU) in the CBCT. Deformable image registration of planning CT images to CBCT, and direct correction of CBCT image values are two methods proposed to allow heterogeneity corrected dose calculations based on CBCT. In this paper we compare the accuracy and robustness of these two approaches. CBCT images for 44 patients were used including pelvis, lung and head & neck sites. CBCT HU were corrected using a ‘shading correction’ algorithm and via deformable registration of planning CT to CBCT using either Elastix or Niftyreg. Radiotherapy dose distributions were re-calculated with heterogeneity correction based on the corrected CBCT and several relevant dose metrics for target and OAR volumes were calculated. Accuracy of CBCT based dose metrics was determined using an ‘override ratio’ method where the ratio of the dose metric to that calculated on a bulk-density assigned version of the same image is assumed to be constant for each patient, allowing comparison to the patient’s planning CT as a gold standard. Similar performance is achieved by shading corrected CBCT and both deformable registration algorithms, with mean and standard deviation of dose metric error less than 1% for all sites studied. For lung images, use of deformed CT leads to slightly larger standard deviation of dose metric error than shading corrected CBCT with more dose metric errors greater than 2% observed (7% versus 1%).
Meaningful Assessment of Robotic Surgical Style using the Wisdom of Crowds.
Ershad, M; Rege, R; Fey, A Majewicz
2018-07-01
Quantitative assessment of surgical skills is an important aspect of surgical training; however, the proposed metrics are sometimes difficult to interpret and may not capture the stylistic characteristics that define expertise. This study proposes a methodology for evaluating the surgical skill, based on metrics associated with stylistic adjectives, and evaluates the ability of this method to differentiate expertise levels. We recruited subjects from different expertise levels to perform training tasks on a surgical simulator. A lexicon of contrasting adjective pairs, based on important skills for robotic surgery, inspired by the global evaluative assessment of robotic skills tool, was developed. To validate the use of stylistic adjectives for surgical skill assessment, posture videos of the subjects performing the task, as well as videos of the task were rated by crowd-workers. Metrics associated with each adjective were found using kinematic and physiological measurements through correlation with the crowd-sourced adjective assignment ratings. To evaluate the chosen metrics' ability in distinguishing expertise levels, two classifiers were trained and tested using these metrics. Crowd-assignment ratings for all adjectives were significantly correlated with expertise levels. The results indicate that naive Bayes classifier performs the best, with an accuracy of [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text] when classifying into four, three, and two levels of expertise, respectively. The proposed method is effective at mapping understandable adjectives of expertise to the stylistic movements and physiological response of trainees.
Shackelford, Stacy; Garofalo, Evan; Shalin, Valerie; Pugh, Kristy; Chen, Hegang; Pasley, Jason; Sarani, Babak; Henry, Sharon; Bowyer, Mark; Mackenzie, Colin F
2015-07-01
Maintaining trauma-specific surgical skills is an ongoing challenge for surgical training programs. An objective assessment of surgical skills is needed. We hypothesized that a validated surgical performance assessment tool could detect differences following a training intervention. We developed surgical performance assessment metrics based on discussion with expert trauma surgeons, video review of 10 experts and 10 novice surgeons performing three vascular exposure procedures and lower extremity fasciotomy on cadavers, and validated the metrics with interrater reliability testing by five reviewers blinded to level of expertise and a consensus conference. We tested these performance metrics in 12 surgical residents (Year 3-7) before and 2 weeks after vascular exposure skills training in the Advanced Surgical Skills for Exposure in Trauma (ASSET) course. Performance was assessed in three areas as follows: knowledge (anatomic, management), procedure steps, and technical skills. Time to completion of procedures was recorded, and these metrics were combined into a single performance score, the Trauma Readiness Index (TRI). Wilcoxon matched-pairs signed-ranks test compared pretraining/posttraining effects. Mean time to complete procedures decreased by 4.3 minutes (from 13.4 minutes to 9.1 minutes). The performance component most improved by the 1-day skills training was procedure steps, completion of which increased by 21%. Technical skill scores improved by 12%. Overall knowledge improved by 3%, with 18% improvement in anatomic knowledge. TRI increased significantly from 50% to 64% with ASSET training. Interrater reliability of the surgical performance assessment metrics was validated with single intraclass correlation coefficient of 0.7 to 0.98. A trauma-relevant surgical performance assessment detected improvements in specific procedure steps and anatomic knowledge taught during a 1-day course, quantified by the TRI. ASSET training reduced time to complete vascular control by one third. Future applications include assessing specific skills in a larger surgeon cohort, assessing military surgical readiness, and quantifying skill degradation with time since training.
The psychometrics of mental workload: multiple measures are sensitive but divergent.
Matthews, Gerald; Reinerman-Jones, Lauren E; Barber, Daniel J; Abich, Julian
2015-02-01
A study was run to test the sensitivity of multiple workload indices to the differing cognitive demands of four military monitoring task scenarios and to investigate relationships between indices. Various psychophysiological indices of mental workload exhibit sensitivity to task factors. However, the psychometric properties of multiple indices, including the extent to which they intercorrelate, have not been adequately investigated. One hundred fifty participants performed in four task scenarios based on a simulation of unmanned ground vehicle operation. Scenarios required threat detection and/or change detection. Both single- and dual-task scenarios were used. Workload metrics for each scenario were derived from the electroencephalogram (EEG), electrocardiogram, transcranial Doppler sonography, functional near infrared, and eye tracking. Subjective workload was also assessed. Several metrics showed sensitivity to the differing demands of the four scenarios. Eye fixation duration and the Task Load Index metric derived from EEG were diagnostic of single-versus dual-task performance. Several other metrics differentiated the two single tasks but were less effective in differentiating single- from dual-task performance. Psychometric analyses confirmed the reliability of individual metrics but failed to identify any general workload factor. An analysis of difference scores between low- and high-workload conditions suggested an effort factor defined by heart rate variability and frontal cortex oxygenation. General workload is not well defined psychometrically, although various individual metrics may satisfy conventional criteria for workload assessment. Practitioners should exercise caution in using multiple metrics that may not correspond well, especially at the level of the individual operator.
Evaluation metrics for bone segmentation in ultrasound
NASA Astrophysics Data System (ADS)
Lougheed, Matthew; Fichtinger, Gabor; Ungi, Tamas
2015-03-01
Tracked ultrasound is a safe alternative to X-ray for imaging bones. The interpretation of bony structures is challenging as ultrasound has no specific intensity characteristic of bones. Several image segmentation algorithms have been devised to identify bony structures. We propose an open-source framework that would aid in the development and comparison of such algorithms by quantitatively measuring segmentation performance in the ultrasound images. True-positive and false-negative metrics used in the framework quantify algorithm performance based on correctly segmented bone and correctly segmented boneless regions. Ground-truth for these metrics are defined manually and along with the corresponding automatically segmented image are used for the performance analysis. Manually created ground truth tests were generated to verify the accuracy of the analysis. Further evaluation metrics for determining average performance per slide and standard deviation are considered. The metrics provide a means of evaluating accuracy of frames along the length of a volume. This would aid in assessing the accuracy of the volume itself and the approach to image acquisition (positioning and frequency of frame). The framework was implemented as an open-source module of the 3D Slicer platform. The ground truth tests verified that the framework correctly calculates the implemented metrics. The developed framework provides a convenient way to evaluate bone segmentation algorithms. The implementation fits in a widely used application for segmentation algorithm prototyping. Future algorithm development will benefit by monitoring the effects of adjustments to an algorithm in a standard evaluation framework.
Task-oriented lossy compression of magnetic resonance images
NASA Astrophysics Data System (ADS)
Anderson, Mark C.; Atkins, M. Stella; Vaisey, Jacques
1996-04-01
A new task-oriented image quality metric is used to quantify the effects of distortion introduced into magnetic resonance images by lossy compression. This metric measures the similarity between a radiologist's manual segmentation of pathological features in the original images and the automated segmentations performed on the original and compressed images. The images are compressed using a general wavelet-based lossy image compression technique, embedded zerotree coding, and segmented using a three-dimensional stochastic model-based tissue segmentation algorithm. The performance of the compression system is then enhanced by compressing different regions of the image volume at different bit rates, guided by prior knowledge about the location of important anatomical regions in the image. Application of the new system to magnetic resonance images is shown to produce compression results superior to the conventional methods, both subjectively and with respect to the segmentation similarity metric.
Metric learning for automatic sleep stage classification.
Phan, Huy; Do, Quan; Do, The-Luan; Vu, Duc-Lung
2013-01-01
We introduce in this paper a metric learning approach for automatic sleep stage classification based on single-channel EEG data. We show that learning a global metric from training data instead of using the default Euclidean metric, the k-nearest neighbor classification rule outperforms state-of-the-art methods on Sleep-EDF dataset with various classification settings. The overall accuracy for Awake/Sleep and 4-class classification setting are 98.32% and 94.49% respectively. Furthermore, the superior accuracy is achieved by performing classification on a low-dimensional feature space derived from time and frequency domains and without the need for artifact removal as a preprocessing step.
Daluwatte, Chathuri; Vicente, Jose; Galeotti, Loriano; Johannesen, Lars; Strauss, David G; Scully, Christopher G
Performance of ECG beat detectors is traditionally assessed on long intervals (e.g.: 30min), but only incorrect detections within a short interval (e.g.: 10s) may cause incorrect (i.e., missed+false) heart rate limit alarms (tachycardia and bradycardia). We propose a novel performance metric based on distribution of incorrect beat detection over a short interval and assess its relationship with incorrect heart rate limit alarm rates. Six ECG beat detectors were assessed using performance metrics over long interval (sensitivity and positive predictive value over 30min) and short interval (Area Under empirical cumulative distribution function (AUecdf) for short interval (i.e., 10s) sensitivity and positive predictive value) on two ECG databases. False heart rate limit and asystole alarm rates calculated using a third ECG database were then correlated (Spearman's rank correlation) with each calculated performance metric. False alarm rates correlated with sensitivity calculated on long interval (i.e., 30min) (ρ=-0.8 and p<0.05) and AUecdf for sensitivity (ρ=0.9 and p<0.05) in all assessed ECG databases. Sensitivity over 30min grouped the two detectors with lowest false alarm rates while AUecdf for sensitivity provided further information to identify the two beat detectors with highest false alarm rates as well, which was inseparable with sensitivity over 30min. Short interval performance metrics can provide insights on the potential of a beat detector to generate incorrect heart rate limit alarms. Published by Elsevier Inc.
Resilience-based performance metrics for water resources management under uncertainty
NASA Astrophysics Data System (ADS)
Roach, Tom; Kapelan, Zoran; Ledbetter, Ralph
2018-06-01
This paper aims to develop new, resilience type metrics for long-term water resources management under uncertain climate change and population growth. Resilience is defined here as the ability of a water resources management system to 'bounce back', i.e. absorb and then recover from a water deficit event, restoring the normal system operation. Ten alternative metrics are proposed and analysed addressing a range of different resilience aspects including duration, magnitude, frequency and volume of related water deficit events. The metrics were analysed on a real-world case study of the Bristol Water supply system in the UK and compared with current practice. The analyses included an examination of metrics' sensitivity and correlation, as well as a detailed examination into the behaviour of metrics during water deficit periods. The results obtained suggest that multiple metrics which cover different aspects of resilience should be used simultaneously when assessing the resilience of a water resources management system, leading to a more complete understanding of resilience compared with current practice approaches. It was also observed that calculating the total duration of a water deficit period provided a clearer and more consistent indication of system performance compared to splitting the deficit periods into the time to reach and time to recover from the worst deficit events.
PSQM-based RR and NR video quality metrics
NASA Astrophysics Data System (ADS)
Lu, Zhongkang; Lin, Weisi; Ong, Eeping; Yang, Xiaokang; Yao, Susu
2003-06-01
This paper presents a new and general concept, PQSM (Perceptual Quality Significance Map), to be used in measuring the visual distortion. It makes use of the selectivity characteristic of HVS (Human Visual System) that it pays more attention to certain area/regions of visual signal due to one or more of the following factors: salient features in image/video, cues from domain knowledge, and association of other media (e.g., speech or audio). PQSM is an array whose elements represent the relative perceptual-quality significance levels for the corresponding area/regions for images or video. Due to its generality, PQSM can be incorporated into any visual distortion metrics: to improve effectiveness or/and efficiency of perceptual metrics; or even to enhance a PSNR-based metric. A three-stage PQSM estimation method is also proposed in this paper, with an implementation of motion, texture, luminance, skin-color and face mapping. Experimental results show the scheme can improve the performance of current image/video distortion metrics.
Weber-aware weighted mutual information evaluation for infrared-visible image fusion
NASA Astrophysics Data System (ADS)
Luo, Xiaoyan; Wang, Shining; Yuan, Ding
2016-10-01
A performance metric for infrared and visible image fusion is proposed based on Weber's law. To indicate the stimulus of source images, two Weber components are provided. One is differential excitation to reflect the spectral signal of visible and infrared images, and the other is orientation to capture the scene structure feature. By comparing the corresponding Weber component in infrared and visible images, the source pixels can be marked with different dominant properties in intensity or structure. If the pixels have the same dominant property label, the pixels are grouped to calculate the mutual information (MI) on the corresponding Weber components between dominant source and fused images. Then, the final fusion metric is obtained via weighting the group-wise MI values according to the number of pixels in different groups. Experimental results demonstrate that the proposed metric performs well on popular image fusion cases and outperforms other image fusion metrics.
The compressed average image intensity metric for stereoscopic video quality assessment
NASA Astrophysics Data System (ADS)
Wilczewski, Grzegorz
2016-09-01
The following article depicts insights towards design, creation and testing of a genuine metric designed for a 3DTV video quality evaluation. The Compressed Average Image Intensity (CAII) mechanism is based upon stereoscopic video content analysis, setting its core feature and functionality to serve as a versatile tool for an effective 3DTV service quality assessment. Being an objective type of quality metric it may be utilized as a reliable source of information about the actual performance of a given 3DTV system, under strict providers evaluation. Concerning testing and the overall performance analysis of the CAII metric, the following paper presents comprehensive study of results gathered across several testing routines among selected set of samples of stereoscopic video content. As a result, the designed method for stereoscopic video quality evaluation is investigated across the range of synthetic visual impairments injected into the original video stream.
The Massachusetts Community College Performance-Based Funding Formula: A New Model for New England?
ERIC Educational Resources Information Center
Salomon-Fernandez, Yves
2014-01-01
The Massachusetts community college system is entering a second year with funding for each of its 15 schools determined using a new performance-based formula. Under the new model, 50% of each college's allocation is based on performance on metrics related to enrollment and student success, with added incentives for "at-risk" students…
On Information Metrics for Spatial Coding.
Souza, Bryan C; Pavão, Rodrigo; Belchior, Hindiael; Tort, Adriano B L
2018-04-01
The hippocampal formation is involved in navigation, and its neuronal activity exhibits a variety of spatial correlates (e.g., place cells, grid cells). The quantification of the information encoded by spikes has been standard procedure to identify which cells have spatial correlates. For place cells, most of the established metrics derive from Shannon's mutual information (Shannon, 1948), and convey information rate in bits/s or bits/spike (Skaggs et al., 1993, 1996). Despite their widespread use, the performance of these metrics in relation to the original mutual information metric has never been investigated. In this work, using simulated and real data, we find that the current information metrics correlate less with the accuracy of spatial decoding than the original mutual information metric. We also find that the top informative cells may differ among metrics, and show a surrogate-based normalization that yields comparable spatial information estimates. Since different information metrics may identify different neuronal populations, we discuss current and alternative definitions of spatially informative cells, which affect the metric choice. Copyright © 2018 IBRO. Published by Elsevier Ltd. All rights reserved.
What are the Ingredients of a Scientifically and Policy-Relevant Hydrologic Connectivity Metric?
NASA Astrophysics Data System (ADS)
Ali, G.; English, C.; McCullough, G.; Stainton, M.
2014-12-01
While the concept of hydrologic connectivity is of significant importance to both researchers and policy makers, there is no consensus on how to express it in quantitative terms. This lack of consensus was further exacerbated by recent rulings of the U.S. Supreme Court that rely on the idea of "significant nexuses": critical degrees of landscape connectivity now have to be demonstrated to warrant environmental protection under the Clean Water Act. Several indicators of connectivity have been suggested in the literature, but they are often computationally intensive and require soil water content information, a requirement that makes them inapplicable over large, data-poor areas for which management decisions are needed. Here our objective was to assess the extent to which the concept of connectivity could become more operational by: 1) drafting a list of potential, watershed-scale connectivity metrics; 2) establishing a list of criteria for ranking the performance of those metrics; 3) testing them in various landscapes. Our focus was on a dozen agricultural Prairie watersheds where the interaction between near-level topography, perennial and intermittent streams, pothole wetlands and man-made drains renders the estimation of connectivity difficult. A simple procedure was used to convert RADARSAT images, collected between 1997 and 2011, into binary maps of saturated versus non-saturated areas. Several pattern-based and graph-theoretic metrics were then computed for a dynamic assessment of connectivity. The metrics performance was compared with regards to their sensitivity to antecedent precipitation, their correlation with watershed discharge, and their ability to portray aggregation effects. Results show that no single connectivity metric could satisfy all our performance criteria. Graph-theoretic metrics however seemed to perform better in pothole-dominated watersheds, thus highlighting the need for region-specific connectivity assessment frameworks.
NASA Technical Reports Server (NTRS)
Lee, P. J.
1985-01-01
For a frequency-hopped noncoherent MFSK communication system without jammer state information (JSI) in a worst case partial band jamming environment, it is well known that the use of a conventional unquantized metric results in very poor performance. In this paper, a 'normalized' unquantized energy metric is suggested for such a system. It is shown that with this metric, one can save 2-3 dB in required signal energy over the system with hard decision metric without JSI for the same desired performance. When this very robust metric is compared to the conventional unquantized energy metric with JSI, the loss in required signal energy is shown to be small. Thus, the use of this normalized metric provides performance comparable to systems for which JSI is known. Cutoff rate and bit error rate with dual-k coding are used for the performance measures.
Localized Multi-Model Extremes Metrics for the Fourth National Climate Assessment
NASA Astrophysics Data System (ADS)
Thompson, T. R.; Kunkel, K.; Stevens, L. E.; Easterling, D. R.; Biard, J.; Sun, L.
2017-12-01
We have performed localized analysis of scenario-based datasets for the Fourth National Climate Assessment (NCA4). These datasets include CMIP5-based Localized Constructed Analogs (LOCA) downscaled simulations at daily temporal resolution and 1/16th-degree spatial resolution. Over 45 temperature and precipitation extremes metrics have been processed using LOCA data, including threshold, percentile, and degree-days calculations. The localized analysis calculates trends in the temperature and precipitation extremes metrics for relatively small regions such as counties, metropolitan areas, climate zones, administrative areas, or economic zones. For NCA4, we are currently addressing metropolitan areas as defined by U.S. Census Bureau Metropolitan Statistical Areas. Such localized analysis provides essential information for adaptation planning at scales relevant to local planning agencies and businesses. Nearly 30 such regions have been analyzed to date. Each locale is defined by a closed polygon that is used to extract LOCA-based extremes metrics specific to the area. For each metric, single-model data at each LOCA grid location are first averaged over several 30-year historical and future periods. Then, for each metric, the spatial average across the region is calculated using model weights based on both model independence and reproducibility of current climate conditions. The range of single-model results is also captured on the same localized basis, and then combined with the weighted ensemble average for each region and each metric. For example, Boston-area cooling degree days and maximum daily temperature is shown below for RCP8.5 (red) and RCP4.5 (blue) scenarios. We also discuss inter-regional comparison of these metrics, as well as their relevance to risk analysis for adaptation planning.
Metrics help rural hospitals achieve world-class performance.
Goodspeed, Scott W
2006-01-01
This article describes the emerging trend of using metrics in rural hospitals to achieve world-class performance. This trend is a response to the fact that rural hospitals have small patient volumes yet must maintain a profit margin in order to fulfill their mission to the community. The conceptual idea for this article is based largely on Robert Kaplan and David Norton's Balanced Scorecard articles in the Harvard Business Review. The ideas also come from the experiences of the 60-plus rural hospitals that are using the Balanced Scorecard and their implementation of metrics to influence performance and behavior. It is indeed possible for rural hospitals to meet and exceed the unique needs of patients and physicians (customers), to achieve healthy profit margins, and to be the rural hospital of choice that employees are proud to work for.
Sensor Selection for Aircraft Engine Performance Estimation and Gas Path Fault Diagnostics
NASA Technical Reports Server (NTRS)
Simon, Donald L.; Rinehart, Aidan W.
2015-01-01
This paper presents analytical techniques for aiding system designers in making aircraft engine health management sensor selection decisions. The presented techniques, which are based on linear estimation and probability theory, are tailored for gas turbine engine performance estimation and gas path fault diagnostics applications. They enable quantification of the performance estimation and diagnostic accuracy offered by different candidate sensor suites. For performance estimation, sensor selection metrics are presented for two types of estimators including a Kalman filter and a maximum a posteriori estimator. For each type of performance estimator, sensor selection is based on minimizing the theoretical sum of squared estimation errors in health parameters representing performance deterioration in the major rotating modules of the engine. For gas path fault diagnostics, the sensor selection metric is set up to maximize correct classification rate for a diagnostic strategy that performs fault classification by identifying the fault type that most closely matches the observed measurement signature in a weighted least squares sense. Results from the application of the sensor selection metrics to a linear engine model are presented and discussed. Given a baseline sensor suite and a candidate list of optional sensors, an exhaustive search is performed to determine the optimal sensor suites for performance estimation and fault diagnostics. For any given sensor suite, Monte Carlo simulation results are found to exhibit good agreement with theoretical predictions of estimation and diagnostic accuracies.
Sensor Selection for Aircraft Engine Performance Estimation and Gas Path Fault Diagnostics
NASA Technical Reports Server (NTRS)
Simon, Donald L.; Rinehart, Aidan W.
2016-01-01
This paper presents analytical techniques for aiding system designers in making aircraft engine health management sensor selection decisions. The presented techniques, which are based on linear estimation and probability theory, are tailored for gas turbine engine performance estimation and gas path fault diagnostics applications. They enable quantification of the performance estimation and diagnostic accuracy offered by different candidate sensor suites. For performance estimation, sensor selection metrics are presented for two types of estimators including a Kalman filter and a maximum a posteriori estimator. For each type of performance estimator, sensor selection is based on minimizing the theoretical sum of squared estimation errors in health parameters representing performance deterioration in the major rotating modules of the engine. For gas path fault diagnostics, the sensor selection metric is set up to maximize correct classification rate for a diagnostic strategy that performs fault classification by identifying the fault type that most closely matches the observed measurement signature in a weighted least squares sense. Results from the application of the sensor selection metrics to a linear engine model are presented and discussed. Given a baseline sensor suite and a candidate list of optional sensors, an exhaustive search is performed to determine the optimal sensor suites for performance estimation and fault diagnostics. For any given sensor suite, Monte Carlo simulation results are found to exhibit good agreement with theoretical predictions of estimation and diagnostic accuracies.
Measuring FLOPS Using Hardware Performance Counter Technologies on LC systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahn, D H
2008-09-05
FLOPS (FLoating-point Operations Per Second) is a commonly used performance metric for scientific programs that rely heavily on floating-point (FP) calculations. The metric is based on the number of FP operations rather than instructions, thereby facilitating a fair comparison between different machines. A well-known use of this metric is the LINPACK benchmark that is used to generate the Top500 list. It measures how fast a computer solves a dense N by N system of linear equations Ax=b, which requires a known number of FP operations, and reports the result in millions of FP operations per second (MFLOPS). While running amore » benchmark with known FP workloads can provide insightful information about the efficiency of a machine's FP pipelines in relation to other machines, measuring FLOPS of an arbitrary scientific application in a platform-independent manner is nontrivial. The goal of this paper is twofold. First, we explore the FP microarchitectures of key processors that are underpinning the LC machines. Second, we present the hardware performance monitoring counter-based measurement techniques that a user can use to get the native FLOPS of his or her program, which are practical solutions readily available on LC platforms. By nature, however, these native FLOPS metrics are not directly comparable across different machines mainly because FP operations are not consistent across microarchitectures. Thus, the first goal of this paper represents the base reference by which a user can interpret the measured FLOPS more judiciously.« less
Kumar, B Vinodh; Mohan, Thuthi
2018-01-01
Six Sigma is one of the most popular quality management system tools employed for process improvement. The Six Sigma methods are usually applied when the outcome of the process can be measured. This study was done to assess the performance of individual biochemical parameters on a Sigma Scale by calculating the sigma metrics for individual parameters and to follow the Westgard guidelines for appropriate Westgard rules and levels of internal quality control (IQC) that needs to be processed to improve target analyte performance based on the sigma metrics. This is a retrospective study, and data required for the study were extracted between July 2015 and June 2016 from a Secondary Care Government Hospital, Chennai. The data obtained for the study are IQC - coefficient of variation percentage and External Quality Assurance Scheme (EQAS) - Bias% for 16 biochemical parameters. For the level 1 IQC, four analytes (alkaline phosphatase, magnesium, triglyceride, and high-density lipoprotein-cholesterol) showed an ideal performance of ≥6 sigma level, five analytes (urea, total bilirubin, albumin, cholesterol, and potassium) showed an average performance of <3 sigma level and for level 2 IQCs, same four analytes of level 1 showed a performance of ≥6 sigma level, and four analytes (urea, albumin, cholesterol, and potassium) showed an average performance of <3 sigma level. For all analytes <6 sigma level, the quality goal index (QGI) was <0.8 indicating the area requiring improvement to be imprecision except cholesterol whose QGI >1.2 indicated inaccuracy. This study shows that sigma metrics is a good quality tool to assess the analytical performance of a clinical chemistry laboratory. Thus, sigma metric analysis provides a benchmark for the laboratory to design a protocol for IQC, address poor assay performance, and assess the efficiency of existing laboratory processes.
Measures of model performance based on the log accuracy ratio
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morley, Steven Karl; Brito, Thiago Vasconcelos; Welling, Daniel T.
Quantitative assessment of modeling and forecasting of continuous quantities uses a variety of approaches. We review existing literature describing metrics for forecast accuracy and bias, concentrating on those based on relative errors and percentage errors. Of these accuracy metrics, the mean absolute percentage error (MAPE) is one of the most common across many fields and has been widely applied in recent space science literature and we highlight the benefits and drawbacks of MAPE and proposed alternatives. We then introduce the log accuracy ratio, and derive from it two metrics: the median symmetric accuracy; and the symmetric signed percentage bias. Robustmore » methods for estimating the spread of a multiplicative linear model using the log accuracy ratio are also presented. The developed metrics are shown to be easy to interpret, robust, and to mitigate the key drawbacks of their more widely-used counterparts based on relative errors and percentage errors. Their use is illustrated with radiation belt electron flux modeling examples.« less
NASA Technical Reports Server (NTRS)
Rastaetter, L.; Kuznetsova, M.; Hesse, M.; Pulkkinen, A.; Glocer, A.; Yu, Y.; Meng, X.; Raeder, J.; Wiltberger, M.; Welling, D.;
2011-01-01
In this paper the metrics-based results of the Dst part of the 2008-2009 GEM Metrics Challenge are reported. The Metrics Challenge asked modelers to submit results for 4 geomagnetic storm events and 5 different types of observations that can be modeled by statistical or climatological or physics-based (e.g. MHD) models of the magnetosphere-ionosphere system. We present the results of over 25 model settings that were run at the Community Coordinated Modeling Center (CCMC) and at the institutions of various modelers for these events. To measure the performance of each of the models against the observations we use comparisons of one-hour averaged model data with the Dst index issued by the World Data Center for Geomagnetism, Kyoto, Japan, and direct comparison of one-minute model data with the one-minute Dst index calculated by the United States Geologic Survey (USGS).
Measures of model performance based on the log accuracy ratio
Morley, Steven Karl; Brito, Thiago Vasconcelos; Welling, Daniel T.
2018-01-03
Quantitative assessment of modeling and forecasting of continuous quantities uses a variety of approaches. We review existing literature describing metrics for forecast accuracy and bias, concentrating on those based on relative errors and percentage errors. Of these accuracy metrics, the mean absolute percentage error (MAPE) is one of the most common across many fields and has been widely applied in recent space science literature and we highlight the benefits and drawbacks of MAPE and proposed alternatives. We then introduce the log accuracy ratio, and derive from it two metrics: the median symmetric accuracy; and the symmetric signed percentage bias. Robustmore » methods for estimating the spread of a multiplicative linear model using the log accuracy ratio are also presented. The developed metrics are shown to be easy to interpret, robust, and to mitigate the key drawbacks of their more widely-used counterparts based on relative errors and percentage errors. Their use is illustrated with radiation belt electron flux modeling examples.« less
Objectively Quantifying Radiation Esophagitis With Novel Computed Tomography–Based Metrics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Niedzielski, Joshua S., E-mail: jsniedzielski@mdanderson.org; University of Texas Houston Graduate School of Biomedical Science, Houston, Texas; Yang, Jinzhong
Purpose: To study radiation-induced esophageal expansion as an objective measure of radiation esophagitis in patients with non-small cell lung cancer (NSCLC) treated with intensity modulated radiation therapy. Methods and Materials: Eighty-five patients had weekly intra-treatment CT imaging and esophagitis scoring according to Common Terminlogy Criteria for Adverse Events 4.0, (24 Grade 0, 45 Grade 2, and 16 Grade 3). Nineteen esophageal expansion metrics based on mean, maximum, spatial length, and volume of expansion were calculated as voxel-based relative volume change, using the Jacobian determinant from deformable image registration between the planning and weekly CTs. An anatomic variability correction method wasmore » validated and applied to these metrics to reduce uncertainty. An analysis of expansion metrics and radiation esophagitis grade was conducted using normal tissue complication probability from univariate logistic regression and Spearman rank for grade 2 and grade 3 esophagitis endpoints, as well as the timing of expansion and esophagitis grade. Metrics' performance in classifying esophagitis was tested with receiver operating characteristic analysis. Results: Expansion increased with esophagitis grade. Thirteen of 19 expansion metrics had receiver operating characteristic area under the curve values >0.80 for both grade 2 and grade 3 esophagitis endpoints, with the highest performance from maximum axial expansion (MaxExp1) and esophageal length with axial expansion ≥30% (LenExp30%) with area under the curve values of 0.93 and 0.91 for grade 2, 0.90 and 0.90 for grade 3 esophagitis, respectively. Conclusions: Esophageal expansion may be a suitable objective measure of esophagitis, particularly maximum axial esophageal expansion and esophageal length with axial expansion ≥30%, with 2.1 Jacobian value and 98.6 mm as the metric value for 50% probability of grade 3 esophagitis. The uncertainty in esophageal Jacobian calculations can be reduced with anatomic correction methods.« less
Resilience Metrics for the Electric Power System: A Performance-Based Approach.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vugrin, Eric D.; Castillo, Andrea R; Silva-Monroy, Cesar Augusto
Grid resilience is a concept related to a power system's ability to continue operating and delivering power even in the event that low probability, high-consequence disruptions such as hurricanes, earthquakes, and cyber-attacks occur. Grid resilience objectives focus on managing and, ideally, minimizing potential consequences that occur as a result of these disruptions. Currently, no formal grid resilience definitions, metrics, or analysis methods have been universally accepted. This document describes an effort to develop and describe grid resilience metrics and analysis methods. The metrics and methods described herein extend upon the Resilience Analysis Process (RAP) developed by Watson et al. formore » the 2015 Quadrennial Energy Review. The extension allows for both outputs from system models and for historical data to serve as the basis for creating grid resilience metrics and informing grid resilience planning and response decision-making. This document describes the grid resilience metrics and analysis methods. Demonstration of the metrics and methods is shown through a set of illustrative use cases.« less
Towards a Visual Quality Metric for Digital Video
NASA Technical Reports Server (NTRS)
Watson, Andrew B.
1998-01-01
The advent of widespread distribution of digital video creates a need for automated methods for evaluating visual quality of digital video. This is particularly so since most digital video is compressed using lossy methods, which involve the controlled introduction of potentially visible artifacts. Compounding the problem is the bursty nature of digital video, which requires adaptive bit allocation based on visual quality metrics. In previous work, we have developed visual quality metrics for evaluating, controlling, and optimizing the quality of compressed still images. These metrics incorporate simplified models of human visual sensitivity to spatial and chromatic visual signals. The challenge of video quality metrics is to extend these simplified models to temporal signals as well. In this presentation I will discuss a number of the issues that must be resolved in the design of effective video quality metrics. Among these are spatial, temporal, and chromatic sensitivity and their interactions, visual masking, and implementation complexity. I will also touch on the question of how to evaluate the performance of these metrics.
NASA Astrophysics Data System (ADS)
Papacharalampous, Georgia; Tyralis, Hristos; Koutsoyiannis, Demetris
2017-04-01
Machine learning (ML) is considered to be a promising approach to hydrological processes forecasting. We conduct a comparison between several stochastic and ML point estimation methods by performing large-scale computational experiments based on simulations. The purpose is to provide generalized results, while the respective comparisons in the literature are usually based on case studies. The stochastic methods used include simple methods, models from the frequently used families of Autoregressive Moving Average (ARMA), Autoregressive Fractionally Integrated Moving Average (ARFIMA) and Exponential Smoothing models. The ML methods used are Random Forests (RF), Support Vector Machines (SVM) and Neural Networks (NN). The comparison refers to the multi-step ahead forecasting properties of the methods. A total of 20 methods are used, among which 9 are the ML methods. 12 simulation experiments are performed, while each of them uses 2 000 simulated time series of 310 observations. The time series are simulated using stochastic processes from the families of ARMA and ARFIMA models. Each time series is split into a fitting (first 300 observations) and a testing set (last 10 observations). The comparative assessment of the methods is based on 18 metrics, that quantify the methods' performance according to several criteria related to the accurate forecasting of the testing set, the capturing of its variation and the correlation between the testing and forecasted values. The most important outcome of this study is that there is not a uniformly better or worse method. However, there are methods that are regularly better or worse than others with respect to specific metrics. It appears that, although a general ranking of the methods is not possible, their classification based on their similar or contrasting performance in the various metrics is possible to some extent. Another important conclusion is that more sophisticated methods do not necessarily provide better forecasts compared to simpler methods. It is pointed out that the ML methods do not differ dramatically from the stochastic methods, while it is interesting that the NN, RF and SVM algorithms used in this study offer potentially very good performance in terms of accuracy. It should be noted that, although this study focuses on hydrological processes, the results are of general scientific interest. Another important point in this study is the use of several methods and metrics. Using fewer methods and fewer metrics would have led to a very different overall picture, particularly if those fewer metrics corresponded to fewer criteria. For this reason, we consider that the proposed methodology is appropriate for the evaluation of forecasting methods.
Stability and Performance Metrics for Adaptive Flight Control
NASA Technical Reports Server (NTRS)
Stepanyan, Vahram; Krishnakumar, Kalmanje; Nguyen, Nhan; VanEykeren, Luarens
2009-01-01
This paper addresses the problem of verifying adaptive control techniques for enabling safe flight in the presence of adverse conditions. Since the adaptive systems are non-linear by design, the existing control verification metrics are not applicable to adaptive controllers. Moreover, these systems are in general highly uncertain. Hence, the system's characteristics cannot be evaluated by relying on the available dynamical models. This necessitates the development of control verification metrics based on the system's input-output information. For this point of view, a set of metrics is introduced that compares the uncertain aircraft's input-output behavior under the action of an adaptive controller to that of a closed-loop linear reference model to be followed by the aircraft. This reference model is constructed for each specific maneuver using the exact aerodynamic and mass properties of the aircraft to meet the stability and performance requirements commonly accepted in flight control. The proposed metrics are unified in the sense that they are model independent and not restricted to any specific adaptive control methods. As an example, we present simulation results for a wing damaged generic transport aircraft with several existing adaptive controllers.
Evaluation schemes for video and image anomaly detection algorithms
NASA Astrophysics Data System (ADS)
Parameswaran, Shibin; Harguess, Josh; Barngrover, Christopher; Shafer, Scott; Reese, Michael
2016-05-01
Video anomaly detection is a critical research area in computer vision. It is a natural first step before applying object recognition algorithms. There are many algorithms that detect anomalies (outliers) in videos and images that have been introduced in recent years. However, these algorithms behave and perform differently based on differences in domains and tasks to which they are subjected. In order to better understand the strengths and weaknesses of outlier algorithms and their applicability in a particular domain/task of interest, it is important to measure and quantify their performance using appropriate evaluation metrics. There are many evaluation metrics that have been used in the literature such as precision curves, precision-recall curves, and receiver operating characteristic (ROC) curves. In order to construct these different metrics, it is also important to choose an appropriate evaluation scheme that decides when a proposed detection is considered a true or a false detection. Choosing the right evaluation metric and the right scheme is very critical since the choice can introduce positive or negative bias in the measuring criterion and may favor (or work against) a particular algorithm or task. In this paper, we review evaluation metrics and popular evaluation schemes that are used to measure the performance of anomaly detection algorithms on videos and imagery with one or more anomalies. We analyze the biases introduced by these by measuring the performance of an existing anomaly detection algorithm.
The data quality analyzer: A quality control program for seismic data
NASA Astrophysics Data System (ADS)
Ringler, A. T.; Hagerty, M. T.; Holland, J.; Gonzales, A.; Gee, L. S.; Edwards, J. D.; Wilson, D.; Baker, A. M.
2015-03-01
The U.S. Geological Survey's Albuquerque Seismological Laboratory (ASL) has several initiatives underway to enhance and track the quality of data produced from ASL seismic stations and to improve communication about data problems to the user community. The Data Quality Analyzer (DQA) is one such development and is designed to characterize seismic station data quality in a quantitative and automated manner. The DQA consists of a metric calculator, a PostgreSQL database, and a Web interface: The metric calculator, SEEDscan, is a Java application that reads and processes miniSEED data and generates metrics based on a configuration file. SEEDscan compares hashes of metadata and data to detect changes in either and performs subsequent recalculations as needed. This ensures that the metric values are up to date and accurate. SEEDscan can be run as a scheduled task or on demand. The PostgreSQL database acts as a central hub where metric values and limited station descriptions are stored at the channel level with one-day granularity. The Web interface dynamically loads station data from the database and allows the user to make requests for time periods of interest, review specific networks and stations, plot metrics as a function of time, and adjust the contribution of various metrics to the overall quality grade of the station. The quantification of data quality is based on the evaluation of various metrics (e.g., timing quality, daily noise levels relative to long-term noise models, and comparisons between broadband data and event synthetics). Users may select which metrics contribute to the assessment and those metrics are aggregated into a "grade" for each station. The DQA is being actively used for station diagnostics and evaluation based on the completed metrics (availability, gap count, timing quality, deviation from a global noise model, deviation from a station noise model, coherence between co-located sensors, and comparison between broadband data and synthetics for earthquakes) on stations in the Global Seismographic Network and Advanced National Seismic System.
Multiple symbol partially coherent detection of MPSK
NASA Technical Reports Server (NTRS)
Simon, M. K.; Divsalar, D.
1992-01-01
It is shown that by using the known (or estimated) value of carrier tracking loop signal to noise ratio (SNR) in the decision metric, it is possible to improve the error probability performance of a partially coherent multiple phase-shift-keying (MPSK) system relative to that corresponding to the commonly used ideal coherent decision rule. Using a maximum-likeihood approach, an optimum decision metric is derived and shown to take the form of a weighted sum of the ideal coherent decision metric (i.e., correlation) and the noncoherent decision metric which is optimum for differential detection of MPSK. The performance of a receiver based on this optimum decision rule is derived and shown to provide continued improvement with increasing length of observation interval (data symbol sequence length). Unfortunately, increasing the observation length does not eliminate the error floor associated with the finite loop SNR. Nevertheless, in the limit of infinite observation length, the average error probability performance approaches the algebraic sum of the error floor and the performance of ideal coherent detection, i.e., at any error probability above the error floor, there is no degradation due to the partial coherence. It is shown that this limiting behavior is virtually achievable with practical size observation lengths. Furthermore, the performance is quite insensitive to mismatch between the estimate of loop SNR (e.g., obtained from measurement) fed to the decision metric and its true value. These results may be of use in low-cost Earth-orbiting or deep-space missions employing coded modulations.
Assessment of various supervised learning algorithms using different performance metrics
NASA Astrophysics Data System (ADS)
Susheel Kumar, S. M.; Laxkar, Deepak; Adhikari, Sourav; Vijayarajan, V.
2017-11-01
Our work brings out comparison based on the performance of supervised machine learning algorithms on a binary classification task. The supervised machine learning algorithms which are taken into consideration in the following work are namely Support Vector Machine(SVM), Decision Tree(DT), K Nearest Neighbour (KNN), Naïve Bayes(NB) and Random Forest(RF). This paper mostly focuses on comparing the performance of above mentioned algorithms on one binary classification task by analysing the Metrics such as Accuracy, F-Measure, G-Measure, Precision, Misclassification Rate, False Positive Rate, True Positive Rate, Specificity, Prevalence.
Assessing Upper Extremity Motor Function in Practice of Virtual Activities of Daily Living
Adams, Richard J.; Lichter, Matthew D.; Krepkovich, Eileen T.; Ellington, Allison; White, Marga; Diamond, Paul T.
2015-01-01
A study was conducted to investigate the criterion validity of measures of upper extremity (UE) motor function derived during practice of virtual activities of daily living (ADLs). Fourteen hemiparetic stroke patients employed a Virtual Occupational Therapy Assistant (VOTA), consisting of a high-fidelity virtual world and a Kinect™ sensor, in four sessions of approximately one hour in duration. An Unscented Kalman Filter-based human motion tracking algorithm estimated UE joint kinematics in real-time during performance of virtual ADL activities, enabling both animation of the user’s avatar and automated generation of metrics related to speed and smoothness of motion. These metrics, aggregated over discrete sub-task elements during performance of virtual ADLs, were compared to scores from an established assessment of UE motor performance, the Wolf Motor Function Test (WMFT). Spearman’s rank correlation analysis indicates a moderate correlation between VOTA-derived metrics and the time-based WMFT assessments, supporting the criterion validity of VOTA measures as a means of tracking patient progress during an UE rehabilitation program that includes practice of virtual ADLs. PMID:25265612
Assessing upper extremity motor function in practice of virtual activities of daily living.
Adams, Richard J; Lichter, Matthew D; Krepkovich, Eileen T; Ellington, Allison; White, Marga; Diamond, Paul T
2015-03-01
A study was conducted to investigate the criterion validity of measures of upper extremity (UE) motor function derived during practice of virtual activities of daily living (ADLs). Fourteen hemiparetic stroke patients employed a Virtual Occupational Therapy Assistant (VOTA), consisting of a high-fidelity virtual world and a Kinect™ sensor, in four sessions of approximately one hour in duration. An unscented Kalman Filter-based human motion tracking algorithm estimated UE joint kinematics in real-time during performance of virtual ADL activities, enabling both animation of the user's avatar and automated generation of metrics related to speed and smoothness of motion. These metrics, aggregated over discrete sub-task elements during performance of virtual ADLs, were compared to scores from an established assessment of UE motor performance, the Wolf Motor Function Test (WMFT). Spearman's rank correlation analysis indicates a moderate correlation between VOTA-derived metrics and the time-based WMFT assessments, supporting the criterion validity of VOTA measures as a means of tracking patient progress during an UE rehabilitation program that includes practice of virtual ADLs.
Local coding based matching kernel method for image classification.
Song, Yan; McLoughlin, Ian Vince; Dai, Li-Rong
2014-01-01
This paper mainly focuses on how to effectively and efficiently measure visual similarity for local feature based representation. Among existing methods, metrics based on Bag of Visual Word (BoV) techniques are efficient and conceptually simple, at the expense of effectiveness. By contrast, kernel based metrics are more effective, but at the cost of greater computational complexity and increased storage requirements. We show that a unified visual matching framework can be developed to encompass both BoV and kernel based metrics, in which local kernel plays an important role between feature pairs or between features and their reconstruction. Generally, local kernels are defined using Euclidean distance or its derivatives, based either explicitly or implicitly on an assumption of Gaussian noise. However, local features such as SIFT and HoG often follow a heavy-tailed distribution which tends to undermine the motivation behind Euclidean metrics. Motivated by recent advances in feature coding techniques, a novel efficient local coding based matching kernel (LCMK) method is proposed. This exploits the manifold structures in Hilbert space derived from local kernels. The proposed method combines advantages of both BoV and kernel based metrics, and achieves a linear computational complexity. This enables efficient and scalable visual matching to be performed on large scale image sets. To evaluate the effectiveness of the proposed LCMK method, we conduct extensive experiments with widely used benchmark datasets, including 15-Scenes, Caltech101/256, PASCAL VOC 2007 and 2011 datasets. Experimental results confirm the effectiveness of the relatively efficient LCMK method.
Sigma Routing Metric for RPL Protocol.
Sanmartin, Paul; Rojas, Aldo; Fernandez, Luis; Avila, Karen; Jabba, Daladier; Valle, Sebastian
2018-04-21
This paper presents the adaptation of a specific metric for the RPL protocol in the objective function MRHOF. Among the functions standardized by IETF, we find OF0, which is based on the minimum hop count, as well as MRHOF, which is based on the Expected Transmission Count (ETX). However, when the network becomes denser or the number of nodes increases, both OF0 and MRHOF introduce long hops, which can generate a bottleneck that restricts the network. The adaptation is proposed to optimize both OFs through a new routing metric. To solve the above problem, the metrics of the minimum number of hops and the ETX are combined by designing a new routing metric called SIGMA-ETX, in which the best route is calculated using the standard deviation of ETX values between each node, as opposed to working with the ETX average along the route. This method ensures a better routing performance in dense sensor networks. The simulations are done through the Cooja simulator, based on the Contiki operating system. The simulations showed that the proposed optimization outperforms at a high margin in both OF0 and MRHOF, in terms of network latency, packet delivery ratio, lifetime, and power consumption.
Sigma Routing Metric for RPL Protocol
Rojas, Aldo; Fernandez, Luis
2018-01-01
This paper presents the adaptation of a specific metric for the RPL protocol in the objective function MRHOF. Among the functions standardized by IETF, we find OF0, which is based on the minimum hop count, as well as MRHOF, which is based on the Expected Transmission Count (ETX). However, when the network becomes denser or the number of nodes increases, both OF0 and MRHOF introduce long hops, which can generate a bottleneck that restricts the network. The adaptation is proposed to optimize both OFs through a new routing metric. To solve the above problem, the metrics of the minimum number of hops and the ETX are combined by designing a new routing metric called SIGMA-ETX, in which the best route is calculated using the standard deviation of ETX values between each node, as opposed to working with the ETX average along the route. This method ensures a better routing performance in dense sensor networks. The simulations are done through the Cooja simulator, based on the Contiki operating system. The simulations showed that the proposed optimization outperforms at a high margin in both OF0 and MRHOF, in terms of network latency, packet delivery ratio, lifetime, and power consumption. PMID:29690524
Fu, Lawrence D; Aphinyanaphongs, Yindalon; Wang, Lily; Aliferis, Constantin F
2011-08-01
Evaluating the biomedical literature and health-related websites for quality are challenging information retrieval tasks. Current commonly used methods include impact factor for journals, PubMed's clinical query filters and machine learning-based filter models for articles, and PageRank for websites. Previous work has focused on the average performance of these methods without considering the topic, and it is unknown how performance varies for specific topics or focused searches. Clinicians, researchers, and users should be aware when expected performance is not achieved for specific topics. The present work analyzes the behavior of these methods for a variety of topics. Impact factor, clinical query filters, and PageRank vary widely across different topics while a topic-specific impact factor and machine learning-based filter models are more stable. The results demonstrate that a method may perform excellently on average but struggle when used on a number of narrower topics. Topic-adjusted metrics and other topic robust methods have an advantage in such situations. Users of traditional topic-sensitive metrics should be aware of their limitations. Copyright © 2011 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Ribeiro, M. Gabriela T. C.; Machado, Adelio A. S. C.
2013-01-01
Two new semiquantitative green chemistry metrics, the green circle and the green matrix, have been developed for quick assessment of the greenness of a chemical reaction or process, even without performing the experiment from a protocol if enough detail is provided in it. The evaluation is based on the 12 principles of green chemistry. The…
Designing Industrial Networks Using Ecological Food Web Metrics.
Layton, Astrid; Bras, Bert; Weissburg, Marc
2016-10-18
Biologically Inspired Design (biomimicry) and Industrial Ecology both look to natural systems to enhance the sustainability and performance of engineered products, systems and industries. Bioinspired design (BID) traditionally has focused on a unit operation and single product level. In contrast, this paper describes how principles of network organization derived from analysis of ecosystem properties can be applied to industrial system networks. Specifically, this paper examines the applicability of particular food web matrix properties as design rules for economically and biologically sustainable industrial networks, using an optimization model developed for a carpet recycling network. Carpet recycling network designs based on traditional cost and emissions based optimization are compared to designs obtained using optimizations based solely on ecological food web metrics. The analysis suggests that networks optimized using food web metrics also were superior from a traditional cost and emissions perspective; correlations between optimization using ecological metrics and traditional optimization ranged generally from 0.70 to 0.96, with flow-based metrics being superior to structural parameters. Four structural food parameters provided correlations nearly the same as that obtained using all structural parameters, but individual structural parameters provided much less satisfactory correlations. The analysis indicates that bioinspired design principles from ecosystems can lead to both environmentally and economically sustainable industrial resource networks, and represent guidelines for designing sustainable industry networks.
Tools for monitoring system suitability in LC MS/MS centric proteomic experiments.
Bereman, Michael S
2015-03-01
With advances in liquid chromatography coupled to tandem mass spectrometry technologies combined with the continued goals of biomarker discovery, clinical applications of established biomarkers, and integrating large multiomic datasets (i.e. "big data"), there remains an urgent need for robust tools to assess instrument performance (i.e. system suitability) in proteomic workflows. To this end, several freely available tools have been introduced that monitor a number of peptide identification (ID) and/or peptide ID free metrics. Peptide ID metrics include numbers of proteins, peptides, or peptide spectral matches identified from a complex mixture. Peptide ID free metrics include retention time reproducibility, full width half maximum, ion injection times, and integrated peptide intensities. The main driving force in the development of these tools is to monitor both intra- and interexperiment performance variability and to identify sources of variation. The purpose of this review is to summarize and evaluate these tools based on versatility, automation, vendor neutrality, metrics monitored, and visualization capabilities. In addition, the implementation of a robust system suitability workflow is discussed in terms of metrics, type of standard, and frequency of evaluation along with the obstacles to overcome prior to incorporating a more proactive approach to overall quality control in liquid chromatography coupled to tandem mass spectrometry based proteomic workflows. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
An Underwater Color Image Quality Evaluation Metric.
Yang, Miao; Sowmya, Arcot
2015-12-01
Quality evaluation of underwater images is a key goal of underwater video image retrieval and intelligent processing. To date, no metric has been proposed for underwater color image quality evaluation (UCIQE). The special absorption and scattering characteristics of the water medium do not allow direct application of natural color image quality metrics especially to different underwater environments. In this paper, subjective testing for underwater image quality has been organized. The statistical distribution of the underwater image pixels in the CIELab color space related to subjective evaluation indicates the sharpness and colorful factors correlate well with subjective image quality perception. Based on these, a new UCIQE metric, which is a linear combination of chroma, saturation, and contrast, is proposed to quantify the non-uniform color cast, blurring, and low-contrast that characterize underwater engineering and monitoring images. Experiments are conducted to illustrate the performance of the proposed UCIQE metric and its capability to measure the underwater image enhancement results. They show that the proposed metric has comparable performance to the leading natural color image quality metrics and the underwater grayscale image quality metrics available in the literature, and can predict with higher accuracy the relative amount of degradation with similar image content in underwater environments. Importantly, UCIQE is a simple and fast solution for real-time underwater video processing. The effectiveness of the presented measure is also demonstrated by subjective evaluation. The results show better correlation between the UCIQE and the subjective mean opinion score.
Towards New Metrics for High-Performance Computing Resilience
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hukerikar, Saurabh; Ashraf, Rizwan A; Engelmann, Christian
Ensuring the reliability of applications is becoming an increasingly important challenge as high-performance computing (HPC) systems experience an ever-growing number of faults, errors and failures. While the HPC community has made substantial progress in developing various resilience solutions, it continues to rely on platform-based metrics to quantify application resiliency improvements. The resilience of an HPC application is concerned with the reliability of the application outcome as well as the fault handling efficiency. To understand the scope of impact, effective coverage and performance efficiency of existing and emerging resilience solutions, there is a need for new metrics. In this paper, wemore » develop new ways to quantify resilience that consider both the reliability and the performance characteristics of the solutions from the perspective of HPC applications. As HPC systems continue to evolve in terms of scale and complexity, it is expected that applications will experience various types of faults, errors and failures, which will require applications to apply multiple resilience solutions across the system stack. The proposed metrics are intended to be useful for understanding the combined impact of these solutions on an application's ability to produce correct results and to evaluate their overall impact on an application's performance in the presence of various modes of faults.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Klawikowski, S; Christian, J; Schott, D
Purpose: Pilot study developing a CT-texture based model for early assessment of treatment response during the delivery of chemoradiation therapy (CRT) for pancreatic cancer. Methods: Daily CT data acquired for 24 pancreatic head cancer patients using CT-on-rails, during the routine CT-guided CRT delivery with a radiation dose of 50.4 Gy in 28 fractions, were analyzed. The pancreas head was contoured on each daily CT. Texture analysis was performed within the pancreas head contour using a research tool (IBEX). Over 1300 texture metrics including: grey level co-occurrence, run-length, histogram, neighborhood intensity difference, and geometrical shape features were calculated for each dailymore » CT. Metric-trend information was established by finding the best fit of either a linear, quadratic, or exponential function for each metric value verses accumulated dose. Thus all the daily CT texture information was consolidated into a best-fit trend type for a given patient and texture metric. Linear correlation was performed between the patient histological response vector (good, medium, poor) and all combinations of 23 patient subgroups (statistical jackknife) determining which metrics were most correlated to response and repeatedly reliable across most patients. Control correlations against CT scanner, reconstruction kernel, and gated/nongated CT images were also calculated. Euclidean distance measure was used to group/sort patient vectors based on the data of these trend-response metrics. Results: We found four specific trend-metrics (Gray Level Coocurence Matrix311-1InverseDiffMomentNorm, Gray Level Coocurence Matrix311-1InverseDiffNorm, Gray Level Coocurence Matrix311-1 Homogeneity2, and Intensity Direct Local StdMean) that were highly correlated with patient response and repeatedly reliable. Our four trend-metric model successfully ordered our pilot response dataset (p=0.00070). We found no significant correlation to our control parameters: gating (p=0.7717), scanner (p=0.9741), and kernel (p=0.8586). Conclusion: We have successfully created a CT-texture based early treatment response prediction model using the CTs acquired during the delivery of chemoradiation therapy for pancreatic cancer. Future testing is required to validate the model with more patient data.« less
Analysis of Subjects' Vulnerability in a Touch Screen Game Using Behavioral Metrics.
Parsinejad, Payam; Sipahi, Rifat
2017-12-01
In this article, we report results on an experimental study conducted with volunteer subjects playing a touch-screen game with two unique difficulty levels. Subjects have knowledge about the rules of both game levels, but only sufficient playing experience with the easy level of the game, making them vulnerable with the difficult level. Several behavioral metrics associated with subjects' playing the game are studied in order to assess subjects' mental-workload changes induced by their vulnerability. Specifically, these metrics are calculated based on subjects' finger kinematics and decision making times, which are then compared with baseline metrics, namely, performance metrics pertaining to how well the game is played and a physiological metric called pnn50 extracted from heart rate measurements. In balanced experiments and supported by comparisons with baseline metrics, it is found that some of the studied behavioral metrics have the potential to be used to infer subjects' mental workload changes through different levels of the game. These metrics, which are decoupled from task specifics, relate to subjects' ability to develop strategies to play the game, and hence have the advantage of offering insight into subjects' task-load and vulnerability assessment across various experimental settings.
Guiding Principles and Checklist for Population-Based Quality Metrics
Brunelli, Steven M.; Maddux, Franklin W.; Parker, Thomas F.; Johnson, Douglas; Nissenson, Allen R.; Collins, Allan; Lacson, Eduardo
2014-01-01
The Centers for Medicare and Medicaid Services oversees the ESRD Quality Incentive Program to ensure that the highest quality of health care is provided by outpatient dialysis facilities that treat patients with ESRD. To that end, Centers for Medicare and Medicaid Services uses clinical performance measures to evaluate quality of care under a pay-for-performance or value-based purchasing model. Now more than ever, the ESRD therapeutic area serves as the vanguard of health care delivery. By translating medical evidence into clinical performance measures, the ESRD Prospective Payment System became the first disease-specific sector using the pay-for-performance model. A major challenge for the creation and implementation of clinical performance measures is the adjustments that are necessary to transition from taking care of individual patients to managing the care of patient populations. The National Quality Forum and others have developed effective and appropriate population-based clinical performance measures quality metrics that can be aggregated at the physician, hospital, dialysis facility, nursing home, or surgery center level. Clinical performance measures considered for endorsement by the National Quality Forum are evaluated using five key criteria: evidence, performance gap, and priority (impact); reliability; validity; feasibility; and usability and use. We have developed a checklist of special considerations for clinical performance measure development according to these National Quality Forum criteria. Although the checklist is focused on ESRD, it could also have broad application to chronic disease states, where health care delivery organizations seek to enhance quality, safety, and efficiency of their services. Clinical performance measures are likely to become the norm for tracking performance for health care insurers. Thus, it is critical that the methodologies used to develop such metrics serve the payer and the provider and most importantly, reflect what represents the best care to improve patient outcomes. PMID:24558050
Pichler, Peter; Mazanek, Michael; Dusberger, Frederico; Weilnböck, Lisa; Huber, Christian G; Stingl, Christoph; Luider, Theo M; Straube, Werner L; Köcher, Thomas; Mechtler, Karl
2012-11-02
While the performance of liquid chromatography (LC) and mass spectrometry (MS) instrumentation continues to increase, applications such as analyses of complete or near-complete proteomes and quantitative studies require constant and optimal system performance. For this reason, research laboratories and core facilities alike are recommended to implement quality control (QC) measures as part of their routine workflows. Many laboratories perform sporadic quality control checks. However, successive and systematic longitudinal monitoring of system performance would be facilitated by dedicated automatic or semiautomatic software solutions that aid an effortless analysis and display of QC metrics over time. We present the software package SIMPATIQCO (SIMPle AuTomatIc Quality COntrol) designed for evaluation of data from LTQ Orbitrap, Q-Exactive, LTQ FT, and LTQ instruments. A centralized SIMPATIQCO server can process QC data from multiple instruments. The software calculates QC metrics supervising every step of data acquisition from LC and electrospray to MS. For each QC metric the software learns the range indicating adequate system performance from the uploaded data using robust statistics. Results are stored in a database and can be displayed in a comfortable manner from any computer in the laboratory via a web browser. QC data can be monitored for individual LC runs as well as plotted over time. SIMPATIQCO thus assists the longitudinal monitoring of important QC metrics such as peptide elution times, peak widths, intensities, total ion current (TIC) as well as sensitivity, and overall LC-MS system performance; in this way the software also helps identify potential problems. The SIMPATIQCO software package is available free of charge.
2012-01-01
While the performance of liquid chromatography (LC) and mass spectrometry (MS) instrumentation continues to increase, applications such as analyses of complete or near-complete proteomes and quantitative studies require constant and optimal system performance. For this reason, research laboratories and core facilities alike are recommended to implement quality control (QC) measures as part of their routine workflows. Many laboratories perform sporadic quality control checks. However, successive and systematic longitudinal monitoring of system performance would be facilitated by dedicated automatic or semiautomatic software solutions that aid an effortless analysis and display of QC metrics over time. We present the software package SIMPATIQCO (SIMPle AuTomatIc Quality COntrol) designed for evaluation of data from LTQ Orbitrap, Q-Exactive, LTQ FT, and LTQ instruments. A centralized SIMPATIQCO server can process QC data from multiple instruments. The software calculates QC metrics supervising every step of data acquisition from LC and electrospray to MS. For each QC metric the software learns the range indicating adequate system performance from the uploaded data using robust statistics. Results are stored in a database and can be displayed in a comfortable manner from any computer in the laboratory via a web browser. QC data can be monitored for individual LC runs as well as plotted over time. SIMPATIQCO thus assists the longitudinal monitoring of important QC metrics such as peptide elution times, peak widths, intensities, total ion current (TIC) as well as sensitivity, and overall LC–MS system performance; in this way the software also helps identify potential problems. The SIMPATIQCO software package is available free of charge. PMID:23088386
Evaluation of image quality metrics for the prediction of subjective best focus.
Kilintari, Marina; Pallikaris, Aristophanis; Tsiklis, Nikolaos; Ginis, Harilaos S
2010-03-01
Seven existing and three new image quality metrics were evaluated in terms of their effectiveness in predicting subjective cycloplegic refraction. Monochromatic wavefront aberrations (WA) were measured in 70 eyes using a Shack-Hartmann based device (Complete Ophthalmic Analysis System; Wavefront Sciences). Subjective cycloplegic spherocylindrical correction was obtained using a standard manifest refraction procedure. The dioptric amount required to optimize each metric was calculated and compared with the subjective refraction result. Metrics included monochromatic and polychromatic variants, as well as variants taking into consideration the Stiles and Crawford effect (SCE). WA measurements were performed using infrared light and converted to visible before all calculations. The mean difference between subjective cycloplegic and WA-derived spherical refraction ranged from 0.17 to 0.36 diopters (D), while paraxial curvature resulted in a difference of 0.68 D. Monochromatic metrics exhibited smaller mean differences between subjective cycloplegic and objective refraction. Consideration of the SCE reduced the standard deviation (SD) of the difference between subjective and objective refraction. All metrics exhibited similar performance in terms of accuracy and precision. We hypothesize that errors pertaining to the conversion between infrared and visible wavelengths rather than calculation method may be the limiting factor in determining objective best focus from near infrared WA measurements.
A proteomics performance standard to support measurement quality in proteomics.
Beasley-Green, Ashley; Bunk, David; Rudnick, Paul; Kilpatrick, Lisa; Phinney, Karen
2012-04-01
The emergence of MS-based proteomic platforms as a prominent technology utilized in biochemical and biomedical research has increased the need for high-quality MS measurements. To address this need, National Institute of Standards and Technology (NIST) reference material (RM) 8323 yeast protein extract is introduced as a proteomics quality control material for benchmarking the preanalytical and analytical performance of proteomics-based experimental workflows. RM 8323 yeast protein extract is based upon the well-characterized eukaryote Saccharomyces cerevisiae and can be utilized in the design and optimization of proteomics-based methodologies from sample preparation to data analysis. To demonstrate its utility as a proteomics quality control material, we coupled LC-MS/MS measurements of RM 8323 with the NIST MS Quality Control (MSQC) performance metrics to quantitatively assess the LC-MS/MS instrumentation parameters that influence measurement accuracy, repeatability, and reproducibility. Due to the complexity of the yeast proteome, we also demonstrate how NIST RM 8323, along with the NIST MSQC performance metrics, can be used in the evaluation and optimization of proteomics-based sample preparation methods. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Shaikh, Faiq; Hendrata, Kenneth; Kolowitz, Brian; Awan, Omer; Shrestha, Rasu; Deible, Christopher
2017-06-01
In the era of value-based healthcare, many aspects of medical care are being measured and assessed to improve quality and reduce costs. Radiology adds enormously to health care costs and is under pressure to adopt a more efficient system that incorporates essential metrics to assess its value and impact on outcomes. Most current systems tie radiologists' incentives and evaluations to RVU-based productivity metrics and peer-review-based quality metrics. In a new potential model, a radiologist's performance will have to increasingly depend on a number of parameters that define "value," beginning with peer review metrics that include referrer satisfaction and feedback from radiologists to the referring physician that evaluates the potency and validity of clinical information provided for a given study. These new dimensions of value measurement will directly impact the cascade of further medical management. We share our continued experience with this project that had two components: RESP (Referrer Evaluation System Pilot) and FRACI (Feedback from Radiologist Addressing Confounding Issues), which were introduced to the clinical radiology workflow in order to capture referrer-based and radiologist-based feedback on radiology reporting. We also share our insight into the principles of design thinking as applied in its planning and execution.
[Predictive model based multimetric index of macroinvertebrates for river health assessment].
Chen, Kai; Yu, Hai Yan; Zhang, Ji Wei; Wang, Bei Xin; Chen, Qiu Wen
2017-06-18
Improving the stability of integrity of biotic index (IBI; i.e., multi-metric indices, MMI) across temporal and spatial scales is one of the most important issues in water ecosystem integrity bioassessment and water environment management. Using datasets of field-based macroinvertebrate and physicochemical variables and GIS-based natural predictors (e.g., geomorphology and climate) and land use variables collected at 227 river sites from 2004 to 2011 across the Zhejiang Province, China, we used random forests (RF) to adjust the effects of natural variations at temporal and spatial scales on macroinvertebrate metrics. We then developed natural variations adjusted (predictive) and unadjusted (null) MMIs and compared performance between them. The core me-trics selected for predictive and null MMIs were different from each other, and natural variations within core metrics in predictive MMI explained by RF models ranged between 11.4% and 61.2%. The predictive MMI was more precise and accurate, but less responsive and sensitive than null MMI. The multivariate nearest-neighbor test determined that 9 test sites and 1 most degraded site were flagged outside of the environmental space of the reference site network. We found that combination of predictive MMI developed by using predictive model and the nearest-neighbor test performed best and decreased risks of inferring type I (designating a water body as being in poor biological condition, when it was actually in good condition) and type II (designating a water body as being in good biological condition, when it was actually in poor condition) errors. Our results provided an effective method to improve the stability and performance of integrity of biotic index.
Throughput analysis for the National Airspace System
NASA Astrophysics Data System (ADS)
Sureshkumar, Chandrasekar
The United States National Airspace System (NAS) network performance is currently measured using a variety of metrics based on delay. Developments in the fields of wireless communication, manufacturing and other modes of transportation like road, freight, etc. have explored various metrics that complement the delay metric. In this work, we develop a throughput concept for both the terminal and en-route phases of flight inspired by studies in the above areas and explore the applications of throughput metrics for the en-route airspace of the NAS. These metrics can be applied to the NAS performance at each hierarchical level—the sector, center, regional and national and will consist of multiple layers of networks with the bottom level comprising the traffic pattern modelled as a network of individual sectors acting as nodes. This hierarchical approach is especially suited for executive level decision making as it gives an overall picture of not just the inefficiencies but also the aspects where the NAS has performed well in a given situation from which specific information about the effects of a policy change on the NAS performance at each level can be determined. These metrics are further validated with real traffic data using the Future Air Traffic Management Concepts Evaluation Tool (FACET) for three en-route sectors and an Air Route Traffic Control Center (ARTCC). Further, this work proposes a framework to compute the minimum makespan and the capacity of a runway system in any configuration. Towards this, an algorithm for optimal arrival and departure flight sequencing is proposed. The proposed algorithm is based on a branch-and-bound technique and allows for the efficient computation of the best runway assignment and sequencing of arrival and departure operations that minimize the makespan at a given airport. The lower and upper bounds of the cost of each branch for the best first search in the branch-and-bound algorithm are computed based on the minimum separation standards between arrival and departure operations set by the Federal Aviation Administration. The optimal objective value is mathematically proved to lie between these bounds and the algorithm uses these bounds to efficiently find promising branches and discard all others and terminate with atleast one sequence with the minimal makespan. The proposed algorithm is analyzed and validated through real traffic operations data at the Hartsfield-Jackson Atlanta international airport.
Metric-driven harm: an exploration of unintended consequences of performance measurement.
Rambur, Betty; Vallett, Carol; Cohen, Judith A; Tarule, Jill Mattuck
2013-11-01
Performance measurement is an increasingly common element of the US health care system. Typically a proxy for high quality outcomes, there has been little systematic investigation of the potential negative unintended consequences of performance metrics, including metric-driven harm. This case study details an incidence of post-surgical metric-driven harm and offers Smith's 1995 work and a patient centered, context sensitive metric model for potential adoption by nurse researchers and clinicians. Implications for further research are discussed. © 2013.
Kasthurirathne, Suranga N; Dixon, Brian E; Gichoya, Judy; Xu, Huiping; Xia, Yuni; Mamlin, Burke; Grannis, Shaun J
2017-05-01
Existing approaches to derive decision models from plaintext clinical data frequently depend on medical dictionaries as the sources of potential features. Prior research suggests that decision models developed using non-dictionary based feature sourcing approaches and "off the shelf" tools could predict cancer with performance metrics between 80% and 90%. We sought to compare non-dictionary based models to models built using features derived from medical dictionaries. We evaluated the detection of cancer cases from free text pathology reports using decision models built with combinations of dictionary or non-dictionary based feature sourcing approaches, 4 feature subset sizes, and 5 classification algorithms. Each decision model was evaluated using the following performance metrics: sensitivity, specificity, accuracy, positive predictive value, and area under the receiver operating characteristics (ROC) curve. Decision models parameterized using dictionary and non-dictionary feature sourcing approaches produced performance metrics between 70 and 90%. The source of features and feature subset size had no impact on the performance of a decision model. Our study suggests there is little value in leveraging medical dictionaries for extracting features for decision model building. Decision models built using features extracted from the plaintext reports themselves achieve comparable results to those built using medical dictionaries. Overall, this suggests that existing "off the shelf" approaches can be leveraged to perform accurate cancer detection using less complex Named Entity Recognition (NER) based feature extraction, automated feature selection and modeling approaches. Copyright © 2017 Elsevier Inc. All rights reserved.
Bayesian performance metrics of binary sensors in homeland security applications
NASA Astrophysics Data System (ADS)
Jannson, Tomasz P.; Forrester, Thomas C.
2008-04-01
Bayesian performance metrics, based on such parameters, as: prior probability, probability of detection (or, accuracy), false alarm rate, and positive predictive value, characterizes the performance of binary sensors; i.e., sensors that have only binary response: true target/false target. Such binary sensors, very common in Homeland Security, produce an alarm that can be true, or false. They include: X-ray airport inspection, IED inspections, product quality control, cancer medical diagnosis, part of ATR, and many others. In this paper, we analyze direct and inverse conditional probabilities in the context of Bayesian inference and binary sensors, using X-ray luggage inspection statistical results as a guideline.
PET and MRI image fusion based on combination of 2-D Hilbert transform and IHS method.
Haddadpour, Mozhdeh; Daneshvar, Sabalan; Seyedarabi, Hadi
2017-08-01
The process of medical image fusion is combining two or more medical images such as Magnetic Resonance Image (MRI) and Positron Emission Tomography (PET) and mapping them to a single image as fused image. So purpose of our study is assisting physicians to diagnose and treat the diseases in the least of the time. We used Magnetic Resonance Image (MRI) and Positron Emission Tomography (PET) as input images, so fused them based on combination of two dimensional Hilbert transform (2-D HT) and Intensity Hue Saturation (IHS) method. Evaluation metrics that we apply are Discrepancy (D k ) as an assessing spectral features and Average Gradient (AG k ) as an evaluating spatial features and also Overall Performance (O.P) to verify properly of the proposed method. In this paper we used three common evaluation metrics like Average Gradient (AG k ) and the lowest Discrepancy (D k ) and Overall Performance (O.P) to evaluate the performance of our method. Simulated and numerical results represent the desired performance of proposed method. Since that the main purpose of medical image fusion is preserving both spatial and spectral features of input images, so based on numerical results of evaluation metrics such as Average Gradient (AG k ), Discrepancy (D k ) and Overall Performance (O.P) and also desired simulated results, it can be concluded that our proposed method can preserve both spatial and spectral features of input images. Copyright © 2017 Chang Gung University. Published by Elsevier B.V. All rights reserved.
Determining GPS average performance metrics
NASA Technical Reports Server (NTRS)
Moore, G. V.
1995-01-01
Analytic and semi-analytic methods are used to show that users of the GPS constellation can expect performance variations based on their location. Specifically, performance is shown to be a function of both altitude and latitude. These results stem from the fact that the GPS constellation is itself non-uniform. For example, GPS satellites are over four times as likely to be directly over Tierra del Fuego than over Hawaii or Singapore. Inevitable performance variations due to user location occur for ground, sea, air and space GPS users. These performance variations can be studied in an average relative sense. A semi-analytic tool which symmetrically allocates GPS satellite latitude belt dwell times among longitude points is used to compute average performance metrics. These metrics include average number of GPS vehicles visible, relative average accuracies in the radial, intrack and crosstrack (or radial, north/south, east/west) directions, and relative average PDOP or GDOP. The tool can be quickly changed to incorporate various user antenna obscuration models and various GPS constellation designs. Among other applications, tool results can be used in studies to: predict locations and geometries of best/worst case performance, design GPS constellations, determine optimal user antenna location and understand performance trends among various users.
Person Re-Identification via Distance Metric Learning With Latent Variables.
Sun, Chong; Wang, Dong; Lu, Huchuan
2017-01-01
In this paper, we propose an effective person re-identification method with latent variables, which represents a pedestrian as the mixture of a holistic model and a number of flexible models. Three types of latent variables are introduced to model uncertain factors in the re-identification problem, including vertical misalignments, horizontal misalignments and leg posture variations. The distance between two pedestrians can be determined by minimizing a given distance function with respect to latent variables, and then be used to conduct the re-identification task. In addition, we develop a latent metric learning method for learning the effective metric matrix, which can be solved via an iterative manner: once latent information is specified, the metric matrix can be obtained based on some typical metric learning methods; with the computed metric matrix, the latent variables can be determined by searching the state space exhaustively. Finally, extensive experiments are conducted on seven databases to evaluate the proposed method. The experimental results demonstrate that our method achieves better performance than other competing algorithms.
Automated Assessment of Visual Quality of Digital Video
NASA Technical Reports Server (NTRS)
Watson, Andrew B.; Ellis, Stephen R. (Technical Monitor)
1997-01-01
The advent of widespread distribution of digital video creates a need for automated methods for evaluating visual quality of digital video. This is particularly so since most digital video is compressed using lossy methods, which involve the controlled introduction of potentially visible artifacts. Compounding the problem is the bursty nature of digital video, which requires adaptive bit allocation based on visual quality metrics. In previous work, we have developed visual quality metrics for evaluating, controlling, and optimizing the quality of compressed still images[1-4]. These metrics incorporate simplified models of human visual sensitivity to spatial and chromatic visual signals. The challenge of video quality metrics is to extend these simplified models to temporal signals as well. In this presentation I will discuss a number of the issues that must be resolved in the design of effective video quality metrics. Among these are spatial, temporal, and chromatic sensitivity and their interactions, visual masking, and implementation complexity. I will also touch on the question of how to evaluate the performance of these metrics.
Best Practices Handbook: Traffic Engineering in Range Networks
2016-03-01
units of measurement. Measurement Methodology - A repeatable measurement technique used to derive one or more metrics of interest . Network...Performance measures - Metrics that provide quantitative or qualitative measures of the performance of systems or subsystems of interest . Performance Metric
Substantial Progress Yet Significant Opportunity for Improvement in Stroke Care in China.
Li, Zixiao; Wang, Chunjuan; Zhao, Xingquan; Liu, Liping; Wang, Chunxue; Li, Hao; Shen, Haipeng; Liang, Li; Bettger, Janet; Yang, Qing; Wang, David; Wang, Anxin; Pan, Yuesong; Jiang, Yong; Yang, Xiaomeng; Zhang, Changqing; Fonarow, Gregg C; Schwamm, Lee H; Hu, Bo; Peterson, Eric D; Xian, Ying; Wang, Yilong; Wang, Yongjun
2016-11-01
Stroke is a leading cause of death in China. Yet the adherence to guideline-recommended ischemic stroke performance metrics in the past decade has been previously shown to be suboptimal. Since then, several nationwide stroke quality management initiatives have been conducted in China. We sought to determine whether adherence had improved since then. Data were obtained from the 2 phases of China National Stroke Registries, which included 131 hospitals (12 173 patients with acute ischemic stroke) in China National Stroke Registries phase 1 from 2007 to 2008 versus 219 hospitals (19 604 patients) in China National Stroke Registries phase 2 from 2012 to 2013. Multiple regression models were developed to evaluate the difference in adherence to performance measure between the 2 study periods. The overall quality of care has improved over time, as reflected by the higher composite score of 0.76 in 2012 to 2013 versus 0.63 in 2007 to 2008. Nine of 13 individual performance metrics improved. However, there were no significant improvements in the rates of intravenous thrombolytic therapy and anticoagulation for atrial fibrillation. After multivariate analysis, there remained a significant 1.17-fold (95% confidence interval, 1.14-1.21) increase in the odds of delivering evidence-based performance metrics in the more recent time periods versus older data. The performance metrics with the most significantly increased odds included stroke education, dysphagia screening, smoking cessation, and antithrombotics at discharge. Adherence to stroke performance metrics has increased over time, but significant opportunities remain for further improvement. Continuous stroke quality improvement program should be developed as a national priority in China. © 2016 American Heart Association, Inc.
A New Metric for Land-Atmosphere Coupling Strength: Applications on Observations and Modeling
NASA Astrophysics Data System (ADS)
Tang, Q.; Xie, S.; Zhang, Y.; Phillips, T. J.; Santanello, J. A., Jr.; Cook, D. R.; Riihimaki, L.; Gaustad, K.
2017-12-01
A new metric is proposed to quantify the land-atmosphere (LA) coupling strength and is elaborated by correlating the surface evaporative fraction and impacting land and atmosphere variables (e.g., soil moisture, vegetation, and radiation). Based upon multiple linear regression, this approach simultaneously considers multiple factors and thus represents complex LA coupling mechanisms better than existing single variable metrics. The standardized regression coefficients quantify the relative contributions from individual drivers in a consistent manner, avoiding the potential inconsistency in relative influence of conventional metrics. Moreover, the unique expendable feature of the new method allows us to verify and explore potentially important coupling mechanisms. Our observation-based application of the new metric shows moderate coupling with large spatial variations at the U.S. Southern Great Plains. The relative importance of soil moisture vs. vegetation varies by location. We also show that LA coupling strength is generally underestimated by single variable methods due to their incompleteness. We also apply this new metric to evaluate the representation of LA coupling in the Accelerated Climate Modeling for Energy (ACME) V1 Contiguous United States (CONUS) regionally refined model (RRM). This work is performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-ABS-734201
Online kinematic regulation by visual feedback for grasp versus transport during reach-to-pinch
Nataraj, Raviraj; Pasluosta, Cristian; Li, Zong-Ming
2014-01-01
Purpose This study investigated novel kinematic performance parameters to understand regulation by visual feedback (VF) of the reaching hand on the grasp and transport components during the reach-to-pinch maneuver. Conventional metrics often signify discrete movement features to postulate sensory-based control effects (e.g., time for maximum velocity to signify feedback delay). The presented metrics of this study were devised to characterize relative vision-based control of the sub-movements across the entire maneuver. Methods Movement performance was assessed according to reduced variability and increased efficiency of kinematic trajectories. Variability was calculated as the standard deviation about the observed mean trajectory for a given subject and VF condition across kinematic derivatives for sub-movements of inter-pad grasp (distance between thumb and index finger-pads; relative orientation of finger-pads) and transport (distance traversed by wrist). A Markov analysis then examined the probabilistic effect of VF on which movement component exhibited higher variability over phases of the complete maneuver. Jerk-based metrics of smoothness (minimal jerk) and energy (integrated jerk-squared) were applied to indicate total movement efficiency with VF. Results/Discussion The reductions in grasp variability metrics with VF were significantly greater (p<0.05) compared to transport for velocity, acceleration, and jerk, suggesting separate control pathways for each component. The Markov analysis indicated that VF preferentially regulates grasp over transport when continuous control is modeled probabilistically during the movement. Efficiency measures demonstrated VF to be more integral for early motor planning of grasp than transport in producing greater increases in smoothness and trajectory adjustments (i.e., jerk-energy) early compared to late in the movement cycle. Conclusions These findings demonstrate the greater regulation by VF on kinematic performance of grasp compared to transport and how particular features of this relativistic control occur continually over the maneuver. Utilizing the advanced performance metrics presented in this study facilitated characterization of VF effects continuously across the entire movement in corroborating the notion of separate control pathways for each component. PMID:24968371
Tang, Junqing; Heinimann, Hans Rudolf
2018-01-01
Traffic congestion brings not only delay and inconvenience, but other associated national concerns, such as greenhouse gases, air pollutants, road safety issues and risks. Identification, measurement, tracking, and control of urban recurrent congestion are vital for building a livable and smart community. A considerable amount of works has made contributions to tackle the problem. Several methods, such as time-based approaches and level of service, can be effective for characterizing congestion on urban streets. However, studies with systemic perspectives have been minor in congestion quantification. Resilience, on the other hand, is an emerging concept that focuses on comprehensive systemic performance and characterizes the ability of a system to cope with disturbance and to recover its functionality. In this paper, we symbolized recurrent congestion as internal disturbance and proposed a modified metric inspired by the well-applied "R4" resilience-triangle framework. We constructed the metric with generic dimensions from both resilience engineering and transport science to quantify recurrent congestion based on spatial-temporal traffic patterns and made the comparison with other two approaches in freeway and signal-controlled arterial cases. Results showed that the metric can effectively capture congestion patterns in the study area and provides a quantitative benchmark for comparison. Also, it suggested not only a good comparative performance in measuring strength of proposed metric, but also its capability of considering the discharging process in congestion. The sensitivity tests showed that proposed metric possesses robustness against parameter perturbation in Robustness Range (RR), but the number of identified congestion patterns can be influenced by the existence of ϵ. In addition, the Elasticity Threshold (ET) and the spatial dimension of cell-based platform differ the congestion results significantly on both the detected number and intensity. By tackling this conventional problem with emerging concept, our metric provides a systemic alternative approach and enriches the toolbox for congestion assessment. Future work will be conducted on a larger scale with multiplex scenarios in various traffic conditions.
Quality of Information Approach to Improving Source Selection in Tactical Networks
2017-02-01
consider the performance of this process based on metrics relating to quality of information: accuracy, timeliness, completeness and reliability. These...that are indicators of that the network is meeting these quality requirements. We study effective data rate, social distance, link integrity and the...utility of information as metrics within a multi-genre network to determine the quality of information of its available sources. This paper proposes a
An optimization-based framework for anisotropic simplex mesh adaptation
NASA Astrophysics Data System (ADS)
Yano, Masayuki; Darmofal, David L.
2012-09-01
We present a general framework for anisotropic h-adaptation of simplex meshes. Given a discretization and any element-wise, localizable error estimate, our adaptive method iterates toward a mesh that minimizes error for a given degrees of freedom. Utilizing mesh-metric duality, we consider a continuous optimization problem of the Riemannian metric tensor field that provides an anisotropic description of element sizes. First, our method performs a series of local solves to survey the behavior of the local error function. This information is then synthesized using an affine-invariant tensor manipulation framework to reconstruct an approximate gradient of the error function with respect to the metric tensor field. Finally, we perform gradient descent in the metric space to drive the mesh toward optimality. The method is first demonstrated to produce optimal anisotropic meshes minimizing the L2 projection error for a pair of canonical problems containing a singularity and a singular perturbation. The effectiveness of the framework is then demonstrated in the context of output-based adaptation for the advection-diffusion equation using a high-order discontinuous Galerkin discretization and the dual-weighted residual (DWR) error estimate. The method presented provides a unified framework for optimizing both the element size and anisotropy distribution using an a posteriori error estimate and enables efficient adaptation of anisotropic simplex meshes for high-order discretizations.
Methodology, Methods, and Metrics for Testing and Evaluating Augmented Cognition Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Greitzer, Frank L.
The augmented cognition research community seeks cognitive neuroscience-based solutions to improve warfighter performance by applying and managing mitigation strategies to reduce workload and improve the throughput and quality of decisions. The focus of augmented cognition mitigation research is to define, demonstrate, and exploit neuroscience and behavioral measures that support inferences about the warfighter’s cognitive state that prescribe the nature and timing of mitigation. A research challenge is to develop valid evaluation methodologies, metrics and measures to assess the impact of augmented cognition mitigations. Two considerations are external validity, which is the extent to which the results apply to operational contexts;more » and internal validity, which reflects the reliability of performance measures and the conclusions based on analysis of results. The scientific rigor of the research methodology employed in conducting empirical investigations largely affects the validity of the findings. External validity requirements also compel us to demonstrate operational significance of mitigations. Thus it is important to demonstrate effectiveness of mitigations under specific conditions. This chapter reviews some cognitive science and methodological considerations in designing augmented cognition research studies and associated human performance metrics and analysis methods to assess the impact of augmented cognition mitigations.« less
Kumar, B. Vinodh; Mohan, Thuthi
2018-01-01
OBJECTIVE: Six Sigma is one of the most popular quality management system tools employed for process improvement. The Six Sigma methods are usually applied when the outcome of the process can be measured. This study was done to assess the performance of individual biochemical parameters on a Sigma Scale by calculating the sigma metrics for individual parameters and to follow the Westgard guidelines for appropriate Westgard rules and levels of internal quality control (IQC) that needs to be processed to improve target analyte performance based on the sigma metrics. MATERIALS AND METHODS: This is a retrospective study, and data required for the study were extracted between July 2015 and June 2016 from a Secondary Care Government Hospital, Chennai. The data obtained for the study are IQC - coefficient of variation percentage and External Quality Assurance Scheme (EQAS) - Bias% for 16 biochemical parameters. RESULTS: For the level 1 IQC, four analytes (alkaline phosphatase, magnesium, triglyceride, and high-density lipoprotein-cholesterol) showed an ideal performance of ≥6 sigma level, five analytes (urea, total bilirubin, albumin, cholesterol, and potassium) showed an average performance of <3 sigma level and for level 2 IQCs, same four analytes of level 1 showed a performance of ≥6 sigma level, and four analytes (urea, albumin, cholesterol, and potassium) showed an average performance of <3 sigma level. For all analytes <6 sigma level, the quality goal index (QGI) was <0.8 indicating the area requiring improvement to be imprecision except cholesterol whose QGI >1.2 indicated inaccuracy. CONCLUSION: This study shows that sigma metrics is a good quality tool to assess the analytical performance of a clinical chemistry laboratory. Thus, sigma metric analysis provides a benchmark for the laboratory to design a protocol for IQC, address poor assay performance, and assess the efficiency of existing laboratory processes. PMID:29692587
DOE Office of Scientific and Technical Information (OSTI.GOV)
Desai, V; Labby, Z; Culberson, W
Purpose: To determine whether body site-specific treatment plans form unique “plan class” clusters in a multi-dimensional analysis of plan complexity metrics such that a single beam quality correction determined for a representative plan could be universally applied within the “plan class”, thereby increasing the dosimetric accuracy of a detector’s response within a subset of similarly modulated nonstandard deliveries. Methods: We collected 95 clinical volumetric modulated arc therapy (VMAT) plans from four body sites (brain, lung, prostate, and spine). The lung data was further subdivided into SBRT and non-SBRT data for a total of five plan classes. For each control pointmore » in each plan, a variety of aperture-based complexity metrics were calculated and stored as unique characteristics of each patient plan. A multiple comparison of means analysis was performed such that every plan class was compared to every other plan class for every complexity metric in order to determine which groups could be considered different from one another. Statistical significance was assessed after correcting for multiple hypothesis testing. Results: Six out of a possible 10 pairwise plan class comparisons were uniquely distinguished based on at least nine out of 14 of the proposed metrics (Brain/Lung, Brain/SBRT lung, Lung/Prostate, Lung/SBRT Lung, Lung/Spine, Prostate/SBRT Lung). Eight out of 14 of the complexity metrics could distinguish at least six out of the possible 10 pairwise plan class comparisons. Conclusion: Aperture-based complexity metrics could prove to be useful tools to quantitatively describe a distinct class of treatment plans. Certain plan-averaged complexity metrics could be considered unique characteristics of a particular plan. A new approach to generating plan-class specific reference (pcsr) fields could be established through a targeted preservation of select complexity metrics or a clustering algorithm that identifies plans exhibiting similar modulation characteristics. Measurements and simulations will better elucidate potential plan-class specific dosimetry correction factors.« less
A Classification Scheme for Smart Manufacturing Systems’ Performance Metrics
Lee, Y. Tina; Kumaraguru, Senthilkumaran; Jain, Sanjay; Robinson, Stefanie; Helu, Moneer; Hatim, Qais Y.; Rachuri, Sudarsan; Dornfeld, David; Saldana, Christopher J.; Kumara, Soundar
2017-01-01
This paper proposes a classification scheme for performance metrics for smart manufacturing systems. The discussion focuses on three such metrics: agility, asset utilization, and sustainability. For each of these metrics, we discuss classification themes, which we then use to develop a generalized classification scheme. In addition to the themes, we discuss a conceptual model that may form the basis for the information necessary for performance evaluations. Finally, we present future challenges in developing robust, performance-measurement systems for real-time, data-intensive enterprises. PMID:28785744
Evaluation of Vehicle-Based Crash Severity Metrics.
Tsoi, Ada H; Gabler, Hampton C
2015-01-01
Vehicle change in velocity (delta-v) is a widely used crash severity metric used to estimate occupant injury risk. Despite its widespread use, delta-v has several limitations. Of most concern, delta-v is a vehicle-based metric which does not consider the crash pulse or the performance of occupant restraints, e.g. seatbelts and airbags. Such criticisms have prompted the search for alternative impact severity metrics based upon vehicle kinematics. The purpose of this study was to assess the ability of the occupant impact velocity (OIV), acceleration severity index (ASI), vehicle pulse index (VPI), and maximum delta-v (delta-v) to predict serious injury in real world crashes. The study was based on the analysis of event data recorders (EDRs) downloaded from the National Automotive Sampling System / Crashworthiness Data System (NASS-CDS) 2000-2013 cases. All vehicles in the sample were GM passenger cars and light trucks involved in a frontal collision. Rollover crashes were excluded. Vehicles were restricted to single-event crashes that caused an airbag deployment. All EDR data were checked for a successful, completed recording of the event and that the crash pulse was complete. The maximum abbreviated injury scale (MAIS) was used to describe occupant injury outcome. Drivers were categorized into either non-seriously injured group (MAIS2-) or seriously injured group (MAIS3+), based on the severity of any injuries to the thorax, abdomen, and spine. ASI and OIV were calculated according to the Manual for Assessing Safety Hardware. VPI was calculated according to ISO/TR 12353-3, with vehicle-specific parameters determined from U.S. New Car Assessment Program crash tests. Using binary logistic regression, the cumulative probability of injury risk was determined for each metric and assessed for statistical significance, goodness-of-fit, and prediction accuracy. The dataset included 102,744 vehicles. A Wald chi-square test showed each vehicle-based crash severity metric estimate to be a significant predictor in the model (p < 0.05). For the belted drivers, both OIV and VPI were significantly better predictors of serious injury than delta-v (p < 0.05). For the unbelted drivers, there was no statistically significant difference between delta-v, OIV, VPI, and ASI. The broad findings of this study suggest it is feasible to improve injury prediction if we consider adding restraint performance to classic measures, e.g. delta-v. Applications, such as advanced automatic crash notification, should consider the use of different metrics for belted versus unbelted occupants.
NASA Astrophysics Data System (ADS)
Boé, Julien; Terray, Laurent
2014-05-01
Ensemble approaches for climate change projections have become ubiquitous. Because of large model-to-model variations and, generally, lack of rationale for the choice of a particular climate model against others, it is widely accepted that future climate change and its impacts should not be estimated based on a single climate model. Generally, as a default approach, the multi-model ensemble mean (MMEM) is considered to provide the best estimate of climate change signals. The MMEM approach is based on the implicit hypothesis that all the models provide equally credible projections of future climate change. This hypothesis is unlikely to be true and ideally one would want to give more weight to more realistic models. A major issue with this alternative approach lies in the assessment of the relative credibility of future climate projections from different climate models, as they can only be evaluated against present-day observations: which present-day metric(s) should be used to decide which models are "good" and which models are "bad" in the future climate? Once a supposedly informative metric has been found, other issues arise. What is the best statistical method to combine multiple models results taking into account their relative credibility measured by a given metric? How to be sure in the end that the metric-based estimate of future climate change is not in fact less realistic than the MMEM? It is impossible to provide strict answers to those questions in the climate change context. Yet, in this presentation, we propose a methodological approach based on a perfect model framework that could bring some useful elements of answer to the questions previously mentioned. The basic idea is to take a random climate model in the ensemble and treat it as if it were the truth (results of this model, in both past and future climate, are called "synthetic observations"). Then, all the other members from the multi-model ensemble are used to derive thanks to a metric-based approach a posterior estimate of climate change, based on the synthetic observation of the metric. Finally, it is possible to compare the posterior estimate to the synthetic observation of future climate change to evaluate the skill of the method. The main objective of this presentation is to describe and apply this perfect model framework to test different methodological issues associated with non-uniform model weighting and similar metric-based approaches. The methodology presented is general, but will be applied to the specific case of summer temperature change in France, for which previous works have suggested potentially useful metrics associated with soil-atmosphere and cloud-temperature interactions. The relative performances of different simple statistical approaches to combine multiple model results based on metrics will be tested. The impact of ensemble size, observational errors, internal variability, and model similarity will be characterized. The potential improvements associated with metric-based approaches compared to the MMEM is terms of errors and uncertainties will be quantified.
Long-term & short-term measures of roadway snow and ice control performance : final report.
DOT National Transportation Integrated Search
2016-07-08
The primary performance measures for RSIC programs by state DOTs are 1) operating speed recovery time and 2) time to achieve bare pavement. There is a continued need for objective, outcome-based performance metrics for RSIC operations. The overall go...
The metric on field space, functional renormalization, and metric–torsion quantum gravity
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reuter, Martin, E-mail: reuter@thep.physik.uni-mainz.de; Schollmeyer, Gregor M., E-mail: schollmeyer@thep.physik.uni-mainz.de
Searching for new non-perturbatively renormalizable quantum gravity theories, functional renormalization group (RG) flows are studied on a theory space of action functionals depending on the metric and the torsion tensor, the latter parameterized by three irreducible component fields. A detailed comparison with Quantum Einstein–Cartan Gravity (QECG), Quantum Einstein Gravity (QEG), and “tetrad-only” gravity, all based on different theory spaces, is performed. It is demonstrated that, over a generic theory space, the construction of a functional RG equation (FRGE) for the effective average action requires the specification of a metric on the infinite-dimensional field manifold as an additional input. A modifiedmore » FRGE is obtained if this metric is scale-dependent, as it happens in the metric–torsion system considered.« less
Visible contrast energy metrics for detection and discrimination
NASA Astrophysics Data System (ADS)
Ahumada, Albert J.; Watson, Andrew B.
2013-03-01
Contrast energy was proposed by Watson, Barlow, and Robson (Science, 1983) as a useful metric for representing luminance contrast target stimuli because it represents the detectability of the stimulus in photon noise for an ideal observer. We propose here the use of visible contrast energy metrics for detection and discrimination among static luminance patterns. The visibility is approximated with spatial frequency sensitivity weighting and eccentricity sensitivity weighting. The suggested weighting functions revise the Standard Spatial Observer (Watson and Ahumada, J. Vision, 2005) for luminance contrast detection , extend it into the near periphery, and provide compensation for duration. Under the assumption that the detection is limited only by internal noise, both detection and discrimination performance can be predicted by metrics based on the visible energy of the difference images.
NASA Astrophysics Data System (ADS)
Schunert, Sebastian
In this work we develop a quantitative decision metric for spatial discretization methods of the SN equations. The quantitative decision metric utilizes performance data from selected test problems for computing a fitness score that is used for the selection of the most suitable discretization method for a particular SN transport application. The fitness score is aggregated as a weighted geometric mean of single performance indicators representing various performance aspects relevant to the user. Thus, the fitness function can be adjusted to the particular needs of the code practitioner by adding/removing single performance indicators or changing their importance via the supplied weights. Within this work a special, broad class of methods is considered, referred to as nodal methods. This class is naturally comprised of the DGFEM methods of all function space families. Within this work it is also shown that the Higher Order Diamond Difference (HODD) method is a nodal method. Building on earlier findings that the Arbitrarily High Order Method of the Nodal type (AHOTN) is also a nodal method, a generalized finite-element framework is created to yield as special cases various methods that were developed independently using profoundly different formalisms. A selection of test problems related to a certain performance aspect are considered: an Method of Manufactured Solutions (MMS) test suite for assessing accuracy and execution time, Lathrop's test problem for assessing resilience against occurrence of negative fluxes, and a simple, homogeneous cube test problem to verify if a method possesses the thick diffusive limit. The contending methods are implemented as efficiently as possible under a common SN transport code framework to level the playing field for a fair comparison of their computational load. Numerical results are presented for all three test problems and a qualitative rating of each method's performance is provided for each aspect: accuracy/efficiency, resilience against negative fluxes, and possession of the thick diffusion limit, separately. The choice of the most efficient method depends on the utilized error norm: in Lp error norms higher order methods such as the AHOTN method of order three perform best, while for computing integral quantities the linear nodal (LN) method is most efficient. The most resilient method against occurrence of negative fluxes is the simple corner balance (SCB) method. A validation of the quantitative decision metric is performed based on the NEA box-inbox suite of test problems. The validation exercise comprises two stages: first prediction of the contending methods' performance via the decision metric and second computing the actual scores based on data obtained from the NEA benchmark problem. The comparison of predicted and actual scores via a penalty function (ratio of predicted best performer's score to actual best score) completes the validation exercise. It is found that the decision metric is capable of very accurate predictions (penalty < 10%) in more than 83% of the considered cases and features penalties up to 20% for the remaining cases. An exception to this rule is the third test case NEA-III intentionally set up to incorporate a poor match of the benchmark with the "data" problems. However, even under these worst case conditions the decision metric's suggestions are never detrimental. Suggestions for improving the decision metric's accuracy are to increase the pool of employed data, to refine the mapping of a given configuration to a case in the database, and to better characterize the desired target quantities.
Decomposition-based transfer distance metric learning for image classification.
Luo, Yong; Liu, Tongliang; Tao, Dacheng; Xu, Chao
2014-09-01
Distance metric learning (DML) is a critical factor for image analysis and pattern recognition. To learn a robust distance metric for a target task, we need abundant side information (i.e., the similarity/dissimilarity pairwise constraints over the labeled data), which is usually unavailable in practice due to the high labeling cost. This paper considers the transfer learning setting by exploiting the large quantity of side information from certain related, but different source tasks to help with target metric learning (with only a little side information). The state-of-the-art metric learning algorithms usually fail in this setting because the data distributions of the source task and target task are often quite different. We address this problem by assuming that the target distance metric lies in the space spanned by the eigenvectors of the source metrics (or other randomly generated bases). The target metric is represented as a combination of the base metrics, which are computed using the decomposed components of the source metrics (or simply a set of random bases); we call the proposed method, decomposition-based transfer DML (DTDML). In particular, DTDML learns a sparse combination of the base metrics to construct the target metric by forcing the target metric to be close to an integration of the source metrics. The main advantage of the proposed method compared with existing transfer metric learning approaches is that we directly learn the base metric coefficients instead of the target metric. To this end, far fewer variables need to be learned. We therefore obtain more reliable solutions given the limited side information and the optimization tends to be faster. Experiments on the popular handwritten image (digit, letter) classification and challenge natural image annotation tasks demonstrate the effectiveness of the proposed method.
Evaluating Modeled Impact Metrics for Human Health, Agriculture Growth, and Near-Term Climate
NASA Astrophysics Data System (ADS)
Seltzer, K. M.; Shindell, D. T.; Faluvegi, G.; Murray, L. T.
2017-12-01
Simulated metrics that assess impacts on human health, agriculture growth, and near-term climate were evaluated using ground-based and satellite observations. The NASA GISS ModelE2 and GEOS-Chem models were used to simulate the near-present chemistry of the atmosphere. A suite of simulations that varied by model, meteorology, horizontal resolution, emissions inventory, and emissions year were performed, enabling an analysis of metric sensitivities to various model components. All simulations utilized consistent anthropogenic global emissions inventories (ECLIPSE V5a or CEDS), and an evaluation of simulated results were carried out for 2004-2006 and 2009-2011 over the United States and 2014-2015 over China. Results for O3- and PM2.5-based metrics featured minor differences due to the model resolutions considered here (2.0° × 2.5° and 0.5° × 0.666°) and model, meteorology, and emissions inventory each played larger roles in variances. Surface metrics related to O3 were consistently high biased, though to varying degrees, demonstrating the need to evaluate particular modeling frameworks before O3 impacts are quantified. Surface metrics related to PM2.5 were diverse, indicating that a multimodel mean with robust results are valuable tools in predicting PM2.5-related impacts. Oftentimes, the configuration that captured the change of a metric best over time differed from the configuration that captured the magnitude of the same metric best, demonstrating the challenge in skillfully simulating impacts. These results highlight the strengths and weaknesses of these models in simulating impact metrics related to air quality and near-term climate. With such information, the reliability of historical and future simulations can be better understood.
NASA Astrophysics Data System (ADS)
Khobragade, P.; Fan, Jiahua; Rupcich, Franco; Crotty, Dominic J.; Gilat Schmidt, Taly
2016-03-01
This study quantitatively evaluated the performance of the exponential transformation of the free-response operating characteristic curve (EFROC) metric, with the Channelized Hotelling Observer (CHO) as a reference. The CHO has been used for image quality assessment of reconstruction algorithms and imaging systems and often it is applied to study the signal-location-known cases. The CHO also requires a large set of images to estimate the covariance matrix. In terms of clinical applications, this assumption and requirement may be unrealistic. The newly developed location-unknown EFROC detectability metric is estimated from the confidence scores reported by a model observer. Unlike the CHO, EFROC does not require a channelization step and is a non-parametric detectability metric. There are few quantitative studies available on application of the EFROC metric, most of which are based on simulation data. This study investigated the EFROC metric using experimental CT data. A phantom with four low contrast objects: 3mm (14 HU), 5mm (7HU), 7mm (5 HU) and 10 mm (3 HU) was scanned at dose levels ranging from 25 mAs to 270 mAs and reconstructed using filtered backprojection. The area under the curve values for CHO (AUC) and EFROC (AFE) were plotted with respect to different dose levels. The number of images required to estimate the non-parametric AFE metric was calculated for varying tasks and found to be less than the number of images required for parametric CHO estimation. The AFE metric was found to be more sensitive to changes in dose than the CHO metric. This increased sensitivity and the assumption of unknown signal location may be useful for investigating and optimizing CT imaging methods. Future work is required to validate the AFE metric against human observers.
Jarc, Anthony M; Curet, Myriam J
2017-03-01
Effective visualization of the operative field is vital to surgical safety and education. However, additional metrics for visualization are needed to complement other common measures of surgeon proficiency, such as time or errors. Unlike other surgical modalities, robot-assisted minimally invasive surgery (RAMIS) enables data-driven feedback to trainees through measurement of camera adjustments. The purpose of this study was to validate and quantify the importance of novel camera metrics during RAMIS. New (n = 18), intermediate (n = 8), and experienced (n = 13) surgeons completed 25 virtual reality simulation exercises on the da Vinci Surgical System. Three camera metrics were computed for all exercises and compared to conventional efficiency measures. Both camera metrics and efficiency metrics showed construct validity (p < 0.05) across most exercises (camera movement frequency 23/25, camera movement duration 22/25, camera movement interval 19/25, overall score 24/25, completion time 25/25). Camera metrics differentiated new and experienced surgeons across all tasks as well as efficiency metrics. Finally, camera metrics significantly (p < 0.05) correlated with completion time (camera movement frequency 21/25, camera movement duration 21/25, camera movement interval 20/25) and overall score (camera movement frequency 20/25, camera movement duration 19/25, camera movement interval 20/25) for most exercises. We demonstrate construct validity of novel camera metrics and correlation between camera metrics and efficiency metrics across many simulation exercises. We believe camera metrics could be used to improve RAMIS proficiency-based curricula.
Kelbe, David; Oak Ridge National Lab.; van Aardt, Jan; ...
2016-10-18
Terrestrial laser scanning has demonstrated increasing potential for rapid comprehensive measurement of forest structure, especially when multiple scans are spatially registered in order to reduce the limitations of occlusion. Although marker-based registration techniques (based on retro-reflective spherical targets) are commonly used in practice, a blind marker-free approach is preferable, insofar as it supports rapid operational data acquisition. To support these efforts, we extend the pairwise registration approach of our earlier work, and develop a graph-theoretical framework to perform blind marker-free global registration of multiple point cloud data sets. Pairwise pose estimates are weighted based on their estimated error, in ordermore » to overcome pose conflict while exploiting redundant information and improving precision. The proposed approach was tested for eight diverse New England forest sites, with 25 scans collected at each site. Quantitative assessment was provided via a novel embedded confidence metric, with a mean estimated root-mean-square error of 7.2 cm and 89% of scans connected to the reference node. Lastly, this paper assesses the validity of the embedded multiview registration confidence metric and evaluates the performance of the proposed registration algorithm.« less
NASA Technical Reports Server (NTRS)
Shim, J. S.; Kuznetsova, M.; Rastatter, L.; Hesse, M.; Bilitza, D.; Butala, M.; Codrescu, M.; Emery, B.; Foster, B.; Fuller-Rowell, T.;
2011-01-01
Objective quantification of model performance based on metrics helps us evaluate the current state of space physics modeling capability, address differences among various modeling approaches, and track model improvements over time. The Coupling, Energetics, and Dynamics of Atmospheric Regions (CEDAR) Electrodynamics Thermosphere Ionosphere (ETI) Challenge was initiated in 2009 to assess accuracy of various ionosphere/thermosphere models in reproducing ionosphere and thermosphere parameters. A total of nine events and five physical parameters were selected to compare between model outputs and observations. The nine events included two strong and one moderate geomagnetic storm events from GEM Challenge events and three moderate storms and three quiet periods from the first half of the International Polar Year (IPY) campaign, which lasted for 2 years, from March 2007 to March 2009. The five physical parameters selected were NmF2 and hmF2 from ISRs and LEO satellites such as CHAMP and COSMIC, vertical drifts at Jicamarca, and electron and neutral densities along the track of the CHAMP satellite. For this study, four different metrics and up to 10 models were used. In this paper, we focus on preliminary results of the study using ground-based measurements, which include NmF2 and hmF2 from Incoherent Scatter Radars (ISRs), and vertical drifts at Jicamarca. The results show that the model performance strongly depends on the type of metrics used, and thus no model is ranked top for all used metrics. The analysis further indicates that performance of the model also varies with latitude and geomagnetic activity level.
Johnson, Robin R.; Stone, Bradly T.; Miranda, Carrie M.; Vila, Bryan; James, Lois; James, Stephen M.; Rubio, Roberto F.; Berka, Chris
2014-01-01
Objective: To demonstrate that psychophysiology may have applications for objective assessment of expertise development in deadly force judgment and decision making (DFJDM). Background: Modern training techniques focus on improving decision-making skills with participative assessment between trainees and subject matter experts primarily through subjective observation. Objective metrics need to be developed. The current proof of concept study explored the potential for psychophysiological metrics in deadly force judgment contexts. Method: Twenty-four participants (novice, expert) were recruited. All wore a wireless Electroencephalography (EEG) device to collect psychophysiological data during high-fidelity simulated deadly force judgment and decision-making simulations using a modified Glock firearm. Participants were exposed to 27 video scenarios, one-third of which would have justified use of deadly force. Pass/fail was determined by whether the participant used deadly force appropriately. Results: Experts had a significantly higher pass rate compared to novices (p < 0.05). Multiple metrics were shown to distinguish novices from experts. Hierarchical regression analyses indicate that psychophysiological variables are able to explain 72% of the variability in expert performance, but only 37% in novices. Discriminant function analysis (DFA) using psychophysiological metrics was able to discern between experts and novices with 72.6% accuracy. Conclusion: While limited due to small sample size, the results suggest that psychophysiology may be developed for use as an objective measure of expertise in DFDJM. Specifically, discriminant function measures may have the potential to objectively identify expert skill acquisition. Application: Psychophysiological metrics may create a performance model with the potential to optimize simulator-based DFJDM training. These performance models could be used for trainee feedback, and/or by the instructor to assess performance objectively. PMID:25100966
NASA Astrophysics Data System (ADS)
De Luccia, Frank J.; Houchin, Scott; Porter, Brian C.; Graybill, Justin; Haas, Evan; Johnson, Patrick D.; Isaacson, Peter J.; Reth, Alan D.
2016-05-01
The GOES-R Flight Project has developed an Image Navigation and Registration (INR) Performance Assessment Tool Set (IPATS) for measuring Advanced Baseline Imager (ABI) and Geostationary Lightning Mapper (GLM) INR performance metrics in the post-launch period for performance evaluation and long term monitoring. For ABI, these metrics are the 3-sigma errors in navigation (NAV), channel-to-channel registration (CCR), frame-to-frame registration (FFR), swath-to-swath registration (SSR), and within frame registration (WIFR) for the Level 1B image products. For GLM, the single metric of interest is the 3-sigma error in the navigation of background images (GLM NAV) used by the system to navigate lightning strikes. 3-sigma errors are estimates of the 99. 73rd percentile of the errors accumulated over a 24 hour data collection period. IPATS utilizes a modular algorithmic design to allow user selection of data processing sequences optimized for generation of each INR metric. This novel modular approach minimizes duplication of common processing elements, thereby maximizing code efficiency and speed. Fast processing is essential given the large number of sub-image registrations required to generate INR metrics for the many images produced over a 24 hour evaluation period. Another aspect of the IPATS design that vastly reduces execution time is the off-line propagation of Landsat based truth images to the fixed grid coordinates system for each of the three GOES-R satellite locations, operational East and West and initial checkout locations. This paper describes the algorithmic design and implementation of IPATS and provides preliminary test results.
NASA Technical Reports Server (NTRS)
DeLuccia, Frank J.; Houchin, Scott; Porter, Brian C.; Graybill, Justin; Haas, Evan; Johnson, Patrick D.; Isaacson, Peter J.; Reth, Alan D.
2016-01-01
The GOES-R Flight Project has developed an Image Navigation and Registration (INR) Performance Assessment Tool Set (IPATS) for measuring Advanced Baseline Imager (ABI) and Geostationary Lightning Mapper (GLM) INR performance metrics in the post-launch period for performance evaluation and long term monitoring. For ABI, these metrics are the 3-sigma errors in navigation (NAV), channel-to-channel registration (CCR), frame-to-frame registration (FFR), swath-to-swath registration (SSR), and within frame registration (WIFR) for the Level 1B image products. For GLM, the single metric of interest is the 3-sigma error in the navigation of background images (GLM NAV) used by the system to navigate lightning strikes. 3-sigma errors are estimates of the 99.73rd percentile of the errors accumulated over a 24 hour data collection period. IPATS utilizes a modular algorithmic design to allow user selection of data processing sequences optimized for generation of each INR metric. This novel modular approach minimizes duplication of common processing elements, thereby maximizing code efficiency and speed. Fast processing is essential given the large number of sub-image registrations required to generate INR metrics for the many images produced over a 24 hour evaluation period. Another aspect of the IPATS design that vastly reduces execution time is the off-line propagation of Landsat based truth images to the fixed grid coordinates system for each of the three GOES-R satellite locations, operational East and West and initial checkout locations. This paper describes the algorithmic design and implementation of IPATS and provides preliminary test results.
NASA Technical Reports Server (NTRS)
De Luccia, Frank J.; Houchin, Scott; Porter, Brian C.; Graybill, Justin; Haas, Evan; Johnson, Patrick D.; Isaacson, Peter J.; Reth, Alan D.
2016-01-01
The GOES-R Flight Project has developed an Image Navigation and Registration (INR) Performance Assessment Tool Set (IPATS) for measuring Advanced Baseline Imager (ABI) and Geostationary Lightning Mapper (GLM) INR performance metrics in the post-launch period for performance evaluation and long term monitoring. For ABI, these metrics are the 3-sigma errors in navigation (NAV), channel-to-channel registration (CCR), frame-to-frame registration (FFR), swath-to-swath registration (SSR), and within frame registration (WIFR) for the Level 1B image products. For GLM, the single metric of interest is the 3-sigma error in the navigation of background images (GLM NAV) used by the system to navigate lightning strikes. 3-sigma errors are estimates of the 99.73rd percentile of the errors accumulated over a 24-hour data collection period. IPATS utilizes a modular algorithmic design to allow user selection of data processing sequences optimized for generation of each INR metric. This novel modular approach minimizes duplication of common processing elements, thereby maximizing code efficiency and speed. Fast processing is essential given the large number of sub-image registrations required to generate INR metrics for the many images produced over a 24-hour evaluation period. Another aspect of the IPATS design that vastly reduces execution time is the off-line propagation of Landsat based truth images to the fixed grid coordinates system for each of the three GOES-R satellite locations, operational East and West and initial checkout locations. This paper describes the algorithmic design and implementation of IPATS and provides preliminary test results.
75 FR 7581 - RTO/ISO Performance Metrics; Notice Requesting Comments on RTO/ISO Performance Metrics
Federal Register 2010, 2011, 2012, 2013, 2014
2010-02-22
... performance communicate about the benefits of RTOs and, where appropriate, (2) changes that need to be made to... of staff from all the jurisdictional ISOs/RTOs to develop a set of performance metrics that the ISOs/RTOs will use to report annually to the Commission. Commission staff and representatives from the ISOs...
Geospace environment modeling 2008--2009 challenge: Dst index
Rastätter, L.; Kuznetsova, M.M.; Glocer, A.; Welling, D.; Meng, X.; Raeder, J.; Wittberger, M.; Jordanova, V.K.; Yu, Y.; Zaharia, S.; Weigel, R.S.; Sazykin, S.; Boynton, R.; Wei, H.; Eccles, V.; Horton, W.; Mays, M.L.; Gannon, J.
2013-01-01
This paper reports the metrics-based results of the Dst index part of the 2008–2009 GEM Metrics Challenge. The 2008–2009 GEM Metrics Challenge asked modelers to submit results for four geomagnetic storm events and five different types of observations that can be modeled by statistical, climatological or physics-based models of the magnetosphere-ionosphere system. We present the results of 30 model settings that were run at the Community Coordinated Modeling Center and at the institutions of various modelers for these events. To measure the performance of each of the models against the observations, we use comparisons of 1 hour averaged model data with the Dst index issued by the World Data Center for Geomagnetism, Kyoto, Japan, and direct comparison of 1 minute model data with the 1 minute Dst index calculated by the United States Geological Survey. The latter index can be used to calculate spectral variability of model outputs in comparison to the index. We find that model rankings vary widely by skill score used. None of the models consistently perform best for all events. We find that empirical models perform well in general. Magnetohydrodynamics-based models of the global magnetosphere with inner magnetosphere physics (ring current model) included and stand-alone ring current models with properly defined boundary conditions perform well and are able to match or surpass results from empirical models. Unlike in similar studies, the statistical models used in this study found their challenge in the weakest events rather than the strongest events.
An Adaptive Handover Prediction Scheme for Seamless Mobility Based Wireless Networks
Safa Sadiq, Ali; Fisal, Norsheila Binti; Ghafoor, Kayhan Zrar; Lloret, Jaime
2014-01-01
We propose an adaptive handover prediction (AHP) scheme for seamless mobility based wireless networks. That is, the AHP scheme incorporates fuzzy logic with AP prediction process in order to lend cognitive capability to handover decision making. Selection metrics, including received signal strength, mobile node relative direction towards the access points in the vicinity, and access point load, are collected and considered inputs of the fuzzy decision making system in order to select the best preferable AP around WLANs. The obtained handover decision which is based on the calculated quality cost using fuzzy inference system is also based on adaptable coefficients instead of fixed coefficients. In other words, the mean and the standard deviation of the normalized network prediction metrics of fuzzy inference system, which are collected from available WLANs are obtained adaptively. Accordingly, they are applied as statistical information to adjust or adapt the coefficients of membership functions. In addition, we propose an adjustable weight vector concept for input metrics in order to cope with the continuous, unpredictable variation in their membership degrees. Furthermore, handover decisions are performed in each MN independently after knowing RSS, direction toward APs, and AP load. Finally, performance evaluation of the proposed scheme shows its superiority compared with representatives of the prediction approaches. PMID:25574490
An adaptive handover prediction scheme for seamless mobility based wireless networks.
Sadiq, Ali Safa; Fisal, Norsheila Binti; Ghafoor, Kayhan Zrar; Lloret, Jaime
2014-01-01
We propose an adaptive handover prediction (AHP) scheme for seamless mobility based wireless networks. That is, the AHP scheme incorporates fuzzy logic with AP prediction process in order to lend cognitive capability to handover decision making. Selection metrics, including received signal strength, mobile node relative direction towards the access points in the vicinity, and access point load, are collected and considered inputs of the fuzzy decision making system in order to select the best preferable AP around WLANs. The obtained handover decision which is based on the calculated quality cost using fuzzy inference system is also based on adaptable coefficients instead of fixed coefficients. In other words, the mean and the standard deviation of the normalized network prediction metrics of fuzzy inference system, which are collected from available WLANs are obtained adaptively. Accordingly, they are applied as statistical information to adjust or adapt the coefficients of membership functions. In addition, we propose an adjustable weight vector concept for input metrics in order to cope with the continuous, unpredictable variation in their membership degrees. Furthermore, handover decisions are performed in each MN independently after knowing RSS, direction toward APs, and AP load. Finally, performance evaluation of the proposed scheme shows its superiority compared with representatives of the prediction approaches.
Funding Ohio Community Colleges: An Analysis of the Performance Funding Model
ERIC Educational Resources Information Center
Krueger, Cynthia A.
2013-01-01
This study examined Ohio's community college performance funding model that is based on seven student success metrics. A percentage of the regular state subsidy is withheld from institutions; funding is earned back based on the three-year average of success points achieved in comparison to other community colleges in the state. Analysis of…
Robotics-based synthesis of human motion.
Khatib, O; Demircan, E; De Sapio, V; Sentis, L; Besier, T; Delp, S
2009-01-01
The synthesis of human motion is a complex procedure that involves accurate reconstruction of movement sequences, modeling of musculoskeletal kinematics, dynamics and actuation, and characterization of reliable performance criteria. Many of these processes have much in common with the problems found in robotics research. Task-based methods used in robotics may be leveraged to provide novel musculoskeletal modeling methods and physiologically accurate performance predictions. In this paper, we present (i) a new method for the real-time reconstruction of human motion trajectories using direct marker tracking, (ii) a task-driven muscular effort minimization criterion and (iii) new human performance metrics for dynamic characterization of athletic skills. Dynamic motion reconstruction is achieved through the control of a simulated human model to follow the captured marker trajectories in real-time. The operational space control and real-time simulation provide human dynamics at any configuration of the performance. A new criteria of muscular effort minimization has been introduced to analyze human static postures. Extensive motion capture experiments were conducted to validate the new minimization criterion. Finally, new human performance metrics were introduced to study in details an athletic skill. These metrics include the effort expenditure and the feasible set of operational space accelerations during the performance of the skill. The dynamic characterization takes into account skeletal kinematics as well as muscle routing kinematics and force generating capacities. The developments draw upon an advanced musculoskeletal modeling platform and a task-oriented framework for the effective integration of biomechanics and robotics methods.
Voice based gender classification using machine learning
NASA Astrophysics Data System (ADS)
Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.
2017-11-01
Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.
On the new metrics for IMRT QA verification.
Garcia-Romero, Alejandro; Hernandez-Vitoria, Araceli; Millan-Cebrian, Esther; Alba-Escorihuela, Veronica; Serrano-Zabaleta, Sonia; Ortega-Pardina, Pablo
2016-11-01
The aim of this work is to search for new metrics that could give more reliable acceptance/rejection criteria on the IMRT verification process and to offer solutions to the discrepancies found among different conventional metrics. Therefore, besides conventional metrics, new ones are proposed and evaluated with new tools to find correlations among them. These new metrics are based on the processing of the dose-volume histogram information, evaluating the absorbed dose differences, the dose constraint fulfillment, or modified biomathematical treatment outcome models such as tumor control probability (TCP) and normal tissue complication probability (NTCP). An additional purpose is to establish whether the new metrics yield the same acceptance/rejection plan distribution as the conventional ones. Fifty eight treatment plans concerning several patient locations are analyzed. All of them were verified prior to the treatment, using conventional metrics, and retrospectively after the treatment with the new metrics. These new metrics include the definition of three continuous functions, based on dose-volume histograms resulting from measurements evaluated with a reconstructed dose system and also with a Monte Carlo redundant calculation. The 3D gamma function for every volume of interest is also calculated. The information is also processed to obtain ΔTCP or ΔNTCP for the considered volumes of interest. These biomathematical treatment outcome models have been modified to increase their sensitivity to dose changes. A robustness index from a radiobiological point of view is defined to classify plans in robustness against dose changes. Dose difference metrics can be condensed in a single parameter: the dose difference global function, with an optimal cutoff that can be determined from a receiver operating characteristics (ROC) analysis of the metric. It is not always possible to correlate differences in biomathematical treatment outcome models with dose difference metrics. This is due to the fact that the dose constraint is often far from the dose that has an actual impact on the radiobiological model, and therefore, biomathematical treatment outcome models are insensitive to big dose differences between the verification system and the treatment planning system. As an alternative, the use of modified radiobiological models which provides a better correlation is proposed. In any case, it is better to choose robust plans from a radiobiological point of view. The robustness index defined in this work is a good predictor of the plan rejection probability according to metrics derived from modified radiobiological models. The global 3D gamma-based metric calculated for each plan volume shows a good correlation with the dose difference metrics and presents a good performance in the acceptance/rejection process. Some discrepancies have been found in dose reconstruction depending on the algorithm employed. Significant and unavoidable discrepancies were found between the conventional metrics and the new ones. The dose difference global function and the 3D gamma for each plan volume are good classifiers regarding dose difference metrics. ROC analysis is useful to evaluate the predictive power of the new metrics. The correlation between biomathematical treatment outcome models and the dose difference-based metrics is enhanced by using modified TCP and NTCP functions that take into account the dose constraints for each plan. The robustness index is useful to evaluate if a plan is likely to be rejected. Conventional verification should be replaced by the new metrics, which are clinically more relevant.
NASA Astrophysics Data System (ADS)
Eum, H. I.; Cannon, A. J.
2015-12-01
Climate models are a key provider to investigate impacts of projected future climate conditions on regional hydrologic systems. However, there is a considerable mismatch of spatial resolution between GCMs and regional applications, in particular a region characterized by complex terrain such as Korean peninsula. Therefore, a downscaling procedure is an essential to assess regional impacts of climate change. Numerous statistical downscaling methods have been used mainly due to the computational efficiency and simplicity. In this study, four statistical downscaling methods [Bias-Correction/Spatial Disaggregation (BCSD), Bias-Correction/Constructed Analogue (BCCA), Multivariate Adaptive Constructed Analogs (MACA), and Bias-Correction/Climate Imprint (BCCI)] are applied to downscale the latest Climate Forecast System Reanalysis data to stations for precipitation, maximum temperature, and minimum temperature over South Korea. By split sampling scheme, all methods are calibrated with observational station data for 19 years from 1973 to 1991 are and tested for the recent 19 years from 1992 to 2010. To assess skill of the downscaling methods, we construct a comprehensive suite of performance metrics that measure an ability of reproducing temporal correlation, distribution, spatial correlation, and extreme events. In addition, we employ Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) to identify robust statistical downscaling methods based on the performance metrics for each season. The results show that downscaling skill is considerably affected by the skill of CFSR and all methods lead to large improvements in representing all performance metrics. According to seasonal performance metrics evaluated, when TOPSIS is applied, MACA is identified as the most reliable and robust method for all variables and seasons. Note that such result is derived from CFSR output which is recognized as near perfect climate data in climate studies. Therefore, the ranking of this study may be changed when various GCMs are downscaled and evaluated. Nevertheless, it may be informative for end-users (i.e. modelers or water resources managers) to understand and select more suitable downscaling methods corresponding to priorities on regional applications.
Adapting the ISO 20462 softcopy ruler method for online image quality studies
NASA Astrophysics Data System (ADS)
Burns, Peter D.; Phillips, Jonathan B.; Williams, Don
2013-01-01
In this paper we address the problem of Image Quality Assessment of no reference metrics, focusing on JPEG corrupted images. In general no reference metrics are not able to measure with the same performance the distortions within their possible range and with respect to different image contents. The crosstalk between content and distortion signals influences the human perception. We here propose two strategies to improve the correlation between subjective and objective quality data. The first strategy is based on grouping the images according to their spatial complexity. The second one is based on a frequency analysis. Both the strategies are tested on two databases available in the literature. The results show an improvement in the correlations between no reference metrics and psycho-visual data, evaluated in terms of the Pearson Correlation Coefficient.
Learning Compositional Shape Models of Multiple Distance Metrics by Information Projection.
Luo, Ping; Lin, Liang; Liu, Xiaobai
2016-07-01
This paper presents a novel compositional contour-based shape model by incorporating multiple distance metrics to account for varying shape distortions or deformations. Our approach contains two key steps: 1) contour feature generation and 2) generative model pursuit. For each category, we first densely sample an ensemble of local prototype contour segments from a few positive shape examples and describe each segment using three different types of distance metrics. These metrics are diverse and complementary with each other to capture various shape deformations. We regard the parameterized contour segment plus an additive residual ϵ as a basic subspace, namely, ϵ -ball, in the sense that it represents local shape variance under the certain distance metric. Using these ϵ -balls as features, we then propose a generative learning algorithm to pursue the compositional shape model, which greedily selects the most representative features under the information projection principle. In experiments, we evaluate our model on several public challenging data sets, and demonstrate that the integration of multiple shape distance metrics is capable of dealing various shape deformations, articulations, and background clutter, hence boosting system performance.
Spatial-temporal distortion metric for in-service quality monitoring of any digital video system
NASA Astrophysics Data System (ADS)
Wolf, Stephen; Pinson, Margaret H.
1999-11-01
Many organizations have focused on developing digital video quality metrics which produce results that accurately emulate subjective responses. However, to be widely applicable a metric must also work over a wide range of quality, and be useful for in-service quality monitoring. The Institute for Telecommunication Sciences (ITS) has developed spatial-temporal distortion metrics that meet all of these requirements. These objective metrics are described in detail and have a number of interesting properties, including utilization of (1) spatial activity filters which emphasize long edges on the order of 10 arc min while simultaneously performing large amounts of noise suppression, (2) the angular direction of the spatial gradient, (3) spatial-temporal compression factors of at least 384:1 (spatial compression of at least 64:1 and temporal compression of at least 6:1, and 4) simple perceptibility thresholds and spatial-temporal masking functions. Results are presented that compare the objective metric values with mean opinion scores from a wide range of subjective data bases spanning many different scenes, systems, bit-rates, and applications.
Lopes, Julio Cesar Dias; Dos Santos, Fábio Mendes; Martins-José, Andrelly; Augustyns, Koen; De Winter, Hans
2017-01-01
A new metric for the evaluation of model performance in the field of virtual screening and quantitative structure-activity relationship applications is described. This metric has been termed the power metric and is defined as the fraction of the true positive rate divided by the sum of the true positive and false positive rates, for a given cutoff threshold. The performance of this metric is compared with alternative metrics such as the enrichment factor, the relative enrichment factor, the receiver operating curve enrichment factor, the correct classification rate, Matthews correlation coefficient and Cohen's kappa coefficient. The performance of this new metric is found to be quite robust with respect to variations in the applied cutoff threshold and ratio of the number of active compounds to the total number of compounds, and at the same time being sensitive to variations in model quality. It possesses the correct characteristics for its application in early-recognition virtual screening problems.
Uncooperative target-in-the-loop performance with backscattered speckle-field effects
NASA Astrophysics Data System (ADS)
Kansky, Jan E.; Murphy, Daniel V.
2007-09-01
Systems utilizing target-in-the-loop (TIL) techniques for adaptive optics phase compensation rely on a metric sensor to perform a hill climbing algorithm that maximizes the far-field Strehl ratio. In uncooperative TIL, the metric signal is derived from the light backscattered from a target. In cases where the target is illuminated with a laser with suffciently long coherence length, the potential exists for the validity of the metric sensor to be compromised by speckle-field effects. We report experimental results from a scaled laboratory designed to evaluate TIL performance in atmospheric turbulence and thermal blooming conditions where the metric sensors are influenced by varying degrees of backscatter speckle. We compare performance of several TIL configurations and metrics for cases with static speckle, and for cases with speckle fluctuations within the frequency range that the TIL system operates. The roles of metric sensor filtering and system bandwidth are discussed.
Impact of Different Economic Performance Metrics on the Perceived Value of Solar Photovoltaics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Drury, E.; Denholm, P.; Margolis, R.
2011-10-01
Photovoltaic (PV) systems are installed by several types of market participants, ranging from residential customers to large-scale project developers and utilities. Each type of market participant frequently uses a different economic performance metric to characterize PV value because they are looking for different types of returns from a PV investment. This report finds that different economic performance metrics frequently show different price thresholds for when a PV investment becomes profitable or attractive. Several project parameters, such as financing terms, can have a significant impact on some metrics [e.g., internal rate of return (IRR), net present value (NPV), and benefit-to-cost (B/C)more » ratio] while having a minimal impact on other metrics (e.g., simple payback time). As such, the choice of economic performance metric by different customer types can significantly shape each customer's perception of PV investment value and ultimately their adoption decision.« less
Categorization of hyperspectral information (HSI) based on the distribution of spectra in hyperspace
NASA Astrophysics Data System (ADS)
Resmini, Ronald G.
2003-09-01
Hyperspectral information (HSI) data are commonly categorized by a description of the dominant physical geographic background captured in the image cube. In other words, HSI categorization is commonly based on a cursory, visual assessment of whether the data are of desert, forest, urban, littoral, jungle, alpine, etc., terrains. Additionally, often the design of HSI collection experiments is based on the acquisition of data of the various backgrounds or of objects of interest within the various terrain types. These data are for assessing and quantifying algorithm performance as well as for algorithm development activities. Here, results of an investigation into the validity of the backgrounds-driven mode of characterizing the diversity of hyperspectral data are presented. HSI data are described quantitatively, in the space where most algorithms operate: n-dimensional (n-D) hyperspace, where n is the number of bands in an HSI data cube. Nineteen metrics designed to probe hyperspace are applied to 14 HYDICE HSI data cubes that represent nine different backgrounds. Each of the 14 sets (one for each HYDICE cube) of 19 metric values was analyzed for clustering. With the present set of data and metrics, there is no clear, unambiguous break-out of metrics based on the nine different geographic backgrounds. The break-outs clump seemingly unrelated data types together; e.g., littoral and urban/residential. Most metrics are normally distributed and indicate no clustering; one metric is one outlier away from normal (i.e., two clusters); and five are comprised of two distributions (i.e., two clusters). Overall, there are three different break-outs that do not correspond to conventional background categories. Implications of these preliminary results are discussed as are recommendations for future work.
Guiding principles and checklist for population-based quality metrics.
Krishnan, Mahesh; Brunelli, Steven M; Maddux, Franklin W; Parker, Thomas F; Johnson, Douglas; Nissenson, Allen R; Collins, Allan; Lacson, Eduardo
2014-06-06
The Centers for Medicare and Medicaid Services oversees the ESRD Quality Incentive Program to ensure that the highest quality of health care is provided by outpatient dialysis facilities that treat patients with ESRD. To that end, Centers for Medicare and Medicaid Services uses clinical performance measures to evaluate quality of care under a pay-for-performance or value-based purchasing model. Now more than ever, the ESRD therapeutic area serves as the vanguard of health care delivery. By translating medical evidence into clinical performance measures, the ESRD Prospective Payment System became the first disease-specific sector using the pay-for-performance model. A major challenge for the creation and implementation of clinical performance measures is the adjustments that are necessary to transition from taking care of individual patients to managing the care of patient populations. The National Quality Forum and others have developed effective and appropriate population-based clinical performance measures quality metrics that can be aggregated at the physician, hospital, dialysis facility, nursing home, or surgery center level. Clinical performance measures considered for endorsement by the National Quality Forum are evaluated using five key criteria: evidence, performance gap, and priority (impact); reliability; validity; feasibility; and usability and use. We have developed a checklist of special considerations for clinical performance measure development according to these National Quality Forum criteria. Although the checklist is focused on ESRD, it could also have broad application to chronic disease states, where health care delivery organizations seek to enhance quality, safety, and efficiency of their services. Clinical performance measures are likely to become the norm for tracking performance for health care insurers. Thus, it is critical that the methodologies used to develop such metrics serve the payer and the provider and most importantly, reflect what represents the best care to improve patient outcomes. Copyright © 2014 by the American Society of Nephrology.
Return on investment in healthcare leadership development programs.
Jeyaraman, Maya M; Qadar, Sheikh Muhammad Zeeshan; Wierzbowski, Aleksandra; Farshidfar, Farnaz; Lys, Justin; Dickson, Graham; Grimes, Kelly; Phillips, Leah A; Mitchell, Jonathan I; Van Aerde, John; Johnson, Dave; Krupka, Frank; Zarychanski, Ryan; Abou-Setta, Ahmed M
2018-02-05
Purpose Strong leadership has been shown to foster change, including loyalty, improved performance and decreased error rates, but there is a dearth of evidence on effectiveness of leadership development programs. To ensure a return on the huge investments made, evidence-based approaches are needed to assess the impact of leadership on health-care establishments. As a part of a pan-Canadian initiative to design an effective evaluative instrument, the purpose of this paper was to identify and summarize evidence on health-care outcomes/return on investment (ROI) indicators and metrics associated with leadership quality, leadership development programs and existing evaluative instruments. Design/methodology/approach The authors performed a scoping review using the Arksey and O'Malley framework, searching eight databases from 2006 through June 2016. Findings Of 11,868 citations screened, the authors included 223 studies reporting on health-care outcomes/ROI indicators and metrics associated with leadership quality (73 studies), leadership development programs (138 studies) and existing evaluative instruments (12 studies). The extracted ROI indicators and metrics have been summarized in detail. Originality/value This review provides a snapshot in time of the current evidence on ROI indicators and metrics associated with leadership. Summarized ROI indicators and metrics can be used to design an effective evaluative instrument to assess the impact of leadership on health-care organizations.
McLellan, Tom M; Pasiakos, Stefan M; Lieberman, Harris R
2014-04-01
Protein supplements are consumed frequently by athletes and recreationally active adults for various reasons, including improved exercise performance and recovery after exercise. Yet, far too often, the decision to purchase and consume protein supplements is based on marketing claims rather than available evidence-based research. The purpose of this review was to provide a systematic and comprehensive analysis of the literature that tested the hypothesis that protein supplements, when combined with carbohydrate, directly enhance endurance performance by sparing muscle glycogen during exercise and increasing the rate of glycogen restoration during recovery. The analysis was used to create evidence statements based on an accepted strength of recommendation taxonomy. English language articles were searched with PubMed and Google Scholar using protein and supplements together with performance, exercise, competition, and muscle, alone or in combination as keywords. Additional articles were retrieved from reference lists found in these papers. Inclusion criteria specified recruiting healthy active adults less than 50 years of age and evaluating the effects of protein supplements in combination with carbohydrate on endurance performance metrics such as time-to-exhaustion, time-trial, or total power output during sprint intervals. The literature search identified 28 articles, of which 26 incorporated test metrics that permitted exclusive categorization into one of the following sections: ingestion during an acute bout of exercise (n = 11) and ingestion during and after exercise to affect subsequent endurance performance (n = 15). The remaining two articles contained performance metrics that spanned both categories. All papers were read in detail and searched for experimental design confounders such as energy content of the supplements, dietary control, use of trained or untrained participants, number of subjects recruited, direct measures of muscle glycogen utilization and restoration, and the sensitivity of the test metrics to explain the discrepant findings. Our evidence statements assert that when carbohydrate supplementation was delivered at optimal rates during or after exercise, protein supplements provided no further ergogenic effect, regardless of the performance metric used. In addition, the limited data available suggested recovery of muscle glycogen stores together with subsequent rate of utilization during exercise is not related to the potential ergogenic effect of protein supplements. Many studies lacked ability to measure direct effects of protein supplementation on muscle metabolism through determination of muscle glycogen, kinetic assessments of protein turnover, or changes in key signaling proteins, and therefore could not substantiate changes in rates of synthesis or degradation of protein. As a result, the interpretation of their data was often biased and inconclusive since they lacked ability to test the proposed underlying mechanism of action. When carbohydrate is delivered at optimal rates during or after endurance exercise, protein supplements appear to have no direct endurance performance enhancing effect.
Sánchez-Margallo, Juan A; Sánchez-Margallo, Francisco M; Oropesa, Ignacio; Enciso, Silvia; Gómez, Enrique J
2017-02-01
The aim of this study is to present the construct and concurrent validity of a motion-tracking method of laparoscopic instruments based on an optical pose tracker and determine its feasibility as an objective assessment tool of psychomotor skills during laparoscopic suturing. A group of novice ([Formula: see text] laparoscopic procedures), intermediate (11-100 laparoscopic procedures) and experienced ([Formula: see text] laparoscopic procedures) surgeons performed three intracorporeal sutures on an ex vivo porcine stomach. Motion analysis metrics were recorded using the proposed tracking method, which employs an optical pose tracker to determine the laparoscopic instruments' position. Construct validation was measured for all 10 metrics across the three groups and between pairs of groups. Concurrent validation was measured against a previously validated suturing checklist. Checklists were completed by two independent surgeons over blinded video recordings of the task. Eighteen novices, 15 intermediates and 11 experienced surgeons took part in this study. Execution time and path length travelled by the laparoscopic dissector presented construct validity. Experienced surgeons required significantly less time ([Formula: see text]), travelled less distance using both laparoscopic instruments ([Formula: see text]) and made more efficient use of the work space ([Formula: see text]) compared with novice and intermediate surgeons. Concurrent validation showed strong correlation between both the execution time and path length and the checklist score ([Formula: see text] and [Formula: see text], [Formula: see text]). The suturing performance was successfully assessed by the motion analysis method. Construct and concurrent validity of the motion-based assessment method has been demonstrated for the execution time and path length metrics. This study demonstrates the efficacy of the presented method for objective evaluation of psychomotor skills in laparoscopic suturing. However, this method does not take into account the quality of the suture. Thus, future works will focus on developing new methods combining motion analysis and qualitative outcome evaluation to provide a complete performance assessment to trainees.
NASA Astrophysics Data System (ADS)
Moslehi, M.; de Barros, F.
2017-12-01
Complexity of hydrogeological systems arises from the multi-scale heterogeneity and insufficient measurements of their underlying parameters such as hydraulic conductivity and porosity. An inadequate characterization of hydrogeological properties can significantly decrease the trustworthiness of numerical models that predict groundwater flow and solute transport. Therefore, a variety of data assimilation methods have been proposed in order to estimate hydrogeological parameters from spatially scarce data by incorporating the governing physical models. In this work, we propose a novel framework for evaluating the performance of these estimation methods. We focus on the Ensemble Kalman Filter (EnKF) approach that is a widely used data assimilation technique. It reconciles multiple sources of measurements to sequentially estimate model parameters such as the hydraulic conductivity. Several methods have been used in the literature to quantify the accuracy of the estimations obtained by EnKF, including Rank Histograms, RMSE and Ensemble Spread. However, these commonly used methods do not regard the spatial information and variability of geological formations. This can cause hydraulic conductivity fields with very different spatial structures to have similar histograms or RMSE. We propose a vision-based approach that can quantify the accuracy of estimations by considering the spatial structure embedded in the estimated fields. Our new approach consists of adapting a new metric, Color Coherent Vectors (CCV), to evaluate the accuracy of estimated fields achieved by EnKF. CCV is a histogram-based technique for comparing images that incorporate spatial information. We represent estimated fields as digital three-channel images and use CCV to compare and quantify the accuracy of estimations. The sensitivity of CCV to spatial information makes it a suitable metric for assessing the performance of spatial data assimilation techniques. Under various factors of data assimilation methods such as number, layout, and type of measurements, we compare the performance of CCV with other metrics such as RMSE. By simulating hydrogeological processes using estimated and true fields, we observe that CCV outperforms other existing evaluation metrics.
The five traps of performance measurement.
Likierman, Andrew
2009-10-01
Evaluating a company's performance often entails wading through a thicket of numbers produced by a few simple metrics, writes the author, and senior executives leave measurement to those whose specialty is spreadsheets. To take ownership of performance assessment, those executives should find qualitative, forward-looking measures that will help them avoid five common traps: Measuring against yourself. Find data from outside the company, and reward relative, rather than absolute, performance. Enterprise Rent-A-Car uses a service quality index to measure customers' repeat purchase intentions. Looking backward. Use measures that lead rather than lag the profits in your business. Humana, a health insurer, found that the sickest 10% of its patients account for 80% of its costs; now it offers customers incentives for early screening. Putting your faith in numbers. The soft drinks company Britvic evaluates its executive coaching program not by trying to assign it an ROI number but by tracking participants' careers for a year. Gaming your metrics. The law firm Clifford Chance replaced its single, easy-to-game metric of billable hours with seven criteria on which to base bonuses. Sticking to your numbers too long. Be precise about what you want to assess and explicit about what metrics are assessing it. Such clarity would have helped investors interpret the AAA ratings involved in the financial meltdown. Really good assessment will combine finance managers' relative independence with line managers' expertise.
Dean, Jamie A; Wong, Kee H; Welsh, Liam C; Jones, Ann-Britt; Schick, Ulrike; Newbold, Kate L; Bhide, Shreerang A; Harrington, Kevin J; Nutting, Christopher M; Gulliford, Sarah L
2016-07-01
Severe acute mucositis commonly results from head and neck (chemo)radiotherapy. A predictive model of mucositis could guide clinical decision-making and inform treatment planning. We aimed to generate such a model using spatial dose metrics and machine learning. Predictive models of severe acute mucositis were generated using radiotherapy dose (dose-volume and spatial dose metrics) and clinical data. Penalised logistic regression, support vector classification and random forest classification (RFC) models were generated and compared. Internal validation was performed (with 100-iteration cross-validation), using multiple metrics, including area under the receiver operating characteristic curve (AUC) and calibration slope, to assess performance. Associations between covariates and severe mucositis were explored using the models. The dose-volume-based models (standard) performed equally to those incorporating spatial information. Discrimination was similar between models, but the RFCstandard had the best calibration. The mean AUC and calibration slope for this model were 0.71 (s.d.=0.09) and 3.9 (s.d.=2.2), respectively. The volumes of oral cavity receiving intermediate and high doses were associated with severe mucositis. The RFCstandard model performance is modest-to-good, but should be improved, and requires external validation. Reducing the volumes of oral cavity receiving intermediate and high doses may reduce mucositis incidence. Copyright © 2016 The Author(s). Published by Elsevier Ireland Ltd.. All rights reserved.
A comparison of proxy performance in coral biodiversity monitoring
NASA Astrophysics Data System (ADS)
Richards, Zoe T.
2013-03-01
The productivity and health of coral reef habitat is diminishing worldwide; however, the effect that habitat declines have on coral reef biodiversity is not known. Logistical and financial constraints mean that surveys of hard coral communities rarely collect data at the species level; hence it is important to know if there are proxy metrics that can reliably predict biodiversity. Here, the performances of six proxy metrics are compared using regression analyses on survey data from a location in the northern Great Barrier Reef. Results suggest generic richness is a strong explanatory variable for spatial patterns in species richness (explaining 82 % of the variation when measured on a belt transect). The most commonly used metric of reef health, percentage live coral cover, is not positively or linearly related to hard coral species richness. This result raises doubt as to whether management actions based on such reefscape information will be effective for the conservation of coral biodiversity.
Target detection cycle criteria when using the targeting task performance metric
NASA Astrophysics Data System (ADS)
Hixson, Jonathan G.; Jacobs, Eddie L.; Vollmerhausen, Richard H.
2004-12-01
The US Army RDECOM CERDEC Night Vision and Electronic Sensors Directorate of the US Army (NVESD) has developed a new target acquisition metric to better predict the performance of modern electro-optical imagers. The TTP metric replaces the Johnson criteria. One problem with transitioning to the new model is that the difficulty of searching in a terrain has traditionally been quantified by an "N50." The N50 is the number of Johnson criteria cycles needed for the observer to detect the target half the time, assuming that the observer is not time limited. In order to make use of this empirical data base, a conversion must be found relating Johnson cycles for detection to TTP cycles for detection. This paper describes how that relationship is established. We have found that the relationship between Johnson and TTP is 1:2.7 for the recognition and identification tasks.
Supply chain value creation methodology under BSC approach
NASA Astrophysics Data System (ADS)
Golrizgashti, Seyedehfatemeh
2014-06-01
The objective of this paper is proposing a developed balanced scorecard approach to measure supply chain performance with the aim of creating more value in manufacturing and business operations. The most important metrics have been selected based on experts' opinion acquired by in-depth interviews focused on creating more value for stakeholders. Using factor analysis method, a survey research has been used to categorize selected metrics into balanced scorecard perspectives. The result identifies the intensity of correlation between perspectives and cause-and-effect chains among them using statistical method based on a real case study in home appliance manufacturing industries.
NASA Astrophysics Data System (ADS)
Holobar, A.; Minetto, M. A.; Farina, D.
2014-02-01
Objective. A signal-based metric for assessment of accuracy of motor unit (MU) identification from high-density surface electromyograms (EMG) is introduced. This metric, so-called pulse-to-noise-ratio (PNR), is computationally efficient, does not require any additional experimental costs and can be applied to every MU that is identified by the previously developed convolution kernel compensation technique. Approach. The analytical derivation of the newly introduced metric is provided, along with its extensive experimental validation on both synthetic and experimental surface EMG signals with signal-to-noise ratios ranging from 0 to 20 dB and muscle contraction forces from 5% to 70% of the maximum voluntary contraction. Main results. In all the experimental and simulated signals, the newly introduced metric correlated significantly with both sensitivity and false alarm rate in identification of MU discharges. Practically all the MUs with PNR > 30 dB exhibited sensitivity >90% and false alarm rates <2%. Therefore, a threshold of 30 dB in PNR can be used as a simple method for selecting only reliably decomposed units. Significance. The newly introduced metric is considered a robust and reliable indicator of accuracy of MU identification. The study also shows that high-density surface EMG can be reliably decomposed at contraction forces as high as 70% of the maximum.
DOT National Transportation Integrated Search
2016-09-02
Public transportation agencies can obtain large amounts of information regarding timeliness, efficiency, cleanliness, ridership, and other : performance measures. However, these metrics are based on the interests of these agencies and do not necessar...
Luenser, Arne; Schurkus, Henry F; Ochsenfeld, Christian
2017-04-11
A reformulation of the random phase approximation within the resolution-of-the-identity (RI) scheme is presented, that is competitive to canonical molecular orbital RI-RPA already for small- to medium-sized molecules. For electronically sparse systems drastic speedups due to the reduced scaling behavior compared to the molecular orbital formulation are demonstrated. Our reformulation is based on two ideas, which are independently useful: First, a Cholesky decomposition of density matrices that reduces the scaling with basis set size for a fixed-size molecule by one order, leading to massive performance improvements. Second, replacement of the overlap RI metric used in the original AO-RPA by an attenuated Coulomb metric. Accuracy is significantly improved compared to the overlap metric, while locality and sparsity of the integrals are retained, as is the effective linear scaling behavior.
Regime-based evaluation of cloudiness in CMIP5 models
NASA Astrophysics Data System (ADS)
Jin, Daeho; Oreopoulos, Lazaros; Lee, Dongmin
2017-01-01
The concept of cloud regimes (CRs) is used to develop a framework for evaluating the cloudiness of 12 fifth Coupled Model Intercomparison Project (CMIP5) models. Reference CRs come from existing global International Satellite Cloud Climatology Project (ISCCP) weather states. The evaluation is made possible by the implementation in several CMIP5 models of the ISCCP simulator generating in each grid cell daily joint histograms of cloud optical thickness and cloud top pressure. Model performance is assessed with several metrics such as CR global cloud fraction (CF), CR relative frequency of occurrence (RFO), their product [long-term average total cloud amount (TCA)], cross-correlations of CR RFO maps, and a metric of resemblance between model and ISCCP CRs. In terms of CR global RFO, arguably the most fundamental metric, the models perform unsatisfactorily overall, except for CRs representing thick storm clouds. Because model CR CF is internally constrained by our method, RFO discrepancies yield also substantial TCA errors. Our results support previous findings that CMIP5 models underestimate cloudiness. The multi-model mean performs well in matching observed RFO maps for many CRs, but is still not the best for this or other metrics. When overall performance across all CRs is assessed, some models, despite shortcomings, apparently outperform Moderate Resolution Imaging Spectroradiometer cloud observations evaluated against ISCCP like another model output. Lastly, contrasting cloud simulation performance against each model's equilibrium climate sensitivity in order to gain insight on whether good cloud simulation pairs with particular values of this parameter, yields no clear conclusions.
SU-E-I-71: Quality Assessment of Surrogate Metrics in Multi-Atlas-Based Image Segmentation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, T; Ruan, D
Purpose: With the ever-growing data of heterogeneous quality, relevance assessment of atlases becomes increasingly critical for multi-atlas-based image segmentation. However, there is no universally recognized best relevance metric and even a standard to compare amongst candidates remains elusive. This study, for the first time, designs a quantification to assess relevance metrics’ quality, based on a novel perspective of the metric as surrogate for inferring the inaccessible oracle geometric agreement. Methods: We first develop an inference model to relate surrogate metrics in image space to the underlying oracle relevance metric in segmentation label space, with a monotonically non-decreasing function subject tomore » random perturbations. Subsequently, we investigate model parameters to reveal key contributing factors to surrogates’ ability in prognosticating the oracle relevance value, for the specific task of atlas selection. Finally, we design an effective contract-to-noise ratio (eCNR) to quantify surrogates’ quality based on insights from these analyses and empirical observations. Results: The inference model was specialized to a linear function with normally distributed perturbations, with surrogate metric exemplified by several widely-used image similarity metrics, i.e., MSD/NCC/(N)MI. Surrogates’ behaviors in selecting the most relevant atlases were assessed under varying eCNR, showing that surrogates with high eCNR dominated those with low eCNR in retaining the most relevant atlases. In an end-to-end validation, NCC/(N)MI with eCNR of 0.12 compared to MSD with eCNR of 0.10 resulted in statistically better segmentation with mean DSC of about 0.85 and the first and third quartiles of (0.83, 0.89), compared to MSD with mean DSC of 0.84 and the first and third quartiles of (0.81, 0.89). Conclusion: The designed eCNR is capable of characterizing surrogate metrics’ quality in prognosticating the oracle relevance value. It has been demonstrated to be correlated with the performance of relevant atlas selection and ultimate label fusion.« less
Mutual Information in Frequency and Its Application to Measure Cross-Frequency Coupling in Epilepsy
NASA Astrophysics Data System (ADS)
Malladi, Rakesh; Johnson, Don H.; Kalamangalam, Giridhar P.; Tandon, Nitin; Aazhang, Behnaam
2018-06-01
We define a metric, mutual information in frequency (MI-in-frequency), to detect and quantify the statistical dependence between different frequency components in the data, referred to as cross-frequency coupling and apply it to electrophysiological recordings from the brain to infer cross-frequency coupling. The current metrics used to quantify the cross-frequency coupling in neuroscience cannot detect if two frequency components in non-Gaussian brain recordings are statistically independent or not. Our MI-in-frequency metric, based on Shannon's mutual information between the Cramer's representation of stochastic processes, overcomes this shortcoming and can detect statistical dependence in frequency between non-Gaussian signals. We then describe two data-driven estimators of MI-in-frequency: one based on kernel density estimation and the other based on the nearest neighbor algorithm and validate their performance on simulated data. We then use MI-in-frequency to estimate mutual information between two data streams that are dependent across time, without making any parametric model assumptions. Finally, we use the MI-in- frequency metric to investigate the cross-frequency coupling in seizure onset zone from electrocorticographic recordings during seizures. The inferred cross-frequency coupling characteristics are essential to optimize the spatial and spectral parameters of electrical stimulation based treatments of epilepsy.
Compression performance comparison in low delay real-time video for mobile applications
NASA Astrophysics Data System (ADS)
Bivolarski, Lazar
2012-10-01
This article compares the performance of several current video coding standards in the conditions of low-delay real-time in a resource constrained environment. The comparison is performed using the same content and the metrics and mix of objective and perceptual quality metrics. The metrics results in different coding schemes are analyzed from a point of view of user perception and quality of service. Multiple standards are compared MPEG-2, MPEG4 and MPEG-AVC and well and H.263. The metrics used in the comparison include SSIM, VQM and DVQ. Subjective evaluation and quality of service are discussed from a point of view of perceptual metrics and their incorporation in the coding scheme development process. The performance and the correlation of results are presented as a predictor of the performance of video compression schemes.
SU-E-T-436: Fluence-Based Trajectory Optimization for Non-Coplanar VMAT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smyth, G; Bamber, JC; Bedford, JL
2015-06-15
Purpose: To investigate a fluence-based trajectory optimization technique for non-coplanar VMAT for brain cancer. Methods: Single-arc non-coplanar VMAT trajectories were determined using a heuristic technique for five patients. Organ at risk (OAR) volume intersected during raytracing was minimized for two cases: absolute volume and the sum of relative volumes weighted by OAR importance. These trajectories and coplanar VMAT formed starting points for the fluence-based optimization method. Iterative least squares optimization was performed on control points 24° apart in gantry rotation. Optimization minimized the root-mean-square (RMS) deviation of PTV dose from the prescription (relative importance 100), maximum dose to the brainstemmore » (10), optic chiasm (5), globes (5) and optic nerves (5), plus mean dose to the lenses (5), hippocampi (3), temporal lobes (2), cochleae (1) and brain excluding other regions of interest (1). Control point couch rotations were varied in steps of up to 10° and accepted if the cost function improved. Final treatment plans were optimized with the same objectives in an in-house planning system and evaluated using a composite metric - the sum of optimization metrics weighted by importance. Results: The composite metric decreased with fluence-based optimization in 14 of the 15 plans. In the remaining case its overall value, and the PTV and OAR components, were unchanged but the balance of OAR sparing differed. PTV RMS deviation was improved in 13 cases and unchanged in two. The OAR component was reduced in 13 plans. In one case the OAR component increased but the composite metric decreased - a 4 Gy increase in OAR metrics was balanced by a reduction in PTV RMS deviation from 2.8% to 2.6%. Conclusion: Fluence-based trajectory optimization improved plan quality as defined by the composite metric. While dose differences were case specific, fluence-based optimization improved both PTV and OAR dosimetry in 80% of cases.« less
Climate Data Analytics Workflow Management
NASA Astrophysics Data System (ADS)
Zhang, J.; Lee, S.; Pan, L.; Mattmann, C. A.; Lee, T. J.
2016-12-01
In this project we aim to pave a novel path to create a sustainable building block toward Earth science big data analytics and knowledge sharing. Closely studying how Earth scientists conduct data analytics research in their daily work, we have developed a provenance model to record their activities, and to develop a technology to automatically generate workflows for scientists from the provenance. On top of it, we have built the prototype of a data-centric provenance repository, and establish a PDSW (People, Data, Service, Workflow) knowledge network to support workflow recommendation. To ensure the scalability and performance of the expected recommendation system, we have leveraged the Apache OODT system technology. The community-approved, metrics-based performance evaluation web-service will allow a user to select a metric from the list of several community-approved metrics and to evaluate model performance using the metric as well as the reference dataset. This service will facilitate the use of reference datasets that are generated in support of the model-data intercomparison projects such as Obs4MIPs and Ana4MIPs. The data-centric repository infrastructure will allow us to catch richer provenance to further facilitate knowledge sharing and scientific collaboration in the Earth science community. This project is part of Apache incubator CMDA project.
Nyflot, Matthew J.; Yang, Fei; Byrd, Darrin; Bowen, Stephen R.; Sandison, George A.; Kinahan, Paul E.
2015-01-01
Abstract. Image heterogeneity metrics such as textural features are an active area of research for evaluating clinical outcomes with positron emission tomography (PET) imaging and other modalities. However, the effects of stochastic image acquisition noise on these metrics are poorly understood. We performed a simulation study by generating 50 statistically independent PET images of the NEMA IQ phantom with realistic noise and resolution properties. Heterogeneity metrics based on gray-level intensity histograms, co-occurrence matrices, neighborhood difference matrices, and zone size matrices were evaluated within regions of interest surrounding the lesions. The impact of stochastic variability was evaluated with percent difference from the mean of the 50 realizations, coefficient of variation and estimated sample size for clinical trials. Additionally, sensitivity studies were performed to simulate the effects of patient size and image reconstruction method on the quantitative performance of these metrics. Complex trends in variability were revealed as a function of textural feature, lesion size, patient size, and reconstruction parameters. In conclusion, the sensitivity of PET textural features to normal stochastic image variation and imaging parameters can be large and is feature-dependent. Standards are needed to ensure that prospective studies that incorporate textural features are properly designed to measure true effects that may impact clinical outcomes. PMID:26251842
Nyflot, Matthew J; Yang, Fei; Byrd, Darrin; Bowen, Stephen R; Sandison, George A; Kinahan, Paul E
2015-10-01
Image heterogeneity metrics such as textural features are an active area of research for evaluating clinical outcomes with positron emission tomography (PET) imaging and other modalities. However, the effects of stochastic image acquisition noise on these metrics are poorly understood. We performed a simulation study by generating 50 statistically independent PET images of the NEMA IQ phantom with realistic noise and resolution properties. Heterogeneity metrics based on gray-level intensity histograms, co-occurrence matrices, neighborhood difference matrices, and zone size matrices were evaluated within regions of interest surrounding the lesions. The impact of stochastic variability was evaluated with percent difference from the mean of the 50 realizations, coefficient of variation and estimated sample size for clinical trials. Additionally, sensitivity studies were performed to simulate the effects of patient size and image reconstruction method on the quantitative performance of these metrics. Complex trends in variability were revealed as a function of textural feature, lesion size, patient size, and reconstruction parameters. In conclusion, the sensitivity of PET textural features to normal stochastic image variation and imaging parameters can be large and is feature-dependent. Standards are needed to ensure that prospective studies that incorporate textural features are properly designed to measure true effects that may impact clinical outcomes.
Orbit Clustering Based on Transfer Cost
NASA Technical Reports Server (NTRS)
Gustafson, Eric D.; Arrieta-Camacho, Juan J.; Petropoulos, Anastassios E.
2013-01-01
We propose using cluster analysis to perform quick screening for combinatorial global optimization problems. The key missing component currently preventing cluster analysis from use in this context is the lack of a useable metric function that defines the cost to transfer between two orbits. We study several proposed metrics and clustering algorithms, including k-means and the expectation maximization algorithm. We also show that proven heuristic methods such as the Q-law can be modified to work with cluster analysis.
An Instrumented Glove to Assess Manual Dexterity in Simulation-Based Neurosurgical Education
Lemos, Juan Diego; Hernandez, Alher Mauricio; Soto-Romero, Georges
2017-01-01
The traditional neurosurgical apprenticeship scheme includes the assessment of trainee’s manual skills carried out by experienced surgeons. However, the introduction of surgical simulation technology presents a new paradigm where residents can refine surgical techniques on a simulator before putting them into practice in real patients. Unfortunately, in this new scheme, an experienced surgeon will not always be available to evaluate trainee’s performance. For this reason, it is necessary to develop automatic mechanisms to estimate metrics for assessing manual dexterity in a quantitative way. Authors have proposed some hardware-software approaches to evaluate manual dexterity on surgical simulators. This paper presents IGlove, a wearable device that uses inertial sensors embedded on an elastic glove to capture hand movements. Metrics to assess manual dexterity are estimated from sensors signals using data processing and information analysis algorithms. It has been designed to be used with a neurosurgical simulator called Daubara NS Trainer, but can be easily adapted to another benchtop- and manikin-based medical simulators. The system was tested with a sample of 14 volunteers who performed a test that was designed to simultaneously evaluate their fine motor skills and the IGlove’s functionalities. Metrics obtained by each of the participants are presented as results in this work; it is also shown how these metrics are used to automatically evaluate the level of manual dexterity of each volunteer. PMID:28468268
Model Performance Evaluation and Scenario Analysis ...
This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors. The performance measures include error analysis, coefficient of determination, Nash-Sutcliffe efficiency, and a new weighted rank method. These performance metrics only provide useful information about the overall model performance. Note that MPESA is based on the separation of observed and simulated time series into magnitude and sequence components. The separation of time series into magnitude and sequence components and the reconstruction back to time series provides diagnostic insights to modelers. For example, traditional approaches lack the capability to identify if the source of uncertainty in the simulated data is due to the quality of the input data or the way the analyst adjusted the model parameters. This report presents a suite of model diagnostics that identify if mismatches between observed and simulated data result from magnitude or sequence related errors. MPESA offers graphical and statistical options that allow HSPF users to compare observed and simulated time series and identify the parameter values to adjust or the input data to modify. The scenario analysis part of the too
Combining control input with flight path data to evaluate pilot performance in transport aircraft.
Ebbatson, Matt; Harris, Don; Huddlestone, John; Sears, Rodney
2008-11-01
When deriving an objective assessment of piloting performance from flight data records, it is common to employ metrics which purely evaluate errors in flight path parameters. The adequacy of pilot performance is evaluated from the flight path of the aircraft. However, in large jet transport aircraft these measures may be insensitive and require supplementing with frequency-based measures of control input parameters. Flight path and control input data were collected from pilots undertaking a jet transport aircraft conversion course during a series of symmetric and asymmetric approaches in a flight simulator. The flight path data were analyzed for deviations around the optimum flight path while flying an instrument landing approach. Manipulation of the flight controls was subject to analysis using a series of power spectral density measures. The flight path metrics showed no significant differences in performance between the symmetric and asymmetric approaches. However, control input frequency domain measures revealed that the pilots employed highly different control strategies in the pitch and yaw axes. The results demonstrate that to evaluate pilot performance fully in large aircraft, it is necessary to employ performance metrics targeted at both the outer control loop (flight path) and the inner control loop (flight control) parameters in parallel, evaluating both the product and process of a pilot's performance.
Integrating automated support for a software management cycle into the TAME system
NASA Technical Reports Server (NTRS)
Sunazuka, Toshihiko; Basili, Victor R.
1989-01-01
Software managers are interested in the quantitative management of software quality, cost and progress. An integrated software management methodology, which can be applied throughout the software life cycle for any number purposes, is required. The TAME (Tailoring A Measurement Environment) methodology is based on the improvement paradigm and the goal/question/metric (GQM) paradigm. This methodology helps generate a software engineering process and measurement environment based on the project characteristics. The SQMAR (software quality measurement and assurance technology) is a software quality metric system and methodology applied to the development processes. It is based on the feed forward control principle. Quality target setting is carried out before the plan-do-check-action activities are performed. These methodologies are integrated to realize goal oriented measurement, process control and visual management. A metric setting procedure based on the GQM paradigm, a management system called the software management cycle (SMC), and its application to a case study based on NASA/SEL data are discussed. The expected effects of SMC are quality improvement, managerial cost reduction, accumulation and reuse of experience, and a highly visual management reporting system.
Strategy quantification using body worn inertial sensors in a reactive agility task.
Eke, Chika U; Cain, Stephen M; Stirling, Leia A
2017-11-07
Agility performance is often evaluated using time-based metrics, which provide little information about which factors aid or limit success. The objective of this study was to better understand agility strategy by identifying biomechanical metrics that were sensitive to performance speed, which were calculated with data from an array of body-worn inertial sensors. Five metrics were defined (normalized number of foot contacts, stride length variance, arm swing variance, mean normalized stride frequency, and number of body rotations) that corresponded to agility terms defined by experts working in athletic, clinical, and military environments. Eighteen participants donned 13 sensors to complete a reactive agility task, which involved navigating a set of cones in response to a vocal cue. Participants were grouped into fast, medium, and slow performance based on their completion time. Participants in the fast group had the smallest number of foot contacts (normalizing by height), highest stride length variance (normalizing by height), highest forearm angular velocity variance, and highest stride frequency (normalizing by height). The number of body rotations was not sensitive to speed and may have been determined by hand and foot dominance while completing the agility task. The results of this study have the potential to inform the development of a composite agility score constructed from the list of significant metrics. By quantifying the agility terms previously defined by expert evaluators through an agility score, this study can assist in strategy development for training and rehabilitation across athletic, clinical, and military domains. Copyright © 2017 Elsevier Ltd. All rights reserved.
Subjective Performance Evaluation in the Public Sector: Evidence from School Inspections. CEE DP 135
ERIC Educational Resources Information Center
Hussain, Iftikhar
2012-01-01
Performance measurement in the public sector is largely based on objective metrics, which may be subject to gaming behaviour. This paper investigates a novel subjective performance evaluation system where independent inspectors visit schools at very short notice, publicly disclose their findings and sanction schools rated fail. First, I…
Evaluating which plan quality metrics are appropriate for use in lung SBRT.
Yaparpalvi, Ravindra; Garg, Madhur K; Shen, Jin; Bodner, William R; Mynampati, Dinesh K; Gafar, Aleiya; Kuo, Hsiang-Chi; Basavatia, Amar K; Ohri, Nitin; Hong, Linda X; Kalnicki, Shalom; Tome, Wolfgang A
2018-02-01
Several dose metrics in the categories-homogeneity, coverage, conformity and gradient have been proposed in literature for evaluating treatment plan quality. In this study, we applied these metrics to characterize and identify the plan quality metrics that would merit plan quality assessment in lung stereotactic body radiation therapy (SBRT) dose distributions. Treatment plans of 90 lung SBRT patients, comprising 91 targets, treated in our institution were retrospectively reviewed. Dose calculations were performed using anisotropic analytical algorithm (AAA) with heterogeneity correction. A literature review on published plan quality metrics in the categories-coverage, homogeneity, conformity and gradient was performed. For each patient, using dose-volume histogram data, plan quality metric values were quantified and analysed. For the study, the radiation therapy oncology group (RTOG) defined plan quality metrics were: coverage (0.90 ± 0.08); homogeneity (1.27 ± 0.07); conformity (1.03 ± 0.07) and gradient (4.40 ± 0.80). Geometric conformity strongly correlated with conformity index (p < 0.0001). Gradient measures strongly correlated with target volume (p < 0.0001). The RTOG lung SBRT protocol advocated conformity guidelines for prescribed dose in all categories were met in ≥94% of cases. The proportion of total lung volume receiving doses of 20 Gy and 5 Gy (V 20 and V 5 ) were mean 4.8% (±3.2) and 16.4% (±9.2), respectively. Based on our study analyses, we recommend the following metrics as appropriate surrogates for establishing SBRT lung plan quality guidelines-coverage % (ICRU 62), conformity (CN or CI Paddick ) and gradient (R 50% ). Furthermore, we strongly recommend that RTOG lung SBRT protocols adopt either CN or CI Padddick in place of prescription isodose to target volume ratio for conformity index evaluation. Advances in knowledge: Our study metrics are valuable tools for establishing lung SBRT plan quality guidelines.
EEG amplitude modulation analysis for semi-automated diagnosis of Alzheimer's disease
NASA Astrophysics Data System (ADS)
Falk, Tiago H.; Fraga, Francisco J.; Trambaiolli, Lucas; Anghinah, Renato
2012-12-01
Recent experimental evidence has suggested a neuromodulatory deficit in Alzheimer's disease (AD). In this paper, we present a new electroencephalogram (EEG) based metric to quantitatively characterize neuromodulatory activity. More specifically, the short-term EEG amplitude modulation rate-of-change (i.e., modulation frequency) is computed for five EEG subband signals. To test the performance of the proposed metric, a classification task was performed on a database of 32 participants partitioned into three groups of approximately equal size: healthy controls, patients diagnosed with mild AD, and those with moderate-to-severe AD. To gauge the benefits of the proposed metric, performance results were compared with those obtained using EEG spectral peak parameters which were recently shown to outperform other conventional EEG measures. Using a simple feature selection algorithm based on area-under-the-curve maximization and a support vector machine classifier, the proposed parameters resulted in accuracy gains, relative to spectral peak parameters, of 21.3% when discriminating between the three groups and by 50% when mild and moderate-to-severe groups were merged into one. The preliminary findings reported herein provide promising insights that automated tools may be developed to assist physicians in very early diagnosis of AD as well as provide researchers with a tool to automatically characterize cross-frequency interactions and their changes with disease.
Human-centric predictive model of task difficulty for human-in-the-loop control tasks
Majewicz Fey, Ann
2018-01-01
Quantitatively measuring the difficulty of a manipulation task in human-in-the-loop control systems is ill-defined. Currently, systems are typically evaluated through task-specific performance measures and post-experiment user surveys; however, these methods do not capture the real-time experience of human users. In this study, we propose to analyze and predict the difficulty of a bivariate pointing task, with a haptic device interface, using human-centric measurement data in terms of cognition, physical effort, and motion kinematics. Noninvasive sensors were used to record the multimodal response of human user for 14 subjects performing the task. A data-driven approach for predicting task difficulty was implemented based on several task-independent metrics. We compare four possible models for predicting task difficulty to evaluated the roles of the various types of metrics, including: (I) a movement time model, (II) a fusion model using both physiological and kinematic metrics, (III) a model only with kinematic metrics, and (IV) a model only with physiological metrics. The results show significant correlation between task difficulty and the user sensorimotor response. The fusion model, integrating user physiology and motion kinematics, provided the best estimate of task difficulty (R2 = 0.927), followed by a model using only kinematic metrics (R2 = 0.921). Both models were better predictors of task difficulty than the movement time model (R2 = 0.847), derived from Fitt’s law, a well studied difficulty model for human psychomotor control. PMID:29621301
Performance evaluation of a distance learning program.
Dailey, D J; Eno, K R; Brinkley, J F
1994-01-01
This paper presents a performance metric which uses a single number to characterize the response time for a non-deterministic client-server application operating over the Internet. When applied to a Macintosh-based distance learning application called the Digital Anatomist Browser, the metric allowed us to observe that "A typical student doing a typical mix of Browser commands on a typical data set will experience the same delay if they use a slow Macintosh on a local network or a fast Macintosh on the other side of the country accessing the data over the Internet." The methodology presented is applicable to other client-server applications that are rapidly appearing on the Internet.
Blind Source Parameters for Performance Evaluation of Despeckling Filters.
Biradar, Nagashettappa; Dewal, M L; Rohit, ManojKumar; Gowre, Sanjaykumar; Gundge, Yogesh
2016-01-01
The speckle noise is inherent to transthoracic echocardiographic images. A standard noise-free reference echocardiographic image does not exist. The evaluation of filters based on the traditional parameters such as peak signal-to-noise ratio, mean square error, and structural similarity index may not reflect the true filter performance on echocardiographic images. Therefore, the performance of despeckling can be evaluated using blind assessment metrics like the speckle suppression index, speckle suppression and mean preservation index (SMPI), and beta metric. The need for noise-free reference image is overcome using these three parameters. This paper presents a comprehensive analysis and evaluation of eleven types of despeckling filters for echocardiographic images in terms of blind and traditional performance parameters along with clinical validation. The noise is effectively suppressed using the logarithmic neighborhood shrinkage (NeighShrink) embedded with Stein's unbiased risk estimation (SURE). The SMPI is three times more effective compared to the wavelet based generalized likelihood estimation approach. The quantitative evaluation and clinical validation reveal that the filters such as the nonlocal mean, posterior sampling based Bayesian estimation, hybrid median, and probabilistic patch based filters are acceptable whereas median, anisotropic diffusion, fuzzy, and Ripplet nonlinear approximation filters have limited applications for echocardiographic images.
Blind Source Parameters for Performance Evaluation of Despeckling Filters
Biradar, Nagashettappa; Dewal, M. L.; Rohit, ManojKumar; Gowre, Sanjaykumar; Gundge, Yogesh
2016-01-01
The speckle noise is inherent to transthoracic echocardiographic images. A standard noise-free reference echocardiographic image does not exist. The evaluation of filters based on the traditional parameters such as peak signal-to-noise ratio, mean square error, and structural similarity index may not reflect the true filter performance on echocardiographic images. Therefore, the performance of despeckling can be evaluated using blind assessment metrics like the speckle suppression index, speckle suppression and mean preservation index (SMPI), and beta metric. The need for noise-free reference image is overcome using these three parameters. This paper presents a comprehensive analysis and evaluation of eleven types of despeckling filters for echocardiographic images in terms of blind and traditional performance parameters along with clinical validation. The noise is effectively suppressed using the logarithmic neighborhood shrinkage (NeighShrink) embedded with Stein's unbiased risk estimation (SURE). The SMPI is three times more effective compared to the wavelet based generalized likelihood estimation approach. The quantitative evaluation and clinical validation reveal that the filters such as the nonlocal mean, posterior sampling based Bayesian estimation, hybrid median, and probabilistic patch based filters are acceptable whereas median, anisotropic diffusion, fuzzy, and Ripplet nonlinear approximation filters have limited applications for echocardiographic images. PMID:27298618
A novel critical infrastructure resilience assessment approach using dynamic Bayesian networks
NASA Astrophysics Data System (ADS)
Cai, Baoping; Xie, Min; Liu, Yonghong; Liu, Yiliu; Ji, Renjie; Feng, Qiang
2017-10-01
The word resilience originally originates from the Latin word "resiliere", which means to "bounce back". The concept has been used in various fields, such as ecology, economics, psychology, and society, with different definitions. In the field of critical infrastructure, although some resilience metrics are proposed, they are totally different from each other, which are determined by the performances of the objects of evaluation. Here we bridge the gap by developing a universal critical infrastructure resilience metric from the perspective of reliability engineering. A dynamic Bayesian networks-based assessment approach is proposed to calculate the resilience value. A series, parallel and voting system is used to demonstrate the application of the developed resilience metric and assessment approach.
Boareto, Marcelo; Cesar, Jonatas; Leite, Vitor B P; Caticha, Nestor
2015-01-01
We introduce Supervised Variational Relevance Learning (Suvrel), a variational method to determine metric tensors to define distance based similarity in pattern classification, inspired in relevance learning. The variational method is applied to a cost function that penalizes large intraclass distances and favors small interclass distances. We find analytically the metric tensor that minimizes the cost function. Preprocessing the patterns by doing linear transformations using the metric tensor yields a dataset which can be more efficiently classified. We test our methods using publicly available datasets, for some standard classifiers. Among these datasets, two were tested by the MAQC-II project and, even without the use of further preprocessing, our results improve on their performance.
Varshney, Rickul; Frenkiel, Saul; Nguyen, Lily H P; Young, Meredith; Del Maestro, Rolando; Zeitouni, Anthony; Tewfik, Marc A
2014-01-01
The technical challenges of endoscopic sinus surgery (ESS) and the high risk of complications support the development of alternative modalities to train residents in these procedures. Virtual reality simulation is becoming a useful tool for training the skills necessary for minimally invasive surgery; however, there are currently no ESS virtual reality simulators available with valid evidence supporting their use in resident education. Our aim was to develop a new rhinology simulator, as well as to define potential performance metrics for trainee assessment. The McGill simulator for endoscopic sinus surgery (MSESS), a new sinus surgery virtual reality simulator with haptic feedback, was developed (a collaboration between the McGill University Department of Otolaryngology-Head and Neck Surgery, the Montreal Neurologic Institute Simulation Lab, and the National Research Council of Canada). A panel of experts in education, performance assessment, rhinology, and skull base surgery convened to identify core technical abilities that would need to be taught by the simulator, as well as performance metrics to be developed and captured. The MSESS allows the user to perform basic sinus surgery skills, such as an ethmoidectomy and sphenoidotomy, through the use of endoscopic tools in a virtual nasal model. The performance metrics were developed by an expert panel and include measurements of safety, quality, and efficiency of the procedure. The MSESS incorporates novel technological advancements to create a realistic platform for trainees. To our knowledge, this is the first simulator to combine novel tools such as the endonasal wash and elaborate anatomic deformity with advanced performance metrics for ESS.
Three validation metrics for automated probabilistic image segmentation of brain tumours
Zou, Kelly H.; Wells, William M.; Kikinis, Ron; Warfield, Simon K.
2005-01-01
SUMMARY The validity of brain tumour segmentation is an important issue in image processing because it has a direct impact on surgical planning. We examined the segmentation accuracy based on three two-sample validation metrics against the estimated composite latent gold standard, which was derived from several experts’ manual segmentations by an EM algorithm. The distribution functions of the tumour and control pixel data were parametrically assumed to be a mixture of two beta distributions with different shape parameters. We estimated the corresponding receiver operating characteristic curve, Dice similarity coefficient, and mutual information, over all possible decision thresholds. Based on each validation metric, an optimal threshold was then computed via maximization. We illustrated these methods on MR imaging data from nine brain tumour cases of three different tumour types, each consisting of a large number of pixels. The automated segmentation yielded satisfactory accuracy with varied optimal thresholds. The performances of these validation metrics were also investigated via Monte Carlo simulation. Extensions of incorporating spatial correlation structures using a Markov random field model were considered. PMID:15083482
Johnson, S J; Hunt, C M; Woolnough, H M; Crawshaw, M; Kilkenny, C; Gould, D A; England, A; Sinha, A; Villard, P F
2012-05-01
The aim of this article was to identify and prospectively investigate simulated ultrasound-guided targeted liver biopsy performance metrics as differentiators between levels of expertise in interventional radiology. Task analysis produced detailed procedural step documentation allowing identification of critical procedure steps and performance metrics for use in a virtual reality ultrasound-guided targeted liver biopsy procedure. Consultant (n=14; male=11, female=3) and trainee (n=26; male=19, female=7) scores on the performance metrics were compared. Ethical approval was granted by the Liverpool Research Ethics Committee (UK). Independent t-tests and analysis of variance (ANOVA) investigated differences between groups. Independent t-tests revealed significant differences between trainees and consultants on three performance metrics: targeting, p=0.018, t=-2.487 (-2.040 to -0.207); probe usage time, p = 0.040, t=2.132 (11.064 to 427.983); mean needle length in beam, p=0.029, t=-2.272 (-0.028 to -0.002). ANOVA reported significant differences across years of experience (0-1, 1-2, 3+ years) on seven performance metrics: no-go area touched, p=0.012; targeting, p=0.025; length of session, p=0.024; probe usage time, p=0.025; total needle distance moved, p=0.038; number of skin contacts, p<0.001; total time in no-go area, p=0.008. More experienced participants consistently received better performance scores on all 19 performance metrics. It is possible to measure and monitor performance using simulation, with performance metrics providing feedback on skill level and differentiating levels of expertise. However, a transfer of training study is required.
Towards Principled Experimental Study of Autonomous Mobile Robots
NASA Technical Reports Server (NTRS)
Gat, Erann
1995-01-01
We review the current state of research in autonomous mobile robots and conclude that there is an inadequate basis for predicting the reliability and behavior of robots operating in unengineered environments. We present a new approach to the study of autonomous mobile robot performance based on formal statistical analysis of independently reproducible experiments conducted on real robots. Simulators serve as models rather than experimental surrogates. We demonstrate three new results: 1) Two commonly used performance metrics (time and distance) are not as well correlated as is often tacitly assumed. 2) The probability distributions of these performance metrics are exponential rather than normal, and 3) a modular, object-oriented simulation accurately predicts the behavior of the real robot in a statistically significant manner.
Analysis of Skeletal Muscle Metrics as Predictors of Functional Task Performance
NASA Technical Reports Server (NTRS)
Ryder, Jeffrey W.; Buxton, Roxanne E.; Redd, Elizabeth; Scott-Pandorf, Melissa; Hackney, Kyle J.; Fiedler, James; Ploutz-Snyder, Robert J.; Bloomberg, Jacob J.; Ploutz-Snyder, Lori L.
2010-01-01
PURPOSE: The ability to predict task performance using physiological performance metrics is vital to ensure that astronauts can execute their jobs safely and effectively. This investigation used a weighted suit to evaluate task performance at various ratios of strength, power, and endurance to body weight. METHODS: Twenty subjects completed muscle performance tests and functional tasks representative of those that would be required of astronauts during planetary exploration (see table for specific tests/tasks). Subjects performed functional tasks while wearing a weighted suit with additional loads ranging from 0-120% of initial body weight. Performance metrics were time to completion for all tasks except hatch opening, which consisted of total work. Task performance metrics were plotted against muscle metrics normalized to "body weight" (subject weight + external load; BW) for each trial. Fractional polynomial regression was used to model the relationship between muscle and task performance. CONCLUSION: LPMIF/BW is the best predictor of performance for predominantly lower-body tasks that are ambulatory and of short duration. LPMIF/BW is a very practical predictor of occupational task performance as it is quick and relatively safe to perform. Accordingly, bench press work best predicts hatch-opening work performance.
Contrast-based sensorless adaptive optics for retinal imaging.
Zhou, Xiaolin; Bedggood, Phillip; Bui, Bang; Nguyen, Christine T O; He, Zheng; Metha, Andrew
2015-09-01
Conventional adaptive optics ophthalmoscopes use wavefront sensing methods to characterize ocular aberrations for real-time correction. However, there are important situations in which the wavefront sensing step is susceptible to difficulties that affect the accuracy of the correction. To circumvent these, wavefront sensorless adaptive optics (or non-wavefront sensing AO; NS-AO) imaging has recently been developed and has been applied to point-scanning based retinal imaging modalities. In this study we show, for the first time, contrast-based NS-AO ophthalmoscopy for full-frame in vivo imaging of human and animal eyes. We suggest a robust image quality metric that could be used for any imaging modality, and test its performance against other metrics using (physical) model eyes.
Multi-scale functional mapping of tidal marsh vegetation for restoration monitoring
NASA Astrophysics Data System (ADS)
Tuxen Bettman, Karin
2007-12-01
Nearly half of the world's natural wetlands have been destroyed or degraded, and in recent years, there have been significant endeavors to restore wetland habitat throughout the world. Detailed mapping of restoring wetlands can offer valuable information about changes in vegetation and geomorphology, which can inform the restoration process and ultimately help to improve chances of restoration success. I studied six tidal marshes in the San Francisco Estuary, CA, US, between 2003 and 2004 in order to develop techniques for mapping tidal marshes at multiple scales by incorporating specific restoration objectives for improved longer term monitoring. I explored a "pixel-based" remote sensing image analysis method for mapping vegetation in restored and natural tidal marshes, describing the benefits and limitations of this type of approach (Chapter 2). I also performed a multi-scale analysis of vegetation pattern metrics for a recently restored tidal marsh in order to target the metrics that are consistent across scales and will be robust measures of marsh vegetation change (Chapter 3). Finally, I performed an "object-based" image analysis using the same remotely sensed imagery, which maps vegetation type and specific wetland functions at multiple scales (Chapter 4). The combined results of my work highlight important trends and management implications for monitoring wetland restoration using remote sensing, and will better enable restoration ecologists to use remote sensing for tidal marsh monitoring. Several findings important for tidal marsh restoration monitoring were made. Overall results showed that pixel-based methods are effective at quantifying landscape changes in composition and diversity in recently restored marshes, but are limited in their use for quantifying smaller, more fine-scale changes. While pattern metrics can highlight small but important changes in vegetation composition and configuration across years, scientists should exercise caution when using metrics in their studies or to validate restoration management decisions, and multi-scale analyses should be performed before metrics are used in restoration science for important management decisions. Lastly, restoration objectives, ecosystem function, and scale can each be integrated into monitoring techniques using remote sensing for improved restoration monitoring.
Advanced Life Support System Value Metric
NASA Technical Reports Server (NTRS)
Jones, Harry W.; Rasky, Daniel J. (Technical Monitor)
1999-01-01
The NASA Advanced Life Support (ALS) Program is required to provide a performance metric to measure its progress in system development. Extensive discussions within the ALS program have led to the following approach. The Equivalent System Mass (ESM) metric has been traditionally used and provides a good summary of the weight, size, and power cost factors of space life support equipment. But ESM assumes that all the systems being traded off exactly meet a fixed performance requirement, so that the value and benefit (readiness, performance, safety, etc.) of all the different systems designs are considered to be exactly equal. This is too simplistic. Actual system design concepts are selected using many cost and benefit factors and the system specification is defined after many trade-offs. The ALS program needs a multi-parameter metric including both the ESM and a System Value Metric (SVM). The SVM would include safety, maintainability, reliability, performance, use of cross cutting technology, and commercialization potential. Another major factor in system selection is technology readiness level (TRL), a familiar metric in ALS. The overall ALS system metric that is suggested is a benefit/cost ratio, SVM/[ESM + function (TRL)], with appropriate weighting and scaling. The total value is given by SVM. Cost is represented by higher ESM and lower TRL. The paper provides a detailed description and example application of a suggested System Value Metric and an overall ALS system metric.
Conditional anomaly detection methods for patient–management alert systems
Valko, Michal; Cooper, Gregory; Seybert, Amy; Visweswaran, Shyam; Saul, Melissa; Hauskrecht, Milos
2010-01-01
Anomaly detection methods can be very useful in identifying unusual or interesting patterns in data. A recently proposed conditional anomaly detection framework extends anomaly detection to the problem of identifying anomalous patterns on a subset of attributes in the data. The anomaly always depends (is conditioned) on the value of remaining attributes. The work presented in this paper focuses on instance–based methods for detecting conditional anomalies. The methods rely on the distance metric to identify examples in the dataset that are most critical for detecting the anomaly. We investigate various metrics and metric learning methods to optimize the performance of the instance–based anomaly detection methods. We show the benefits of the instance–based methods on two real–world detection problems: detection of unusual admission decisions for patients with the community–acquired pneumonia and detection of unusual orders of an HPF4 test that is used to confirm Heparin induced thrombocytopenia — a life–threatening condition caused by the Heparin therapy. PMID:25392850
R&D100: Lightweight Distributed Metric Service
Gentile, Ann; Brandt, Jim; Tucker, Tom; Showerman, Mike
2018-06-12
On today's High Performance Computing platforms, the complexity of applications and configurations makes efficient use of resources difficult. The Lightweight Distributed Metric Service (LDMS) is monitoring software developed by Sandia National Laboratories to provide detailed metrics of system performance. LDMS provides collection, transport, and storage of data from extreme-scale systems at fidelities and timescales to provide understanding of application and system performance with no statistically significant impact on application performance.
R&D100: Lightweight Distributed Metric Service
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gentile, Ann; Brandt, Jim; Tucker, Tom
2015-11-19
On today's High Performance Computing platforms, the complexity of applications and configurations makes efficient use of resources difficult. The Lightweight Distributed Metric Service (LDMS) is monitoring software developed by Sandia National Laboratories to provide detailed metrics of system performance. LDMS provides collection, transport, and storage of data from extreme-scale systems at fidelities and timescales to provide understanding of application and system performance with no statistically significant impact on application performance.
NASA Astrophysics Data System (ADS)
Selvam, Kayalvizhi; Vinod Kumar, D. M.; Siripuram, Ramakanth
2017-04-01
In this paper, an optimization technique called peer enhanced teaching learning based optimization (PeTLBO) algorithm is used in multi-objective problem domain. The PeTLBO algorithm is parameter less so it reduced the computational burden. The proposed peer enhanced multi-objective based TLBO (PeMOTLBO) algorithm has been utilized to find a set of non-dominated optimal solutions [distributed generation (DG) location and sizing in distribution network]. The objectives considered are: real power loss and the voltage deviation subjected to voltage limits and maximum penetration level of DG in distribution network. Since the DG considered is capable of injecting real and reactive power to the distribution network the power factor is considered as 0.85 lead. The proposed peer enhanced multi-objective optimization technique provides different trade-off solutions in order to find the best compromise solution a fuzzy set theory approach has been used. The effectiveness of this proposed PeMOTLBO is tested on IEEE 33-bus and Indian 85-bus distribution system. The performance is validated with Pareto fronts and two performance metrics (C-metric and S-metric) by comparing with robust multi-objective technique called non-dominated sorting genetic algorithm-II and also with the basic TLBO.
Favazza, Christopher P; Fetterly, Kenneth A; Hangiandreou, Nicholas J; Leng, Shuai; Schueler, Beth A
2015-01-01
Evaluation of flat-panel angiography equipment through conventional image quality metrics is limited by the scope of standard spatial-domain image quality metric(s), such as contrast-to-noise ratio and spatial resolution, or by restricted access to appropriate data to calculate Fourier domain measurements, such as modulation transfer function, noise power spectrum, and detective quantum efficiency. Observer models have been shown capable of overcoming these limitations and are able to comprehensively evaluate medical-imaging systems. We present a spatial domain-based channelized Hotelling observer model to calculate the detectability index (DI) of our different sized disks and compare the performance of different imaging conditions and angiography systems. When appropriate, changes in DIs were compared to expectations based on the classical Rose model of signal detection to assess linearity of the model with quantum signal-to-noise ratio (SNR) theory. For these experiments, the estimated uncertainty of the DIs was less than 3%, allowing for precise comparison of imaging systems or conditions. For most experimental variables, DI changes were linear with expectations based on quantum SNR theory. DIs calculated for the smallest objects demonstrated nonlinearity with quantum SNR theory due to system blur. Two angiography systems with different detector element sizes were shown to perform similarly across the majority of the detection tasks.
HealthTrust: a social network approach for retrieving online health videos.
Fernandez-Luque, Luis; Karlsen, Randi; Melton, Genevieve B
2012-01-31
Social media are becoming mainstream in the health domain. Despite the large volume of accurate and trustworthy health information available on social media platforms, finding good-quality health information can be difficult. Misleading health information can often be popular (eg, antivaccination videos) and therefore highly rated by general search engines. We believe that community wisdom about the quality of health information can be harnessed to help create tools for retrieving good-quality social media content. To explore approaches for extracting metrics about authoritativeness in online health communities and how these metrics positively correlate with the quality of the content. We designed a metric, called HealthTrust, that estimates the trustworthiness of social media content (eg, blog posts or videos) in a health community. The HealthTrust metric calculates reputation in an online health community based on link analysis. We used the metric to retrieve YouTube videos and channels about diabetes. In two different experiments, health consumers provided 427 ratings of 17 videos and professionals gave 162 ratings of 23 videos. In addition, two professionals reviewed 30 diabetes channels. HealthTrust may be used for retrieving online videos on diabetes, since it performed better than YouTube Search in most cases. Overall, of 20 potential channels, HealthTrust's filtering allowed only 3 bad channels (15%) versus 8 (40%) on the YouTube list. Misleading and graphic videos (eg, featuring amputations) were more commonly found by YouTube Search than by searches based on HealthTrust. However, some videos from trusted sources had low HealthTrust scores, mostly from general health content providers, and therefore not highly connected in the diabetes community. When comparing video ratings from our reviewers, we found that HealthTrust achieved a positive and statistically significant correlation with professionals (Pearson r₁₀ = .65, P = .02) and a trend toward significance with health consumers (r₇ = .65, P = .06) with videos on hemoglobinA(1c), but it did not perform as well with diabetic foot videos. The trust-based metric HealthTrust showed promising results when used to retrieve diabetes content from YouTube. Our research indicates that social network analysis may be used to identify trustworthy social media in health communities.
Advanced Life Support System Value Metric
NASA Technical Reports Server (NTRS)
Jones, Harry W.; Arnold, James O. (Technical Monitor)
1999-01-01
The NASA Advanced Life Support (ALS) Program is required to provide a performance metric to measure its progress in system development. Extensive discussions within the ALS program have reached a consensus. The Equivalent System Mass (ESM) metric has been traditionally used and provides a good summary of the weight, size, and power cost factors of space life support equipment. But ESM assumes that all the systems being traded off exactly meet a fixed performance requirement, so that the value and benefit (readiness, performance, safety, etc.) of all the different systems designs are exactly equal. This is too simplistic. Actual system design concepts are selected using many cost and benefit factors and the system specification is then set accordingly. The ALS program needs a multi-parameter metric including both the ESM and a System Value Metric (SVM). The SVM would include safety, maintainability, reliability, performance, use of cross cutting technology, and commercialization potential. Another major factor in system selection is technology readiness level (TRL), a familiar metric in ALS. The overall ALS system metric that is suggested is a benefit/cost ratio, [SVM + TRL]/ESM, with appropriate weighting and scaling. The total value is the sum of SVM and TRL. Cost is represented by ESM. The paper provides a detailed description and example application of the suggested System Value Metric.
Oropesa, Ignacio; Sánchez-González, Patricia; Chmarra, Magdalena K; Lamata, Pablo; Fernández, Alvaro; Sánchez-Margallo, Juan A; Jansen, Frank Willem; Dankelman, Jenny; Sánchez-Margallo, Francisco M; Gómez, Enrique J
2013-03-01
The EVA (Endoscopic Video Analysis) tracking system is a new system for extracting motions of laparoscopic instruments based on nonobtrusive video tracking. The feasibility of using EVA in laparoscopic settings has been tested in a box trainer setup. EVA makes use of an algorithm that employs information of the laparoscopic instrument's shaft edges in the image, the instrument's insertion point, and the camera's optical center to track the three-dimensional position of the instrument tip. A validation study of EVA comprised a comparison of the measurements achieved with EVA and the TrEndo tracking system. To this end, 42 participants (16 novices, 22 residents, and 4 experts) were asked to perform a peg transfer task in a box trainer. Ten motion-based metrics were used to assess their performance. Construct validation of the EVA has been obtained for seven motion-based metrics. Concurrent validation revealed that there is a strong correlation between the results obtained by EVA and the TrEndo for metrics, such as path length (ρ = 0.97), average speed (ρ = 0.94), or economy of volume (ρ = 0.85), proving the viability of EVA. EVA has been successfully validated in a box trainer setup, showing the potential of endoscopic video analysis to assess laparoscopic psychomotor skills. The results encourage further implementation of video tracking in training setups and image-guided surgery.
Using community-level metrics to monitor the effects of marine protected areas on biodiversity.
Soykan, Candan U; Lewison, Rebecca L
2015-06-01
Marine protected areas (MPAs) are used to protect species, communities, and their associated habitats, among other goals. Measuring MPA efficacy can be challenging, however, particularly when considering responses at the community level. We gathered 36 abundance and 14 biomass data sets on fish assemblages and used meta-analysis to evaluate the ability of 22 distinct community diversity metrics to detect differences in community structure between MPAs and nearby control sites. We also considered the effects of 6 covariates-MPA size and age, MPA size and age interaction, latitude, total species richness, and level of protection-on each metric. Some common metrics, such as species richness and Shannon diversity, did not differ consistently between MPA and control sites, whereas other metrics, such as total abundance and biomass, were consistently different across studies. Metric responses derived from the biomass data sets were more consistent than those based on the abundance data sets, suggesting that community-level biomass differs more predictably than abundance between MPA and control sites. Covariate analyses indicated that level of protection, latitude, MPA size, and the interaction between MPA size and age affect metric performance. These results highlight a handful of metrics, several of which are little known, that could be used to meet the increasing demand for community-level indicators of MPA effectiveness. © 2015 Society for Conservation Biology.
Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale
Kobourov, Stephen; Gallant, Mike; Börner, Katy
2016-01-01
Overview Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms—Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. Cluster Quality Metrics We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Network Clustering Algorithms Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters. PMID:27391786
Comparison of 3D displays using objective metrics
NASA Astrophysics Data System (ADS)
Havig, Paul; McIntire, John; Dixon, Sharon; Moore, Jason; Reis, George
2008-04-01
Previously, we (Havig, Aleva, Reis, Moore, and McIntire, 2007) presented a taxonomy for the development of three-dimensional (3D) displays. We proposed three levels of metrics: objective (in which physical measurements are made of the display), subjective (Likert-type rating scales to show preferences of the display), and subjective-objective (performance metrics in which one shows how the 3D display may be more or less useful than a 2D display or a different 3D display). We concluded that for each level of metric, drawing practical comparisons among currently disparate 3D displays is difficult. In this paper we attempt to define more clearly the objective metrics for 3D displays. We set out to collect and measure physical attributes of several 3D displays and compare the results. We discuss our findings in terms of both difficulties in making the measurements in the first place, due to the physical set-up of the display, to issues in comparing the results we found and comparing how similar (or dissimilar) two 3D displays may or may not be. We conclude by discussing the next steps in creating objective metrics for three-dimensional displays as well as a proposed way ahead for the other two levels of metrics based on our findings.
Performance Metrics for Soil Moisture Retrievals and Applications Requirements
USDA-ARS?s Scientific Manuscript database
Quadratic performance metrics such as root-mean-square error (RMSE) and time series correlation are often used to assess the accuracy of geophysical retrievals and true fields. These metrics are generally related; nevertheless each has advantages and disadvantages. In this study we explore the relat...
A biologically plausible computational model for auditory object recognition.
Larson, Eric; Billimoria, Cyrus P; Sen, Kamal
2009-01-01
Object recognition is a task of fundamental importance for sensory systems. Although this problem has been intensively investigated in the visual system, relatively little is known about the recognition of complex auditory objects. Recent work has shown that spike trains from individual sensory neurons can be used to discriminate between and recognize stimuli. Multiple groups have developed spike similarity or dissimilarity metrics to quantify the differences between spike trains. Using a nearest-neighbor approach the spike similarity metrics can be used to classify the stimuli into groups used to evoke the spike trains. The nearest prototype spike train to the tested spike train can then be used to identify the stimulus. However, how biological circuits might perform such computations remains unclear. Elucidating this question would facilitate the experimental search for such circuits in biological systems, as well as the design of artificial circuits that can perform such computations. Here we present a biologically plausible model for discrimination inspired by a spike distance metric using a network of integrate-and-fire model neurons coupled to a decision network. We then apply this model to the birdsong system in the context of song discrimination and recognition. We show that the model circuit is effective at recognizing individual songs, based on experimental input data from field L, the avian primary auditory cortex analog. We also compare the performance and robustness of this model to two alternative models of song discrimination: a model based on coincidence detection and a model based on firing rate.
Li, Guang; Greene, Travis C; Nishino, Thomas K; Willis, Charles E
2016-09-08
The purpose of this study was to evaluate several of the standardized image quality metrics proposed by the American Association of Physics in Medicine (AAPM) Task Group 150. The task group suggested region-of-interest (ROI)-based techniques to measure nonuniformity, minimum signal-to-noise ratio (SNR), number of anomalous pixels, and modulation transfer function (MTF). This study evaluated the effects of ROI size and layout on the image metrics by using four different ROI sets, assessed result uncertainty by repeating measurements, and compared results with two commercially available quality control tools, namely the Carestream DIRECTVIEW Total Quality Tool (TQT) and the GE Healthcare Quality Assurance Process (QAP). Seven Carestream DRX-1C (CsI) detectors on mobile DR systems and four GE FlashPad detectors in radiographic rooms were tested. Images were analyzed using MATLAB software that had been previously validated and reported. Our values for signal and SNR nonuniformity and MTF agree with values published by other investigators. Our results show that ROI size affects nonuniformity and minimum SNR measurements, but not detection of anomalous pixels. Exposure geometry affects all tested image metrics except for the MTF. TG-150 metrics in general agree with the TQT, but agree with the QAP only for local and global signal nonuniformity. The difference in SNR nonuniformity and MTF values between the TG-150 and QAP may be explained by differences in the calculation of noise and acquisition beam quality, respectively. TG-150's SNR nonuniformity metrics are also more sensitive to detector nonuniformity compared to the QAP. Our results suggest that fixed ROI size should be used for consistency because nonuniformity metrics depend on ROI size. Ideally, detector tests should be performed at the exact calibration position. If not feasible, a baseline should be established from the mean of several repeated measurements. Our study indicates that the TG-150 tests can be used as an independent standardized procedure for detector performance assessment. © 2016 The Authors.
Greene, Travis C.; Nishino, Thomas K.; Willis, Charles E.
2016-01-01
The purpose of this study was to evaluate several of the standardized image quality metrics proposed by the American Association of Physics in Medicine (AAPM) Task Group 150. The task group suggested region‐of‐interest (ROI)‐based techniques to measure nonuniformity, minimum signal‐to‐noise ratio (SNR), number of anomalous pixels, and modulation transfer function (MTF). This study evaluated the effects of ROI size and layout on the image metrics by using four different ROI sets, assessed result uncertainty by repeating measurements, and compared results with two commercially available quality control tools, namely the Carestream DIRECTVIEW Total Quality Tool (TQT) and the GE Healthcare Quality Assurance Process (QAP). Seven Carestream DRX‐1C (CsI) detectors on mobile DR systems and four GE FlashPad detectors in radiographic rooms were tested. Images were analyzed using MATLAB software that had been previously validated and reported. Our values for signal and SNR nonuniformity and MTF agree with values published by other investigators. Our results show that ROI size affects nonuniformity and minimum SNR measurements, but not detection of anomalous pixels. Exposure geometry affects all tested image metrics except for the MTF. TG‐150 metrics in general agree with the TQT, but agree with the QAP only for local and global signal nonuniformity. The difference in SNR nonuniformity and MTF values between the TG‐150 and QAP may be explained by differences in the calculation of noise and acquisition beam quality, respectively. TG‐150's SNR nonuniformity metrics are also more sensitive to detector nonuniformity compared to the QAP. Our results suggest that fixed ROI size should be used for consistency because nonuniformity metrics depend on ROI size. Ideally, detector tests should be performed at the exact calibration position. If not feasible, a baseline should be established from the mean of several repeated measurements. Our study indicates that the TG‐150 tests can be used as an independent standardized procedure for detector performance assessment. PACS number(s): 87.57.‐s, 87.57.C PMID:27685102
De Keersmaecker, Wanda; Lhermitte, Stef; Honnay, Olivier; Farifteh, Jamshid; Somers, Ben; Coppin, Pol
2014-07-01
Increasing frequency of extreme climate events is likely to impose increased stress on ecosystems and to jeopardize the services that ecosystems provide. Therefore, it is of major importance to assess the effects of extreme climate events on the temporal stability (i.e., the resistance, the resilience, and the variance) of ecosystem properties. Most time series of ecosystem properties are, however, affected by varying data characteristics, uncertainties, and noise, which complicate the comparison of ecosystem stability metrics (ESMs) between locations. Therefore, there is a strong need for a more comprehensive understanding regarding the reliability of stability metrics and how they can be used to compare ecosystem stability globally. The objective of this study was to evaluate the performance of temporal ESMs based on time series of the Moderate Resolution Imaging Spectroradiometer derived Normalized Difference Vegetation Index of 15 global land-cover types. We provide a framework (i) to assess the reliability of ESMs in function of data characteristics, uncertainties and noise and (ii) to integrate reliability estimates in future global ecosystem stability studies against climate disturbances. The performance of our framework was tested through (i) a global ecosystem comparison and (ii) an comparison of ecosystem stability in response to the 2003 drought. The results show the influence of data quality on the accuracy of ecosystem stability. White noise, biased noise, and trends have a stronger effect on the accuracy of stability metrics than the length of the time series, temporal resolution, or amount of missing values. Moreover, we demonstrate the importance of integrating reliability estimates to interpret stability metrics within confidence limits. Based on these confidence limits, other studies dealing with specific ecosystem types or locations can be put into context, and a more reliable assessment of ecosystem stability against environmental disturbances can be obtained. © 2013 John Wiley & Sons Ltd.
IT Metrics and Money: One Approach to Public Accountability
ERIC Educational Resources Information Center
Daigle, Stephen L.
2004-01-01
Performance measurement can be a difficult political as well as technical challenge for educational institutions at all levels. Performance-based budgeting can raise the stakes still higher by linking resource allocation to a public "report card." The 23-campus system of the California State University (CSU) accepted each of these…
Johnson, S J; Hunt, C M; Woolnough, H M; Crawshaw, M; Kilkenny, C; Gould, D A; England, A; Sinha, A; Villard, P F
2012-01-01
Objectives The aim of this article was to identify and prospectively investigate simulated ultrasound-guided targeted liver biopsy performance metrics as differentiators between levels of expertise in interventional radiology. Methods Task analysis produced detailed procedural step documentation allowing identification of critical procedure steps and performance metrics for use in a virtual reality ultrasound-guided targeted liver biopsy procedure. Consultant (n=14; male=11, female=3) and trainee (n=26; male=19, female=7) scores on the performance metrics were compared. Ethical approval was granted by the Liverpool Research Ethics Committee (UK). Independent t-tests and analysis of variance (ANOVA) investigated differences between groups. Results Independent t-tests revealed significant differences between trainees and consultants on three performance metrics: targeting, p=0.018, t=−2.487 (−2.040 to −0.207); probe usage time, p = 0.040, t=2.132 (11.064 to 427.983); mean needle length in beam, p=0.029, t=−2.272 (−0.028 to −0.002). ANOVA reported significant differences across years of experience (0–1, 1–2, 3+ years) on seven performance metrics: no-go area touched, p=0.012; targeting, p=0.025; length of session, p=0.024; probe usage time, p=0.025; total needle distance moved, p=0.038; number of skin contacts, p<0.001; total time in no-go area, p=0.008. More experienced participants consistently received better performance scores on all 19 performance metrics. Conclusion It is possible to measure and monitor performance using simulation, with performance metrics providing feedback on skill level and differentiating levels of expertise. However, a transfer of training study is required. PMID:21304005
Gahm, Jin Kyu; Shi, Yonggang
2018-01-01
Surface mapping methods play an important role in various brain imaging studies from tracking the maturation of adolescent brains to mapping gray matter atrophy patterns in Alzheimer’s disease. Popular surface mapping approaches based on spherical registration, however, have inherent numerical limitations when severe metric distortions are present during the spherical parameterization step. In this paper, we propose a novel computational framework for intrinsic surface mapping in the Laplace-Beltrami (LB) embedding space based on Riemannian metric optimization on surfaces (RMOS). Given a diffeomorphism between two surfaces, an isometry can be defined using the pullback metric, which in turn results in identical LB embeddings from the two surfaces. The proposed RMOS approach builds upon this mathematical foundation and achieves general feature-driven surface mapping in the LB embedding space by iteratively optimizing the Riemannian metric defined on the edges of triangular meshes. At the core of our framework is an optimization engine that converts an energy function for surface mapping into a distance measure in the LB embedding space, which can be effectively optimized using gradients of the LB eigen-system with respect to the Riemannian metrics. In the experimental results, we compare the RMOS algorithm with spherical registration using large-scale brain imaging data, and show that RMOS achieves superior performance in the prediction of hippocampal subfields and cortical gyral labels, and the holistic mapping of striatal surfaces for the construction of a striatal connectivity atlas from substantia nigra. PMID:29574399
A causal examination of the effects of confounding factors on multimetric indices
Schoolmaster, Donald R.; Grace, James B.; Schweiger, E. William; Mitchell, Brian R.; Guntenspergen, Glenn R.
2013-01-01
The development of multimetric indices (MMIs) as a means of providing integrative measures of ecosystem condition is becoming widespread. An increasingly recognized problem for the interpretability of MMIs is controlling for the potentially confounding influences of environmental covariates. Most common approaches to handling covariates are based on simple notions of statistical control, leaving the causal implications of covariates and their adjustment unstated. In this paper, we use graphical models to examine some of the potential impacts of environmental covariates on the observed signals between human disturbance and potential response metrics. Using simulations based on various causal networks, we show how environmental covariates can both obscure and exaggerate the effects of human disturbance on individual metrics. We then examine from a causal interpretation standpoint the common practice of adjusting ecological metrics for environmental influences using only the set of sites deemed to be in reference condition. We present and examine the performance of an alternative approach to metric adjustment that uses the whole set of sites and models both environmental and human disturbance effects simultaneously. The findings from our analyses indicate that failing to model and adjust metrics can result in a systematic bias towards those metrics in which environmental covariates function to artificially strengthen the metric–disturbance relationship resulting in MMIs that do not accurately measure impacts of human disturbance. We also find that a “whole-set modeling approach” requires fewer assumptions and is more efficient with the given information than the more commonly applied “reference-set” approach.
Metrics for evaluating performance and uncertainty of Bayesian network models
Bruce G. Marcot
2012-01-01
This paper presents a selected set of existing and new metrics for gauging Bayesian network model performance and uncertainty. Selected existing and new metrics are discussed for conducting model sensitivity analysis (variance reduction, entropy reduction, case file simulation); evaluating scenarios (influence analysis); depicting model complexity (numbers of model...
Relational Agreement Measures for Similarity Searching of Cheminformatic Data Sets.
Rivera-Borroto, Oscar Miguel; García-de la Vega, José Manuel; Marrero-Ponce, Yovani; Grau, Ricardo
2016-01-01
Research on similarity searching of cheminformatic data sets has been focused on similarity measures using fingerprints. However, nominal scales are the least informative of all metric scales, increasing the tied similarity scores, and decreasing the effectivity of the retrieval engines. Tanimoto's coefficient has been claimed to be the most prominent measure for this task. Nevertheless, this field is far from being exhausted since the computer science no free lunch theorem predicts that "no similarity measure has overall superiority over the population of data sets". We introduce 12 relational agreement (RA) coefficients for seven metric scales, which are integrated within a group fusion-based similarity searching algorithm. These similarity measures are compared to a reference panel of 21 proximity quantifiers over 17 benchmark data sets (MUV), by using informative descriptors, a feature selection stage, a suitable performance metric, and powerful comparison tests. In this stage, RA coefficients perform favourably with repect to the state-of-the-art proximity measures. Afterward, the RA-based method outperform another four nearest neighbor searching algorithms over the same data domains. In a third validation stage, RA measures are successfully applied to the virtual screening of the NCI data set. Finally, we discuss a possible molecular interpretation for these similarity variants.
Sweet-spot training for early esophageal cancer detection
NASA Astrophysics Data System (ADS)
van der Sommen, Fons; Zinger, Svitlana; Schoon, Erik J.; de With, Peter H. N.
2016-03-01
Over the past decade, the imaging tools for endoscopists have improved drastically. This has enabled physicians to visually inspect the intestinal tissue for early signs of malignant lesions. Besides this, recent studies show the feasibility of supportive image analysis for endoscopists, but the analysis problem is typically approached as a segmentation task where binary ground truth is employed. In this study, we show that the detection of early cancerous tissue in the gastrointestinal tract cannot be approached as a binary segmentation problem and it is crucial and clinically relevant to involve multiple experts for annotating early lesions. By employing the so-called sweet spot for training purposes as a metric, a much better detection performance can be achieved. Furthermore, a multi-expert-based ground truth, i.e. a golden standard, enables an improved validation of the resulting delineations. For this purpose, besides the sweet spot we also propose another novel metric, the Jaccard Golden Standard (JIGS) that can handle multiple ground-truth annotations. Our experiments involving these new metrics and based on the golden standard show that the performance of a detection algorithm of early neoplastic lesions in Barrett's esophagus can be increased significantly, demonstrating a 10 percent point increase in the resulting F1 detection score.
A relationship between eye movement patterns and performance in a precognitive tracking task
NASA Technical Reports Server (NTRS)
Repperger, D. W.; Hartzell, E. J.
1977-01-01
Eye movements made by various subjects in the performance of a precognitive tracking task are studied. The tracking task persented by an antiaircraft artillery (AAA) simulator has an input forcing function represented by a deterministic aircraft fly-by. The performance of subjects is ranked by two metrics. Good, mediocre, and poor trackers are selected for analysis based on performance during the difficult segment of the tracking task and over replications. Using phase planes to characterize both the eye movement patterns and the displayed error signal, a simple metric is developed to study these patterns. Two characterizations of eye movement strategies are defined and quantified. Using these two types of eye strategies, two conclusions are obtained about good, mediocre, and poor trackers. First, the eye tracker who used a fixed strategy will consistently perform better. Secondly, the best fixed strategy is defined as a Crosshair Fixator.
Regime-Based Evaluation of Cloudiness in CMIP5 Models
NASA Technical Reports Server (NTRS)
Jin, Daeho; Oraiopoulos, Lazaros; Lee, Dong Min
2016-01-01
The concept of Cloud Regimes (CRs) is used to develop a framework for evaluating the cloudiness of 12 fifth Coupled Model Intercomparison Project (CMIP5) models. Reference CRs come from existing global International Satellite Cloud Climatology Project (ISCCP) weather states. The evaluation is made possible by the implementation in several CMIP5 models of the ISCCP simulator generating for each gridcell daily joint histograms of cloud optical thickness and cloud top pressure. Model performance is assessed with several metrics such as CR global cloud fraction (CF), CR relative frequency of occurrence (RFO), their product (long-term average total cloud amount [TCA]), cross-correlations of CR RFO maps, and a metric of resemblance between model and ISCCP CRs. In terms of CR global RFO, arguably the most fundamental metric, the models perform unsatisfactorily overall, except for CRs representing thick storm clouds. Because model CR CF is internally constrained by our method, RFO discrepancies yield also substantial TCA errors. Our findings support previous studies showing that CMIP5 models underestimate cloudiness. The multi-model mean performs well in matching observed RFO maps for many CRs, but is not the best for this or other metrics. When overall performance across all CRs is assessed, some models, despite their shortcomings, apparently outperform Moderate Resolution Imaging Spectroradiometer (MODIS) cloud observations evaluated against ISCCP as if they were another model output. Lastly, cloud simulation performance is contrasted with each model's equilibrium climate sensitivity (ECS) in order to gain insight on whether good cloud simulation pairs with particular values of this parameter.
Vector-based navigation using grid-like representations in artificial agents.
Banino, Andrea; Barry, Caswell; Uria, Benigno; Blundell, Charles; Lillicrap, Timothy; Mirowski, Piotr; Pritzel, Alexander; Chadwick, Martin J; Degris, Thomas; Modayil, Joseph; Wayne, Greg; Soyer, Hubert; Viola, Fabio; Zhang, Brian; Goroshin, Ross; Rabinowitz, Neil; Pascanu, Razvan; Beattie, Charlie; Petersen, Stig; Sadik, Amir; Gaffney, Stephen; King, Helen; Kavukcuoglu, Koray; Hassabis, Demis; Hadsell, Raia; Kumaran, Dharshan
2018-05-01
Deep neural networks have achieved impressive successes in fields ranging from object recognition to complex games such as Go 1,2 . Navigation, however, remains a substantial challenge for artificial agents, with deep neural networks trained by reinforcement learning 3-5 failing to rival the proficiency of mammalian spatial behaviour, which is underpinned by grid cells in the entorhinal cortex 6 . Grid cells are thought to provide a multi-scale periodic representation that functions as a metric for coding space 7,8 and is critical for integrating self-motion (path integration) 6,7,9 and planning direct trajectories to goals (vector-based navigation) 7,10,11 . Here we set out to leverage the computational functions of grid cells to develop a deep reinforcement learning agent with mammal-like navigational abilities. We first trained a recurrent network to perform path integration, leading to the emergence of representations resembling grid cells, as well as other entorhinal cell types 12 . We then showed that this representation provided an effective basis for an agent to locate goals in challenging, unfamiliar, and changeable environments-optimizing the primary objective of navigation through deep reinforcement learning. The performance of agents endowed with grid-like representations surpassed that of an expert human and comparison agents, with the metric quantities necessary for vector-based navigation derived from grid-like units within the network. Furthermore, grid-like representations enabled agents to conduct shortcut behaviours reminiscent of those performed by mammals. Our findings show that emergent grid-like representations furnish agents with a Euclidean spatial metric and associated vector operations, providing a foundation for proficient navigation. As such, our results support neuroscientific theories that see grid cells as critical for vector-based navigation 7,10,11 , demonstrating that the latter can be combined with path-based strategies to support navigation in challenging environments.
Separation Assurance and Scheduling Coordination in the Arrival Environment
NASA Technical Reports Server (NTRS)
Aweiss, Arwa S.; Cone, Andrew C.; Holladay, Joshua J.; Munoz, Epifanio; Lewis, Timothy A.
2016-01-01
Separation assurance (SA) automation has been proposed as either a ground-based or airborne paradigm. The arrival environment is complex because aircraft are being sequenced and spaced to the arrival fix. This paper examines the effect of the allocation of the SA and scheduling functions on the performance of the system. Two coordination configurations between an SA and an arrival management system are tested using both ground and airborne implementations. All configurations have a conflict detection and resolution (CD&R) system and either an integrated or separated scheduler. Performance metrics are presented for the ground and airborne systems based on arrival traffic headed to Dallas/ Fort Worth International airport. The total delay, time-spacing conformance, and schedule conformance are used to measure efficiency. The goal of the analysis is to use the metrics to identify performance differences between the configurations that are based on different function allocations. A surveillance range limitation of 100 nmi and a time delay for sharing updated trajectory intent of 30 seconds were implemented for the airborne system. Overall, these results indicate that the surveillance range and the sharing of trajectories and aircraft schedules are important factors in determining the efficiency of an airborne arrival management system. These parameters are not relevant to the ground-based system as modeled for this study because it has instantaneous access to all aircraft trajectories and intent. Creating a schedule external to the CD&R and the scheduling conformance system was seen to reduce total delays for the airborne system, and had a minor effect on the ground-based system. The effect of an external scheduler on other metrics was mixed.
A Three-Dimensional Receiver Operator Characteristic Surface Diagnostic Metric
NASA Technical Reports Server (NTRS)
Simon, Donald L.
2011-01-01
Receiver Operator Characteristic (ROC) curves are commonly applied as metrics for quantifying the performance of binary fault detection systems. An ROC curve provides a visual representation of a detection system s True Positive Rate versus False Positive Rate sensitivity as the detection threshold is varied. The area under the curve provides a measure of fault detection performance independent of the applied detection threshold. While the standard ROC curve is well suited for quantifying binary fault detection performance, it is not suitable for quantifying the classification performance of multi-fault classification problems. Furthermore, it does not provide a measure of diagnostic latency. To address these shortcomings, a novel three-dimensional receiver operator characteristic (3D ROC) surface metric has been developed. This is done by generating and applying two separate curves: the standard ROC curve reflecting fault detection performance, and a second curve reflecting fault classification performance. A third dimension, diagnostic latency, is added giving rise to 3D ROC surfaces. Applying numerical integration techniques, the volumes under and between the surfaces are calculated to produce metrics of the diagnostic system s detection and classification performance. This paper will describe the 3D ROC surface metric in detail, and present an example of its application for quantifying the performance of aircraft engine gas path diagnostic methods. Metric limitations and potential enhancements are also discussed
Correlation of admissions statistics to graduate student success in medical physics
McSpadden, Erin; Rakowski, Joseph; Nalichowski, Adrian; Yudelev, Mark; Snyder, Michael
2014-01-01
The purpose of this work is to develop metrics for evaluation of medical physics graduate student performance, assess relationships between success and other quantifiable factors, and determine whether graduate student performance can be accurately predicted by admissions statistics. A cohort of 108 medical physics graduate students from a single institution were rated for performance after matriculation based on final scores in specific courses, first year graduate Grade Point Average (GPA), performance on the program exit exam, performance in oral review sessions, and faculty rating. Admissions statistics including matriculating program (MS vs. PhD); undergraduate degree type, GPA, and country; graduate degree; general and subject GRE scores; traditional vs. nontraditional status; and ranking by admissions committee were evaluated for potential correlation with the performance metrics. GRE verbal and quantitative scores were correlated with higher scores in the most difficult courses in the program and with the program exit exam; however, the GRE section most correlated with overall faculty rating was the analytical writing section. Students with undergraduate degrees in engineering had a higher faculty rating than those from other disciplines and faculty rating was strongly correlated with undergraduate country. Undergraduate GPA was not statistically correlated with any success metrics investigated in this study. However, the high degree of selection on GPA and quantitative GRE scores during the admissions process results in relatively narrow ranges for these quantities. As such, these results do not necessarily imply that one should not strongly consider traditional metrics, such as undergraduate GPA and quantitative GRE score, during the admissions process. They suggest that once applicants have been initially filtered by these metrics, additional selection should be performed via the other metrics shown here to be correlated with success. The parameters used to make admissions decisions for our program are accurate in predicting student success, as illustrated by the very strong statistical correlation between admissions rank and course average, first year graduate GPA, and faculty rating (p<0.002). Overall, this study indicates that an undergraduate degree in physics should not be considered a fundamental requirement for entry into our program and that within the relatively narrow range of undergraduate GPA and quantitative GRE scores of those admitted into our program, additional variations in these metrics are not important predictors of success. While the high degree of selection on particular statistics involved in the admissions process, along with the relatively small sample size, makes it difficult to draw concrete conclusions about the meaning of correlations here, these results suggest that success in medical physics is based on more than quantitative capabilities. Specifically, they indicate that analytical and communication skills play a major role in student success in our program, as well as predicted future success by program faculty members. Finally, this study confirms that our current admissions process is effective in identifying candidates who will be successful in our program and are expected to be successful after graduation, and provides additional insight useful in improving our admissions selection process. PACS number: 01.40.‐d PMID:24423842
NASA Astrophysics Data System (ADS)
Dostal, P.; Krasula, L.; Klima, M.
2012-06-01
Various image processing techniques in multimedia technology are optimized using visual attention feature of the human visual system. Spatial non-uniformity causes that different locations in an image are of different importance in terms of perception of the image. In other words, the perceived image quality depends mainly on the quality of important locations known as regions of interest. The performance of such techniques is measured by subjective evaluation or objective image quality criteria. Many state-of-the-art objective metrics are based on HVS properties; SSIM, MS-SSIM based on image structural information, VIF based on the information that human brain can ideally gain from the reference image or FSIM utilizing the low-level features to assign the different importance to each location in the image. But still none of these objective metrics utilize the analysis of regions of interest. We solve the question if these objective metrics can be used for effective evaluation of images reconstructed by processing techniques based on ROI analysis utilizing high-level features. In this paper authors show that the state-of-the-art objective metrics do not correlate well with subjective evaluation while the demosaicing based on ROI analysis is used for reconstruction. The ROI were computed from "ground truth" visual attention data. The algorithm combining two known demosaicing techniques on the basis of ROI location is proposed to reconstruct the ROI in fine quality while the rest of image is reconstructed with low quality. The color image reconstructed by this ROI approach was compared with selected demosaicing techniques by objective criteria and subjective testing. The qualitative comparison of the objective and subjective results indicates that the state-of-the-art objective metrics are still not suitable for evaluation image processing techniques based on ROI analysis and new criteria is demanded.
A laser beam quality definition based on induced temperature rise.
Miller, Harold C
2012-12-17
Laser beam quality metrics like M(2) can be used to describe the spot sizes and propagation behavior of a wide variety of non-ideal laser beams. However, for beams that have been diffracted by limiting apertures in the near-field, or those with unusual near-field profiles, the conventional metrics can lead to an inconsistent or incomplete description of far-field performance. This paper motivates an alternative laser beam quality definition that can be used with any beam. The approach uses a consideration of the intrinsic ability of a laser beam profile to heat a material. Comparisons are made with conventional beam quality metrics. An analysis on an asymmetric Gaussian beam is used to establish a connection with the invariant beam propagation ratio.
Meoded, Avner; Kwan, Justin Y.; Peters, Tracy L.; Huey, Edward D.; Danielian, Laura E.; Wiggs, Edythe; Morrissette, Arthur; Wu, Tianxia; Russell, James W.; Bayat, Elham; Grafman, Jordan; Floeter, Mary Kay
2013-01-01
Introduction Executive dysfunction occurs in many patients with amyotrophic lateral sclerosis (ALS), but it has not been well studied in primary lateral sclerosis (PLS). The aims of this study were to (1) compare cognitive function in PLS to that in ALS patients, (2) explore the relationship between performance on specific cognitive tests and diffusion tensor imaging (DTI) metrics of white matter tracts and gray matter volumes, and (3) compare DTI metrics in patients with and without cognitive and behavioral changes. Methods The Delis-Kaplan Executive Function System (D-KEFS), the Mattis Dementia Rating Scale (DRS-2), and other behavior and mood scales were administered to 25 ALS patients and 25 PLS patients. Seventeen of the PLS patients, 13 of the ALS patients, and 17 healthy controls underwent structural magnetic resonance imaging (MRI) and DTI. Atlas-based analysis using MRI Studio software was used to measure fractional anisotropy, and axial and radial diffusivity of selected white matter tracts. Voxel-based morphometry was used to assess gray matter volumes. The relationship between diffusion properties of selected association and commissural white matter and performance on executive function and memory tests was explored using a linear regression model. Results More ALS than PLS patients had abnormal scores on the DRS-2. DRS-2 and D-KEFS scores were related to DTI metrics in several long association tracts and the callosum. Reduced gray matter volumes in motor and perirolandic areas were not associated with cognitive scores. Conclusion The changes in diffusion metrics of white matter long association tracts suggest that the loss of integrity of the networks connecting fronto-temporal areas to parietal and occipital areas contributes to cognitive impairment. PMID:24052798
Csapo, Peter; Raab, Markus
2014-01-01
The hot-hand phenomenon, according to which a player’s performance is significantly elevated during certain phases relative to the expected performance based on the player’s base rate, has left many researchers and fans in basketball puzzled: The vast majority of players, coaches and fans believe in its existence but statistical evidence supporting this belief has been scarce. It has frequently been argued that the hot hand in basketball is unobservable because of strategic adjustments and defensive interference of the opposing team. We use a dataset with novel metrics, such as the number of defenders and the defensive intensity for each shot attempt, which enable us to directly measure defensive pressure. First, we examine how the shooting percentage of NBA players changes relative to the attributes of each metric. We find that it is of lesser importance by how many defenders a player is guarded but that defensive intensity, e.g., whether a defender raises his hand when his opponent shoots, has a larger impact on shot difficulty. Second, we explore how the underlying metrics and shooting accuracy change as a function of streak length. Our results indicate that defensive pressure and shot difficulty increase (decrease) during hot (cold) streaks, so that defenders seem to behave according to the hot-hand belief and try to force hot players into more difficult shots. However, we find that shooting percentages of presumably hot players do not increase and that shooting performance is not related to streakiness, so that the defenders’ hot-hand behavior cannot be considered ecologically rational. Therefore, we are unable to find evidence in favor of the hot-hand effect even when accounting for defensive pressure. PMID:25474443
Csapo, Peter; Raab, Markus
2014-01-01
The hot-hand phenomenon, according to which a player's performance is significantly elevated during certain phases relative to the expected performance based on the player's base rate, has left many researchers and fans in basketball puzzled: The vast majority of players, coaches and fans believe in its existence but statistical evidence supporting this belief has been scarce. It has frequently been argued that the hot hand in basketball is unobservable because of strategic adjustments and defensive interference of the opposing team. We use a dataset with novel metrics, such as the number of defenders and the defensive intensity for each shot attempt, which enable us to directly measure defensive pressure. First, we examine how the shooting percentage of NBA players changes relative to the attributes of each metric. We find that it is of lesser importance by how many defenders a player is guarded but that defensive intensity, e.g., whether a defender raises his hand when his opponent shoots, has a larger impact on shot difficulty. Second, we explore how the underlying metrics and shooting accuracy change as a function of streak length. Our results indicate that defensive pressure and shot difficulty increase (decrease) during hot (cold) streaks, so that defenders seem to behave according to the hot-hand belief and try to force hot players into more difficult shots. However, we find that shooting percentages of presumably hot players do not increase and that shooting performance is not related to streakiness, so that the defenders' hot-hand behavior cannot be considered ecologically rational. Therefore, we are unable to find evidence in favor of the hot-hand effect even when accounting for defensive pressure.
Feng, Fred; Bao, Shan; Sayer, James R; Flannagan, Carol; Manser, Michael; Wunderlich, Robert
2017-07-01
This paper investigated the characteristics of vehicle longitudinal jerk (change rate of acceleration with respect to time) by using vehicle sensor data from an existing naturalistic driving study. The main objective was to examine whether vehicle jerk contains useful information that could be potentially used to identify aggressive drivers. Initial investigation showed that there are unique characteristics of vehicle jerk in drivers' gas and brake pedal operations. Thus two jerk-based metrics were examined: (1) driver's frequency of using large positive jerk when pressing the gas pedal, and (2) driver's frequency of using large negative jerk when pressing the brake pedal. To validate the performance of the two metrics, drivers were firstly divided into an aggressive group and a normal group using three classification methods (1) traveling at excessive speed (speeding), (2) following too closely to a front vehicle (tailgating), and (3) their association with crashes or near-crashes in the dataset. The results show that those aggressive drivers defined using any of the three methods above were associated with significantly higher values of the two jerk-based metrics. Between the two metrics the frequency of using large negative jerk seems to have better performance in identifying aggressive drivers. A sensitivity analysis shows the findings were largely consistent with varying parameters in the analysis. The potential applications of this work include developing quantitative surrogate safety measures to identify aggressive drivers and aggressive driving, which could be potentially used to, for example, provide real-time or post-ride performance feedback to the drivers, or warn the surrounding drivers or vehicles using the connected vehicle technologies. Copyright © 2017 Elsevier Ltd. All rights reserved.
Stikic, Maja; Berka, Chris; Levendowski, Daniel J.; Rubio, Roberto F.; Tan, Veasna; Korszen, Stephanie; Barba, Douglas; Wurzer, David
2014-01-01
The objective of this study was to investigate the feasibility of physiological metrics such as ECG-derived heart rate and EEG-derived cognitive workload and engagement as potential predictors of performance on different training tasks. An unsupervised approach based on self-organizing neural network (NN) was utilized to model cognitive state changes over time. The feature vector comprised EEG-engagement, EEG-workload, and heart rate metrics, all self-normalized to account for individual differences. During the competitive training process, a linear topology was developed where the feature vectors similar to each other activated the same NN nodes. The NN model was trained and auto-validated on combat marksmanship training data from 51 participants that were required to make “deadly force decisions” in challenging combat scenarios. The trained NN model was cross validated using 10-fold cross-validation. It was also validated on a golf study in which additional 22 participants were asked to complete 10 sessions of 10 putts each. Temporal sequences of the activated nodes for both studies followed the same pattern of changes, demonstrating the generalization capabilities of the approach. Most node transition changes were local, but important events typically caused significant changes in the physiological metrics, as evidenced by larger state changes. This was investigated by calculating a transition score as the sum of subsequent state transitions between the activated NN nodes. Correlation analysis demonstrated statistically significant correlations between the transition scores and subjects' performances in both studies. This paper explored the hypothesis that temporal sequences of physiological changes comprise the discriminative patterns for performance prediction. These physiological markers could be utilized in future training improvement systems (e.g., through neurofeedback), and applied across a variety of training environments. PMID:25414629
Control algorithms and applications of the wavefront sensorless adaptive optics
NASA Astrophysics Data System (ADS)
Ma, Liang; Wang, Bin; Zhou, Yuanshen; Yang, Huizhen
2017-10-01
Compared with the conventional adaptive optics (AO) system, the wavefront sensorless (WFSless) AO system need not to measure the wavefront and reconstruct it. It is simpler than the conventional AO in system architecture and can be applied to the complex conditions. Based on the analysis of principle and system model of the WFSless AO system, wavefront correction methods of the WFSless AO system were divided into two categories: model-free-based and model-based control algorithms. The WFSless AO system based on model-free-based control algorithms commonly considers the performance metric as a function of the control parameters and then uses certain control algorithm to improve the performance metric. The model-based control algorithms include modal control algorithms, nonlinear control algorithms and control algorithms based on geometrical optics. Based on the brief description of above typical control algorithms, hybrid methods combining the model-free-based control algorithm with the model-based control algorithm were generalized. Additionally, characteristics of various control algorithms were compared and analyzed. We also discussed the extensive applications of WFSless AO system in free space optical communication (FSO), retinal imaging in the human eye, confocal microscope, coherent beam combination (CBC) techniques and extended objects.
A Surgical Business Composite Score for Army Medicine.
Stoddard, Douglas R; Robinson, Andrew B; Comer, Tracy A; Meno, Jenifer A; Welder, Matthew D
2016-06-01
Measuring surgical business performance for Army military treatment facilities is currently done through 6 business metrics developed by the Army Medical Command (MEDCOM) Surgical Services Service Line (3SL). Development of a composite score for business performance has the potential to simplify and synthesize measurement, improving focus for strategic goal setting and implementation. However, several considerations, ranging from data availability to submetric selection, must be addressed to ensure the score is accurate and representative. This article presents the methodology used in the composite score's creation and presents a metric based on return on investment and a measure of cases recaptured from private networks. Reprint & Copyright © 2016 Association of Military Surgeons of the U.S.
The use of vision-based image quality metrics to predict low-light performance of camera phones
NASA Astrophysics Data System (ADS)
Hultgren, B.; Hertel, D.
2010-01-01
Small digital camera modules such as those in mobile phones have become ubiquitous. Their low-light performance is of utmost importance since a high percentage of images are made under low lighting conditions where image quality failure may occur due to blur, noise, and/or underexposure. These modes of image degradation are not mutually exclusive: they share common roots in the physics of the imager, the constraints of image processing, and the general trade-off situations in camera design. A comprehensive analysis of failure modes is needed in order to understand how their interactions affect overall image quality. Low-light performance is reported for DSLR, point-and-shoot, and mobile phone cameras. The measurements target blur, noise, and exposure error. Image sharpness is evaluated from three different physical measurements: static spatial frequency response, handheld motion blur, and statistical information loss due to image processing. Visual metrics for sharpness, graininess, and brightness are calculated from the physical measurements, and displayed as orthogonal image quality metrics to illustrate the relative magnitude of image quality degradation as a function of subject illumination. The impact of each of the three sharpness measurements on overall sharpness quality is displayed for different light levels. The power spectrum of the statistical information target is a good representation of natural scenes, thus providing a defined input signal for the measurement of power-spectrum based signal-to-noise ratio to characterize overall imaging performance.
Compensation of chief executive officers at nonprofit US hospitals.
Joynt, Karen E; Le, Sidney T; Orav, E John; Jha, Ashish K
2014-01-01
Hospital chief executive officers (CEOs) can shape the priorities and performance of their organizations. The degree to which their compensation is based on their hospitals' quality performance is not well known. To characterize CEO compensation and examine its relation with quality metrics. Retrospective observational study. Participants included 1877 CEOs at 2681 private, nonprofit US hospitals. We used linear regression to identify hospital structural characteristics associated with CEO pay. We then determined the degree to which a hospital's performance on financial metrics, technologic metrics, quality metrics, and community benefit in 2008 was associated with CEO pay in 2009. The CEOs in our sample had a mean compensation of $595,781 (median, $404,938) in 2009. In multivariate analyses, CEO pay was associated with the number of hospital beds overseen ($550 for each additional bed; 95% CI, 429-671; P < .001), teaching status ($425,078 more at major teaching vs nonteaching hospitals; 95% CI, 315,238-534,918; P < .001), and urban location. Hospitals with high levels of advanced technologic capabilities compensated their CEOs $135,862 more (95% CI, 80,744-190,990; P < .001) than did hospitals with low levels of technology. Hospitals with high performance on patient satisfaction compensated their CEOs $51,706 more than did those with low performance on patient satisfaction (95% CI, 15,166-88,247; P = .006). We found no association between CEO pay and hospitals' margins, liquidity, capitalization, occupancy rates, process quality performance, mortality rates, readmission rates, or measures of community benefit. Compensation of CEOs at nonprofit hospitals was highly variable across the country. Compensation was associated with technology and patient satisfaction but not with processes of care, patient outcomes, or community benefit.
Contrast-based sensorless adaptive optics for retinal imaging
Zhou, Xiaolin; Bedggood, Phillip; Bui, Bang; Nguyen, Christine T.O.; He, Zheng; Metha, Andrew
2015-01-01
Conventional adaptive optics ophthalmoscopes use wavefront sensing methods to characterize ocular aberrations for real-time correction. However, there are important situations in which the wavefront sensing step is susceptible to difficulties that affect the accuracy of the correction. To circumvent these, wavefront sensorless adaptive optics (or non-wavefront sensing AO; NS-AO) imaging has recently been developed and has been applied to point-scanning based retinal imaging modalities. In this study we show, for the first time, contrast-based NS-AO ophthalmoscopy for full-frame in vivo imaging of human and animal eyes. We suggest a robust image quality metric that could be used for any imaging modality, and test its performance against other metrics using (physical) model eyes. PMID:26417525
Mohamed, Abdallah S. R.; Ruangskul, Manee-Naad; Awan, Musaddiq J.; Baron, Charles A.; Kalpathy-Cramer, Jayashree; Castillo, Richard; Castillo, Edward; Guerrero, Thomas M.; Kocak-Uzel, Esengul; Yang, Jinzhong; Court, Laurence E.; Kantor, Michael E.; Gunn, G. Brandon; Colen, Rivka R.; Frank, Steven J.; Garden, Adam S.; Rosenthal, David I.
2015-01-01
Purpose To develop a quality assurance (QA) workflow by using a robust, curated, manually segmented anatomic region-of-interest (ROI) library as a benchmark for quantitative assessment of different image registration techniques used for head and neck radiation therapy–simulation computed tomography (CT) with diagnostic CT coregistration. Materials and Methods Radiation therapy–simulation CT images and diagnostic CT images in 20 patients with head and neck squamous cell carcinoma treated with curative-intent intensity-modulated radiation therapy between August 2011 and May 2012 were retrospectively retrieved with institutional review board approval. Sixty-eight reference anatomic ROIs with gross tumor and nodal targets were then manually contoured on images from each examination. Diagnostic CT images were registered with simulation CT images rigidly and by using four deformable image registration (DIR) algorithms: atlas based, B-spline, demons, and optical flow. The resultant deformed ROIs were compared with manually contoured reference ROIs by using similarity coefficient metrics (ie, Dice similarity coefficient) and surface distance metrics (ie, 95% maximum Hausdorff distance). The nonparametric Steel test with control was used to compare different DIR algorithms with rigid image registration (RIR) by using the post hoc Wilcoxon signed-rank test for stratified metric comparison. Results A total of 2720 anatomic and 50 tumor and nodal ROIs were delineated. All DIR algorithms showed improved performance over RIR for anatomic and target ROI conformance, as shown for most comparison metrics (Steel test, P < .008 after Bonferroni correction). The performance of different algorithms varied substantially with stratification by specific anatomic structures or category and simulation CT section thickness. Conclusion Development of a formal ROI-based QA workflow for registration assessment demonstrated improved performance with DIR techniques over RIR. After QA, DIR implementation should be the standard for head and neck diagnostic CT and simulation CT allineation, especially for target delineation. © RSNA, 2014 Online supplemental material is available for this article. PMID:25380454
Memory colours and colour quality evaluation of conventional and solid-state lamps.
Smet, Kevin A G; Ryckaert, Wouter R; Pointer, Michael R; Deconinck, Geert; Hanselaer, Peter
2010-12-06
A colour quality metric based on memory colours is presented. The basic idea is simple. The colour quality of a test source is evaluated as the degree of similarity between the colour appearance of a set of familiar objects and their memory colours. The closer the match, the better the colour quality. This similarity was quantified using a set of similarity distributions obtained by Smet et al. in a previous study. The metric was validated by calculating the Pearson and Spearman correlation coefficients between the metric predictions and the visual appreciation results obtained in a validation experiment conducted by the authors as well those obtained in two independent studies. The metric was found to correlate well with the visual appreciation of the lighting quality of the sources used in the three experiments. Its performance was also compared with that of the CIE colour rendering index and the NIST colour quality scale. For all three experiments, the metric was found to be significantly better at predicting the correct visual rank order of the light sources (p < 0.1).
Validation metrics for turbulent plasma transport
Holland, C.
2016-06-22
Developing accurate models of plasma dynamics is essential for confident predictive modeling of current and future fusion devices. In modern computer science and engineering, formal verification and validation processes are used to assess model accuracy and establish confidence in the predictive capabilities of a given model. This paper provides an overview of the key guiding principles and best practices for the development of validation metrics, illustrated using examples from investigations of turbulent transport in magnetically confined plasmas. Particular emphasis is given to the importance of uncertainty quantification and its inclusion within the metrics, and the need for utilizing synthetic diagnosticsmore » to enable quantitatively meaningful comparisons between simulation and experiment. As a starting point, the structure of commonly used global transport model metrics and their limitations is reviewed. An alternate approach is then presented, which focuses upon comparisons of predicted local fluxes, fluctuations, and equilibrium gradients against observation. Furthermore, the utility of metrics based upon these comparisons is demonstrated by applying them to gyrokinetic predictions of turbulent transport in a variety of discharges performed on the DIII-D tokamak, as part of a multi-year transport model validation activity.« less
Validation metrics for turbulent plasma transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holland, C.
Developing accurate models of plasma dynamics is essential for confident predictive modeling of current and future fusion devices. In modern computer science and engineering, formal verification and validation processes are used to assess model accuracy and establish confidence in the predictive capabilities of a given model. This paper provides an overview of the key guiding principles and best practices for the development of validation metrics, illustrated using examples from investigations of turbulent transport in magnetically confined plasmas. Particular emphasis is given to the importance of uncertainty quantification and its inclusion within the metrics, and the need for utilizing synthetic diagnosticsmore » to enable quantitatively meaningful comparisons between simulation and experiment. As a starting point, the structure of commonly used global transport model metrics and their limitations is reviewed. An alternate approach is then presented, which focuses upon comparisons of predicted local fluxes, fluctuations, and equilibrium gradients against observation. Furthermore, the utility of metrics based upon these comparisons is demonstrated by applying them to gyrokinetic predictions of turbulent transport in a variety of discharges performed on the DIII-D tokamak, as part of a multi-year transport model validation activity.« less
Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial
This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit m...
NASA Astrophysics Data System (ADS)
Zhang, Lulu; Liu, Jingling; Li, Yi
2015-03-01
The influence of spatial differences, which are caused by different anthropogenic disturbances, and temporal changes, which are caused by natural conditions, on macroinvertebrates with periphyton communities in Baiyangdian Lake was compared. Periphyton and macrobenthos assemblage samples were simultaneously collected on four occasions during 2009 and 2010. Based on the physical and chemical attributes in the water and sediment, the 8 sampling sites can be divided into 5 habitat types by using cluster analysis. According to coefficients variation analysis (CV), three primary conclusions can be drawn: (1) the metrics of Hilsenhoff Biotic Index (HBI), Percent Tolerant Taxa (PTT), Percent dominant taxon (PDT), and community loss index (CLI), based on macroinvertebrates, and the metrics of algal density (AD), the proportion of chlorophyta (CHL), and the proportion of cyanophyta (CYA), based on periphytons, were mostly constant throughout our study; (2) in terms of spatial variation, the CV values in the macroinvertebratebased metrics were lower than the CV values in the periphyton-based metrics, and these findings may be caused by the effects of changes in environmental factors; whereas, the CV values in the macroinvertebrate-based metrics were higher than those in the periphyton-based metrics, and these results may be linked to the influences of phenology and life history patterns of the macroinvertebrate individuals; and (3) the CV values for the functionalbased metrics were higher than those for the structuralbased metrics. Therefore, spatial and temporal variation for metrics should be considered when assessing applying the biometrics.
Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale.
Emmons, Scott; Kobourov, Stephen; Gallant, Mike; Börner, Katy
2016-01-01
Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms-Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters.
Quantification of Dynamic Model Validation Metrics Using Uncertainty Propagation from Requirements
NASA Technical Reports Server (NTRS)
Brown, Andrew M.; Peck, Jeffrey A.; Stewart, Eric C.
2018-01-01
The Space Launch System, NASA's new large launch vehicle for long range space exploration, is presently in the final design and construction phases, with the first launch scheduled for 2019. A dynamic model of the system has been created and is critical for calculation of interface loads and natural frequencies and mode shapes for guidance, navigation, and control (GNC). Because of the program and schedule constraints, a single modal test of the SLS will be performed while bolted down to the Mobile Launch Pad just before the first launch. A Monte Carlo and optimization scheme will be performed to create thousands of possible models based on given dispersions in model properties and to determine which model best fits the natural frequencies and mode shapes from modal test. However, the question still remains as to whether this model is acceptable for the loads and GNC requirements. An uncertainty propagation and quantification (UP and UQ) technique to develop a quantitative set of validation metrics that is based on the flight requirements has therefore been developed and is discussed in this paper. There has been considerable research on UQ and UP and validation in the literature, but very little on propagating the uncertainties from requirements, so most validation metrics are "rules-of-thumb;" this research seeks to come up with more reason-based metrics. One of the main assumptions used to achieve this task is that the uncertainty in the modeling of the fixed boundary condition is accurate, so therefore that same uncertainty can be used in propagating the fixed-test configuration to the free-free actual configuration. The second main technique applied here is the usage of the limit-state formulation to quantify the final probabilistic parameters and to compare them with the requirements. These techniques are explored with a simple lumped spring-mass system and a simplified SLS model. When completed, it is anticipated that this requirements-based validation metric will provide a quantified confidence and probability of success for the final SLS dynamics model, which will be critical for a successful launch program, and can be applied in the many other industries where an accurate dynamic model is required.
A guide to calculating habitat-quality metrics to inform conservation of highly mobile species
Bieri, Joanna A.; Sample, Christine; Thogmartin, Wayne E.; Diffendorfer, James E.; Earl, Julia E.; Erickson, Richard A.; Federico, Paula; Flockhart, D. T. Tyler; Nicol, Sam; Semmens, Darius J.; Skraber, T.; Wiederholt, Ruscena; Mattsson, Brady J.
2018-01-01
Many metrics exist for quantifying the relative value of habitats and pathways used by highly mobile species. Properly selecting and applying such metrics requires substantial background in mathematics and understanding the relevant management arena. To address this multidimensional challenge, we demonstrate and compare three measurements of habitat quality: graph-, occupancy-, and demographic-based metrics. Each metric provides insights into system dynamics, at the expense of increasing amounts and complexity of data and models. Our descriptions and comparisons of diverse habitat-quality metrics provide means for practitioners to overcome the modeling challenges associated with management or conservation of such highly mobile species. Whereas previous guidance for applying habitat-quality metrics has been scattered in diversified tracks of literature, we have brought this information together into an approachable format including accessible descriptions and a modeling case study for a typical example that conservation professionals can adapt for their own decision contexts and focal populations.Considerations for Resource ManagersManagement objectives, proposed actions, data availability and quality, and model assumptions are all relevant considerations when applying and interpreting habitat-quality metrics.Graph-based metrics answer questions related to habitat centrality and connectivity, are suitable for populations with any movement pattern, quantify basic spatial and temporal patterns of occupancy and movement, and require the least data.Occupancy-based metrics answer questions about likelihood of persistence or colonization, are suitable for populations that undergo localized extinctions, quantify spatial and temporal patterns of occupancy and movement, and require a moderate amount of data.Demographic-based metrics answer questions about relative or absolute population size, are suitable for populations with any movement pattern, quantify demographic processes and population dynamics, and require the most data.More real-world examples applying occupancy-based, agent-based, and continuous-based metrics to seasonally migratory species are needed to better understand challenges and opportunities for applying these metrics more broadly.
Implementing the Data Center Energy Productivity Metric in a High Performance Computing Data Center
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sego, Landon H.; Marquez, Andres; Rawson, Andrew
2013-06-30
As data centers proliferate in size and number, the improvement of their energy efficiency and productivity has become an economic and environmental imperative. Making these improvements requires metrics that are robust, interpretable, and practical. We discuss the properties of a number of the proposed metrics of energy efficiency and productivity. In particular, we focus on the Data Center Energy Productivity (DCeP) metric, which is the ratio of useful work produced by the data center to the energy consumed performing that work. We describe our approach for using DCeP as the principal outcome of a designed experiment using a highly instrumented,more » high-performance computing data center. We found that DCeP was successful in clearly distinguishing different operational states in the data center, thereby validating its utility as a metric for identifying configurations of hardware and software that would improve energy productivity. We also discuss some of the challenges and benefits associated with implementing the DCeP metric, and we examine the efficacy of the metric in making comparisons within a data center and between data centers.« less
A mechanical argument for the differential performance of coronary artery grafts.
Prim, David A; Zhou, Boran; Hartstone-Rose, Adam; Uline, Mark J; Shazly, Tarek; Eberth, John F
2016-02-01
Coronary artery bypass grafting (CABG) acutely disturbs the homeostatic state of the transplanted vessel making retention of graft patency dependent on chronic remodeling processes. The time course and extent to which remodeling restores vessel homeostasis will depend, in part, on the nature and magnitude of the mechanical disturbances induced upon transplantation. In this investigation, biaxial mechanical testing and histology were performed on the porcine left anterior descending artery (LAD) and analogs of common autografts, including the internal thoracic artery (ITA), radial artery (RA), great saphenous vein (GSV) and lateral saphenous vein (LSV). Experimental data were used to quantify the parameters of a structure-based constitutive model enabling prediction of the acute vessel mechanical response pre-transplantation and under coronary loading conditions. A novel metric Ξ was developed to quantify mechanical differences between each graft vessel in situ and the LAD in situ, while a second metric Ω compares the graft vessels in situ to their state under coronary loading. The relative values of these metrics among candidate autograft sources are consistent with vessel-specific variations in CABG clinical success rates with the ITA as the superior and GSV the inferior graft choices based on mechanical performance. This approach can be used to evaluate other candidate tissues for grafting or to aid in the development of synthetic and tissue engineered alternatives. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Cao, Jianwei; Khan, Bilal; Hervey, Nathan; Tian, Fenghua; Delgado, Mauricio R.; Clegg, Nancy J.; Smith, Linsley; Roberts, Heather; Tulchin-Francis, Kirsten; Shierk, Angela; Shagman, Laura; MacFarlane, Duncan; Liu, Hanli; Alexandrakis, George
2015-04-01
Sensorimotor cortex plasticity induced by constraint-induced movement therapy (CIMT) in six children (10.2±2.1 years old) with hemiplegic cerebral palsy was assessed by functional near-infrared spectroscopy (fNIRS). The activation laterality index and time-to-peak/duration during a finger-tapping task and the resting-state functional connectivity were quantified before, immediately after, and 6 months after CIMT. These fNIRS-based metrics were used to help explain changes in clinical scores of manual performance obtained concurrently with imaging time points. Five age-matched healthy children (9.8±1.3 years old) were also imaged to provide comparative activation metrics for normal controls. Interestingly, the activation time-to-peak/duration for all sensorimotor centers displayed significant normalization immediately after CIMT that persisted 6 months later. In contrast to this improved localized activation response, the laterality index and resting-state connectivity metrics that depended on communication between sensorimotor centers improved immediately after CIMT, but relapsed 6 months later. In addition, for the subjects measured in this work, there was either a trade-off between improving unimanual versus bimanual performance when sensorimotor activation patterns normalized after CIMT, or an improvement occurred in both unimanual and bimanual performance but at the cost of very abnormal plastic changes in sensorimotor activity.
Immersive training and mentoring for laparoscopic surgery
NASA Astrophysics Data System (ADS)
Nistor, Vasile; Allen, Brian; Dutson, E.; Faloutsos, P.; Carman, G. P.
2007-04-01
We describe in this paper a training system for minimally invasive surgery (MIS) that creates an immersive training simulation by recording the pathways of the instruments from an expert surgeon while performing an actual training task. Instrument spatial pathway data is stored and later accessed at the training station in order to visualize the ergonomic experience of the expert surgeon and trainees. Our system is based on tracking the spatial position and orientation of the instruments on the console for both the expert surgeon and the trainee. The technology is the result of recent developments in miniaturized position sensors that can be integrated seamlessly into the MIS instruments without compromising functionality. In order to continuously monitor the positions of laparoscopic tool tips, DC magnetic tracking sensors are used. A hardware-software interface transforms the coordinate data points into instrument pathways, while an intuitive graphic user interface displays the instruments spatial position and orientation for the mentor/trainee, and endoscopic video information. These data are recorded and saved in a database for subsequent immersive training and training performance analysis. We use two 6 DOF DC magnetic trackers with a sensor diameter of just 1.3 mm - small enough for insertion into 4 French catheters, embedded in the shaft of a endoscopic grasper and a needle driver. One sensor is located at the distal end of the shaft while the second sensor is located at the proximal end of the shaft. The placement of these sensors does not impede the functionally of the instrument. Since the sensors are located inside the shaft there are no sealing issues between the valve of the trocar and the instrument. We devised a peg transfer training task in accordance to validated training procedures, and tested our system on its ability to differentiate between the expert surgeon and the novices, based on a set of performance metrics. These performance metrics: motion smoothness, total path length, and time to completion, are derived from the kinematics of the instrument. An affine combination of the above mentioned metrics is provided to give a general score for the training performance. Clear differentiation between the expert surgeons and the novice trainees is visible in the test results. Strictly kinematics based performance metrics can be used to evaluate the training progress of MIS trainees in the context of UCLA - LTS.
Lin, Meihua; Li, Haoli; Zhao, Xiaolei; Qin, Jiheng
2013-01-01
Genome-wide analysis of gene-gene interactions has been recognized as a powerful avenue to identify the missing genetic components that can not be detected by using current single-point association analysis. Recently, several model-free methods (e.g. the commonly used information based metrics and several logistic regression-based metrics) were developed for detecting non-linear dependence between genetic loci, but they are potentially at the risk of inflated false positive error, in particular when the main effects at one or both loci are salient. In this study, we proposed two conditional entropy-based metrics to challenge this limitation. Extensive simulations demonstrated that the two proposed metrics, provided the disease is rare, could maintain consistently correct false positive rate. In the scenarios for a common disease, our proposed metrics achieved better or comparable control of false positive error, compared to four previously proposed model-free metrics. In terms of power, our methods outperformed several competing metrics in a range of common disease models. Furthermore, in real data analyses, both metrics succeeded in detecting interactions and were competitive with the originally reported results or the logistic regression approaches. In conclusion, the proposed conditional entropy-based metrics are promising as alternatives to current model-based approaches for detecting genuine epistatic effects. PMID:24339984
NASA Astrophysics Data System (ADS)
Asadzadeh, M.; Maclean, A.; Tolson, B. A.; Burn, D. H.
2009-05-01
Hydrologic model calibration aims to find a set of parameters that adequately simulates observations of watershed behavior, such as streamflow, or a state variable, such as snow water equivalent (SWE). There are different metrics for evaluating calibration effectiveness that involve quantifying prediction errors, such as the Nash-Sutcliffe (NS) coefficient and bias evaluated for the entire calibration period, on a seasonal basis, for low flows, or for high flows. Many of these metrics are conflicting such that the set of parameters that maximizes the high flow NS differs from the set of parameters that maximizes the low flow NS. Conflicting objectives are very likely when different calibration objectives are based on different fluxes and/or state variables (e.g., NS based on streamflow versus SWE). One of the most popular ways to balance different metrics is to aggregate them based on their importance and find the set of parameters that optimizes a weighted sum of the efficiency metrics. Comparing alternative hydrologic models (e.g., assessing model improvement when a process or more detail is added to the model) based on the aggregated objective might be misleading since it represents one point on the tradeoff of desired error metrics. To derive a more comprehensive model comparison, we solved a bi-objective calibration problem to estimate the tradeoff between two error metrics for each model. Although this approach is computationally more expensive than the aggregation approach, it results in a better understanding of the effectiveness of selected models at each level of every error metric and therefore provides a better rationale for judging relative model quality. The two alternative models used in this study are two MESH hydrologic models (version 1.2) of the Wolf Creek Research basin that differ in their watershed spatial discretization (a single Grouped Response Unit, GRU, versus multiple GRUs). The MESH model, currently under development by Environment Canada, is a coupled land-surface and hydrologic model. Results will demonstrate the conclusions a modeller might make regarding the value of additional watershed spatial discretization under both an aggregated (single-objective) and multi-objective model comparison framework.
Visual tuning and metrical perception of realistic point-light dance movements.
Su, Yi-Huang
2016-03-07
Humans move to music spontaneously, and this sensorimotor coupling underlies musical rhythm perception. The present research proposed that, based on common action representation, different metrical levels as in auditory rhythms could emerge visually when observing structured dance movements. Participants watched a point-light figure performing basic steps of Swing dance cyclically in different tempi, whereby the trunk bounced vertically at every beat and the limbs moved laterally at every second beat, yielding two possible metrical periodicities. In Experiment 1, participants freely identified a tempo of the movement and tapped along. While some observers only tuned to the bounce and some only to the limbs, the majority tuned to one level or the other depending on the movement tempo, which was also associated with individuals' preferred tempo. In Experiment 2, participants reproduced the tempo of leg movements by four regular taps, and showed a slower perceived leg tempo with than without the trunk bouncing simultaneously in the stimuli. This mirrors previous findings of an auditory 'subdivision effect', suggesting the leg movements were perceived as beat while the bounce as subdivisions. Together these results support visual metrical perception of dance movements, which may employ similar action-based mechanisms to those underpinning auditory rhythm perception.
Cho, Woon; Jang, Jinbeum; Koschan, Andreas; Abidi, Mongi A; Paik, Joonki
2016-11-28
A fundamental limitation of hyperspectral imaging is the inter-band misalignment correlated with subject motion during data acquisition. One way of resolving this problem is to assess the alignment quality of hyperspectral image cubes derived from the state-of-the-art alignment methods. In this paper, we present an automatic selection framework for the optimal alignment method to improve the performance of face recognition. Specifically, we develop two qualitative prediction models based on: 1) a principal curvature map for evaluating the similarity index between sequential target bands and a reference band in the hyperspectral image cube as a full-reference metric; and 2) the cumulative probability of target colors in the HSV color space for evaluating the alignment index of a single sRGB image rendered using all of the bands of the hyperspectral image cube as a no-reference metric. We verify the efficacy of the proposed metrics on a new large-scale database, demonstrating a higher prediction accuracy in determining improved alignment compared to two full-reference and five no-reference image quality metrics. We also validate the ability of the proposed framework to improve hyperspectral face recognition.
Tracking and Data Relay Satellite System (TDRSS) navigation with DSN radio metric data
NASA Technical Reports Server (NTRS)
Ellis, J.
1981-01-01
The use of DSN radiometric data for enhancing the orbit determination capability for TDRS is examined. Results of a formal covariance analysis are presented which establish the nominal TDRS navigation performance and assess the performance improvement based on augmenting the nominal TDRS data strategy with radiometric data from DSN sites.
High performance batteries with carbon nanomaterials and ionic liquids
Lu, Wen [Littleton, CO
2012-08-07
The present invention is directed to lithium-ion batteries in general and more particularly to lithium-ion batteries based on aligned graphene ribbon anodes, V.sub.2O.sub.5 graphene ribbon composite cathodes, and ionic liquid electrolytes. The lithium-ion batteries have excellent performance metrics of cell voltages, energy densities, and power densities.
Too Little and Too Much Trust: Performance Measurement in Australian Higher Education
ERIC Educational Resources Information Center
Woelert, Peter; Yates, Lyn
2015-01-01
A striking feature of contemporary Australian higher education governance is the strong emphasis on centralized, template style, metric-based, and consequential forms of performance measurement. Such emphasis is indicative of a low degree of political trust among the central authorities in Australia in the intrinsic capacity of universities and…
Miller, Anna N; Kozar, Rosemary; Wolinsky, Philip
2017-06-01
Reproducible metrics are needed to evaluate the delivery of orthopaedic trauma care, national care, norms, and outliers. The American College of Surgeons (ACS) is uniquely positioned to collect and evaluate the data needed to evaluate orthopaedic trauma care via the Committee on Trauma and the Trauma Quality Improvement Project. We evaluated the first quality metrics the ACS has collected for orthopaedic trauma surgery to determine whether these metrics can be appropriately collected with accuracy and completeness. The metrics include the time to administration of the first dose of antibiotics for open fractures, the time to surgical irrigation and débridement of open tibial fractures, and the percentage of patients who undergo stabilization of femoral fractures at trauma centers nationwide. These metrics were analyzed to evaluate for variances in the delivery of orthopaedic care across the country. The data showed wide variances for all metrics, and many centers had incomplete ability to collect the orthopaedic trauma care metrics. There was a large variability in the results of the metrics collected among different trauma center levels, as well as among centers of a particular level. The ACS has successfully begun tracking orthopaedic trauma care performance measures, which will help inform reevaluation of the goals and continued work on data collection and improvement of patient care. Future areas of research may link these performance measures with patient outcomes, such as long-term tracking, to assess nonunion and function. This information can provide insight into center performance and its effect on patient outcomes. The ACS was able to successfully collect and evaluate the data for three metrics used to assess the quality of orthopaedic trauma care. However, additional research is needed to determine whether these metrics are suitable for evaluating orthopaedic trauma care and cutoff values for each metric.
Curriculum-Based Measures in Writing: A School-Based Evaluation of Predictive Validity
ERIC Educational Resources Information Center
Terenzi, Christina M.
2009-01-01
Recent research in the area of Curriculum-Based Measures (CBM) in writing has shown that traditionally used metrics, such as total words written and total words correct, may not be the best tools for measuring writing performance, for both secondary and elementary aged children (e.g., Gansle, Noell, VanDerHeyden, Naquin, & Slider, 2002; Tindal…
Interaction Metrics for Feedback Control of Sound Radiation from Stiffened Panels
NASA Technical Reports Server (NTRS)
Cabell, Randolph H.; Cox, David E.; Gibbs, Gary P.
2003-01-01
Interaction metrics developed for the process control industry are used to evaluate decentralized control of sound radiation from bays on an aircraft fuselage. The metrics are applied to experimentally measured frequency response data from a model of an aircraft fuselage. The purpose is to understand how coupling between multiple bays of the fuselage can destabilize or limit the performance of a decentralized active noise control system. The metrics quantitatively verify observations from a previous experiment, in which decentralized controllers performed worse than centralized controllers. The metrics do not appear to be useful for explaining control spillover which was observed in a previous experiment.
Detection of periodicity based on independence tests - III. Phase distance correlation periodogram
NASA Astrophysics Data System (ADS)
Zucker, Shay
2018-02-01
I present the Phase Distance Correlation (PDC) periodogram - a new periodicity metric, based on the Distance Correlation concept of Gábor Székely. For each trial period, PDC calculates the distance correlation between the data samples and their phases. PDC requires adaptation of the Székely's distance correlation to circular variables (phases). The resulting periodicity metric is best suited to sparse data sets, and it performs better than other methods for sawtooth-like periodicities. These include Cepheid and RR-Lyrae light curves, as well as radial velocity curves of eccentric spectroscopic binaries. The performance of the PDC periodogram in other contexts is almost as good as that of the Generalized Lomb-Scargle periodogram. The concept of phase distance correlation can be adapted also to astrometric data, and it has the potential to be suitable also for large evenly spaced data sets, after some algorithmic perfection.
On Adding Structure to Unstructured Overlay Networks
NASA Astrophysics Data System (ADS)
Leitão, João; Carvalho, Nuno A.; Pereira, José; Oliveira, Rui; Rodrigues, Luís
Unstructured peer-to-peer overlay networks are very resilient to churn and topology changes, while requiring little maintenance cost. Therefore, they are an infrastructure to build highly scalable large-scale services in dynamic networks. Typically, the overlay topology is defined by a peer sampling service that aims at maintaining, in each process, a random partial view of peers in the system. The resulting random unstructured topology is suboptimal when a specific performance metric is considered. On the other hand, structured approaches (for instance, a spanning tree) may optimize a given target performance metric but are highly fragile. In fact, the cost for maintaining structures with strong constraints may easily become prohibitive in highly dynamic networks. This chapter discusses different techniques that aim at combining the advantages of unstructured and structured networks. Namely we focus on two distinct approaches, one based on optimizing the overlay and another based on optimizing the gossip mechanism itself.
Accuracy optimization with wavelength tunability in overlay imaging technology
NASA Astrophysics Data System (ADS)
Lee, Honggoo; Kang, Yoonshik; Han, Sangjoon; Shim, Kyuchan; Hong, Minhyung; Kim, Seungyoung; Lee, Jieun; Lee, Dongyoung; Oh, Eungryong; Choi, Ahlin; Kim, Youngsik; Marciano, Tal; Klein, Dana; Hajaj, Eitan M.; Aharon, Sharon; Ben-Dov, Guy; Lilach, Saltoun; Serero, Dan; Golotsvan, Anna
2018-03-01
As semiconductor manufacturing technology progresses and the dimensions of integrated circuit elements shrink, overlay budget is accordingly being reduced. Overlay budget closely approaches the scale of measurement inaccuracies due to both optical imperfections of the measurement system and the interaction of light with geometrical asymmetries of the measured targets. Measurement inaccuracies can no longer be ignored due to their significant effect on the resulting device yield. In this paper we investigate a new approach for imaging based overlay (IBO) measurements by optimizing accuracy rather than contrast precision, including its effect over the total target performance, using wavelength tunable overlay imaging metrology. We present new accuracy metrics based on theoretical development and present their quality in identifying the measurement accuracy when compared to CD-SEM overlay measurements. The paper presents the theoretical considerations and simulation work, as well as measurement data, for which tunability combined with the new accuracy metrics is shown to improve accuracy performance.
NASA Astrophysics Data System (ADS)
Che, Chang; Yu, Xiaoyang; Sun, Xiaoming; Yu, Boyang
2017-12-01
In recent years, Scalable Vocabulary Tree (SVT) has been shown to be effective in image retrieval. However, for general images where the foreground is the object to be recognized while the background is cluttered, the performance of the current SVT framework is restricted. In this paper, a new image retrieval framework that incorporates a robust distance metric and information fusion is proposed, which improves the retrieval performance relative to the baseline SVT approach. First, the visual words that represent the background are diminished by using a robust Hausdorff distance between different images. Second, image matching results based on three image signature representations are fused, which enhances the retrieval precision. We conducted intensive experiments on small-scale to large-scale image datasets: Corel-9, Corel-48, and PKU-198, where the proposed Hausdorff metric and information fusion outperforms the state-of-the-art methods by about 13, 15, and 15%, respectively.
Structural texture similarity metrics for image analysis and retrieval.
Zujovic, Jana; Pappas, Thrasyvoulos N; Neuhoff, David L
2013-07-01
We develop new metrics for texture similarity that accounts for human visual perception and the stochastic nature of textures. The metrics rely entirely on local image statistics and allow substantial point-by-point deviations between textures that according to human judgment are essentially identical. The proposed metrics extend the ideas of structural similarity and are guided by research in texture analysis-synthesis. They are implemented using a steerable filter decomposition and incorporate a concise set of subband statistics, computed globally or in sliding windows. We conduct systematic tests to investigate metric performance in the context of "known-item search," the retrieval of textures that are "identical" to the query texture. This eliminates the need for cumbersome subjective tests, thus enabling comparisons with human performance on a large database. Our experimental results indicate that the proposed metrics outperform peak signal-to-noise ratio (PSNR), structural similarity metric (SSIM) and its variations, as well as state-of-the-art texture classification metrics, using standard statistical measures.
Performance Evaluation in Network-Based Parallel Computing
NASA Technical Reports Server (NTRS)
Dezhgosha, Kamyar
1996-01-01
Network-based parallel computing is emerging as a cost-effective alternative for solving many problems which require use of supercomputers or massively parallel computers. The primary objective of this project has been to conduct experimental research on performance evaluation for clustered parallel computing. First, a testbed was established by augmenting our existing SUNSPARCs' network with PVM (Parallel Virtual Machine) which is a software system for linking clusters of machines. Second, a set of three basic applications were selected. The applications consist of a parallel search, a parallel sort, a parallel matrix multiplication. These application programs were implemented in C programming language under PVM. Third, we conducted performance evaluation under various configurations and problem sizes. Alternative parallel computing models and workload allocations for application programs were explored. The performance metric was limited to elapsed time or response time which in the context of parallel computing can be expressed in terms of speedup. The results reveal that the overhead of communication latency between processes in many cases is the restricting factor to performance. That is, coarse-grain parallelism which requires less frequent communication between processes will result in higher performance in network-based computing. Finally, we are in the final stages of installing an Asynchronous Transfer Mode (ATM) switch and four ATM interfaces (each 155 Mbps) which will allow us to extend our study to newer applications, performance metrics, and configurations.
McAdams, Harley; AlQuraishi, Mohammed
2015-04-21
Techniques for determining values for a metric of microscale interactions include determining a mesoscale metric for a plurality of mesoscale interaction types, wherein a value of the mesoscale metric for each mesoscale interaction type is based on a corresponding function of values of the microscale metric for the plurality of the microscale interaction types. A plurality of observations that indicate the values of the mesoscale metric are determined for the plurality of mesoscale interaction types. Values of the microscale metric are determined for the plurality of microscale interaction types based on the plurality of observations and the corresponding functions and compressed sensing.
Polak, Rainer; London, Justin; Jacoby, Nori
2016-01-01
Most approaches to musical rhythm, whether in music theory, music psychology, or musical neuroscience, presume that musical rhythms are based on isochronous (temporally equidistant) beats and/or beat subdivisions. However, rhythms that are based on non-isochronous, or unequal patterns of time are prominent in the music of Southeast Europe, the Near East and Southern Asia, and in the music of Africa and the African diaspora. The present study examines one such style found in contemporary Malian jembe percussion music. A corpus of 15 representative performances of three different pieces ("Manjanin," "Maraka," and "Woloso") containing ~43,000 data points was analyzed. Manjanin and Woloso are characterized by non-isochronous beat subdivisions (a short IOI followed by two longer IOIs), while Maraka subdivisions are quasi-isochronous. Analyses of onsets and asynchronies show no significant differences in timing precision and coordination between the isochronously timed Maraka vs. the non-isochronously timed Woloso performances, though both pieces were slightly less variable than non-isochronous Manjanin. Thus, the precision and stability of rhythm and entrainment in human music does not necessarily depend on metric isochrony, consistent with the hypothesis that isochrony is not a biologically-based constraint on human rhythmic behavior. Rather, it may represent a historically popular option within a variety of culturally contingent options for metric organization.
An MILP-based cross-layer optimization for a multi-reader arbitration in the UHF RFID system.
Choi, Jinchul; Lee, Chaewoo
2011-01-01
In RFID systems, the performance of each reader such as interrogation range and tag recognition rate may suffer from interferences from other readers. Since the reader interference can be mitigated by output signal power control, spectral and/or temporal separation among readers, the system performance depends on how to adapt the various reader arbitration metrics such as time, frequency, and output power to the system environment. However, complexity and difficulty of the optimization problem increase with respect to the variety of the arbitration metrics. Thus, most proposals in previous study have been suggested to primarily prevent the reader collision with consideration of one or two arbitration metrics. In this paper, we propose a novel cross-layer optimization design based on the concept of combining time division, frequency division, and power control not only to solve the reader interference problem, but also to achieve the multiple objectives such as minimum interrogation delay, maximum reader utilization, and energy efficiency. Based on the priority of the multiple objectives, our cross-layer design optimizes the system sequentially by means of the mixed-integer linear programming. In spite of the multi-stage optimization, the optimization design is formulated as a concise single mathematical form by properly assigning a weight to each objective. Numerical results demonstrate the effectiveness of the proposed optimization design.
An MILP-Based Cross-Layer Optimization for a Multi-Reader Arbitration in the UHF RFID System
Choi, Jinchul; Lee, Chaewoo
2011-01-01
In RFID systems, the performance of each reader such as interrogation range and tag recognition rate may suffer from interferences from other readers. Since the reader interference can be mitigated by output signal power control, spectral and/or temporal separation among readers, the system performance depends on how to adapt the various reader arbitration metrics such as time, frequency, and output power to the system environment. However, complexity and difficulty of the optimization problem increase with respect to the variety of the arbitration metrics. Thus, most proposals in previous study have been suggested to primarily prevent the reader collision with consideration of one or two arbitration metrics. In this paper, we propose a novel cross-layer optimization design based on the concept of combining time division, frequency division, and power control not only to solve the reader interference problem, but also to achieve the multiple objectives such as minimum interrogation delay, maximum reader utilization, and energy efficiency. Based on the priority of the multiple objectives, our cross-layer design optimizes the system sequentially by means of the mixed-integer linear programming. In spite of the multi-stage optimization, the optimization design is formulated as a concise single mathematical form by properly assigning a weight to each objective. Numerical results demonstrate the effectiveness of the proposed optimization design. PMID:22163743
Favazza, Christopher P.; Fetterly, Kenneth A.; Hangiandreou, Nicholas J.; Leng, Shuai; Schueler, Beth A.
2015-01-01
Abstract. Evaluation of flat-panel angiography equipment through conventional image quality metrics is limited by the scope of standard spatial-domain image quality metric(s), such as contrast-to-noise ratio and spatial resolution, or by restricted access to appropriate data to calculate Fourier domain measurements, such as modulation transfer function, noise power spectrum, and detective quantum efficiency. Observer models have been shown capable of overcoming these limitations and are able to comprehensively evaluate medical-imaging systems. We present a spatial domain-based channelized Hotelling observer model to calculate the detectability index (DI) of our different sized disks and compare the performance of different imaging conditions and angiography systems. When appropriate, changes in DIs were compared to expectations based on the classical Rose model of signal detection to assess linearity of the model with quantum signal-to-noise ratio (SNR) theory. For these experiments, the estimated uncertainty of the DIs was less than 3%, allowing for precise comparison of imaging systems or conditions. For most experimental variables, DI changes were linear with expectations based on quantum SNR theory. DIs calculated for the smallest objects demonstrated nonlinearity with quantum SNR theory due to system blur. Two angiography systems with different detector element sizes were shown to perform similarly across the majority of the detection tasks. PMID:26158086
Polak, Rainer; London, Justin; Jacoby, Nori
2016-01-01
Most approaches to musical rhythm, whether in music theory, music psychology, or musical neuroscience, presume that musical rhythms are based on isochronous (temporally equidistant) beats and/or beat subdivisions. However, rhythms that are based on non-isochronous, or unequal patterns of time are prominent in the music of Southeast Europe, the Near East and Southern Asia, and in the music of Africa and the African diaspora. The present study examines one such style found in contemporary Malian jembe percussion music. A corpus of 15 representative performances of three different pieces (“Manjanin,” “Maraka,” and “Woloso”) containing ~43,000 data points was analyzed. Manjanin and Woloso are characterized by non-isochronous beat subdivisions (a short IOI followed by two longer IOIs), while Maraka subdivisions are quasi-isochronous. Analyses of onsets and asynchronies show no significant differences in timing precision and coordination between the isochronously timed Maraka vs. the non-isochronously timed Woloso performances, though both pieces were slightly less variable than non-isochronous Manjanin. Thus, the precision and stability of rhythm and entrainment in human music does not necessarily depend on metric isochrony, consistent with the hypothesis that isochrony is not a biologically-based constraint on human rhythmic behavior. Rather, it may represent a historically popular option within a variety of culturally contingent options for metric organization. PMID:27445659
Feng, Zhaozhong; Calatayud, Vicent; Zhu, Jianguo; Kobayashi, Kazuhiko
2018-04-01
Five winter wheat cultivars were exposed to ambient (A-O 3 ) and elevated (E-O 3 , 1.5 ambient) O 3 in a fully open-air fumigation system in China. Ozone exposure- and flux based response relationships were established for seven physiological variables related to photosynthesis. The performance of the fitting of the regressions in terms of R 2 increased when second order regressions instead of first order ones were used, suggesting that effects of O 3 were more pronounced towards the last developmental stages of the wheat. The more robust indicators were those related with CO 2 assimilation, Rubisco activity and RuBP regeneration capacity (A sat , J max and Vc max ), and chlorophyll content (Chl). Flux-based metrics (POD y , Phytotoxic O 3 Dose over a threshold ynmolO 3 m -2 s -1 ) predicted slightly better the responses to O 3 than exposure metrics (AOTX, Accumulated O 3 exposure over an hourly Threshold of X ppb) for most of the variables. The best performance was observed for metrics POD 1 ( A sat , J max and Vc max ) and POD 3 (Chl). For this crop, the proposed response functions could be used for O 3 risk assessment based on physiological effects and also to include the influence of O 3 on yield or other variables in models with a photosynthetic component. Copyright © 2017 Elsevier B.V. All rights reserved.
Gahm, Jin Kyu; Shi, Yonggang
2018-05-01
Surface mapping methods play an important role in various brain imaging studies from tracking the maturation of adolescent brains to mapping gray matter atrophy patterns in Alzheimer's disease. Popular surface mapping approaches based on spherical registration, however, have inherent numerical limitations when severe metric distortions are present during the spherical parameterization step. In this paper, we propose a novel computational framework for intrinsic surface mapping in the Laplace-Beltrami (LB) embedding space based on Riemannian metric optimization on surfaces (RMOS). Given a diffeomorphism between two surfaces, an isometry can be defined using the pullback metric, which in turn results in identical LB embeddings from the two surfaces. The proposed RMOS approach builds upon this mathematical foundation and achieves general feature-driven surface mapping in the LB embedding space by iteratively optimizing the Riemannian metric defined on the edges of triangular meshes. At the core of our framework is an optimization engine that converts an energy function for surface mapping into a distance measure in the LB embedding space, which can be effectively optimized using gradients of the LB eigen-system with respect to the Riemannian metrics. In the experimental results, we compare the RMOS algorithm with spherical registration using large-scale brain imaging data, and show that RMOS achieves superior performance in the prediction of hippocampal subfields and cortical gyral labels, and the holistic mapping of striatal surfaces for the construction of a striatal connectivity atlas from substantia nigra. Copyright © 2018 Elsevier B.V. All rights reserved.
Classification of forest land attributes using multi-source remotely sensed data
NASA Astrophysics Data System (ADS)
Pippuri, Inka; Suvanto, Aki; Maltamo, Matti; Korhonen, Kari T.; Pitkänen, Juho; Packalen, Petteri
2016-02-01
The aim of the study was to (1) examine the classification of forest land using airborne laser scanning (ALS) data, satellite images and sample plots of the Finnish National Forest Inventory (NFI) as training data and to (2) identify best performing metrics for classifying forest land attributes. Six different schemes of forest land classification were studied: land use/land cover (LU/LC) classification using both national classes and FAO (Food and Agricultural Organization of the United Nations) classes, main type, site type, peat land type and drainage status. Special interest was to test different ALS-based surface metrics in classification of forest land attributes. Field data consisted of 828 NFI plots collected in 2008-2012 in southern Finland and remotely sensed data was from summer 2010. Multinomial logistic regression was used as the classification method. Classification of LU/LC classes were highly accurate (kappa-values 0.90 and 0.91) but also the classification of site type, peat land type and drainage status succeeded moderately well (kappa-values 0.51, 0.69 and 0.52). ALS-based surface metrics were found to be the most important predictor variables in classification of LU/LC class, main type and drainage status. In best classification models of forest site types both spectral metrics from satellite data and point cloud metrics from ALS were used. In turn, in the classification of peat land types ALS point cloud metrics played the most important role. Results indicated that the prediction of site type and forest land category could be incorporated into stand level forest management inventory system in Finland.
Assessing the quality of restored images in optical long-baseline interferometry
NASA Astrophysics Data System (ADS)
Gomes, Nuno; Garcia, Paulo J. V.; Thiébaut, Éric
2017-03-01
Assessing the quality of aperture synthesis maps is relevant for benchmarking image reconstruction algorithms, for the scientific exploitation of data from optical long-baseline interferometers, and for the design/upgrade of new/existing interferometric imaging facilities. Although metrics have been proposed in these contexts, no systematic study has been conducted on the selection of a robust metric for quality assessment. This article addresses the question: what is the best metric to assess the quality of a reconstructed image? It starts by considering several metrics and selecting a few based on general properties. Then, a variety of image reconstruction cases are considered. The observational scenarios are phase closure and phase referencing at the Very Large Telescope Interferometer (VLTI), for a combination of two, three, four and six telescopes. End-to-end image reconstruction is accomplished with the MIRA software, and several merit functions are put to test. It is found that convolution by an effective point spread function is required for proper image quality assessment. The effective angular resolution of the images is superior to naive expectation based on the maximum frequency sampled by the array. This is due to the prior information used in the aperture synthesis algorithm and to the nature of the objects considered. The ℓ1-norm is the most robust of all considered metrics, because being linear it is less sensitive to image smoothing by high regularization levels. For the cases considered, this metric allows the implementation of automatic quality assessment of reconstructed images, with a performance similar to human selection.
Gouda, Hebe N; Critchley, Julia; Powles, John; Capewell, Simon
2012-01-28
Reasons for the widespread declines in coronary heart disease (CHD) mortality in high income countries are controversial. Here we explore how the type of metric chosen for the analyses of these declines affects the answer obtained. The analyses we reviewed were performed using IMPACT, a large Excel based model of the determinants of temporal change in mortality from CHD. Assessments of the decline in CHD mortality in the USA between 1980 and 2000 served as the central case study. Analyses based in the metric of number of deaths prevented attributed about half the decline to treatments (including preventive medications) and half to favourable shifts in risk factors. However, when mortality change was expressed in the metric of life-years-gained, the share attributed to risk factor change rose to 65%. This happened because risk factor changes were modelled as slowing disease progression, such that the hypothetical deaths averted resulted in longer average remaining lifetimes gained than the deaths averted by better treatments. This result was robust to a range of plausible assumptions on the relative effect sizes of changes in treatments and risk factors. Time-based metrics (such as life years) are generally preferable because they direct attention to the changes in the natural history of disease that are produced by changes in key health determinants. The life-years attached to each death averted will also weight deaths in a way that better reflects social preferences.
Teixeira, Andreia Sofia; Monteiro, Pedro T; Carriço, João A; Ramirez, Mário; Francisco, Alexandre P
2015-01-01
Trees, including minimum spanning trees (MSTs), are commonly used in phylogenetic studies. But, for the research community, it may be unclear that the presented tree is just a hypothesis, chosen from among many possible alternatives. In this scenario, it is important to quantify our confidence in both the trees and the branches/edges included in such trees. In this paper, we address this problem for MSTs by introducing a new edge betweenness metric for undirected and weighted graphs. This spanning edge betweenness metric is defined as the fraction of equivalent MSTs where a given edge is present. The metric provides a per edge statistic that is similar to that of the bootstrap approach frequently used in phylogenetics to support the grouping of taxa. We provide methods for the exact computation of this metric based on the well known Kirchhoff's matrix tree theorem. Moreover, we implement and make available a module for the PHYLOViZ software and evaluate the proposed metric concerning both effectiveness and computational performance. Analysis of trees generated using multilocus sequence typing data (MLST) and the goeBURST algorithm revealed that the space of possible MSTs in real data sets is extremely large. Selection of the edge to be represented using bootstrap could lead to unreliable results since alternative edges are present in the same fraction of equivalent MSTs. The choice of the MST to be presented, results from criteria implemented in the algorithm that must be based in biologically plausible models.
Assessing deep and shallow learning methods for quantitative prediction of acute chemical toxicity.
Liu, Ruifeng; Madore, Michael; Glover, Kyle P; Feasel, Michael G; Wallqvist, Anders
2018-05-02
Animal-based methods for assessing chemical toxicity are struggling to meet testing demands. In silico approaches, including machine-learning methods, are promising alternatives. Recently, deep neural networks (DNNs) were evaluated and reported to outperform other machine-learning methods for quantitative structure-activity relationship modeling of molecular properties. However, most of the reported performance evaluations relied on global performance metrics, such as the root mean squared error (RMSE) between the predicted and experimental values of all samples, without considering the impact of sample distribution across the activity spectrum. Here, we carried out an in-depth analysis of DNN performance for quantitative prediction of acute chemical toxicity using several datasets. We found that the overall performance of DNN models on datasets of up to 30,000 compounds was similar to that of random forest (RF) models, as measured by the RMSE and correlation coefficients between the predicted and experimental results. However, our detailed analyses demonstrated that global performance metrics are inappropriate for datasets with a highly uneven sample distribution, because they show a strong bias for the most populous compounds along the toxicity spectrum. For highly toxic compounds, DNN and RF models trained on all samples performed much worse than the global performance metrics indicated. Surprisingly, our variable nearest neighbor method, which utilizes only structurally similar compounds to make predictions, performed reasonably well, suggesting that information of close near neighbors in the training sets is a key determinant of acute toxicity predictions.
Lee, Kam L; Bernardo, Michael; Ireland, Timothy A
2016-06-01
This is part two of a two-part study in benchmarking system performance of fixed digital radiographic systems. The study compares the system performance of seven fixed digital radiography systems based on quantitative metrics like modulation transfer function (sMTF), normalised noise power spectrum (sNNPS), detective quantum efficiency (sDQE) and entrance surface air kerma (ESAK). It was found that the most efficient image receptors (greatest sDQE) were not necessarily operating at the lowest ESAK. In part one of this study, sMTF is shown to depend on system configuration while sNNPS is shown to be relatively consistent across systems. Systems are ranked on their signal-to-noise ratio efficiency (sDQE) and their ESAK. Systems using the same equipment configuration do not necessarily have the same system performance. This implies radiographic practice at the site will have an impact on the overall system performance. In general, systems are more dose efficient at low dose settings.
The Albuquerque Seismological Laboratory Data Quality Analyzer
NASA Astrophysics Data System (ADS)
Ringler, A. T.; Hagerty, M.; Holland, J.; Gee, L. S.; Wilson, D.
2013-12-01
The U.S. Geological Survey's Albuquerque Seismological Laboratory (ASL) has several efforts underway to improve data quality at its stations. The Data Quality Analyzer (DQA) is one such development. The DQA is designed to characterize station data quality in a quantitative and automated manner. Station quality is based on the evaluation of various metrics, such as timing quality, noise levels, sensor coherence, and so on. These metrics are aggregated into a measurable grade for each station. The DQA consists of a website, a metric calculator (Seedscan), and a PostgreSQL database. The website allows the user to make requests for various time periods, review specific networks and stations, adjust weighting of the station's grade, and plot metrics as a function of time. The website dynamically loads all station data from a PostgreSQL database. The database is central to the application; it acts as a hub where metric values and limited station descriptions are stored. Data is stored at the level of one sensor's channel per day. The database is populated by Seedscan. Seedscan reads and processes miniSEED data, to generate metric values. Seedscan, written in Java, compares hashes of metadata and data to detect changes and perform subsequent recalculations. This ensures that the metric values are up to date and accurate. Seedscan can be run in a scheduled task or on demand by way of a config file. It will compute metrics specified in its configuration file. While many metrics are currently in development, some are completed and being actively used. These include: availability, timing quality, gap count, deviation from the New Low Noise Model, deviation from a station's noise baseline, inter-sensor coherence, and data-synthetic fits. In all, 20 metrics are planned, but any number could be added. ASL is actively using the DQA on a daily basis for station diagnostics and evaluation. As Seedscan is scheduled to run every night, data quality analysts are able to then use the website to diagnose changes in noise levels or other anomalous data. This allows for errors to be corrected quickly and efficiently. The code is designed to be flexible for adding metrics and portable for use in other networks. We anticipate further development of the DQA by improving the existing web-interface, adding more metrics, adding an interface to facilitate the verification of historic station metadata and performance, and an interface to allow better monitoring of data quality goals.
NASA Technical Reports Server (NTRS)
McFarland, Shane M.; Norcross, Jason
2016-01-01
Existing methods for evaluating EVA suit performance and mobility have historically concentrated on isolated joint range of motion and torque. However, these techniques do little to evaluate how well a suited crewmember can actually perform during an EVA. An alternative method of characterizing suited mobility through measurement of metabolic cost to the wearer has been evaluated at Johnson Space Center over the past several years. The most recent study involved six test subjects completing multiple trials of various functional tasks in each of three different space suits; the results indicated it was often possible to discern between different suit designs on the basis of metabolic cost alone. However, other variables may have an effect on real-world suited performance; namely, completion time of the task, the gravity field in which the task is completed, etc. While previous results have analyzed completion time, metabolic cost, and metabolic cost normalized to system mass individually, it is desirable to develop a single metric comprising these (and potentially other) performance metrics. This paper outlines the background upon which this single-score metric is determined to be feasible, and initial efforts to develop such a metric. Forward work includes variable coefficient determination and verification of the metric through repeated testing.
Woods, Carl T; Veale, James P; Collier, Neil; Robertson, Sam
2017-02-01
This study investigated the extent to which position in the Australian Football League (AFL) national draft is associated with individual game performance metrics. Physical/technical skill performance metrics were collated from all participants in the 2014 national under 18 (U18) championships (18 games) drafted into the AFL (n = 65; 17.8 ± 0.5 y); 232 observations. Players were subdivided into draft position (ranked 1-65) and then draft round (1-4). Here, earlier draft selection (i.e., closer to 1) reflects a more desirable player. Microtechnology and a commercial provider facilitated the quantification of individual game performance metrics (n = 16). Linear mixed models were fitted to data, modelling the extent to which draft position was associated with these metrics. Draft position in the first/second round was negatively associated with "contested possessions" and "contested marks", respectively. Physical performance metrics were positively associated with draft position in these rounds. Correlations weakened for the third/fourth rounds. Contested possessions/marks were associated with an earlier draft selection. Physical performance metrics were associated with a later draft selection. Recruiters change the type of U18 player they draft as the selection pool reduces. juniors with contested skill appear prioritised.
Evaluation Metrics for Biostatistical and Epidemiological Collaborations
Rubio, Doris McGartland; del Junco, Deborah J.; Bhore, Rafia; Lindsell, Christopher J.; Oster, Robert A.; Wittkowski, Knut M.; Welty, Leah J.; Li, Yi-Ju; DeMets, Dave
2011-01-01
Increasing demands for evidence-based medicine and for the translation of biomedical research into individual and public health benefit have been accompanied by the proliferation of special units that offer expertise in biostatistics, epidemiology, and research design (BERD) within academic health centers. Objective metrics that can be used to evaluate, track, and improve the performance of these BERD units are critical to their successful establishment and sustainable future. To develop a set of reliable but versatile metrics that can be adapted easily to different environments and evolving needs, we consulted with members of BERD units from the consortium of academic health centers funded by the Clinical and Translational Science Award Program of the National Institutes of Health. Through a systematic process of consensus building and document drafting, we formulated metrics that covered the three identified domains of BERD practices: the development and maintenance of collaborations with clinical and translational science investigators, the application of BERD-related methods to clinical and translational research, and the discovery of novel BERD-related methodologies. In this article, we describe the set of metrics and advocate their use for evaluating BERD practices. The routine application, comparison of findings across diverse BERD units, and ongoing refinement of the metrics will identify trends, facilitate meaningful changes, and ultimately enhance the contribution of BERD activities to biomedical research. PMID:21284015
Sketch Matching on Topology Product Graph.
Liang, Shuang; Luo, Jun; Liu, Wenyin; Wei, Yichen
2015-08-01
Sketch matching is the fundamental problem in sketch based interfaces. After years of study, it remains challenging when there exists large irregularity and variations in the hand drawn sketch shapes. While most existing works exploit topology relations and graph representations for this problem, they are usually limited by the coarse topology exploration and heuristic (thus suboptimal) similarity metrics between graphs. We present a new sketch matching method with two novel contributions. We introduce a comprehensive definition of topology relations, which results in a rich and informative graph representation of sketches. For graph matching, we propose topology product graph that retains the full correspondence for matching two graphs. Based on it, we derive an intuitive sketch similarity metric whose exact solution is easy to compute. In addition, the graph representation and new metric naturally support partial matching, an important practical problem that received less attention in the literature. Extensive experimental results on a real challenging dataset and the superior performance of our method show that it outperforms the state-of-the-art.
Neural decoding with kernel-based metric learning.
Brockmeier, Austin J; Choi, John S; Kriminger, Evan G; Francis, Joseph T; Principe, Jose C
2014-06-01
In studies of the nervous system, the choice of metric for the neural responses is a pivotal assumption. For instance, a well-suited distance metric enables us to gauge the similarity of neural responses to various stimuli and assess the variability of responses to a repeated stimulus-exploratory steps in understanding how the stimuli are encoded neurally. Here we introduce an approach where the metric is tuned for a particular neural decoding task. Neural spike train metrics have been used to quantify the information content carried by the timing of action potentials. While a number of metrics for individual neurons exist, a method to optimally combine single-neuron metrics into multineuron, or population-based, metrics is lacking. We pose the problem of optimizing multineuron metrics and other metrics using centered alignment, a kernel-based dependence measure. The approach is demonstrated on invasively recorded neural data consisting of both spike trains and local field potentials. The experimental paradigm consists of decoding the location of tactile stimulation on the forepaws of anesthetized rats. We show that the optimized metrics highlight the distinguishing dimensions of the neural response, significantly increase the decoding accuracy, and improve nonlinear dimensionality reduction methods for exploratory neural analysis.
DOT National Transportation Integrated Search
2018-02-02
The objective of this study is to develop an evidencebased research implementation database and tool to support research implementation at the Georgia Department of Transportation (GDOT).A review was conducted drawing from the (1) implementati...
Red Dragon-MSL Hybrid Landing Architecture for 2018
NASA Astrophysics Data System (ADS)
Grover, M. R.; Sklyanskiy, E.; Stelzner, A. D.; Sherwood, B.
2012-06-01
Hybridizing modern developments at SpaceX and JPL could enable landing 1 metric ton-class payloads on Mars for of order $250M, beginning in 2018. Near term, OCT could perform Earth-based flight demonstration of supersonic retropropulsion.
Metrics for Evaluation of Student Models
ERIC Educational Resources Information Center
Pelanek, Radek
2015-01-01
Researchers use many different metrics for evaluation of performance of student models. The aim of this paper is to provide an overview of commonly used metrics, to discuss properties, advantages, and disadvantages of different metrics, to summarize current practice in educational data mining, and to provide guidance for evaluation of student…
A comparison of quantum limited dose and noise equivalent dose
NASA Astrophysics Data System (ADS)
Job, Isaias D.; Boyce, Sarah J.; Petrillo, Michael J.; Zhou, Kungang
2016-03-01
Quantum-limited-dose (QLD) and noise-equivalent-dose (NED) are performance metrics often used interchangeably. Although the metrics are related, they are not equivalent unless the treatment of electronic noise is carefully considered. These metrics are increasingly important to properly characterize the low-dose performance of flat panel detectors (FPDs). A system can be said to be quantum-limited when the Signal-to-noise-ratio (SNR) is proportional to the square-root of x-ray exposure. Recent experiments utilizing three methods to determine the quantum-limited dose range yielded inconsistent results. To investigate the deviation in results, generalized analytical equations are developed to model the image processing and analysis of each method. We test the generalized expression for both radiographic and fluoroscopic detectors. The resulting analysis shows that total noise content of the images processed by each method are inherently different based on their readout scheme. Finally, it will be shown that the NED is equivalent to the instrumentation-noise-equivalent-exposure (INEE) and furthermore that the NED is derived from the quantum-noise-only method of determining QLD. Future investigations will measure quantum-limited performance of radiographic panels with a modified readout scheme to allow for noise improvements similar to measurements performed with fluoroscopic detectors.
A support vector machine for predicting defibrillation outcomes from waveform metrics.
Howe, Andrew; Escalona, Omar J; Di Maio, Rebecca; Massot, Bertrand; Cromie, Nick A; Darragh, Karen M; Adgey, Jennifer; McEneaney, David J
2014-03-01
Algorithms to predict shock success based on VF waveform metrics could significantly enhance resuscitation by optimising the timing of defibrillation. To investigate robust methods of predicting defibrillation success in VF cardiac arrest patients, by using a support vector machine (SVM) optimisation approach. Frequency-domain (AMSA, dominant frequency and median frequency) and time-domain (slope and RMS amplitude) VF waveform metrics were calculated in a 4.1Y window prior to defibrillation. Conventional prediction test validity of each waveform parameter was conducted and used AUC>0.6 as the criterion for inclusion as a corroborative attribute processed by the SVM classification model. The latter used a Gaussian radial-basis-function (RBF) kernel and the error penalty factor C was fixed to 1. A two-fold cross-validation resampling technique was employed. A total of 41 patients had 115 defibrillation instances. AMSA, slope and RMS waveform metrics performed test validation with AUC>0.6 for predicting termination of VF and return-to-organised rhythm. Predictive accuracy of the optimised SVM design for termination of VF was 81.9% (± 1.24 SD); positive and negative predictivity were respectively 84.3% (± 1.98 SD) and 77.4% (± 1.24 SD); sensitivity and specificity were 87.6% (± 2.69 SD) and 71.6% (± 9.38 SD) respectively. AMSA, slope and RMS were the best VF waveform frequency-time parameters predictors of termination of VF according to test validity assessment. This a priori can be used for a simplified SVM optimised design that combines the predictive attributes of these VF waveform metrics for improved prediction accuracy and generalisation performance without requiring the definition of any threshold value on waveform metrics. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
de Barros, Felipe P. J.; Ezzedine, Souheil; Rubin, Yoram
2012-02-01
The significance of conditioning predictions of environmental performance metrics (EPMs) on hydrogeological data in heterogeneous porous media is addressed. Conditioning EPMs on available data reduces uncertainty and increases the reliability of model predictions. We present a rational and concise approach to investigate the impact of conditioning EPMs on data as a function of the location of the environmentally sensitive target receptor, data types and spacing between measurements. We illustrate how the concept of comparative information yield curves introduced in de Barros et al. [de Barros FPJ, Rubin Y, Maxwell R. The concept of comparative information yield curves and its application to risk-based site characterization. Water Resour Res 2009;45:W06401. doi:10.1029/2008WR007324] could be used to assess site characterization needs as a function of flow and transport dimensionality and EPMs. For a given EPM, we show how alternative uncertainty reduction metrics yield distinct gains of information from a variety of sampling schemes. Our results show that uncertainty reduction is EPM dependent (e.g., travel times) and does not necessarily indicate uncertainty reduction in an alternative EPM (e.g., human health risk). The results show how the position of the environmental target, flow dimensionality and the choice of the uncertainty reduction metric can be used to assist in field sampling campaigns.
Unbiased Estimation of Refractive State of Aberrated Eyes
Martin, Jesson; Vasudevan, Balamurali; Himebaugh, Nikole; Bradley, Arthur; Thibos, Larry
2011-01-01
To identify unbiased methods for estimating the target vergence required to maximize visual acuity based on wavefront aberration measurements. Experiments were designed to minimize the impact of confounding factors that have hampered previous research. Objective wavefront refractions and subjective acuity refractions were obtained for the same monochromatic wavelength. Accommodation and pupil fluctuations were eliminated by cycloplegia. Unbiased subjective refractions that maximize visual acuity for high contrast letters were performed with a computer controlled forced choice staircase procedure, using 0.125 diopter steps of defocus. All experiments were performed for two pupil diameters (3mm and 6mm). As reported in the literature, subjective refractive error does not change appreciably when the pupil dilates. For 3 mm pupils most metrics yielded objective refractions that were about 0.1D more hyperopic than subjective acuity refractions. When pupil diameter increased to 6 mm, this bias changed in the myopic direction and the variability between metrics also increased. These inaccuracies were small compared to the precision of the measurements, which implies that most metrics provided unbiased estimates of refractive state for medium and large pupils. A variety of image quality metrics may be used to determine ocular refractive state for monochromatic (635nm) light, thereby achieving accurate results without the need for empirical correction factors. PMID:21777601
NASA Astrophysics Data System (ADS)
Kierkels, R. G. J.; den Otter, L. A.; Korevaar, E. W.; Langendijk, J. A.; van der Schaaf, A.; Knopf, A. C.; Sijtsema, N. M.
2018-02-01
A prerequisite for adaptive dose-tracking in radiotherapy is the assessment of the deformable image registration (DIR) quality. In this work, various metrics that quantify DIR uncertainties are investigated using realistic deformation fields of 26 head and neck and 12 lung cancer patients. Metrics related to the physiologically feasibility (the Jacobian determinant, harmonic energy (HE), and octahedral shear strain (OSS)) and numerically robustness of the deformation (the inverse consistency error (ICE), transitivity error (TE), and distance discordance metric (DDM)) were investigated. The deformable registrations were performed using a B-spline transformation model. The DIR error metrics were log-transformed and correlated (Pearson) against the log-transformed ground-truth error on a voxel level. Correlations of r ⩾ 0.5 were found for the DDM and HE. Given a DIR tolerance threshold of 2.0 mm and a negative predictive value of 0.90, the DDM and HE thresholds were 0.49 mm and 0.014, respectively. In conclusion, the log-transformed DDM and HE can be used to identify voxels at risk for large DIR errors with a large negative predictive value. The HE and/or DDM can therefore be used to perform automated quality assurance of each CT-based DIR for head and neck and lung cancer patients.
Alternative Indices of Performance: An Exploration of Eye Gaze Metrics in a Visual Puzzle Task
2014-07-01
strategy in which participants anchor their search on pieces in the correct positions results. Although there were significant effects for both...PERFORMANCE WING, HUMAN EFFECTIVENESS DIRECTORATE, WRIGHT-PATTERSON AIR FORCE BASE, OH 45433 AIR FORCE MATERIEL COMMAND UNITED STATES AIR FORCE...Interface Division //signed// WILLIAM E. RUSSELL Chief, Warfighter Interface Division Human Effectiveness Directorate 711 Human Performance
NASA Technical Reports Server (NTRS)
McConnell, Joshua B.
2000-01-01
The scientific exploration of Mars will require the collection and return of subterranean samples to Earth for examination. This necessitates the use of some type of device or devices that possesses the ability to effectively penetrate the Martian surface, collect suitable samples and return them to the surface in a manner consistent with imposed scientific constraints. The first opportunity for such a device will occur on the 2003 and 2005 Mars Sample Return missions, being performed by NASA. This paper reviews the work completed on the compilation of a database containing viable penetrating and sampling devices, the performance of a system level trade study comparing selected devices to a set of prescribed parameters and the employment of a metric for the evaluation and ranking of the traded penetration and sampling devices, with respect to possible usage on the 03 and 05 sample return missions. The trade study performed is based on a select set of scientific, engineering, programmatic and socio-political criterion. The use of a metric for the various penetration and sampling devices will act to expedite current and future device selection.
Cultural Issues in Psychiatric Administration and Leadership
Aggarwal, Neil Krishan
2016-01-01
This paper addresses cultural issues in psychiatric administration and leadership through two issues: (1) the changing culture of psychiatric practice based on new clinician performance metrics and (2) the culture of psychiatric administration and leadership in light of organizational cultural competence. Regarding the first issue, some observers have discussed the challenges of creating novel practice environments that balance business values of efficient performance with fiduciary values of treatment competence. This paper expands upon this discussion, demonstrating that some metrics from the Centers for Medicare & Medicaid Services, the nation’s largest funder of postgraduate medical training, may penalize clinicians for patient medication behaviors that are unrelated to clinician performance. A focus on pharmacotherapy over psychotherapy in these metrics has unclear consequences for the future of psychiatric training. Regarding the second issue, studies of psychiatric administration and leadership reveal a disproportionate influence of older men in positions of power despite efforts to recruit women, minorities, and immigrants who increasingly constitute the psychiatric workforce. Organizational cultural competence initiatives can diversify institutional cultures so that psychiatric leaders better reflect the populations they serve. In both cases, psychiatric administrators and leaders play critical roles in ensuring that their organizations respond to social challenges. PMID:26071640
Cultural Issues in Psychiatric Administration and Leadership.
Aggarwal, Neil Krishan
2015-09-01
This paper addresses cultural issues in psychiatric administration and leadership through two issues: (1) the changing culture of psychiatric practice based on new clinician performance metrics and (2) the culture of psychiatric administration and leadership in light of organizational cultural competence. Regarding the first issue, some observers have discussed the challenges of creating novel practice environments that balance business values of efficient performance with fiduciary values of treatment competence. This paper expands upon this discussion, demonstrating that some metrics from the Centers for Medicare & Medicaid Services, the nation's largest funder of postgraduate medical training, may penalize clinicians for patient medication behaviors that are unrelated to clinician performance. A focus on pharmacotherapy over psychotherapy in these metrics has unclear consequences for the future of psychiatric training. Regarding the second issue, studies of psychiatric administration and leadership reveal a disproportionate influence of older men in positions of power despite efforts to recruit women, minorities, and immigrants who increasingly constitute the psychiatric workforce. Organizational cultural competence initiatives can diversify institutional cultures so that psychiatric leaders better reflect the populations they serve. In both cases, psychiatric administrators and leaders play critical roles in ensuring that their organizations respond to social challenges.
Color preservation for tone reproduction and image enhancement
NASA Astrophysics Data System (ADS)
Hsin, Chengho; Lee, Zong Wei; Lee, Zheng Zhan; Shin, Shaw-Jyh
2014-01-01
Applications based on luminance processing often face the problem of recovering the original chrominance in the output color image. A common approach to reconstruct a color image from the luminance output is by preserving the original hue and saturation. However, this approach often produces a highly colorful image which is undesirable. We develop a color preservation method that not only retains the ratios of the input tri-chromatic values but also adjusts the output chroma in an appropriate way. Linearizing the output luminance is the key idea to realize this method. In addition, a lightness difference metric together with a colorfulness difference metric are proposed to evaluate the performance of the color preservation methods. It shows that the proposed method performs consistently better than the existing approaches.
HealthTrust: A Social Network Approach for Retrieving Online Health Videos
Karlsen, Randi; Melton, Genevieve B
2012-01-01
Background Social media are becoming mainstream in the health domain. Despite the large volume of accurate and trustworthy health information available on social media platforms, finding good-quality health information can be difficult. Misleading health information can often be popular (eg, antivaccination videos) and therefore highly rated by general search engines. We believe that community wisdom about the quality of health information can be harnessed to help create tools for retrieving good-quality social media content. Objectives To explore approaches for extracting metrics about authoritativeness in online health communities and how these metrics positively correlate with the quality of the content. Methods We designed a metric, called HealthTrust, that estimates the trustworthiness of social media content (eg, blog posts or videos) in a health community. The HealthTrust metric calculates reputation in an online health community based on link analysis. We used the metric to retrieve YouTube videos and channels about diabetes. In two different experiments, health consumers provided 427 ratings of 17 videos and professionals gave 162 ratings of 23 videos. In addition, two professionals reviewed 30 diabetes channels. Results HealthTrust may be used for retrieving online videos on diabetes, since it performed better than YouTube Search in most cases. Overall, of 20 potential channels, HealthTrust’s filtering allowed only 3 bad channels (15%) versus 8 (40%) on the YouTube list. Misleading and graphic videos (eg, featuring amputations) were more commonly found by YouTube Search than by searches based on HealthTrust. However, some videos from trusted sources had low HealthTrust scores, mostly from general health content providers, and therefore not highly connected in the diabetes community. When comparing video ratings from our reviewers, we found that HealthTrust achieved a positive and statistically significant correlation with professionals (Pearson r 10 = .65, P = .02) and a trend toward significance with health consumers (r 7 = .65, P = .06) with videos on hemoglobinA1 c, but it did not perform as well with diabetic foot videos. Conclusions The trust-based metric HealthTrust showed promising results when used to retrieve diabetes content from YouTube. Our research indicates that social network analysis may be used to identify trustworthy social media in health communities. PMID:22356723
Image sharpness assessment based on wavelet energy of edge area
NASA Astrophysics Data System (ADS)
Li, Jin; Zhang, Hong; Zhang, Lei; Yang, Yifan; He, Lei; Sun, Mingui
2018-04-01
Image quality assessment is needed in multiple image processing areas and blur is one of the key reasons of image deterioration. Although great full-reference image quality assessment metrics have been proposed in the past few years, no-reference method is still an area of current research. Facing this problem, this paper proposes a no-reference sharpness assessment method based on wavelet transformation which focuses on the edge area of image. Based on two simple characteristics of human vision system, weights are introduced to calculate weighted log-energy of each wavelet sub band. The final score is given by the ratio of high-frequency energy to the total energy. The algorithm is tested on multiple databases. Comparing with several state-of-the-art metrics, proposed algorithm has better performance and less runtime consumption.
Applying Sigma Metrics to Reduce Outliers.
Litten, Joseph
2017-03-01
Sigma metrics can be used to predict assay quality, allowing easy comparison of instrument quality and predicting which tests will require minimal quality control (QC) rules to monitor the performance of the method. A Six Sigma QC program can result in fewer controls and fewer QC failures for methods with a sigma metric of 5 or better. The higher the number of methods with a sigma metric of 5 or better, the lower the costs for reagents, supplies, and control material required to monitor the performance of the methods. Copyright © 2016 Elsevier Inc. All rights reserved.
Poisson, Sharon N.; Josephson, S. Andrew
2011-01-01
Stroke is a major public health burden, and accounts for many hospitalizations each year. Due to gaps in practice and recommended guidelines, there has been a recent push toward implementing quality measures to be used for improving patient care, comparing institutions, as well as for rewarding or penalizing physicians through pay-for-performance. This article reviews the major organizations involved in implementing quality metrics for stroke, and the 10 major metrics currently being tracked. We also discuss possible future metrics and the implications of public reporting and using metrics for pay-for-performance. PMID:23983840
Christoforou, Christoforos; Papadopoulos, Timothy C.; Constantinidou, Fofi; Theodorou, Maria
2017-01-01
The ability to anticipate the population-wide response of a target audience to a new movie or TV series, before its release, is critical to the film industry. Equally important is the ability to understand the underlying factors that drive or characterize viewer’s decision to watch a movie. Traditional approaches (which involve pilot test-screenings, questionnaires, and focus groups) have reached a plateau in their ability to predict the population-wide responses to new movies. In this study, we develop a novel computational approach for extracting neurophysiological electroencephalography (EEG) and eye-gaze based metrics to predict the population-wide behavior of movie goers. We further, explore the connection of the derived metrics to the underlying cognitive processes that might drive moviegoers’ decision to watch a movie. Towards that, we recorded neural activity—through the use of EEG—and eye-gaze activity from a group of naive individuals while watching movie trailers of pre-selected movies for which the population-wide preference is captured by the movie’s market performance (i.e., box-office ticket sales in the US). Our findings show that the neural based metrics, derived using the proposed methodology, carry predictive information about the broader audience decisions to watch a movie, above and beyond traditional methods. In particular, neural metrics are shown to predict up to 72% of the variance of the films’ performance at their premiere and up to 67% of the variance at following weekends; which corresponds to a 23-fold increase in prediction accuracy compared to current neurophysiological or traditional methods. We discuss our findings in the context of existing literature and hypothesize on the possible connection of the derived neurophysiological metrics to cognitive states of focused attention, the encoding of long-term memory, and the synchronization of different components of the brain’s rewards network. Beyond the practical implication in predicting and understanding the behavior of moviegoers, the proposed approach can facilitate the use of video stimuli in neuroscience research; such as the study of individual differences in attention-deficit disorders, and the study of desensitization to media violence. PMID:29311885
Christoforou, Christoforos; Papadopoulos, Timothy C; Constantinidou, Fofi; Theodorou, Maria
2017-01-01
The ability to anticipate the population-wide response of a target audience to a new movie or TV series, before its release, is critical to the film industry. Equally important is the ability to understand the underlying factors that drive or characterize viewer's decision to watch a movie. Traditional approaches (which involve pilot test-screenings, questionnaires, and focus groups) have reached a plateau in their ability to predict the population-wide responses to new movies. In this study, we develop a novel computational approach for extracting neurophysiological electroencephalography (EEG) and eye-gaze based metrics to predict the population-wide behavior of movie goers. We further, explore the connection of the derived metrics to the underlying cognitive processes that might drive moviegoers' decision to watch a movie. Towards that, we recorded neural activity-through the use of EEG-and eye-gaze activity from a group of naive individuals while watching movie trailers of pre-selected movies for which the population-wide preference is captured by the movie's market performance (i.e., box-office ticket sales in the US). Our findings show that the neural based metrics, derived using the proposed methodology, carry predictive information about the broader audience decisions to watch a movie, above and beyond traditional methods. In particular, neural metrics are shown to predict up to 72% of the variance of the films' performance at their premiere and up to 67% of the variance at following weekends; which corresponds to a 23-fold increase in prediction accuracy compared to current neurophysiological or traditional methods. We discuss our findings in the context of existing literature and hypothesize on the possible connection of the derived neurophysiological metrics to cognitive states of focused attention, the encoding of long-term memory, and the synchronization of different components of the brain's rewards network. Beyond the practical implication in predicting and understanding the behavior of moviegoers, the proposed approach can facilitate the use of video stimuli in neuroscience research; such as the study of individual differences in attention-deficit disorders, and the study of desensitization to media violence.
Virtual reality simulator training for laparoscopic colectomy: what metrics have construct validity?
Shanmugan, Skandan; Leblanc, Fabien; Senagore, Anthony J; Ellis, C Neal; Stein, Sharon L; Khan, Sadaf; Delaney, Conor P; Champagne, Bradley J
2014-02-01
Virtual reality simulation for laparoscopic colectomy has been used for training of surgical residents and has been considered as a model for technical skills assessment of board-eligible colorectal surgeons. However, construct validity (the ability to distinguish between skill levels) must be confirmed before widespread implementation. This study was designed to specifically determine which metrics for laparoscopic sigmoid colectomy have evidence of construct validity. General surgeons that had performed fewer than 30 laparoscopic colon resections and laparoscopic colorectal experts (>200 laparoscopic colon resections) performed laparoscopic sigmoid colectomy on the LAP Mentor model. All participants received a 15-minute instructional warm-up and had never used the simulator before the study. Performance was then compared between each group for 21 metrics (procedural, 14; intraoperative errors, 7) to determine specifically which measurements demonstrate construct validity. Performance was compared with the Mann-Whitney U-test (p < 0.05 was significant). Fifty-three surgeons; 29 general surgeons, and 24 colorectal surgeons enrolled in the study. The virtual reality simulators for laparoscopic sigmoid colectomy demonstrated construct validity for 8 of 14 procedural metrics by distinguishing levels of surgical experience (p < 0.05). The most discriminatory procedural metrics (p < 0.01) favoring experts were reduced instrument path length, accuracy of the peritoneal/medial mobilization, and dissection of the inferior mesenteric artery. Intraoperative errors were not discriminatory for most metrics and favored general surgeons for colonic wall injury (general surgeons, 0.7; colorectal surgeons, 3.5; p = 0.045). Individual variability within the general surgeon and colorectal surgeon groups was not accounted for. The virtual reality simulators for laparoscopic sigmoid colectomy demonstrated construct validity for 8 procedure-specific metrics. However, using virtual reality simulator metrics to detect intraoperative errors did not discriminate between groups. If the virtual reality simulator continues to be used for the technical assessment of trainees and board-eligible surgeons, the evaluation of performance should be limited to procedural metrics.
ERIC Educational Resources Information Center
Whitesell, Emilyn Ruble
2015-01-01
School accountability systems are a popular approach to improving education outcomes in the United States. These systems intend to "hold schools accountable" by assessing school performance on specific metrics, publishing accountability reports, and some combination of rewarding and sanctioning schools based on performance. Additionally,…
ERIC Educational Resources Information Center
Lauen, Douglas Lee
2011-01-01
This study examines the incentive effects of North Carolina's practice of awarding performance bonuses on test score achievement on the state tests. Bonuses were awarded based solely on whether a school exceeds a threshold on a continuous performance metric. The study uses a sharp regression discontinuity design, an approach with strong internal…
Nanthagopal, A Padma; Rajamony, R Sukanesh
2012-07-01
The proposed system provides new textural information for segmenting tumours, efficiently and accurately and with less computational time, from benign and malignant tumour images, especially in smaller dimensions of tumour regions of computed tomography (CT) images. Region-based segmentation of tumour from brain CT image data is an important but time-consuming task performed manually by medical experts. The objective of this work is to segment brain tumour from CT images using combined grey and texture features with new edge features and nonlinear support vector machine (SVM) classifier. The selected optimal features are used to model and train the nonlinear SVM classifier to segment the tumour from computed tomography images and the segmentation accuracies are evaluated for each slice of the tumour image. The method is applied on real data of 80 benign, malignant tumour images. The results are compared with the radiologist labelled ground truth. Quantitative analysis between ground truth and the segmented tumour is presented in terms of segmentation accuracy and the overlap similarity measure dice metric. From the analysis and performance measures such as segmentation accuracy and dice metric, it is inferred that better segmentation accuracy and higher dice metric are achieved with the normalized cut segmentation method than with the fuzzy c-means clustering method.
An Exploratory Study of OEE Implementation in Indian Manufacturing Companies
NASA Astrophysics Data System (ADS)
Kumar, J.; Soni, V. K.
2015-04-01
Globally, the implementation of Overall equipment effectiveness (OEE) has proven to be highly effective in improving availability, performance rate and quality rate while reducing unscheduled breakdown and wastage that stems from the equipment. This paper investigates the present status and future scope of OEE metrics in Indian manufacturing companies through an extensive survey. In this survey, opinions of Production and Maintenance Managers have been analyzed statistically to explore the relationship between factors, perspective of OEE and potential use of OEE metrics. Although the sample has been divers in terms of product, process type, size, and geographic location of the companies, they are enforced to implement improvement techniques such as OEE metrics to improve performance. The findings reveal that OEE metrics has huge potential and scope to improve performance. Responses indicate that Indian companies are aware of OEE but they are not utilizing full potential of OEE metrics.
Concussion classification via deep learning using whole-brain white matter fiber strains
Cai, Yunliang; Wu, Shaoju; Zhao, Wei; Li, Zhigang; Wu, Zheyang
2018-01-01
Developing an accurate and reliable injury predictor is central to the biomechanical studies of traumatic brain injury. State-of-the-art efforts continue to rely on empirical, scalar metrics based on kinematics or model-estimated tissue responses explicitly pre-defined in a specific brain region of interest. They could suffer from loss of information. A single training dataset has also been used to evaluate performance but without cross-validation. In this study, we developed a deep learning approach for concussion classification using implicit features of the entire voxel-wise white matter fiber strains. Using reconstructed American National Football League (NFL) injury cases, leave-one-out cross-validation was employed to objectively compare injury prediction performances against two baseline machine learning classifiers (support vector machine (SVM) and random forest (RF)) and four scalar metrics via univariate logistic regression (Brain Injury Criterion (BrIC), cumulative strain damage measure of the whole brain (CSDM-WB) and the corpus callosum (CSDM-CC), and peak fiber strain in the CC). Feature-based machine learning classifiers including deep learning, SVM, and RF consistently outperformed all scalar injury metrics across all performance categories (e.g., leave-one-out accuracy of 0.828–0.862 vs. 0.690–0.776, and .632+ error of 0.148–0.176 vs. 0.207–0.292). Further, deep learning achieved the best cross-validation accuracy, sensitivity, AUC, and .632+ error. These findings demonstrate the superior performances of deep learning in concussion prediction and suggest its promise for future applications in biomechanical investigations of traumatic brain injury. PMID:29795640
Chowriappa, Ashirwad J; Shi, Yi; Raza, Syed Johar; Ahmed, Kamran; Stegemann, Andrew; Wilding, Gregory; Kaouk, Jihad; Peabody, James O; Menon, Mani; Hassett, James M; Kesavadas, Thenkurussi; Guru, Khurshid A
2013-12-01
A standardized scoring system does not exist in virtual reality-based assessment metrics to describe safe and crucial surgical skills in robot-assisted surgery. This study aims to develop an assessment score along with its construct validation. All subjects performed key tasks on previously validated Fundamental Skills of Robotic Surgery curriculum, which were recorded, and metrics were stored. After an expert consensus for the purpose of content validation (Delphi), critical safety determining procedural steps were identified from the Fundamental Skills of Robotic Surgery curriculum and a hierarchical task decomposition of multiple parameters using a variety of metrics was used to develop Robotic Skills Assessment Score (RSA-Score). Robotic Skills Assessment mainly focuses on safety in operative field, critical error, economy, bimanual dexterity, and time. Following, the RSA-Score was further evaluated for construct validation and feasibility. Spearman correlation tests performed between tasks using the RSA-Scores indicate no cross correlation. Wilcoxon rank sum tests were performed between the two groups. The proposed RSA-Score was evaluated on non-robotic surgeons (n = 15) and on expert-robotic surgeons (n = 12). The expert group demonstrated significantly better performance on all four tasks in comparison to the novice group. Validation of the RSA-Score in this study was carried out on the Robotic Surgical Simulator. The RSA-Score is a valid scoring system that could be incorporated in any virtual reality-based surgical simulator to achieve standardized assessment of fundamental surgical tents during robot-assisted surgery. Copyright © 2013 Elsevier Inc. All rights reserved.
Concussion classification via deep learning using whole-brain white matter fiber strains.
Cai, Yunliang; Wu, Shaoju; Zhao, Wei; Li, Zhigang; Wu, Zheyang; Ji, Songbai
2018-01-01
Developing an accurate and reliable injury predictor is central to the biomechanical studies of traumatic brain injury. State-of-the-art efforts continue to rely on empirical, scalar metrics based on kinematics or model-estimated tissue responses explicitly pre-defined in a specific brain region of interest. They could suffer from loss of information. A single training dataset has also been used to evaluate performance but without cross-validation. In this study, we developed a deep learning approach for concussion classification using implicit features of the entire voxel-wise white matter fiber strains. Using reconstructed American National Football League (NFL) injury cases, leave-one-out cross-validation was employed to objectively compare injury prediction performances against two baseline machine learning classifiers (support vector machine (SVM) and random forest (RF)) and four scalar metrics via univariate logistic regression (Brain Injury Criterion (BrIC), cumulative strain damage measure of the whole brain (CSDM-WB) and the corpus callosum (CSDM-CC), and peak fiber strain in the CC). Feature-based machine learning classifiers including deep learning, SVM, and RF consistently outperformed all scalar injury metrics across all performance categories (e.g., leave-one-out accuracy of 0.828-0.862 vs. 0.690-0.776, and .632+ error of 0.148-0.176 vs. 0.207-0.292). Further, deep learning achieved the best cross-validation accuracy, sensitivity, AUC, and .632+ error. These findings demonstrate the superior performances of deep learning in concussion prediction and suggest its promise for future applications in biomechanical investigations of traumatic brain injury.
Metrication report to the Congress
NASA Technical Reports Server (NTRS)
1991-01-01
NASA's principal metrication accomplishments for FY 1990 were establishment of metrication policy for major programs, development of an implementing instruction for overall metric policy and initiation of metrication planning for the major program offices. In FY 1991, development of an overall NASA plan and individual program office plans will be completed, requirement assessments will be performed for all support areas, and detailed assessment and transition planning will be undertaken at the institutional level. Metric feasibility decisions on a number of major programs are expected over the next 18 months.
Foresters' Metric Conversions program (version 1.0). [Computer program
Jefferson A. Palmer
1999-01-01
The conversion of scientific measurements has become commonplace in the fields of - engineering, research, and forestry. Foresters? Metric Conversions is a Windows-based computer program that quickly converts user-defined measurements from English to metric and from metric to English. Foresters? Metric Conversions was derived from the publication "Metric...
Benefits of utilizing CellProfiler as a characterization tool for U–10Mo nuclear fuel
DOE Office of Scientific and Technical Information (OSTI.GOV)
Collette, R.; Douglas, J.; Patterson, L.
2015-07-15
Automated image processing techniques have the potential to aid in the performance evaluation of nuclear fuels by eliminating judgment calls that may vary from person-to-person or sample-to-sample. Analysis of in-core fuel performance is required for design and safety evaluations related to almost every aspect of the nuclear fuel cycle. This study presents a methodology for assessing the quality of uranium–molybdenum fuel images and describes image analysis routines designed for the characterization of several important microstructural properties. The analyses are performed in CellProfiler, an open-source program designed to enable biologists without training in computer vision or programming to automatically extract cellularmore » measurements from large image sets. The quality metric scores an image based on three parameters: the illumination gradient across the image, the overall focus of the image, and the fraction of the image that contains scratches. The metric presents the user with the ability to ‘pass’ or ‘fail’ an image based on a reproducible quality score. Passable images may then be characterized through a separate CellProfiler pipeline, which enlists a variety of common image analysis techniques. The results demonstrate the ability to reliably pass or fail images based on the illumination, focus, and scratch fraction of the image, followed by automatic extraction of morphological data with respect to fission gas voids, interaction layers, and grain boundaries. - Graphical abstract: Display Omitted - Highlights: • A technique is developed to score U–10Mo FIB-SEM image quality using CellProfiler. • The pass/fail metric is based on image illumination, focus, and area scratched. • Automated image analysis is performed in pipeline fashion to characterize images. • Fission gas void, interaction layer, and grain boundary coverage data is extracted. • Preliminary characterization results demonstrate consistency of the algorithm.« less
Assessing precision, bias and sigma-metrics of 53 measurands of the Alinity ci system.
Westgard, Sten; Petrides, Victoria; Schneider, Sharon; Berman, Marvin; Herzogenrath, Jörg; Orzechowski, Anthony
2017-12-01
Assay performance is dependent on the accuracy and precision of a given method. These attributes can be combined into an analytical Sigma-metric, providing a simple value for laboratorians to use in evaluating a test method's capability to meet its analytical quality requirements. Sigma-metrics were determined for 37 clinical chemistry assays, 13 immunoassays, and 3 ICT methods on the Alinity ci system. Analytical Performance Specifications were defined for the assays, following a rationale of using CLIA goals first, then Ricos Desirable goals when CLIA did not regulate the method, and then other sources if the Ricos Desirable goal was unrealistic. A precision study was conducted at Abbott on each assay using the Alinity ci system following the CLSI EP05-A2 protocol. Bias was estimated following the CLSI EP09-A3 protocol using samples with concentrations spanning the assay's measuring interval tested in duplicate on the Alinity ci system and ARCHITECT c8000 and i2000 SR systems, where testing was also performed at Abbott. Using the regression model, the %bias was estimated at an important medical decisions point. Then the Sigma-metric was estimated for each assay and was plotted on a method decision chart. The Sigma-metric was calculated using the equation: Sigma-metric=(%TEa-|%bias|)/%CV. The Sigma-metrics and Normalized Method Decision charts demonstrate that a majority of the Alinity assays perform at least at five Sigma or higher, at or near critical medical decision levels. More than 90% of the assays performed at Five and Six Sigma. None performed below Three Sigma. Sigma-metrics plotted on Normalized Method Decision charts provide useful evaluations of performance. The majority of Alinity ci system assays had sigma values >5 and thus laboratories can expect excellent or world class performance. Laboratorians can use these tools as aids in choosing high-quality products, further contributing to the delivery of excellent quality healthcare for patients. Copyright © 2017 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
Maximizing your return on people.
Bassi, Laurie; McMurrer, Daniel
2007-03-01
Though most traditional HR performance metrics don't predict organizational performance, alternatives simply have not existed--until now. During the past ten years, researchers Laurie Bassi and Daniel McMurrer have worked to develop a system that allows executives to assess human capital management (HCM) and to use those metrics both to predict organizational performance and to guide organizations' investments in people. The new framework is based on a core set of HCM drivers that fall into five major categories: leadership practices, employee engagement, knowledge accessibility, workforce optimization, and organizational learning capacity. By employing rigorously designed surveys to score a company on the range of HCM practices across the five categories, it's possible to benchmark organizational HCM capabilities, identify HCM strengths and weaknesses, and link improvements or back-sliding in specific HCM practices with improvements or shortcomings in organizational performance. The process requires determining a "maturity" score for each practice, based on a scale of 1 (low) to 5 (high). Over time, evolving maturity scores from multiple surveys can reveal progress in each of the HCM practices and help a company decide where to focus improvement efforts that will have a direct impact on performance. The authors draw from their work with American Standard, South Carolina's Beaufort County School District, and a bevy of financial firms to show how improving HCM scores led to increased sales, safety, academic test scores, and stock returns. Bassi and McMurrer urge HR departments to move beyond the usual metrics and begin using HCM measurement tools to gauge how well people are managed and developed throughout the organization. In this new role, according to the authors, HR can take on strategic responsibility and ensure that superior human capital management becomes central to the organization's culture.
Patel, Sajan; Rajkomar, Alvin; Harrison, James D; Prasad, Priya A; Valencia, Victoria; Ranji, Sumant R; Mourad, Michelle
2018-03-05
Audit and feedback improves clinical care by highlighting the gap between current and ideal practice. We combined best practices of audit and feedback with continuously generated electronic health record data to improve performance on quality metrics in an inpatient setting. We conducted a cluster randomised control trial comparing intensive audit and feedback with usual audit and feedback from February 2016 to June 2016. The study subjects were internal medicine teams on the teaching service at an urban tertiary care hospital. Teams in the intensive feedback arm received access to a daily-updated team-based data dashboard as well as weekly inperson review of performance data ('STAT rounds'). The usual feedback arm received ongoing twice-monthly emails with graphical depictions of team performance on selected quality metrics. The primary outcome was performance on a composite discharge metric (Discharge Mix Index, 'DMI'). A washout period occurred at the end of the trial (from May through June 2016) during which STAT rounds were removed from the intensive feedback arm. A total of 40 medicine teams participated in the trial. During the intervention period, the primary outcome of completion of the DMI was achieved on 79.3% (426/537) of patients in the intervention group compared with 63.2% (326/516) in the control group (P<0.0001). During the washout period, there was no significant difference in performance between the intensive and usual feedback groups. Intensive audit and feedback using timely data and STAT rounds significantly increased performance on a composite discharge metric compared with usual feedback. With the cessation of STAT rounds, performance between the intensive and usual feedback groups did not differ significantly, highlighting the importance of feedback delivery on effecting change. The trial was registered with ClinicalTrials.gov (NCT02593253). © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Restaurant Energy Use Benchmarking Guideline
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hedrick, R.; Smith, V.; Field, K.
2011-07-01
A significant operational challenge for food service operators is defining energy use benchmark metrics to compare against the performance of individual stores. Without metrics, multiunit operators and managers have difficulty identifying which stores in their portfolios require extra attention to bring their energy performance in line with expectations. This report presents a method whereby multiunit operators may use their own utility data to create suitable metrics for evaluating their operations.
On Railroad Tank Car Puncture Performance: Part I - Considering Metrics
DOT National Transportation Integrated Search
2016-04-12
This paper is the first in a two-part series on the puncture performance of railroad tank cars carrying hazardous materials in the event of an accident. Various metrics are often mentioned in the open literature to characterize the structural perform...
Efficient dual approach to distance metric learning.
Shen, Chunhua; Kim, Junae; Liu, Fayao; Wang, Lei; van den Hengel, Anton
2014-02-01
Distance metric learning is of fundamental interest in machine learning because the employed distance metric can significantly affect the performance of many learning methods. Quadratic Mahalanobis metric learning is a popular approach to the problem, but typically requires solving a semidefinite programming (SDP) problem, which is computationally expensive. The worst case complexity of solving an SDP problem involving a matrix variable of size D×D with O(D) linear constraints is about O(D(6.5)) using interior-point methods, where D is the dimension of the input data. Thus, the interior-point methods only practically solve problems exhibiting less than a few thousand variables. Because the number of variables is D(D+1)/2, this implies a limit upon the size of problem that can practically be solved around a few hundred dimensions. The complexity of the popular quadratic Mahalanobis metric learning approach thus limits the size of problem to which metric learning can be applied. Here, we propose a significantly more efficient and scalable approach to the metric learning problem based on the Lagrange dual formulation of the problem. The proposed formulation is much simpler to implement, and therefore allows much larger Mahalanobis metric learning problems to be solved. The time complexity of the proposed method is roughly O(D(3)), which is significantly lower than that of the SDP approach. Experiments on a variety of data sets demonstrate that the proposed method achieves an accuracy comparable with the state of the art, but is applicable to significantly larger problems. We also show that the proposed method can be applied to solve more general Frobenius norm regularized SDP problems approximately.
Validation metrics for turbulent plasma transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holland, C., E-mail: chholland@ucsd.edu
Developing accurate models of plasma dynamics is essential for confident predictive modeling of current and future fusion devices. In modern computer science and engineering, formal verification and validation processes are used to assess model accuracy and establish confidence in the predictive capabilities of a given model. This paper provides an overview of the key guiding principles and best practices for the development of validation metrics, illustrated using examples from investigations of turbulent transport in magnetically confined plasmas. Particular emphasis is given to the importance of uncertainty quantification and its inclusion within the metrics, and the need for utilizing synthetic diagnosticsmore » to enable quantitatively meaningful comparisons between simulation and experiment. As a starting point, the structure of commonly used global transport model metrics and their limitations is reviewed. An alternate approach is then presented, which focuses upon comparisons of predicted local fluxes, fluctuations, and equilibrium gradients against observation. The utility of metrics based upon these comparisons is demonstrated by applying them to gyrokinetic predictions of turbulent transport in a variety of discharges performed on the DIII-D tokamak [J. L. Luxon, Nucl. Fusion 42, 614 (2002)], as part of a multi-year transport model validation activity.« less
Lahiri, Uttama; Bekele, Esubalew; Dohrmann, Elizabeth; Warren, Zachary; Sarkar, Nilanjan
2015-04-01
Clinical applications of advanced technology may hold promise for addressing impairments associated with autism spectrum disorders (ASD). This project evaluated the application of a novel physiologically responsive virtual reality based technological system for conversation skills in a group of adolescents with ASD. The system altered components of conversation based on (1) performance alone or (2) the composite effect of performance and physiological metrics of predicted engagement (e.g., gaze pattern, pupil dilation, blink rate). Participants showed improved performance and looking pattern within the physiologically sensitive system as compared to the performance based system. This suggests that physiologically informed technologies may have the potential of being an effective tool in the hands of interventionists.
Tracking occupational hearing loss across global industries: A comparative analysis of metrics
Rabinowitz, Peter M.; Galusha, Deron; McTague, Michael F.; Slade, Martin D.; Wesdock, James C.; Dixon-Ernst, Christine
2013-01-01
Occupational hearing loss is one of the most prevalent occupational conditions; yet, there is no acknowledged international metric to allow comparisons of risk between different industries and regions. In order to make recommendations for an international standard of occupational hearing loss, members of an international industry group (the International Aluminium Association) submitted details of different hearing loss metrics currently in use by members. We compared the performance of these metrics using an audiometric data set for over 6000 individuals working in 10 locations of one member company. We calculated rates for each metric at each location from 2002 to 2006. For comparison, we calculated the difference of observed–expected (for age) binaural high frequency hearing loss (in dB/year) for each location over the same time period. We performed linear regression to determine the correlation between each metric and the observed–expected rate of hearing loss. The different metrics produced discrepant results, with annual rates ranging from 0.0% for a less-sensitive metric to more than 10% for a highly sensitive metric. At least two metrics, a 10 dB age-corrected threshold shift from baseline and a 15 dB nonage-corrected shift metric, correlated well with the difference of observed–expected high-frequency hearing loss. This study suggests that it is feasible to develop an international standard for tracking occupational hearing loss in industrial working populations. PMID:22387709
Do Your Students Measure Up Metrically?
ERIC Educational Resources Information Center
Taylor, P. Mark; Simms, Ken; Kim, Ok-Kyeong; Reys, Robert E.
2001-01-01
Examines released metric items from the Third International Mathematics and Science Study (TIMSS) and the 3rd and 4th grade results. Recommends refocusing instruction on the metric system to improve student performance in measurement. (KHR)
An Examination of Diameter Density Prediction with k-NN and Airborne Lidar
Strunk, Jacob L.; Gould, Peter J.; Packalen, Petteri; ...
2017-11-16
While lidar-based forest inventory methods have been widely demonstrated, performances of methods to predict tree diameters with airborne lidar (lidar) are not well understood. One cause for this is that the performance metrics typically used in studies for prediction of diameters can be difficult to interpret, and may not support comparative inferences between sampling designs and study areas. To help with this problem we propose two indices and use them to evaluate a variety of lidar and k nearest neighbor (k-NN) strategies for prediction of tree diameter distributions. The indices are based on the coefficient of determination ( R 2),more » and root mean square deviation (RMSD). Both of the indices are highly interpretable, and the RMSD-based index facilitates comparisons with alternative (non-lidar) inventory strategies, and with projects in other regions. K-NN diameter distribution prediction strategies were examined using auxiliary lidar for 190 training plots distribute across the 800 km 2 Savannah River Site in South Carolina, USA. In conclusion, we evaluate the performance of k-NN with respect to distance metrics, number of neighbors, predictor sets, and response sets. K-NN and lidar explained 80% of variability in diameters, and Mahalanobis distance with k = 3 neighbors performed best according to a number of criteria.« less
An Examination of Diameter Density Prediction with k-NN and Airborne Lidar
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strunk, Jacob L.; Gould, Peter J.; Packalen, Petteri
While lidar-based forest inventory methods have been widely demonstrated, performances of methods to predict tree diameters with airborne lidar (lidar) are not well understood. One cause for this is that the performance metrics typically used in studies for prediction of diameters can be difficult to interpret, and may not support comparative inferences between sampling designs and study areas. To help with this problem we propose two indices and use them to evaluate a variety of lidar and k nearest neighbor (k-NN) strategies for prediction of tree diameter distributions. The indices are based on the coefficient of determination ( R 2),more » and root mean square deviation (RMSD). Both of the indices are highly interpretable, and the RMSD-based index facilitates comparisons with alternative (non-lidar) inventory strategies, and with projects in other regions. K-NN diameter distribution prediction strategies were examined using auxiliary lidar for 190 training plots distribute across the 800 km 2 Savannah River Site in South Carolina, USA. In conclusion, we evaluate the performance of k-NN with respect to distance metrics, number of neighbors, predictor sets, and response sets. K-NN and lidar explained 80% of variability in diameters, and Mahalanobis distance with k = 3 neighbors performed best according to a number of criteria.« less
SU-E-J-128: Two-Stage Atlas Selection in Multi-Atlas-Based Image Segmentation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, T; Ruan, D
2015-06-15
Purpose: In the new era of big data, multi-atlas-based image segmentation is challenged by heterogeneous atlas quality and high computation burden from extensive atlas collection, demanding efficient identification of the most relevant atlases. This study aims to develop a two-stage atlas selection scheme to achieve computational economy with performance guarantee. Methods: We develop a low-cost fusion set selection scheme by introducing a preliminary selection to trim full atlas collection into an augmented subset, alleviating the need for extensive full-fledged registrations. More specifically, fusion set selection is performed in two successive steps: preliminary selection and refinement. An augmented subset is firstmore » roughly selected from the whole atlas collection with a simple registration scheme and the corresponding preliminary relevance metric; the augmented subset is further refined into the desired fusion set size, using full-fledged registration and the associated relevance metric. The main novelty of this work is the introduction of an inference model to relate the preliminary and refined relevance metrics, based on which the augmented subset size is rigorously derived to ensure the desired atlases survive the preliminary selection with high probability. Results: The performance and complexity of the proposed two-stage atlas selection method were assessed using a collection of 30 prostate MR images. It achieved comparable segmentation accuracy as the conventional one-stage method with full-fledged registration, but significantly reduced computation time to 1/3 (from 30.82 to 11.04 min per segmentation). Compared with alternative one-stage cost-saving approach, the proposed scheme yielded superior performance with mean and medium DSC of (0.83, 0.85) compared to (0.74, 0.78). Conclusion: This work has developed a model-guided two-stage atlas selection scheme to achieve significant cost reduction while guaranteeing high segmentation accuracy. The benefit in both complexity and performance is expected to be most pronounced with large-scale heterogeneous data.« less
Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial
The model performance evaluation consists of metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors.
Greenroads : a sustainability performance metric for roadway design and construction.
DOT National Transportation Integrated Search
2009-11-01
Greenroads is a performance metric for quantifying sustainable practices associated with roadway design and construction. Sustainability is defined as having seven key components: ecology, equity, economy, extent, expectations, experience and exposur...
Performance metrics used by freight transport providers.
DOT National Transportation Integrated Search
2008-09-30
The newly-established National Cooperative Freight Research Program (NCFRP) has allocated $300,000 in funding to a project entitled Performance Metrics for Freight Transportation (NCFRP 03). The project is scheduled for completion in September ...
Doğanay Erdoğan, Beyza; Elhan, Atilla Halİl; Kaskatı, Osman Tolga; Öztuna, Derya; Küçükdeveci, Ayşe Adile; Kutlay, Şehim; Tennant, Alan
2017-10-01
This study aimed to explore the potential of an inclusive and fully integrated measurement system for the Activities component of the International Classification of Functioning, Disability and Health (ICF), incorporating four classical scales, including the Health Assessment Questionnaire (HAQ), and a Computerized Adaptive Testing (CAT). Three hundred patients with rheumatoid arthritis (RA) answered relevant questions from four questionnaires. Rasch analysis was performed to create an item bank using this item pool. A further 100 RA patients were recruited for a CAT application. Both real and simulated CATs were applied and the agreement between these CAT-based scores and 'paper-pencil' scores was evaluated with intraclass correlation coefficient (ICC). Anchoring strategies were used to obtain a direct translation from the item bank common metric to the HAQ score. Mean age of 300 patients was 52.3 ± 11.7 years; disease duration was 11.3 ± 8.0 years; 74.7% were women. After testing for the assumptions of Rasch analysis, a 28-item Activities item bank was created. The agreement between CAT-based scores and paper-pencil scores were high (ICC = 0.993). Using those HAQ items in the item bank as anchoring items, another Rasch analysis was performed with HAQ-8 scores as separate items together with anchoring items. Finally a conversion table of the item bank common metric to the HAQ scores was created. A fully integrated and inclusive health assessment system, illustrating the Activities component of the ICF, was built to assess RA patients. Raw score to metric conversions and vice versa were available, giving access to the metric by a simple look-up table. © 2015 Asia Pacific League of Associations for Rheumatology and Wiley Publishing Asia Pty Ltd.
NASA Astrophysics Data System (ADS)
Evans, Garrett Nolan
In this work, I present two projects that both contribute to the aim of discovering how intelligence manifests in the brain. The first project is a method for analyzing recorded neural signals, which takes the form of a convolution-based metric on neural membrane potential recordings. Relying only on integral and algebraic operations, the metric compares the timing and number of spikes within recordings as well as the recordings' subthreshold features: summarizing differences in these with a single "distance" between the recordings. Like van Rossum's (2001) metric for spike trains, the metric is based on a convolution operation that it performs on the input data. The kernel used for the convolution is carefully chosen such that it produces a desirable frequency space response and, unlike van Rossum's kernel, causes the metric to be first order both in differences between nearby spike times and in differences between same-time membrane potential values: an important trait. The second project is a combinatorial syntax method for connectionist semantic network encoding. Combinatorial syntax has been a point on which those who support a symbol-processing view of intelligent processing and those who favor a connectionist view have had difficulty seeing eye-to-eye. Symbol-processing theorists have persuasively argued that combinatorial syntax is necessary for certain intelligent mental operations, such as reasoning by analogy. Connectionists have focused on the versatility and adaptability offered by self-organizing networks of simple processing units. With this project, I show that there is a way to reconcile the two perspectives and to ascribe a combinatorial syntax to a connectionist network. The critical principle is to interpret nodes, or units, in the connectionist network as bound integrations of the interpretations for nodes that they share links with. Nodes need not correspond exactly to neurons and may correspond instead to distributed sets, or assemblies, of neurons.
Assessment of six dissimilarity metrics for climate analogues
NASA Astrophysics Data System (ADS)
Grenier, Patrick; Parent, Annie-Claude; Huard, David; Anctil, François; Chaumont, Diane
2013-04-01
Spatial analogue techniques consist in identifying locations whose recent-past climate is similar in some aspects to the future climate anticipated at a reference location. When identifying analogues, one key step is the quantification of the dissimilarity between two climates separated in time and space, which involves the choice of a metric. In this communication, spatial analogues and their usefulness are briefly discussed. Next, six metrics are presented (the standardized Euclidean distance, the Kolmogorov-Smirnov statistic, the nearest-neighbor distance, the Zech-Aslan energy statistic, the Friedman-Rafsky runs statistic and the Kullback-Leibler divergence), along with a set of criteria used for their assessment. The related case study involves the use of numerical simulations performed with the Canadian Regional Climate Model (CRCM-v4.2.3), from which three annual indicators (total precipitation, heating degree-days and cooling degree-days) are calculated over 30-year periods (1971-2000 and 2041-2070). Results indicate that the six metrics identify comparable analogue regions at a relatively large scale, but best analogues may differ substantially. For best analogues, it is also shown that the uncertainty stemming from the metric choice does generally not exceed that stemming from the simulation or model choice. A synthesis of the advantages and drawbacks of each metric is finally presented, in which the Zech-Aslan energy statistic stands out as the most recommended metric for analogue studies, whereas the Friedman-Rafsky runs statistic is the least recommended, based on this case study.
Nicol, Sam; Wiederholt, Ruscena; Diffendorfer, James E.; Mattsson, Brady; Thogmartin, Wayne E.; Semmens, Darius J.; Laura Lopez-Hoffman,; Norris, Ryan
2016-01-01
Mobile species with complex spatial dynamics can be difficult to manage because their population distributions vary across space and time, and because the consequences of managing particular habitats are uncertain when evaluated at the level of the entire population. Metrics to assess the importance of habitats and pathways connecting habitats in a network are necessary to guide a variety of management decisions. Given the many metrics developed for spatially structured models, it can be challenging to select the most appropriate one for a particular decision. To guide the management of spatially structured populations, we define three classes of metrics describing habitat and pathway quality based on their data requirements (graph-based, occupancy-based, and demographic-based metrics) and synopsize the ecological literature relating to these classes. Applying the first steps of a formal decision-making approach (problem framing, objectives, and management actions), we assess the utility of metrics for particular types of management decisions. Our framework can help managers with problem framing, choosing metrics of habitat and pathway quality, and to elucidate the data needs for a particular metric. Our goal is to help managers to narrow the range of suitable metrics for a management project, and aid in decision-making to make the best use of limited resources.
Raison, Nicholas; Ahmed, Kamran; Fossati, Nicola; Buffi, Nicolò; Mottrie, Alexandre; Dasgupta, Prokar; Van Der Poel, Henk
2017-05-01
To develop benchmark scores of competency for use within a competency based virtual reality (VR) robotic training curriculum. This longitudinal, observational study analysed results from nine European Association of Urology hands-on-training courses in VR simulation. In all, 223 participants ranging from novice to expert robotic surgeons completed 1565 exercises. Competency was set at 75% of the mean expert score. Benchmark scores for all general performance metrics generated by the simulator were calculated. Assessment exercises were selected by expert consensus and through learning-curve analysis. Three basic skill and two advanced skill exercises were identified. Benchmark scores based on expert performance offered viable targets for novice and intermediate trainees in robotic surgery. Novice participants met the competency standards for most basic skill exercises; however, advanced exercises were significantly more challenging. Intermediate participants performed better across the seven metrics but still did not achieve the benchmark standard in the more difficult exercises. Benchmark scores derived from expert performances offer relevant and challenging scores for trainees to achieve during VR simulation training. Objective feedback allows both participants and trainers to monitor educational progress and ensures that training remains effective. Furthermore, the well-defined goals set through benchmarking offer clear targets for trainees and enable training to move to a more efficient competency based curriculum. © 2016 The Authors BJU International © 2016 BJU International Published by John Wiley & Sons Ltd.
Systematic methods for knowledge acquisition and expert system development
NASA Technical Reports Server (NTRS)
Belkin, Brenda L.; Stengel, Robert F.
1991-01-01
Nine cooperating rule-based systems, collectively called AUTOCREW, were designed to automate functions and decisions associated with a combat aircraft's subsystem. The organization of tasks within each system is described; performance metrics were developed to evaluate the workload of each rule base, and to assess the cooperation between the rule-bases. Each AUTOCREW subsystem is composed of several expert systems that perform specific tasks. AUTOCREW's NAVIGATOR was analyzed in detail to understand the difficulties involved in designing the system and to identify tools and methodologies that ease development. The NAVIGATOR determines optimal navigation strategies from a set of available sensors. A Navigation Sensor Management (NSM) expert system was systematically designed from Kalman filter covariance data; four ground-based, a satellite-based, and two on-board INS-aiding sensors were modeled and simulated to aid an INS. The NSM Expert was developed using the Analysis of Variance (ANOVA) and the ID3 algorithm. Navigation strategy selection is based on an RSS position error decision metric, which is computed from the covariance data. Results show that the NSM Expert predicts position error correctly between 45 and 100 percent of the time for a specified navaid configuration and aircraft trajectory. The NSM Expert adapts to new situations, and provides reasonable estimates of hybrid performance. The systematic nature of the ANOVA/ID3 method makes it broadly applicable to expert system design when experimental or simulation data is available.
NASA Technical Reports Server (NTRS)
Bole, Brian; Goebel, Kai; Vachtsevanos, George
2012-01-01
This paper introduces a novel Markov process formulation of stochastic fault growth modeling, in order to facilitate the development and analysis of prognostics-based control adaptation. A metric representing the relative deviation between the nominal output of a system and the net output that is actually enacted by an implemented prognostics-based control routine, will be used to define the action space of the formulated Markov process. The state space of the Markov process will be defined in terms of an abstracted metric representing the relative health remaining in each of the system s components. The proposed formulation of component fault dynamics will conveniently relate feasible system output performance modifications to predictions of future component health deterioration.
Assessment of read and write stability for 6T SRAM cell based on charge plasma DLTFET
NASA Astrophysics Data System (ADS)
Anju; Yadav, Shivendra; Sharma, Dheeraj
2018-03-01
To overcome the process variations due to random dopant fluctuations (RDFs) and complex annealing techniques a charge plasma based doping less TFET (CP-DLTFET) device has been proposed for designing of 6T SRAM cell. The proposed device also benefited by subthreshold slope, low leakage current, and low power supply. In this paper, to avoid the dependency of stability parameters of SRAM cell to supply voltage (Vdd), here N-curve metrics has been analyzed to determine read and write stability. Because N-curve provides stability analysis in terms of voltage and current as well as it gives combine stability analysis with the facility of an inline tester. Further, analyzing the N-curve metrics for different Vdd, cell ratio, and pull-up ratio assist in designing the configuration of transistors for the better read and write stability. Power metrics of N-curve gives the knowledge about read and write stability instead of using four metrics (SINM, SVNM, WTV, and WTI) of N-curve. Finally, in the 6T CP-DLTFET SRAM cell, read and write stability is tested by the interface trap charges (ITCs). The performance parameter of the 6T CP-DLTFET SRAM cell provides considerable read and write stability with less fabrication complexity.
Visual tuning and metrical perception of realistic point-light dance movements
Su, Yi-Huang
2016-01-01
Humans move to music spontaneously, and this sensorimotor coupling underlies musical rhythm perception. The present research proposed that, based on common action representation, different metrical levels as in auditory rhythms could emerge visually when observing structured dance movements. Participants watched a point-light figure performing basic steps of Swing dance cyclically in different tempi, whereby the trunk bounced vertically at every beat and the limbs moved laterally at every second beat, yielding two possible metrical periodicities. In Experiment 1, participants freely identified a tempo of the movement and tapped along. While some observers only tuned to the bounce and some only to the limbs, the majority tuned to one level or the other depending on the movement tempo, which was also associated with individuals’ preferred tempo. In Experiment 2, participants reproduced the tempo of leg movements by four regular taps, and showed a slower perceived leg tempo with than without the trunk bouncing simultaneously in the stimuli. This mirrors previous findings of an auditory ‘subdivision effect’, suggesting the leg movements were perceived as beat while the bounce as subdivisions. Together these results support visual metrical perception of dance movements, which may employ similar action-based mechanisms to those underpinning auditory rhythm perception. PMID:26947252
NASA Technical Reports Server (NTRS)
Heath, Bruce E.; Khan, M. Javed; Rossi, Marcia; Ali, Syed Firasat
2005-01-01
The rising cost of flight training and the low cost of powerful computers have resulted in increasing use of PC-based flight simulators. This has prompted FAA standards regulating such use and allowing aspects of training on simulators meeting these standards to be substituted for flight time. However, the FAA regulations require an authorized flight instructor as part of the training environment. Thus, while costs associated with flight time have been reduced, the cost associated with the need for a flight instructor still remains. The obvious area of research, therefore, has been to develop intelligent simulators. However, the two main challenges of such attempts have been training strategies and assessment. The research reported in this paper was conducted to evaluate various performance metrics of a straight-in landing approach by 33 novice pilots flying a light single engine aircraft simulation. These metrics were compared to assessments of these flights by two flight instructors to establish a correlation between the two techniques in an attempt to determine a composite performance metric for this flight maneuver.
Gish, Ryan
2002-08-01
Strategic triggers and metrics help healthcare providers achieve financial success. Metrics help assess progress toward long-term goals. Triggers signal market changes requiring a change in strategy. All metrics may not move in concert. Organizations need to identify indicators, monitor performance.
The brain of René Descartes (1650): A neuro-anatomical analysis.
Philippe, Charlier; Isabelle, Huynh-Charlier; Philippe, Froesch; Russell, Shorto; Nadia, Benmoussa; Alain, Froment; Dominique, Grimaud-Hervé; Saudamini, Deo; Anaïs, Augias; Lou, Albessard; Antoine, Balzeau
2017-07-15
The skull of René Descartes is held in the National Museum of Natural History since the 19th c. Up to date, only anthropological examinations were carried out, focusing on the cranial capacity and phrenological interpretation of the skull morphology. Using CT-scan based 3D technology, a reconstruction of the endocast was performed, allowing for its first complete description and inter-disciplinary analysis: assessment of metrical and non-metrical features, retrospective diagnosis of anatomical anomalies, and confrontation with neuro-psychological abilities of this well-identified individual. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Gulliver, John; de Hoogh, Kees; Fecht, Daniela; Vienneau, Danielle; Briggs, David
2011-12-01
The development of geographical information system techniques has opened up a wide array of methods for air pollution exposure assessment. The extent to which these provide reliable estimates of air pollution concentrations is nevertheless not clearly established. Nor is it clear which methods or metrics should be preferred in epidemiological studies. This paper compares the performance of ten different methods and metrics in terms of their ability to predict mean annual PM 10 concentrations across 52 monitoring sites in London, UK. Metrics analysed include indicators (distance to nearest road, traffic volume on nearest road, heavy duty vehicle (HDV) volume on nearest road, road density within 150 m, traffic volume within 150 m and HDV volume within 150 m) and four modelling approaches: based on the nearest monitoring site, kriging, dispersion modelling and land use regression (LUR). Measures were computed in a GIS, and resulting metrics calibrated and validated against monitoring data using a form of grouped jack-knife analysis. The results show that PM 10 concentrations across London show little spatial variation. As a consequence, most methods can predict the average without serious bias. Few of the approaches, however, show good correlations with monitored PM 10 concentrations, and most predict no better than a simple classification based on site type. Only land use regression reaches acceptable levels of correlation ( R2 = 0.47), though this can be improved by also including information on site type. This might therefore be taken as a recommended approach in many studies, though care is needed in developing meaningful land use regression models, and like any method they need to be validated against local data before their application as part of epidemiological studies.
Foul tip impact attenuation of baseball catcher masks using head impact metrics
White, Terrance R.; Cutcliffe, Hattie C.; Shridharani, Jay K.; Wood, Garrett W.; Bass, Cameron R.
2018-01-01
Currently, no scientific consensus exists on the relative safety of catcher mask styles and materials. Due to differences in mass and material properties, the style and material of a catcher mask influences the impact metrics observed during simulated foul ball impacts. The catcher surrogate was a Hybrid III head and neck equipped with a six degree of freedom sensor package to obtain linear accelerations and angular rates. Four mask styles were impacted using an air cannon for six 30 m/s and six 35 m/s impacts to the nasion. To quantify impact severity, the metrics peak linear acceleration, peak angular acceleration, Head Injury Criterion, Head Impact Power, and Gadd Severity Index were used. An Analysis of Covariance and a Tukey’s HSD Test were conducted to compare the least squares mean between masks for each head injury metric. For each injury metric a P-Value less than 0.05 was found indicating a significant difference in mask performance. Tukey’s HSD test found for each metric, the traditional style titanium mask fell in the lowest performance category while the hockey style mask was in the highest performance category. Limitations of this study prevented a direct correlation from mask testing performance to mild traumatic brain injury. PMID:29856814
The importance of metrics for evaluating scientific performance
NASA Astrophysics Data System (ADS)
Miyakawa, Tsuyoshi
Evaluation of scientific performance is a major factor that determines the behavior of both individual researchers and the academic institutes to which they belong. Because the number of researchers heavily outweighs the number of available research posts, and the competitive funding accounts for an ever-increasing proportion of research budget, some objective indicators of research performance have gained recognition for increasing transparency and openness. It is common practice to use metrics and indices to evaluate a researcher's performance or the quality of their grant applications. Such measures include the number of publications, the number of times these papers are cited and, more recently, the h-index, which measures the number of highly-cited papers the researcher has written. However, academic institutions and funding agencies in Japan have been rather slow to adopt such metrics. In this article, I will outline some of the currently available metrics, and discuss why we need to use such objective indicators of research performance more often in Japan. I will also discuss how to promote the use of metrics and what we should keep in mind when using them, as well as their potential impact on the research community in Japan.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aho, Jacob; Pao, Lucy Y.; Fleming, Paul
2014-11-13
As wind energy becomes a larger portion of the world's energy portfolio there has been an increased interest for wind turbines to control their active power output to provide ancillary services which support grid reliability. One of these ancillary services is the provision of frequency regulation, also referred to as secondary frequency control or automatic generation control (AGC), which is often procured through markets which recently adopted performance-based compensation. A wind turbine with a control system developed to provide active power ancillary services can be used to provide frequency regulation services. Simulations have been performed to determine the AGC trackingmore » performance at various power schedule set-points, participation levels, and wind conditions. The performance metrics used in this study are based on those used by several system operators in the US. Another metric that is analyzed is the damage equivalent loads (DELs) on turbine structural components, though the impacts on the turbine electrical components are not considered. The results of these single-turbine simulations show that high performance scores can be achieved when there are insufficient wind resources available. The capability of a wind turbine to rapidly and accurately follow power commands allows for high performance even when tracking rapidly changing AGC signals. As the turbine de-rates to meet decreased power schedule set-points there is a reduction in the DELs, and the participation in frequency regulation has a negligible impact on these loads.« less
NASA Astrophysics Data System (ADS)
Kastor, David; Ray, Sourya; Traschen, Jennie
2017-10-01
We study the problem of finding brane-like solutions to Lovelock gravity, adopting a general approach to establish conditions that a lower dimensional base metric must satisfy in order that a solution to a given Lovelock theory can be constructed in one higher dimension. We find that for Lovelock theories with generic values of the coupling constants, the Lovelock tensors (higher curvature generalizations of the Einstein tensor) of the base metric must all be proportional to the metric. Hence, allowed base metrics form a subclass of Einstein metrics. This subclass includes so-called ‘universal metrics’, which have been previously investigated as solutions to quantum-corrected field equations. For specially tuned values of the Lovelock couplings, we find that the Lovelock tensors of the base metric need to satisfy fewer constraints. For example, for Lovelock theories with a unique vacuum there is only a single such constraint, a case previously identified in the literature, and brane solutions can be straightforwardly constructed.
Closed-loop, pilot/vehicle analysis of the approach and landing task
NASA Technical Reports Server (NTRS)
Anderson, M. R.; Schmidt, D. K.
1986-01-01
In the case of approach and landing, it is universally accepted that the pilot uses more than one vehicle response, or output, to close his control loops. Therefore, to model this task, a multi-loop analysis technique is required. The analysis problem has been in obtaining reasonable analytic estimates of the describing functions representing the pilot's loop compensation. Once these pilot describing functions are obtained, appropriate performance and workload metrics must then be developed for the landing task. The optimal control approach provides a powerful technique for obtaining the necessary describing functions, once the appropriate task objective is defined in terms of a quadratic objective function. An approach is presented through the use of a simple, reasonable objective function and model-based metrics to evaluate loop performance and pilot workload. The results of an analysis of the LAHOS (Landing and Approach of Higher Order Systems) study performed by R.E. Smith is also presented.
Nixon, Gavin J; Svenstrup, Helle F; Donald, Carol E; Carder, Caroline; Stephenson, Judith M; Morris-Jones, Stephen; Huggett, Jim F; Foy, Carole A
2014-12-01
Molecular diagnostic measurements are currently underpinned by the polymerase chain reaction (PCR). There are also a number of alternative nucleic acid amplification technologies, which unlike PCR, work at a single temperature. These 'isothermal' methods, reportedly offer potential advantages over PCR such as simplicity, speed and resistance to inhibitors and could also be used for quantitative molecular analysis. However there are currently limited mechanisms to evaluate their quantitative performance, which would assist assay development and study comparisons. This study uses a sexually transmitted infection diagnostic model in combination with an adapted metric termed isothermal doubling time (IDT), akin to PCR efficiency, to compare quantitative PCR and quantitative loop-mediated isothermal amplification (qLAMP) assays, and to quantify the impact of matrix interference. The performance metric described here facilitates the comparison of qLAMP assays that could assist assay development and validation activities.
More quality measures versus measuring what matters: a call for balance and parsimony
Nelson, Eugene C; Pryor, David B; James, Brent; Swensen, Stephen J; Kaplan, Gary S; Weissberg, Jed I; Bisognano, Maureen; Yates, Gary R; Hunt, Gordon C
2012-01-01
External groups requiring measures now include public and private payers, regulators, accreditors and others that certify performance levels for consumers, patients and payers. Although benefits have accrued from the growth in quality measurement, the recent explosion in the number of measures threatens to shift resources from improving quality to cover a plethora of quality-performance metrics that may have a limited impact on the things that patients and payers want and need (ie, better outcomes, better care, and lower per capita costs). Here we propose a policy that quality measurement should be: balanced to meet the need of end users to judge quality and cost performance and the need of providers to continuously improve the quality, outcomes and costs of their services; and parsimonious to measure quality, outcomes and costs with appropriate metrics that are selected based on end-user needs. PMID:22893696
More quality measures versus measuring what matters: a call for balance and parsimony.
Meyer, Gregg S; Nelson, Eugene C; Pryor, David B; James, Brent; Swensen, Stephen J; Kaplan, Gary S; Weissberg, Jed I; Bisognano, Maureen; Yates, Gary R; Hunt, Gordon C
2012-11-01
External groups requiring measures now include public and private payers, regulators, accreditors and others that certify performance levels for consumers, patients and payers. Although benefits have accrued from the growth in quality measurement, the recent explosion in the number of measures threatens to shift resources from improving quality to cover a plethora of quality-performance metrics that may have a limited impact on the things that patients and payers want and need (ie, better outcomes, better care, and lower per capita costs). Here we propose a policy that quality measurement should be: balanced to meet the need of end users to judge quality and cost performance and the need of providers to continuously improve the quality, outcomes and costs of their services; and parsimonious to measure quality, outcomes and costs with appropriate metrics that are selected based on end-user needs.
MO-AB-BRA-05: [18F]NaF PET/CT Imaging Biomarkers in Metastatic Prostate Cancer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Harmon, S; Perk, T; Lin, C
Purpose: Clinical use of {sup 18}F-Sodium Fluoride (NaF) PET/CT in metastatic settings often lacks technology to quantitatively measure full disease dynamics due to high tumor burden. This study assesses radiomics-based extraction of NaF PET/CT measures, including global metrics of overall burden and local metrics of disease heterogeneity, in metastatic prostate cancer for correlation to clinical outcomes. Methods: Fifty-six metastatic Castrate-Resistant Prostate Cancer (mCRPC) patients had NaF PET/CT scans performed at baseline and three cycles into chemotherapy (N=16) or androgen-receptor (AR) inhibitors (N=39). A novel technology, Quantitative Total Bone Imaging (QTBI), was used for analysis. Employing hybrid PET/CT segmentation and articulatedmore » skeletal-registration, QTBI allows for response assessment of individual lesions. Various SUV metrics were extracted from each lesion (iSUV). Global metrics were extracted from composite lesion-level statistics for each patient (pSUV). Proportion of detected lesions and those with significant response (%-increase or %-decrease) was calculated for each patient based on test-retest limits for iSUV metrics. Cox proportional hazard regression analyses were conducted between imaging metrics and progression-free survival (PFS). Results: Functional burden (pSUV{sub total}) assessed mid-treatment was the strongest univariate predictor of PFS (HR=2.03; p<0.0001). Various global metrics outperformed baseline clinical markers, including fraction of skeletal burden, mean uptake (pSUV{sub mean}), and heterogeneity of average lesion uptake (pSUV{sub hetero}). Of 43 patients with paired baseline/mid-treatment imaging, 40 showed heterogeneity in lesion-level response, containing populations of lesions with both increasing/decreasing metrics. Proportion of lesions with significantly increasing iSUV{sub mean} was highly predictive of clinical PFS (HR=2.0; p=0.0002). Patients exhibiting higher proportion of lesions with decreasing iSUV{sub total} saw prolonged radiographic PFS (HR=0.51; p=0.02). Conclusion: Technology presented here provides comprehensive disease quantification on NaF PET/CT imaging, showing strong correlation to clinical outcomes. Total functional burden as well as proportions of similarly responding lesions was predictive of PFS. This supports ongoing development of NaF PET/CT based imaging biomarkers in mCRPC. Prostate Cancer Foundation.« less
Transfer of uncertainty of space-borne high resolution rainfall products at ungauged regions
NASA Astrophysics Data System (ADS)
Tang, Ling
Hydrologically relevant characteristics of high resolution (˜ 0.25 degree, 3 hourly) satellite rainfall uncertainty were derived as a function of season and location using a six year (2002-2007) archive of National Aeronautics and Space Administration (NASA)'s Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA) precipitation data. The Next Generation Radar (NEXRAD) Stage IV rainfall data over the continental United States was used as ground validation (GV) data. A geostatistical mapping scheme was developed and tested for transfer (i.e., spatial interpolation) of uncertainty information from GV regions to the vast non-GV regions by leveraging the error characterization work carried out in the earlier step. The open question explored here was, "If 'error' is defined on the basis of independent ground validation (GV) data, how are error metrics estimated for a satellite rainfall data product without the need for much extensive GV data?" After a quantitative analysis of the spatial and temporal structure of the satellite rainfall uncertainty, a proof-of-concept geostatistical mapping scheme (based on the kriging method) was evaluated. The idea was to understand how realistic the idea of 'transfer' is for the GPM era. It was found that it was indeed technically possible to transfer error metrics from a gauged to an ungauged location for certain error metrics and that a regionalized error metric scheme for GPM may be possible. The uncertainty transfer scheme based on a commonly used kriging method (ordinary kriging) was then assessed further at various timescales (climatologic, seasonal, monthly and weekly), and as a function of the density of GV coverage. The results indicated that if a transfer scheme for estimating uncertainty metrics was finer than seasonal scale (ranging from 3-6 hourly to weekly-monthly), the effectiveness for uncertainty transfer worsened significantly. Next, a comprehensive assessment of different kriging methods for spatial transfer (interpolation) of error metrics was performed. Three kriging methods for spatial interpolation are compared, which are: ordinary kriging (OK), indicator kriging (IK) and disjunctive kriging (DK). Additional comparison with the simple inverse distance weighting (IDW) method was also performed to quantify the added benefit (if any) of using geostatistical methods. The overall performance ranking of the kriging methods was found to be as follows: OK=DK > IDW > IK. Lastly, various metrics of satellite rainfall uncertainty were identified for two large continental landmasses that share many similar Koppen climate zones, United States and Australia. The dependence of uncertainty as a function of gauge density was then investigated. The investigation revealed that only the first and second ordered moments of error are most amenable to a Koppen-type climate type classification in different continental landmasses.
Conceptual model of comprehensive research metrics for improved human health and environment.
Engel-Cox, Jill A; Van Houten, Bennett; Phelps, Jerry; Rose, Shyanika W
2008-05-01
Federal, state, and private research agencies and organizations have faced increasing administrative and public demand for performance measurement. Historically, performance measurement predominantly consisted of near-term outputs measured through bibliometrics. The recent focus is on accountability for investment based on long-term outcomes. Developing measurable outcome-based metrics for research programs has been particularly challenging, because of difficulty linking research results to spatially and temporally distant outcomes. Our objective in this review is to build a logic model and associated metrics through which to measure the contribution of environmental health research programs to improvements in human health, the environment, and the economy. We used expert input and literature research on research impact assessment. With these sources, we developed a logic model that defines the components and linkages between extramural environmental health research grant programs and the outputs and outcomes related to health and social welfare, environmental quality and sustainability, economics, and quality of life. The logic model focuses on the environmental health research portfolio of the National Institute of Environmental Health Sciences (NIEHS) Division of Extramural Research and Training. The model delineates pathways for contributions by five types of institutional partners in the research process: NIEHS, other government (federal, state, and local) agencies, grantee institutions, business and industry, and community partners. The model is being applied to specific NIEHS research applications and the broader research community. We briefly discuss two examples and discuss the strengths and limits of outcome-based evaluation of research programs.
Multistressor predictive models of invertebrate condition in the Corn Belt, USA
Waite, Ian R.; Van Metre, Peter C.
2017-01-01
Understanding the complex relations between multiple environmental stressors and ecological conditions in streams can help guide resource-management decisions. During 14 weeks in spring/summer 2013, personnel from the US Geological Survey and the US Environmental Protection Agency sampled 98 wadeable streams across the Midwest Corn Belt region of the USA for water and sediment quality, physical and habitat characteristics, and ecological communities. We used these data to develop independent predictive disturbance models for 3 macroinvertebrate metrics and a multimetric index. We developed the models based on boosted regression trees (BRT) for 3 stressor categories, land use/land cover (geographic information system [GIS]), all in-stream stressors combined (nutrients, habitat, and contaminants), and for GIS plus in-stream stressors. The GIS plus in-stream stressor models had the best overall performance with an average cross-validation R2 across all models of 0.41. The models were generally consistent in the explanatory variables selected within each stressor group across the 4 invertebrate metrics modeled. Variables related to riparian condition, substrate size or embeddedness, velocity and channel shape, nutrients (primarily NH3), and contaminants (pyrethroid degradates) were important descriptors of the invertebrate metrics. Models based on all measured in-stream stressors performed comparably to models based on GIS landscape variables, suggesting that the in-stream stressor characterization reasonably represents the dominant factors affecting invertebrate communities and that GIS variables are acting as surrogates for in-stream stressors that directly affect in-stream biota.
Product evaluation based in the association between intuition and tasks.
Almeida e Silva, Caio Márcio; Okimoto, Maria Lúcia L R; Albertazzi, Deise; Calixto, Cyntia; Costa, Humberto
2012-01-01
This paper explores the importance of researching the intuitiveness in the product use. It approaches the intuitiveness influence for users that already had a visual experience of the product. Finally, it is suggested the use of a table that relates the tasks performed while using a product, the features for an intuitive use and the performance metric "task success".
Hybrid Control for Multi-Agent Systems in Complex Sensing Environments
2012-02-28
controllers , the overall closed-loop system is time -varying but can potentially exhibit better stability and performance... system is time -varying and yet, once 4 feedback-interconnected with a suitable controller , it can potentially yield better stability and performance...resolution Sensing, Control and Switched Systems 13 4 Metric-Based Receding Horizon Control 14 5 Decentralized Control and Finite Wordlength Channels 15
NASA Astrophysics Data System (ADS)
Xu, Jingjiang; Song, Shaozhen; Li, Yuandong; Wang, Ruikang K.
2018-01-01
Optical coherence tomography angiography (OCTA) is increasingly becoming a popular inspection tool for biomedical imaging applications. By exploring the amplitude, phase and complex information available in OCT signals, numerous algorithms have been proposed that contrast functional vessel networks within microcirculatory tissue beds. However, it is not clear which algorithm delivers optimal imaging performance. Here, we investigate systematically how amplitude and phase information have an impact on the OCTA imaging performance, to establish the relationship of amplitude and phase stability with OCT signal-to-noise ratio (SNR), time interval and particle dynamics. With either repeated A-scan or repeated B-scan imaging protocols, the amplitude noise increases with the increase of OCT SNR; however, the phase noise does the opposite, i.e. it increases with the decrease of OCT SNR. Coupled with experimental measurements, we utilize a simple Monte Carlo (MC) model to simulate the performance of amplitude-, phase- and complex-based algorithms for OCTA imaging, the results of which suggest that complex-based algorithms deliver the best performance when the phase noise is < ~40 mrad. We also conduct a series of in vivo vascular imaging in animal models and human retina to verify the findings from the MC model through assessing the OCTA performance metrics of vessel connectivity, image SNR and contrast-to-noise ratio. We show that for all the metrics assessed, the complex-based algorithm delivers better performance than either the amplitude- or phase-based algorithms for both the repeated A-scan and the B-scan imaging protocols, which agrees well with the conclusion drawn from the MC simulations.
Xu, Jingjiang; Song, Shaozhen; Li, Yuandong; Wang, Ruikang K
2017-12-19
Optical coherence tomography angiography (OCTA) is increasingly becoming a popular inspection tool for biomedical imaging applications. By exploring the amplitude, phase and complex information available in OCT signals, numerous algorithms have been proposed that contrast functional vessel networks within microcirculatory tissue beds. However, it is not clear which algorithm delivers optimal imaging performance. Here, we investigate systematically how amplitude and phase information have an impact on the OCTA imaging performance, to establish the relationship of amplitude and phase stability with OCT signal-to-noise ratio (SNR), time interval and particle dynamics. With either repeated A-scan or repeated B-scan imaging protocols, the amplitude noise increases with the increase of OCT SNR; however, the phase noise does the opposite, i.e. it increases with the decrease of OCT SNR. Coupled with experimental measurements, we utilize a simple Monte Carlo (MC) model to simulate the performance of amplitude-, phase- and complex-based algorithms for OCTA imaging, the results of which suggest that complex-based algorithms deliver the best performance when the phase noise is < ~40 mrad. We also conduct a series of in vivo vascular imaging in animal models and human retina to verify the findings from the MC model through assessing the OCTA performance metrics of vessel connectivity, image SNR and contrast-to-noise ratio. We show that for all the metrics assessed, the complex-based algorithm delivers better performance than either the amplitude- or phase-based algorithms for both the repeated A-scan and the B-scan imaging protocols, which agrees well with the conclusion drawn from the MC simulations.
Engineering performance metrics
NASA Astrophysics Data System (ADS)
Delozier, R.; Snyder, N.
1993-03-01
Implementation of a Total Quality Management (TQM) approach to engineering work required the development of a system of metrics which would serve as a meaningful management tool for evaluating effectiveness in accomplishing project objectives and in achieving improved customer satisfaction. A team effort was chartered with the goal of developing a system of engineering performance metrics which would measure customer satisfaction, quality, cost effectiveness, and timeliness. The approach to developing this system involved normal systems design phases including, conceptual design, detailed design, implementation, and integration. The lessons teamed from this effort will be explored in this paper. These lessons learned may provide a starting point for other large engineering organizations seeking to institute a performance measurement system accomplishing project objectives and in achieving improved customer satisfaction. To facilitate this effort, a team was chartered to assist in the development of the metrics system. This team, consisting of customers and Engineering staff members, was utilized to ensure that the needs and views of the customers were considered in the development of performance measurements. The development of a system of metrics is no different than the development of any type of system. It includes the steps of defining performance measurement requirements, measurement process conceptual design, performance measurement and reporting system detailed design, and system implementation and integration.
de los Reyes-Guzmán, Ana; Dimbwadyo-Terrer, Iris; Trincado-Alonso, Fernando; Monasterio-Huelin, Félix; Torricelli, Diego; Gil-Agudo, Angel
2014-08-01
Quantitative measures of human movement quality are important for discriminating healthy and pathological conditions and for expressing the outcomes and clinically important changes in subjects' functional state. However the most frequently used instruments for the upper extremity functional assessment are clinical scales, that previously have been standardized and validated, but have a high subjective component depending on the observer who scores the test. But they are not enough to assess motor strategies used during movements, and their use in combination with other more objective measures is necessary. The objective of the present review is to provide an overview on objective metrics found in literature with the aim of quantifying the upper extremity performance during functional tasks, regardless of the equipment or system used for registering kinematic data. A search in Medline, Google Scholar and IEEE Xplore databases was performed following a combination of a series of keywords. The full scientific papers that fulfilled the inclusion criteria were included in the review. A set of kinematic metrics was found in literature in relation to joint displacements, analysis of hand trajectories and velocity profiles. These metrics were classified into different categories according to the movement characteristic that was being measured. These kinematic metrics provide the starting point for a proposed objective metrics for the functional assessment of the upper extremity in people with movement disorders as a consequence of neurological injuries. Potential areas of future and further research are presented in the Discussion section. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Wagle, Pradeep; Bhattarai, Nishan; Gowda, Prasanna H.; Kakani, Vijaya G.
2017-06-01
Robust evapotranspiration (ET) models are required to predict water usage in a variety of terrestrial ecosystems under different geographical and agrometeorological conditions. As a result, several remote sensing-based surface energy balance (SEB) models have been developed to estimate ET over large regions. However, comparison of the performance of several SEB models at the same site is limited. In addition, none of the SEB models have been evaluated for their ability to predict ET in rain-fed high biomass sorghum grown for biofuel production. In this paper, we evaluated the performance of five widely used single-source SEB models, namely Surface Energy Balance Algorithm for Land (SEBAL), Mapping ET with Internalized Calibration (METRIC), Surface Energy Balance System (SEBS), Simplified Surface Energy Balance Index (S-SEBI), and operational Simplified Surface Energy Balance (SSEBop), for estimating ET over a high biomass sorghum field during the 2012 and 2013 growing seasons. The predicted ET values were compared against eddy covariance (EC) measured ET (ETEC) for 19 cloud-free Landsat image. In general, S-SEBI, SEBAL, and SEBS performed reasonably well for the study period, while METRIC and SSEBop performed poorly. All SEB models substantially overestimated ET under extremely dry conditions as they underestimated sensible heat (H) and overestimated latent heat (LE) fluxes under dry conditions during the partitioning of available energy. METRIC, SEBAL, and SEBS overestimated LE regardless of wet or dry periods. Consequently, predicted seasonal cumulative ET by METRIC, SEBAL, and SEBS were higher than seasonal cumulative ETEC in both seasons. In contrast, S-SEBI and SSEBop substantially underestimated ET under too wet conditions, and predicted seasonal cumulative ET by S-SEBI and SSEBop were lower than seasonal cumulative ETEC in the relatively wetter 2013 growing season. Our results indicate the necessity of inclusion of soil moisture or plant water stress component in SEB models for the improvement of their performance, especially under too dry or wet environments.
Evaluation metrics for biostatistical and epidemiological collaborations.
Rubio, Doris McGartland; Del Junco, Deborah J; Bhore, Rafia; Lindsell, Christopher J; Oster, Robert A; Wittkowski, Knut M; Welty, Leah J; Li, Yi-Ju; Demets, Dave
2011-10-15
Increasing demands for evidence-based medicine and for the translation of biomedical research into individual and public health benefit have been accompanied by the proliferation of special units that offer expertise in biostatistics, epidemiology, and research design (BERD) within academic health centers. Objective metrics that can be used to evaluate, track, and improve the performance of these BERD units are critical to their successful establishment and sustainable future. To develop a set of reliable but versatile metrics that can be adapted easily to different environments and evolving needs, we consulted with members of BERD units from the consortium of academic health centers funded by the Clinical and Translational Science Award Program of the National Institutes of Health. Through a systematic process of consensus building and document drafting, we formulated metrics that covered the three identified domains of BERD practices: the development and maintenance of collaborations with clinical and translational science investigators, the application of BERD-related methods to clinical and translational research, and the discovery of novel BERD-related methodologies. In this article, we describe the set of metrics and advocate their use for evaluating BERD practices. The routine application, comparison of findings across diverse BERD units, and ongoing refinement of the metrics will identify trends, facilitate meaningful changes, and ultimately enhance the contribution of BERD activities to biomedical research. Copyright © 2011 John Wiley & Sons, Ltd.
Concussions are associated with decreased batting performance among Major League Baseball players.
Wasserman, Erin B; Abar, Beau; Shah, Manish N; Wasserman, Daniel; Bazarian, Jeffrey J
2015-05-01
Concussions impair balance, visual acuity, and reaction time--all of which are required for high-level batting performance--but the effects of concussion on batting performance have not been reported. The authors examined this relationship between concussion and batting performance among Major League Baseball (MLB) players. Batting performance among concussed MLB players will be worse upon return to play than batting performance among players missing time for noninjury reasons. Cohort study; Level of evidence, 3. The authors identified MLB players who sustained a concussion between 2007 and 2013 through league disabled-list records and a Baseball Prospectus database. For a comparison group, they identified players who went on paternity or bereavement leave during the same period. Using repeated-measures generalized linear models, the authors compared 7 batting metrics between the 2 groups for the 2 weeks upon return, as well as 4 to 6 weeks after return, controlling for pre-leave batting metrics, number of days missed, and position. The authors identified 66 concussions and 68 episodes of bereavement/paternity leave to include in the analysis. In the 2 weeks after return, batting average (.235 vs .266), on-base percentage (.294 vs .326), slugging percentage (.361 vs .423), and on-base plus slugging (.650 vs .749) were significantly lower among concussed players relative to the bereavement/paternity leave players (time×group interaction, P<.05). In weeks 4 to 6 after leave, these metrics were slightly lower in concussed players but not statistically significantly so. Although concussed players may be asymptomatic upon return to play, the residual effects of concussion on the skills required for batting may still be present. Further work is needed to clarify the mechanism through which batting performance after concussion is adversely affected and to identify better measures to use for return-to-play decisions. © 2015 The Author(s).
Duncan, James R; Kline, Benjamin; Glaiberman, Craig B
2007-04-01
To create and test methods of extracting efficiency data from recordings of simulated renal stent procedures. Task analysis was performed and used to design a standardized testing protocol. Five experienced angiographers then performed 16 renal stent simulations using the Simbionix AngioMentor angiographic simulator. Audio and video recordings of these simulations were captured from multiple vantage points. The recordings were synchronized and compiled. A series of efficiency metrics (procedure time, contrast volume, and tool use) were then extracted from the recordings. The intraobserver and interobserver variability of these individual metrics was also assessed. The metrics were converted to costs and aggregated to determine the fixed and variable costs of a procedure segment or the entire procedure. Task analysis and pilot testing led to a standardized testing protocol suitable for performance assessment. Task analysis also identified seven checkpoints that divided the renal stent simulations into six segments. Efficiency metrics for these different segments were extracted from the recordings and showed excellent intra- and interobserver correlations. Analysis of the individual and aggregated efficiency metrics demonstrated large differences between segments as well as between different angiographers. These differences persisted when efficiency was expressed as either total or variable costs. Task analysis facilitated both protocol development and data analysis. Efficiency metrics were readily extracted from recordings of simulated procedures. Aggregating the metrics and dividing the procedure into segments revealed potential insights that could be easily overlooked because the simulator currently does not attempt to aggregate the metrics and only provides data derived from the entire procedure. The data indicate that analysis of simulated angiographic procedures will be a powerful method of assessing performance in interventional radiology.
NASA Astrophysics Data System (ADS)
Koch, Julian; Cüneyd Demirel, Mehmet; Stisen, Simon
2018-05-01
The process of model evaluation is not only an integral part of model development and calibration but also of paramount importance when communicating modelling results to the scientific community and stakeholders. The modelling community has a large and well-tested toolbox of metrics to evaluate temporal model performance. In contrast, spatial performance evaluation does not correspond to the grand availability of spatial observations readily available and to the sophisticate model codes simulating the spatial variability of complex hydrological processes. This study makes a contribution towards advancing spatial-pattern-oriented model calibration by rigorously testing a multiple-component performance metric. The promoted SPAtial EFficiency (SPAEF) metric reflects three equally weighted components: correlation, coefficient of variation and histogram overlap. This multiple-component approach is found to be advantageous in order to achieve the complex task of comparing spatial patterns. SPAEF, its three components individually and two alternative spatial performance metrics, i.e. connectivity analysis and fractions skill score, are applied in a spatial-pattern-oriented model calibration of a catchment model in Denmark. Results suggest the importance of multiple-component metrics because stand-alone metrics tend to fail to provide holistic pattern information. The three SPAEF components are found to be independent, which allows them to complement each other in a meaningful way. In order to optimally exploit spatial observations made available by remote sensing platforms, this study suggests applying bias insensitive metrics which further allow for a comparison of variables which are related but may differ in unit. This study applies SPAEF in the hydrological context using the mesoscale Hydrologic Model (mHM; version 5.8), but we see great potential across disciplines related to spatially distributed earth system modelling.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Götstedt, Julia; Karlsson Hauer, Anna; Bäck, Anna, E-mail: anna.back@vgregion.se
Purpose: Complexity metrics have been suggested as a complement to measurement-based quality assurance for intensity modulated radiation therapy (IMRT) and volumetric modulated arc therapy (VMAT). However, these metrics have not yet been sufficiently validated. This study develops and evaluates new aperture-based complexity metrics in the context of static multileaf collimator (MLC) openings and compares them to previously published metrics. Methods: This study develops the converted aperture metric and the edge area metric. The converted aperture metric is based on small and irregular parts within the MLC opening that are quantified as measured distances between MLC leaves. The edge area metricmore » is based on the relative size of the region around the edges defined by the MLC. Another metric suggested in this study is the circumference/area ratio. Earlier defined aperture-based complexity metrics—the modulation complexity score, the edge metric, the ratio monitor units (MU)/Gy, the aperture area, and the aperture irregularity—are compared to the newly proposed metrics. A set of small and irregular static MLC openings are created which simulate individual IMRT/VMAT control points of various complexities. These are measured with both an amorphous silicon electronic portal imaging device and EBT3 film. The differences between calculated and measured dose distributions are evaluated using a pixel-by-pixel comparison with two global dose difference criteria of 3% and 5%. The extent of the dose differences, expressed in terms of pass rate, is used as a measure of the complexity of the MLC openings and used for the evaluation of the metrics compared in this study. The different complexity scores are calculated for each created static MLC opening. The correlation between the calculated complexity scores and the extent of the dose differences (pass rate) are analyzed in scatter plots and using Pearson’s r-values. Results: The complexity scores calculated by the edge area metric, converted aperture metric, circumference/area ratio, edge metric, and MU/Gy ratio show good linear correlation to the complexity of the MLC openings, expressed as the 5% dose difference pass rate, with Pearson’s r-values of −0.94, −0.88, −0.84, −0.89, and −0.82, respectively. The overall trends for the 3% and 5% dose difference evaluations are similar. Conclusions: New complexity metrics are developed. The calculated scores correlate to the complexity of the created static MLC openings. The complexity of the MLC opening is dependent on the penumbra region relative to the area of the opening. The aperture-based complexity metrics that combined either the distances between the MLC leaves or the MLC opening circumference with the aperture area show the best correlation with the complexity of the static MLC openings.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kelbe, David; Oak Ridge National Lab.; van Aardt, Jan
Terrestrial laser scanning has demonstrated increasing potential for rapid comprehensive measurement of forest structure, especially when multiple scans are spatially registered in order to reduce the limitations of occlusion. Although marker-based registration techniques (based on retro-reflective spherical targets) are commonly used in practice, a blind marker-free approach is preferable, insofar as it supports rapid operational data acquisition. To support these efforts, we extend the pairwise registration approach of our earlier work, and develop a graph-theoretical framework to perform blind marker-free global registration of multiple point cloud data sets. Pairwise pose estimates are weighted based on their estimated error, in ordermore » to overcome pose conflict while exploiting redundant information and improving precision. The proposed approach was tested for eight diverse New England forest sites, with 25 scans collected at each site. Quantitative assessment was provided via a novel embedded confidence metric, with a mean estimated root-mean-square error of 7.2 cm and 89% of scans connected to the reference node. Lastly, this paper assesses the validity of the embedded multiview registration confidence metric and evaluates the performance of the proposed registration algorithm.« less
A performance study of the time-varying cache behavior: a study on APEX, Mantevo, NAS, and PARSEC
Siddique, Nafiul A.; Grubel, Patricia A.; Badawy, Abdel-Hameed A.; ...
2017-09-20
Cache has long been used to minimize the latency of main memory accesses by storing frequently used data near the processor. Processor performance depends on the underlying cache performance. Therefore, significant research has been done to identify the most crucial metrics of cache performance. Although the majority of research focuses on measuring cache hit rates and data movement as the primary cache performance metrics, cache utilization is significantly important. We investigate the application’s locality using cache utilization metrics. In addition, we present cache utilization and traditional cache performance metrics as the program progresses providing detailed insights into the dynamic applicationmore » behavior on parallel applications from four benchmark suites running on multiple cores. We explore cache utilization for APEX, Mantevo, NAS, and PARSEC, mostly scientific benchmark suites. Our results indicate that 40% of the data bytes in a cache line are accessed at least once before line eviction. Also, on average a byte is accessed two times before the cache line is evicted for these applications. Moreover, we present runtime cache utilization, as well as, conventional performance metrics that illustrate a holistic understanding of cache behavior. To facilitate this research, we build a memory simulator incorporated into the Structural Simulation Toolkit (Rodrigues et al. in SIGMETRICS Perform Eval Rev 38(4):37–42, 2011). Finally, our results suggest that variable cache line size can result in better performance and can also conserve power.« less
A performance study of the time-varying cache behavior: a study on APEX, Mantevo, NAS, and PARSEC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Siddique, Nafiul A.; Grubel, Patricia A.; Badawy, Abdel-Hameed A.
Cache has long been used to minimize the latency of main memory accesses by storing frequently used data near the processor. Processor performance depends on the underlying cache performance. Therefore, significant research has been done to identify the most crucial metrics of cache performance. Although the majority of research focuses on measuring cache hit rates and data movement as the primary cache performance metrics, cache utilization is significantly important. We investigate the application’s locality using cache utilization metrics. In addition, we present cache utilization and traditional cache performance metrics as the program progresses providing detailed insights into the dynamic applicationmore » behavior on parallel applications from four benchmark suites running on multiple cores. We explore cache utilization for APEX, Mantevo, NAS, and PARSEC, mostly scientific benchmark suites. Our results indicate that 40% of the data bytes in a cache line are accessed at least once before line eviction. Also, on average a byte is accessed two times before the cache line is evicted for these applications. Moreover, we present runtime cache utilization, as well as, conventional performance metrics that illustrate a holistic understanding of cache behavior. To facilitate this research, we build a memory simulator incorporated into the Structural Simulation Toolkit (Rodrigues et al. in SIGMETRICS Perform Eval Rev 38(4):37–42, 2011). Finally, our results suggest that variable cache line size can result in better performance and can also conserve power.« less
ERIC Educational Resources Information Center
Ramanarayanan, Vikram; Lange, Patrick; Evanini, Keelan; Molloy, Hillary; Tsuprun, Eugene; Qian, Yao; Suendermann-Oeft, David
2017-01-01
Predicting and analyzing multimodal dialog user experience (UX) metrics, such as overall call experience, caller engagement, and latency, among other metrics, in an ongoing manner is important for evaluating such systems. We investigate automated prediction of multiple such metrics collected from crowdsourced interactions with an open-source,…
JPDO Portfolio Analysis of NextGen
2009-09-01
runways. C. Metrics The JPDO Interagency Portfolio & Systems Analysis ( IPSA ) division continues to coordinate, develop, and refine the metrics and...targets associated with the NextGen initiatives with the partner agencies & stakeholder communities. IPSA has formulated a set of top-level metrics as...metrics are calculated from system performance measures that constitute outputs of the American Institute of Aeronautics and Astronautics 8 IPSA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ronald Boring; Roger Lew; Thomas Ulrich
2014-03-01
As control rooms are modernized with new digital systems at nuclear power plants, it is necessary to evaluate the operator performance using these systems as part of a verification and validation process. There are no standard, predefined metrics available for assessing what is satisfactory operator interaction with new systems, especially during the early design stages of a new system. This report identifies the process and metrics for evaluating human system interfaces as part of control room modernization. The report includes background information on design and evaluation, a thorough discussion of human performance measures, and a practical example of how themore » process and metrics have been used as part of a turbine control system upgrade during the formative stages of design. The process and metrics are geared toward generalizability to other applications and serve as a template for utilities undertaking their own control room modernization activities.« less
EOID System Model Validation, Metrics, and Synthetic Clutter Generation
2003-09-30
Our long-term goal is to accurately predict the capability of the current generation of laser-based underwater imaging sensors to perform Electro ... Optic Identification (EOID) against relevant targets in a variety of realistic environmental conditions. The models will predict the impact of
Efficient and Flexible Climate Analysis with Python in a Cloud-Based Distributed Computing Framework
NASA Astrophysics Data System (ADS)
Gannon, C.
2017-12-01
As climate models become progressively more advanced, and spatial resolution further improved through various downscaling projects, climate projections at a local level are increasingly insightful and valuable. However, the raw size of climate datasets presents numerous hurdles for analysts wishing to develop customized climate risk metrics or perform site-specific statistical analysis. Four Twenty Seven, a climate risk consultancy, has implemented a Python-based distributed framework to analyze large climate datasets in the cloud. With the freedom afforded by efficiently processing these datasets, we are able to customize and continually develop new climate risk metrics using the most up-to-date data. Here we outline our process for using Python packages such as XArray and Dask to evaluate netCDF files in a distributed framework, StarCluster to operate in a cluster-computing environment, cloud computing services to access publicly hosted datasets, and how this setup is particularly valuable for generating climate change indicators and performing localized statistical analysis.
Dynamics of subway networks based on vehicles operation timetable
NASA Astrophysics Data System (ADS)
Xiao, Xue-mei; Jia, Li-min; Wang, Yan-hui
2017-05-01
In this paper, a subway network is represented as a dynamic, directed and weighted graph, in which vertices represent subway stations and weights of edges represent the number of vehicles passing through the edges by considering vehicles operation timetable. Meanwhile the definitions of static and dynamic metrics which can represent vertices' and edges' local and global attributes are proposed. Based on the model and metrics, standard deviation is further introduced to study the dynamic properties (heterogeneity and vulnerability) of subway networks. Through a detailed analysis of the Beijing subway network, we conclude that with the existing network structure, the heterogeneity and vulnerability of the Beijing subway network varies over time when the vehicle operation timetable is taken into consideration, and the distribution of edge weights affects the performance of the network. In other words, although the vehicles operation timetable is restrained by the physical structure of the network, it determines the performances and properties of the Beijing subway network.
Evaluation of an Integrated Framework for Biodiversity with a New Metric for Functional Dispersion
Presley, Steven J.; Scheiner, Samuel M.; Willig, Michael R.
2014-01-01
Growing interest in understanding ecological patterns from phylogenetic and functional perspectives has driven the development of metrics that capture variation in evolutionary histories or ecological functions of species. Recently, an integrated framework based on Hill numbers was developed that measures three dimensions of biodiversity based on abundance, phylogeny and function of species. This framework is highly flexible, allowing comparison of those diversity dimensions, including different aspects of a single dimension and their integration into a single measure. The behavior of those metrics with regard to variation in data structure has not been explored in detail, yet is critical for ensuring an appropriate match between the concept and its measurement. We evaluated how each metric responds to particular data structures and developed a new metric for functional biodiversity. The phylogenetic metric is sensitive to variation in the topology of phylogenetic trees, including variation in the relative lengths of basal, internal and terminal branches. In contrast, the functional metric exhibited multiple shortcomings: (1) species that are functionally redundant contribute nothing to functional diversity and (2) a single highly distinct species causes functional diversity to approach the minimum possible value. We introduced an alternative, improved metric based on functional dispersion that solves both of these problems. In addition, the new metric exhibited more desirable behavior when based on multiple traits. PMID:25148103
Subrandom methods for multidimensional nonuniform sampling.
Worley, Bradley
2016-08-01
Methods of nonuniform sampling that utilize pseudorandom number sequences to select points from a weighted Nyquist grid are commonplace in biomolecular NMR studies, due to the beneficial incoherence introduced by pseudorandom sampling. However, these methods require the specification of a non-arbitrary seed number in order to initialize a pseudorandom number generator. Because the performance of pseudorandom sampling schedules can substantially vary based on seed number, this can complicate the task of routine data collection. Approaches such as jittered sampling and stochastic gap sampling are effective at reducing random seed dependence of nonuniform sampling schedules, but still require the specification of a seed number. This work formalizes the use of subrandom number sequences in nonuniform sampling as a means of seed-independent sampling, and compares the performance of three subrandom methods to their pseudorandom counterparts using commonly applied schedule performance metrics. Reconstruction results using experimental datasets are also provided to validate claims made using these performance metrics. Copyright © 2016 Elsevier Inc. All rights reserved.
Challenge toward the prediction of typhoon behaviour and down pour
NASA Astrophysics Data System (ADS)
Takahashi, K.; Onishi, R.; Baba, Y.; Kida, S.; Matsuda, K.; Goto, K.; Fuchigami, H.
2013-08-01
Mechanisms of interactions among different scale phenomena play important roles for forecasting of weather and climate. Multi-scale Simulator for the Geoenvironment (MSSG), which deals with multi-scale multi-physics phenomena, is a coupled non-hydrostatic atmosphere-ocean model designed to be run efficiently on the Earth Simulator. We present simulation results with the world-highest 1.9km horizontal resolution for the entire globe and regional heavy rain with 1km horizontal resolution and 5m horizontal/vertical resolution for urban area simulation. To gain high performance by exploiting the system capabilities, we propose novel performance evaluation metrics introduced in previous studies that incorporate the effects of the data caching mechanism between CPU and memory. With a useful code optimization guideline based on such metrics, we demonstrate that MSSG can achieve an excellent peak performance ratio of 32.2% on the Earth Simulator with the single-core performance found to be a key to a reduced time-to-solution.
Changing to the Metric System.
ERIC Educational Resources Information Center
Chambers, Donald L.; Dowling, Kenneth W.
This report examines educational aspects of the conversion to the metric system of measurement in the United States. Statements of positions on metrication and basic mathematical skills are given from various groups. Base units, symbols, prefixes, and style of the metric system are outlined. Guidelines for teaching metric concepts are given,…
Shaikh, M S; Moiz, B
2016-04-01
Around two-thirds of important clinical decisions about the management of patients are based on laboratory test results. Clinical laboratories are required to adopt quality control (QC) measures to ensure provision of accurate and precise results. Six sigma is a statistical tool, which provides opportunity to assess performance at the highest level of excellence. The purpose of this study was to assess performance of our hematological parameters on sigma scale in order to identify gaps and hence areas of improvement in patient care. Twelve analytes included in the study were hemoglobin (Hb), hematocrit (Hct), red blood cell count (RBC), mean corpuscular volume (MCV), red cell distribution width (RDW), total leukocyte count (TLC) with percentages of neutrophils (Neutr%) and lymphocytes (Lymph %), platelet count (Plt), mean platelet volume (MPV), prothrombin time (PT), and fibrinogen (Fbg). Internal quality control data and external quality assurance survey results were utilized for the calculation of sigma metrics for each analyte. Acceptable sigma value of ≥3 was obtained for the majority of the analytes included in the analysis. MCV, Plt, and Fbg achieved value of <3 for level 1 (low abnormal) control. PT performed poorly on both level 1 and 2 controls with sigma value of <3. Despite acceptable conventional QC tools, application of sigma metrics can identify analytical deficits and hence prospects for the improvement in clinical laboratories. © 2016 John Wiley & Sons Ltd.
Universal health coverage in Rwanda: dream or reality.
Nyandekwe, Médard; Nzayirambaho, Manassé; Baptiste Kakoma, Jean
2014-01-01
Universal Health Coverage (UHC) has been a global concern for a long time and even more nowadays. While a number of publications are almost unanimous that Rwanda is not far from UHC, very few have focused on its financial sustainability and on its extreme external financial dependency. The objectives of this study are: (i) To assess Rwanda UHC based mainly on Community-Based Health Insurance (CBHI) from 2000 to 2012; (ii) to inform policy makers about observed gaps for a better way forward. A retrospective (2000-2012) SWOT analysis was applied to six metrics as key indicators of UHC achievement related to WHO definition, i.e. (i) health insurance and access to care, (ii) equity, (iii) package of services, (iv) rights-based approach, (v) quality of health care, (vi) financial-risk protection, and (vii) CBHI self-financing capacity (SFC) was added by the authors. The first metric with 96,15% of overall health insurance coverage and 1.07 visit per capita per year versus 1 visit recommended by WHO, the second with 24,8% indigent people subsidized versus 24,1% living in extreme poverty, the third, the fourth, and the fifth metrics excellently performing, the sixth with 10.80% versus ≤40% as limit acceptable of catastrophic health spending level and lastly the CBHI SFC i.e. proper cost recovery estimated at 82.55% in 2011/2012, Rwanda UHC achievements are objectively convincing. Rwanda UHC is not a dream but a reality if we consider all convincing results issued of the seven metrics.
No-Reference Video Quality Assessment Based on Statistical Analysis in 3D-DCT Domain.
Li, Xuelong; Guo, Qun; Lu, Xiaoqiang
2016-05-13
It is an important task to design models for universal no-reference video quality assessment (NR-VQA) in multiple video processing and computer vision applications. However, most existing NR-VQA metrics are designed for specific distortion types which are not often aware in practical applications. A further deficiency is that the spatial and temporal information of videos is hardly considered simultaneously. In this paper, we propose a new NR-VQA metric based on the spatiotemporal natural video statistics (NVS) in 3D discrete cosine transform (3D-DCT) domain. In the proposed method, a set of features are firstly extracted based on the statistical analysis of 3D-DCT coefficients to characterize the spatiotemporal statistics of videos in different views. These features are used to predict the perceived video quality via the efficient linear support vector regression (SVR) model afterwards. The contributions of this paper are: 1) we explore the spatiotemporal statistics of videos in 3DDCT domain which has the inherent spatiotemporal encoding advantage over other widely used 2D transformations; 2) we extract a small set of simple but effective statistical features for video visual quality prediction; 3) the proposed method is universal for multiple types of distortions and robust to different databases. The proposed method is tested on four widely used video databases. Extensive experimental results demonstrate that the proposed method is competitive with the state-of-art NR-VQA metrics and the top-performing FR-VQA and RR-VQA metrics.
Development and implementation of a balanced scorecard in an academic hospitalist group.
Hwa, Michael; Sharpe, Bradley A; Wachter, Robert M
2013-03-01
Academic hospitalist groups (AHGs) are often expected to excel in multiple domains: quality improvement, patient safety, education, research, administration, and clinical care. To be successful, AHGs must develop strategies to balance their energies, resources, and performance. The balanced scorecard (BSC) is a strategic management system that enables organizations to translate their mission and vision into specific objectives and metrics across multiple domains. To date, no hospitalist group has reported on BSC implementation. We set out to develop a BSC as part of a strategic planning initiative. Based on a needs assessment of the University of California, San Francisco, Division of Hospital Medicine, mission and vision statements were developed. We engaged representative faculty to develop strategic objectives and determine performance metrics across 4 BSC perspectives. There were 41 metrics identified, and 16 were chosen for the initial BSC. It allowed us to achieve several goals: 1) present a broad view of performance, 2) create transparency and accountability, 3) communicate goals and engage faculty, and 4) ensure we use data to guide strategic decisions. Several lessons were learned, including the need to build faculty consensus, establish metrics with reliable measureable data, and the power of the BSC to drive goals across the division. We successfully developed and implemented a BSC in an AHG as part of a strategic planning initiative. The BSC has been instrumental in allowing us to achieve balanced success in multiple domains. Academic groups should consider employing the BSC as it allows for a data-driven strategic planning and assessment process. Copyright © 2013 Society of Hospital Medicine.
Improved Mental Acuity Forecasting with an Individualized Quantitative Sleep Model.
Winslow, Brent D; Nguyen, Nam; Venta, Kimberly E
2017-01-01
Sleep impairment significantly alters human brain structure and cognitive function, but available evidence suggests that adults in developed nations are sleeping less. A growing body of research has sought to use sleep to forecast cognitive performance by modeling the relationship between the two, but has generally focused on vigilance rather than other cognitive constructs affected by sleep, such as reaction time, executive function, and working memory. Previous modeling efforts have also utilized subjective, self-reported sleep durations and were restricted to laboratory environments. In the current effort, we addressed these limitations by employing wearable systems and mobile applications to gather objective sleep information, assess multi-construct cognitive performance, and model/predict changes to mental acuity. Thirty participants were recruited for participation in the study, which lasted 1 week. Using the Fitbit Charge HR and a mobile version of the automated neuropsychological assessment metric called CogGauge, we gathered a series of features and utilized the unified model of performance to predict mental acuity based on sleep records. Our results suggest that individuals poorly rate their sleep duration, supporting the need for objective sleep metrics to model circadian changes to mental acuity. Participant compliance in using the wearable throughout the week and responding to the CogGauge assessments was 80%. Specific biases were identified in temporal metrics across mobile devices and operating systems and were excluded from the mental acuity metric development. Individualized prediction of mental acuity consistently outperformed group modeling. This effort indicates the feasibility of creating an individualized, mobile assessment and prediction of mental acuity, compatible with the majority of current mobile devices.
Symmetry-based detection and diagnosis of DCIS in breast MRI
NASA Astrophysics Data System (ADS)
Srikantha, Abhilash; Harz, Markus T.; Newstead, Gillian; Wang, Lei; Platel, Bram; Hegenscheid, Katrin; Mann, Ritse M.; Hahn, Horst K.; Peitgen, Heinz-Otto
2013-02-01
The delineation and diagnosis of non-mass-like lesions, most notably DCIS (ductal carcinoma in situ), is among the most challenging tasks in breast MRI reading. Even for human observers, DCIS is not always easy to diferentiate from patterns of active parenchymal enhancement or from benign alterations of breast tissue. In this light, it is no surprise that CADe/CADx approaches often completely fail to classify DCIS. Of the several approaches that have tried to devise such computer aid, none achieve performances similar to mass detection and classification in terms of sensitivity and specificity. In our contribution, we show a novel approach to combine a newly proposed metric of anatomical breast symmetry calculated on subtraction images of dynamic contrast-enhanced (DCE) breast MRI, descriptive kinetic parameters, and lesion candidate morphology to achieve performances comparable to computer-aided methods used for masses. We have based the development of the method on DCE MRI data of 18 DCIS cases with hand-annotated lesions, complemented by DCE-MRI data of nine normal cases. We propose a novel metric to quantify the symmetry of contralateral breasts and derive a strong indicator for potentially malignant changes from this metric. Also, we propose a novel metric for the orientation of a finding towards a fix point (the nipple). Our combined scheme then achieves a sensitivity of 89% with a specificity of 78%, matching CAD results for breast MRI on masses. The processing pipeline is intended to run on a CAD server, hence we designed all processing to be automated and free of per-case parameters. We expect that the detection results of our proposed non-mass aimed algorithm will complement other CAD algorithms, or ideally be joined with them in a voting scheme.