Sample records for model performance metrics

  1. Metrics for Performance Evaluation of Patient Exercises during Physical Therapy.

    PubMed

    Vakanski, Aleksandar; Ferguson, Jake M; Lee, Stephen

    2017-06-01

    The article proposes a set of metrics for evaluation of patient performance in physical therapy exercises. Taxonomy is employed that classifies the metrics into quantitative and qualitative categories, based on the level of abstraction of the captured motion sequences. Further, the quantitative metrics are classified into model-less and model-based metrics, in reference to whether the evaluation employs the raw measurements of patient performed motions, or whether the evaluation is based on a mathematical model of the motions. The reviewed metrics include root-mean square distance, Kullback Leibler divergence, log-likelihood, heuristic consistency, Fugl-Meyer Assessment, and similar. The metrics are evaluated for a set of five human motions captured with a Kinect sensor. The metrics can potentially be integrated into a system that employs machine learning for modelling and assessment of the consistency of patient performance in home-based therapy setting. Automated performance evaluation can overcome the inherent subjectivity in human performed therapy assessment, and it can increase the adherence to prescribed therapy plans, and reduce healthcare costs.

  2. Multi-objective optimization for generating a weighted multi-model ensemble

    NASA Astrophysics Data System (ADS)

    Lee, H.

    2017-12-01

    Many studies have demonstrated that multi-model ensembles generally show better skill than each ensemble member. When generating weighted multi-model ensembles, the first step is measuring the performance of individual model simulations using observations. There is a consensus on the assignment of weighting factors based on a single evaluation metric. When considering only one evaluation metric, the weighting factor for each model is proportional to a performance score or inversely proportional to an error for the model. While this conventional approach can provide appropriate combinations of multiple models, the approach confronts a big challenge when there are multiple metrics under consideration. When considering multiple evaluation metrics, it is obvious that a simple averaging of multiple performance scores or model ranks does not address the trade-off problem between conflicting metrics. So far, there seems to be no best method to generate weighted multi-model ensembles based on multiple performance metrics. The current study applies the multi-objective optimization, a mathematical process that provides a set of optimal trade-off solutions based on a range of evaluation metrics, to combining multiple performance metrics for the global climate models and their dynamically downscaled regional climate simulations over North America and generating a weighted multi-model ensemble. NASA satellite data and the Regional Climate Model Evaluation System (RCMES) software toolkit are used for assessment of the climate simulations. Overall, the performance of each model differs markedly with strong seasonal dependence. Because of the considerable variability across the climate simulations, it is important to evaluate models systematically and make future projections by assigning optimized weighting factors to the models with relatively good performance. Our results indicate that the optimally weighted multi-model ensemble always shows better performance than an arithmetic ensemble mean and may provide reliable future projections.

  3. Improving Climate Projections Using "Intelligent" Ensembles

    NASA Technical Reports Server (NTRS)

    Baker, Noel C.; Taylor, Patrick C.

    2015-01-01

    Recent changes in the climate system have led to growing concern, especially in communities which are highly vulnerable to resource shortages and weather extremes. There is an urgent need for better climate information to develop solutions and strategies for adapting to a changing climate. Climate models provide excellent tools for studying the current state of climate and making future projections. However, these models are subject to biases created by structural uncertainties. Performance metrics-or the systematic determination of model biases-succinctly quantify aspects of climate model behavior. Efforts to standardize climate model experiments and collect simulation data-such as the Coupled Model Intercomparison Project (CMIP)-provide the means to directly compare and assess model performance. Performance metrics have been used to show that some models reproduce present-day climate better than others. Simulation data from multiple models are often used to add value to projections by creating a consensus projection from the model ensemble, in which each model is given an equal weight. It has been shown that the ensemble mean generally outperforms any single model. It is possible to use unequal weights to produce ensemble means, in which models are weighted based on performance (called "intelligent" ensembles). Can performance metrics be used to improve climate projections? Previous work introduced a framework for comparing the utility of model performance metrics, showing that the best metrics are related to the variance of top-of-atmosphere outgoing longwave radiation. These metrics improve present-day climate simulations of Earth's energy budget using the "intelligent" ensemble method. The current project identifies several approaches for testing whether performance metrics can be applied to future simulations to create "intelligent" ensemble-mean climate projections. It is shown that certain performance metrics test key climate processes in the models, and that these metrics can be used to evaluate model quality in both current and future climate states. This information will be used to produce new consensus projections and provide communities with improved climate projections for urgent decision-making.

  4. Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial

    EPA Science Inventory

    This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit m...

  5. Metrics for evaluating performance and uncertainty of Bayesian network models

    Treesearch

    Bruce G. Marcot

    2012-01-01

    This paper presents a selected set of existing and new metrics for gauging Bayesian network model performance and uncertainty. Selected existing and new metrics are discussed for conducting model sensitivity analysis (variance reduction, entropy reduction, case file simulation); evaluating scenarios (influence analysis); depicting model complexity (numbers of model...

  6. A novel spatial performance metric for robust pattern optimization of distributed hydrological models

    NASA Astrophysics Data System (ADS)

    Stisen, S.; Demirel, C.; Koch, J.

    2017-12-01

    Evaluation of performance is an integral part of model development and calibration as well as it is of paramount importance when communicating modelling results to stakeholders and the scientific community. There exists a comprehensive and well tested toolbox of metrics to assess temporal model performance in the hydrological modelling community. On the contrary, the experience to evaluate spatial performance is not corresponding to the grand availability of spatial observations readily available and to the sophisticate model codes simulating the spatial variability of complex hydrological processes. This study aims at making a contribution towards advancing spatial pattern oriented model evaluation for distributed hydrological models. This is achieved by introducing a novel spatial performance metric which provides robust pattern performance during model calibration. The promoted SPAtial EFficiency (spaef) metric reflects three equally weighted components: correlation, coefficient of variation and histogram overlap. This multi-component approach is necessary in order to adequately compare spatial patterns. spaef, its three components individually and two alternative spatial performance metrics, i.e. connectivity analysis and fractions skill score, are tested in a spatial pattern oriented model calibration of a catchment model in Denmark. The calibration is constrained by a remote sensing based spatial pattern of evapotranspiration and discharge timeseries at two stations. Our results stress that stand-alone metrics tend to fail to provide holistic pattern information to the optimizer which underlines the importance of multi-component metrics. The three spaef components are independent which allows them to complement each other in a meaningful way. This study promotes the use of bias insensitive metrics which allow comparing variables which are related but may differ in unit in order to optimally exploit spatial observations made available by remote sensing platforms. We see great potential of spaef across environmental disciplines dealing with spatially distributed modelling.

  7. Performance Metrics, Error Modeling, and Uncertainty Quantification

    NASA Technical Reports Server (NTRS)

    Tian, Yudong; Nearing, Grey S.; Peters-Lidard, Christa D.; Harrison, Kenneth W.; Tang, Ling

    2016-01-01

    A common set of statistical metrics has been used to summarize the performance of models or measurements-­ the most widely used ones being bias, mean square error, and linear correlation coefficient. They assume linear, additive, Gaussian errors, and they are interdependent, incomplete, and incapable of directly quantifying un­certainty. The authors demonstrate that these metrics can be directly derived from the parameters of the simple linear error model. Since a correct error model captures the full error information, it is argued that the specification of a parametric error model should be an alternative to the metrics-based approach. The error-modeling meth­odology is applicable to both linear and nonlinear errors, while the metrics are only meaningful for linear errors. In addition, the error model expresses the error structure more naturally, and directly quantifies uncertainty. This argument is further explained by highlighting the intrinsic connections between the performance metrics, the error model, and the joint distribution between the data and the reference.

  8. Geospace Environment Modeling 2008-2009 Challenge: Ground Magnetic Field Perturbations

    NASA Technical Reports Server (NTRS)

    Pulkkinen, A.; Kuznetsova, M.; Ridley, A.; Raeder, J.; Vapirev, A.; Weimer, D.; Weigel, R. S.; Wiltberger, M.; Millward, G.; Rastatter, L.; hide

    2011-01-01

    Acquiring quantitative metrics!based knowledge about the performance of various space physics modeling approaches is central for the space weather community. Quantification of the performance helps the users of the modeling products to better understand the capabilities of the models and to choose the approach that best suits their specific needs. Further, metrics!based analyses are important for addressing the differences between various modeling approaches and for measuring and guiding the progress in the field. In this paper, the metrics!based results of the ground magnetic field perturbation part of the Geospace Environment Modeling 2008 2009 Challenge are reported. Predictions made by 14 different models, including an ensemble model, are compared to geomagnetic observatory recordings from 12 different northern hemispheric locations. Five different metrics are used to quantify the model performances for four storm events. It is shown that the ranking of the models is strongly dependent on the type of metric used to evaluate the model performance. None of the models rank near or at the top systematically for all used metrics. Consequently, one cannot pick the absolute winner : the choice for the best model depends on the characteristics of the signal one is interested in. Model performances vary also from event to event. This is particularly clear for root!mean!square difference and utility metric!based analyses. Further, analyses indicate that for some of the models, increasing the global magnetohydrodynamic model spatial resolution and the inclusion of the ring current dynamics improve the models capability to generate more realistic ground magnetic field fluctuations.

  9. Accounting for regional variation in both natural environment and human disturbance to improve performance of multimetric indices of lotic benthic diatoms.

    PubMed

    Tang, Tao; Stevenson, R Jan; Infante, Dana M

    2016-10-15

    Regional variation in both natural environment and human disturbance can influence performance of ecological assessments. In this study we calculated 5 types of benthic diatom multimetric indices (MMIs) with 3 different approaches to account for variation in ecological assessments. We used: site groups defined by ecoregions or diatom typologies; the same or different sets of metrics among site groups; and unmodeled or modeled MMIs, where models accounted for natural variation in metrics within site groups by calculating an expected reference condition for each metric and each site. We used data from the USEPA's National Rivers and Streams Assessment to calculate the MMIs and evaluate changes in MMI performance. MMI performance was evaluated with indices of precision, bias, responsiveness, sensitivity and relevancy which were respectively measured as MMI variation among reference sites, effects of natural variables on MMIs, difference between MMIs at reference and highly disturbed sites, percent of highly disturbed sites properly classified, and relation of MMIs to human disturbance and stressors. All 5 types of MMIs showed considerable discrimination ability. Using different metrics among ecoregions sometimes reduced precision, but it consistently increased responsiveness, sensitivity, and relevancy. Site specific metric modeling reduced bias and increased responsiveness. Combined use of different metrics among site groups and site specific modeling significantly improved MMI performance irrespective of site grouping approach. Compared to ecoregion site classification, grouping sites based on diatom typologies improved precision, but did not improve overall performance of MMIs if we accounted for natural variation in metrics with site specific models. We conclude that using different metrics among ecoregions and site specific metric modeling improve MMI performance, particularly when used together. Applications of these MMI approaches in ecological assessments introduced a tradeoff with assessment consistency when metrics differed across site groups, but they justified the convenient and consistent use of ecoregions. Copyright © 2016 Elsevier B.V. All rights reserved.

  10. Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial

    EPA Pesticide Factsheets

    The model performance evaluation consists of metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors.

  11. Evaluating hydrological model performance using information theory-based metrics

    USDA-ARS?s Scientific Manuscript database

    The accuracy-based model performance metrics not necessarily reflect the qualitative correspondence between simulated and measured streamflow time series. The objective of this work was to use the information theory-based metrics to see whether they can be used as complementary tool for hydrologic m...

  12. The SPAtial EFficiency metric (SPAEF): multiple-component evaluation of spatial patterns for optimization of hydrological models

    NASA Astrophysics Data System (ADS)

    Koch, Julian; Cüneyd Demirel, Mehmet; Stisen, Simon

    2018-05-01

    The process of model evaluation is not only an integral part of model development and calibration but also of paramount importance when communicating modelling results to the scientific community and stakeholders. The modelling community has a large and well-tested toolbox of metrics to evaluate temporal model performance. In contrast, spatial performance evaluation does not correspond to the grand availability of spatial observations readily available and to the sophisticate model codes simulating the spatial variability of complex hydrological processes. This study makes a contribution towards advancing spatial-pattern-oriented model calibration by rigorously testing a multiple-component performance metric. The promoted SPAtial EFficiency (SPAEF) metric reflects three equally weighted components: correlation, coefficient of variation and histogram overlap. This multiple-component approach is found to be advantageous in order to achieve the complex task of comparing spatial patterns. SPAEF, its three components individually and two alternative spatial performance metrics, i.e. connectivity analysis and fractions skill score, are applied in a spatial-pattern-oriented model calibration of a catchment model in Denmark. Results suggest the importance of multiple-component metrics because stand-alone metrics tend to fail to provide holistic pattern information. The three SPAEF components are found to be independent, which allows them to complement each other in a meaningful way. In order to optimally exploit spatial observations made available by remote sensing platforms, this study suggests applying bias insensitive metrics which further allow for a comparison of variables which are related but may differ in unit. This study applies SPAEF in the hydrological context using the mesoscale Hydrologic Model (mHM; version 5.8), but we see great potential across disciplines related to spatially distributed earth system modelling.

  13. Performance assessment of geospatial simulation models of land-use change--a landscape metric-based approach.

    PubMed

    Sakieh, Yousef; Salmanmahiny, Abdolrassoul

    2016-03-01

    Performance evaluation is a critical step when developing land-use and cover change (LUCC) models. The present study proposes a spatially explicit model performance evaluation method, adopting a landscape metric-based approach. To quantify GEOMOD model performance, a set of composition- and configuration-based landscape metrics including number of patches, edge density, mean Euclidean nearest neighbor distance, largest patch index, class area, landscape shape index, and splitting index were employed. The model takes advantage of three decision rules including neighborhood effect, persistence of change direction, and urbanization suitability values. According to the results, while class area, largest patch index, and splitting indices demonstrated insignificant differences between spatial pattern of ground truth and simulated layers, there was a considerable inconsistency between simulation results and real dataset in terms of the remaining metrics. Specifically, simulation outputs were simplistic and the model tended to underestimate number of developed patches by producing a more compact landscape. Landscape-metric-based performance evaluation produces more detailed information (compared to conventional indices such as the Kappa index and overall accuracy) on the model's behavior in replicating spatial heterogeneity features of a landscape such as frequency, fragmentation, isolation, and density. Finally, as the main characteristic of the proposed method, landscape metrics employ the maximum potential of observed and simulated layers for a performance evaluation procedure, provide a basis for more robust interpretation of a calibration process, and also deepen modeler insight into the main strengths and pitfalls of a specific land-use change model when simulating a spatiotemporal phenomenon.

  14. Comparing masked target transform volume (MTTV) clutter metric to human observer evaluation of visual clutter

    NASA Astrophysics Data System (ADS)

    Camp, H. A.; Moyer, Steven; Moore, Richard K.

    2010-04-01

    The Night Vision and Electronic Sensors Directorate's current time-limited search (TLS) model, which makes use of the targeting task performance (TTP) metric to describe image quality, does not explicitly account for the effects of visual clutter on observer performance. The TLS model is currently based on empirical fits to describe human performance for a time of day, spectrum and environment. Incorporating a clutter metric into the TLS model may reduce the number of these empirical fits needed. The masked target transform volume (MTTV) clutter metric has been previously presented and compared to other clutter metrics. Using real infrared imagery of rural images with varying levels of clutter, NVESD is currently evaluating the appropriateness of the MTTV metric. NVESD had twenty subject matter experts (SME) rank the amount of clutter in each scene in a series of pair-wise comparisons. MTTV metric values were calculated and then compared to the SME observers rankings. The MTTV metric ranked the clutter in a similar manner to the SME evaluation, suggesting that the MTTV metric may emulate SME response. This paper is a first step in quantifying clutter and measuring the agreement to subjective human evaluation.

  15. Research on quality metrics of wireless adaptive video streaming

    NASA Astrophysics Data System (ADS)

    Li, Xuefei

    2018-04-01

    With the development of wireless networks and intelligent terminals, video traffic has increased dramatically. Adaptive video streaming has become one of the most promising video transmission technologies. For this type of service, a good QoS (Quality of Service) of wireless network does not always guarantee that all customers have good experience. Thus, new quality metrics have been widely studies recently. Taking this into account, the objective of this paper is to investigate the quality metrics of wireless adaptive video streaming. In this paper, a wireless video streaming simulation platform with DASH mechanism and multi-rate video generator is established. Based on this platform, PSNR model, SSIM model and Quality Level model are implemented. Quality Level Model considers the QoE (Quality of Experience) factors such as image quality, stalling and switching frequency while PSNR Model and SSIM Model mainly consider the quality of the video. To evaluate the performance of these QoE models, three performance metrics (SROCC, PLCC and RMSE) which are used to make a comparison of subjective and predicted MOS (Mean Opinion Score) are calculated. From these performance metrics, the monotonicity, linearity and accuracy of these quality metrics can be observed.

  16. Metrics for Evaluation of Student Models

    ERIC Educational Resources Information Center

    Pelanek, Radek

    2015-01-01

    Researchers use many different metrics for evaluation of performance of student models. The aim of this paper is to provide an overview of commonly used metrics, to discuss properties, advantages, and disadvantages of different metrics, to summarize current practice in educational data mining, and to provide guidance for evaluation of student…

  17. Metric-driven harm: an exploration of unintended consequences of performance measurement.

    PubMed

    Rambur, Betty; Vallett, Carol; Cohen, Judith A; Tarule, Jill Mattuck

    2013-11-01

    Performance measurement is an increasingly common element of the US health care system. Typically a proxy for high quality outcomes, there has been little systematic investigation of the potential negative unintended consequences of performance metrics, including metric-driven harm. This case study details an incidence of post-surgical metric-driven harm and offers Smith's 1995 work and a patient centered, context sensitive metric model for potential adoption by nurse researchers and clinicians. Implications for further research are discussed. © 2013.

  18. The power metric: a new statistically robust enrichment-type metric for virtual screening applications with early recovery capability.

    PubMed

    Lopes, Julio Cesar Dias; Dos Santos, Fábio Mendes; Martins-José, Andrelly; Augustyns, Koen; De Winter, Hans

    2017-01-01

    A new metric for the evaluation of model performance in the field of virtual screening and quantitative structure-activity relationship applications is described. This metric has been termed the power metric and is defined as the fraction of the true positive rate divided by the sum of the true positive and false positive rates, for a given cutoff threshold. The performance of this metric is compared with alternative metrics such as the enrichment factor, the relative enrichment factor, the receiver operating curve enrichment factor, the correct classification rate, Matthews correlation coefficient and Cohen's kappa coefficient. The performance of this new metric is found to be quite robust with respect to variations in the applied cutoff threshold and ratio of the number of active compounds to the total number of compounds, and at the same time being sensitive to variations in model quality. It possesses the correct characteristics for its application in early-recognition virtual screening problems.

  19. The model for Fundamentals of Endovascular Surgery (FEVS) successfully defines the competent endovascular surgeon.

    PubMed

    Duran, Cassidy; Estrada, Sean; O'Malley, Marcia; Sheahan, Malachi G; Shames, Murray L; Lee, Jason T; Bismuth, Jean

    2015-12-01

    Fundamental skills testing is now required for certification in general surgery. No model for assessing fundamental endovascular skills exists. Our objective was to develop a model that tests the fundamental endovascular skills and differentiates competent from noncompetent performance. The Fundamentals of Endovascular Surgery model was developed in silicon and virtual-reality versions. Twenty individuals (with a range of experience) performed four tasks on each model in three separate sessions. Tasks on the silicon model were performed under fluoroscopic guidance, and electromagnetic tracking captured motion metrics for catheter tip position. Image processing captured tool tip position and motion on the virtual model. Performance was evaluated using a global rating scale, blinded video assessment of error metrics, and catheter tip movement and position. Motion analysis was based on derivations of speed and position that define proficiency of movement (spectral arc length, duration of submovement, and number of submovements). Performance was significantly different between competent and noncompetent interventionalists for the three performance measures of motion metrics, error metrics, and global rating scale. The mean error metric score was 6.83 for noncompetent individuals and 2.51 for the competent group (P < .0001). Median global rating scores were 2.25 for the noncompetent group and 4.75 for the competent users (P < .0001). The Fundamentals of Endovascular Surgery model successfully differentiates competent and noncompetent performance of fundamental endovascular skills based on a series of objective performance measures. This model could serve as a platform for skills testing for all trainees. Copyright © 2015 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.

  20. Research and development on performance models of thermal imaging systems

    NASA Astrophysics Data System (ADS)

    Wang, Ji-hui; Jin, Wei-qi; Wang, Xia; Cheng, Yi-nan

    2009-07-01

    Traditional ACQUIRE models perform the discrimination tasks of detection (target orientation, recognition and identification) for military target based upon minimum resolvable temperature difference (MRTD) and Johnson criteria for thermal imaging systems (TIS). Johnson criteria is generally pessimistic for performance predict of sampled imager with the development of focal plane array (FPA) detectors and digital image process technology. Triangle orientation discrimination threshold (TOD) model, minimum temperature difference perceived (MTDP)/ thermal range model (TRM3) Model and target task performance (TTP) metric have been developed to predict the performance of sampled imager, especially TTP metric can provides better accuracy than the Johnson criteria. In this paper, the performance models above are described; channel width metrics have been presented to describe the synthesis performance including modulate translate function (MTF) channel width for high signal noise to ration (SNR) optoelectronic imaging systems and MRTD channel width for low SNR TIS; the under resolvable questions for performance assessment of TIS are indicated; last, the development direction of performance models for TIS are discussed.

  1. A Classification Scheme for Smart Manufacturing Systems’ Performance Metrics

    PubMed Central

    Lee, Y. Tina; Kumaraguru, Senthilkumaran; Jain, Sanjay; Robinson, Stefanie; Helu, Moneer; Hatim, Qais Y.; Rachuri, Sudarsan; Dornfeld, David; Saldana, Christopher J.; Kumara, Soundar

    2017-01-01

    This paper proposes a classification scheme for performance metrics for smart manufacturing systems. The discussion focuses on three such metrics: agility, asset utilization, and sustainability. For each of these metrics, we discuss classification themes, which we then use to develop a generalized classification scheme. In addition to the themes, we discuss a conceptual model that may form the basis for the information necessary for performance evaluations. Finally, we present future challenges in developing robust, performance-measurement systems for real-time, data-intensive enterprises. PMID:28785744

  2. Creating "Intelligent" Climate Model Ensemble Averages Using a Process-Based Framework

    NASA Astrophysics Data System (ADS)

    Baker, N. C.; Taylor, P. C.

    2014-12-01

    The CMIP5 archive contains future climate projections from over 50 models provided by dozens of modeling centers from around the world. Individual model projections, however, are subject to biases created by structural model uncertainties. As a result, ensemble averaging of multiple models is often used to add value to model projections: consensus projections have been shown to consistently outperform individual models. Previous reports for the IPCC establish climate change projections based on an equal-weighted average of all model projections. However, certain models reproduce climate processes better than other models. Should models be weighted based on performance? Unequal ensemble averages have previously been constructed using a variety of mean state metrics. What metrics are most relevant for constraining future climate projections? This project develops a framework for systematically testing metrics in models to identify optimal metrics for unequal weighting multi-model ensembles. A unique aspect of this project is the construction and testing of climate process-based model evaluation metrics. A climate process-based metric is defined as a metric based on the relationship between two physically related climate variables—e.g., outgoing longwave radiation and surface temperature. Metrics are constructed using high-quality Earth radiation budget data from NASA's Clouds and Earth's Radiant Energy System (CERES) instrument and surface temperature data sets. It is found that regional values of tested quantities can vary significantly when comparing weighted and unweighted model ensembles. For example, one tested metric weights the ensemble by how well models reproduce the time-series probability distribution of the cloud forcing component of reflected shortwave radiation. The weighted ensemble for this metric indicates lower simulated precipitation (up to .7 mm/day) in tropical regions than the unweighted ensemble: since CMIP5 models have been shown to overproduce precipitation, this result could indicate that the metric is effective in identifying models which simulate more realistic precipitation. Ultimately, the goal of the framework is to identify performance metrics for advising better methods for ensemble averaging models and create better climate predictions.

  3. Synchronization of multi-agent systems with metric-topological interactions.

    PubMed

    Wang, Lin; Chen, Guanrong

    2016-09-01

    A hybrid multi-agent systems model integrating the advantages of both metric interaction and topological interaction rules, called the metric-topological model, is developed. This model describes planar motions of mobile agents, where each agent can interact with all the agents within a circle of a constant radius, and can furthermore interact with some distant agents to reach a pre-assigned number of neighbors, if needed. Some sufficient conditions imposed only on system parameters and agent initial states are presented, which ensure achieving synchronization of the whole group of agents. It reveals the intrinsic relationships among the interaction range, the speed, the initial heading, and the density of the group. Moreover, robustness against variations of interaction range, density, and speed are investigated by comparing the motion patterns and performances of the hybrid metric-topological interaction model with the conventional metric-only and topological-only interaction models. Practically in all cases, the hybrid metric-topological interaction model has the best performance in the sense of achieving highest frequency of synchronization, fastest convergent rate, and smallest heading difference.

  4. A GPS Phase-Locked Loop Performance Metric Based on the Phase Discriminator Output

    PubMed Central

    Stevanovic, Stefan; Pervan, Boris

    2018-01-01

    We propose a novel GPS phase-lock loop (PLL) performance metric based on the standard deviation of tracking error (defined as the discriminator’s estimate of the true phase error), and explain its advantages over the popular phase jitter metric using theory, numerical simulation, and experimental results. We derive an augmented GPS phase-lock loop (PLL) linear model, which includes the effect of coherent averaging, to be used in conjunction with this proposed metric. The augmented linear model allows more accurate calculation of tracking error standard deviation in the presence of additive white Gaussian noise (AWGN) as compared to traditional linear models. The standard deviation of tracking error, with a threshold corresponding to half of the arctangent discriminator pull-in region, is shown to be a more reliable/robust measure of PLL performance under interference conditions than the phase jitter metric. In addition, the augmented linear model is shown to be valid up until this threshold, which facilitates efficient performance prediction, so that time-consuming direct simulations and costly experimental testing can be reserved for PLL designs that are much more likely to be successful. The effect of varying receiver reference oscillator quality on the tracking error metric is also considered. PMID:29351250

  5. Ranking streamflow model performance based on Information theory metrics

    NASA Astrophysics Data System (ADS)

    Martinez, Gonzalo; Pachepsky, Yakov; Pan, Feng; Wagener, Thorsten; Nicholson, Thomas

    2016-04-01

    The accuracy-based model performance metrics not necessarily reflect the qualitative correspondence between simulated and measured streamflow time series. The objective of this work was to use the information theory-based metrics to see whether they can be used as complementary tool for hydrologic model evaluation and selection. We simulated 10-year streamflow time series in five watersheds located in Texas, North Carolina, Mississippi, and West Virginia. Eight model of different complexity were applied. The information-theory based metrics were obtained after representing the time series as strings of symbols where different symbols corresponded to different quantiles of the probability distribution of streamflow. The symbol alphabet was used. Three metrics were computed for those strings - mean information gain that measures the randomness of the signal, effective measure complexity that characterizes predictability and fluctuation complexity that characterizes the presence of a pattern in the signal. The observed streamflow time series has smaller information content and larger complexity metrics than the precipitation time series. Watersheds served as information filters and and streamflow time series were less random and more complex than the ones of precipitation. This is reflected the fact that the watershed acts as the information filter in the hydrologic conversion process from precipitation to streamflow. The Nash Sutcliffe efficiency metric increased as the complexity of models increased, but in many cases several model had this efficiency values not statistically significant from each other. In such cases, ranking models by the closeness of the information-theory based parameters in simulated and measured streamflow time series can provide an additional criterion for the evaluation of hydrologic model performance.

  6. Validity of the two-level model for Viterbi decoder gap-cycle performance

    NASA Technical Reports Server (NTRS)

    Dolinar, S.; Arnold, S.

    1990-01-01

    A two-level model has previously been proposed for approximating the performance of a Viterbi decoder which encounters data received with periodically varying signal-to-noise ratio. Such cyclically gapped data is obtained from the Very Large Array (VLA), either operating as a stand-alone system or arrayed with Goldstone. This approximate model predicts that the decoder error rate will vary periodically between two discrete levels with the same period as the gap cycle. It further predicts that the length of the gapped portion of the decoder error cycle for a constraint length K decoder will be about K-1 bits shorter than the actual duration of the gap. The two-level model for Viterbi decoder performance with gapped data is subjected to detailed validation tests. Curves showing the cyclical behavior of the decoder error burst statistics are compared with the simple square-wave cycles predicted by the model. The validity of the model depends on a parameter often considered irrelevant in the analysis of Viterbi decoder performance, the overall scaling of the received signal or the decoder's branch-metrics. Three scaling alternatives are examined: optimum branch-metric scaling and constant branch-metric scaling combined with either constant noise-level scaling or constant signal-level scaling. The simulated decoder error cycle curves roughly verify the accuracy of the two-level model for both the case of optimum branch-metric scaling and the case of constant branch-metric scaling combined with constant noise-level scaling. However, the model is not accurate for the case of constant branch-metric scaling combined with constant signal-level scaling.

  7. Impact of distance-based metric learning on classification and visualization model performance and structure-activity landscapes.

    PubMed

    Kireeva, Natalia V; Ovchinnikova, Svetlana I; Kuznetsov, Sergey L; Kazennov, Andrey M; Tsivadze, Aslan Yu

    2014-02-01

    This study concerns large margin nearest neighbors classifier and its multi-metric extension as the efficient approaches for metric learning which aimed to learn an appropriate distance/similarity function for considered case studies. In recent years, many studies in data mining and pattern recognition have demonstrated that a learned metric can significantly improve the performance in classification, clustering and retrieval tasks. The paper describes application of the metric learning approach to in silico assessment of chemical liabilities. Chemical liabilities, such as adverse effects and toxicity, play a significant role in drug discovery process, in silico assessment of chemical liabilities is an important step aimed to reduce costs and animal testing by complementing or replacing in vitro and in vivo experiments. Here, to our knowledge for the first time, a distance-based metric learning procedures have been applied for in silico assessment of chemical liabilities, the impact of metric learning on structure-activity landscapes and predictive performance of developed models has been analyzed, the learned metric was used in support vector machines. The metric learning results have been illustrated using linear and non-linear data visualization techniques in order to indicate how the change of metrics affected nearest neighbors relations and descriptor space.

  8. Impact of distance-based metric learning on classification and visualization model performance and structure-activity landscapes

    NASA Astrophysics Data System (ADS)

    Kireeva, Natalia V.; Ovchinnikova, Svetlana I.; Kuznetsov, Sergey L.; Kazennov, Andrey M.; Tsivadze, Aslan Yu.

    2014-02-01

    This study concerns large margin nearest neighbors classifier and its multi-metric extension as the efficient approaches for metric learning which aimed to learn an appropriate distance/similarity function for considered case studies. In recent years, many studies in data mining and pattern recognition have demonstrated that a learned metric can significantly improve the performance in classification, clustering and retrieval tasks. The paper describes application of the metric learning approach to in silico assessment of chemical liabilities. Chemical liabilities, such as adverse effects and toxicity, play a significant role in drug discovery process, in silico assessment of chemical liabilities is an important step aimed to reduce costs and animal testing by complementing or replacing in vitro and in vivo experiments. Here, to our knowledge for the first time, a distance-based metric learning procedures have been applied for in silico assessment of chemical liabilities, the impact of metric learning on structure-activity landscapes and predictive performance of developed models has been analyzed, the learned metric was used in support vector machines. The metric learning results have been illustrated using linear and non-linear data visualization techniques in order to indicate how the change of metrics affected nearest neighbors relations and descriptor space.

  9. Multi-metric calibration of hydrological model to capture overall flow regimes

    NASA Astrophysics Data System (ADS)

    Zhang, Yongyong; Shao, Quanxi; Zhang, Shifeng; Zhai, Xiaoyan; She, Dunxian

    2016-08-01

    Flow regimes (e.g., magnitude, frequency, variation, duration, timing and rating of change) play a critical role in water supply and flood control, environmental processes, as well as biodiversity and life history patterns in the aquatic ecosystem. The traditional flow magnitude-oriented calibration of hydrological model was usually inadequate to well capture all the characteristics of observed flow regimes. In this study, we simulated multiple flow regime metrics simultaneously by coupling a distributed hydrological model with an equally weighted multi-objective optimization algorithm. Two headwater watersheds in the arid Hexi Corridor were selected for the case study. Sixteen metrics were selected as optimization objectives, which could represent the major characteristics of flow regimes. Model performance was compared with that of the single objective calibration. Results showed that most metrics were better simulated by the multi-objective approach than those of the single objective calibration, especially the low and high flow magnitudes, frequency and variation, duration, maximum flow timing and rating. However, the model performance of middle flow magnitude was not significantly improved because this metric was usually well captured by single objective calibration. The timing of minimum flow was poorly predicted by both the multi-metric and single calibrations due to the uncertainties in model structure and input data. The sensitive parameter values of the hydrological model changed remarkably and the simulated hydrological processes by the multi-metric calibration became more reliable, because more flow characteristics were considered. The study is expected to provide more detailed flow information by hydrological simulation for the integrated water resources management, and to improve the simulation performances of overall flow regimes.

  10. Models of Marine Fish Biodiversity: Assessing Predictors from Three Habitat Classification Schemes.

    PubMed

    Yates, Katherine L; Mellin, Camille; Caley, M Julian; Radford, Ben T; Meeuwig, Jessica J

    2016-01-01

    Prioritising biodiversity conservation requires knowledge of where biodiversity occurs. Such knowledge, however, is often lacking. New technologies for collecting biological and physical data coupled with advances in modelling techniques could help address these gaps and facilitate improved management outcomes. Here we examined the utility of environmental data, obtained using different methods, for developing models of both uni- and multivariate biodiversity metrics. We tested which biodiversity metrics could be predicted best and evaluated the performance of predictor variables generated from three types of habitat data: acoustic multibeam sonar imagery, predicted habitat classification, and direct observer habitat classification. We used boosted regression trees (BRT) to model metrics of fish species richness, abundance and biomass, and multivariate regression trees (MRT) to model biomass and abundance of fish functional groups. We compared model performance using different sets of predictors and estimated the relative influence of individual predictors. Models of total species richness and total abundance performed best; those developed for endemic species performed worst. Abundance models performed substantially better than corresponding biomass models. In general, BRT and MRTs developed using predicted habitat classifications performed less well than those using multibeam data. The most influential individual predictor was the abiotic categorical variable from direct observer habitat classification and models that incorporated predictors from direct observer habitat classification consistently outperformed those that did not. Our results show that while remotely sensed data can offer considerable utility for predictive modelling, the addition of direct observer habitat classification data can substantially improve model performance. Thus it appears that there are aspects of marine habitats that are important for modelling metrics of fish biodiversity that are not fully captured by remotely sensed data. As such, the use of remotely sensed data to model biodiversity represents a compromise between model performance and data availability.

  11. Models of Marine Fish Biodiversity: Assessing Predictors from Three Habitat Classification Schemes

    PubMed Central

    Yates, Katherine L.; Mellin, Camille; Caley, M. Julian; Radford, Ben T.; Meeuwig, Jessica J.

    2016-01-01

    Prioritising biodiversity conservation requires knowledge of where biodiversity occurs. Such knowledge, however, is often lacking. New technologies for collecting biological and physical data coupled with advances in modelling techniques could help address these gaps and facilitate improved management outcomes. Here we examined the utility of environmental data, obtained using different methods, for developing models of both uni- and multivariate biodiversity metrics. We tested which biodiversity metrics could be predicted best and evaluated the performance of predictor variables generated from three types of habitat data: acoustic multibeam sonar imagery, predicted habitat classification, and direct observer habitat classification. We used boosted regression trees (BRT) to model metrics of fish species richness, abundance and biomass, and multivariate regression trees (MRT) to model biomass and abundance of fish functional groups. We compared model performance using different sets of predictors and estimated the relative influence of individual predictors. Models of total species richness and total abundance performed best; those developed for endemic species performed worst. Abundance models performed substantially better than corresponding biomass models. In general, BRT and MRTs developed using predicted habitat classifications performed less well than those using multibeam data. The most influential individual predictor was the abiotic categorical variable from direct observer habitat classification and models that incorporated predictors from direct observer habitat classification consistently outperformed those that did not. Our results show that while remotely sensed data can offer considerable utility for predictive modelling, the addition of direct observer habitat classification data can substantially improve model performance. Thus it appears that there are aspects of marine habitats that are important for modelling metrics of fish biodiversity that are not fully captured by remotely sensed data. As such, the use of remotely sensed data to model biodiversity represents a compromise between model performance and data availability. PMID:27333202

  12. Rank Order Entropy: why one metric is not enough

    PubMed Central

    McLellan, Margaret R.; Ryan, M. Dominic; Breneman, Curt M.

    2011-01-01

    The use of Quantitative Structure-Activity Relationship models to address problems in drug discovery has a mixed history, generally resulting from the mis-application of QSAR models that were either poorly constructed or used outside of their domains of applicability. This situation has motivated the development of a variety of model performance metrics (r2, PRESS r2, F-tests, etc) designed to increase user confidence in the validity of QSAR predictions. In a typical workflow scenario, QSAR models are created and validated on training sets of molecules using metrics such as Leave-One-Out or many-fold cross-validation methods that attempt to assess their internal consistency. However, few current validation methods are designed to directly address the stability of QSAR predictions in response to changes in the information content of the training set. Since the main purpose of QSAR is to quickly and accurately estimate a property of interest for an untested set of molecules, it makes sense to have a means at hand to correctly set user expectations of model performance. In fact, the numerical value of a molecular prediction is often less important to the end user than knowing the rank order of that set of molecules according to their predicted endpoint values. Consequently, a means for characterizing the stability of predicted rank order is an important component of predictive QSAR. Unfortunately, none of the many validation metrics currently available directly measure the stability of rank order prediction, making the development of an additional metric that can quantify model stability a high priority. To address this need, this work examines the stabilities of QSAR rank order models created from representative data sets, descriptor sets, and modeling methods that were then assessed using Kendall Tau as a rank order metric, upon which the Shannon Entropy was evaluated as a means of quantifying rank-order stability. Random removal of data from the training set, also known as Data Truncation Analysis (DTA), was used as a means for systematically reducing the information content of each training set while examining both rank order performance and rank order stability in the face of training set data loss. The premise for DTA ROE model evaluation is that the response of a model to incremental loss of training information will be indicative of the quality and sufficiency of its training set, learning method, and descriptor types to cover a particular domain of applicability. This process is termed a “rank order entropy” evaluation, or ROE. By analogy with information theory, an unstable rank order model displays a high level of implicit entropy, while a QSAR rank order model which remains nearly unchanged during training set reductions would show low entropy. In this work, the ROE metric was applied to 71 data sets of different sizes, and was found to reveal more information about the behavior of the models than traditional metrics alone. Stable, or consistently performing models, did not necessarily predict rank order well. Models that performed well in rank order did not necessarily perform well in traditional metrics. In the end, it was shown that ROE metrics suggested that some QSAR models that are typically used should be discarded. ROE evaluation helps to discern which combinations of data set, descriptor set, and modeling methods lead to usable models in prioritization schemes, and provides confidence in the use of a particular model within a specific domain of applicability. PMID:21875058

  13. Specification and implementation of IFC based performance metrics to support building life cycle assessment of hybrid energy systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morrissey, Elmer; O'Donnell, James; Keane, Marcus

    2004-03-29

    Minimizing building life cycle energy consumption is becoming of paramount importance. Performance metrics tracking offers a clear and concise manner of relating design intent in a quantitative form. A methodology is discussed for storage and utilization of these performance metrics through an Industry Foundation Classes (IFC) instantiated Building Information Model (BIM). The paper focuses on storage of three sets of performance data from three distinct sources. An example of a performance metrics programming hierarchy is displayed for a heat pump and a solar array. Utilizing the sets of performance data, two discrete performance effectiveness ratios may be computed, thus offeringmore » an accurate method of quantitatively assessing building performance.« less

  14. Context and meter enhance long-range planning in music performance

    PubMed Central

    Mathias, Brian; Pfordresher, Peter Q.; Palmer, Caroline

    2015-01-01

    Neural responses demonstrate evidence of resonance, or oscillation, during the production of periodic auditory events. Music contains periodic auditory events that give rise to a sense of beat, which in turn generates a sense of meter on the basis of multiple periodicities. Metrical hierarchies may aid memory for music by facilitating similarity-based associations among sequence events at different periodic distances that unfold in longer contexts. A fundamental question is how metrical associations arising from a musical context influence memory during music performance. Longer contexts may facilitate metrical associations at higher hierarchical levels more than shorter contexts, a prediction of the range model, a formal model of planning processes in music performance (Palmer and Pfordresher, 2003; Pfordresher et al., 2007). Serial ordering errors, in which intended sequence events are produced in incorrect sequence positions, were measured as skilled pianists performed musical pieces that contained excerpts embedded in long or short musical contexts. Pitch errors arose from metrically similar positions and further sequential distances more often when the excerpt was embedded in long contexts compared to short contexts. Musicians’ keystroke intensities and error rates also revealed influences of metrical hierarchies, which differed for performances in long and short contexts. The range model accounted for contextual effects and provided better fits to empirical findings when metrical associations between sequence events were included. Longer sequence contexts may facilitate planning during sequence production by increasing conceptual similarity between hierarchically associated events. These findings are consistent with the notion that neural oscillations at multiple periodicities may strengthen metrical associations across sequence events during planning. PMID:25628550

  15. Regime-based evaluation of cloudiness in CMIP5 models

    NASA Astrophysics Data System (ADS)

    Jin, Daeho; Oreopoulos, Lazaros; Lee, Dongmin

    2017-01-01

    The concept of cloud regimes (CRs) is used to develop a framework for evaluating the cloudiness of 12 fifth Coupled Model Intercomparison Project (CMIP5) models. Reference CRs come from existing global International Satellite Cloud Climatology Project (ISCCP) weather states. The evaluation is made possible by the implementation in several CMIP5 models of the ISCCP simulator generating in each grid cell daily joint histograms of cloud optical thickness and cloud top pressure. Model performance is assessed with several metrics such as CR global cloud fraction (CF), CR relative frequency of occurrence (RFO), their product [long-term average total cloud amount (TCA)], cross-correlations of CR RFO maps, and a metric of resemblance between model and ISCCP CRs. In terms of CR global RFO, arguably the most fundamental metric, the models perform unsatisfactorily overall, except for CRs representing thick storm clouds. Because model CR CF is internally constrained by our method, RFO discrepancies yield also substantial TCA errors. Our results support previous findings that CMIP5 models underestimate cloudiness. The multi-model mean performs well in matching observed RFO maps for many CRs, but is still not the best for this or other metrics. When overall performance across all CRs is assessed, some models, despite shortcomings, apparently outperform Moderate Resolution Imaging Spectroradiometer cloud observations evaluated against ISCCP like another model output. Lastly, contrasting cloud simulation performance against each model's equilibrium climate sensitivity in order to gain insight on whether good cloud simulation pairs with particular values of this parameter, yields no clear conclusions.

  16. The use of player physical and technical skill match activity profiles to predict position in the Australian Football League draft.

    PubMed

    Woods, Carl T; Veale, James P; Collier, Neil; Robertson, Sam

    2017-02-01

    This study investigated the extent to which position in the Australian Football League (AFL) national draft is associated with individual game performance metrics. Physical/technical skill performance metrics were collated from all participants in the 2014 national under 18 (U18) championships (18 games) drafted into the AFL (n = 65; 17.8 ± 0.5 y); 232 observations. Players were subdivided into draft position (ranked 1-65) and then draft round (1-4). Here, earlier draft selection (i.e., closer to 1) reflects a more desirable player. Microtechnology and a commercial provider facilitated the quantification of individual game performance metrics (n = 16). Linear mixed models were fitted to data, modelling the extent to which draft position was associated with these metrics. Draft position in the first/second round was negatively associated with "contested possessions" and "contested marks", respectively. Physical performance metrics were positively associated with draft position in these rounds. Correlations weakened for the third/fourth rounds. Contested possessions/marks were associated with an earlier draft selection. Physical performance metrics were associated with a later draft selection. Recruiters change the type of U18 player they draft as the selection pool reduces. juniors with contested skill appear prioritised.

  17. Normal tissue complication probability (NTCP) modelling using spatial dose metrics and machine learning methods for severe acute oral mucositis resulting from head and neck radiotherapy.

    PubMed

    Dean, Jamie A; Wong, Kee H; Welsh, Liam C; Jones, Ann-Britt; Schick, Ulrike; Newbold, Kate L; Bhide, Shreerang A; Harrington, Kevin J; Nutting, Christopher M; Gulliford, Sarah L

    2016-07-01

    Severe acute mucositis commonly results from head and neck (chemo)radiotherapy. A predictive model of mucositis could guide clinical decision-making and inform treatment planning. We aimed to generate such a model using spatial dose metrics and machine learning. Predictive models of severe acute mucositis were generated using radiotherapy dose (dose-volume and spatial dose metrics) and clinical data. Penalised logistic regression, support vector classification and random forest classification (RFC) models were generated and compared. Internal validation was performed (with 100-iteration cross-validation), using multiple metrics, including area under the receiver operating characteristic curve (AUC) and calibration slope, to assess performance. Associations between covariates and severe mucositis were explored using the models. The dose-volume-based models (standard) performed equally to those incorporating spatial information. Discrimination was similar between models, but the RFCstandard had the best calibration. The mean AUC and calibration slope for this model were 0.71 (s.d.=0.09) and 3.9 (s.d.=2.2), respectively. The volumes of oral cavity receiving intermediate and high doses were associated with severe mucositis. The RFCstandard model performance is modest-to-good, but should be improved, and requires external validation. Reducing the volumes of oral cavity receiving intermediate and high doses may reduce mucositis incidence. Copyright © 2016 The Author(s). Published by Elsevier Ireland Ltd.. All rights reserved.

  18. Human-centric predictive model of task difficulty for human-in-the-loop control tasks

    PubMed Central

    Majewicz Fey, Ann

    2018-01-01

    Quantitatively measuring the difficulty of a manipulation task in human-in-the-loop control systems is ill-defined. Currently, systems are typically evaluated through task-specific performance measures and post-experiment user surveys; however, these methods do not capture the real-time experience of human users. In this study, we propose to analyze and predict the difficulty of a bivariate pointing task, with a haptic device interface, using human-centric measurement data in terms of cognition, physical effort, and motion kinematics. Noninvasive sensors were used to record the multimodal response of human user for 14 subjects performing the task. A data-driven approach for predicting task difficulty was implemented based on several task-independent metrics. We compare four possible models for predicting task difficulty to evaluated the roles of the various types of metrics, including: (I) a movement time model, (II) a fusion model using both physiological and kinematic metrics, (III) a model only with kinematic metrics, and (IV) a model only with physiological metrics. The results show significant correlation between task difficulty and the user sensorimotor response. The fusion model, integrating user physiology and motion kinematics, provided the best estimate of task difficulty (R2 = 0.927), followed by a model using only kinematic metrics (R2 = 0.921). Both models were better predictors of task difficulty than the movement time model (R2 = 0.847), derived from Fitt’s law, a well studied difficulty model for human psychomotor control. PMID:29621301

  19. Regime-Based Evaluation of Cloudiness in CMIP5 Models

    NASA Technical Reports Server (NTRS)

    Jin, Daeho; Oraiopoulos, Lazaros; Lee, Dong Min

    2016-01-01

    The concept of Cloud Regimes (CRs) is used to develop a framework for evaluating the cloudiness of 12 fifth Coupled Model Intercomparison Project (CMIP5) models. Reference CRs come from existing global International Satellite Cloud Climatology Project (ISCCP) weather states. The evaluation is made possible by the implementation in several CMIP5 models of the ISCCP simulator generating for each gridcell daily joint histograms of cloud optical thickness and cloud top pressure. Model performance is assessed with several metrics such as CR global cloud fraction (CF), CR relative frequency of occurrence (RFO), their product (long-term average total cloud amount [TCA]), cross-correlations of CR RFO maps, and a metric of resemblance between model and ISCCP CRs. In terms of CR global RFO, arguably the most fundamental metric, the models perform unsatisfactorily overall, except for CRs representing thick storm clouds. Because model CR CF is internally constrained by our method, RFO discrepancies yield also substantial TCA errors. Our findings support previous studies showing that CMIP5 models underestimate cloudiness. The multi-model mean performs well in matching observed RFO maps for many CRs, but is not the best for this or other metrics. When overall performance across all CRs is assessed, some models, despite their shortcomings, apparently outperform Moderate Resolution Imaging Spectroradiometer (MODIS) cloud observations evaluated against ISCCP as if they were another model output. Lastly, cloud simulation performance is contrasted with each model's equilibrium climate sensitivity (ECS) in order to gain insight on whether good cloud simulation pairs with particular values of this parameter.

  20. Model Performance Evaluation and Scenario Analysis ...

    EPA Pesticide Factsheets

    This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors. The performance measures include error analysis, coefficient of determination, Nash-Sutcliffe efficiency, and a new weighted rank method. These performance metrics only provide useful information about the overall model performance. Note that MPESA is based on the separation of observed and simulated time series into magnitude and sequence components. The separation of time series into magnitude and sequence components and the reconstruction back to time series provides diagnostic insights to modelers. For example, traditional approaches lack the capability to identify if the source of uncertainty in the simulated data is due to the quality of the input data or the way the analyst adjusted the model parameters. This report presents a suite of model diagnostics that identify if mismatches between observed and simulated data result from magnitude or sequence related errors. MPESA offers graphical and statistical options that allow HSPF users to compare observed and simulated time series and identify the parameter values to adjust or the input data to modify. The scenario analysis part of the too

  1. Stability and Performance Metrics for Adaptive Flight Control

    NASA Technical Reports Server (NTRS)

    Stepanyan, Vahram; Krishnakumar, Kalmanje; Nguyen, Nhan; VanEykeren, Luarens

    2009-01-01

    This paper addresses the problem of verifying adaptive control techniques for enabling safe flight in the presence of adverse conditions. Since the adaptive systems are non-linear by design, the existing control verification metrics are not applicable to adaptive controllers. Moreover, these systems are in general highly uncertain. Hence, the system's characteristics cannot be evaluated by relying on the available dynamical models. This necessitates the development of control verification metrics based on the system's input-output information. For this point of view, a set of metrics is introduced that compares the uncertain aircraft's input-output behavior under the action of an adaptive controller to that of a closed-loop linear reference model to be followed by the aircraft. This reference model is constructed for each specific maneuver using the exact aerodynamic and mass properties of the aircraft to meet the stability and performance requirements commonly accepted in flight control. The proposed metrics are unified in the sense that they are model independent and not restricted to any specific adaptive control methods. As an example, we present simulation results for a wing damaged generic transport aircraft with several existing adaptive controllers.

  2. Up Periscope! Designing a New Perceptual Metric for Imaging System Performance

    NASA Technical Reports Server (NTRS)

    Watson, Andrew B.

    2016-01-01

    Modern electronic imaging systems include optics, sensors, sampling, noise, processing, compression, transmission and display elements, and are viewed by the human eye. Many of these elements cannot be assessed by traditional imaging system metrics such as the MTF. More complex metrics such as NVTherm do address these elements, but do so largely through parametric adjustment of an MTF-like metric. The parameters are adjusted through subjective testing of human observers identifying specific targets in a set of standard images. We have designed a new metric that is based on a model of human visual pattern classification. In contrast to previous metrics, ours simulates the human observer identifying the standard targets. One application of this metric is to quantify performance of modern electronic periscope systems on submarines.

  3. A Locally Weighted Fixation Density-Based Metric for Assessing the Quality of Visual Saliency Predictions

    NASA Astrophysics Data System (ADS)

    Gide, Milind S.; Karam, Lina J.

    2016-08-01

    With the increased focus on visual attention (VA) in the last decade, a large number of computational visual saliency methods have been developed over the past few years. These models are traditionally evaluated by using performance evaluation metrics that quantify the match between predicted saliency and fixation data obtained from eye-tracking experiments on human observers. Though a considerable number of such metrics have been proposed in the literature, there are notable problems in them. In this work, we discuss shortcomings in existing metrics through illustrative examples and propose a new metric that uses local weights based on fixation density which overcomes these flaws. To compare the performance of our proposed metric at assessing the quality of saliency prediction with other existing metrics, we construct a ground-truth subjective database in which saliency maps obtained from 17 different VA models are evaluated by 16 human observers on a 5-point categorical scale in terms of their visual resemblance with corresponding ground-truth fixation density maps obtained from eye-tracking data. The metrics are evaluated by correlating metric scores with the human subjective ratings. The correlation results show that the proposed evaluation metric outperforms all other popular existing metrics. Additionally, the constructed database and corresponding subjective ratings provide an insight into which of the existing metrics and future metrics are better at estimating the quality of saliency prediction and can be used as a benchmark.

  4. A neural net-based approach to software metrics

    NASA Technical Reports Server (NTRS)

    Boetticher, G.; Srinivas, Kankanahalli; Eichmann, David A.

    1992-01-01

    Software metrics provide an effective method for characterizing software. Metrics have traditionally been composed through the definition of an equation. This approach is limited by the fact that all the interrelationships among all the parameters be fully understood. This paper explores an alternative, neural network approach to modeling metrics. Experiments performed on two widely accepted metrics, McCabe and Halstead, indicate that the approach is sound, thus serving as the groundwork for further exploration into the analysis and design of software metrics.

  5. Performance of the METRIC model in estimating evapotranspiration fluxes over an irrigated field in Saudi Arabia using Landsat-8 images

    NASA Astrophysics Data System (ADS)

    Madugundu, Rangaswamy; Al-Gaadi, Khalid A.; Tola, ElKamil; Hassaballa, Abdalhaleem A.; Patil, Virupakshagouda C.

    2017-12-01

    Accurate estimation of evapotranspiration (ET) is essential for hydrological modeling and efficient crop water management in hyper-arid climates. In this study, we applied the METRIC algorithm on Landsat-8 images, acquired from June to October 2013, for the mapping of ET of a 50 ha center-pivot irrigated alfalfa field in the eastern region of Saudi Arabia. The METRIC-estimated energy balance components and ET were evaluated against the data provided by an eddy covariance (EC) flux tower installed in the field. Results indicated that the METRIC algorithm provided accurate ET estimates over the study area, with RMSE values of 0.13 and 4.15 mm d-1. The METRIC algorithm was observed to perform better in full canopy conditions compared to partial canopy conditions. On average, the METRIC algorithm overestimated the hourly ET by 6.6 % in comparison to the EC measurements; however, the daily ET was underestimated by 4.2 %.

  6. Interaction Metrics for Feedback Control of Sound Radiation from Stiffened Panels

    NASA Technical Reports Server (NTRS)

    Cabell, Randolph H.; Cox, David E.; Gibbs, Gary P.

    2003-01-01

    Interaction metrics developed for the process control industry are used to evaluate decentralized control of sound radiation from bays on an aircraft fuselage. The metrics are applied to experimentally measured frequency response data from a model of an aircraft fuselage. The purpose is to understand how coupling between multiple bays of the fuselage can destabilize or limit the performance of a decentralized active noise control system. The metrics quantitatively verify observations from a previous experiment, in which decentralized controllers performed worse than centralized controllers. The metrics do not appear to be useful for explaining control spillover which was observed in a previous experiment.

  7. A condition metric for Eucalyptus woodland derived from expert evaluations.

    PubMed

    Sinclair, Steve J; Bruce, Matthew J; Griffioen, Peter; Dodd, Amanda; White, Matthew D

    2018-02-01

    The evaluation of ecosystem quality is important for land-management and land-use planning. Evaluation is unavoidably subjective, and robust metrics must be based on consensus and the structured use of observations. We devised a transparent and repeatable process for building and testing ecosystem metrics based on expert data. We gathered quantitative evaluation data on the quality of hypothetical grassy woodland sites from experts. We used these data to train a model (an ensemble of 30 bagged regression trees) capable of predicting the perceived quality of similar hypothetical woodlands based on a set of 13 site variables as inputs (e.g., cover of shrubs, richness of native forbs). These variables can be measured at any site and the model implemented in a spreadsheet as a metric of woodland quality. We also investigated the number of experts required to produce an opinion data set sufficient for the construction of a metric. The model produced evaluations similar to those provided by experts, as shown by assessing the model's quality scores of expert-evaluated test sites not used to train the model. We applied the metric to 13 woodland conservation reserves and asked managers of these sites to independently evaluate their quality. To assess metric performance, we compared the model's evaluation of site quality with the managers' evaluations through multidimensional scaling. The metric performed relatively well, plotting close to the center of the space defined by the evaluators. Given the method provides data-driven consensus and repeatability, which no single human evaluator can provide, we suggest it is a valuable tool for evaluating ecosystem quality in real-world contexts. We believe our approach is applicable to any ecosystem. © 2017 State of Victoria.

  8. Application of Support Vector Machine to Forex Monitoring

    NASA Astrophysics Data System (ADS)

    Kamruzzaman, Joarder; Sarker, Ruhul A.

    Previous studies have demonstrated superior performance of artificial neural network (ANN) based forex forecasting models over traditional regression models. This paper applies support vector machines to build a forecasting model from the historical data using six simple technical indicators and presents a comparison with an ANN based model trained by scaled conjugate gradient (SCG) learning algorithm. The models are evaluated and compared on the basis of five commonly used performance metrics that measure closeness of prediction as well as correctness in directional change. Forecasting results of six different currencies against Australian dollar reveal superior performance of SVM model using simple linear kernel over ANN-SCG model in terms of all the evaluation metrics. The effect of SVM parameter selection on prediction performance is also investigated and analyzed.

  9. Investigation of Two Models to Set and Evaluate Quality Targets for HbA1c: Biological Variation and Sigma-metrics

    PubMed Central

    Weykamp, Cas; John, Garry; Gillery, Philippe; English, Emma; Ji, Linong; Lenters-Westra, Erna; Little, Randie R.; Roglic, Gojka; Sacks, David B.; Takei, Izumi

    2016-01-01

    Background A major objective of the IFCC Task Force on implementation of HbA1c standardization is to develop a model to define quality targets for HbA1c. Methods Two generic models, the Biological Variation and Sigma-metrics model, are investigated. Variables in the models were selected for HbA1c and data of EQA/PT programs were used to evaluate the suitability of the models to set and evaluate quality targets within and between laboratories. Results In the biological variation model 48% of individual laboratories and none of the 26 instrument groups met the minimum performance criterion. In the Sigma-metrics model, with a total allowable error (TAE) set at 5 mmol/mol (0.46% NGSP) 77% of the individual laboratories and 12 of 26 instrument groups met the 2 sigma criterion. Conclusion The Biological Variation and Sigma-metrics model were demonstrated to be suitable for setting and evaluating quality targets within and between laboratories. The Sigma-metrics model is more flexible as both the TAE and the risk of failure can be adjusted to requirements related to e.g. use for diagnosis/monitoring or requirements of (inter)national authorities. With the aim of reaching international consensus on advice regarding quality targets for HbA1c, the Task Force suggests the Sigma-metrics model as the model of choice with default values of 5 mmol/mol (0.46%) for TAE, and risk levels of 2 and 4 sigma for routine laboratories and laboratories performing clinical trials, respectively. These goals should serve as a starting point for discussion with international stakeholders in the field of diabetes. PMID:25737535

  10. Analysis of Skeletal Muscle Metrics as Predictors of Functional Task Performance

    NASA Technical Reports Server (NTRS)

    Ryder, Jeffrey W.; Buxton, Roxanne E.; Redd, Elizabeth; Scott-Pandorf, Melissa; Hackney, Kyle J.; Fiedler, James; Ploutz-Snyder, Robert J.; Bloomberg, Jacob J.; Ploutz-Snyder, Lori L.

    2010-01-01

    PURPOSE: The ability to predict task performance using physiological performance metrics is vital to ensure that astronauts can execute their jobs safely and effectively. This investigation used a weighted suit to evaluate task performance at various ratios of strength, power, and endurance to body weight. METHODS: Twenty subjects completed muscle performance tests and functional tasks representative of those that would be required of astronauts during planetary exploration (see table for specific tests/tasks). Subjects performed functional tasks while wearing a weighted suit with additional loads ranging from 0-120% of initial body weight. Performance metrics were time to completion for all tasks except hatch opening, which consisted of total work. Task performance metrics were plotted against muscle metrics normalized to "body weight" (subject weight + external load; BW) for each trial. Fractional polynomial regression was used to model the relationship between muscle and task performance. CONCLUSION: LPMIF/BW is the best predictor of performance for predominantly lower-body tasks that are ambulatory and of short duration. LPMIF/BW is a very practical predictor of occupational task performance as it is quick and relatively safe to perform. Accordingly, bench press work best predicts hatch-opening work performance.

  11. Metrics for linear kinematic features in sea ice

    NASA Astrophysics Data System (ADS)

    Levy, G.; Coon, M.; Sulsky, D.

    2006-12-01

    The treatment of leads as cracks or discontinuities (see Coon et al. presentation) requires some shift in the procedure of evaluation and comparison of lead-resolving models and their validation against observations. Common metrics used to evaluate ice model skills are by and large an adaptation of a least square "metric" adopted from operational numerical weather prediction data assimilation systems and are most appropriate for continuous fields and Eilerian systems where the observations and predictions are commensurate. However, this class of metrics suffers from some flaws in areas of sharp gradients and discontinuities (e.g., leads) and when Lagrangian treatments are more natural. After a brief review of these metrics and their performance in areas of sharp gradients, we present two new metrics specifically designed to measure model accuracy in representing linear features (e.g., leads). The indices developed circumvent the requirement that both the observations and model variables be commensurate (i.e., measured with the same units) by considering the frequencies of the features of interest/importance. We illustrate the metrics by scoring several hypothetical "simulated" discontinuity fields against the lead interpreted from RGPS observations.

  12. Numerical studies and metric development for validation of magnetohydrodynamic models on the HIT-SI experiment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hansen, C., E-mail: hansec@uw.edu; Columbia University, New York, New York 10027; Victor, B.

    We present application of three scalar metrics derived from the Biorthogonal Decomposition (BD) technique to evaluate the level of agreement between macroscopic plasma dynamics in different data sets. BD decomposes large data sets, as produced by distributed diagnostic arrays, into principal mode structures without assumptions on spatial or temporal structure. These metrics have been applied to validation of the Hall-MHD model using experimental data from the Helicity Injected Torus with Steady Inductive helicity injection experiment. Each metric provides a measure of correlation between mode structures extracted from experimental data and simulations for an array of 192 surface-mounted magnetic probes. Numericalmore » validation studies have been performed using the NIMROD code, where the injectors are modeled as boundary conditions on the flux conserver, and the PSI-TET code, where the entire plasma volume is treated. Initial results from a comprehensive validation study of high performance operation with different injector frequencies are presented, illustrating application of the BD method. Using a simplified (constant, uniform density and temperature) Hall-MHD model, simulation results agree with experimental observation for two of the three defined metrics when the injectors are driven with a frequency of 14.5 kHz.« less

  13. Virtual reality simulator training for laparoscopic colectomy: what metrics have construct validity?

    PubMed

    Shanmugan, Skandan; Leblanc, Fabien; Senagore, Anthony J; Ellis, C Neal; Stein, Sharon L; Khan, Sadaf; Delaney, Conor P; Champagne, Bradley J

    2014-02-01

    Virtual reality simulation for laparoscopic colectomy has been used for training of surgical residents and has been considered as a model for technical skills assessment of board-eligible colorectal surgeons. However, construct validity (the ability to distinguish between skill levels) must be confirmed before widespread implementation. This study was designed to specifically determine which metrics for laparoscopic sigmoid colectomy have evidence of construct validity. General surgeons that had performed fewer than 30 laparoscopic colon resections and laparoscopic colorectal experts (>200 laparoscopic colon resections) performed laparoscopic sigmoid colectomy on the LAP Mentor model. All participants received a 15-minute instructional warm-up and had never used the simulator before the study. Performance was then compared between each group for 21 metrics (procedural, 14; intraoperative errors, 7) to determine specifically which measurements demonstrate construct validity. Performance was compared with the Mann-Whitney U-test (p < 0.05 was significant). Fifty-three surgeons; 29 general surgeons, and 24 colorectal surgeons enrolled in the study. The virtual reality simulators for laparoscopic sigmoid colectomy demonstrated construct validity for 8 of 14 procedural metrics by distinguishing levels of surgical experience (p < 0.05). The most discriminatory procedural metrics (p < 0.01) favoring experts were reduced instrument path length, accuracy of the peritoneal/medial mobilization, and dissection of the inferior mesenteric artery. Intraoperative errors were not discriminatory for most metrics and favored general surgeons for colonic wall injury (general surgeons, 0.7; colorectal surgeons, 3.5; p = 0.045). Individual variability within the general surgeon and colorectal surgeon groups was not accounted for. The virtual reality simulators for laparoscopic sigmoid colectomy demonstrated construct validity for 8 procedure-specific metrics. However, using virtual reality simulator metrics to detect intraoperative errors did not discriminate between groups. If the virtual reality simulator continues to be used for the technical assessment of trainees and board-eligible surgeons, the evaluation of performance should be limited to procedural metrics.

  14. Creating "Intelligent" Ensemble Averages Using a Process-Based Framework

    NASA Astrophysics Data System (ADS)

    Baker, Noel; Taylor, Patrick

    2014-05-01

    The CMIP5 archive contains future climate projections from over 50 models provided by dozens of modeling centers from around the world. Individual model projections, however, are subject to biases created by structural model uncertainties. As a result, ensemble averaging of multiple models is used to add value to individual model projections and construct a consensus projection. Previous reports for the IPCC establish climate change projections based on an equal-weighted average of all model projections. However, individual models reproduce certain climate processes better than other models. Should models be weighted based on performance? Unequal ensemble averages have previously been constructed using a variety of mean state metrics. What metrics are most relevant for constraining future climate projections? This project develops a framework for systematically testing metrics in models to identify optimal metrics for unequal weighting multi-model ensembles. The intention is to produce improved ("intelligent") unequal-weight ensemble averages. A unique aspect of this project is the construction and testing of climate process-based model evaluation metrics. A climate process-based metric is defined as a metric based on the relationship between two physically related climate variables—e.g., outgoing longwave radiation and surface temperature. Several climate process metrics are constructed using high-quality Earth radiation budget data from NASA's Clouds and Earth's Radiant Energy System (CERES) instrument in combination with surface temperature data sets. It is found that regional values of tested quantities can vary significantly when comparing the equal-weighted ensemble average and an ensemble weighted using the process-based metric. Additionally, this study investigates the dependence of the metric weighting scheme on the climate state using a combination of model simulations including a non-forced preindustrial control experiment, historical simulations, and several radiative forcing Representative Concentration Pathway (RCP) scenarios. Ultimately, the goal of the framework is to advise better methods for ensemble averaging models and create better climate predictions.

  15. CPMIP: measurements of real computational performance of Earth system models in CMIP6

    NASA Astrophysics Data System (ADS)

    Balaji, Venkatramani; Maisonnave, Eric; Zadeh, Niki; Lawrence, Bryan N.; Biercamp, Joachim; Fladrich, Uwe; Aloisio, Giovanni; Benson, Rusty; Caubel, Arnaud; Durachta, Jeffrey; Foujols, Marie-Alice; Lister, Grenville; Mocavero, Silvia; Underwood, Seth; Wright, Garrett

    2017-01-01

    A climate model represents a multitude of processes on a variety of timescales and space scales: a canonical example of multi-physics multi-scale modeling. The underlying climate system is physically characterized by sensitive dependence on initial conditions, and natural stochastic variability, so very long integrations are needed to extract signals of climate change. Algorithms generally possess weak scaling and can be I/O and/or memory-bound. Such weak-scaling, I/O, and memory-bound multi-physics codes present particular challenges to computational performance. Traditional metrics of computational efficiency such as performance counters and scaling curves do not tell us enough about real sustained performance from climate models on different machines. They also do not provide a satisfactory basis for comparative information across models. codes present particular challenges to computational performance. We introduce a set of metrics that can be used for the study of computational performance of climate (and Earth system) models. These measures do not require specialized software or specific hardware counters, and should be accessible to anyone. They are independent of platform and underlying parallel programming models. We show how these metrics can be used to measure actually attained performance of Earth system models on different machines, and identify the most fruitful areas of research and development for performance engineering. codes present particular challenges to computational performance. We present results for these measures for a diverse suite of models from several modeling centers, and propose to use these measures as a basis for a CPMIP, a computational performance model intercomparison project (MIP).

  16. Evaluating true BCI communication rate through mutual information and language models.

    PubMed

    Speier, William; Arnold, Corey; Pouratian, Nader

    2013-01-01

    Brain-computer interface (BCI) systems are a promising means for restoring communication to patients suffering from "locked-in" syndrome. Research to improve system performance primarily focuses on means to overcome the low signal to noise ratio of electroencephalogric (EEG) recordings. However, the literature and methods are difficult to compare due to the array of evaluation metrics and assumptions underlying them, including that: 1) all characters are equally probable, 2) character selection is memoryless, and 3) errors occur completely at random. The standardization of evaluation metrics that more accurately reflect the amount of information contained in BCI language output is critical to make progress. We present a mutual information-based metric that incorporates prior information and a model of systematic errors. The parameters of a system used in one study were re-optimized, showing that the metric used in optimization significantly affects the parameter values chosen and the resulting system performance. The results of 11 BCI communication studies were then evaluated using different metrics, including those previously used in BCI literature and the newly advocated metric. Six studies' results varied based on the metric used for evaluation and the proposed metric produced results that differed from those originally published in two of the studies. Standardizing metrics to accurately reflect the rate of information transmission is critical to properly evaluate and compare BCI communication systems and advance the field in an unbiased manner.

  17. Learning Compositional Shape Models of Multiple Distance Metrics by Information Projection.

    PubMed

    Luo, Ping; Lin, Liang; Liu, Xiaobai

    2016-07-01

    This paper presents a novel compositional contour-based shape model by incorporating multiple distance metrics to account for varying shape distortions or deformations. Our approach contains two key steps: 1) contour feature generation and 2) generative model pursuit. For each category, we first densely sample an ensemble of local prototype contour segments from a few positive shape examples and describe each segment using three different types of distance metrics. These metrics are diverse and complementary with each other to capture various shape deformations. We regard the parameterized contour segment plus an additive residual ϵ as a basic subspace, namely, ϵ -ball, in the sense that it represents local shape variance under the certain distance metric. Using these ϵ -balls as features, we then propose a generative learning algorithm to pursue the compositional shape model, which greedily selects the most representative features under the information projection principle. In experiments, we evaluate our model on several public challenging data sets, and demonstrate that the integration of multiple shape distance metrics is capable of dealing various shape deformations, articulations, and background clutter, hence boosting system performance.

  18. A New Metric for Quantifying Performance Impairment on the Psychomotor Vigilance Test

    DTIC Science & Technology

    2012-01-01

    used the coefficient of determination (R2) and the P-values based on Bartelss test of randomness of the residual error to quantify the goodness - of - fit ...we used the goodness - of - fit between each metric and the corresponding individualized two-process model output (Rajaraman et al., 2008, 2009) to assess...individualized two-process model fits for each of the 12 subjects using the five metrics. The P-values are for Bartelss

  19. Partially supervised speaker clustering.

    PubMed

    Tang, Hao; Chu, Stephen Mingyu; Hasegawa-Johnson, Mark; Huang, Thomas S

    2012-05-01

    Content-based multimedia indexing, retrieval, and processing as well as multimedia databases demand the structuring of the media content (image, audio, video, text, etc.), one significant goal being to associate the identity of the content to the individual segments of the signals. In this paper, we specifically address the problem of speaker clustering, the task of assigning every speech utterance in an audio stream to its speaker. We offer a complete treatment to the idea of partially supervised speaker clustering, which refers to the use of our prior knowledge of speakers in general to assist the unsupervised speaker clustering process. By means of an independent training data set, we encode the prior knowledge at the various stages of the speaker clustering pipeline via 1) learning a speaker-discriminative acoustic feature transformation, 2) learning a universal speaker prior model, and 3) learning a discriminative speaker subspace, or equivalently, a speaker-discriminative distance metric. We study the directional scattering property of the Gaussian mixture model (GMM) mean supervector representation of utterances in the high-dimensional space, and advocate exploiting this property by using the cosine distance metric instead of the euclidean distance metric for speaker clustering in the GMM mean supervector space. We propose to perform discriminant analysis based on the cosine distance metric, which leads to a novel distance metric learning algorithm—linear spherical discriminant analysis (LSDA). We show that the proposed LSDA formulation can be systematically solved within the elegant graph embedding general dimensionality reduction framework. Our speaker clustering experiments on the GALE database clearly indicate that 1) our speaker clustering methods based on the GMM mean supervector representation and vector-based distance metrics outperform traditional speaker clustering methods based on the “bag of acoustic features” representation and statistical model-based distance metrics, 2) our advocated use of the cosine distance metric yields consistent increases in the speaker clustering performance as compared to the commonly used euclidean distance metric, 3) our partially supervised speaker clustering concept and strategies significantly improve the speaker clustering performance over the baselines, and 4) our proposed LSDA algorithm further leads to state-of-the-art speaker clustering performance.

  20. Fusion set selection with surrogate metric in multi-atlas based image segmentation

    NASA Astrophysics Data System (ADS)

    Zhao, Tingting; Ruan, Dan

    2016-02-01

    Multi-atlas based image segmentation sees unprecedented opportunities but also demanding challenges in the big data era. Relevant atlas selection before label fusion plays a crucial role in reducing potential performance loss from heterogeneous data quality and high computation cost from extensive data. This paper starts with investigating the image similarity metric (termed ‘surrogate’), an alternative to the inaccessible geometric agreement metric (termed ‘oracle’) in atlas relevance assessment, and probes into the problem of how to select the ‘most-relevant’ atlases and how many such atlases to incorporate. We propose an inference model to relate the surrogates and the oracle geometric agreement metrics. Based on this model, we quantify the behavior of the surrogates in mimicking oracle metrics for atlas relevance ordering. Finally, analytical insights on the choice of fusion set size are presented from a probabilistic perspective, with the integrated goal of including the most relevant atlases and excluding the irrelevant ones. Empirical evidence and performance assessment are provided based on prostate and corpus callosum segmentation.

  1. Evaluating Modeled Impact Metrics for Human Health, Agriculture Growth, and Near-Term Climate

    NASA Astrophysics Data System (ADS)

    Seltzer, K. M.; Shindell, D. T.; Faluvegi, G.; Murray, L. T.

    2017-12-01

    Simulated metrics that assess impacts on human health, agriculture growth, and near-term climate were evaluated using ground-based and satellite observations. The NASA GISS ModelE2 and GEOS-Chem models were used to simulate the near-present chemistry of the atmosphere. A suite of simulations that varied by model, meteorology, horizontal resolution, emissions inventory, and emissions year were performed, enabling an analysis of metric sensitivities to various model components. All simulations utilized consistent anthropogenic global emissions inventories (ECLIPSE V5a or CEDS), and an evaluation of simulated results were carried out for 2004-2006 and 2009-2011 over the United States and 2014-2015 over China. Results for O3- and PM2.5-based metrics featured minor differences due to the model resolutions considered here (2.0° × 2.5° and 0.5° × 0.666°) and model, meteorology, and emissions inventory each played larger roles in variances. Surface metrics related to O3 were consistently high biased, though to varying degrees, demonstrating the need to evaluate particular modeling frameworks before O3 impacts are quantified. Surface metrics related to PM2.5 were diverse, indicating that a multimodel mean with robust results are valuable tools in predicting PM2.5-related impacts. Oftentimes, the configuration that captured the change of a metric best over time differed from the configuration that captured the magnitude of the same metric best, demonstrating the challenge in skillfully simulating impacts. These results highlight the strengths and weaknesses of these models in simulating impact metrics related to air quality and near-term climate. With such information, the reliability of historical and future simulations can be better understood.

  2. Key metrics for HFIR HEU and LEU models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ilas, Germina; Betzler, Benjamin R.; Chandler, David

    This report compares key metrics for two fuel design models of the High Flux Isotope Reactor (HFIR). The first model represents the highly enriched uranium (HEU) fuel currently in use at HFIR, and the second model considers a low-enriched uranium (LEU) interim design fuel. Except for the fuel region, the two models are consistent, and both include an experiment loading that is representative of HFIR's current operation. The considered key metrics are the neutron flux at the cold source moderator vessel, the mass of 252Cf produced in the flux trap target region as function of cycle time, the fast neutronmore » flux at locations of interest for material irradiation experiments, and the reactor cycle length. These key metrics are a small subset of the overall HFIR performance and safety metrics. They were defined as a means of capturing data essential for HFIR's primary missions, for use in optimization studies assessing the impact of HFIR's conversion from HEU fuel to different types of LEU fuel designs.« less

  3. Information-theoretic model comparison unifies saliency metrics

    PubMed Central

    Kümmerer, Matthias; Wallis, Thomas S. A.; Bethge, Matthias

    2015-01-01

    Learning the properties of an image associated with human gaze placement is important both for understanding how biological systems explore the environment and for computer vision applications. There is a large literature on quantitative eye movement models that seeks to predict fixations from images (sometimes termed “saliency” prediction). A major problem known to the field is that existing model comparison metrics give inconsistent results, causing confusion. We argue that the primary reason for these inconsistencies is because different metrics and models use different definitions of what a “saliency map” entails. For example, some metrics expect a model to account for image-independent central fixation bias whereas others will penalize a model that does. Here we bring saliency evaluation into the domain of information by framing fixation prediction models probabilistically and calculating information gain. We jointly optimize the scale, the center bias, and spatial blurring of all models within this framework. Evaluating existing metrics on these rephrased models produces almost perfect agreement in model rankings across the metrics. Model performance is separated from center bias and spatial blurring, avoiding the confounding of these factors in model comparison. We additionally provide a method to show where and how models fail to capture information in the fixations on the pixel level. These methods are readily extended to spatiotemporal models of fixation scanpaths, and we provide a software package to facilitate their use. PMID:26655340

  4. PV System 'Availability' as a Reliability Metric -- Improving Standards, Contract Language and Performance Models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Klise, Geoffrey T.; Hill, Roger; Walker, Andy

    The use of the term 'availability' to describe a photovoltaic (PV) system and power plant has been fraught with confusion for many years. A term that is meant to describe equipment operational status is often omitted, misapplied or inaccurately combined with PV performance metrics due to attempts to measure performance and reliability through the lens of traditional power plant language. This paper discusses three areas where current research in standards, contract language and performance modeling is improving the way availability is used with regards to photovoltaic systems and power plants.

  5. Decision-relevant evaluation of climate models: A case study of chill hours in California

    NASA Astrophysics Data System (ADS)

    Jagannathan, K. A.; Jones, A. D.; Kerr, A. C.

    2017-12-01

    The past decade has seen a proliferation of different climate datasets with over 60 climate models currently in use. Comparative evaluation and validation of models can assist practitioners chose the most appropriate models for adaptation planning. However, such assessments are usually conducted for `climate metrics' such as seasonal temperature, while sectoral decisions are often based on `decision-relevant outcome metrics' such as growing degree days or chill hours. Since climate models predict different metrics with varying skill, the goal of this research is to conduct a bottom-up evaluation of model skill for `outcome-based' metrics. Using chill hours (number of hours in winter months where temperature is lesser than 45 deg F) in Fresno, CA as a case, we assess how well different GCMs predict the historical mean and slope of chill hours, and whether and to what extent projections differ based on model selection. We then compare our results with other climate-based evaluations of the region, to identify similarities and differences. For the model skill evaluation, historically observed chill hours were compared with simulations from 27 GCMs (and multiple ensembles). Model skill scores were generated based on a statistical hypothesis test of the comparative assessment. Future projections from RCP 8.5 runs were evaluated, and a simple bias correction was also conducted. Our analysis indicates that model skill in predicting chill hour slope is dependent on its skill in predicting mean chill hours, which results from the non-linear nature of the chill metric. However, there was no clear relationship between the models that performed well for the chill hour metric and those that performed well in other temperature-based evaluations (such winter minimum temperature or diurnal temperature range). Further, contrary to conclusions from other studies, we also found that the multi-model mean or large ensemble mean results may not always be most appropriate for this outcome metric. Our assessment sheds light on key differences between global versus local skill, and broad versus specific skill of climate models, highlighting that decision-relevant model evaluation may be crucial for providing practitioners with the best available climate information for their specific needs.

  6. 2008 GEM Modeling Challenge: Metrics Study of the Dst Index in Physics-Based Magnetosphere and Ring Current Models and in Statistical and Analytic Specifications

    NASA Technical Reports Server (NTRS)

    Rastaetter, L.; Kuznetsova, M.; Hesse, M.; Pulkkinen, A.; Glocer, A.; Yu, Y.; Meng, X.; Raeder, J.; Wiltberger, M.; Welling, D.; hide

    2011-01-01

    In this paper the metrics-based results of the Dst part of the 2008-2009 GEM Metrics Challenge are reported. The Metrics Challenge asked modelers to submit results for 4 geomagnetic storm events and 5 different types of observations that can be modeled by statistical or climatological or physics-based (e.g. MHD) models of the magnetosphere-ionosphere system. We present the results of over 25 model settings that were run at the Community Coordinated Modeling Center (CCMC) and at the institutions of various modelers for these events. To measure the performance of each of the models against the observations we use comparisons of one-hour averaged model data with the Dst index issued by the World Data Center for Geomagnetism, Kyoto, Japan, and direct comparison of one-minute model data with the one-minute Dst index calculated by the United States Geologic Survey (USGS).

  7. Temporal Delineation and Quantification of Short Term Clustered Mining Seismicity

    NASA Astrophysics Data System (ADS)

    Woodward, Kyle; Wesseloo, Johan; Potvin, Yves

    2017-07-01

    The assessment of the temporal characteristics of seismicity is fundamental to understanding and quantifying the seismic hazard associated with mining, the effectiveness of strategies and tactics used to manage seismic hazard, and the relationship between seismicity and changes to the mining environment. This article aims to improve the accuracy and precision in which the temporal dimension of seismic responses can be quantified and delineated. We present a review and discussion on the occurrence of time-dependent mining seismicity with a specific focus on temporal modelling and the modified Omori law (MOL). This forms the basis for the development of a simple weighted metric that allows for the consistent temporal delineation and quantification of a seismic response. The optimisation of this metric allows for the selection of the most appropriate modelling interval given the temporal attributes of time-dependent mining seismicity. We evaluate the performance weighted metric for the modelling of a synthetic seismic dataset. This assessment shows that seismic responses can be quantified and delineated by the MOL, with reasonable accuracy and precision, when the modelling is optimised by evaluating the weighted MLE metric. Furthermore, this assessment highlights that decreased weighted MLE metric performance can be expected if there is a lack of contrast between the temporal characteristics of events associated with different processes.

  8. Person Re-Identification via Distance Metric Learning With Latent Variables.

    PubMed

    Sun, Chong; Wang, Dong; Lu, Huchuan

    2017-01-01

    In this paper, we propose an effective person re-identification method with latent variables, which represents a pedestrian as the mixture of a holistic model and a number of flexible models. Three types of latent variables are introduced to model uncertain factors in the re-identification problem, including vertical misalignments, horizontal misalignments and leg posture variations. The distance between two pedestrians can be determined by minimizing a given distance function with respect to latent variables, and then be used to conduct the re-identification task. In addition, we develop a latent metric learning method for learning the effective metric matrix, which can be solved via an iterative manner: once latent information is specified, the metric matrix can be obtained based on some typical metric learning methods; with the computed metric matrix, the latent variables can be determined by searching the state space exhaustively. Finally, extensive experiments are conducted on seven databases to evaluate the proposed method. The experimental results demonstrate that our method achieves better performance than other competing algorithms.

  9. Climate Classification is an Important Factor in ­Assessing Hospital Performance Metrics

    NASA Astrophysics Data System (ADS)

    Boland, M. R.; Parhi, P.; Gentine, P.; Tatonetti, N. P.

    2017-12-01

    Context/Purpose: Climate is a known modulator of disease, but its impact on hospital performance metrics remains unstudied. Methods: We assess the relationship between Köppen-Geiger climate classification and hospital performance metrics, specifically 30-day mortality, as reported in Hospital Compare, and collected for the period July 2013 through June 2014 (7/1/2013 - 06/30/2014). A hospital-level multivariate linear regression analysis was performed while controlling for known socioeconomic factors to explore the relationship between all-cause mortality and climate. Hospital performance scores were obtained from 4,524 hospitals belonging to 15 distinct Köppen-Geiger climates and 2,373 unique counties. Results: Model results revealed that hospital performance metrics for mortality showed significant climate dependence (p<0.001) after adjusting for socioeconomic factors. Interpretation: Currently, hospitals are reimbursed by Governmental agencies using 30-day mortality rates along with 30-day readmission rates. These metrics allow Government agencies to rank hospitals according to their `performance' along these metrics. Various socioeconomic factors are taken into consideration when determining individual hospitals performance. However, no climate-based adjustment is made within the existing framework. Our results indicate that climate-based variability in 30-day mortality rates does exist even after socioeconomic confounder adjustment. Use of standardized high-level climate classification systems (such as Koppen-Geiger) would be useful to incorporate in future metrics. Conclusion: Climate is a significant factor in evaluating hospital 30-day mortality rates. These results demonstrate that climate classification is an important factor when comparing hospital performance across the United States.

  10. Algal bioassessment metrics for wadeable streams and rivers of Maine, USA

    USGS Publications Warehouse

    Danielson, Thomas J.; Loftin, Cynthia S.; Tsomides, Leonidas; DiFranco, Jeanne L.; Connors, Beth

    2011-01-01

    Many state water-quality agencies use biological assessment methods based on lotic fish and macroinvertebrate communities, but relatively few states have incorporated algal multimetric indices into monitoring programs. Algae are good indicators for monitoring water quality because they are sensitive to many environmental stressors. We evaluated benthic algal community attributes along a landuse gradient affecting wadeable streams and rivers in Maine, USA, to identify potential bioassessment metrics. We collected epilithic algal samples from 193 locations across the state. We computed weighted-average optima for common taxa for total P, total N, specific conductance, % impervious cover, and % developed watershed, which included all land use that is no longer forest or wetland. We assigned Maine stream tolerance values and categories (sensitive, intermediate, tolerant) to taxa based on their optima and responses to watershed disturbance. We evaluated performance of algal community metrics used in multimetric indices from other regions and novel metrics based on Maine data. Metrics specific to Maine data, such as the relative richness of species characterized as being sensitive in Maine, were more correlated with % developed watershed than most metrics used in other regions. Few community-structure attributes (e.g., species richness) were useful metrics in Maine. Performance of algal bioassessment models would be improved if metrics were evaluated with attributes of local data before inclusion in multimetric indices or statistical models. ?? 2011 by The North American Benthological Society.

  11. Combining satellite data and appropriate objective functions for improved spatial pattern performance of a distributed hydrologic model

    NASA Astrophysics Data System (ADS)

    Demirel, Mehmet C.; Mai, Juliane; Mendiguren, Gorka; Koch, Julian; Samaniego, Luis; Stisen, Simon

    2018-02-01

    Satellite-based earth observations offer great opportunities to improve spatial model predictions by means of spatial-pattern-oriented model evaluations. In this study, observed spatial patterns of actual evapotranspiration (AET) are utilised for spatial model calibration tailored to target the pattern performance of the model. The proposed calibration framework combines temporally aggregated observed spatial patterns with a new spatial performance metric and a flexible spatial parameterisation scheme. The mesoscale hydrologic model (mHM) is used to simulate streamflow and AET and has been selected due to its soil parameter distribution approach based on pedo-transfer functions and the build in multi-scale parameter regionalisation. In addition two new spatial parameter distribution options have been incorporated in the model in order to increase the flexibility of root fraction coefficient and potential evapotranspiration correction parameterisations, based on soil type and vegetation density. These parameterisations are utilised as they are most relevant for simulated AET patterns from the hydrologic model. Due to the fundamental challenges encountered when evaluating spatial pattern performance using standard metrics, we developed a simple but highly discriminative spatial metric, i.e. one comprised of three easily interpretable components measuring co-location, variation and distribution of the spatial data. The study shows that with flexible spatial model parameterisation used in combination with the appropriate objective functions, the simulated spatial patterns of actual evapotranspiration become substantially more similar to the satellite-based estimates. Overall 26 parameters are identified for calibration through a sequential screening approach based on a combination of streamflow and spatial pattern metrics. The robustness of the calibrations is tested using an ensemble of nine calibrations based on different seed numbers using the shuffled complex evolution optimiser. The calibration results reveal a limited trade-off between streamflow dynamics and spatial patterns illustrating the benefit of combining separate observation types and objective functions. At the same time, the simulated spatial patterns of AET significantly improved when an objective function based on observed AET patterns and a novel spatial performance metric compared to traditional streamflow-only calibration were included. Since the overall water balance is usually a crucial goal in hydrologic modelling, spatial-pattern-oriented optimisation should always be accompanied by traditional discharge measurements. In such a multi-objective framework, the current study promotes the use of a novel bias-insensitive spatial pattern metric, which exploits the key information contained in the observed patterns while allowing the water balance to be informed by discharge observations.

  12. Validation metrics for turbulent plasma transport

    DOE PAGES

    Holland, C.

    2016-06-22

    Developing accurate models of plasma dynamics is essential for confident predictive modeling of current and future fusion devices. In modern computer science and engineering, formal verification and validation processes are used to assess model accuracy and establish confidence in the predictive capabilities of a given model. This paper provides an overview of the key guiding principles and best practices for the development of validation metrics, illustrated using examples from investigations of turbulent transport in magnetically confined plasmas. Particular emphasis is given to the importance of uncertainty quantification and its inclusion within the metrics, and the need for utilizing synthetic diagnosticsmore » to enable quantitatively meaningful comparisons between simulation and experiment. As a starting point, the structure of commonly used global transport model metrics and their limitations is reviewed. An alternate approach is then presented, which focuses upon comparisons of predicted local fluxes, fluctuations, and equilibrium gradients against observation. Furthermore, the utility of metrics based upon these comparisons is demonstrated by applying them to gyrokinetic predictions of turbulent transport in a variety of discharges performed on the DIII-D tokamak, as part of a multi-year transport model validation activity.« less

  13. Validation metrics for turbulent plasma transport

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Holland, C.

    Developing accurate models of plasma dynamics is essential for confident predictive modeling of current and future fusion devices. In modern computer science and engineering, formal verification and validation processes are used to assess model accuracy and establish confidence in the predictive capabilities of a given model. This paper provides an overview of the key guiding principles and best practices for the development of validation metrics, illustrated using examples from investigations of turbulent transport in magnetically confined plasmas. Particular emphasis is given to the importance of uncertainty quantification and its inclusion within the metrics, and the need for utilizing synthetic diagnosticsmore » to enable quantitatively meaningful comparisons between simulation and experiment. As a starting point, the structure of commonly used global transport model metrics and their limitations is reviewed. An alternate approach is then presented, which focuses upon comparisons of predicted local fluxes, fluctuations, and equilibrium gradients against observation. Furthermore, the utility of metrics based upon these comparisons is demonstrated by applying them to gyrokinetic predictions of turbulent transport in a variety of discharges performed on the DIII-D tokamak, as part of a multi-year transport model validation activity.« less

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhao, T; Ruan, D

    Purpose: The growing size and heterogeneity in training atlas necessitates sophisticated schemes to identify only the most relevant atlases for the specific multi-atlas-based image segmentation problem. This study aims to develop a model to infer the inaccessible oracle geometric relevance metric from surrogate image similarity metrics, and based on such model, provide guidance to atlas selection in multi-atlas-based image segmentation. Methods: We relate the oracle geometric relevance metric in label space to the surrogate metric in image space, by a monotonically non-decreasing function with additive random perturbations. Subsequently, a surrogate’s ability to prognosticate the oracle order for atlas subset selectionmore » is quantified probabilistically. Finally, important insights and guidance are provided for the design of fusion set size, balancing the competing demands to include the most relevant atlases and to exclude the most irrelevant ones. A systematic solution is derived based on an optimization framework. Model verification and performance assessment is performed based on clinical prostate MR images. Results: The proposed surrogate model was exemplified by a linear map with normally distributed perturbation, and verified with several commonly-used surrogates, including MSD, NCC and (N)MI. The derived behaviors of different surrogates in atlas selection and their corresponding performance in ultimate label estimate were validated. The performance of NCC and (N)MI was similarly superior to MSD, with a 10% higher atlas selection probability and a segmentation performance increase in DSC by 0.10 with the first and third quartiles of (0.83, 0.89), compared to (0.81, 0.89). The derived optimal fusion set size, valued at 7/8/8/7 for MSD/NCC/MI/NMI, agreed well with the appropriate range [4, 9] from empirical observation. Conclusion: This work has developed an efficacious probabilistic model to characterize the image-based surrogate metric on atlas selection. Analytical insights lead to valid guiding principles on fusion set size design.« less

  15. Evaluating ET estimates from the Simplified Surface Energy Balance (SSEB) model using METRIC model output

    NASA Astrophysics Data System (ADS)

    Senay, G. B.; Budde, M. E.; Allen, R. G.; Verdin, J. P.

    2008-12-01

    Evapotranspiration (ET) is an important component of the hydrologic budget because it expresses the exchange of mass and energy between the soil-water-vegetation system and the atmosphere. Since direct measurement of ET is difficult, various modeling methods are used to estimate actual ET (ETa). Generally, the choice of method for ET estimation depends on the objective of the study and is further limited by the availability of data and desired accuracy of the ET estimate. Operational monitoring of crop performance requires processing large data sets and a quick response time. A Simplified Surface Energy Balance (SSEB) model was developed by the U.S. Geological Survey's Famine Early Warning Systems Network to estimate irrigation water use in remote places of the world. In this study, we evaluated the performance of the SSEB model with the METRIC (Mapping Evapotranspiration at high Resolution and with Internalized Calibration) model that has been evaluated by several researchers using the Lysimeter data. The METRIC model has been proven to provide reliable ET estimates in different regions of the world. Reference ET fractions of both models (ETrF of METRIC vs. ETf of SSEB) were generated and compared using individual Landsat thermal images collected from 2000 though 2005 in Idaho, New Mexico, and California. In addition, the models were compared using monthly and seasonal total ETa estimates. The SSEB model reproduced both the spatial and temporal variability exhibited by METRIC on land surfaces, explaining up to 80 percent of the spatial variability. However, the ETa estimates over water bodies were systematically higher in the SSEB output, which could be improved by using a correction coefficient to take into account the absorption of solar energy by deeper water layers that has little contribution to the ET process. This study demonstrated the usefulness of the SSEB method for large-scale agro-hydrologic applications for operational monitoring and assessing of crop performance and regional water balance dynamics.

  16. Geospace environment modeling 2008--2009 challenge: Dst index

    USGS Publications Warehouse

    Rastätter, L.; Kuznetsova, M.M.; Glocer, A.; Welling, D.; Meng, X.; Raeder, J.; Wittberger, M.; Jordanova, V.K.; Yu, Y.; Zaharia, S.; Weigel, R.S.; Sazykin, S.; Boynton, R.; Wei, H.; Eccles, V.; Horton, W.; Mays, M.L.; Gannon, J.

    2013-01-01

    This paper reports the metrics-based results of the Dst index part of the 2008–2009 GEM Metrics Challenge. The 2008–2009 GEM Metrics Challenge asked modelers to submit results for four geomagnetic storm events and five different types of observations that can be modeled by statistical, climatological or physics-based models of the magnetosphere-ionosphere system. We present the results of 30 model settings that were run at the Community Coordinated Modeling Center and at the institutions of various modelers for these events. To measure the performance of each of the models against the observations, we use comparisons of 1 hour averaged model data with the Dst index issued by the World Data Center for Geomagnetism, Kyoto, Japan, and direct comparison of 1 minute model data with the 1 minute Dst index calculated by the United States Geological Survey. The latter index can be used to calculate spectral variability of model outputs in comparison to the index. We find that model rankings vary widely by skill score used. None of the models consistently perform best for all events. We find that empirical models perform well in general. Magnetohydrodynamics-based models of the global magnetosphere with inner magnetosphere physics (ring current model) included and stand-alone ring current models with properly defined boundary conditions perform well and are able to match or surpass results from empirical models. Unlike in similar studies, the statistical models used in this study found their challenge in the weakest events rather than the strongest events.

  17. New VHP-Female v. 2.0 full-body computational phantom and its performance metrics using FEM simulator ANSYS HFSS.

    PubMed

    Yanamadala, Janakinadh; Noetscher, Gregory M; Rathi, Vishal K; Maliye, Saili; Win, Htay A; Tran, Anh L; Jackson, Xavier J; Htet, Aung T; Kozlov, Mikhail; Nazarian, Ara; Louie, Sara; Makarov, Sergey N

    2015-01-01

    Simulation of the electromagnetic response of the human body relies heavily upon efficient computational models or phantoms. The first objective of this paper is to present a new platform-independent full-body electromagnetic computational model (computational phantom), the Visible Human Project(®) (VHP)-Female v. 2.0 and to describe its distinct features. The second objective is to report phantom simulation performance metrics using the commercial FEM electromagnetic solver ANSYS HFSS.

  18. High fidelity quasi steady-state aerodynamic model effects on race vehicle performance predictions using multi-body simulation

    NASA Astrophysics Data System (ADS)

    Mohrfeld-Halterman, J. A.; Uddin, M.

    2016-07-01

    We described in this paper the development of a high fidelity vehicle aerodynamic model to fit wind tunnel test data over a wide range of vehicle orientations. We also present a comparison between the effects of this proposed model and a conventional quasi steady-state aerodynamic model on race vehicle simulation results. This is done by implementing both of these models independently in multi-body quasi steady-state simulations to determine the effects of the high fidelity aerodynamic model on race vehicle performance metrics. The quasi steady state vehicle simulation is developed with a multi-body NASCAR Truck vehicle model, and simulations are conducted for three different types of NASCAR race tracks, a short track, a one and a half mile intermediate track, and a higher speed, two mile intermediate race track. For each track simulation, the effects of the aerodynamic model on handling, maximum corner speed, and drive force metrics are analysed. The accuracy of the high-fidelity model is shown to reduce the aerodynamic model error relative to the conventional aerodynamic model, and the increased accuracy of the high fidelity aerodynamic model is found to have realisable effects on the performance metric predictions on the intermediate tracks resulting from the quasi steady-state simulation.

  19. Performance of five surface energy balance models for estimating daily evapotranspiration in high biomass sorghum

    NASA Astrophysics Data System (ADS)

    Wagle, Pradeep; Bhattarai, Nishan; Gowda, Prasanna H.; Kakani, Vijaya G.

    2017-06-01

    Robust evapotranspiration (ET) models are required to predict water usage in a variety of terrestrial ecosystems under different geographical and agrometeorological conditions. As a result, several remote sensing-based surface energy balance (SEB) models have been developed to estimate ET over large regions. However, comparison of the performance of several SEB models at the same site is limited. In addition, none of the SEB models have been evaluated for their ability to predict ET in rain-fed high biomass sorghum grown for biofuel production. In this paper, we evaluated the performance of five widely used single-source SEB models, namely Surface Energy Balance Algorithm for Land (SEBAL), Mapping ET with Internalized Calibration (METRIC), Surface Energy Balance System (SEBS), Simplified Surface Energy Balance Index (S-SEBI), and operational Simplified Surface Energy Balance (SSEBop), for estimating ET over a high biomass sorghum field during the 2012 and 2013 growing seasons. The predicted ET values were compared against eddy covariance (EC) measured ET (ETEC) for 19 cloud-free Landsat image. In general, S-SEBI, SEBAL, and SEBS performed reasonably well for the study period, while METRIC and SSEBop performed poorly. All SEB models substantially overestimated ET under extremely dry conditions as they underestimated sensible heat (H) and overestimated latent heat (LE) fluxes under dry conditions during the partitioning of available energy. METRIC, SEBAL, and SEBS overestimated LE regardless of wet or dry periods. Consequently, predicted seasonal cumulative ET by METRIC, SEBAL, and SEBS were higher than seasonal cumulative ETEC in both seasons. In contrast, S-SEBI and SSEBop substantially underestimated ET under too wet conditions, and predicted seasonal cumulative ET by S-SEBI and SSEBop were lower than seasonal cumulative ETEC in the relatively wetter 2013 growing season. Our results indicate the necessity of inclusion of soil moisture or plant water stress component in SEB models for the improvement of their performance, especially under too dry or wet environments.

  20. Stream macroinvertebrate response models for bioassessment metrics: addressing the issue of spatial scale

    USGS Publications Warehouse

    White, Ian R.; Kennen, Jonathan G.; May, Jason T.; Brown, Larry R.; Cuffney, Thomas F.; Jones, Kimberly A.; Orlando, James L.

    2014-01-01

    We developed independent predictive disturbance models for a full regional data set and four individual ecoregions (Full Region vs. Individual Ecoregion models) to evaluate effects of spatial scale on the assessment of human landscape modification, on predicted response of stream biota, and the effect of other possible confounding factors, such as watershed size and elevation, on model performance. We selected macroinvertebrate sampling sites for model development (n = 591) and validation (n = 467) that met strict screening criteria from four proximal ecoregions in the northeastern U.S.: North Central Appalachians, Ridge and Valley, Northeastern Highlands, and Northern Piedmont. Models were developed using boosted regression tree (BRT) techniques for four macroinvertebrate metrics; results were compared among ecoregions and metrics. Comparing within a region but across the four macroinvertebrate metrics, the average richness of tolerant taxa (RichTOL) had the highest R2 for BRT models. Across the four metrics, final BRT models had between four and seven explanatory variables and always included a variable related to urbanization (e.g., population density, percent urban, or percent manmade channels), and either a measure of hydrologic runoff (e.g., minimum April, average December, or maximum monthly runoff) and(or) a natural landscape factor (e.g., riparian slope, precipitation, and elevation), or a measure of riparian disturbance. Contrary to our expectations, Full Region models explained nearly as much variance in the macroinvertebrate data as Individual Ecoregion models, and taking into account watershed size or elevation did not appear to improve model performance. As a result, it may be advantageous for bioassessment programs to develop large regional models as a preliminary assessment of overall disturbance conditions as long as the range in natural landscape variability is not excessive.

  1. Stream Macroinvertebrate Response Models for Bioassessment Metrics: Addressing the Issue of Spatial Scale

    PubMed Central

    Waite, Ian R.; Kennen, Jonathan G.; May, Jason T.; Brown, Larry R.; Cuffney, Thomas F.; Jones, Kimberly A.; Orlando, James L.

    2014-01-01

    We developed independent predictive disturbance models for a full regional data set and four individual ecoregions (Full Region vs. Individual Ecoregion models) to evaluate effects of spatial scale on the assessment of human landscape modification, on predicted response of stream biota, and the effect of other possible confounding factors, such as watershed size and elevation, on model performance. We selected macroinvertebrate sampling sites for model development (n = 591) and validation (n = 467) that met strict screening criteria from four proximal ecoregions in the northeastern U.S.: North Central Appalachians, Ridge and Valley, Northeastern Highlands, and Northern Piedmont. Models were developed using boosted regression tree (BRT) techniques for four macroinvertebrate metrics; results were compared among ecoregions and metrics. Comparing within a region but across the four macroinvertebrate metrics, the average richness of tolerant taxa (RichTOL) had the highest R2 for BRT models. Across the four metrics, final BRT models had between four and seven explanatory variables and always included a variable related to urbanization (e.g., population density, percent urban, or percent manmade channels), and either a measure of hydrologic runoff (e.g., minimum April, average December, or maximum monthly runoff) and(or) a natural landscape factor (e.g., riparian slope, precipitation, and elevation), or a measure of riparian disturbance. Contrary to our expectations, Full Region models explained nearly as much variance in the macroinvertebrate data as Individual Ecoregion models, and taking into account watershed size or elevation did not appear to improve model performance. As a result, it may be advantageous for bioassessment programs to develop large regional models as a preliminary assessment of overall disturbance conditions as long as the range in natural landscape variability is not excessive. PMID:24675770

  2. Towards a Visual Quality Metric for Digital Video

    NASA Technical Reports Server (NTRS)

    Watson, Andrew B.

    1998-01-01

    The advent of widespread distribution of digital video creates a need for automated methods for evaluating visual quality of digital video. This is particularly so since most digital video is compressed using lossy methods, which involve the controlled introduction of potentially visible artifacts. Compounding the problem is the bursty nature of digital video, which requires adaptive bit allocation based on visual quality metrics. In previous work, we have developed visual quality metrics for evaluating, controlling, and optimizing the quality of compressed still images. These metrics incorporate simplified models of human visual sensitivity to spatial and chromatic visual signals. The challenge of video quality metrics is to extend these simplified models to temporal signals as well. In this presentation I will discuss a number of the issues that must be resolved in the design of effective video quality metrics. Among these are spatial, temporal, and chromatic sensitivity and their interactions, visual masking, and implementation complexity. I will also touch on the question of how to evaluate the performance of these metrics.

  3. When to Make Mountains out of Molehills: The Pros and Cons of Simple and Complex Model Calibration Procedures

    NASA Astrophysics Data System (ADS)

    Smith, K. A.; Barker, L. J.; Harrigan, S.; Prudhomme, C.; Hannaford, J.; Tanguy, M.; Parry, S.

    2017-12-01

    Earth and environmental models are relied upon to investigate system responses that cannot otherwise be examined. In simulating physical processes, models have adjustable parameters which may, or may not, have a physical meaning. Determining the values to assign to these model parameters is an enduring challenge for earth and environmental modellers. Selecting different error metrics by which the models results are compared to observations will lead to different sets of calibrated model parameters, and thus different model results. Furthermore, models may exhibit `equifinal' behaviour, where multiple combinations of model parameters lead to equally acceptable model performance against observations. These decisions in model calibration introduce uncertainty that must be considered when model results are used to inform environmental decision-making. This presentation focusses on the uncertainties that derive from the calibration of a four parameter lumped catchment hydrological model (GR4J). The GR models contain an inbuilt automatic calibration algorithm that can satisfactorily calibrate against four error metrics in only a few seconds. However, a single, deterministic model result does not provide information on parameter uncertainty. Furthermore, a modeller interested in extreme events, such as droughts, may wish to calibrate against more low flows specific error metrics. In a comprehensive assessment, the GR4J model has been run with 500,000 Latin Hypercube Sampled parameter sets across 303 catchments in the United Kingdom. These parameter sets have been assessed against six error metrics, including two drought specific metrics. This presentation compares the two approaches, and demonstrates that the inbuilt automatic calibration can outperform the Latin Hypercube experiment approach in single metric assessed performance. However, it is also shown that there are many merits of the more comprehensive assessment, which allows for probabilistic model results, multi-objective optimisation, and better tailoring to calibrate the model for specific applications such as drought event characterisation. Modellers and decision-makers may be constrained in their choice of calibration method, so it is important that they recognise the strengths and limitations of their chosen approach.

  4. A probability metric for identifying high-performing facilities: an application for pay-for-performance programs.

    PubMed

    Shwartz, Michael; Peköz, Erol A; Burgess, James F; Christiansen, Cindy L; Rosen, Amy K; Berlowitz, Dan

    2014-12-01

    Two approaches are commonly used for identifying high-performing facilities on a performance measure: one, that the facility is in a top quantile (eg, quintile or quartile); and two, that a confidence interval is below (or above) the average of the measure for all facilities. This type of yes/no designation often does not do well in distinguishing high-performing from average-performing facilities. To illustrate an alternative continuous-valued metric for profiling facilities--the probability a facility is in a top quantile--and show the implications of using this metric for profiling and pay-for-performance. We created a composite measure of quality from fiscal year 2007 data based on 28 quality indicators from 112 Veterans Health Administration nursing homes. A Bayesian hierarchical multivariate normal-binomial model was used to estimate shrunken rates of the 28 quality indicators, which were combined into a composite measure using opportunity-based weights. Rates were estimated using Markov Chain Monte Carlo methods as implemented in WinBUGS. The probability metric was calculated from the simulation replications. Our probability metric allowed better discrimination of high performers than the point or interval estimate of the composite score. In a pay-for-performance program, a smaller top quantile (eg, a quintile) resulted in more resources being allocated to the highest performers, whereas a larger top quantile (eg, being above the median) distinguished less among high performers and allocated more resources to average performers. The probability metric has potential but needs to be evaluated by stakeholders in different types of delivery systems.

  5. The relationship between organ dose and patient size in tube current modulated adult thoracic CT scans

    NASA Astrophysics Data System (ADS)

    Khatonabadi, Maryam; Zhang, Di; Yang, Jeffrey; DeMarco, John J.; Cagnon, Chris C.; McNitt-Gray, Michael F.

    2012-03-01

    Recently published AAPM Task Group 204 developed conversion coefficients that use scanner reported CTDIvol to estimate dose to the center of patient undergoing fixed tube current body exam. However, most performed CT exams use TCM to reduce dose to patients. Therefore, the purpose of this study was to investigate the correlation between organ dose and a variety of patient size metrics in adult chest CT scans that use tube current modulation (TCM). Monte Carlo simulations were performed for 32 voxelized models with contoured lungs and glandular breasts tissue, consisting of females and males. These simulations made use of patient's actual TCM data to estimate organ dose. Using image data, different size metrics were calculated, these measurements were all performed on one slice, at the level of patient's nipple. Estimated doses were normalized by scanner-reported CTDIvol and plotted versus different metrics. CTDIvol values were plotted versus different metrics to look at scanner's output versus size. The metrics performed similarly in terms of correlating with organ dose. Looking at each gender separately, for male models normalized lung dose showed a better linear correlation (r2=0.91) with effective diameter, while female models showed higher correlation (r2=0.59) with the anterior-posterior measurement. There was essentially no correlation observed between size and CTDIvol-normalized breast dose. However, a linear relationship was observed between absolute breast dose and size. Dose to lungs and breasts were consistently higher in females with similar size as males which could be due to shape and composition differences between genders in the thoracic region.

  6. Performance Benchmarks for Scholarly Metrics Associated with Fisheries and Wildlife Faculty

    PubMed Central

    Swihart, Robert K.; Sundaram, Mekala; Höök, Tomas O.; DeWoody, J. Andrew; Kellner, Kenneth F.

    2016-01-01

    Research productivity and impact are often considered in professional evaluations of academics, and performance metrics based on publications and citations increasingly are used in such evaluations. To promote evidence-based and informed use of these metrics, we collected publication and citation data for 437 tenure-track faculty members at 33 research-extensive universities in the United States belonging to the National Association of University Fisheries and Wildlife Programs. For each faculty member, we computed 8 commonly used performance metrics based on numbers of publications and citations, and recorded covariates including academic age (time since Ph.D.), sex, percentage of appointment devoted to research, and the sub-disciplinary research focus. Standardized deviance residuals from regression models were used to compare faculty after accounting for variation in performance due to these covariates. We also aggregated residuals to enable comparison across universities. Finally, we tested for temporal trends in citation practices to assess whether the “law of constant ratios”, used to enable comparison of performance metrics between disciplines that differ in citation and publication practices, applied to fisheries and wildlife sub-disciplines when mapped to Web of Science Journal Citation Report categories. Our regression models reduced deviance by ¼ to ½. Standardized residuals for each faculty member, when combined across metrics as a simple average or weighted via factor analysis, produced similar results in terms of performance based on percentile rankings. Significant variation was observed in scholarly performance across universities, after accounting for the influence of covariates. In contrast to findings for other disciplines, normalized citation ratios for fisheries and wildlife sub-disciplines increased across years. Increases were comparable for all sub-disciplines except ecology. We discuss the advantages and limitations of our methods, illustrate their use when applied to new data, and suggest future improvements. Our benchmarking approach may provide a useful tool to augment detailed, qualitative assessment of performance. PMID:27152838

  7. Performance Benchmarks for Scholarly Metrics Associated with Fisheries and Wildlife Faculty.

    PubMed

    Swihart, Robert K; Sundaram, Mekala; Höök, Tomas O; DeWoody, J Andrew; Kellner, Kenneth F

    2016-01-01

    Research productivity and impact are often considered in professional evaluations of academics, and performance metrics based on publications and citations increasingly are used in such evaluations. To promote evidence-based and informed use of these metrics, we collected publication and citation data for 437 tenure-track faculty members at 33 research-extensive universities in the United States belonging to the National Association of University Fisheries and Wildlife Programs. For each faculty member, we computed 8 commonly used performance metrics based on numbers of publications and citations, and recorded covariates including academic age (time since Ph.D.), sex, percentage of appointment devoted to research, and the sub-disciplinary research focus. Standardized deviance residuals from regression models were used to compare faculty after accounting for variation in performance due to these covariates. We also aggregated residuals to enable comparison across universities. Finally, we tested for temporal trends in citation practices to assess whether the "law of constant ratios", used to enable comparison of performance metrics between disciplines that differ in citation and publication practices, applied to fisheries and wildlife sub-disciplines when mapped to Web of Science Journal Citation Report categories. Our regression models reduced deviance by ¼ to ½. Standardized residuals for each faculty member, when combined across metrics as a simple average or weighted via factor analysis, produced similar results in terms of performance based on percentile rankings. Significant variation was observed in scholarly performance across universities, after accounting for the influence of covariates. In contrast to findings for other disciplines, normalized citation ratios for fisheries and wildlife sub-disciplines increased across years. Increases were comparable for all sub-disciplines except ecology. We discuss the advantages and limitations of our methods, illustrate their use when applied to new data, and suggest future improvements. Our benchmarking approach may provide a useful tool to augment detailed, qualitative assessment of performance.

  8. Developing Metrics in Systems Integration (ISS Program COTS Integration Model)

    NASA Technical Reports Server (NTRS)

    Lueders, Kathryn

    2007-01-01

    This viewgraph presentation reviews some of the complications in developing metrics for systems integration. Specifically it reviews a case study of how two programs within NASA try to develop and measure performance while meeting the encompassing organizational goals.

  9. A priori discretization error metrics for distributed hydrologic modeling applications

    NASA Astrophysics Data System (ADS)

    Liu, Hongli; Tolson, Bryan A.; Craig, James R.; Shafii, Mahyar

    2016-12-01

    Watershed spatial discretization is an important step in developing a distributed hydrologic model. A key difficulty in the spatial discretization process is maintaining a balance between the aggregation-induced information loss and the increase in computational burden caused by the inclusion of additional computational units. Objective identification of an appropriate discretization scheme still remains a challenge, in part because of the lack of quantitative measures for assessing discretization quality, particularly prior to simulation. This study proposes a priori discretization error metrics to quantify the information loss of any candidate discretization scheme without having to run and calibrate a hydrologic model. These error metrics are applicable to multi-variable and multi-site discretization evaluation and provide directly interpretable information to the hydrologic modeler about discretization quality. The first metric, a subbasin error metric, quantifies the routing information loss from discretization, and the second, a hydrological response unit (HRU) error metric, improves upon existing a priori metrics by quantifying the information loss due to changes in land cover or soil type property aggregation. The metrics are straightforward to understand and easy to recode. Informed by the error metrics, a two-step discretization decision-making approach is proposed with the advantage of reducing extreme errors and meeting the user-specified discretization error targets. The metrics and decision-making approach are applied to the discretization of the Grand River watershed in Ontario, Canada. Results show that information loss increases as discretization gets coarser. Moreover, results help to explain the modeling difficulties associated with smaller upstream subbasins since the worst discretization errors and highest error variability appear in smaller upstream areas instead of larger downstream drainage areas. Hydrologic modeling experiments under candidate discretization schemes validate the strong correlation between the proposed discretization error metrics and hydrologic simulation responses. Discretization decision-making results show that the common and convenient approach of making uniform discretization decisions across the watershed performs worse than the proposed non-uniform discretization approach in terms of preserving spatial heterogeneity under the same computational cost.

  10. Detection and quantification of flow consistency in business process models.

    PubMed

    Burattin, Andrea; Bernstein, Vered; Neurauter, Manuel; Soffer, Pnina; Weber, Barbara

    2018-01-01

    Business process models abstract complex business processes by representing them as graphical models. Their layout, as determined by the modeler, may have an effect when these models are used. However, this effect is currently not fully understood. In order to systematically study this effect, a basic set of measurable key visual features is proposed, depicting the layout properties that are meaningful to the human user. The aim of this research is thus twofold: first, to empirically identify key visual features of business process models which are perceived as meaningful to the user and second, to show how such features can be quantified into computational metrics, which are applicable to business process models. We focus on one particular feature, consistency of flow direction, and show the challenges that arise when transforming it into a precise metric. We propose three different metrics addressing these challenges, each following a different view of flow consistency. We then report the results of an empirical evaluation, which indicates which metric is more effective in predicting the human perception of this feature. Moreover, two other automatic evaluations describing the performance and the computational capabilities of our metrics are reported as well.

  11. Performance metrics for the assessment of satellite data products: an ocean color case study

    PubMed Central

    Seegers, Bridget N.; Stumpf, Richard P.; Schaeffer, Blake A.; Loftin, Keith A.; Werdell, P. Jeremy

    2018-01-01

    Performance assessment of ocean color satellite data has generally relied on statistical metrics chosen for their common usage and the rationale for selecting certain metrics is infrequently explained. Commonly reported statistics based on mean squared errors, such as the coefficient of determination (r2), root mean square error, and regression slopes, are most appropriate for Gaussian distributions without outliers and, therefore, are often not ideal for ocean color algorithm performance assessment, which is often limited by sample availability. In contrast, metrics based on simple deviations, such as bias and mean absolute error, as well as pair-wise comparisons, often provide more robust and straightforward quantities for evaluating ocean color algorithms with non-Gaussian distributions and outliers. This study uses a SeaWiFS chlorophyll-a validation data set to demonstrate a framework for satellite data product assessment and recommends a multi-metric and user-dependent approach that can be applied within science, modeling, and resource management communities. PMID:29609296

  12. Are Current Physical Match Performance Metrics in Elite Soccer Fit for Purpose or is the Adoption of an Integrated Approach Needed?

    PubMed

    Bradley, Paul S; Ade, Jack D

    2018-01-18

    Time-motion analysis is a valuable data-collection technique used to quantify the physical match performance of elite soccer players. For over 40 years researchers have adopted a 'traditional' approach when evaluating match demands by simply reporting the distance covered or time spent along a motion continuum of walking through to sprinting. This methodology quantifies physical metrics in isolation without integrating other factors and this ultimately leads to a one-dimensional insight into match performance. Thus, this commentary proposes a novel 'integrated' approach that focuses on a sensitive physical metric such as high-intensity running but contextualizes this in relation to key tactical activities for each position and collectively for the team. In the example presented, the 'integrated' model clearly unveils the unique high-intensity profile that exists due to distinct tactical roles, rather than one-dimensional 'blind' distances produced by 'traditional' models. Intuitively this innovative concept may aid the coaches understanding of the physical performance in relation to the tactical roles and instructions given to the players. Additionally, it will enable practitioners to more effectively translate match metrics into training and testing protocols. This innovative model may well aid advances in other team sports that incorporate similar intermittent movements with tactical purpose. Evidence of the merits and application of this new concept are needed before the scientific community accepts this model as it may well add complexity to an area that conceivably needs simplicity.

  13. CEDAR Electrodynamics Thermosphere Ionosphere (ETI) Challenge for Systematic Assessment of Ionosphere/Thermosphere Models: NmF2, hmF2, and Vertical Drift Using Ground-Based Observations

    NASA Technical Reports Server (NTRS)

    Shim, J. S.; Kuznetsova, M.; Rastatter, L.; Hesse, M.; Bilitza, D.; Butala, M.; Codrescu, M.; Emery, B.; Foster, B.; Fuller-Rowell, T.; hide

    2011-01-01

    Objective quantification of model performance based on metrics helps us evaluate the current state of space physics modeling capability, address differences among various modeling approaches, and track model improvements over time. The Coupling, Energetics, and Dynamics of Atmospheric Regions (CEDAR) Electrodynamics Thermosphere Ionosphere (ETI) Challenge was initiated in 2009 to assess accuracy of various ionosphere/thermosphere models in reproducing ionosphere and thermosphere parameters. A total of nine events and five physical parameters were selected to compare between model outputs and observations. The nine events included two strong and one moderate geomagnetic storm events from GEM Challenge events and three moderate storms and three quiet periods from the first half of the International Polar Year (IPY) campaign, which lasted for 2 years, from March 2007 to March 2009. The five physical parameters selected were NmF2 and hmF2 from ISRs and LEO satellites such as CHAMP and COSMIC, vertical drifts at Jicamarca, and electron and neutral densities along the track of the CHAMP satellite. For this study, four different metrics and up to 10 models were used. In this paper, we focus on preliminary results of the study using ground-based measurements, which include NmF2 and hmF2 from Incoherent Scatter Radars (ISRs), and vertical drifts at Jicamarca. The results show that the model performance strongly depends on the type of metrics used, and thus no model is ranked top for all used metrics. The analysis further indicates that performance of the model also varies with latitude and geomagnetic activity level.

  14. Climate Data Analytics Workflow Management

    NASA Astrophysics Data System (ADS)

    Zhang, J.; Lee, S.; Pan, L.; Mattmann, C. A.; Lee, T. J.

    2016-12-01

    In this project we aim to pave a novel path to create a sustainable building block toward Earth science big data analytics and knowledge sharing. Closely studying how Earth scientists conduct data analytics research in their daily work, we have developed a provenance model to record their activities, and to develop a technology to automatically generate workflows for scientists from the provenance. On top of it, we have built the prototype of a data-centric provenance repository, and establish a PDSW (People, Data, Service, Workflow) knowledge network to support workflow recommendation. To ensure the scalability and performance of the expected recommendation system, we have leveraged the Apache OODT system technology. The community-approved, metrics-based performance evaluation web-service will allow a user to select a metric from the list of several community-approved metrics and to evaluate model performance using the metric as well as the reference dataset. This service will facilitate the use of reference datasets that are generated in support of the model-data intercomparison projects such as Obs4MIPs and Ana4MIPs. The data-centric repository infrastructure will allow us to catch richer provenance to further facilitate knowledge sharing and scientific collaboration in the Earth science community. This project is part of Apache incubator CMDA project.

  15. Validation metrics for turbulent plasma transport

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Holland, C., E-mail: chholland@ucsd.edu

    Developing accurate models of plasma dynamics is essential for confident predictive modeling of current and future fusion devices. In modern computer science and engineering, formal verification and validation processes are used to assess model accuracy and establish confidence in the predictive capabilities of a given model. This paper provides an overview of the key guiding principles and best practices for the development of validation metrics, illustrated using examples from investigations of turbulent transport in magnetically confined plasmas. Particular emphasis is given to the importance of uncertainty quantification and its inclusion within the metrics, and the need for utilizing synthetic diagnosticsmore » to enable quantitatively meaningful comparisons between simulation and experiment. As a starting point, the structure of commonly used global transport model metrics and their limitations is reviewed. An alternate approach is then presented, which focuses upon comparisons of predicted local fluxes, fluctuations, and equilibrium gradients against observation. The utility of metrics based upon these comparisons is demonstrated by applying them to gyrokinetic predictions of turbulent transport in a variety of discharges performed on the DIII-D tokamak [J. L. Luxon, Nucl. Fusion 42, 614 (2002)], as part of a multi-year transport model validation activity.« less

  16. Improved Mental Acuity Forecasting with an Individualized Quantitative Sleep Model.

    PubMed

    Winslow, Brent D; Nguyen, Nam; Venta, Kimberly E

    2017-01-01

    Sleep impairment significantly alters human brain structure and cognitive function, but available evidence suggests that adults in developed nations are sleeping less. A growing body of research has sought to use sleep to forecast cognitive performance by modeling the relationship between the two, but has generally focused on vigilance rather than other cognitive constructs affected by sleep, such as reaction time, executive function, and working memory. Previous modeling efforts have also utilized subjective, self-reported sleep durations and were restricted to laboratory environments. In the current effort, we addressed these limitations by employing wearable systems and mobile applications to gather objective sleep information, assess multi-construct cognitive performance, and model/predict changes to mental acuity. Thirty participants were recruited for participation in the study, which lasted 1 week. Using the Fitbit Charge HR and a mobile version of the automated neuropsychological assessment metric called CogGauge, we gathered a series of features and utilized the unified model of performance to predict mental acuity based on sleep records. Our results suggest that individuals poorly rate their sleep duration, supporting the need for objective sleep metrics to model circadian changes to mental acuity. Participant compliance in using the wearable throughout the week and responding to the CogGauge assessments was 80%. Specific biases were identified in temporal metrics across mobile devices and operating systems and were excluded from the mental acuity metric development. Individualized prediction of mental acuity consistently outperformed group modeling. This effort indicates the feasibility of creating an individualized, mobile assessment and prediction of mental acuity, compatible with the majority of current mobile devices.

  17. Toward better public health reporting using existing off the shelf approaches: The value of medical dictionaries in automated cancer detection using plaintext medical data.

    PubMed

    Kasthurirathne, Suranga N; Dixon, Brian E; Gichoya, Judy; Xu, Huiping; Xia, Yuni; Mamlin, Burke; Grannis, Shaun J

    2017-05-01

    Existing approaches to derive decision models from plaintext clinical data frequently depend on medical dictionaries as the sources of potential features. Prior research suggests that decision models developed using non-dictionary based feature sourcing approaches and "off the shelf" tools could predict cancer with performance metrics between 80% and 90%. We sought to compare non-dictionary based models to models built using features derived from medical dictionaries. We evaluated the detection of cancer cases from free text pathology reports using decision models built with combinations of dictionary or non-dictionary based feature sourcing approaches, 4 feature subset sizes, and 5 classification algorithms. Each decision model was evaluated using the following performance metrics: sensitivity, specificity, accuracy, positive predictive value, and area under the receiver operating characteristics (ROC) curve. Decision models parameterized using dictionary and non-dictionary feature sourcing approaches produced performance metrics between 70 and 90%. The source of features and feature subset size had no impact on the performance of a decision model. Our study suggests there is little value in leveraging medical dictionaries for extracting features for decision model building. Decision models built using features extracted from the plaintext reports themselves achieve comparable results to those built using medical dictionaries. Overall, this suggests that existing "off the shelf" approaches can be leveraged to perform accurate cancer detection using less complex Named Entity Recognition (NER) based feature extraction, automated feature selection and modeling approaches. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Adaptive distance metric learning for diffusion tensor image segmentation.

    PubMed

    Kong, Youyong; Wang, Defeng; Shi, Lin; Hui, Steve C N; Chu, Winnie C W

    2014-01-01

    High quality segmentation of diffusion tensor images (DTI) is of key interest in biomedical research and clinical application. In previous studies, most efforts have been made to construct predefined metrics for different DTI segmentation tasks. These methods require adequate prior knowledge and tuning parameters. To overcome these disadvantages, we proposed to automatically learn an adaptive distance metric by a graph based semi-supervised learning model for DTI segmentation. An original discriminative distance vector was first formulated by combining both geometry and orientation distances derived from diffusion tensors. The kernel metric over the original distance and labels of all voxels were then simultaneously optimized in a graph based semi-supervised learning approach. Finally, the optimization task was efficiently solved with an iterative gradient descent method to achieve the optimal solution. With our approach, an adaptive distance metric could be available for each specific segmentation task. Experiments on synthetic and real brain DTI datasets were performed to demonstrate the effectiveness and robustness of the proposed distance metric learning approach. The performance of our approach was compared with three classical metrics in the graph based semi-supervised learning framework.

  19. Adaptive Distance Metric Learning for Diffusion Tensor Image Segmentation

    PubMed Central

    Kong, Youyong; Wang, Defeng; Shi, Lin; Hui, Steve C. N.; Chu, Winnie C. W.

    2014-01-01

    High quality segmentation of diffusion tensor images (DTI) is of key interest in biomedical research and clinical application. In previous studies, most efforts have been made to construct predefined metrics for different DTI segmentation tasks. These methods require adequate prior knowledge and tuning parameters. To overcome these disadvantages, we proposed to automatically learn an adaptive distance metric by a graph based semi-supervised learning model for DTI segmentation. An original discriminative distance vector was first formulated by combining both geometry and orientation distances derived from diffusion tensors. The kernel metric over the original distance and labels of all voxels were then simultaneously optimized in a graph based semi-supervised learning approach. Finally, the optimization task was efficiently solved with an iterative gradient descent method to achieve the optimal solution. With our approach, an adaptive distance metric could be available for each specific segmentation task. Experiments on synthetic and real brain DTI datasets were performed to demonstrate the effectiveness and robustness of the proposed distance metric learning approach. The performance of our approach was compared with three classical metrics in the graph based semi-supervised learning framework. PMID:24651858

  20. Automated Assessment of Visual Quality of Digital Video

    NASA Technical Reports Server (NTRS)

    Watson, Andrew B.; Ellis, Stephen R. (Technical Monitor)

    1997-01-01

    The advent of widespread distribution of digital video creates a need for automated methods for evaluating visual quality of digital video. This is particularly so since most digital video is compressed using lossy methods, which involve the controlled introduction of potentially visible artifacts. Compounding the problem is the bursty nature of digital video, which requires adaptive bit allocation based on visual quality metrics. In previous work, we have developed visual quality metrics for evaluating, controlling, and optimizing the quality of compressed still images[1-4]. These metrics incorporate simplified models of human visual sensitivity to spatial and chromatic visual signals. The challenge of video quality metrics is to extend these simplified models to temporal signals as well. In this presentation I will discuss a number of the issues that must be resolved in the design of effective video quality metrics. Among these are spatial, temporal, and chromatic sensitivity and their interactions, visual masking, and implementation complexity. I will also touch on the question of how to evaluate the performance of these metrics.

  1. A Literature Survey and Experimental Evaluation of the State-of-the-Art in Uplift Modeling: A Stepping Stone Toward the Development of Prescriptive Analytics.

    PubMed

    Devriendt, Floris; Moldovan, Darie; Verbeke, Wouter

    2018-03-01

    Prescriptive analytics extends on predictive analytics by allowing to estimate an outcome in function of control variables, allowing as such to establish the required level of control variables for realizing a desired outcome. Uplift modeling is at the heart of prescriptive analytics and aims at estimating the net difference in an outcome resulting from a specific action or treatment that is applied. In this article, a structured and detailed literature survey on uplift modeling is provided by identifying and contrasting various groups of approaches. In addition, evaluation metrics for assessing the performance of uplift models are reviewed. An experimental evaluation on four real-world data sets provides further insight into their use. Uplift random forests are found to be consistently among the best performing techniques in terms of the Qini and Gini measures, although considerable variability in performance across the various data sets of the experiments is observed. In addition, uplift models are frequently observed to be unstable and display a strong variability in terms of performance across different folds in the cross-validation experimental setup. This potentially threatens their actual use for business applications. Moreover, it is found that the available evaluation metrics do not provide an intuitively understandable indication of the actual use and performance of a model. Specifically, existing evaluation metrics do not facilitate a comparison of uplift models and predictive models and evaluate performance either at an arbitrary cutoff or over the full spectrum of potential cutoffs. In conclusion, we highlight the instability of uplift models and the need for an application-oriented approach to assess uplift models as prime topics for further research.

  2. Prediction of fatigue-related driver performance from EEG data by deep Riemannian model.

    PubMed

    Hajinoroozi, Mehdi; Jianqiu Zhang; Yufei Huang

    2017-07-01

    Prediction of the drivers' drowsy and alert states is important for safety purposes. The prediction of drivers' drowsy and alert states from electroencephalography (EEG) using shallow and deep Riemannian methods is presented. For shallow Riemannian methods, the minimum distance to Riemannian mean (mdm) and Log-Euclidian metric are investigated, where it is shown that Log-Euclidian metric outperforms the mdm algorithm. In addition the SPDNet, a deep Riemannian model, that takes the EEG covariance matrix as the input is investigated. It is shown that SPDNet outperforms all tested shallow and deep classification methods. Performance of SPDNet is 6.02% and 2.86% higher than the best performance by the conventional Euclidian classifiers and shallow Riemannian models, respectively.

  3. Shipboard Electrical System Modeling for Early-Stage Design Space Exploration

    DTIC Science & Technology

    2013-04-01

    method is demonstrated in several system studies. I. INTRODUCTION The integrated engineering plant ( IEP ) of an electric warship can be viewed as a...which it must operate [2], [4]. The desired IEP design should be dependable [5]. The operability metric has previously been defined as a measure of...the performance of an IEP during a specific scenario [2]. Dependability metrics have been derived from the operability metric as measures of the IEP

  4. Assessing deep and shallow learning methods for quantitative prediction of acute chemical toxicity.

    PubMed

    Liu, Ruifeng; Madore, Michael; Glover, Kyle P; Feasel, Michael G; Wallqvist, Anders

    2018-05-02

    Animal-based methods for assessing chemical toxicity are struggling to meet testing demands. In silico approaches, including machine-learning methods, are promising alternatives. Recently, deep neural networks (DNNs) were evaluated and reported to outperform other machine-learning methods for quantitative structure-activity relationship modeling of molecular properties. However, most of the reported performance evaluations relied on global performance metrics, such as the root mean squared error (RMSE) between the predicted and experimental values of all samples, without considering the impact of sample distribution across the activity spectrum. Here, we carried out an in-depth analysis of DNN performance for quantitative prediction of acute chemical toxicity using several datasets. We found that the overall performance of DNN models on datasets of up to 30,000 compounds was similar to that of random forest (RF) models, as measured by the RMSE and correlation coefficients between the predicted and experimental results. However, our detailed analyses demonstrated that global performance metrics are inappropriate for datasets with a highly uneven sample distribution, because they show a strong bias for the most populous compounds along the toxicity spectrum. For highly toxic compounds, DNN and RF models trained on all samples performed much worse than the global performance metrics indicated. Surprisingly, our variable nearest neighbor method, which utilizes only structurally similar compounds to make predictions, performed reasonably well, suggesting that information of close near neighbors in the training sets is a key determinant of acute toxicity predictions.

  5. Translation from UML to Markov Model: A Performance Modeling Framework

    NASA Astrophysics Data System (ADS)

    Khan, Razib Hayat; Heegaard, Poul E.

    Performance engineering focuses on the quantitative investigation of the behavior of a system during the early phase of the system development life cycle. Bearing this on mind, we delineate a performance modeling framework of the application for communication system that proposes a translation process from high level UML notation to Continuous Time Markov Chain model (CTMC) and solves the model for relevant performance metrics. The framework utilizes UML collaborations, activity diagrams and deployment diagrams to be used for generating performance model for a communication system. The system dynamics will be captured by UML collaboration and activity diagram as reusable specification building blocks, while deployment diagram highlights the components of the system. The collaboration and activity show how reusable building blocks in the form of collaboration can compose together the service components through input and output pin by highlighting the behavior of the components and later a mapping between collaboration and system component identified by deployment diagram will be delineated. Moreover the UML models are annotated to associate performance related quality of service (QoS) information which is necessary for solving the performance model for relevant performance metrics through our proposed framework. The applicability of our proposed performance modeling framework in performance evaluation is delineated in the context of modeling a communication system.

  6. Analysis and Modeling of Realistic Compound Channels in Transparent Relay Transmissions

    PubMed Central

    Kanjirathumkal, Cibile K.; Mohammed, Sameer S.

    2014-01-01

    Analytical approaches for the characterisation of the compound channels in transparent multihop relay transmissions over independent fading channels are considered in this paper. Compound channels with homogeneous links are considered first. Using Mellin transform technique, exact expressions are derived for the moments of cascaded Weibull distributions. Subsequently, two performance metrics, namely, coefficient of variation and amount of fade, are derived using the computed moments. These metrics quantify the possible variations in the channel gain and signal to noise ratio from their respective average values and can be used to characterise the achievable receiver performance. This approach is suitable for analysing more realistic compound channel models for scattering density variations of the environment, experienced in multihop relay transmissions. The performance metrics for such heterogeneous compound channels having distinct distribution in each hop are computed and compared with those having identical constituent component distributions. The moments and the coefficient of variation computed are then used to develop computationally efficient estimators for the distribution parameters and the optimal hop count. The metrics and estimators proposed are complemented with numerical and simulation results to demonstrate the impact of the accuracy of the approaches. PMID:24701175

  7. Using measures of information content and complexity of time series as hydrologic metrics

    USDA-ARS?s Scientific Manuscript database

    The information theory has been previously used to develop metrics that allowed to characterize temporal patterns in soil moisture dynamics, and to evaluate and to compare performance of soil water flow models. The objective of this study was to apply information and complexity measures to characte...

  8. Dataset of two experiments of the application of gamified peer assessment model into online learning environment MeuTutor.

    PubMed

    Tenório, Thyago; Bittencourt, Ig Ibert; Isotani, Seiji; Pedro, Alan; Ospina, Patrícia; Tenório, Daniel

    2017-06-01

    In this dataset, we present the collected data of two experiments with the application of the gamified peer assessment model into online learning environment MeuTutor to allow the comparison of the obtained results with others proposed models. MeuTutor is an intelligent tutoring system aims to monitor the learning of the students in a personalized way, ensuring quality education and improving the performance of its members (Tenório et al., 2016) [1]. The first experiment evaluated the effectiveness of the peer assessment model through metrics as final grade (result), time to correct the activities and associated costs. The second experiment evaluated the gamification influence into peer assessment model, analyzing metrics as access number (logins), number of performed activities and number of performed corrections. In this article, we present in table form for each metric: the raw data of each treatment; the summarized data; the application results of the normality test Shapiro-Wilk; the application results of the statistical tests T -Test and/or Wilcoxon. The presented data in this article are related to the article entitled "A gamified peer assessment model for on-line learning environments in a competitive context" (Tenório et al., 2016) [1].

  9. Estimation of the fraction of absorbed photosynthetically active radiation (fPAR) in maize canopies using LiDAR data and hyperspectral imagery.

    PubMed

    Qin, Haiming; Wang, Cheng; Zhao, Kaiguang; Xi, Xiaohuan

    2018-01-01

    Accurate estimation of the fraction of absorbed photosynthetically active radiation (fPAR) for maize canopies are important for maize growth monitoring and yield estimation. The goal of this study is to explore the potential of using airborne LiDAR and hyperspectral data to better estimate maize fPAR. This study focuses on estimating maize fPAR from (1) height and coverage metrics derived from airborne LiDAR point cloud data; (2) vegetation indices derived from hyperspectral imagery; and (3) a combination of these metrics. Pearson correlation analyses were conducted to evaluate the relationships among LiDAR metrics, hyperspectral metrics, and field-measured fPAR values. Then, multiple linear regression (MLR) models were developed using these metrics. Results showed that (1) LiDAR height and coverage metrics provided good explanatory power (i.e., R2 = 0.81); (2) hyperspectral vegetation indices provided moderate interpretability (i.e., R2 = 0.50); and (3) the combination of LiDAR metrics and hyperspectral metrics improved the LiDAR model (i.e., R2 = 0.88). These results indicate that LiDAR model seems to offer a reliable method for estimating maize fPAR at a high spatial resolution and it can be used for farmland management. Combining LiDAR and hyperspectral metrics led to better performance of maize fPAR estimation than LiDAR or hyperspectral metrics alone, which means that maize fPAR retrieval can benefit from the complementary nature of LiDAR-detected canopy structure characteristics and hyperspectral-captured vegetation spectral information.

  10. Nonlinear Semi-Supervised Metric Learning Via Multiple Kernels and Local Topology.

    PubMed

    Li, Xin; Bai, Yanqin; Peng, Yaxin; Du, Shaoyi; Ying, Shihui

    2018-03-01

    Changing the metric on the data may change the data distribution, hence a good distance metric can promote the performance of learning algorithm. In this paper, we address the semi-supervised distance metric learning (ML) problem to obtain the best nonlinear metric for the data. First, we describe the nonlinear metric by the multiple kernel representation. By this approach, we project the data into a high dimensional space, where the data can be well represented by linear ML. Then, we reformulate the linear ML by a minimization problem on the positive definite matrix group. Finally, we develop a two-step algorithm for solving this model and design an intrinsic steepest descent algorithm to learn the positive definite metric matrix. Experimental results validate that our proposed method is effective and outperforms several state-of-the-art ML methods.

  11. Predicting the natural flow regime: Models for assessing hydrological alteration in streams

    USGS Publications Warehouse

    Carlisle, D.M.; Falcone, J.; Wolock, D.M.; Meador, M.R.; Norris, R.H.

    2009-01-01

    Understanding the extent to which natural streamflow characteristics have been altered is an important consideration for ecological assessments of streams. Assessing hydrologic condition requires that we quantify the attributes of the flow regime that would be expected in the absence of anthropogenic modifications. The objective of this study was to evaluate whether selected streamflow characteristics could be predicted at regional and national scales using geospatial data. Long-term, gaged river basins distributed throughout the contiguous US that had streamflow characteristics representing least disturbed or near pristine conditions were identified. Thirteen metrics of the magnitude, frequency, duration, timing and rate of change of streamflow were calculated using a 20-50 year period of record for each site. We used random forests (RF), a robust statistical modelling approach, to develop models that predicted the value for each streamflow metric using natural watershed characteristics. We compared the performance (i.e. bias and precision) of national- and regional-scale predictive models to that of models based on landscape classifications, including major river basins, ecoregions and hydrologic landscape regions (HLR). For all hydrologic metrics, landscape stratification models produced estimates that were less biased and more precise than a null model that accounted for no natural variability. Predictive models at the national and regional scale performed equally well, and substantially improved predictions of all hydrologic metrics relative to landscape stratification models. Prediction error rates ranged from 15 to 40%, but were 25% for most metrics. We selected three gaged, non-reference sites to illustrate how predictive models could be used to assess hydrologic condition. These examples show how the models accurately estimate predisturbance conditions and are sensitive to changes in streamflow variability associated with long-term land-use change. We also demonstrate how the models can be applied to predict expected natural flow characteristics at ungaged sites. ?? 2009 John Wiley & Sons, Ltd.

  12. Information Geometry for Landmark Shape Analysis: Unifying Shape Representation and Deformation

    PubMed Central

    Peter, Adrian M.; Rangarajan, Anand

    2010-01-01

    Shape matching plays a prominent role in the comparison of similar structures. We present a unifying framework for shape matching that uses mixture models to couple both the shape representation and deformation. The theoretical foundation is drawn from information geometry wherein information matrices are used to establish intrinsic distances between parametric densities. When a parameterized probability density function is used to represent a landmark-based shape, the modes of deformation are automatically established through the information matrix of the density. We first show that given two shapes parameterized by Gaussian mixture models (GMMs), the well-known Fisher information matrix of the mixture model is also a Riemannian metric (actually, the Fisher-Rao Riemannian metric) and can therefore be used for computing shape geodesics. The Fisher-Rao metric has the advantage of being an intrinsic metric and invariant to reparameterization. The geodesic—computed using this metric—establishes an intrinsic deformation between the shapes, thus unifying both shape representation and deformation. A fundamental drawback of the Fisher-Rao metric is that it is not available in closed form for the GMM. Consequently, shape comparisons are computationally very expensive. To address this, we develop a new Riemannian metric based on generalized ϕ-entropy measures. In sharp contrast to the Fisher-Rao metric, the new metric is available in closed form. Geodesic computations using the new metric are considerably more efficient. We validate the performance and discriminative capabilities of these new information geometry-based metrics by pairwise matching of corpus callosum shapes. We also study the deformations of fish shapes that have various topological properties. A comprehensive comparative analysis is also provided using other landmark-based distances, including the Hausdorff distance, the Procrustes metric, landmark-based diffeomorphisms, and the bending energies of the thin-plate (TPS) and Wendland splines. PMID:19110497

  13. Measures of model performance based on the log accuracy ratio

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morley, Steven Karl; Brito, Thiago Vasconcelos; Welling, Daniel T.

    Quantitative assessment of modeling and forecasting of continuous quantities uses a variety of approaches. We review existing literature describing metrics for forecast accuracy and bias, concentrating on those based on relative errors and percentage errors. Of these accuracy metrics, the mean absolute percentage error (MAPE) is one of the most common across many fields and has been widely applied in recent space science literature and we highlight the benefits and drawbacks of MAPE and proposed alternatives. We then introduce the log accuracy ratio, and derive from it two metrics: the median symmetric accuracy; and the symmetric signed percentage bias. Robustmore » methods for estimating the spread of a multiplicative linear model using the log accuracy ratio are also presented. The developed metrics are shown to be easy to interpret, robust, and to mitigate the key drawbacks of their more widely-used counterparts based on relative errors and percentage errors. Their use is illustrated with radiation belt electron flux modeling examples.« less

  14. Measures of model performance based on the log accuracy ratio

    DOE PAGES

    Morley, Steven Karl; Brito, Thiago Vasconcelos; Welling, Daniel T.

    2018-01-03

    Quantitative assessment of modeling and forecasting of continuous quantities uses a variety of approaches. We review existing literature describing metrics for forecast accuracy and bias, concentrating on those based on relative errors and percentage errors. Of these accuracy metrics, the mean absolute percentage error (MAPE) is one of the most common across many fields and has been widely applied in recent space science literature and we highlight the benefits and drawbacks of MAPE and proposed alternatives. We then introduce the log accuracy ratio, and derive from it two metrics: the median symmetric accuracy; and the symmetric signed percentage bias. Robustmore » methods for estimating the spread of a multiplicative linear model using the log accuracy ratio are also presented. The developed metrics are shown to be easy to interpret, robust, and to mitigate the key drawbacks of their more widely-used counterparts based on relative errors and percentage errors. Their use is illustrated with radiation belt electron flux modeling examples.« less

  15. New metric for optimizing Continuous Loop Averaging Deconvolution (CLAD) sequences under the 1/f noise model

    PubMed Central

    Peng, Xian; Yuan, Han; Chen, Wufan; Ding, Lei

    2017-01-01

    Continuous loop averaging deconvolution (CLAD) is one of the proven methods for recovering transient auditory evoked potentials (AEPs) in rapid stimulation paradigms, which requires an elaborated stimulus sequence design to attenuate impacts from noise in data. The present study aimed to develop a new metric in gauging a CLAD sequence in terms of noise gain factor (NGF), which has been proposed previously but with less effectiveness in the presence of pink (1/f) noise. We derived the new metric by explicitly introducing the 1/f model into the proposed time-continuous sequence. We selected several representative CLAD sequences to test their noise property on typical EEG recordings, as well as on five real CLAD electroencephalogram (EEG) recordings to retrieve the middle latency responses. We also demonstrated the merit of the new metric in generating and quantifying optimized sequences using a classic genetic algorithm. The new metric shows evident improvements in measuring actual noise gains at different frequencies, and better performance than the original NGF in various aspects. The new metric is a generalized NGF measurement that can better quantify the performance of a CLAD sequence, and provide a more efficient mean of generating CLAD sequences via the incorporation with optimization algorithms. The present study can facilitate the specific application of CLAD paradigm with desired sequences in the clinic. PMID:28414803

  16. Analyzing System on A Chip Single Event Upset Responses using Single Event Upset Data, Classical Reliability Models, and Space Environment Data

    NASA Technical Reports Server (NTRS)

    Berg, Melanie; LaBel, Kenneth; Campola, Michael; Xapsos, Michael

    2017-01-01

    We are investigating the application of classical reliability performance metrics combined with standard single event upset (SEU) analysis data. We expect to relate SEU behavior to system performance requirements. Our proposed methodology will provide better prediction of SEU responses in harsh radiation environments with confidence metrics. single event upset (SEU), single event effect (SEE), field programmable gate array devises (FPGAs)

  17. Localized Multi-Model Extremes Metrics for the Fourth National Climate Assessment

    NASA Astrophysics Data System (ADS)

    Thompson, T. R.; Kunkel, K.; Stevens, L. E.; Easterling, D. R.; Biard, J.; Sun, L.

    2017-12-01

    We have performed localized analysis of scenario-based datasets for the Fourth National Climate Assessment (NCA4). These datasets include CMIP5-based Localized Constructed Analogs (LOCA) downscaled simulations at daily temporal resolution and 1/16th-degree spatial resolution. Over 45 temperature and precipitation extremes metrics have been processed using LOCA data, including threshold, percentile, and degree-days calculations. The localized analysis calculates trends in the temperature and precipitation extremes metrics for relatively small regions such as counties, metropolitan areas, climate zones, administrative areas, or economic zones. For NCA4, we are currently addressing metropolitan areas as defined by U.S. Census Bureau Metropolitan Statistical Areas. Such localized analysis provides essential information for adaptation planning at scales relevant to local planning agencies and businesses. Nearly 30 such regions have been analyzed to date. Each locale is defined by a closed polygon that is used to extract LOCA-based extremes metrics specific to the area. For each metric, single-model data at each LOCA grid location are first averaged over several 30-year historical and future periods. Then, for each metric, the spatial average across the region is calculated using model weights based on both model independence and reproducibility of current climate conditions. The range of single-model results is also captured on the same localized basis, and then combined with the weighted ensemble average for each region and each metric. For example, Boston-area cooling degree days and maximum daily temperature is shown below for RCP8.5 (red) and RCP4.5 (blue) scenarios. We also discuss inter-regional comparison of these metrics, as well as their relevance to risk analysis for adaptation planning.

  18. A new look at mobility metrics for pyroclastic density currents: collection, interpretation, and use

    NASA Astrophysics Data System (ADS)

    Ogburn, S. E.; Lopes, D.; Calder, E. S.

    2012-12-01

    Mitigation of risk associated with pyroclastic density currents (PDCs) depends upon accurate forecasting of possible flow paths, often using empirical models that rely on mobility metrics or the stochastic application of computational flow models. Mobility metrics often inform computational models, sometimes as direct model inputs (e.g. energy cone model), or as estimates for input parameters (e.g. basal friction parameter in TITAN2D). These mobility metrics are often compiled from PDCs at many volcanoes, generalized to reveal empirical constants, or sampled for use in probabilistic models. In practice, however, there are often inconsistencies in how mobility metrics have been collected, reported, and used. For instance, the runout of PDCs often varies depending on the method used (e.g. manually measured from a paper map, automated using GIS software); and the distance traveled by the center of mass of PDCs is rarely reported due to the difficulty in locating it. This work reexamines the way we measure, report, and analyze PDC mobility metrics. Several metrics, such as the Heim coefficient (height dropped/runout, H/L) and the proportionality of inundated area to volume (A∝V2/3) have been used successfully with PDC data (Sparks 1976; Nairn and Self 1977; Sheridan 1979; Hayashi and Self 1992; Calder et al. 1999; Widiwijayanti et al. 2008) in addition to the non-volcanic flows they were originally developed for. Other mobility metrics have been investigated by the debris avalanche community but have not yet been extensively applied to pyroclastic flows (e.g. the initial aspect ratio of collapsing pile). We investigate the relative merits and suitability of contrasting mobility metrics for different types of PDCs (e.g. dome-collapse pyroclastic flows, ash-cloud surges, pumice flows), and indicate certain circumstances under which each model performs optimally. We show that these metrics can be used (with varying success) to predict the runout of a PDC of given volume, or vice versa. The problem of locating the center of mass of PDCs is also investigated by comparing field measurements, geometric centroids, linear thickness models, and computational flow models. Comparing center of mass measurements with runout provides insight into the relative roles of sliding vs. spreading in PDC emplacement. The effect of topography on mobility is explored by comparing mobility metrics to valley morphology measurements, including sinuosity, cross-sectional area, and valley slope. Lastly, we examine the problem of compiling and generalizing mobility data from worldwide databases using a hierarchical Bayes model for weighting mobility metrics for use as model inputs, which offers an improved method over simple space-filling strategies. This is especially useful for calibrating models at data-sparse volcanoes.

  19. An Investigation of the Relationship Between Automated Machine Translation Evaluation Metrics and User Performance on an Information Extraction Task

    DTIC Science & Technology

    2007-01-01

    parameter dimension between the two models). 93 were tested.3 Model 1 log( pHits 1− pHits ) = α + β1 ∗ MetricScore (6.6) The results for each of the...505.67 oTERavg .357 .13 .007 log( pHits 1− pHits ), that is, log-odds of correct task performance, of 2.79 over the intercept only model. All... pHits 1− pHits ) = −1.15− .418× I[MT=2] − .527× I[MT=3] + 1.78×METEOR+ 1.28×METEOR × I[MT=2] + 1.86×METEOR × I[MT=3] (6.7) Model 3 log( pHits 1− pHits

  20. Identifying psychophysiological indices of expert vs. novice performance in deadly force judgment and decision making

    PubMed Central

    Johnson, Robin R.; Stone, Bradly T.; Miranda, Carrie M.; Vila, Bryan; James, Lois; James, Stephen M.; Rubio, Roberto F.; Berka, Chris

    2014-01-01

    Objective: To demonstrate that psychophysiology may have applications for objective assessment of expertise development in deadly force judgment and decision making (DFJDM). Background: Modern training techniques focus on improving decision-making skills with participative assessment between trainees and subject matter experts primarily through subjective observation. Objective metrics need to be developed. The current proof of concept study explored the potential for psychophysiological metrics in deadly force judgment contexts. Method: Twenty-four participants (novice, expert) were recruited. All wore a wireless Electroencephalography (EEG) device to collect psychophysiological data during high-fidelity simulated deadly force judgment and decision-making simulations using a modified Glock firearm. Participants were exposed to 27 video scenarios, one-third of which would have justified use of deadly force. Pass/fail was determined by whether the participant used deadly force appropriately. Results: Experts had a significantly higher pass rate compared to novices (p < 0.05). Multiple metrics were shown to distinguish novices from experts. Hierarchical regression analyses indicate that psychophysiological variables are able to explain 72% of the variability in expert performance, but only 37% in novices. Discriminant function analysis (DFA) using psychophysiological metrics was able to discern between experts and novices with 72.6% accuracy. Conclusion: While limited due to small sample size, the results suggest that psychophysiology may be developed for use as an objective measure of expertise in DFDJM. Specifically, discriminant function measures may have the potential to objectively identify expert skill acquisition. Application: Psychophysiological metrics may create a performance model with the potential to optimize simulator-based DFJDM training. These performance models could be used for trainee feedback, and/or by the instructor to assess performance objectively. PMID:25100966

  1. A Comparison of Evaluation Metrics for Biomedical Journals, Articles, and Websites in Terms of Sensitivity to Topic

    PubMed Central

    Fu, Lawrence D.; Aphinyanaphongs, Yindalon; Wang, Lily; Aliferis, Constantin F.

    2011-01-01

    Evaluating the biomedical literature and health-related websites for quality are challenging information retrieval tasks. Current commonly used methods include impact factor for journals, PubMed’s clinical query filters and machine learning-based filter models for articles, and PageRank for websites. Previous work has focused on the average performance of these methods without considering the topic, and it is unknown how performance varies for specific topics or focused searches. Clinicians, researchers, and users should be aware when expected performance is not achieved for specific topics. The present work analyzes the behavior of these methods for a variety of topics. Impact factor, clinical query filters, and PageRank vary widely across different topics while a topic-specific impact factor and machine learning-based filter models are more stable. The results demonstrate that a method may perform excellently on average but struggle when used on a number of narrower topics. Topic adjusted metrics and other topic robust methods have an advantage in such situations. Users of traditional topic-sensitive metrics should be aware of their limitations. PMID:21419864

  2. More robust regional precipitation projection from selected CMIP5 models based on multiple-dimensional metrics

    NASA Astrophysics Data System (ADS)

    Qian, Y.; Wang, L.; Leung, L. R.; Lin, G.; Lu, J.; Gao, Y.; Zhang, Y.

    2017-12-01

    Projecting precipitation changes is challenging because of incomplete understanding of the climate system and biases and uncertainty in climate models. In East Asia where summer precipitation is dominantly influenced by the monsoon circulation and the global models from Coupled Model Intercomparison Project Phase 5 (CMIP5), however, give various projection of precipitation change for 21th century. It is critical for community to know which models' projection are more reliable in response to natural and anthropogenic forcings. In this study we defined multiple-dimensional metrics, measuring the model performance in simulating the present-day of large-scale circulation, regional precipitation and relationship between them. The large-scale circulation features examined in this study include the lower tropospheric southwesterly winds, the western North Pacific subtropical high, the South China Sea Subtropical High, and the East Asian westerly jet in the upper troposphere. Each of these circulation features transport moisture to East Asia, enhancing the moist static energy and strengthening the Meiyu moisture front that is the primary mechanism for precipitation generation in eastern China. Based on these metrics, 30 models in CMIP5 ensemble are classified into three groups. Models in the top performing group projected regional precipitation patterns that are more similar to each other than the bottom or middle performing group and consistently projected statistically significant increasing trends in two of the large-scale circulation indices and precipitation. In contrast, models in the bottom or middle performing group projected small drying or no trends in precipitation. We also find the models that only reasonably reproduce the observed precipitation climatology does not guarantee more reliable projection of future precipitation because good simulation skill could be achieved through compensating errors from multiple sources. Herein the potential for more robust projections of precipitation changes at regional scale is demonstrated through the use of discriminating metric to subsample the multi-model ensemble. The results from this study provides insights for how to select models from CMIP ensemble to project regional climate and hydrological cycle changes.

  3. Visualizing curved spacetime

    NASA Astrophysics Data System (ADS)

    Jonsson, Rickard M.

    2005-03-01

    I present a way to visualize the concept of curved spacetime. The result is a curved surface with local coordinate systems (Minkowski systems) living on it, giving the local directions of space and time. Relative to these systems, special relativity holds. The method can be used to visualize gravitational time dilation, the horizon of black holes, and cosmological models. The idea underlying the illustrations is first to specify a field of timelike four-velocities uμ. Then, at every point, one performs a coordinate transformation to a local Minkowski system comoving with the given four-velocity. In the local system, the sign of the spatial part of the metric is flipped to create a new metric of Euclidean signature. The new positive definite metric, called the absolute metric, can be covariantly related to the original Lorentzian metric. For the special case of a two-dimensional original metric, the absolute metric may be embedded in three-dimensional Euclidean space as a curved surface.

  4. Robotics-based synthesis of human motion.

    PubMed

    Khatib, O; Demircan, E; De Sapio, V; Sentis, L; Besier, T; Delp, S

    2009-01-01

    The synthesis of human motion is a complex procedure that involves accurate reconstruction of movement sequences, modeling of musculoskeletal kinematics, dynamics and actuation, and characterization of reliable performance criteria. Many of these processes have much in common with the problems found in robotics research. Task-based methods used in robotics may be leveraged to provide novel musculoskeletal modeling methods and physiologically accurate performance predictions. In this paper, we present (i) a new method for the real-time reconstruction of human motion trajectories using direct marker tracking, (ii) a task-driven muscular effort minimization criterion and (iii) new human performance metrics for dynamic characterization of athletic skills. Dynamic motion reconstruction is achieved through the control of a simulated human model to follow the captured marker trajectories in real-time. The operational space control and real-time simulation provide human dynamics at any configuration of the performance. A new criteria of muscular effort minimization has been introduced to analyze human static postures. Extensive motion capture experiments were conducted to validate the new minimization criterion. Finally, new human performance metrics were introduced to study in details an athletic skill. These metrics include the effort expenditure and the feasible set of operational space accelerations during the performance of the skill. The dynamic characterization takes into account skeletal kinematics as well as muscle routing kinematics and force generating capacities. The developments draw upon an advanced musculoskeletal modeling platform and a task-oriented framework for the effective integration of biomechanics and robotics methods.

  5. Assessing precision, bias and sigma-metrics of 53 measurands of the Alinity ci system.

    PubMed

    Westgard, Sten; Petrides, Victoria; Schneider, Sharon; Berman, Marvin; Herzogenrath, Jörg; Orzechowski, Anthony

    2017-12-01

    Assay performance is dependent on the accuracy and precision of a given method. These attributes can be combined into an analytical Sigma-metric, providing a simple value for laboratorians to use in evaluating a test method's capability to meet its analytical quality requirements. Sigma-metrics were determined for 37 clinical chemistry assays, 13 immunoassays, and 3 ICT methods on the Alinity ci system. Analytical Performance Specifications were defined for the assays, following a rationale of using CLIA goals first, then Ricos Desirable goals when CLIA did not regulate the method, and then other sources if the Ricos Desirable goal was unrealistic. A precision study was conducted at Abbott on each assay using the Alinity ci system following the CLSI EP05-A2 protocol. Bias was estimated following the CLSI EP09-A3 protocol using samples with concentrations spanning the assay's measuring interval tested in duplicate on the Alinity ci system and ARCHITECT c8000 and i2000 SR systems, where testing was also performed at Abbott. Using the regression model, the %bias was estimated at an important medical decisions point. Then the Sigma-metric was estimated for each assay and was plotted on a method decision chart. The Sigma-metric was calculated using the equation: Sigma-metric=(%TEa-|%bias|)/%CV. The Sigma-metrics and Normalized Method Decision charts demonstrate that a majority of the Alinity assays perform at least at five Sigma or higher, at or near critical medical decision levels. More than 90% of the assays performed at Five and Six Sigma. None performed below Three Sigma. Sigma-metrics plotted on Normalized Method Decision charts provide useful evaluations of performance. The majority of Alinity ci system assays had sigma values >5 and thus laboratories can expect excellent or world class performance. Laboratorians can use these tools as aids in choosing high-quality products, further contributing to the delivery of excellent quality healthcare for patients. Copyright © 2017 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  6. National Quality Forum Colon Cancer Quality Metric Performance: How Are Hospitals Measuring Up?

    PubMed

    Mason, Meredith C; Chang, George J; Petersen, Laura A; Sada, Yvonne H; Tran Cao, Hop S; Chai, Christy; Berger, David H; Massarweh, Nader N

    2017-12-01

    To evaluate the impact of care at high-performing hospitals on the National Quality Forum (NQF) colon cancer metrics. The NQF endorses evaluating ≥12 lymph nodes (LNs), adjuvant chemotherapy (AC) for stage III patients, and AC within 4 months of diagnosis as colon cancer quality indicators. Data on hospital-level metric performance and the association with survival are unclear. Retrospective cohort study of 218,186 patients with resected stage I to III colon cancer in the National Cancer Data Base (2004-2012). High-performing hospitals (>75% achievement) were identified by the proportion of patients achieving each measure. The association between hospital performance and survival was evaluated using Cox shared frailty modeling. Only hospital LN performance improved (15.8% in 2004 vs 80.7% in 2012; trend test, P < 0.001), with 45.9% of hospitals performing well on all 3 measures concurrently in the most recent study year. Overall, 5-year survival was 75.0%, 72.3%, 72.5%, and 69.5% for those treated at hospitals with high performance on 3, 2, 1, and 0 metrics, respectively (log-rank, P < 0.001). Care at hospitals with high metric performance was associated with lower risk of death in a dose-response fashion [0 metrics, reference; 1, hazard ratio (HR) 0.96 (0.89-1.03); 2, HR 0.92 (0.87-0.98); 3, HR 0.85 (0.80-0.90); 2 vs 1, HR 0.96 (0.91-1.01); 3 vs 1, HR 0.89 (0.84-0.93); 3 vs 2, HR 0.95 (0.89-0.95)]. Performance on metrics in combination was associated with lower risk of death [LN + AC, HR 0.86 (0.78-0.95); AC + timely AC, HR 0.92 (0.87-0.98); LN + AC + timely AC, HR 0.85 (0.80-0.90)], whereas individual measures were not [LN, HR 0.95 (0.88-1.04); AC, HR 0.95 (0.87-1.05)]. Less than half of hospitals perform well on these NQF colon cancer metrics concurrently, and high performance on individual measures is not associated with improved survival. Quality improvement efforts should shift focus from individual measures to defining composite measures encompassing the overall multimodal care pathway and capturing successful transitions from one care modality to another.

  7. f(T) gravity and energy distribution in Landau-Lifshitz prescription

    NASA Astrophysics Data System (ADS)

    Ganiou, M. G.; Houndjo, M. J. S.; Tossa, J.

    We investigate in this paper the Landau-Lifshitz energy distribution in the framework of f(T) theory view as a modified version of Teleparallel theory. From some important Teleparallel theory results on the localization of energy, our investigations generalize the Landau-Lifshitz prescription from the computation of the energy-momentum complex to the framework of f(T) gravity as it is done in the modified versions of General Relativity. We compute the energy density in the first step for three plane-symmetric metrics in vacuum. We find for the second metric that the energy density vanishes independently of f(T) models. We find that the Teleparallel Landau-Lifshitz energy-momentum complex formulations for these metrics are different from those obtained in General Relativity for the same metrics. Second, the calculations are performed for the cosmic string spacetime metric. It results that the energy distribution depends on the mass M and the radius r of cosmic string and it is strongly affected by the parameter of the considered quadratic and cubic f(T) models. Our investigation with this metric induces interesting results susceptible to be tested with some astrophysics hypothesis.

  8. On the new metrics for IMRT QA verification.

    PubMed

    Garcia-Romero, Alejandro; Hernandez-Vitoria, Araceli; Millan-Cebrian, Esther; Alba-Escorihuela, Veronica; Serrano-Zabaleta, Sonia; Ortega-Pardina, Pablo

    2016-11-01

    The aim of this work is to search for new metrics that could give more reliable acceptance/rejection criteria on the IMRT verification process and to offer solutions to the discrepancies found among different conventional metrics. Therefore, besides conventional metrics, new ones are proposed and evaluated with new tools to find correlations among them. These new metrics are based on the processing of the dose-volume histogram information, evaluating the absorbed dose differences, the dose constraint fulfillment, or modified biomathematical treatment outcome models such as tumor control probability (TCP) and normal tissue complication probability (NTCP). An additional purpose is to establish whether the new metrics yield the same acceptance/rejection plan distribution as the conventional ones. Fifty eight treatment plans concerning several patient locations are analyzed. All of them were verified prior to the treatment, using conventional metrics, and retrospectively after the treatment with the new metrics. These new metrics include the definition of three continuous functions, based on dose-volume histograms resulting from measurements evaluated with a reconstructed dose system and also with a Monte Carlo redundant calculation. The 3D gamma function for every volume of interest is also calculated. The information is also processed to obtain ΔTCP or ΔNTCP for the considered volumes of interest. These biomathematical treatment outcome models have been modified to increase their sensitivity to dose changes. A robustness index from a radiobiological point of view is defined to classify plans in robustness against dose changes. Dose difference metrics can be condensed in a single parameter: the dose difference global function, with an optimal cutoff that can be determined from a receiver operating characteristics (ROC) analysis of the metric. It is not always possible to correlate differences in biomathematical treatment outcome models with dose difference metrics. This is due to the fact that the dose constraint is often far from the dose that has an actual impact on the radiobiological model, and therefore, biomathematical treatment outcome models are insensitive to big dose differences between the verification system and the treatment planning system. As an alternative, the use of modified radiobiological models which provides a better correlation is proposed. In any case, it is better to choose robust plans from a radiobiological point of view. The robustness index defined in this work is a good predictor of the plan rejection probability according to metrics derived from modified radiobiological models. The global 3D gamma-based metric calculated for each plan volume shows a good correlation with the dose difference metrics and presents a good performance in the acceptance/rejection process. Some discrepancies have been found in dose reconstruction depending on the algorithm employed. Significant and unavoidable discrepancies were found between the conventional metrics and the new ones. The dose difference global function and the 3D gamma for each plan volume are good classifiers regarding dose difference metrics. ROC analysis is useful to evaluate the predictive power of the new metrics. The correlation between biomathematical treatment outcome models and the dose difference-based metrics is enhanced by using modified TCP and NTCP functions that take into account the dose constraints for each plan. The robustness index is useful to evaluate if a plan is likely to be rejected. Conventional verification should be replaced by the new metrics, which are clinically more relevant.

  9. Instrument Motion Metrics for Laparoscopic Skills Assessment in Virtual Reality and Augmented Reality.

    PubMed

    Fransson, Boel A; Chen, Chi-Ya; Noyes, Julie A; Ragle, Claude A

    2016-11-01

    To determine the construct and concurrent validity of instrument motion metrics for laparoscopic skills assessment in virtual reality and augmented reality simulators. Evaluation study. Veterinarian students (novice, n = 14) and veterinarians (experienced, n = 11) with no or variable laparoscopic experience. Participants' minimally invasive surgery (MIS) experience was determined by hospital records of MIS procedures performed in the Teaching Hospital. Basic laparoscopic skills were assessed by 5 tasks using a physical box trainer. Each participant completed 2 tasks for assessments in each type of simulator (virtual reality: bowel handling and cutting; augmented reality: object positioning and a pericardial window model). Motion metrics such as instrument path length, angle or drift, and economy of motion of each simulator were recorded. None of the motion metrics in a virtual reality simulator showed correlation with experience, or to the basic laparoscopic skills score. All metrics in augmented reality were significantly correlated with experience (time, instrument path, and economy of movement), except for the hand dominance metric. The basic laparoscopic skills score was correlated to all performance metrics in augmented reality. The augmented reality motion metrics differed between American College of Veterinary Surgeons diplomates and residents, whereas basic laparoscopic skills score and virtual reality metrics did not. Our results provide construct validity and concurrent validity for motion analysis metrics for an augmented reality system, whereas a virtual reality system was validated only for the time score. © Copyright 2016 by The American College of Veterinary Surgeons.

  10. Validation Metrics for Improving Our Understanding of Turbulent Transport - Moving Beyond Proof by Pretty Picture and Loud Assertion

    NASA Astrophysics Data System (ADS)

    Holland, C.

    2013-10-01

    Developing validated models of plasma dynamics is essential for confident predictive modeling of current and future fusion devices. This tutorial will present an overview of the key guiding principles and practices for state-of-the-art validation studies, illustrated using examples from investigations of turbulent transport in magnetically confined plasmas. The primary focus of the talk will be the development of quantiatve validation metrics, which are essential for moving beyond qualitative and subjective assessments of model performance and fidelity. Particular emphasis and discussion is given to (i) the need for utilizing synthetic diagnostics to enable quantitatively meaningful comparisons between simulation and experiment, and (ii) the importance of robust uncertainty quantification and its inclusion within the metrics. To illustrate these concepts, we first review the structure and key insights gained from commonly used ``global'' transport model metrics (e.g. predictions of incremental stored energy or radially-averaged temperature), as well as their limitations. Building upon these results, a new form of turbulent transport metrics is then proposed, which focuses upon comparisons of predicted local gradients and fluctuation characteristics against observation. We demonstrate the utility of these metrics by applying them to simulations and modeling of a newly developed ``validation database'' derived from the results of a systematic, multi-year turbulent transport validation campaign on the DIII-D tokamak, in which comprehensive profile and fluctuation measurements have been obtained from a wide variety of heating and confinement scenarios. Finally, we discuss extensions of these metrics and their underlying design concepts to other areas of plasma confinement research, including both magnetohydrodynamic stability and integrated scenario modeling. Supported by the US DOE under DE-FG02-07ER54917 and DE-FC02-08ER54977.

  11. ExaSAT: An exascale co-design tool for performance modeling

    DOE PAGES

    Unat, Didem; Chan, Cy; Zhang, Weiqun; ...

    2015-02-09

    One of the emerging challenges to designing HPC systems is understanding and projecting the requirements of exascale applications. In order to determine the performance consequences of different hardware designs, analytic models are essential because they can provide fast feedback to the co-design centers and chip designers without costly simulations. However, current attempts to analytically model program performance typically rely on the user manually specifying a performance model. Here we introduce the ExaSAT framework that automates the extraction of parameterized performance models directly from source code using compiler analysis. The parameterized analytic model enables quantitative evaluation of a broad range ofmore » hardware design trade-offs and software optimizations on a variety of different performance metrics, with a primary focus on data movement as a metric. Finally, we demonstrate the ExaSAT framework’s ability to perform deep code analysis of a proxy application from the Department of Energy Combustion Co-design Center to illustrate its value to the exascale co-design process. ExaSAT analysis provides insights into the hardware and software trade-offs and lays the groundwork for exploring a more targeted set of design points using cycle-accurate architectural simulators.« less

  12. Corroborating tomographic defect metrics with mechanical response in an additively manufactured precipitation-hardened stainless steel

    NASA Astrophysics Data System (ADS)

    Madison, Jonathan D.; Underwood, Olivia D.; Swiler, Laura P.; Boyce, Brad L.; Jared, Bradley H.; Rodelas, Jeff M.; Salzbrenner, Bradley C.

    2018-04-01

    The intrinsic relation between structure and performance is a foundational tenant of most all materials science investigations. While the specific form of this relation is dictated by material system, processing route and performance metric of interest, it is widely agreed that appropriate characterization of a material allows for greater accuracy in understanding and/or predicting material response. However, in the context of additive manufacturing, prior models and expectations of material performance must be revisited as performance often diverges from traditional values, even among well explored material systems. This work utilizes micro-computed tomography to quantify porosity and lack of fusion defects in an additively manufactured stainless steel and relates these metrics to performance across a statistically significant population using high-throughput mechanical testing. The degree to which performance in additively manufactured stainless steel can and cannot be correlated to detectable porosity will be presented and suggestions for performing similar experiments will be provided.

  13. Can metric-based approaches really improve multi-model climate projections? A perfect model framework applied to summer temperature change in France.

    NASA Astrophysics Data System (ADS)

    Boé, Julien; Terray, Laurent

    2014-05-01

    Ensemble approaches for climate change projections have become ubiquitous. Because of large model-to-model variations and, generally, lack of rationale for the choice of a particular climate model against others, it is widely accepted that future climate change and its impacts should not be estimated based on a single climate model. Generally, as a default approach, the multi-model ensemble mean (MMEM) is considered to provide the best estimate of climate change signals. The MMEM approach is based on the implicit hypothesis that all the models provide equally credible projections of future climate change. This hypothesis is unlikely to be true and ideally one would want to give more weight to more realistic models. A major issue with this alternative approach lies in the assessment of the relative credibility of future climate projections from different climate models, as they can only be evaluated against present-day observations: which present-day metric(s) should be used to decide which models are "good" and which models are "bad" in the future climate? Once a supposedly informative metric has been found, other issues arise. What is the best statistical method to combine multiple models results taking into account their relative credibility measured by a given metric? How to be sure in the end that the metric-based estimate of future climate change is not in fact less realistic than the MMEM? It is impossible to provide strict answers to those questions in the climate change context. Yet, in this presentation, we propose a methodological approach based on a perfect model framework that could bring some useful elements of answer to the questions previously mentioned. The basic idea is to take a random climate model in the ensemble and treat it as if it were the truth (results of this model, in both past and future climate, are called "synthetic observations"). Then, all the other members from the multi-model ensemble are used to derive thanks to a metric-based approach a posterior estimate of climate change, based on the synthetic observation of the metric. Finally, it is possible to compare the posterior estimate to the synthetic observation of future climate change to evaluate the skill of the method. The main objective of this presentation is to describe and apply this perfect model framework to test different methodological issues associated with non-uniform model weighting and similar metric-based approaches. The methodology presented is general, but will be applied to the specific case of summer temperature change in France, for which previous works have suggested potentially useful metrics associated with soil-atmosphere and cloud-temperature interactions. The relative performances of different simple statistical approaches to combine multiple model results based on metrics will be tested. The impact of ensemble size, observational errors, internal variability, and model similarity will be characterized. The potential improvements associated with metric-based approaches compared to the MMEM is terms of errors and uncertainties will be quantified.

  14. Measuring the Performance and Intelligence of Systems: Proceedings of the 2002 PerMIS Workshop

    NASA Technical Reports Server (NTRS)

    Messina, E. R.; Meystel, A. M.

    2002-01-01

    Contents include the following: Performance Metrics; Performance of Multiple Agents; Performance of Mobility Systems; Performance of Planning Systems; General Discussion Panel 1; Uncertainty of Representation I; Performance of Robots in Hazardous Domains; Modeling Intelligence; Modeling of Mind; Measuring Intelligence; Grouping: A Core Procedure of Intelligence; Uncertainty in Representation II; Towards Universal Planning/Control Systems.

  15. Fronto-Temporal Connectivity Predicts ECT Outcome in Major Depression.

    PubMed

    Leaver, Amber M; Wade, Benjamin; Vasavada, Megha; Hellemann, Gerhard; Joshi, Shantanu H; Espinoza, Randall; Narr, Katherine L

    2018-01-01

    Electroconvulsive therapy (ECT) is arguably the most effective available treatment for severe depression. Recent studies have used MRI data to predict clinical outcome to ECT and other antidepressant therapies. One challenge facing such studies is selecting from among the many available metrics, which characterize complementary and sometimes non-overlapping aspects of brain function and connectomics. Here, we assessed the ability of aggregated, functional MRI metrics of basal brain activity and connectivity to predict antidepressant response to ECT using machine learning. A radial support vector machine was trained using arterial spin labeling (ASL) and blood-oxygen-level-dependent (BOLD) functional magnetic resonance imaging (fMRI) metrics from n = 46 (26 female, mean age 42) depressed patients prior to ECT (majority right-unilateral stimulation). Image preprocessing was applied using standard procedures, and metrics included cerebral blood flow in ASL, and regional homogeneity, fractional amplitude of low-frequency modulations, and graph theory metrics (strength, local efficiency, and clustering) in BOLD data. A 5-repeated 5-fold cross-validation procedure with nested feature-selection validated model performance. Linear regressions were applied post hoc to aid interpretation of discriminative features. The range of balanced accuracy in models performing statistically above chance was 58-68%. Here, prediction of non-responders was slightly higher than for responders (maximum performance 74 and 64%, respectively). Several features were consistently selected across cross-validation folds, mostly within frontal and temporal regions. Among these were connectivity strength among: a fronto-parietal network [including left dorsolateral prefrontal cortex (DLPFC)], motor and temporal networks (near ECT electrodes), and/or subgenual anterior cingulate cortex (sgACC). Our data indicate that pattern classification of multimodal fMRI metrics can successfully predict ECT outcome, particularly for individuals who will not respond to treatment. Notably, connectivity with networks highly relevant to ECT and depression were consistently selected as important predictive features. These included the left DLPFC and the sgACC, which are both targets of other neurostimulation therapies for depression, as well as connectivity between motor and right temporal cortices near electrode sites. Future studies that probe additional functional and structural MRI metrics and other patient characteristics may further improve the predictive power of these and similar models.

  16. Multi-linear model set design based on the nonlinearity measure and H-gap metric.

    PubMed

    Shaghaghi, Davood; Fatehi, Alireza; Khaki-Sedigh, Ali

    2017-05-01

    This paper proposes a model bank selection method for a large class of nonlinear systems with wide operating ranges. In particular, nonlinearity measure and H-gap metric are used to provide an effective algorithm to design a model bank for the system. Then, the proposed model bank is accompanied with model predictive controllers to design a high performance advanced process controller. The advantage of this method is the reduction of excessive switch between models and also decrement of the computational complexity in the controller bank that can lead to performance improvement of the control system. The effectiveness of the method is verified by simulations as well as experimental studies on a pH neutralization laboratory apparatus which confirms the efficiency of the proposed algorithm. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  17. A biologically plausible computational model for auditory object recognition.

    PubMed

    Larson, Eric; Billimoria, Cyrus P; Sen, Kamal

    2009-01-01

    Object recognition is a task of fundamental importance for sensory systems. Although this problem has been intensively investigated in the visual system, relatively little is known about the recognition of complex auditory objects. Recent work has shown that spike trains from individual sensory neurons can be used to discriminate between and recognize stimuli. Multiple groups have developed spike similarity or dissimilarity metrics to quantify the differences between spike trains. Using a nearest-neighbor approach the spike similarity metrics can be used to classify the stimuli into groups used to evoke the spike trains. The nearest prototype spike train to the tested spike train can then be used to identify the stimulus. However, how biological circuits might perform such computations remains unclear. Elucidating this question would facilitate the experimental search for such circuits in biological systems, as well as the design of artificial circuits that can perform such computations. Here we present a biologically plausible model for discrimination inspired by a spike distance metric using a network of integrate-and-fire model neurons coupled to a decision network. We then apply this model to the birdsong system in the context of song discrimination and recognition. We show that the model circuit is effective at recognizing individual songs, based on experimental input data from field L, the avian primary auditory cortex analog. We also compare the performance and robustness of this model to two alternative models of song discrimination: a model based on coincidence detection and a model based on firing rate.

  18. SU-D-207B-07: Development of a CT-Radiomics Based Early Response Prediction Model During Delivery of Chemoradiation Therapy for Pancreatic Cancer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Klawikowski, S; Christian, J; Schott, D

    Purpose: Pilot study developing a CT-texture based model for early assessment of treatment response during the delivery of chemoradiation therapy (CRT) for pancreatic cancer. Methods: Daily CT data acquired for 24 pancreatic head cancer patients using CT-on-rails, during the routine CT-guided CRT delivery with a radiation dose of 50.4 Gy in 28 fractions, were analyzed. The pancreas head was contoured on each daily CT. Texture analysis was performed within the pancreas head contour using a research tool (IBEX). Over 1300 texture metrics including: grey level co-occurrence, run-length, histogram, neighborhood intensity difference, and geometrical shape features were calculated for each dailymore » CT. Metric-trend information was established by finding the best fit of either a linear, quadratic, or exponential function for each metric value verses accumulated dose. Thus all the daily CT texture information was consolidated into a best-fit trend type for a given patient and texture metric. Linear correlation was performed between the patient histological response vector (good, medium, poor) and all combinations of 23 patient subgroups (statistical jackknife) determining which metrics were most correlated to response and repeatedly reliable across most patients. Control correlations against CT scanner, reconstruction kernel, and gated/nongated CT images were also calculated. Euclidean distance measure was used to group/sort patient vectors based on the data of these trend-response metrics. Results: We found four specific trend-metrics (Gray Level Coocurence Matrix311-1InverseDiffMomentNorm, Gray Level Coocurence Matrix311-1InverseDiffNorm, Gray Level Coocurence Matrix311-1 Homogeneity2, and Intensity Direct Local StdMean) that were highly correlated with patient response and repeatedly reliable. Our four trend-metric model successfully ordered our pilot response dataset (p=0.00070). We found no significant correlation to our control parameters: gating (p=0.7717), scanner (p=0.9741), and kernel (p=0.8586). Conclusion: We have successfully created a CT-texture based early treatment response prediction model using the CTs acquired during the delivery of chemoradiation therapy for pancreatic cancer. Future testing is required to validate the model with more patient data.« less

  19. Model-based metrics of human-automation function allocation in complex work environments

    NASA Astrophysics Data System (ADS)

    Kim, So Young

    Function allocation is the design decision which assigns work functions to all agents in a team, both human and automated. Efforts to guide function allocation systematically has been studied in many fields such as engineering, human factors, team and organization design, management science, and cognitive systems engineering. Each field focuses on certain aspects of function allocation, but not all; thus, an independent discussion of each does not address all necessary issues with function allocation. Four distinctive perspectives emerged from a review of these fields: technology-centered, human-centered, team-oriented, and work-oriented. Each perspective focuses on different aspects of function allocation: capabilities and characteristics of agents (automation or human), team structure and processes, and work structure and the work environment. Together, these perspectives identify the following eight issues with function allocation: 1) Workload, 2) Incoherency in function allocations, 3) Mismatches between responsibility and authority, 4) Interruptive automation, 5) Automation boundary conditions, 6) Function allocation preventing human adaptation to context, 7) Function allocation destabilizing the humans' work environment, and 8) Mission Performance. Addressing these issues systematically requires formal models and simulations that include all necessary aspects of human-automation function allocation: the work environment, the dynamics inherent to the work, agents, and relationships among them. Also, addressing these issues requires not only a (static) model, but also a (dynamic) simulation that captures temporal aspects of work such as the timing of actions and their impact on the agent's work. Therefore, with properly modeled work as described by the work environment, the dynamics inherent to the work, agents, and relationships among them, a modeling framework developed by this thesis, which includes static work models and dynamic simulation, can capture the issues with function allocation. Then, based on the eight issues, eight types of metrics are established. The purpose of these metrics is to assess the extent to which each issue exists with a given function allocation. Specifically, the eight types of metrics assess workload, coherency of a function allocation, mismatches between responsibility and authority, interruptive automation, automation boundary conditions, human adaptation to context, stability of the human's work environment, and mission performance. Finally, to validate the modeling framework and the metrics, a case study was conducted modeling four different function allocations between a pilot and flight deck automation during the arrival and approach phases of flight. A range of pilot cognitive control modes and maximum human taskload limits were also included in the model. The metrics were assessed for these four function allocations and analyzed to validate capability of the metrics to identify important issues in given function allocations. In addition, the design insights provided by the metrics are highlighted. This thesis concludes with a discussion of mechanisms for further validating the modeling framework and function allocation metrics developed here, and highlights where these developments can be applied in research and in the design of function allocations in complex work environments such as aviation operations.

  20. Hydrologic Model Development and Calibration: Contrasting a Single- and Multi-Objective Approach for Comparing Model Performance

    NASA Astrophysics Data System (ADS)

    Asadzadeh, M.; Maclean, A.; Tolson, B. A.; Burn, D. H.

    2009-05-01

    Hydrologic model calibration aims to find a set of parameters that adequately simulates observations of watershed behavior, such as streamflow, or a state variable, such as snow water equivalent (SWE). There are different metrics for evaluating calibration effectiveness that involve quantifying prediction errors, such as the Nash-Sutcliffe (NS) coefficient and bias evaluated for the entire calibration period, on a seasonal basis, for low flows, or for high flows. Many of these metrics are conflicting such that the set of parameters that maximizes the high flow NS differs from the set of parameters that maximizes the low flow NS. Conflicting objectives are very likely when different calibration objectives are based on different fluxes and/or state variables (e.g., NS based on streamflow versus SWE). One of the most popular ways to balance different metrics is to aggregate them based on their importance and find the set of parameters that optimizes a weighted sum of the efficiency metrics. Comparing alternative hydrologic models (e.g., assessing model improvement when a process or more detail is added to the model) based on the aggregated objective might be misleading since it represents one point on the tradeoff of desired error metrics. To derive a more comprehensive model comparison, we solved a bi-objective calibration problem to estimate the tradeoff between two error metrics for each model. Although this approach is computationally more expensive than the aggregation approach, it results in a better understanding of the effectiveness of selected models at each level of every error metric and therefore provides a better rationale for judging relative model quality. The two alternative models used in this study are two MESH hydrologic models (version 1.2) of the Wolf Creek Research basin that differ in their watershed spatial discretization (a single Grouped Response Unit, GRU, versus multiple GRUs). The MESH model, currently under development by Environment Canada, is a coupled land-surface and hydrologic model. Results will demonstrate the conclusions a modeller might make regarding the value of additional watershed spatial discretization under both an aggregated (single-objective) and multi-objective model comparison framework.

  1. A New Metric for Land-Atmosphere Coupling Strength: Applications on Observations and Modeling

    NASA Astrophysics Data System (ADS)

    Tang, Q.; Xie, S.; Zhang, Y.; Phillips, T. J.; Santanello, J. A., Jr.; Cook, D. R.; Riihimaki, L.; Gaustad, K.

    2017-12-01

    A new metric is proposed to quantify the land-atmosphere (LA) coupling strength and is elaborated by correlating the surface evaporative fraction and impacting land and atmosphere variables (e.g., soil moisture, vegetation, and radiation). Based upon multiple linear regression, this approach simultaneously considers multiple factors and thus represents complex LA coupling mechanisms better than existing single variable metrics. The standardized regression coefficients quantify the relative contributions from individual drivers in a consistent manner, avoiding the potential inconsistency in relative influence of conventional metrics. Moreover, the unique expendable feature of the new method allows us to verify and explore potentially important coupling mechanisms. Our observation-based application of the new metric shows moderate coupling with large spatial variations at the U.S. Southern Great Plains. The relative importance of soil moisture vs. vegetation varies by location. We also show that LA coupling strength is generally underestimated by single variable methods due to their incompleteness. We also apply this new metric to evaluate the representation of LA coupling in the Accelerated Climate Modeling for Energy (ACME) V1 Contiguous United States (CONUS) regionally refined model (RRM). This work is performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-ABS-734201

  2. Simulation of devices mobility to estimate wireless channel quality metrics in 5G networks

    NASA Astrophysics Data System (ADS)

    Orlov, Yu.; Fedorov, S.; Samuylov, A.; Gaidamaka, Yu.; Molchanov, D.

    2017-07-01

    The problem of channel quality estimation for devices in a wireless 5G network is formulated. As a performance metrics of interest we choose the signal-to-interference-plus-noise ratio, which depends essentially on the distance between the communicating devices. A model with a plurality of moving devices in a bounded three-dimensional space and a simulation algorithm to determine the distances between the devices for a given motion model are devised.

  3. Human Performance Optimization Metrics: Consensus Findings, Gaps, and Recommendations for Future Research.

    PubMed

    Nindl, Bradley C; Jaffin, Dianna P; Dretsch, Michael N; Cheuvront, Samuel N; Wesensten, Nancy J; Kent, Michael L; Grunberg, Neil E; Pierce, Joseph R; Barry, Erin S; Scott, Jonathan M; Young, Andrew J; OʼConnor, Francis G; Deuster, Patricia A

    2015-11-01

    Human performance optimization (HPO) is defined as "the process of applying knowledge, skills and emerging technologies to improve and preserve the capabilities of military members, and organizations to execute essential tasks." The lack of consensus for operationally relevant and standardized metrics that meet joint military requirements has been identified as the single most important gap for research and application of HPO. In 2013, the Consortium for Health and Military Performance hosted a meeting to develop a toolkit of standardized HPO metrics for use in military and civilian research, and potentially for field applications by commanders, units, and organizations. Performance was considered from a holistic perspective as being influenced by various behaviors and barriers. To accomplish the goal of developing a standardized toolkit, key metrics were identified and evaluated across a spectrum of domains that contribute to HPO: physical performance, nutritional status, psychological status, cognitive performance, environmental challenges, sleep, and pain. These domains were chosen based on relevant data with regard to performance enhancers and degraders. The specific objectives at this meeting were to (a) identify and evaluate current metrics for assessing human performance within selected domains; (b) prioritize metrics within each domain to establish a human performance assessment toolkit; and (c) identify scientific gaps and the needed research to more effectively assess human performance across domains. This article provides of a summary of 150 total HPO metrics across multiple domains that can be used as a starting point-the beginning of an HPO toolkit: physical fitness (29 metrics), nutrition (24 metrics), psychological status (36 metrics), cognitive performance (35 metrics), environment (12 metrics), sleep (9 metrics), and pain (5 metrics). These metrics can be particularly valuable as the military emphasizes a renewed interest in Human Dimension efforts, and leverages science, resources, programs, and policies to optimize the performance capacities of all Service members.

  4. An annual plant growth proxy in the Mojave Desert using MODIS-EVI data

    USGS Publications Warehouse

    Wallace, C.S.A.; Thomas, K.A.

    2008-01-01

    In the arid Mojave Desert, the phenological response of vegetation is largely dependent upon the timing and amount of rainfall, and maps of annual plant cover at any one point in time can vary widely. Our study developed relative annual plant growth models as proxies for annual plant cover using metrics that captured phenological variability in Moderate-Resolution Imaging Spectroradiometer (MODIS) Enhanced Vegetation Index (EVI) satellite images. We used landscape phenologies revealed in MODIS data together with ecological knowledge of annual plant seasonality to develop a suite of metrics to describe annual growth on a yearly basis. Each of these metrics was applied to temporally-composited MODIS-EVI images to develop a relative model of annual growth. Each model was evaluated by testing how well it predicted field estimates of annual cover collected during 2003 and 2005 at the Mojave National Preserve. The best performing metric was the spring difference metric, which compared the average of three spring MODIS-EVI composites of a given year to that of 2002, a year of record drought. The spring difference metric showed correlations with annual plant cover of R2 = 0.61 for 2005 and R 2 = 0.47 for 2003. Although the correlation is moderate, we consider it supportive given the characteristics of the field data, which were collected for a different study in a localized area and are not ideal for calibration to MODIS pixels. A proxy for annual growth potential was developed from the spring difference metric of 2005 for use as an environmental data layer in desert tortoise habitat modeling. The application of the spring difference metric to other imagery years presents potential for other applications such as fuels, invasive species, and dust-emission monitoring in the Mojave Desert.

  5. An Annual Plant Growth Proxy in the Mojave Desert Using MODIS-EVI Data.

    PubMed

    Wallace, Cynthia S A; Thomas, Kathryn A

    2008-12-03

    In the arid Mojave Desert, the phenological response of vegetation is largely dependent upon the timing and amount of rainfall, and maps of annual plant cover at any one point in time can vary widely. Our study developed relative annual plant growth models as proxies for annual plant cover using metrics that captured phenological variability in Moderate-Resolution Imaging Spectroradiometer (MODIS) Enhanced Vegetation Index (EVI) satellite images. We used landscape phenologies revealed in MODIS data together with ecological knowledge of annual plant seasonality to develop a suite of metrics to describe annual growth on a yearly basis. Each of these metrics was applied to temporally-composited MODIS-EVI images to develop a relative model of annual growth. Each model was evaluated by testing how well it predicted field estimates of annual cover collected during 2003 and 2005 at the Mojave National Preserve. The best performing metric was the spring difference metric, which compared the average of three spring MODIS-EVI composites of a given year to that of 2002, a year of record drought. The spring difference metric showed correlations with annual plant cover of R² = 0.61 for 2005 and R² = 0.47 for 2003. Although the correlation is moderate, we consider it supportive given the characteristics of the field data, which were collected for a different study in a localized area and are not ideal for calibration to MODIS pixels. A proxy for annual growth potential was developed from the spring difference metric of 2005 for use as an environmental data layer in desert tortoise habitat modeling. The application of the spring difference metric to other imagery years presents potential for other applications such as fuels, invasive species, and dust-emission monitoring in the Mojave Desert.

  6. FAST COGNITIVE AND TASK ORIENTED, ITERATIVE DATA DISPLAY (FACTOID)

    DTIC Science & Technology

    2017-06-01

    approaches. As a result, the following assumptions guided our efforts in developing modeling and descriptive metrics for evaluation purposes...Application Evaluation . Our analytic workflow for evaluation is to first provide descriptive statistics about applications across metrics (performance...distributions for evaluation purposes because the goal of evaluation is accurate description , not inference (e.g., prediction). Outliers depicted

  7. Frontal Representation as a Metric of Model Performance

    NASA Astrophysics Data System (ADS)

    Douglass, E.; Mask, A. C.

    2017-12-01

    Representation of fronts detected by altimetry are used to evaluate the performance of the HYCOM global operational product. Fronts are detected and assessed in daily alongtrack altimetry. Then, modeled sea surface height is interpolated to the locations of the alongtrack observations, and the same frontal detection algorithm is applied to the interpolated model output. The percentage of fronts found in the altimetry and replicated in the model gives a score (0-100) that assesses the model's ability to replicate fronts in the proper location with the proper orientation. Further information can be obtained from determining the number of "extra" fronts found in the model but not in the altimetry, and from assessing the horizontal and vertical dimensions of the front in the model as compared to observations. Finally, the sensitivity of this metric to choices regarding the smoothing of noisy alongtrack altimetry observations, and to the minimum size of fronts being analyzed, is assessed.

  8. Productivity in Pediatric Palliative Care: Measuring and Monitoring an Elusive Metric.

    PubMed

    Kaye, Erica C; Abramson, Zachary R; Snaman, Jennifer M; Friebert, Sarah E; Baker, Justin N

    2017-05-01

    Workforce productivity is poorly defined in health care. Particularly in the field of pediatric palliative care (PPC), the absence of consensus metrics impedes aggregation and analysis of data to track workforce efficiency and effectiveness. Lack of uniformly measured data also compromises the development of innovative strategies to improve productivity and hinders investigation of the link between productivity and quality of care, which are interrelated but not interchangeable. To review the literature regarding the definition and measurement of productivity in PPC; to identify barriers to productivity within traditional PPC models; and to recommend novel metrics to study productivity as a component of quality care in PPC. PubMed ® and Cochrane Database of Systematic Reviews searches for scholarly literature were performed using key words (pediatric palliative care, palliative care, team, workforce, workflow, productivity, algorithm, quality care, quality improvement, quality metric, inpatient, hospital, consultation, model) for articles published between 2000 and 2016. Organizational searches of Center to Advance Palliative Care, National Hospice and Palliative Care Organization, National Association for Home Care & Hospice, American Academy of Hospice and Palliative Medicine, Hospice and Palliative Nurses Association, National Quality Forum, and National Consensus Project for Quality Palliative Care were also performed. Additional semistructured interviews were conducted with directors from seven prominent PPC programs across the U.S. to review standard operating procedures for PPC team workflow and productivity. Little consensus exists in the PPC field regarding optimal ways to define, measure, and analyze provider and program productivity. Barriers to accurate monitoring of productivity include difficulties with identification, measurement, and interpretation of metrics applicable to an interdisciplinary care paradigm. In the context of inefficiencies inherent to traditional consultation models, novel productivity metrics are proposed. Further research is needed to determine optimal metrics for monitoring productivity within PPC teams. Innovative approaches should be studied with the goal of improving efficiency of care without compromising value. Copyright © 2016 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.

  9. Retinal Vascular and Oxygen Temporal Dynamic Responses to Light Flicker in Humans

    PubMed Central

    Felder, Anthony E.; Wanek, Justin; Blair, Norman P.

    2017-01-01

    Purpose To mathematically model the temporal dynamic responses of retinal vessel diameter (D), oxygen saturation (SO2), and inner retinal oxygen extraction fraction (OEF) to light flicker and to describe their responses to its cessation in humans. Methods In 16 healthy subjects (age: 60 ± 12 years), retinal oximetry was performed before, during, and after light flicker stimulation. At each time point, five metrics were measured: retinal arterial and venous D (DA, DV) and SO2 (SO2A, SO2V), and OEF. Intra- and intersubject variability of metrics was assessed by coefficient of variation of measurements before flicker within and among subjects, respectively. Metrics during flicker were modeled by exponential functions to determine the flicker-induced steady state metric values and the time constants of changes. Metrics after the cessation of flicker were compared to those before flicker. Results Intra- and intersubject variability for all metrics were less than 6% and 16%, respectively. At the flicker-induced steady state, DA and DV increased by 5%, SO2V increased by 7%, and OEF decreased by 13%. The time constants of DA and DV (14, 15 seconds) were twofold smaller than those of SO2V and OEF (39, 34 seconds). Within 26 seconds after the cessation of flicker, all metrics were not significantly different from before flicker values (P ≥ 0.07). Conclusions Mathematical modeling revealed considerable differences in the time courses of changes among metrics during flicker, indicating flicker duration should be considered separately for each metric. Future application of this method may be useful to elucidate alterations in temporal dynamic responses to light flicker due to retinal diseases. PMID:29098297

  10. The data quality analyzer: A quality control program for seismic data

    NASA Astrophysics Data System (ADS)

    Ringler, A. T.; Hagerty, M. T.; Holland, J.; Gonzales, A.; Gee, L. S.; Edwards, J. D.; Wilson, D.; Baker, A. M.

    2015-03-01

    The U.S. Geological Survey's Albuquerque Seismological Laboratory (ASL) has several initiatives underway to enhance and track the quality of data produced from ASL seismic stations and to improve communication about data problems to the user community. The Data Quality Analyzer (DQA) is one such development and is designed to characterize seismic station data quality in a quantitative and automated manner. The DQA consists of a metric calculator, a PostgreSQL database, and a Web interface: The metric calculator, SEEDscan, is a Java application that reads and processes miniSEED data and generates metrics based on a configuration file. SEEDscan compares hashes of metadata and data to detect changes in either and performs subsequent recalculations as needed. This ensures that the metric values are up to date and accurate. SEEDscan can be run as a scheduled task or on demand. The PostgreSQL database acts as a central hub where metric values and limited station descriptions are stored at the channel level with one-day granularity. The Web interface dynamically loads station data from the database and allows the user to make requests for time periods of interest, review specific networks and stations, plot metrics as a function of time, and adjust the contribution of various metrics to the overall quality grade of the station. The quantification of data quality is based on the evaluation of various metrics (e.g., timing quality, daily noise levels relative to long-term noise models, and comparisons between broadband data and event synthetics). Users may select which metrics contribute to the assessment and those metrics are aggregated into a "grade" for each station. The DQA is being actively used for station diagnostics and evaluation based on the completed metrics (availability, gap count, timing quality, deviation from a global noise model, deviation from a station noise model, coherence between co-located sensors, and comparison between broadband data and synthetics for earthquakes) on stations in the Global Seismographic Network and Advanced National Seismic System.

  11. Systems Engineering Metrics: Organizational Complexity and Product Quality Modeling

    NASA Technical Reports Server (NTRS)

    Mog, Robert A.

    1997-01-01

    Innovative organizational complexity and product quality models applicable to performance metrics for NASA-MSFC's Systems Analysis and Integration Laboratory (SAIL) missions and objectives are presented. An intensive research effort focuses on the synergistic combination of stochastic process modeling, nodal and spatial decomposition techniques, organizational and computational complexity, systems science and metrics, chaos, and proprietary statistical tools for accelerated risk assessment. This is followed by the development of a preliminary model, which is uniquely applicable and robust for quantitative purposes. Exercise of the preliminary model using a generic system hierarchy and the AXAF-I architectural hierarchy is provided. The Kendall test for positive dependence provides an initial verification and validation of the model. Finally, the research and development of the innovation is revisited, prior to peer review. This research and development effort results in near-term, measurable SAIL organizational and product quality methodologies, enhanced organizational risk assessment and evolutionary modeling results, and 91 improved statistical quantification of SAIL productivity interests.

  12. Quality metrics for sensor images

    NASA Technical Reports Server (NTRS)

    Ahumada, AL

    1993-01-01

    Methods are needed for evaluating the quality of augmented visual displays (AVID). Computational quality metrics will help summarize, interpolate, and extrapolate the results of human performance tests with displays. The FLM Vision group at NASA Ames has been developing computational models of visual processing and using them to develop computational metrics for similar problems. For example, display modeling systems use metrics for comparing proposed displays, halftoning optimizing methods use metrics to evaluate the difference between the halftone and the original, and image compression methods minimize the predicted visibility of compression artifacts. The visual discrimination models take as input two arbitrary images A and B and compute an estimate of the probability that a human observer will report that A is different from B. If A is an image that one desires to display and B is the actual displayed image, such an estimate can be regarded as an image quality metric reflecting how well B approximates A. There are additional complexities associated with the problem of evaluating the quality of radar and IR enhanced displays for AVID tasks. One important problem is the question of whether intruding obstacles are detectable in such displays. Although the discrimination model can handle detection situations by making B the original image A plus the intrusion, this detection model makes the inappropriate assumption that the observer knows where the intrusion will be. Effects of signal uncertainty need to be added to our models. A pilot needs to make decisions rapidly. The models need to predict not just the probability of a correct decision, but the probability of a correct decision by the time the decision needs to be made. That is, the models need to predict latency as well as accuracy. Luce and Green have generated models for auditory detection latencies. Similar models are needed for visual detection. Most image quality models are designed for static imagery. Watson has been developing a general spatial-temporal vision model to optimize video compression techniques. These models need to be adapted and calibrated for AVID applications.

  13. Interpreting lateral dynamic weight shifts using a simple inverted pendulum model.

    PubMed

    Kennedy, Michael W; Bretl, Timothy; Schmiedeler, James P

    2014-01-01

    Seventy-five young, healthy adults completed a lateral weight-shifting activity in which each shifted his/her center of pressure (CoP) to visually displayed target locations with the aid of visual CoP feedback. Each subject's CoP data were modeled using a single-link inverted pendulum system with a spring-damper at the joint. This extends the simple inverted pendulum model of static balance in the sagittal plane to lateral weight-shifting balance. The model controlled pendulum angle using PD control and a ramp setpoint trajectory, and weight-shifting was characterized by both shift speed and a non-minimum phase (NMP) behavior metric. This NMP behavior metric examines the force magnitude at shift initiation and provides weight-shifting balance performance information that parallels the examination of peak ground reaction forces in gait analysis. Control parameters were optimized on a subject-by-subject basis to match balance metrics for modeled results to metric values calculated from experimental data. Overall, the model matches experimental data well (average percent error of 0.35% for shifting speed and 0.05% for NMP behavior). These results suggest that the single-link inverted pendulum model can be used effectively to capture lateral weight-shifting balance, as it has been shown to model static balance. Copyright © 2014 Elsevier B.V. All rights reserved.

  14. Calabi-Yau metrics for quotients and complete intersections

    DOE PAGES

    Braun, Volker; Brelidze, Tamaz; Douglas, Michael R.; ...

    2008-05-22

    We extend previous computations of Calabi-Yau metrics on projective hypersurfaces to free quotients, complete intersections, and free quotients of complete intersections. In particular, we construct these metrics on generic quintics, four-generation quotients of the quintic, Schoen Calabi-Yau complete intersections and the quotient of a Schoen manifold with Z₃ x Z₃ fundamental group that was previously used to construct a heterotic standard model. Various numerical investigations into the dependence of Donaldson's algorithm on the integration scheme, as well as on the Kähler and complex structure moduli, are also performed.

  15. Thermodynamic metrics and optimal paths.

    PubMed

    Sivak, David A; Crooks, Gavin E

    2012-05-11

    A fundamental problem in modern thermodynamics is how a molecular-scale machine performs useful work, while operating away from thermal equilibrium without excessive dissipation. To this end, we derive a friction tensor that induces a Riemannian manifold on the space of thermodynamic states. Within the linear-response regime, this metric structure controls the dissipation of finite-time transformations, and bestows optimal protocols with many useful properties. We discuss the connection to the existing thermodynamic length formalism, and demonstrate the utility of this metric by solving for optimal control parameter protocols in a simple nonequilibrium model.

  16. ARM Data-Oriented Metrics and Diagnostics Package for Climate Model Evaluation Value-Added Product

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Chengzhu; Xie, Shaocheng

    A Python-based metrics and diagnostics package is currently being developed by the U.S. Department of Energy (DOE) Atmospheric Radiation Measurement (ARM) Infrastructure Team at Lawrence Livermore National Laboratory (LLNL) to facilitate the use of long-term, high-frequency measurements from the ARM Facility in evaluating the regional climate simulation of clouds, radiation, and precipitation. This metrics and diagnostics package computes climatological means of targeted climate model simulation and generates tables and plots for comparing the model simulation with ARM observational data. The Coupled Model Intercomparison Project (CMIP) model data sets are also included in the package to enable model intercomparison as demonstratedmore » in Zhang et al. (2017). The mean of the CMIP model can serve as a reference for individual models. Basic performance metrics are computed to measure the accuracy of mean state and variability of climate models. The evaluated physical quantities include cloud fraction, temperature, relative humidity, cloud liquid water path, total column water vapor, precipitation, sensible and latent heat fluxes, and radiative fluxes, with plan to extend to more fields, such as aerosol and microphysics properties. Process-oriented diagnostics focusing on individual cloud- and precipitation-related phenomena are also being developed for the evaluation and development of specific model physical parameterizations. The version 1.0 package is designed based on data collected at ARM’s Southern Great Plains (SGP) Research Facility, with the plan to extend to other ARM sites. The metrics and diagnostics package is currently built upon standard Python libraries and additional Python packages developed by DOE (such as CDMS and CDAT). The ARM metrics and diagnostic package is available publicly with the hope that it can serve as an easy entry point for climate modelers to compare their models with ARM data. In this report, we first present the input data, which constitutes the core content of the metrics and diagnostics package in section 2, and a user's guide documenting the workflow/structure of the version 1.0 codes, and including step-by-step instruction for running the package in section 3.« less

  17. Metrics for assessing the performance of morphodynamic models of braided rivers at event and reach scales

    NASA Astrophysics Data System (ADS)

    Williams, Richard; Measures, Richard; Hicks, Murray; Brasington, James

    2017-04-01

    Advances in geomatics technologies have transformed the monitoring of reach-scale (100-101 km) river morphodynamics. Hyperscale Digital Elevation Models (DEMs) can now be acquired at temporal intervals that are commensurate with the frequencies of high-flow events that force morphological change. The low vertical errors associated with such DEMs enable DEMs of Difference (DoDs) to be generated to quantify patterns of erosion and deposition, and derive sediment budgets using the morphological approach. In parallel with reach-scale observational advances, high-resolution, two-dimensional, physics-based numerical morphodynamic models are now computationally feasible for unsteady, reach-scale simulations. In light of this observational and predictive progress, there is a need to identify appropriate metrics that can be extracted from DEMs and DoDs to assess model performance. Nowhere is this more pertinent than in braided river environments, where numerous mobile channels that intertwine around mid-channel bars result in complex patterns of erosion and deposition, thus making model assessment particularly challenging. This paper identifies and evaluates a range of morphological and morphological-change metrics that can be used to assess predictions of braided river morphodynamics at the timescale of single storm events. A depth-averaged, mixed-grainsize Delft3D morphodynamic model was used to simulate morphological change during four discrete high-flow events, ranging from 91 to 403 m3s-1, along a 2.5 x 0.7 km reach of the braided, gravel-bed Rees River, New Zealand. Pre- and post-event topographic surveys, using a fusion of Terrestrial Laser Scanning and optical-empirical bathymetric mapping, were used to produce 0.5 m resolution DEMs and DoDs. The pre- and post-event DEMs for a moderate (227m3s-1) high-flow event were used to calibrate the model. DEMs and DoDs from the other three high-flow events were used for model assessment using two approaches. First, "morphological" metrics were applied to compare observed and predicted post-event DEMs. These metrics include measures of confluence and bifurcation node density, bar shape, braiding intensity, and topographic comparisons using a form of the Brier Skill Score and cumulative frequency distributions of rugosity. Second, "morphological change" metrics were used to compare observed and predicted morphological change. These metrics included the extent of the morphologically active area, pairwise comparisons of morphological change (using kappa and fuzzy kappa statistics), and comparisons between vertical morphological change magnitude and elevation distribution. Results indicate that those metrics that assess characteristic features of braiding, rather than making direct comparisons, are most useful for assessing reach-scale braided river morphodynamic models. Together, the metrics indicate that there was a general affinity between observed and predicted braided river morphodynamics, both during small and large magnitude high-flow events. These results thus demonstrate how high-resolution, reach-scale, natural experiment datasets can be used to assess the efficacy of morphological models in predicting realistic patterns of erosion and deposition. This lays the foundation for the development and assessment of decadal scale morphodynamic models and their use in adaptive river basin management.

  18. A comparison of evaluation metrics for biomedical journals, articles, and websites in terms of sensitivity to topic.

    PubMed

    Fu, Lawrence D; Aphinyanaphongs, Yindalon; Wang, Lily; Aliferis, Constantin F

    2011-08-01

    Evaluating the biomedical literature and health-related websites for quality are challenging information retrieval tasks. Current commonly used methods include impact factor for journals, PubMed's clinical query filters and machine learning-based filter models for articles, and PageRank for websites. Previous work has focused on the average performance of these methods without considering the topic, and it is unknown how performance varies for specific topics or focused searches. Clinicians, researchers, and users should be aware when expected performance is not achieved for specific topics. The present work analyzes the behavior of these methods for a variety of topics. Impact factor, clinical query filters, and PageRank vary widely across different topics while a topic-specific impact factor and machine learning-based filter models are more stable. The results demonstrate that a method may perform excellently on average but struggle when used on a number of narrower topics. Topic-adjusted metrics and other topic robust methods have an advantage in such situations. Users of traditional topic-sensitive metrics should be aware of their limitations. Copyright © 2011 Elsevier Inc. All rights reserved.

  19. Resilience Metrics for the Electric Power System: A Performance-Based Approach.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vugrin, Eric D.; Castillo, Andrea R; Silva-Monroy, Cesar Augusto

    Grid resilience is a concept related to a power system's ability to continue operating and delivering power even in the event that low probability, high-consequence disruptions such as hurricanes, earthquakes, and cyber-attacks occur. Grid resilience objectives focus on managing and, ideally, minimizing potential consequences that occur as a result of these disruptions. Currently, no formal grid resilience definitions, metrics, or analysis methods have been universally accepted. This document describes an effort to develop and describe grid resilience metrics and analysis methods. The metrics and methods described herein extend upon the Resilience Analysis Process (RAP) developed by Watson et al. formore » the 2015 Quadrennial Energy Review. The extension allows for both outputs from system models and for historical data to serve as the basis for creating grid resilience metrics and informing grid resilience planning and response decision-making. This document describes the grid resilience metrics and analysis methods. Demonstration of the metrics and methods is shown through a set of illustrative use cases.« less

  20. A causal examination of the effects of confounding factors on multimetric indices

    USGS Publications Warehouse

    Schoolmaster, Donald R.; Grace, James B.; Schweiger, E. William; Mitchell, Brian R.; Guntenspergen, Glenn R.

    2013-01-01

    The development of multimetric indices (MMIs) as a means of providing integrative measures of ecosystem condition is becoming widespread. An increasingly recognized problem for the interpretability of MMIs is controlling for the potentially confounding influences of environmental covariates. Most common approaches to handling covariates are based on simple notions of statistical control, leaving the causal implications of covariates and their adjustment unstated. In this paper, we use graphical models to examine some of the potential impacts of environmental covariates on the observed signals between human disturbance and potential response metrics. Using simulations based on various causal networks, we show how environmental covariates can both obscure and exaggerate the effects of human disturbance on individual metrics. We then examine from a causal interpretation standpoint the common practice of adjusting ecological metrics for environmental influences using only the set of sites deemed to be in reference condition. We present and examine the performance of an alternative approach to metric adjustment that uses the whole set of sites and models both environmental and human disturbance effects simultaneously. The findings from our analyses indicate that failing to model and adjust metrics can result in a systematic bias towards those metrics in which environmental covariates function to artificially strengthen the metric–disturbance relationship resulting in MMIs that do not accurately measure impacts of human disturbance. We also find that a “whole-set modeling approach” requires fewer assumptions and is more efficient with the given information than the more commonly applied “reference-set” approach.

  1. Identification of the ideal clutter metric to predict time dependence of human visual search

    NASA Astrophysics Data System (ADS)

    Cartier, Joan F.; Hsu, David H.

    1995-05-01

    The Army Night Vision and Electronic Sensors Directorate (NVESD) has recently performed a human perception experiment in which eye tracker measurements were made on trained military observers searching for targets in infrared images. This data offered an important opportunity to evaluate a new technique for search modeling. Following the approach taken by Jeff Nicoll, this model treats search as a random walk in which the observers are in one of two states until they quit: they are either searching, or they are wandering around looking for a point of interest. When wandering they skip rapidly from point to point. When examining they move more slowly, reflecting the fact that target discrimination requires additional thought processes. In this paper we simulate the random walk, using a clutter metric to assign relative attractiveness to points of interest within the image which are competing for the observer's attention. The NVESD data indicates that a number of standard clutter metrics are good estimators of the apportionment of observer's time between wandering and examining. Conversely, the apportionment of observer time spent wandering and examining could be used to reverse engineer the ideal clutter metric which would most perfectly describe the behavior of the group of observers. It may be possible to use this technique to design the optimal clutter metric to predict performance of visual search.

  2. Prediction of user preference over shared-control paradigms for a robotic wheelchair.

    PubMed

    Erdogan, Ahmetcan; Argall, Brenna D

    2017-07-01

    The design of intelligent powered wheelchairs has traditionally focused heavily on providing effective and efficient navigation assistance. Significantly less attention has been given to the end-user's preference between different assistance paradigms. It is possible to include these subjective evaluations in the design process, for example by soliciting feedback in post-experiment questionnaires. However, constantly querying the user for feedback during real-world operation is not practical. In this paper, we present a model that correlates objective performance metrics and subjective evaluations of autonomous wheelchair control paradigms. Using off-the-shelf machine learning techniques, we show that it is possible to build a model that can predict the most preferred shared-control method from task execution metrics such as effort, safety, performance and utilization. We further characterize the relative contributions of each of these metrics to the individual choice of most preferred assistance paradigm. Our evaluation includes Spinal Cord Injured (SCI) and uninjured subject groups. The results show that our proposed correlation model enables the continuous tracking of user preference and offers the possibility of autonomy that is customized to each user.

  3. The Massachusetts Community College Performance-Based Funding Formula: A New Model for New England?

    ERIC Educational Resources Information Center

    Salomon-Fernandez, Yves

    2014-01-01

    The Massachusetts community college system is entering a second year with funding for each of its 15 schools determined using a new performance-based formula. Under the new model, 50% of each college's allocation is based on performance on metrics related to enrollment and student success, with added incentives for "at-risk" students…

  4. Koeppen Bioclimatic Metrics for Evaluating CMIP5 Simulations of Historical Climate

    NASA Astrophysics Data System (ADS)

    Phillips, T. J.; Bonfils, C.

    2012-12-01

    The classic Koeppen bioclimatic classification scheme associates generic vegetation types (e.g. grassland, tundra, broadleaf or evergreen forests, etc.) with regional climate zones defined by the observed amplitude and phase of the annual cycles of continental temperature (T) and precipitation (P). Koeppen classification thus can provide concise, multivariate metrics for evaluating climate model performance in simulating the regional magnitudes and seasonalities of climate variables that are of critical importance for living organisms. In this study, 14 Koeppen vegetation types are derived from annual-cycle climatologies of T and P in some 3 dozen CMIP5 simulations of 1980-1999 climate, a period when observational data provides a reliable global validation standard. Metrics for evaluating the ability of the CMIP5 models to simulate the correct locations and areas of the vegetation types, as well as measures of overall model performance, also are developed. It is found that the CMIP5 models are most deficient in simulating 1) the climates of the drier zones (e.g. desert, savanna, grassland, steppe vegetation types) that are located in the Southwestern U.S. and Mexico, Eastern Europe, Southern Africa, and Central Australia, as well as 2) the climate of regions such as Central Asia and Western South America where topography plays a central role. (Detailed analysis of regional biases in the annual cycles of T and P of selected simulations exemplifying general model performance problems also are to be presented.) The more encouraging results include evidence for a general improvement in CMIP5 performance relative to that of older CMIP3 models. Within CMIP5 also, the more complex Earth Systems Models (ESMs) with prognostic biogeochemistry perform comparably to the corresponding global models that simulate only the "physical" climate. Acknowledgments This work was funded by the U.S. Department of Energy Office of Science and was performed at the Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

  5. Quantification of Dynamic Model Validation Metrics Using Uncertainty Propagation from Requirements

    NASA Technical Reports Server (NTRS)

    Brown, Andrew M.; Peck, Jeffrey A.; Stewart, Eric C.

    2018-01-01

    The Space Launch System, NASA's new large launch vehicle for long range space exploration, is presently in the final design and construction phases, with the first launch scheduled for 2019. A dynamic model of the system has been created and is critical for calculation of interface loads and natural frequencies and mode shapes for guidance, navigation, and control (GNC). Because of the program and schedule constraints, a single modal test of the SLS will be performed while bolted down to the Mobile Launch Pad just before the first launch. A Monte Carlo and optimization scheme will be performed to create thousands of possible models based on given dispersions in model properties and to determine which model best fits the natural frequencies and mode shapes from modal test. However, the question still remains as to whether this model is acceptable for the loads and GNC requirements. An uncertainty propagation and quantification (UP and UQ) technique to develop a quantitative set of validation metrics that is based on the flight requirements has therefore been developed and is discussed in this paper. There has been considerable research on UQ and UP and validation in the literature, but very little on propagating the uncertainties from requirements, so most validation metrics are "rules-of-thumb;" this research seeks to come up with more reason-based metrics. One of the main assumptions used to achieve this task is that the uncertainty in the modeling of the fixed boundary condition is accurate, so therefore that same uncertainty can be used in propagating the fixed-test configuration to the free-free actual configuration. The second main technique applied here is the usage of the limit-state formulation to quantify the final probabilistic parameters and to compare them with the requirements. These techniques are explored with a simple lumped spring-mass system and a simplified SLS model. When completed, it is anticipated that this requirements-based validation metric will provide a quantified confidence and probability of success for the final SLS dynamics model, which will be critical for a successful launch program, and can be applied in the many other industries where an accurate dynamic model is required.

  6. A computational imaging target specific detectivity metric

    NASA Astrophysics Data System (ADS)

    Preece, Bradley L.; Nehmetallah, George

    2017-05-01

    Due to the large quantity of low-cost, high-speed computational processing available today, computational imaging (CI) systems are expected to have a major role for next generation multifunctional cameras. The purpose of this work is to quantify the performance of theses CI systems in a standardized manner. Due to the diversity of CI system designs that are available today or proposed in the near future, significant challenges in modeling and calculating a standardized detection signal-to-noise ratio (SNR) to measure the performance of these systems. In this paper, we developed a path forward for a standardized detectivity metric for CI systems. The detectivity metric is designed to evaluate the performance of a CI system searching for a specific known target or signal of interest, and is defined as the optimal linear matched filter SNR, similar to the Hotelling SNR, calculated in computational space with special considerations for standardization. Therefore, the detectivity metric is designed to be flexible, in order to handle various types of CI systems and specific targets, while keeping the complexity and assumptions of the systems to a minimum.

  7. Modeling temporal sequences of cognitive state changes based on a combination of EEG-engagement, EEG-workload, and heart rate metrics

    PubMed Central

    Stikic, Maja; Berka, Chris; Levendowski, Daniel J.; Rubio, Roberto F.; Tan, Veasna; Korszen, Stephanie; Barba, Douglas; Wurzer, David

    2014-01-01

    The objective of this study was to investigate the feasibility of physiological metrics such as ECG-derived heart rate and EEG-derived cognitive workload and engagement as potential predictors of performance on different training tasks. An unsupervised approach based on self-organizing neural network (NN) was utilized to model cognitive state changes over time. The feature vector comprised EEG-engagement, EEG-workload, and heart rate metrics, all self-normalized to account for individual differences. During the competitive training process, a linear topology was developed where the feature vectors similar to each other activated the same NN nodes. The NN model was trained and auto-validated on combat marksmanship training data from 51 participants that were required to make “deadly force decisions” in challenging combat scenarios. The trained NN model was cross validated using 10-fold cross-validation. It was also validated on a golf study in which additional 22 participants were asked to complete 10 sessions of 10 putts each. Temporal sequences of the activated nodes for both studies followed the same pattern of changes, demonstrating the generalization capabilities of the approach. Most node transition changes were local, but important events typically caused significant changes in the physiological metrics, as evidenced by larger state changes. This was investigated by calculating a transition score as the sum of subsequent state transitions between the activated NN nodes. Correlation analysis demonstrated statistically significant correlations between the transition scores and subjects' performances in both studies. This paper explored the hypothesis that temporal sequences of physiological changes comprise the discriminative patterns for performance prediction. These physiological markers could be utilized in future training improvement systems (e.g., through neurofeedback), and applied across a variety of training environments. PMID:25414629

  8. Voxel-based statistical analysis of uncertainties associated with deformable image registration

    NASA Astrophysics Data System (ADS)

    Li, Shunshan; Glide-Hurst, Carri; Lu, Mei; Kim, Jinkoo; Wen, Ning; Adams, Jeffrey N.; Gordon, James; Chetty, Indrin J.; Zhong, Hualiang

    2013-09-01

    Deformable image registration (DIR) algorithms have inherent uncertainties in their displacement vector fields (DVFs).The purpose of this study is to develop an optimal metric to estimate DIR uncertainties. Six computational phantoms have been developed from the CT images of lung cancer patients using a finite element method (FEM). The FEM generated DVFs were used as a standard for registrations performed on each of these phantoms. A mechanics-based metric, unbalanced energy (UE), was developed to evaluate these registration DVFs. The potential correlation between UE and DIR errors was explored using multivariate analysis, and the results were validated by landmark approach and compared with two other error metrics: DVF inverse consistency (IC) and image intensity difference (ID). Landmark-based validation was performed using the POPI-model. The results show that the Pearson correlation coefficient between UE and DIR error is rUE-error = 0.50. This is higher than rIC-error = 0.29 for IC and DIR error and rID-error = 0.37 for ID and DIR error. The Pearson correlation coefficient between UE and the product of the DIR displacements and errors is rUE-error × DVF = 0.62 for the six patients and rUE-error × DVF = 0.73 for the POPI-model data. It has been demonstrated that UE has a strong correlation with DIR errors, and the UE metric outperforms the IC and ID metrics in estimating DIR uncertainties. The quantified UE metric can be a useful tool for adaptive treatment strategies, including probability-based adaptive treatment planning.

  9. Optimal SVM parameter selection for non-separable and unbalanced datasets.

    PubMed

    Jiang, Peng; Missoum, Samy; Chen, Zhao

    2014-10-01

    This article presents a study of three validation metrics used for the selection of optimal parameters of a support vector machine (SVM) classifier in the case of non-separable and unbalanced datasets. This situation is often encountered when the data is obtained experimentally or clinically. The three metrics selected in this work are the area under the ROC curve (AUC), accuracy, and balanced accuracy. These validation metrics are tested using computational data only, which enables the creation of fully separable sets of data. This way, non-separable datasets, representative of a real-world problem, can be created by projection onto a lower dimensional sub-space. The knowledge of the separable dataset, unknown in real-world problems, provides a reference to compare the three validation metrics using a quantity referred to as the "weighted likelihood". As an application example, the study investigates a classification model for hip fracture prediction. The data is obtained from a parameterized finite element model of a femur. The performance of the various validation metrics is studied for several levels of separability, ratios of unbalance, and training set sizes.

  10. Enhancing the Simplified Surface Energy Balance (SSEB) Approach for Estimating Landscape ET: Validation with the METRIC model

    USGS Publications Warehouse

    Senay, Gabriel B.; Budde, Michael E.; Verdin, James P.

    2011-01-01

    Evapotranspiration (ET) can be derived from satellite data using surface energy balance principles. METRIC (Mapping EvapoTranspiration at high Resolution with Internalized Calibration) is one of the most widely used models available in the literature to estimate ET from satellite imagery. The Simplified Surface Energy Balance (SSEB) model is much easier and less expensive to implement. The main purpose of this research was to present an enhanced version of the Simplified Surface Energy Balance (SSEB) model and to evaluate its performance using the established METRIC model. In this study, SSEB and METRIC ET fractions were compared using 7 Landsat images acquired for south central Idaho during the 2003 growing season. The enhanced SSEB model compared well with the METRIC model output exhibiting an r2 improvement from 0.83 to 0.90 in less complex topography (elevation less than 2000 m) and with an improvement of r2 from 0.27 to 0.38 in more complex (mountain) areas with elevation greater than 2000 m. Independent evaluation showed that both models exhibited higher variation in complex topographic regions, although more with SSEB than with METRIC. The higher ET fraction variation in the complex mountainous regions highlighted the difficulty of capturing the radiation and heat transfer physics on steep slopes having variable aspect with the simple index model, and the need to conduct more research. However, the temporal consistency of the results suggests that the SSEB model can be used on a wide range of elevation (more successfully up 2000 m) to detect anomalies in space and time for water resources management and monitoring such as for drought early warning systems in data scarce regions. SSEB has a potential for operational agro-hydrologic applications to estimate ET with inputs of surface temperature, NDVI, DEM and reference ET.

  11. Enhancing the Simplified Surface Energy Balance (SSEB) approach for estimating landscape ET: Validation with the METRIC model

    USGS Publications Warehouse

    Senay, G.B.; Budde, M.E.; Verdin, J.P.

    2011-01-01

    Evapotranspiration (ET) can be derived from satellite data using surface energy balance principles. METRIC (Mapping EvapoTranspiration at high Resolution with Internalized Calibration) is one of the most widely used models available in the literature to estimate ET from satellite imagery. The Simplified Surface Energy Balance (SSEB) model is much easier and less expensive to implement. The main purpose of this research was to present an enhanced version of the Simplified Surface Energy Balance (SSEB) model and to evaluate its performance using the established METRIC model. In this study, SSEB and METRIC ET fractions were compared using 7 Landsat images acquired for south central Idaho during the 2003 growing season. The enhanced SSEB model compared well with the METRIC model output exhibiting an r2 improvement from 0.83 to 0.90 in less complex topography (elevation less than 2000m) and with an improvement of r2 from 0.27 to 0.38 in more complex (mountain) areas with elevation greater than 2000m. Independent evaluation showed that both models exhibited higher variation in complex topographic regions, although more with SSEB than with METRIC. The higher ET fraction variation in the complex mountainous regions highlighted the difficulty of capturing the radiation and heat transfer physics on steep slopes having variable aspect with the simple index model, and the need to conduct more research. However, the temporal consistency of the results suggests that the SSEB model can be used on a wide range of elevation (more successfully up 2000m) to detect anomalies in space and time for water resources management and monitoring such as for drought early warning systems in data scarce regions. SSEB has a potential for operational agro-hydrologic applications to estimate ET with inputs of surface temperature, NDVI, DEM and reference ET. ?? 2010.

  12. SU-F-J-94: Development of a Plug-in Based Image Analysis Tool for Integration Into Treatment Planning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Owen, D; Anderson, C; Mayo, C

    Purpose: To extend the functionality of a commercial treatment planning system (TPS) to support (i) direct use of quantitative image-based metrics within treatment plan optimization and (ii) evaluation of dose-functional volume relationships to assist in functional image adaptive radiotherapy. Methods: A script was written that interfaces with a commercial TPS via an Application Programming Interface (API). The script executes a program that performs dose-functional volume analyses. Written in C#, the script reads the dose grid and correlates it with image data on a voxel-by-voxel basis through API extensions that can access registration transforms. A user interface was designed through WinFormsmore » to input parameters and display results. To test the performance of this program, image- and dose-based metrics computed from perfusion SPECT images aligned to the treatment planning CT were generated, validated, and compared. Results: The integration of image analysis information was successfully implemented as a plug-in to a commercial TPS. Perfusion SPECT images were used to validate the calculation and display of image-based metrics as well as dose-intensity metrics and histograms for defined structures on the treatment planning CT. Various biological dose correction models, custom image-based metrics, dose-intensity computations, and dose-intensity histograms were applied to analyze the image-dose profile. Conclusion: It is possible to add image analysis features to commercial TPSs through custom scripting applications. A tool was developed to enable the evaluation of image-intensity-based metrics in the context of functional targeting and avoidance. In addition to providing dose-intensity metrics and histograms that can be easily extracted from a plan database and correlated with outcomes, the system can also be extended to a plug-in optimization system, which can directly use the computed metrics for optimization of post-treatment tumor or normal tissue response models. Supported by NIH - P01 - CA059827.« less

  13. Comparison of Highly Resolved Model-Based Exposure ...

    EPA Pesticide Factsheets

    Human exposure to air pollution in many studies is represented by ambient concentrations from space-time kriging of observed values. Space-time kriging techniques based on a limited number of ambient monitors may fail to capture the concentration from local sources. Further, because people spend more time indoors, using ambient concentration to represent exposure may cause error. To quantify the associated exposure error, we computed a series of six different hourly-based exposure metrics at 16,095 Census blocks of three Counties in North Carolina for CO, NOx, PM2.5, and elemental carbon (EC) during 2012. These metrics include ambient background concentration from space-time ordinary kriging (STOK), ambient on-road concentration from the Research LINE source dispersion model (R-LINE), a hybrid concentration combining STOK and R-LINE, and their associated indoor concentrations from an indoor infiltration mass balance model. Using a hybrid-based indoor concentration as the standard, the comparison showed that outdoor STOK metrics yielded large error at both population (67% to 93%) and individual level (average bias between −10% to 95%). For pollutants with significant contribution from on-road emission (EC and NOx), the on-road based indoor metric performs the best at the population level (error less than 52%). At the individual level, however, the STOK-based indoor concentration performs the best (average bias below 30%). For PM2.5, due to the relatively low co

  14. Target Scattering Metrics: Model-Model and Model-Data Comparisons

    DTIC Science & Technology

    2017-12-13

    measured synthetic aperture sonar (SAS) data or from numerical models is investigated. Metrics are needed for quantitative comparisons for signals...candidate metrics for model-model comparisons are examined here with a goal to consider raw data prior to its reduction to data products, which may...be suitable for input to classification schemes. The investigated metrics are then applied to model-data comparisons. INTRODUCTION Metrics for

  15. Target Scattering Metrics: Model-Model and Model Data comparisons

    DTIC Science & Technology

    2017-12-13

    measured synthetic aperture sonar (SAS) data or from numerical models is investigated. Metrics are needed for quantitative comparisons for signals...candidate metrics for model-model comparisons are examined here with a goal to consider raw data prior to its reduction to data products, which may...be suitable for input to classification schemes. The investigated metrics are then applied to model-data comparisons. INTRODUCTION Metrics for

  16. [Predictive model based multimetric index of macroinvertebrates for river health assessment].

    PubMed

    Chen, Kai; Yu, Hai Yan; Zhang, Ji Wei; Wang, Bei Xin; Chen, Qiu Wen

    2017-06-18

    Improving the stability of integrity of biotic index (IBI; i.e., multi-metric indices, MMI) across temporal and spatial scales is one of the most important issues in water ecosystem integrity bioassessment and water environment management. Using datasets of field-based macroinvertebrate and physicochemical variables and GIS-based natural predictors (e.g., geomorphology and climate) and land use variables collected at 227 river sites from 2004 to 2011 across the Zhejiang Province, China, we used random forests (RF) to adjust the effects of natural variations at temporal and spatial scales on macroinvertebrate metrics. We then developed natural variations adjusted (predictive) and unadjusted (null) MMIs and compared performance between them. The core me-trics selected for predictive and null MMIs were different from each other, and natural variations within core metrics in predictive MMI explained by RF models ranged between 11.4% and 61.2%. The predictive MMI was more precise and accurate, but less responsive and sensitive than null MMI. The multivariate nearest-neighbor test determined that 9 test sites and 1 most degraded site were flagged outside of the environmental space of the reference site network. We found that combination of predictive MMI developed by using predictive model and the nearest-neighbor test performed best and decreased risks of inferring type I (designating a water body as being in poor biological condition, when it was actually in good condition) and type II (designating a water body as being in good biological condition, when it was actually in poor condition) errors. Our results provided an effective method to improve the stability and performance of integrity of biotic index.

  17. Bayesian model evidence as a model evaluation metric

    NASA Astrophysics Data System (ADS)

    Guthke, Anneli; Höge, Marvin; Nowak, Wolfgang

    2017-04-01

    When building environmental systems models, we are typically confronted with the questions of how to choose an appropriate model (i.e., which processes to include or neglect) and how to measure its quality. Various metrics have been proposed that shall guide the modeller towards a most robust and realistic representation of the system under study. Criteria for evaluation often address aspects of accuracy (absence of bias) or of precision (absence of unnecessary variance) and need to be combined in a meaningful way in order to address the inherent bias-variance dilemma. We suggest using Bayesian model evidence (BME) as a model evaluation metric that implicitly performs a tradeoff between bias and variance. BME is typically associated with model weights in the context of Bayesian model averaging (BMA). However, it can also be seen as a model evaluation metric in a single-model context or in model comparison. It combines a measure for goodness of fit with a penalty for unjustifiable complexity. Unjustifiable refers to the fact that the appropriate level of model complexity is limited by the amount of information available for calibration. Derived in a Bayesian context, BME naturally accounts for measurement errors in the calibration data as well as for input and parameter uncertainty. BME is therefore perfectly suitable to assess model quality under uncertainty. We will explain in detail and with schematic illustrations what BME measures, i.e. how complexity is defined in the Bayesian setting and how this complexity is balanced with goodness of fit. We will further discuss how BME compares to other model evaluation metrics that address accuracy and precision such as the predictive logscore or other model selection criteria such as the AIC, BIC or KIC. Although computationally more expensive than other metrics or criteria, BME represents an appealing alternative because it provides a global measure of model quality. Even if not applicable to each and every case, we aim at stimulating discussion about how to judge the quality of hydrological models in the presence of uncertainty in general by dissecting the mechanism behind BME.

  18. Performance of a normalized energy metric without jammer state information for an FH/MFSK system in worst case partial band jamming

    NASA Technical Reports Server (NTRS)

    Lee, P. J.

    1985-01-01

    For a frequency-hopped noncoherent MFSK communication system without jammer state information (JSI) in a worst case partial band jamming environment, it is well known that the use of a conventional unquantized metric results in very poor performance. In this paper, a 'normalized' unquantized energy metric is suggested for such a system. It is shown that with this metric, one can save 2-3 dB in required signal energy over the system with hard decision metric without JSI for the same desired performance. When this very robust metric is compared to the conventional unquantized energy metric with JSI, the loss in required signal energy is shown to be small. Thus, the use of this normalized metric provides performance comparable to systems for which JSI is known. Cutoff rate and bit error rate with dual-k coding are used for the performance measures.

  19. Similarity Metrics for Closed Loop Dynamic Systems

    NASA Technical Reports Server (NTRS)

    Whorton, Mark S.; Yang, Lee C.; Bedrossian, Naz; Hall, Robert A.

    2008-01-01

    To what extent and in what ways can two closed-loop dynamic systems be said to be "similar?" This question arises in a wide range of dynamic systems modeling and control system design applications. For example, bounds on error models are fundamental to the controller optimization with modern control design methods. Metrics such as the structured singular value are direct measures of the degree to which properties such as stability or performance are maintained in the presence of specified uncertainties or variations in the plant model. Similarly, controls-related areas such as system identification, model reduction, and experimental model validation employ measures of similarity between multiple realizations of a dynamic system. Each area has its tools and approaches, with each tool more or less suited for one application or the other. Similarity in the context of closed-loop model validation via flight test is subtly different from error measures in the typical controls oriented application. Whereas similarity in a robust control context relates to plant variation and the attendant affect on stability and performance, in this context similarity metrics are sought that assess the relevance of a dynamic system test for the purpose of validating the stability and performance of a "similar" dynamic system. Similarity in the context of system identification is much more relevant than are robust control analogies in that errors between one dynamic system (the test article) and another (the nominal "design" model) are sought for the purpose of bounding the validity of a model for control design and analysis. Yet system identification typically involves open-loop plant models which are independent of the control system (with the exception of limited developments in closed-loop system identification which is nonetheless focused on obtaining open-loop plant models from closed-loop data). Moreover the objectives of system identification are not the same as a flight test and hence system identification error metrics are not directly relevant. In applications such as launch vehicles where the open loop plant is unstable it is similarity of the closed-loop system dynamics of a flight test that are relevant.

  20. Implementation of a channelized Hotelling observer model to assess image quality of x-ray angiography systems.

    PubMed

    Favazza, Christopher P; Fetterly, Kenneth A; Hangiandreou, Nicholas J; Leng, Shuai; Schueler, Beth A

    2015-01-01

    Evaluation of flat-panel angiography equipment through conventional image quality metrics is limited by the scope of standard spatial-domain image quality metric(s), such as contrast-to-noise ratio and spatial resolution, or by restricted access to appropriate data to calculate Fourier domain measurements, such as modulation transfer function, noise power spectrum, and detective quantum efficiency. Observer models have been shown capable of overcoming these limitations and are able to comprehensively evaluate medical-imaging systems. We present a spatial domain-based channelized Hotelling observer model to calculate the detectability index (DI) of our different sized disks and compare the performance of different imaging conditions and angiography systems. When appropriate, changes in DIs were compared to expectations based on the classical Rose model of signal detection to assess linearity of the model with quantum signal-to-noise ratio (SNR) theory. For these experiments, the estimated uncertainty of the DIs was less than 3%, allowing for precise comparison of imaging systems or conditions. For most experimental variables, DI changes were linear with expectations based on quantum SNR theory. DIs calculated for the smallest objects demonstrated nonlinearity with quantum SNR theory due to system blur. Two angiography systems with different detector element sizes were shown to perform similarly across the majority of the detection tasks.

  1. Analysis of latency performance of bluetooth low energy (BLE) networks.

    PubMed

    Cho, Keuchul; Park, Woojin; Hong, Moonki; Park, Gisu; Cho, Wooseong; Seo, Jihoon; Han, Kijun

    2014-12-23

    Bluetooth Low Energy (BLE) is a short-range wireless communication technology aiming at low-cost and low-power communication. The performance evaluation of classical Bluetooth device discovery have been intensively studied using analytical modeling and simulative methods, but these techniques are not applicable to BLE, since BLE has a fundamental change in the design of the discovery mechanism, including the usage of three advertising channels. Recently, there several works have analyzed the topic of BLE device discovery, but these studies are still far from thorough. It is thus necessary to develop a new, accurate model for the BLE discovery process. In particular, the wide range settings of the parameters introduce lots of potential for BLE devices to customize their discovery performance. This motivates our study of modeling the BLE discovery process and performing intensive simulation. This paper is focused on building an analytical model to investigate the discovery probability, as well as the expected discovery latency, which are then validated via extensive experiments. Our analysis considers both continuous and discontinuous scanning modes. We analyze the sensitivity of these performance metrics to parameter settings to quantitatively examine to what extent parameters influence the performance metric of the discovery processes.

  2. Analysis of Latency Performance of Bluetooth Low Energy (BLE) Networks

    PubMed Central

    Cho, Keuchul; Park, Woojin; Hong, Moonki; Park, Gisu; Cho, Wooseong; Seo, Jihoon; Han, Kijun

    2015-01-01

    Bluetooth Low Energy (BLE) is a short-range wireless communication technology aiming at low-cost and low-power communication. The performance evaluation of classical Bluetooth device discovery have been intensively studied using analytical modeling and simulative methods, but these techniques are not applicable to BLE, since BLE has a fundamental change in the design of the discovery mechanism, including the usage of three advertising channels. Recently, there several works have analyzed the topic of BLE device discovery, but these studies are still far from thorough. It is thus necessary to develop a new, accurate model for the BLE discovery process. In particular, the wide range settings of the parameters introduce lots of potential for BLE devices to customize their discovery performance. This motivates our study of modeling the BLE discovery process and performing intensive simulation. This paper is focused on building an analytical model to investigate the discovery probability, as well as the expected discovery latency, which are then validated via extensive experiments. Our analysis considers both continuous and discontinuous scanning modes. We analyze the sensitivity of these performance metrics to parameter settings to quantitatively examine to what extent parameters influence the performance metric of the discovery processes. PMID:25545266

  3. Predicting the Overall Spatial Quality of Automotive Audio Systems

    NASA Astrophysics Data System (ADS)

    Koya, Daisuke

    The spatial quality of automotive audio systems is often compromised due to their unideal listening environments. Automotive audio systems need to be developed quickly due to industry demands. A suitable perceptual model could evaluate the spatial quality of automotive audio systems with similar reliability to formal listening tests but take less time. Such a model is developed in this research project by adapting an existing model of spatial quality for automotive audio use. The requirements for the adaptation were investigated in a literature review. A perceptual model called QESTRAL was reviewed, which predicts the overall spatial quality of domestic multichannel audio systems. It was determined that automotive audio systems are likely to be impaired in terms of the spatial attributes that were not considered in developing the QESTRAL model, but metrics are available that might predict these attributes. To establish whether the QESTRAL model in its current form can accurately predict the overall spatial quality of automotive audio systems, MUSHRA listening tests using headphone auralisation with head tracking were conducted to collect results to be compared against predictions by the model. Based on guideline criteria, the model in its current form could not accurately predict the overall spatial quality of automotive audio systems. To improve prediction performance, the QESTRAL model was recalibrated and modified using existing metrics of the model, those that were proposed from the literature review, and newly developed metrics. The most important metrics for predicting the overall spatial quality of automotive audio systems included those that were interaural cross-correlation (IACC) based, relate to localisation of the frontal audio scene, and account for the perceived scene width in front of the listener. Modifying the model for automotive audio systems did not invalidate its use for domestic audio systems. The resulting model predicts the overall spatial quality of 2- and 5-channel automotive audio systems with a cross-validation performance of R. 2 = 0.85 and root-mean-squareerror (RMSE) = 11.03%.

  4. Distributed Space Mission Design for Earth Observation Using Model-Based Performance Evaluation

    NASA Technical Reports Server (NTRS)

    Nag, Sreeja; LeMoigne-Stewart, Jacqueline; Cervantes, Ben; DeWeck, Oliver

    2015-01-01

    Distributed Space Missions (DSMs) are gaining momentum in their application to earth observation missions owing to their unique ability to increase observation sampling in multiple dimensions. DSM design is a complex problem with many design variables, multiple objectives determining performance and cost and emergent, often unexpected, behaviors. There are very few open-access tools available to explore the tradespace of variables, minimize cost and maximize performance for pre-defined science goals, and therefore select the most optimal design. This paper presents a software tool that can multiple DSM architectures based on pre-defined design variable ranges and size those architectures in terms of predefined science and cost metrics. The tool will help a user select Pareto optimal DSM designs based on design of experiments techniques. The tool will be applied to some earth observation examples to demonstrate its applicability in making some key decisions between different performance metrics and cost metrics early in the design lifecycle.

  5. Application of process mining to assess the data quality of routinely collected time-based performance data sourced from electronic health records by validating process conformance.

    PubMed

    Perimal-Lewis, Lua; Teubner, David; Hakendorf, Paul; Horwood, Chris

    2016-12-01

    Effective and accurate use of routinely collected health data to produce Key Performance Indicator reporting is dependent on the underlying data quality. In this research, Process Mining methodology and tools were leveraged to assess the data quality of time-based Emergency Department data sourced from electronic health records. This research was done working closely with the domain experts to validate the process models. The hospital patient journey model was used to assess flow abnormalities which resulted from incorrect timestamp data used in time-based performance metrics. The research demonstrated process mining as a feasible methodology to assess data quality of time-based hospital performance metrics. The insight gained from this research enabled appropriate corrective actions to be put in place to address the data quality issues. © The Author(s) 2015.

  6. Optimizing Blasting’s Air Overpressure Prediction Model using Swarm Intelligence

    NASA Astrophysics Data System (ADS)

    Nur Asmawisham Alel, Mohd; Ruben Anak Upom, Mark; Asnida Abdullah, Rini; Hazreek Zainal Abidin, Mohd

    2018-04-01

    Air overpressure (AOp) resulting from blasting can cause damage and nuisance to nearby civilians. Thus, it is important to be able to predict AOp accurately. In this study, 8 different Artificial Neural Network (ANN) were developed for the purpose of prediction of AOp. The ANN models were trained using different variants of Particle Swarm Optimization (PSO) algorithm. AOp predictions were also made using an empirical equation, as suggested by United States Bureau of Mines (USBM), to serve as a benchmark. In order to develop the models, 76 blasting operations in Hulu Langat were investigated. All the ANN models were found to outperform the USBM equation in three performance metrics; root mean square error (RMSE), mean absolute percentage error (MAPE) and coefficient of determination (R2). Using a performance ranking method, MSO-Rand-Mut was determined to be the best prediction model for AOp with a performance metric of RMSE=2.18, MAPE=1.73% and R2=0.97. The result shows that ANN models trained using PSO are capable of predicting AOp with great accuracy.

  7. Cross-layer protocol design for QoS optimization in real-time wireless sensor networks

    NASA Astrophysics Data System (ADS)

    Hortos, William S.

    2010-04-01

    The metrics of quality of service (QoS) for each sensor type in a wireless sensor network can be associated with metrics for multimedia that describe the quality of fused information, e.g., throughput, delay, jitter, packet error rate, information correlation, etc. These QoS metrics are typically set at the highest, or application, layer of the protocol stack to ensure that performance requirements for each type of sensor data are satisfied. Application-layer metrics, in turn, depend on the support of the lower protocol layers: session, transport, network, data link (MAC), and physical. The dependencies of the QoS metrics on the performance of the higher layers of the Open System Interconnection (OSI) reference model of the WSN protocol, together with that of the lower three layers, are the basis for a comprehensive approach to QoS optimization for multiple sensor types in a general WSN model. The cross-layer design accounts for the distributed power consumption along energy-constrained routes and their constituent nodes. Following the author's previous work, the cross-layer interactions in the WSN protocol are represented by a set of concatenated protocol parameters and enabling resource levels. The "best" cross-layer designs to achieve optimal QoS are established by applying the general theory of martingale representations to the parameterized multivariate point processes (MVPPs) for discrete random events occurring in the WSN. Adaptive control of network behavior through the cross-layer design is realized through the parametric factorization of the stochastic conditional rates of the MVPPs. The cross-layer protocol parameters for optimal QoS are determined in terms of solutions to stochastic dynamic programming conditions derived from models of transient flows for heterogeneous sensor data and aggregate information over a finite time horizon. Markov state processes, embedded within the complex combinatorial history of WSN events, are more computationally tractable and lead to simplifications for any simulated or analytical performance evaluations of the cross-layer designs.

  8. Predicting streamflow regime metrics for ungauged streamsin Colorado, Washington, and Oregon

    NASA Astrophysics Data System (ADS)

    Sanborn, Stephen C.; Bledsoe, Brian P.

    2006-06-01

    Streamflow prediction in ungauged basins provides essential information for water resources planning and management and ecohydrological studies yet remains a fundamental challenge to the hydrological sciences. A methodology is presented for stratifying streamflow regimes of gauged locations, classifying the regimes of ungauged streams, and developing models for predicting a suite of ecologically pertinent streamflow metrics for these streams. Eighty-four streamflow metrics characterizing various flow regime attributes were computed along with physical and climatic drainage basin characteristics for 150 streams with little or no streamflow modification in Colorado, Washington, and Oregon. The diverse hydroclimatology of the study area necessitates flow regime stratification and geographically independent clusters were identified and used to develop separate predictive models for each flow regime type. Multiple regression models for flow magnitude, timing, and rate of change metrics were quite accurate with many adjusted R2 values exceeding 0.80, while models describing streamflow variability did not perform as well. Separate stratification schemes for high, low, and average flows did not considerably improve models for metrics describing those particular aspects of the regime over a scheme based on the entire flow regime. Models for streams identified as 'snowmelt' type were improved if sites in Colorado and the Pacific Northwest were separated to better stratify the processes driving streamflow in these regions thus revealing limitations of geographically independent streamflow clusters. This study demonstrates that a broad suite of ecologically relevant streamflow characteristics can be accurately modeled across large heterogeneous regions using this framework. Applications of the resulting models include stratifying biomonitoring sites and quantifying linkages between specific aspects of flow regimes and aquatic community structure. In particular, the results bode well for modeling ecological processes related to high-flow magnitude, timing, and rate of change such as the recruitment of fish and riparian vegetation across large regions.

  9. Comparison between stochastic and machine learning methods for hydrological multi-step ahead forecasting: All forecasts are wrong!

    NASA Astrophysics Data System (ADS)

    Papacharalampous, Georgia; Tyralis, Hristos; Koutsoyiannis, Demetris

    2017-04-01

    Machine learning (ML) is considered to be a promising approach to hydrological processes forecasting. We conduct a comparison between several stochastic and ML point estimation methods by performing large-scale computational experiments based on simulations. The purpose is to provide generalized results, while the respective comparisons in the literature are usually based on case studies. The stochastic methods used include simple methods, models from the frequently used families of Autoregressive Moving Average (ARMA), Autoregressive Fractionally Integrated Moving Average (ARFIMA) and Exponential Smoothing models. The ML methods used are Random Forests (RF), Support Vector Machines (SVM) and Neural Networks (NN). The comparison refers to the multi-step ahead forecasting properties of the methods. A total of 20 methods are used, among which 9 are the ML methods. 12 simulation experiments are performed, while each of them uses 2 000 simulated time series of 310 observations. The time series are simulated using stochastic processes from the families of ARMA and ARFIMA models. Each time series is split into a fitting (first 300 observations) and a testing set (last 10 observations). The comparative assessment of the methods is based on 18 metrics, that quantify the methods' performance according to several criteria related to the accurate forecasting of the testing set, the capturing of its variation and the correlation between the testing and forecasted values. The most important outcome of this study is that there is not a uniformly better or worse method. However, there are methods that are regularly better or worse than others with respect to specific metrics. It appears that, although a general ranking of the methods is not possible, their classification based on their similar or contrasting performance in the various metrics is possible to some extent. Another important conclusion is that more sophisticated methods do not necessarily provide better forecasts compared to simpler methods. It is pointed out that the ML methods do not differ dramatically from the stochastic methods, while it is interesting that the NN, RF and SVM algorithms used in this study offer potentially very good performance in terms of accuracy. It should be noted that, although this study focuses on hydrological processes, the results are of general scientific interest. Another important point in this study is the use of several methods and metrics. Using fewer methods and fewer metrics would have led to a very different overall picture, particularly if those fewer metrics corresponded to fewer criteria. For this reason, we consider that the proposed methodology is appropriate for the evaluation of forecasting methods.

  10. Modeling and Performance Evaluation of Backoff Misbehaving Nodes in CSMA/CA Networks

    DTIC Science & Technology

    2012-08-01

    Modeling and Performance Evaluation of Backoff Misbehaving Nodes in CSMA/CA Networks Zhuo Lu, Student Member, IEEE, Wenye Wang, Senior Member, IEEE... misbehaving nodes can obtain, we define and study two general classes of backoff misbehavior: continuous misbehavior, which keeps manipulating the backoff...misbehavior sporadically. Our approach is to introduce a new performance metric, namely order gain, to characterize the performance benefits of misbehaving

  11. Determining GPS average performance metrics

    NASA Technical Reports Server (NTRS)

    Moore, G. V.

    1995-01-01

    Analytic and semi-analytic methods are used to show that users of the GPS constellation can expect performance variations based on their location. Specifically, performance is shown to be a function of both altitude and latitude. These results stem from the fact that the GPS constellation is itself non-uniform. For example, GPS satellites are over four times as likely to be directly over Tierra del Fuego than over Hawaii or Singapore. Inevitable performance variations due to user location occur for ground, sea, air and space GPS users. These performance variations can be studied in an average relative sense. A semi-analytic tool which symmetrically allocates GPS satellite latitude belt dwell times among longitude points is used to compute average performance metrics. These metrics include average number of GPS vehicles visible, relative average accuracies in the radial, intrack and crosstrack (or radial, north/south, east/west) directions, and relative average PDOP or GDOP. The tool can be quickly changed to incorporate various user antenna obscuration models and various GPS constellation designs. Among other applications, tool results can be used in studies to: predict locations and geometries of best/worst case performance, design GPS constellations, determine optimal user antenna location and understand performance trends among various users.

  12. A Validation of Object-Oriented Design Metrics as Quality Indicators

    NASA Technical Reports Server (NTRS)

    Basili, Victor R.; Briand, Lionel C.; Melo, Walcelio

    1997-01-01

    This paper presents the results of a study in which we empirically investigated the suits of object-oriented (00) design metrics introduced in another work. More specifically, our goal is to assess these metrics as predictors of fault-prone classes and, therefore, determine whether they can be used as early quality indicators. This study is complementary to the work described where the same suite of metrics had been used to assess frequencies of maintenance changes to classes. To perform our validation accurately, we collected data on the development of eight medium-sized information management systems based on identical requirements. All eight projects were developed using a sequential life cycle model, a well-known 00 analysis/design method and the C++ programming language. Based on empirical and quantitative analysis, the advantages and drawbacks of these 00 metrics are discussed. Several of Chidamber and Kamerer's 00 metrics appear to be useful to predict class fault-proneness during the early phases of the life-cycle. Also, on our data set, they are better predictors than 'traditional' code metrics, which can only be collected at a later phase of the software development processes.

  13. A Validation of Object-Oriented Design Metrics

    NASA Technical Reports Server (NTRS)

    Basili, Victor R.; Briand, Lionel; Melo, Walcelio L.

    1995-01-01

    This paper presents the results of a study conducted at the University of Maryland in which we experimentally investigated the suite of Object-Oriented (00) design metrics introduced by [Chidamber and Kemerer, 1994]. In order to do this, we assessed these metrics as predictors of fault-prone classes. This study is complementary to [Lieand Henry, 1993] where the same suite of metrics had been used to assess frequencies of maintenance changes to classes. To perform our validation accurately, we collected data on the development of eight medium-sized information management systems based on identical requirements. All eight projects were developed using a sequential life cycle model, a well-known 00 analysis/design method and the C++ programming language. Based on experimental results, the advantages and drawbacks of these 00 metrics are discussed and suggestions for improvement are provided. Several of Chidamber and Kemerer's 00 metrics appear to be adequate to predict class fault-proneness during the early phases of the life-cycle. We also showed that they are, on our data set, better predictors than "traditional" code metrics, which can only be collected at a later phase of the software development processes.

  14. Numerical model validation using experimental data: Application of the area metric on a Francis runner

    NASA Astrophysics Data System (ADS)

    Chatenet, Q.; Tahan, A.; Gagnon, M.; Chamberland-Lauzon, J.

    2016-11-01

    Nowadays, engineers are able to solve complex equations thanks to the increase of computing capacity. Thus, finite elements software is widely used, especially in the field of mechanics to predict part behavior such as strain, stress and natural frequency. However, it can be difficult to determine how a model might be right or wrong, or whether a model is better than another one. Nevertheless, during the design phase, it is very important to estimate how the hydroelectric turbine blades will behave according to the stress to which it is subjected. Indeed, the static and dynamic stress levels will influence the blade's fatigue resistance and thus its lifetime, which is a significant feature. In the industry, engineers generally use either graphic representation, hypothesis tests such as the Student test, or linear regressions in order to compare experimental to estimated data from the numerical model. Due to the variability in personal interpretation (reproducibility), graphical validation is not considered objective. For an objective assessment, it is essential to use a robust validation metric to measure the conformity of predictions against data. We propose to use the area metric in the case of a turbine blade that meets the key points of the ASME Standards and produces a quantitative measure of agreement between simulations and empirical data. This validation metric excludes any belief and criterion of accepting a model which increases robustness. The present work is aimed at applying a validation method, according to ASME V&V 10 recommendations. Firstly, the area metric is applied on the case of a real Francis runner whose geometry and boundaries conditions are complex. Secondly, the area metric will be compared to classical regression methods to evaluate the performance of the method. Finally, we will discuss the use of the area metric as a tool to correct simulations.

  15. Reliability and Productivity Modeling for the Optimization of Separated Spacecraft Interferometers

    NASA Technical Reports Server (NTRS)

    Kenny, Sean (Technical Monitor); Wertz, Julie

    2002-01-01

    As technological systems grow in capability, they also grow in complexity. Due to this complexity, it is no longer possible for a designer to use engineering judgement to identify the components that have the largest impact on system life cycle metrics, such as reliability, productivity, cost, and cost effectiveness. One way of identifying these key components is to build quantitative models and analysis tools that can be used to aid the designer in making high level architecture decisions. Once these key components have been identified, two main approaches to improving a system using these components exist: add redundancy or improve the reliability of the component. In reality, the most effective approach to almost any system will be some combination of these two approaches, in varying orders of magnitude for each component. Therefore, this research tries to answer the question of how to divide funds, between adding redundancy and improving the reliability of components, to most cost effectively improve the life cycle metrics of a system. While this question is relevant to any complex system, this research focuses on one type of system in particular: Separate Spacecraft Interferometers (SSI). Quantitative models are developed to analyze the key life cycle metrics of different SSI system architectures. Next, tools are developed to compare a given set of architectures in terms of total performance, by coupling different life cycle metrics together into one performance metric. Optimization tools, such as simulated annealing and genetic algorithms, are then used to search the entire design space to find the "optimal" architecture design. Sensitivity analysis tools have been developed to determine how sensitive the results of these analyses are to uncertain user defined parameters. Finally, several possibilities for the future work that could be done in this area of research are presented.

  16. Multistressor predictive models of invertebrate condition in the Corn Belt, USA

    USGS Publications Warehouse

    Waite, Ian R.; Van Metre, Peter C.

    2017-01-01

    Understanding the complex relations between multiple environmental stressors and ecological conditions in streams can help guide resource-management decisions. During 14 weeks in spring/summer 2013, personnel from the US Geological Survey and the US Environmental Protection Agency sampled 98 wadeable streams across the Midwest Corn Belt region of the USA for water and sediment quality, physical and habitat characteristics, and ecological communities. We used these data to develop independent predictive disturbance models for 3 macroinvertebrate metrics and a multimetric index. We developed the models based on boosted regression trees (BRT) for 3 stressor categories, land use/land cover (geographic information system [GIS]), all in-stream stressors combined (nutrients, habitat, and contaminants), and for GIS plus in-stream stressors. The GIS plus in-stream stressor models had the best overall performance with an average cross-validation R2 across all models of 0.41. The models were generally consistent in the explanatory variables selected within each stressor group across the 4 invertebrate metrics modeled. Variables related to riparian condition, substrate size or embeddedness, velocity and channel shape, nutrients (primarily NH3), and contaminants (pyrethroid degradates) were important descriptors of the invertebrate metrics. Models based on all measured in-stream stressors performed comparably to models based on GIS landscape variables, suggesting that the in-stream stressor characterization reasonably represents the dominant factors affecting invertebrate communities and that GIS variables are acting as surrogates for in-stream stressors that directly affect in-stream biota.

  17. Vehicle Integrated Prognostic Reasoner (VIPR) Metric Report

    NASA Technical Reports Server (NTRS)

    Cornhill, Dennis; Bharadwaj, Raj; Mylaraswamy, Dinkar

    2013-01-01

    This document outlines a set of metrics for evaluating the diagnostic and prognostic schemes developed for the Vehicle Integrated Prognostic Reasoner (VIPR), a system-level reasoner that encompasses the multiple levels of large, complex systems such as those for aircraft and spacecraft. VIPR health managers are organized hierarchically and operate together to derive diagnostic and prognostic inferences from symptoms and conditions reported by a set of diagnostic and prognostic monitors. For layered reasoners such as VIPR, the overall performance cannot be evaluated by metrics solely directed toward timely detection and accuracy of estimation of the faults in individual components. Among other factors, overall vehicle reasoner performance is governed by the effectiveness of the communication schemes between monitors and reasoners in the architecture, and the ability to propagate and fuse relevant information to make accurate, consistent, and timely predictions at different levels of the reasoner hierarchy. We outline an extended set of diagnostic and prognostics metrics that can be broadly categorized as evaluation measures for diagnostic coverage, prognostic coverage, accuracy of inferences, latency in making inferences, computational cost, and sensitivity to different fault and degradation conditions. We report metrics from Monte Carlo experiments using two variations of an aircraft reference model that supported both flat and hierarchical reasoning.

  18. On use of image quality metrics for perceptual blur modeling: image/video compression case

    NASA Astrophysics Data System (ADS)

    Cha, Jae H.; Olson, Jeffrey T.; Preece, Bradley L.; Espinola, Richard L.; Abbott, A. Lynn

    2018-02-01

    Linear system theory is employed to make target acquisition performance predictions for electro-optical/infrared imaging systems where the modulation transfer function (MTF) may be imposed from a nonlinear degradation process. Previous research relying on image quality metrics (IQM) methods, which heuristically estimate perceived MTF has supported that an average perceived MTF can be used to model some types of degradation such as image compression. Here, we discuss the validity of the IQM approach by mathematically analyzing the associated heuristics from the perspective of reliability, robustness, and tractability. Experiments with standard images compressed by x.264 encoding suggest that the compression degradation can be estimated by a perceived MTF within boundaries defined by well-behaved curves with marginal error. Our results confirm that the IQM linearizer methodology provides a credible tool for sensor performance modeling.

  19. Texture metric that predicts target detection performance

    NASA Astrophysics Data System (ADS)

    Culpepper, Joanne B.

    2015-12-01

    Two texture metrics based on gray level co-occurrence error (GLCE) are used to predict probability of detection and mean search time. The two texture metrics are local clutter metrics and are based on the statistics of GLCE probability distributions. The degree of correlation between various clutter metrics and the target detection performance of the nine military vehicles in complex natural scenes found in the Search_2 dataset are presented. Comparison is also made between four other common clutter metrics found in the literature: root sum of squares, Doyle, statistical variance, and target structure similarity. The experimental results show that the GLCE energy metric is a better predictor of target detection performance when searching for targets in natural scenes than the other clutter metrics studied.

  20. Substantial Progress Yet Significant Opportunity for Improvement in Stroke Care in China.

    PubMed

    Li, Zixiao; Wang, Chunjuan; Zhao, Xingquan; Liu, Liping; Wang, Chunxue; Li, Hao; Shen, Haipeng; Liang, Li; Bettger, Janet; Yang, Qing; Wang, David; Wang, Anxin; Pan, Yuesong; Jiang, Yong; Yang, Xiaomeng; Zhang, Changqing; Fonarow, Gregg C; Schwamm, Lee H; Hu, Bo; Peterson, Eric D; Xian, Ying; Wang, Yilong; Wang, Yongjun

    2016-11-01

    Stroke is a leading cause of death in China. Yet the adherence to guideline-recommended ischemic stroke performance metrics in the past decade has been previously shown to be suboptimal. Since then, several nationwide stroke quality management initiatives have been conducted in China. We sought to determine whether adherence had improved since then. Data were obtained from the 2 phases of China National Stroke Registries, which included 131 hospitals (12 173 patients with acute ischemic stroke) in China National Stroke Registries phase 1 from 2007 to 2008 versus 219 hospitals (19 604 patients) in China National Stroke Registries phase 2 from 2012 to 2013. Multiple regression models were developed to evaluate the difference in adherence to performance measure between the 2 study periods. The overall quality of care has improved over time, as reflected by the higher composite score of 0.76 in 2012 to 2013 versus 0.63 in 2007 to 2008. Nine of 13 individual performance metrics improved. However, there were no significant improvements in the rates of intravenous thrombolytic therapy and anticoagulation for atrial fibrillation. After multivariate analysis, there remained a significant 1.17-fold (95% confidence interval, 1.14-1.21) increase in the odds of delivering evidence-based performance metrics in the more recent time periods versus older data. The performance metrics with the most significantly increased odds included stroke education, dysphagia screening, smoking cessation, and antithrombotics at discharge. Adherence to stroke performance metrics has increased over time, but significant opportunities remain for further improvement. Continuous stroke quality improvement program should be developed as a national priority in China. © 2016 American Heart Association, Inc.

  1. Comparative Simulation Study of Glucose Control Methods Designed for Use in the Intensive Care Unit Setting via a Novel Controller Scoring Metric.

    PubMed

    DeJournett, Jeremy; DeJournett, Leon

    2017-11-01

    Effective glucose control in the intensive care unit (ICU) setting has the potential to decrease morbidity and mortality rates and thereby decrease health care expenditures. To evaluate what constitutes effective glucose control, typically several metrics are reported, including time in range, time in mild and severe hypoglycemia, coefficient of variation, and others. To date, there is no one metric that combines all of these individual metrics to give a number indicative of overall performance. We proposed a composite metric that combines 5 commonly reported metrics, and we used this composite metric to compare 6 glucose controllers. We evaluated the following controllers: Ideal Medical Technologies (IMT) artificial-intelligence-based controller, Yale protocol, Glucommander, Wintergerst et al PID controller, GRIP, and NICE-SUGAR. We evaluated each controller across 80 simulated patients, 4 clinically relevant exogenous dextrose infusions, and one nonclinical infusion as a test of the controller's ability to handle difficult situations. This gave a total of 2400 5-day simulations, and 585 604 individual glucose values for analysis. We used a random walk sensor error model that gave a 10% MARD. For each controller, we calculated severe hypoglycemia (<40 mg/dL), mild hypoglycemia (40-69 mg/dL), normoglycemia (70-140 mg/dL), hyperglycemia (>140 mg/dL), and coefficient of variation (CV), as well as our novel controller metric. For the controllers tested, we achieved the following median values for our novel controller scoring metric: IMT: 88.1, YALE: 46.7, GLUC: 47.2, PID: 50, GRIP: 48.2, NICE: 46.4. The novel scoring metric employed in this study shows promise as a means for evaluating new and existing ICU-based glucose controllers, and it could be used in the future to compare results of glucose control studies in critical care. The IMT AI-based glucose controller demonstrated the most consistent performance results based on this new metric.

  2. Psychomotor skills assessment by motion analysis in minimally invasive surgery on an animal organ.

    PubMed

    Hofstad, Erlend Fagertun; Våpenstad, Cecilie; Bø, Lars Eirik; Langø, Thomas; Kuhry, Esther; Mårvik, Ronald

    2017-08-01

    A high level of psychomotor skills is required to perform minimally invasive surgery (MIS) safely. To be able to measure these skills is important in the assessment of surgeons, as it enables constructive feedback during training. The aim of this study was to test the validity of an objective and automatic assessment method using motion analysis during a laparoscopic procedure on an animal organ. Experienced surgeons in laparoscopy (experts) and medical students (novices) performed a cholecystectomy on a porcine liver box model. The motions of the surgical tools were acquired and analyzed by 11 different motion-related metrics, i.e., a total of 19 metrics as eight of them were measured separately for each hand. We identified for which of the metrics the experts outperformed the novices. In total, two experts and 28 novices were included. The experts achieved significantly better results for 13 of the 19 instrument motion metrics. Expert performance is characterized by a low time to complete the cholecystectomy, high bimanual dexterity (instrument coordination), a limited amount of movement and low measurement of motion smoothness of the dissection instrument, and relatively high usage of the grasper to optimize tissue positioning for dissection.

  3. Validating models of target acquisition performance in the dismounted soldier context

    NASA Astrophysics Data System (ADS)

    Glaholt, Mackenzie G.; Wong, Rachel K.; Hollands, Justin G.

    2018-04-01

    The problem of predicting real-world operator performance with digital imaging devices is of great interest within the military and commercial domains. There are several approaches to this problem, including: field trials with imaging devices, laboratory experiments using imagery captured from these devices, and models that predict human performance based on imaging device parameters. The modeling approach is desirable, as both field trials and laboratory experiments are costly and time-consuming. However, the data from these experiments is required for model validation. Here we considered this problem in the context of dismounted soldiering, for which detection and identification of human targets are essential tasks. Human performance data were obtained for two-alternative detection and identification decisions in a laboratory experiment in which photographs of human targets were presented on a computer monitor and the images were digitally magnified to simulate range-to-target. We then compared the predictions of different performance models within the NV-IPM software package: Targeting Task Performance (TTP) metric model and the Johnson model. We also introduced a modification to the TTP metric computation that incorporates an additional correction for target angular size. We examined model predictions using NV-IPM default values for a critical model constant, V50, and we also considered predictions when this value was optimized to fit the behavioral data. When using default values, certain model versions produced a reasonably close fit to the human performance data in the detection task, while for the identification task all models substantially overestimated performance. When using fitted V50 values the models produced improved predictions, though the slopes of the performance functions were still shallow compared to the behavioral data. These findings are discussed in relation to the models' designs and parameters, and the characteristics of the behavioral paradigm.

  4. Damage modeling and statistical analysis of optics damage performance in MJ-class laser systems.

    PubMed

    Liao, Zhi M; Raymond, B; Gaylord, J; Fallejo, R; Bude, J; Wegner, P

    2014-11-17

    Modeling the lifetime of a fused silica optic is described for a multiple beam, MJ-class laser system. This entails combining optic processing data along with laser shot data to account for complete history of optic processing and shot exposure. Integrating with online inspection data allows for the construction of a performance metric to describe how an optic performs with respect to the model. This methodology helps to validate the damage model as well as allows strategic planning and identifying potential hidden parameters that are affecting the optic's performance.

  5. Funding Ohio Community Colleges: An Analysis of the Performance Funding Model

    ERIC Educational Resources Information Center

    Krueger, Cynthia A.

    2013-01-01

    This study examined Ohio's community college performance funding model that is based on seven student success metrics. A percentage of the regular state subsidy is withheld from institutions; funding is earned back based on the three-year average of success points achieved in comparison to other community colleges in the state. Analysis of…

  6. Detecting understory plant invasion in urban forests using LiDAR

    NASA Astrophysics Data System (ADS)

    Singh, Kunwar K.; Davis, Amy J.; Meentemeyer, Ross K.

    2015-06-01

    Light detection and ranging (LiDAR) data are increasingly used to measure structural characteristics of urban forests but are rarely used to detect the growing problem of exotic understory plant invaders. We explored the merits of using LiDAR-derived metrics alone and through integration with spectral data to detect the spatial distribution of the exotic understory plant Ligustrum sinense, a rapidly spreading invader in the urbanizing region of Charlotte, North Carolina, USA. We analyzed regional-scale L. sinense occurrence data collected over the course of three years with LiDAR-derived metrics of forest structure that were categorized into the following groups: overstory, understory, topography, and overall vegetation characteristics, and IKONOS spectral features - optical. Using random forest (RF) and logistic regression (LR) classifiers, we assessed the relative contributions of LiDAR and IKONOS derived variables to the detection of L. sinense. We compared the top performing models developed for a smaller, nested experimental extent using RF and LR classifiers, and used the best overall model to produce a predictive map of the spatial distribution of L. sinense across our country-wide study extent. RF classification of LiDAR-derived topography metrics produced the highest mapping accuracy estimates, outperforming IKONOS data by 17.5% and the integration of LiDAR and IKONOS data by 5.3%. The top performing model from the RF classifier produced the highest kappa of 64.8%, improving on the parsimonious LR model kappa by 31.1% with a moderate gain of 6.2% over the county extent model. Our results demonstrate the superiority of LiDAR-derived metrics over spectral data and fusion of LiDAR and spectral data for accurately mapping the spatial distribution of the forest understory invader L. sinense.

  7. Surveillance metrics sensitivity study.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hamada, Michael S.; Bierbaum, Rene Lynn; Robertson, Alix A.

    2011-09-01

    In September of 2009, a Tri-Lab team was formed to develop a set of metrics relating to the NNSA nuclear weapon surveillance program. The purpose of the metrics was to develop a more quantitative and/or qualitative metric(s) describing the results of realized or non-realized surveillance activities on our confidence in reporting reliability and assessing the stockpile. As a part of this effort, a statistical sub-team investigated various techniques and developed a complementary set of statistical metrics that could serve as a foundation for characterizing aspects of meeting the surveillance program objectives. The metrics are a combination of tolerance limit calculationsmore » and power calculations, intending to answer level-of-confidence type questions with respect to the ability to detect certain undesirable behaviors (catastrophic defects, margin insufficiency defects, and deviations from a model). Note that the metrics are not intended to gauge product performance but instead the adequacy of surveillance. This report gives a short description of four metrics types that were explored and the results of a sensitivity study conducted to investigate their behavior for various inputs. The results of the sensitivity study can be used to set the risk parameters that specify the level of stockpile problem that the surveillance program should be addressing.« less

  8. Surveillance Metrics Sensitivity Study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bierbaum, R; Hamada, M; Robertson, A

    2011-11-01

    In September of 2009, a Tri-Lab team was formed to develop a set of metrics relating to the NNSA nuclear weapon surveillance program. The purpose of the metrics was to develop a more quantitative and/or qualitative metric(s) describing the results of realized or non-realized surveillance activities on our confidence in reporting reliability and assessing the stockpile. As a part of this effort, a statistical sub-team investigated various techniques and developed a complementary set of statistical metrics that could serve as a foundation for characterizing aspects of meeting the surveillance program objectives. The metrics are a combination of tolerance limit calculationsmore » and power calculations, intending to answer level-of-confidence type questions with respect to the ability to detect certain undesirable behaviors (catastrophic defects, margin insufficiency defects, and deviations from a model). Note that the metrics are not intended to gauge product performance but instead the adequacy of surveillance. This report gives a short description of four metrics types that were explored and the results of a sensitivity study conducted to investigate their behavior for various inputs. The results of the sensitivity study can be used to set the risk parameters that specify the level of stockpile problem that the surveillance program should be addressing.« less

  9. Evaluation of mean climate in a chemistry-climate model simulation

    NASA Astrophysics Data System (ADS)

    Hong, S.; Park, H.; Wie, J.; Park, R.; Lee, S.; Moon, B. K.

    2017-12-01

    Incorporation of the interactive chemistry is essential for understanding chemistry-climate interactions and feedback processes in climate models. Here we assess a newly developed chemistry-climate model (GRIMs-Chem), which is based on the Global/Regional Integrated Model system (GRIMs) including the aerosol direct effect as well as stratospheric linearized ozone chemistry (LINOZ). We conducted GRIMs-Chem with observed sea surface temperature during the period of 1979-2010, and compared the simulation results with observations and also with CMIP models. To measure the relative performance of our model, we define the quantitative performance metric using the Taylor diagram. This metric allow us to assess overall features in simulating multiple variables. Overall, our model better reproduce the zonal mean spatial pattern of temperature, horizontal wind, vertical motion, and relative humidity relative to other models. However, the model did not produce good simulations at upper troposphere (200 hPa). It is currently unclear which model processes are responsible for this. AcknowledgementsThis research was supported by the Korea Ministry of Environment (MOE) as "Climate Change Correspondence Program."

  10. Value-based metrics and Internet-based enterprises

    NASA Astrophysics Data System (ADS)

    Gupta, Krishan M.

    2001-10-01

    Within the last few years, a host of value-based metrics like EVA, MVA, TBR, CFORI, and TSR have evolved. This paper attempts to analyze the validity and applicability of EVA and Balanced Scorecard for Internet based organizations. Despite the collapse of the dot-com model, the firms engaged in e- commerce continue to struggle to find new ways to account for customer-base, technology, employees, knowledge, etc, as part of the value of the firm. While some metrics, like the Balance Scorecard are geared towards internal use, others like EVA are for external use. Value-based metrics are used for performing internal audits as well as comparing firms against one another; and can also be effectively utilized by individuals outside the firm looking to determine if the firm is creating value for its stakeholders.

  11. Real-time performance monitoring and management system

    DOEpatents

    Budhraja, Vikram S [Los Angeles, CA; Dyer, James D [La Mirada, CA; Martinez Morales, Carlos A [Upland, CA

    2007-06-19

    A real-time performance monitoring system for monitoring an electric power grid. The electric power grid has a plurality of grid portions, each grid portion corresponding to one of a plurality of control areas. The real-time performance monitoring system includes a monitor computer for monitoring at least one of reliability metrics, generation metrics, transmission metrics, suppliers metrics, grid infrastructure security metrics, and markets metrics for the electric power grid. The data for metrics being monitored by the monitor computer are stored in a data base, and a visualization of the metrics is displayed on at least one display computer having a monitor. The at least one display computer in one said control area enables an operator to monitor the grid portion corresponding to a different said control area.

  12. Validation of the 5th and 95th Percentile Hybrid III Anthropomorphic Test Device Finite Element Model

    NASA Technical Reports Server (NTRS)

    Lawrence, C.; Somers, J. T.; Baldwin, M. A.; Wells, J. A.; Newby, N.; Currie, N. J.

    2014-01-01

    NASA spacecraft design requirements for occupant protection are a combination of the Brinkley criteria and injury metrics extracted from anthropomorphic test devices (ATD's). For the ATD injury metrics, the requirements specify the use of the 5th percentile female Hybrid III and the 95th percentile male Hybrid III. Furthermore, each of these ATD's is required to be fitted with an articulating pelvis and a straight spine. The articulating pelvis is necessary for the ATD to fit into spacecraft seats, while the straight spine is required as injury metrics for vertical accelerations are better defined for this configuration. The requirements require that physical testing be performed with both ATD's to demonstrate compliance. Before compliance testing can be conducted, extensive modeling and simulation are required to determine appropriate test conditions, simulate conditions not feasible for testing, and assess design features to better ensure compliance testing is successful. While finite element (FE) models are currently available for many of the physical ATD's, currently there are no complete models for either the 5th percentile female or the 95th percentile male Hybrid III with a straight spine and articulating pelvis. The purpose of this work is to assess the accuracy of the existing Livermore Software Technology Corporation's FE models of the 5th and 95th percentile ATD's. To perform this assessment, a series of tests will be performed at Wright Patterson Air Force Research Lab using their horizontal impact accelerator sled test facility. The ATD's will be placed in the Orion seat with a modified-advanced-crew-escape-system (MACES) pressure suit and helmet, and driven with loadings similar to what is expected for the actual Orion vehicle during landing, launch abort, and chute deployment. Test data will be compared to analytical predictions and modelling uncertainty factors will be determined for each injury metric. Additionally, the test data will be used to further improve the FE model, particularly in the areas of the ATD neck components, harness, and suit and helmet effects.

  13. Proceedings of the 66th National Conference on Weights and Measures, 1981

    NASA Astrophysics Data System (ADS)

    Wollin, H. F.; Barbrow, L. E.; Heffernan, A. P.

    1981-12-01

    Major issues discussed included measurement science education, enforcement uniformly, national type approval, inch pound and metric labeling provisions, new design and performance requirements for weighing and measuring technology, metric conversion of retail gasoline dispensers, weights and measures program evaluation studies of model State laws and regulations and their adoption by citation or other means by State and local jurisdictions, and report of States conducting grain moisture meter testing programs.

  14. Assessment of six dissimilarity metrics for climate analogues

    NASA Astrophysics Data System (ADS)

    Grenier, Patrick; Parent, Annie-Claude; Huard, David; Anctil, François; Chaumont, Diane

    2013-04-01

    Spatial analogue techniques consist in identifying locations whose recent-past climate is similar in some aspects to the future climate anticipated at a reference location. When identifying analogues, one key step is the quantification of the dissimilarity between two climates separated in time and space, which involves the choice of a metric. In this communication, spatial analogues and their usefulness are briefly discussed. Next, six metrics are presented (the standardized Euclidean distance, the Kolmogorov-Smirnov statistic, the nearest-neighbor distance, the Zech-Aslan energy statistic, the Friedman-Rafsky runs statistic and the Kullback-Leibler divergence), along with a set of criteria used for their assessment. The related case study involves the use of numerical simulations performed with the Canadian Regional Climate Model (CRCM-v4.2.3), from which three annual indicators (total precipitation, heating degree-days and cooling degree-days) are calculated over 30-year periods (1971-2000 and 2041-2070). Results indicate that the six metrics identify comparable analogue regions at a relatively large scale, but best analogues may differ substantially. For best analogues, it is also shown that the uncertainty stemming from the metric choice does generally not exceed that stemming from the simulation or model choice. A synthesis of the advantages and drawbacks of each metric is finally presented, in which the Zech-Aslan energy statistic stands out as the most recommended metric for analogue studies, whereas the Friedman-Rafsky runs statistic is the least recommended, based on this case study.

  15. Voice based gender classification using machine learning

    NASA Astrophysics Data System (ADS)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  16. Implementation of a channelized Hotelling observer model to assess image quality of x-ray angiography systems

    PubMed Central

    Favazza, Christopher P.; Fetterly, Kenneth A.; Hangiandreou, Nicholas J.; Leng, Shuai; Schueler, Beth A.

    2015-01-01

    Abstract. Evaluation of flat-panel angiography equipment through conventional image quality metrics is limited by the scope of standard spatial-domain image quality metric(s), such as contrast-to-noise ratio and spatial resolution, or by restricted access to appropriate data to calculate Fourier domain measurements, such as modulation transfer function, noise power spectrum, and detective quantum efficiency. Observer models have been shown capable of overcoming these limitations and are able to comprehensively evaluate medical-imaging systems. We present a spatial domain-based channelized Hotelling observer model to calculate the detectability index (DI) of our different sized disks and compare the performance of different imaging conditions and angiography systems. When appropriate, changes in DIs were compared to expectations based on the classical Rose model of signal detection to assess linearity of the model with quantum signal-to-noise ratio (SNR) theory. For these experiments, the estimated uncertainty of the DIs was less than 3%, allowing for precise comparison of imaging systems or conditions. For most experimental variables, DI changes were linear with expectations based on quantum SNR theory. DIs calculated for the smallest objects demonstrated nonlinearity with quantum SNR theory due to system blur. Two angiography systems with different detector element sizes were shown to perform similarly across the majority of the detection tasks. PMID:26158086

  17. Performance assessment in brain-computer interface-based augmentative and alternative communication

    PubMed Central

    2013-01-01

    A large number of incommensurable metrics are currently used to report the performance of brain-computer interfaces (BCI) used for augmentative and alterative communication (AAC). The lack of standard metrics precludes the comparison of different BCI-based AAC systems, hindering rapid growth and development of this technology. This paper presents a review of the metrics that have been used to report performance of BCIs used for AAC from January 2005 to January 2012. We distinguish between Level 1 metrics used to report performance at the output of the BCI Control Module, which translates brain signals into logical control output, and Level 2 metrics at the Selection Enhancement Module, which translates logical control to semantic control. We recommend that: (1) the commensurate metrics Mutual Information or Information Transfer Rate (ITR) be used to report Level 1 BCI performance, as these metrics represent information throughput, which is of interest in BCIs for AAC; 2) the BCI-Utility metric be used to report Level 2 BCI performance, as it is capable of handling all current methods of improving BCI performance; (3) these metrics should be supplemented by information specific to each unique BCI configuration; and (4) studies involving Selection Enhancement Modules should report performance at both Level 1 and Level 2 in the BCI system. Following these recommendations will enable efficient comparison between both BCI Control and Selection Enhancement Modules, accelerating research and development of BCI-based AAC systems. PMID:23680020

  18. Closed-loop, pilot/vehicle analysis of the approach and landing task

    NASA Technical Reports Server (NTRS)

    Anderson, M. R.; Schmidt, D. K.

    1986-01-01

    In the case of approach and landing, it is universally accepted that the pilot uses more than one vehicle response, or output, to close his control loops. Therefore, to model this task, a multi-loop analysis technique is required. The analysis problem has been in obtaining reasonable analytic estimates of the describing functions representing the pilot's loop compensation. Once these pilot describing functions are obtained, appropriate performance and workload metrics must then be developed for the landing task. The optimal control approach provides a powerful technique for obtaining the necessary describing functions, once the appropriate task objective is defined in terms of a quadratic objective function. An approach is presented through the use of a simple, reasonable objective function and model-based metrics to evaluate loop performance and pilot workload. The results of an analysis of the LAHOS (Landing and Approach of Higher Order Systems) study performed by R.E. Smith is also presented.

  19. Construct validity of individual and summary performance metrics associated with a computer-based laparoscopic simulator.

    PubMed

    Rivard, Justin D; Vergis, Ashley S; Unger, Bertram J; Hardy, Krista M; Andrew, Chris G; Gillman, Lawrence M; Park, Jason

    2014-06-01

    Computer-based surgical simulators capture a multitude of metrics based on different aspects of performance, such as speed, accuracy, and movement efficiency. However, without rigorous assessment, it may be unclear whether all, some, or none of these metrics actually reflect technical skill, which can compromise educational efforts on these simulators. We assessed the construct validity of individual performance metrics on the LapVR simulator (Immersion Medical, San Jose, CA, USA) and used these data to create task-specific summary metrics. Medical students with no prior laparoscopic experience (novices, N = 12), junior surgical residents with some laparoscopic experience (intermediates, N = 12), and experienced surgeons (experts, N = 11) all completed three repetitions of four LapVR simulator tasks. The tasks included three basic skills (peg transfer, cutting, clipping) and one procedural skill (adhesiolysis). We selected 36 individual metrics on the four tasks that assessed six different aspects of performance, including speed, motion path length, respect for tissue, accuracy, task-specific errors, and successful task completion. Four of seven individual metrics assessed for peg transfer, six of ten metrics for cutting, four of nine metrics for clipping, and three of ten metrics for adhesiolysis discriminated between experience levels. Time and motion path length were significant on all four tasks. We used the validated individual metrics to create summary equations for each task, which successfully distinguished between the different experience levels. Educators should maintain some skepticism when reviewing the plethora of metrics captured by computer-based simulators, as some but not all are valid. We showed the construct validity of a limited number of individual metrics and developed summary metrics for the LapVR. The summary metrics provide a succinct way of assessing skill with a single metric for each task, but require further validation.

  20. Performance Evaluation of EnKF-based Hydrogeological Site Characterization using Color Coherent Vectors

    NASA Astrophysics Data System (ADS)

    Moslehi, M.; de Barros, F.

    2017-12-01

    Complexity of hydrogeological systems arises from the multi-scale heterogeneity and insufficient measurements of their underlying parameters such as hydraulic conductivity and porosity. An inadequate characterization of hydrogeological properties can significantly decrease the trustworthiness of numerical models that predict groundwater flow and solute transport. Therefore, a variety of data assimilation methods have been proposed in order to estimate hydrogeological parameters from spatially scarce data by incorporating the governing physical models. In this work, we propose a novel framework for evaluating the performance of these estimation methods. We focus on the Ensemble Kalman Filter (EnKF) approach that is a widely used data assimilation technique. It reconciles multiple sources of measurements to sequentially estimate model parameters such as the hydraulic conductivity. Several methods have been used in the literature to quantify the accuracy of the estimations obtained by EnKF, including Rank Histograms, RMSE and Ensemble Spread. However, these commonly used methods do not regard the spatial information and variability of geological formations. This can cause hydraulic conductivity fields with very different spatial structures to have similar histograms or RMSE. We propose a vision-based approach that can quantify the accuracy of estimations by considering the spatial structure embedded in the estimated fields. Our new approach consists of adapting a new metric, Color Coherent Vectors (CCV), to evaluate the accuracy of estimated fields achieved by EnKF. CCV is a histogram-based technique for comparing images that incorporate spatial information. We represent estimated fields as digital three-channel images and use CCV to compare and quantify the accuracy of estimations. The sensitivity of CCV to spatial information makes it a suitable metric for assessing the performance of spatial data assimilation techniques. Under various factors of data assimilation methods such as number, layout, and type of measurements, we compare the performance of CCV with other metrics such as RMSE. By simulating hydrogeological processes using estimated and true fields, we observe that CCV outperforms other existing evaluation metrics.

  1. Performance metrics for the evaluation of hyperspectral chemical identification systems

    NASA Astrophysics Data System (ADS)

    Truslow, Eric; Golowich, Steven; Manolakis, Dimitris; Ingle, Vinay

    2016-02-01

    Remote sensing of chemical vapor plumes is a difficult but important task for many military and civilian applications. Hyperspectral sensors operating in the long-wave infrared regime have well-demonstrated detection capabilities. However, the identification of a plume's chemical constituents, based on a chemical library, is a multiple hypothesis testing problem which standard detection metrics do not fully describe. We propose using an additional performance metric for identification based on the so-called Dice index. Our approach partitions and weights a confusion matrix to develop both the standard detection metrics and identification metric. Using the proposed metrics, we demonstrate that the intuitive system design of a detector bank followed by an identifier is indeed justified when incorporating performance information beyond the standard detection metrics.

  2. Best Practices Handbook: Traffic Engineering in Range Networks

    DTIC Science & Technology

    2016-03-01

    units of measurement. Measurement Methodology - A repeatable measurement technique used to derive one or more metrics of interest . Network...Performance measures - Metrics that provide quantitative or qualitative measures of the performance of systems or subsystems of interest . Performance Metric

  3. Physician-Pharmacist collaboration in a pay for performance healthcare environment.

    PubMed

    Farley, T M; Izakovic, M

    2015-01-01

    Healthcare is becoming more complex and costly in both European (Slovak) and American models. Healthcare in the United States (U.S.) is undergoing a particularly dramatic change. Physician and hospital reimbursement are becoming less procedure focused and increasingly outcome focused. Efforts at Mercy Hospital have shown promise in terms of collaborative team based care improving performance on glucose control outcome metrics, linked to reimbursement. Our performance on the Centers for Medicare and Medicaid Services (CMS) post-operative glucose control metric for cardiac surgery patients increased from a 63.6% pass rate to a 95.1% pass rate after implementing interventions involving physician-pharmacist team based care.Having a multidisciplinary team that is able to adapt quickly to changing expectations in the healthcare environment has aided our institution. As healthcare becomes increasingly saturated with technology, data and quality metrics, collaborative efforts resulting in increased quality and physician efficiency are desirable. Multidisciplinary collaboration (including physician-pharmacist collaboration) appears to be a viable route to improved performance in an outcome based healthcare system (Fig. 2, Ref. 12).

  4. Effects of the bipartite structure of a network on performance of recommenders

    NASA Astrophysics Data System (ADS)

    Wang, Qing-Xian; Li, Jian; Luo, Xin; Xu, Jian-Jun; Shang, Ming-Sheng

    2018-02-01

    Recommender systems aim to predict people's preferences for online items by analyzing their historical behaviors. A recommender can be modeled as a high-dimensional and sparse bipartite network, where the key issue is to understand the relation between the network structure and a recommender's performance. To address this issue, we choose three network characteristics, clustering coefficient, network density and user-item ratio, as the analyzing targets. For the cluster coefficient, we adopt the Degree-preserving rewiring algorithm to obtain a series of bipartite network with varying cluster coefficient, while the degree of user and item keep unchanged. Furthermore, five state-of-the-art recommenders are applied on two real datasets. The performances of recommenders are measured by both numerical and physical metrics. These results show that a recommender's performance is positively related to the clustering coefficient of a bipartite network. Meanwhile, higher density of a bipartite network can provide more accurate but less diverse or novel recommendations. Furthermore, the user-item ratio is positively correlated with the accuracy metrics but negatively correlated with the diverse and novel metrics.

  5. Cognitive skills assessment during robot-assisted surgery: separating the wheat from the chaff.

    PubMed

    Guru, Khurshid A; Esfahani, Ehsan T; Raza, Syed J; Bhat, Rohit; Wang, Katy; Hammond, Yana; Wilding, Gregory; Peabody, James O; Chowriappa, Ashirwad J

    2015-01-01

    To investigate the utility of cognitive assessment during robot-assisted surgery (RAS) to define skills in terms of cognitive engagement, mental workload, and mental state; while objectively differentiating between novice and expert surgeons. In all, 10 surgeons with varying operative experience were assigned to beginner (BG), combined competent and proficient (CPG), and expert (EG) groups based on the Dreyfus model. The participants performed tasks for basic, intermediate and advanced skills on the da Vinci Surgical System. Participant performance was assessed using both tool-based and cognitive metrics. Tool-based metrics showed significant differences between the BG vs CPG and the BG vs EG, in basic skills. While performing intermediate skills, there were significant differences only on the instrument-to-instrument collisions between the BG vs CPG (2.0 vs 0.2, P = 0.028), and the BG vs EG (2.0 vs 0.1, P = 0.018). There were no significant differences between the CPG and EG for both basic and intermediate skills. However, using cognitive metrics, there were significant differences between all groups for the basic and intermediate skills. In advanced skills, there were no significant differences between the CPG and the EG except time (1116 vs 599.6 s), using tool-based metrics. However, cognitive metrics revealed significant differences between both groups. Cognitive assessment of surgeons may aid in defining levels of expertise performing complex surgical tasks once competence is achieved. Cognitive assessment may be used as an adjunct to the traditional methods for skill assessment during RAS. © 2014 The Authors. BJU International © 2014 BJU International.

  6. Identification of robust statistical downscaling methods based on a comprehensive suite of performance metrics for South Korea

    NASA Astrophysics Data System (ADS)

    Eum, H. I.; Cannon, A. J.

    2015-12-01

    Climate models are a key provider to investigate impacts of projected future climate conditions on regional hydrologic systems. However, there is a considerable mismatch of spatial resolution between GCMs and regional applications, in particular a region characterized by complex terrain such as Korean peninsula. Therefore, a downscaling procedure is an essential to assess regional impacts of climate change. Numerous statistical downscaling methods have been used mainly due to the computational efficiency and simplicity. In this study, four statistical downscaling methods [Bias-Correction/Spatial Disaggregation (BCSD), Bias-Correction/Constructed Analogue (BCCA), Multivariate Adaptive Constructed Analogs (MACA), and Bias-Correction/Climate Imprint (BCCI)] are applied to downscale the latest Climate Forecast System Reanalysis data to stations for precipitation, maximum temperature, and minimum temperature over South Korea. By split sampling scheme, all methods are calibrated with observational station data for 19 years from 1973 to 1991 are and tested for the recent 19 years from 1992 to 2010. To assess skill of the downscaling methods, we construct a comprehensive suite of performance metrics that measure an ability of reproducing temporal correlation, distribution, spatial correlation, and extreme events. In addition, we employ Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) to identify robust statistical downscaling methods based on the performance metrics for each season. The results show that downscaling skill is considerably affected by the skill of CFSR and all methods lead to large improvements in representing all performance metrics. According to seasonal performance metrics evaluated, when TOPSIS is applied, MACA is identified as the most reliable and robust method for all variables and seasons. Note that such result is derived from CFSR output which is recognized as near perfect climate data in climate studies. Therefore, the ranking of this study may be changed when various GCMs are downscaled and evaluated. Nevertheless, it may be informative for end-users (i.e. modelers or water resources managers) to understand and select more suitable downscaling methods corresponding to priorities on regional applications.

  7. Simulation of gaseous pollutant dispersion around an isolated building using the k-ω SST (shear stress transport) turbulence model.

    PubMed

    Yu, Hesheng; Thé, Jesse

    2017-05-01

    The dispersion of gaseous pollutant around buildings is complex due to complex turbulence features such as flow detachment and zones of high shear. Computational fluid dynamics (CFD) models are one of the most promising tools to describe the pollutant distribution in the near field of buildings. Reynolds-averaged Navier-Stokes (RANS) models are the most commonly used CFD techniques to address turbulence transport of the pollutant. This research work studies the use of [Formula: see text] closure model for the gas dispersion around a building by fully resolving the viscous sublayer for the first time. The performance of standard [Formula: see text] model is also included for comparison, along with results of an extensively validated Gaussian dispersion model, the U.S. Environmental Protection Agency (EPA) AERMOD (American Meteorological Society/U.S. Environmental Protection Agency Regulatory Model). This study's CFD models apply the standard [Formula: see text] and the [Formula: see text] turbulence models to obtain wind flow field. A passive concentration transport equation is then calculated based on the resolved flow field to simulate the distribution of pollutant concentrations. The resultant simulation of both wind flow and concentration fields are validated rigorously by extensive data using multiple validation metrics. The wind flow field can be acceptably modeled by the [Formula: see text] model. However, the [Formula: see text] model fails to simulate the gas dispersion. The [Formula: see text] model outperforms [Formula: see text] in both flow and dispersion simulations, with higher hit rates for dimensionless velocity components and higher "factor of 2" of observations (FAC2) for normalized concentration. All these validation metrics of [Formula: see text] model pass the quality assurance criteria recommended by The Association of German Engineers (Verein Deutscher Ingenieure, VDI) guideline. Furthermore, these metrics are better than or the same as those in the literature. Comparison between the performances of [Formula: see text] and AERMOD shows that the CFD simulation is superior to Gaussian-type model for pollutant dispersion in the near wake of obstacles. AERMOD can perform as a screening tool for near-field gas dispersion due to its expeditious calculation and the ability to handle complicated cases. The utilization of [Formula: see text] to simulate gaseous pollutant dispersion around an isolated building is appropriate and is expected to be suitable for complex urban environment. Multiple validation metrics of [Formula: see text] turbulence model in CFD quantitatively indicated that this turbulence model was appropriate for the simulation of gas dispersion around buildings. CFD is, therefore, an attractive alternative to wind tunnel for modeling gas dispersion in urban environment due to its excellent performance, and lower cost.

  8. Guiding Principles and Checklist for Population-Based Quality Metrics

    PubMed Central

    Brunelli, Steven M.; Maddux, Franklin W.; Parker, Thomas F.; Johnson, Douglas; Nissenson, Allen R.; Collins, Allan; Lacson, Eduardo

    2014-01-01

    The Centers for Medicare and Medicaid Services oversees the ESRD Quality Incentive Program to ensure that the highest quality of health care is provided by outpatient dialysis facilities that treat patients with ESRD. To that end, Centers for Medicare and Medicaid Services uses clinical performance measures to evaluate quality of care under a pay-for-performance or value-based purchasing model. Now more than ever, the ESRD therapeutic area serves as the vanguard of health care delivery. By translating medical evidence into clinical performance measures, the ESRD Prospective Payment System became the first disease-specific sector using the pay-for-performance model. A major challenge for the creation and implementation of clinical performance measures is the adjustments that are necessary to transition from taking care of individual patients to managing the care of patient populations. The National Quality Forum and others have developed effective and appropriate population-based clinical performance measures quality metrics that can be aggregated at the physician, hospital, dialysis facility, nursing home, or surgery center level. Clinical performance measures considered for endorsement by the National Quality Forum are evaluated using five key criteria: evidence, performance gap, and priority (impact); reliability; validity; feasibility; and usability and use. We have developed a checklist of special considerations for clinical performance measure development according to these National Quality Forum criteria. Although the checklist is focused on ESRD, it could also have broad application to chronic disease states, where health care delivery organizations seek to enhance quality, safety, and efficiency of their services. Clinical performance measures are likely to become the norm for tracking performance for health care insurers. Thus, it is critical that the methodologies used to develop such metrics serve the payer and the provider and most importantly, reflect what represents the best care to improve patient outcomes. PMID:24558050

  9. Phase Two Feasibility Study for Software Safety Requirements Analysis Using Model Checking

    NASA Technical Reports Server (NTRS)

    Turgeon, Gregory; Price, Petra

    2010-01-01

    A feasibility study was performed on a representative aerospace system to determine the following: (1) the benefits and limitations to using SCADE , a commercially available tool for model checking, in comparison to using a proprietary tool that was studied previously [1] and (2) metrics for performing the model checking and for assessing the findings. This study was performed independently of the development task by a group unfamiliar with the system, providing a fresh, external perspective free from development bias.

  10. Target detection cycle criteria when using the targeting task performance metric

    NASA Astrophysics Data System (ADS)

    Hixson, Jonathan G.; Jacobs, Eddie L.; Vollmerhausen, Richard H.

    2004-12-01

    The US Army RDECOM CERDEC Night Vision and Electronic Sensors Directorate of the US Army (NVESD) has developed a new target acquisition metric to better predict the performance of modern electro-optical imagers. The TTP metric replaces the Johnson criteria. One problem with transitioning to the new model is that the difficulty of searching in a terrain has traditionally been quantified by an "N50." The N50 is the number of Johnson criteria cycles needed for the observer to detect the target half the time, assuming that the observer is not time limited. In order to make use of this empirical data base, a conversion must be found relating Johnson cycles for detection to TTP cycles for detection. This paper describes how that relationship is established. We have found that the relationship between Johnson and TTP is 1:2.7 for the recognition and identification tasks.

  11. Task-oriented lossy compression of magnetic resonance images

    NASA Astrophysics Data System (ADS)

    Anderson, Mark C.; Atkins, M. Stella; Vaisey, Jacques

    1996-04-01

    A new task-oriented image quality metric is used to quantify the effects of distortion introduced into magnetic resonance images by lossy compression. This metric measures the similarity between a radiologist's manual segmentation of pathological features in the original images and the automated segmentations performed on the original and compressed images. The images are compressed using a general wavelet-based lossy image compression technique, embedded zerotree coding, and segmented using a three-dimensional stochastic model-based tissue segmentation algorithm. The performance of the compression system is then enhanced by compressing different regions of the image volume at different bit rates, guided by prior knowledge about the location of important anatomical regions in the image. Application of the new system to magnetic resonance images is shown to produce compression results superior to the conventional methods, both subjectively and with respect to the segmentation similarity metric.

  12. Conceptual model of comprehensive research metrics for improved human health and environment.

    PubMed

    Engel-Cox, Jill A; Van Houten, Bennett; Phelps, Jerry; Rose, Shyanika W

    2008-05-01

    Federal, state, and private research agencies and organizations have faced increasing administrative and public demand for performance measurement. Historically, performance measurement predominantly consisted of near-term outputs measured through bibliometrics. The recent focus is on accountability for investment based on long-term outcomes. Developing measurable outcome-based metrics for research programs has been particularly challenging, because of difficulty linking research results to spatially and temporally distant outcomes. Our objective in this review is to build a logic model and associated metrics through which to measure the contribution of environmental health research programs to improvements in human health, the environment, and the economy. We used expert input and literature research on research impact assessment. With these sources, we developed a logic model that defines the components and linkages between extramural environmental health research grant programs and the outputs and outcomes related to health and social welfare, environmental quality and sustainability, economics, and quality of life. The logic model focuses on the environmental health research portfolio of the National Institute of Environmental Health Sciences (NIEHS) Division of Extramural Research and Training. The model delineates pathways for contributions by five types of institutional partners in the research process: NIEHS, other government (federal, state, and local) agencies, grantee institutions, business and industry, and community partners. The model is being applied to specific NIEHS research applications and the broader research community. We briefly discuss two examples and discuss the strengths and limits of outcome-based evaluation of research programs.

  13. The proposed 'concordance-statistic for benefit' provided a useful metric when modeling heterogeneous treatment effects.

    PubMed

    van Klaveren, David; Steyerberg, Ewout W; Serruys, Patrick W; Kent, David M

    2018-02-01

    Clinical prediction models that support treatment decisions are usually evaluated for their ability to predict the risk of an outcome rather than treatment benefit-the difference between outcome risk with vs. without therapy. We aimed to define performance metrics for a model's ability to predict treatment benefit. We analyzed data of the Synergy between Percutaneous Coronary Intervention with Taxus and Cardiac Surgery (SYNTAX) trial and of three recombinant tissue plasminogen activator trials. We assessed alternative prediction models with a conventional risk concordance-statistic (c-statistic) and a novel c-statistic for benefit. We defined observed treatment benefit by the outcomes in pairs of patients matched on predicted benefit but discordant for treatment assignment. The 'c-for-benefit' represents the probability that from two randomly chosen matched patient pairs with unequal observed benefit, the pair with greater observed benefit also has a higher predicted benefit. Compared to a model without treatment interactions, the SYNTAX score II had improved ability to discriminate treatment benefit (c-for-benefit 0.590 vs. 0.552), despite having similar risk discrimination (c-statistic 0.725 vs. 0.719). However, for the simplified stroke-thrombolytic predictive instrument (TPI) vs. the original stroke-TPI, the c-for-benefit (0.584 vs. 0.578) was similar. The proposed methodology has the potential to measure a model's ability to predict treatment benefit not captured with conventional performance metrics. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. The Value of Metrics for Science Data Center Management

    NASA Astrophysics Data System (ADS)

    Moses, J.; Behnke, J.; Watts, T. H.; Lu, Y.

    2005-12-01

    The Earth Observing System Data and Information System (EOSDIS) has been collecting and analyzing records of science data archive, processing and product distribution for more than 10 years. The types of information collected and the analysis performed has matured and progressed to become an integral and necessary part of the system management and planning functions. Science data center managers are realizing the importance that metrics can play in influencing and validating their business model. New efforts focus on better understanding of users and their methods. Examples include tracking user web site interactions and conducting user surveys such as the government authorized American Customer Satisfaction Index survey. This paper discusses the metrics methodology, processes and applications that are growing in EOSDIS, the driving requirements and compelling events, and the future envisioned for metrics as an integral part of earth science data systems.

  15. There is No Free Lunch: Tradeoffs in the Utility of Learned Knowledge

    NASA Technical Reports Server (NTRS)

    Kedar, Smadar T.; McKusick, Kathleen B.

    1992-01-01

    With the recent introduction of learning in integrated systems, there is a need to measure the utility of learned knowledge for these more complex systems. A difficulty arrises when there are multiple, possibly conflicting, utility metrics to be measured. In this paper, we present schemes which trade off conflicting utility metrics in order to achieve some global performance objectives. In particular, we present a case study of a multi-strategy machine learning system, mutual theory refinement, which refines world models for an integrated reactive system, the Entropy Reduction Engine. We provide experimental results on the utility of learned knowledge in two conflicting metrics - improved accuracy and degraded efficiency. We then demonstrate two ways to trade off these metrics. In each, some learned knowledge is either approximated or dynamically 'forgotten' so as to improve efficiency while degrading accuracy only slightly.

  16. Characterizing uncertainty when evaluating risk management metrics: risk assessment modeling of Listeria monocytogenes contamination in ready-to-eat deli meats.

    PubMed

    Gallagher, Daniel; Ebel, Eric D; Gallagher, Owen; Labarre, David; Williams, Michael S; Golden, Neal J; Pouillot, Régis; Dearfield, Kerry L; Kause, Janell

    2013-04-01

    This report illustrates how the uncertainty about food safety metrics may influence the selection of a performance objective (PO). To accomplish this goal, we developed a model concerning Listeria monocytogenes in ready-to-eat (RTE) deli meats. This application used a second order Monte Carlo model that simulates L. monocytogenes concentrations through a series of steps: the food-processing establishment, transport, retail, the consumer's home and consumption. The model accounted for growth inhibitor use, retail cross contamination, and applied an FAO/WHO dose response model for evaluating the probability of illness. An appropriate level of protection (ALOP) risk metric was selected as the average risk of illness per serving across all consumed servings-per-annum and the model was used to solve for the corresponding performance objective (PO) risk metric as the maximum allowable L. monocytogenes concentration (cfu/g) at the processing establishment where regulatory monitoring would occur. Given uncertainty about model inputs, an uncertainty distribution of the PO was estimated. Additionally, we considered how RTE deli meats contaminated at levels above the PO would be handled by the industry using three alternative approaches. Points on the PO distribution represent the probability that - if the industry complies with a particular PO - the resulting risk-per-serving is less than or equal to the target ALOP. For example, assuming (1) a target ALOP of -6.41 log10 risk of illness per serving, (2) industry concentrations above the PO that are re-distributed throughout the remaining concentration distribution and (3) no dose response uncertainty, establishment PO's of -4.98 and -4.39 log10 cfu/g would be required for 90% and 75% confidence that the target ALOP is met, respectively. The PO concentrations from this example scenario are more stringent than the current typical monitoring level of an absence in 25 g (i.e., -1.40 log10 cfu/g) or a stricter criteria of absence in 125 g (i.e., -2.1 log10 cfu/g). This example, and others, demonstrates that a PO for L. monocytogenes would be far below any current monitoring capabilities. Furthermore, this work highlights the demands placed on risk managers and risk assessors when applying uncertain risk models to the current risk metric framework. Copyright © 2013 Elsevier B.V. All rights reserved.

  17. Performance Metrics for Liquid Chromatography-Tandem Mass Spectrometry Systems in Proteomics Analyses*

    PubMed Central

    Rudnick, Paul A.; Clauser, Karl R.; Kilpatrick, Lisa E.; Tchekhovskoi, Dmitrii V.; Neta, Pedatsur; Blonder, Nikša; Billheimer, Dean D.; Blackman, Ronald K.; Bunk, David M.; Cardasis, Helene L.; Ham, Amy-Joan L.; Jaffe, Jacob D.; Kinsinger, Christopher R.; Mesri, Mehdi; Neubert, Thomas A.; Schilling, Birgit; Tabb, David L.; Tegeler, Tony J.; Vega-Montoto, Lorenzo; Variyath, Asokan Mulayath; Wang, Mu; Wang, Pei; Whiteaker, Jeffrey R.; Zimmerman, Lisa J.; Carr, Steven A.; Fisher, Susan J.; Gibson, Bradford W.; Paulovich, Amanda G.; Regnier, Fred E.; Rodriguez, Henry; Spiegelman, Cliff; Tempst, Paul; Liebler, Daniel C.; Stein, Stephen E.

    2010-01-01

    A major unmet need in LC-MS/MS-based proteomics analyses is a set of tools for quantitative assessment of system performance and evaluation of technical variability. Here we describe 46 system performance metrics for monitoring chromatographic performance, electrospray source stability, MS1 and MS2 signals, dynamic sampling of ions for MS/MS, and peptide identification. Applied to data sets from replicate LC-MS/MS analyses, these metrics displayed consistent, reasonable responses to controlled perturbations. The metrics typically displayed variations less than 10% and thus can reveal even subtle differences in performance of system components. Analyses of data from interlaboratory studies conducted under a common standard operating procedure identified outlier data and provided clues to specific causes. Moreover, interlaboratory variation reflected by the metrics indicates which system components vary the most between laboratories. Application of these metrics enables rational, quantitative quality assessment for proteomics and other LC-MS/MS analytical applications. PMID:19837981

  18. Which metric of ambient ozone to predict daily mortality?

    NASA Astrophysics Data System (ADS)

    Moshammer, Hanns; Hutter, Hans-Peter; Kundi, Michael

    2013-02-01

    It is well known that ozone concentration is associated with daily cause specific mortality. But which ozone metric is the best predictor of the daily variability in mortality? We performed a time series analysis on daily deaths (all causes, respiratory and cardiovascular causes as well as death in elderly 65+) in Vienna for the years 1991-2009. We controlled for seasonal and long term trend, day of the week, temperature and humidity using the same basic model for all pollutant metrics. We found model fit was best for same day variability of ozone concentration (calculated as the difference between daily hourly maximum and minimum) and hourly maximum. Of these the variability displayed a more linear dose-response function. Maximum 8 h moving average and daily mean value performed not so well. Nitrogen dioxide (daily mean) in comparison performed better when previous day values were assessed. Same day ozone and previous day nitrogen dioxide effect estimates did not confound each other. Variability in daily ozone levels or peak ozone levels seem to be a better proxy of a complex reactive secondary pollutant mixture than daily average ozone levels in the Middle European setting. If this finding is confirmed this would have implications for the setting of legally binding limit values.

  19. Object detection in natural backgrounds predicted by discrimination performance and models

    NASA Technical Reports Server (NTRS)

    Rohaly, A. M.; Ahumada, A. J. Jr; Watson, A. B.

    1997-01-01

    Many models of visual performance predict image discriminability, the visibility of the difference between a pair of images. We compared the ability of three image discrimination models to predict the detectability of objects embedded in natural backgrounds. The three models were: a multiple channel Cortex transform model with within-channel masking; a single channel contrast sensitivity filter model; and a digital image difference metric. Each model used a Minkowski distance metric (generalized vector magnitude) to summate absolute differences between the background and object plus background images. For each model, this summation was implemented with three different exponents: 2, 4 and infinity. In addition, each combination of model and summation exponent was implemented with and without a simple contrast gain factor. The model outputs were compared to measures of object detectability obtained from 19 observers. Among the models without the contrast gain factor, the multiple channel model with a summation exponent of 4 performed best, predicting the pattern of observer d's with an RMS error of 2.3 dB. The contrast gain factor improved the predictions of all three models for all three exponents. With the factor, the best exponent was 4 for all three models, and their prediction errors were near 1 dB. These results demonstrate that image discrimination models can predict the relative detectability of objects in natural scenes.

  20. Performance regression manager for large scale systems

    DOEpatents

    Faraj, Daniel A.

    2017-10-17

    System and computer program product to perform an operation comprising generating, based on a first output generated by a first execution instance of a command, a first output file specifying a value of at least one performance metric, wherein the first output file is formatted according to a predefined format, comparing the value of the at least one performance metric in the first output file to a value of the performance metric in a second output file, the second output file having been generated based on a second output generated by a second execution instance of the command, and outputting for display an indication of a result of the comparison of the value of the at least one performance metric of the first output file to the value of the at least one performance metric of the second output file.

  1. EOID System Model Validation, Metrics, and Synthetic Clutter Generation

    DTIC Science & Technology

    2003-09-30

    Our long-term goal is to accurately predict the capability of the current generation of laser-based underwater imaging sensors to perform Electro ... Optic Identification (EOID) against relevant targets in a variety of realistic environmental conditions. The models will predict the impact of

  2. Zone calculation as a tool for assessing performance outcome in laparoscopic suturing.

    PubMed

    Buckley, Christina E; Kavanagh, Dara O; Nugent, Emmeline; Ryan, Donncha; Traynor, Oscar J; Neary, Paul C

    2015-06-01

    Simulator performance is measured by metrics, which are valued as an objective way of assessing trainees. Certain procedures such as laparoscopic suturing, however, may not be suitable for assessment under traditionally formulated metrics. Our aim was to assess if our new metric is a valid method of assessing laparoscopic suturing. A software program was developed to order to create a new metric, which would calculate the percentage of time spent operating within pre-defined areas called "zones." Twenty-five candidates (medical students N = 10, surgical residents N = 10, and laparoscopic experts N = 5) performed the laparoscopic suturing task on the ProMIS III(®) simulator. New metrics of "in-zone" and "out-zone" scores as well as traditional metrics of time, path length, and smoothness were generated. Performance was also assessed by two blinded observers using the OSATS and FLS rating scales. This novel metric was evaluated by comparing it to both traditional metrics and subjective scores. There was a significant difference in the average in-zone and out-zone scores between all three experience groups (p < 0.05). The new zone metrics scores correlated significantly with the subjective-blinded observer scores of OSATS and FLS (p = 0.0001). The new zone metric scores also correlated significantly with the traditional metrics of path length, time, and smoothness (p < 0.05). The new metric is a valid tool for assessing laparoscopic suturing objectively. This could be incorporated into a competency-based curriculum to monitor resident progression in the simulated setting.

  3. Can spatial statistical river temperature models be transferred between catchments?

    NASA Astrophysics Data System (ADS)

    Jackson, Faye L.; Fryer, Robert J.; Hannah, David M.; Malcolm, Iain A.

    2017-09-01

    There has been increasing use of spatial statistical models to understand and predict river temperature (Tw) from landscape covariates. However, it is not financially or logistically feasible to monitor all rivers and the transferability of such models has not been explored. This paper uses Tw data from four river catchments collected in August 2015 to assess how well spatial regression models predict the maximum 7-day rolling mean of daily maximum Tw (Twmax) within and between catchments. Models were fitted for each catchment separately using (1) landscape covariates only (LS models) and (2) landscape covariates and an air temperature (Ta) metric (LS_Ta models). All the LS models included upstream catchment area and three included a river network smoother (RNS) that accounted for unexplained spatial structure. The LS models transferred reasonably to other catchments, at least when predicting relative levels of Twmax. However, the predictions were biased when mean Twmax differed between catchments. The RNS was needed to characterise and predict finer-scale spatially correlated variation. Because the RNS was unique to each catchment and thus non-transferable, predictions were better within catchments than between catchments. A single model fitted to all catchments found no interactions between the landscape covariates and catchment, suggesting that the landscape relationships were transferable. The LS_Ta models transferred less well, with particularly poor performance when the relationship with the Ta metric was physically implausible or required extrapolation outside the range of the data. A single model fitted to all catchments found catchment-specific relationships between Twmax and the Ta metric, indicating that the Ta metric was not transferable. These findings improve our understanding of the transferability of spatial statistical river temperature models and provide a foundation for developing new approaches for predicting Tw at unmonitored locations across multiple catchments and larger spatial scales.

  4. Integrating fisheries approaches and household utility models for improved resource management.

    PubMed

    Milner-Gulland, E J

    2011-01-25

    Natural resource management is littered with cases of overexploitation and ineffectual management, leading to loss of both biodiversity and human welfare. Disciplinary boundaries stifle the search for solutions to these issues. Here, I combine the approach of management strategy evaluation, widely applied in fisheries, with household utility models from the conservation and development literature, to produce an integrated framework for evaluating the effectiveness of competing management strategies for harvested resources against a range of performance metrics. I demonstrate the strengths of this approach with a simple model, and use it to examine the effect of manager ignorance of household decisions on resource management effectiveness, and an allocation tradeoff between monitoring resource stocks to reduce observation uncertainty and monitoring users to improve compliance. I show that this integrated framework enables management assessments to consider household utility as a direct metric for system performance, and that although utility and resource stock conservation metrics are well aligned, harvest yield is a poor proxy for both, because it is a product of household allocation decisions between alternate livelihood options, rather than an end in itself. This approach has potential far beyond single-species harvesting in situations where managers are in full control; I show that the integrated approach enables a range of management intervention options to be evaluated within the same framework.

  5. Definition and classification of evaluation units for tertiary structure prediction in CASP12 facilitated through semi-automated metrics.

    PubMed

    Abriata, Luciano A; Kinch, Lisa N; Tamò, Giorgio E; Monastyrskyy, Bohdan; Kryshtafovych, Andriy; Dal Peraro, Matteo

    2018-03-01

    For assessment purposes, CASP targets are split into evaluation units. We herein present the official definition of CASP12 evaluation units (EUs) and their classification into difficulty categories. Each target can be evaluated as one EU (the whole target) or/and several EUs (separate structural domains or groups of structural domains). The specific scenario for a target split is determined by the domain organization of available templates, the difference in server performance on separate domains versus combination of the domains, and visual inspection. In the end, 71 targets were split into 96 EUs. Classification of the EUs into difficulty categories was done semi-automatically with the assistance of metrics provided by the Prediction Center. These metrics account for sequence and structural similarities of the EUs to potential structural templates from the Protein Data Bank, and for the baseline performance of automated server predictions. The metrics readily separate the 96 EUs into 38 EUs that should be straightforward for template-based modeling (TBM) and 39 that are expected to be hard for homology modeling and are thus left for free modeling (FM). The remaining 19 borderline evaluation units were dubbed FM/TBM, and were inspected case by case. The article also overviews structural and evolutionary features of selected targets relevant to our accompanying article presenting the assessment of FM and FM/TBM predictions, and overviews structural features of the hardest evaluation units from the FM category. We finally suggest improvements for the EU definition and classification procedures. © 2017 Wiley Periodicals, Inc.

  6. Normalized distance aggregation of discriminative features for person reidentification

    NASA Astrophysics Data System (ADS)

    Hou, Li; Han, Kang; Wan, Wanggen; Hwang, Jenq-Neng; Yao, Haiyan

    2018-03-01

    We propose an effective person reidentification method based on normalized distance aggregation of discriminative features. Our framework is built on the integration of three high-performance discriminative feature extraction models, including local maximal occurrence (LOMO), feature fusion net (FFN), and a concatenation of LOMO and FFN called LOMO-FFN, through two fast and discriminant metric learning models, i.e., cross-view quadratic discriminant analysis (XQDA) and large-scale similarity learning (LSSL). More specifically, we first represent all the cross-view person images using LOMO, FFN, and LOMO-FFN, respectively, and then apply each extracted feature representation to train XQDA and LSSL, respectively, to obtain the optimized individual cross-view distance metric. Finally, the cross-view person matching is computed as the sum of the optimized individual cross-view distance metric through the min-max normalization. Experimental results have shown the effectiveness of the proposed algorithm on three challenging datasets (VIPeR, PRID450s, and CUHK01).

  7. Modelling cephalopod-inspired pulsed-jet locomotion for underwater soft robots.

    PubMed

    Renda, F; Giorgio-Serchi, F; Boyer, F; Laschi, C

    2015-09-28

    Cephalopods (i.e., octopuses and squids) are being looked upon as a source of inspiration for the development of unmanned underwater vehicles. One kind of cephalopod-inspired soft-bodied vehicle developed by the authors entails a hollow, elastic shell capable of performing a routine of recursive ingestion and expulsion of discrete slugs of fluids which enable the vehicle to propel itself in water. The vehicle performances were found to depend largely on the elastic response of the shell to the actuation cycle, thus motivating the development of a coupled propulsion-elastodynamics model of such vehicles. The model is developed and validated against a set of experimental results performed with the existing cephalopod-inspired prototypes. A metric of the efficiency of the propulsion routine which accounts for the elastic energy contribution during the ingestion/expulsion phases of the actuation is formulated. Demonstration on the use of this model to estimate the efficiency of the propulsion routine for various pulsation frequencies and for different morphologies of the vehicles are provided. This metric of efficiency, employed in association with the present elastodynamics model, provides a useful tool for performing a priori energetic analysis which encompass both the design specifications and the actuation pattern of this new kind of underwater vehicle.

  8. The Mediating Relation between Symbolic and Nonsymbolic Foundations of Math Competence

    PubMed Central

    Price, Gavin R.; Fuchs, Lynn S.

    2016-01-01

    This study investigated the relation between symbolic and nonsymbolic magnitude processing abilities with 2 standardized measures of math competence (WRAT Arithmetic and KeyMath Numeration) in 150 3rd- grade children (mean age 9.01 years). Participants compared sets of dots and pairs of Arabic digits with numerosities 1–9 for relative numerical magnitude. In line with previous studies, performance on both symbolic and nonsymbolic magnitude processing was related to math ability. Performance metrics combining reaction and accuracy, as well as weber fractions, were entered into mediation models with standardized math test scores. Results showed that symbolic magnitude processing ability fully mediates the relation between nonsymbolic magnitude processing and math ability, regardless of the performance metric or standardized test. PMID:26859564

  9. The Mediating Relation between Symbolic and Nonsymbolic Foundations of Math Competence.

    PubMed

    Price, Gavin R; Fuchs, Lynn S

    2016-01-01

    This study investigated the relation between symbolic and nonsymbolic magnitude processing abilities with 2 standardized measures of math competence (WRAT Arithmetic and KeyMath Numeration) in 150 3rd-grade children (mean age 9.01 years). Participants compared sets of dots and pairs of Arabic digits with numerosities 1-9 for relative numerical magnitude. In line with previous studies, performance on both symbolic and nonsymbolic magnitude processing was related to math ability. Performance metrics combining reaction and accuracy, as well as weber fractions, were entered into mediation models with standardized math test scores. Results showed that symbolic magnitude processing ability fully mediates the relation between nonsymbolic magnitude processing and math ability, regardless of the performance metric or standardized test.

  10. Towards Principled Experimental Study of Autonomous Mobile Robots

    NASA Technical Reports Server (NTRS)

    Gat, Erann

    1995-01-01

    We review the current state of research in autonomous mobile robots and conclude that there is an inadequate basis for predicting the reliability and behavior of robots operating in unengineered environments. We present a new approach to the study of autonomous mobile robot performance based on formal statistical analysis of independently reproducible experiments conducted on real robots. Simulators serve as models rather than experimental surrogates. We demonstrate three new results: 1) Two commonly used performance metrics (time and distance) are not as well correlated as is often tacitly assumed. 2) The probability distributions of these performance metrics are exponential rather than normal, and 3) a modular, object-oriented simulation accurately predicts the behavior of the real robot in a statistically significant manner.

  11. Mining the Dynamics of Student Utility and Strategy Use during Vocabulary Learning

    ERIC Educational Resources Information Center

    Pavlik, Philip I., Jr.

    2013-01-01

    This paper describes the development of a dynamical systems model of motivation and metacognition during learning, which explains some of the practically and theoretically important relationships among three student engagement constructs and performance metrics during learning. In order to better calibrate and understand the model, the model was…

  12. On Applying the Prognostic Performance Metrics

    NASA Technical Reports Server (NTRS)

    Saxena, Abhinav; Celaya, Jose; Saha, Bhaskar; Saha, Sankalita; Goebel, Kai

    2009-01-01

    Prognostics performance evaluation has gained significant attention in the past few years. As prognostics technology matures and more sophisticated methods for prognostic uncertainty management are developed, a standardized methodology for performance evaluation becomes extremely important to guide improvement efforts in a constructive manner. This paper is in continuation of previous efforts where several new evaluation metrics tailored for prognostics were introduced and were shown to effectively evaluate various algorithms as compared to other conventional metrics. Specifically, this paper presents a detailed discussion on how these metrics should be interpreted and used. Several shortcomings identified, while applying these metrics to a variety of real applications, are also summarized along with discussions that attempt to alleviate these problems. Further, these metrics have been enhanced to include the capability of incorporating probability distribution information from prognostic algorithms as opposed to evaluation based on point estimates only. Several methods have been suggested and guidelines have been provided to help choose one method over another based on probability distribution characteristics. These approaches also offer a convenient and intuitive visualization of algorithm performance with respect to some of these new metrics like prognostic horizon and alpha-lambda performance, and also quantify the corresponding performance while incorporating the uncertainty information.

  13. Systematic comparison of the behaviors produced by computational models of epileptic neocortex.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Warlaumont, A. S.; Lee, H. C.; Benayoun, M.

    2010-12-01

    Two existing models of brain dynamics in epilepsy, one detailed (i.e., realistic) and one abstract (i.e., simplified) are compared in terms of behavioral range and match to in vitro mouse recordings. A new method is introduced for comparing across computational models that may have very different forms. First, high-level metrics were extracted from model and in vitro output time series. A principal components analysis was then performed over these metrics to obtain a reduced set of derived features. These features define a low-dimensional behavior space in which quantitative measures of behavioral range and degree of match to real data canmore » be obtained. The detailed and abstract models and the mouse recordings overlapped considerably in behavior space. Both the range of behaviors and similarity to mouse data were similar between the detailed and abstract models. When no high-level metrics were used and principal components analysis was computed over raw time series, the models overlapped minimally with the mouse recordings. The method introduced here is suitable for comparing across different kinds of model data and across real brain recordings. It appears that, despite differences in form and computational expense, detailed and abstract models do not necessarily differ in their behaviors.« less

  14. The effect of topography on pyroclastic flow mobility

    NASA Astrophysics Data System (ADS)

    Ogburn, S. E.; Calder, E. S.

    2010-12-01

    Pyroclastic flows are among the most destructive volcanic phenomena. Hazard mitigation depends upon accurate forecasting of possible flow paths, often using computational models. Two main metrics have been proposed to describe the mobility of pyroclastic flows. The Heim coefficient, height-dropped/run-out (H/L), exhibits an inverse relationship with flow volume. This coefficient corresponds to the coefficient of friction and informs computational models that use Coulomb friction laws. Another mobility measure states that with constant shear stress, planimetric area is proportional to the flow volume raised to the 2/3 power (A∝V^(2/3)). This relationship is incorporated in models using constant shear stress instead of constant friction, and used directly by some empirical models. Pyroclastic flows from Soufriere Hills Volcano, Montserrat; Unzen, Japan; Colima, Mexico; and Augustine, Alaska are well described by these metrics. However, flows in specific valleys exhibit differences in mobility. This study investigates the effect of topography on pyroclastic flow mobility, as measured by the above mentioned mobility metrics. Valley width, depth, and cross-sectional area all influence flow mobility. Investigating the appropriateness of these mobility measures, as well as the computational models they inform, indicates certain circumstances under which each model performs optimally. Knowing which conditions call for which models allows for better model selection or model weighting, and therefore, more realistic hazard predictions.

  15. Metrics, Dollars, and Systems Change: Learning from Washington State's Student Achievement Initiative to Design Effective Postsecondary Performance Funding Policies. A State Policy Brief

    ERIC Educational Resources Information Center

    Jenkins, Davis; Shulock, Nancy

    2013-01-01

    The Student Achievement Initiative (SAI), adopted by the Washington State Board for Community and Technical Colleges in 2007, is one of a growing number of performance funding programs that have been dubbed "performance funding 2.0." Unlike previous performance funding models, the SAI rewards colleges for students' intermediate…

  16. 75 FR 7581 - RTO/ISO Performance Metrics; Notice Requesting Comments on RTO/ISO Performance Metrics

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-02-22

    ... performance communicate about the benefits of RTOs and, where appropriate, (2) changes that need to be made to... of staff from all the jurisdictional ISOs/RTOs to develop a set of performance metrics that the ISOs/RTOs will use to report annually to the Commission. Commission staff and representatives from the ISOs...

  17. Performance regression manager for large scale systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Faraj, Daniel A.

    Methods comprising generating, based on a first output generated by a first execution instance of a command, a first output file specifying a value of at least one performance metric, wherein the first output file is formatted according to a predefined format, comparing the value of the at least one performance metric in the first output file to a value of the performance metric in a second output file, the second output file having been generated based on a second output generated by a second execution instance of the command, and outputting for display an indication of a result ofmore » the comparison of the value of the at least one performance metric of the first output file to the value of the at least one performance metric of the second output file.« less

  18. Performance regression manager for large scale systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Faraj, Daniel A.

    System and computer program product to perform an operation comprising generating, based on a first output generated by a first execution instance of a command, a first output file specifying a value of at least one performance metric, wherein the first output file is formatted according to a predefined format, comparing the value of the at least one performance metric in the first output file to a value of the performance metric in a second output file, the second output file having been generated based on a second output generated by a second execution instance of the command, and outputtingmore » for display an indication of a result of the comparison of the value of the at least one performance metric of the first output file to the value of the at least one performance metric of the second output file.« less

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sivak, David; Crooks, Gavin

    A fundamental problem in modern thermodynamics is how a molecular-scale machine performs useful work, while operating away from thermal equilibrium without excessive dissipation. To this end, we derive a friction tensor that induces a Riemannian manifold on the space of thermodynamic states. Within the linear-response regime, this metric structure controls the dissipation of finite-time transformations, and bestows optimal protocols with many useful properties. We discuss the connection to the existing thermodynamic length formalism, and demonstrate the utility of this metric by solving for optimal control parameter protocols in a simple nonequilibrium model.

  20. Operational modes, health, and status monitoring

    NASA Astrophysics Data System (ADS)

    Taljaard, Corrie

    2016-08-01

    System Engineers must fully understand the system, its support system and operational environment to optimise the design. Operations and Support Managers must also identify the correct metrics to measure the performance and to manage the operations and support organisation. Reliability Engineering and Support Analysis provide methods to design a Support System and to optimise the Availability of a complex system. Availability modelling and Failure Analysis during the design is intended to influence the design and to develop an optimum maintenance plan for a system. The remote site locations of the SKA Telescopes place emphasis on availability, failure identification and fault isolation. This paper discusses the use of Failure Analysis and a Support Database to design a Support and Maintenance plan for the SKA Telescopes. It also describes the use of modelling to develop an availability dashboard and performance metrics.

  1. Moments and Root-Mean-Square Error of the Bayesian MMSE Estimator of Classification Error in the Gaussian Model.

    PubMed

    Zollanvari, Amin; Dougherty, Edward R

    2014-06-01

    The most important aspect of any classifier is its error rate, because this quantifies its predictive capacity. Thus, the accuracy of error estimation is critical. Error estimation is problematic in small-sample classifier design because the error must be estimated using the same data from which the classifier has been designed. Use of prior knowledge, in the form of a prior distribution on an uncertainty class of feature-label distributions to which the true, but unknown, feature-distribution belongs, can facilitate accurate error estimation (in the mean-square sense) in circumstances where accurate completely model-free error estimation is impossible. This paper provides analytic asymptotically exact finite-sample approximations for various performance metrics of the resulting Bayesian Minimum Mean-Square-Error (MMSE) error estimator in the case of linear discriminant analysis (LDA) in the multivariate Gaussian model. These performance metrics include the first, second, and cross moments of the Bayesian MMSE error estimator with the true error of LDA, and therefore, the Root-Mean-Square (RMS) error of the estimator. We lay down the theoretical groundwork for Kolmogorov double-asymptotics in a Bayesian setting, which enables us to derive asymptotic expressions of the desired performance metrics. From these we produce analytic finite-sample approximations and demonstrate their accuracy via numerical examples. Various examples illustrate the behavior of these approximations and their use in determining the necessary sample size to achieve a desired RMS. The Supplementary Material contains derivations for some equations and added figures.

  2. Development of the McGill simulator for endoscopic sinus surgery: a new high-fidelity virtual reality simulator for endoscopic sinus surgery.

    PubMed

    Varshney, Rickul; Frenkiel, Saul; Nguyen, Lily H P; Young, Meredith; Del Maestro, Rolando; Zeitouni, Anthony; Tewfik, Marc A

    2014-01-01

    The technical challenges of endoscopic sinus surgery (ESS) and the high risk of complications support the development of alternative modalities to train residents in these procedures. Virtual reality simulation is becoming a useful tool for training the skills necessary for minimally invasive surgery; however, there are currently no ESS virtual reality simulators available with valid evidence supporting their use in resident education. Our aim was to develop a new rhinology simulator, as well as to define potential performance metrics for trainee assessment. The McGill simulator for endoscopic sinus surgery (MSESS), a new sinus surgery virtual reality simulator with haptic feedback, was developed (a collaboration between the McGill University Department of Otolaryngology-Head and Neck Surgery, the Montreal Neurologic Institute Simulation Lab, and the National Research Council of Canada). A panel of experts in education, performance assessment, rhinology, and skull base surgery convened to identify core technical abilities that would need to be taught by the simulator, as well as performance metrics to be developed and captured. The MSESS allows the user to perform basic sinus surgery skills, such as an ethmoidectomy and sphenoidotomy, through the use of endoscopic tools in a virtual nasal model. The performance metrics were developed by an expert panel and include measurements of safety, quality, and efficiency of the procedure. The MSESS incorporates novel technological advancements to create a realistic platform for trainees. To our knowledge, this is the first simulator to combine novel tools such as the endonasal wash and elaborate anatomic deformity with advanced performance metrics for ESS.

  3. Discrete tyre model application for evaluation of vehicle limit handling performance

    NASA Astrophysics Data System (ADS)

    Siramdasu, Y.; Taheri, S.

    2016-11-01

    The goal of this study is twofold, first, to understand the transient and nonlinear effects of anti-lock braking systems (ABS), road undulations and driving dynamics on lateral performance of tyre and second, to develop objective handling manoeuvres and respective metrics to characterise these effects on vehicle behaviour. For studying the transient and nonlinear handling performance of the vehicle, the variations of relaxation length of tyre and tyre inertial properties play significant roles [Pacejka HB. Tire and vehicle dynamics. 3rd ed. Butterworth-Heinemann; 2012]. To accurately simulate these nonlinear effects during high-frequency vehicle dynamic manoeuvres, requires a high-frequency dynamic tyre model (? Hz). A 6 DOF dynamic tyre model integrated with enveloping model is developed and validated using fixed axle high-speed oblique cleat experimental data. Commercially available vehicle dynamics software CarSim® is used for vehicle simulation. The vehicle model was validated by comparing simulation results with experimental sinusoidal steering tests. The validated tyre model is then integrated with vehicle model and a commercial grade rule-based ABS model to perform various objective simulations. Two test scenarios of ABS braking in turn on a smooth road and accelerating in a turn on uneven and smooth roads are considered. Both test cases reiterated that while the tyre is operating in the nonlinear region of slip or slip angle, any road disturbance or high-frequency brake torque input variations can excite the inertial belt vibrations of the tyre. It is shown that these inertial vibrations can directly affect the developed performance metrics and potentially degrade the handling performance of the vehicle.

  4. Evaluating CMIP5 Simulations of Historical Continental Climate with Koeppen Bioclimatic Metrics

    NASA Astrophysics Data System (ADS)

    Phillips, T. J.; Bonfils, C.

    2013-12-01

    The classic Koeppen bioclimatic classification scheme associates generic vegetation types (e.g. grassland, tundra, broadleaf or evergreen forests, etc.) with regional climate zones defined by their annual cycles of continental temperature (T) and precipitation (P), considered together. The locations or areas of Koeppen vegetation types derived from observational data thus can provide concise metrical standards for simultaneously evaluating climate simulations of T and P in naturally defined regions. The CMIP5 models' collective ability to correctly represent two variables that are critically important for living organisms at regional scales is therefore central to this evaluation. For this study, 14 Koeppen vegetation types are derived from annual-cycle climatologies of T and P in some 3 dozen CMIP5 simulations of the 1980-1999 period. Metrics for evaluating the ability of the CMIP5 models to simulate the correct locations and areas of each vegetation type, as well as measures of overall model performance, also are developed. It is found that the CMIP5 models are generally most deficient in simulating: 1) climates of drier Koeppen zones (e.g. desert, savanna, grassland, steppe vegetation types) located in the southwestern U.S. and Mexico, eastern Europe, southern Africa, and central Australia; 2) climates of regions such as central Asia and western South America where topography plays a key role. Details of regional T or P biases in selected simulations that exemplify general model performance problems also will be presented. Acknowledgments: This work was funded by the U.S. Department of Energy Office of Science and was performed at the Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. Map of Koeppen vegetation types derived from observed T and P.

  5. Display/control requirements for VTOL aircraft

    NASA Technical Reports Server (NTRS)

    Hoffman, W. C.; Curry, R. E.; Kleinman, D. L.; Hollister, W. M.; Young, L. R.

    1975-01-01

    Quantative metrics were determined for system control performance, workload for control, monitoring performance, and workload for monitoring. Pilot tasks were allocated for navigation and guidance of automated commercial V/STOL aircraft in all weather conditions using an optimal control model of the human operator to determine display elements and design.

  6. Sensor Selection for Aircraft Engine Performance Estimation and Gas Path Fault Diagnostics

    NASA Technical Reports Server (NTRS)

    Simon, Donald L.; Rinehart, Aidan W.

    2015-01-01

    This paper presents analytical techniques for aiding system designers in making aircraft engine health management sensor selection decisions. The presented techniques, which are based on linear estimation and probability theory, are tailored for gas turbine engine performance estimation and gas path fault diagnostics applications. They enable quantification of the performance estimation and diagnostic accuracy offered by different candidate sensor suites. For performance estimation, sensor selection metrics are presented for two types of estimators including a Kalman filter and a maximum a posteriori estimator. For each type of performance estimator, sensor selection is based on minimizing the theoretical sum of squared estimation errors in health parameters representing performance deterioration in the major rotating modules of the engine. For gas path fault diagnostics, the sensor selection metric is set up to maximize correct classification rate for a diagnostic strategy that performs fault classification by identifying the fault type that most closely matches the observed measurement signature in a weighted least squares sense. Results from the application of the sensor selection metrics to a linear engine model are presented and discussed. Given a baseline sensor suite and a candidate list of optional sensors, an exhaustive search is performed to determine the optimal sensor suites for performance estimation and fault diagnostics. For any given sensor suite, Monte Carlo simulation results are found to exhibit good agreement with theoretical predictions of estimation and diagnostic accuracies.

  7. Sensor Selection for Aircraft Engine Performance Estimation and Gas Path Fault Diagnostics

    NASA Technical Reports Server (NTRS)

    Simon, Donald L.; Rinehart, Aidan W.

    2016-01-01

    This paper presents analytical techniques for aiding system designers in making aircraft engine health management sensor selection decisions. The presented techniques, which are based on linear estimation and probability theory, are tailored for gas turbine engine performance estimation and gas path fault diagnostics applications. They enable quantification of the performance estimation and diagnostic accuracy offered by different candidate sensor suites. For performance estimation, sensor selection metrics are presented for two types of estimators including a Kalman filter and a maximum a posteriori estimator. For each type of performance estimator, sensor selection is based on minimizing the theoretical sum of squared estimation errors in health parameters representing performance deterioration in the major rotating modules of the engine. For gas path fault diagnostics, the sensor selection metric is set up to maximize correct classification rate for a diagnostic strategy that performs fault classification by identifying the fault type that most closely matches the observed measurement signature in a weighted least squares sense. Results from the application of the sensor selection metrics to a linear engine model are presented and discussed. Given a baseline sensor suite and a candidate list of optional sensors, an exhaustive search is performed to determine the optimal sensor suites for performance estimation and fault diagnostics. For any given sensor suite, Monte Carlo simulation results are found to exhibit good agreement with theoretical predictions of estimation and diagnostic accuracies.

  8. Measuring emergency physicians' work: factoring in clinical hours, patients seen, and relative value units into 1 metric.

    PubMed

    Silich, Bert A; Yang, James J

    2012-05-01

    Measuring workplace performance is important to emergency department management. If an unreliable model is used, the results will be inaccurate. Use of inaccurate results to make decisions, such as how to distribute the incentive pay, will lead to rewarding the wrong people and will potentially demoralize top performers. This article demonstrates a statistical model to reliably measure the work accomplished, which can then be used as a performance measurement.

  9. Multivariate decoding of brain images using ordinal regression.

    PubMed

    Doyle, O M; Ashburner, J; Zelaya, F O; Williams, S C R; Mehta, M A; Marquand, A F

    2013-11-01

    Neuroimaging data are increasingly being used to predict potential outcomes or groupings, such as clinical severity, drug dose response, and transitional illness states. In these examples, the variable (target) we want to predict is ordinal in nature. Conventional classification schemes assume that the targets are nominal and hence ignore their ranked nature, whereas parametric and/or non-parametric regression models enforce a metric notion of distance between classes. Here, we propose a novel, alternative multivariate approach that overcomes these limitations - whole brain probabilistic ordinal regression using a Gaussian process framework. We applied this technique to two data sets of pharmacological neuroimaging data from healthy volunteers. The first study was designed to investigate the effect of ketamine on brain activity and its subsequent modulation with two compounds - lamotrigine and risperidone. The second study investigates the effect of scopolamine on cerebral blood flow and its modulation using donepezil. We compared ordinal regression to multi-class classification schemes and metric regression. Considering the modulation of ketamine with lamotrigine, we found that ordinal regression significantly outperformed multi-class classification and metric regression in terms of accuracy and mean absolute error. However, for risperidone ordinal regression significantly outperformed metric regression but performed similarly to multi-class classification both in terms of accuracy and mean absolute error. For the scopolamine data set, ordinal regression was found to outperform both multi-class and metric regression techniques considering the regional cerebral blood flow in the anterior cingulate cortex. Ordinal regression was thus the only method that performed well in all cases. Our results indicate the potential of an ordinal regression approach for neuroimaging data while providing a fully probabilistic framework with elegant approaches for model selection. Copyright © 2013. Published by Elsevier Inc.

  10. Uncooperative target-in-the-loop performance with backscattered speckle-field effects

    NASA Astrophysics Data System (ADS)

    Kansky, Jan E.; Murphy, Daniel V.

    2007-09-01

    Systems utilizing target-in-the-loop (TIL) techniques for adaptive optics phase compensation rely on a metric sensor to perform a hill climbing algorithm that maximizes the far-field Strehl ratio. In uncooperative TIL, the metric signal is derived from the light backscattered from a target. In cases where the target is illuminated with a laser with suffciently long coherence length, the potential exists for the validity of the metric sensor to be compromised by speckle-field effects. We report experimental results from a scaled laboratory designed to evaluate TIL performance in atmospheric turbulence and thermal blooming conditions where the metric sensors are influenced by varying degrees of backscatter speckle. We compare performance of several TIL configurations and metrics for cases with static speckle, and for cases with speckle fluctuations within the frequency range that the TIL system operates. The roles of metric sensor filtering and system bandwidth are discussed.

  11. Impact of Different Economic Performance Metrics on the Perceived Value of Solar Photovoltaics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Drury, E.; Denholm, P.; Margolis, R.

    2011-10-01

    Photovoltaic (PV) systems are installed by several types of market participants, ranging from residential customers to large-scale project developers and utilities. Each type of market participant frequently uses a different economic performance metric to characterize PV value because they are looking for different types of returns from a PV investment. This report finds that different economic performance metrics frequently show different price thresholds for when a PV investment becomes profitable or attractive. Several project parameters, such as financing terms, can have a significant impact on some metrics [e.g., internal rate of return (IRR), net present value (NPV), and benefit-to-cost (B/C)more » ratio] while having a minimal impact on other metrics (e.g., simple payback time). As such, the choice of economic performance metric by different customer types can significantly shape each customer's perception of PV investment value and ultimately their adoption decision.« less

  12. Unsupervised quality estimation model for English to German translation and its application in extensive supervised evaluation.

    PubMed

    Han, Aaron L-F; Wong, Derek F; Chao, Lidia S; He, Liangye; Lu, Yi

    2014-01-01

    With the rapid development of machine translation (MT), the MT evaluation becomes very important to timely tell us whether the MT system makes any progress. The conventional MT evaluation methods tend to calculate the similarity between hypothesis translations offered by automatic translation systems and reference translations offered by professional translators. There are several weaknesses in existing evaluation metrics. Firstly, the designed incomprehensive factors result in language-bias problem, which means they perform well on some special language pairs but weak on other language pairs. Secondly, they tend to use no linguistic features or too many linguistic features, of which no usage of linguistic feature draws a lot of criticism from the linguists and too many linguistic features make the model weak in repeatability. Thirdly, the employed reference translations are very expensive and sometimes not available in the practice. In this paper, the authors propose an unsupervised MT evaluation metric using universal part-of-speech tagset without relying on reference translations. The authors also explore the performances of the designed metric on traditional supervised evaluation tasks. Both the supervised and unsupervised experiments show that the designed methods yield higher correlation scores with human judgments.

  13. Physiologically grounded metrics of model skill: a case study estimating heat stress in intertidal populations

    PubMed Central

    Kish, Nicole E.; Helmuth, Brian; Wethey, David S.

    2016-01-01

    Models of ecological responses to climate change fundamentally assume that predictor variables, which are often measured at large scales, are to some degree diagnostic of the smaller-scale biological processes that ultimately drive patterns of abundance and distribution. Given that organisms respond physiologically to stressors, such as temperature, in highly non-linear ways, small modelling errors in predictor variables can potentially result in failures to predict mortality or severe stress, especially if an organism exists near its physiological limits. As a result, a central challenge facing ecologists, particularly those attempting to forecast future responses to environmental change, is how to develop metrics of forecast model skill (the ability of a model to predict defined events) that are biologically meaningful and reflective of underlying processes. We quantified the skill of four simple models of body temperature (a primary determinant of physiological stress) of an intertidal mussel, Mytilus californianus, using common metrics of model performance, such as root mean square error, as well as forecast verification skill scores developed by the meteorological community. We used a physiologically grounded framework to assess each model's ability to predict optimal, sub-optimal, sub-lethal and lethal physiological responses. Models diverged in their ability to predict different levels of physiological stress when evaluated using skill scores, even though common metrics, such as root mean square error, indicated similar accuracy overall. Results from this study emphasize the importance of grounding assessments of model skill in the context of an organism's physiology and, especially, of considering the implications of false-positive and false-negative errors when forecasting the ecological effects of environmental change. PMID:27729979

  14. An exploratory survey of methods used to develop measures of performance

    NASA Astrophysics Data System (ADS)

    Hamner, Kenneth L.; Lafleur, Charles A.

    1993-09-01

    Nonmanufacturing organizations are being challenged to provide high-quality products and services to their customers, with an emphasis on continuous process improvement. Measures of performance, referred to as metrics, can be used to foster process improvement. The application of performance measurement to nonmanufacturing processes can be very difficult. This research explored methods used to develop metrics in nonmanufacturing organizations. Several methods were formally defined in the literature, and the researchers used a two-step screening process to determine the OMB Generic Method was most likely to produce high-quality metrics. The OMB Generic Method was then used to develop metrics. A few other metric development methods were found in use at nonmanufacturing organizations. The researchers interviewed participants in metric development efforts to determine their satisfaction and to have them identify the strengths and weaknesses of, and recommended improvements to, the metric development methods used. Analysis of participants' responses allowed the researchers to identify the key components of a sound metrics development method. Those components were incorporated into a proposed metric development method that was based on the OMB Generic Method, and should be more likely to produce high-quality metrics that will result in continuous process improvement.

  15. Improving Learner Handovers in Medical Education.

    PubMed

    Warm, Eric J; Englander, Robert; Pereira, Anne; Barach, Paul

    2017-07-01

    Multiple studies have demonstrated that the information included in the Medical Student Performance Evaluation fails to reliably predict medical students' future performance. This faulty transfer of information can lead to harm when poorly prepared students fail out of residency or, worse, are shuttled through the medical education system without an honest accounting of their performance. Such poor learner handovers likely arise from two root causes: (1) the absence of agreed-on outcomes of training and/or accepted assessments of those outcomes, and (2) the lack of standardized ways to communicate the results of those assessments. To improve the current learner handover situation, an authentic, shared mental model of competency is needed; high-quality tools to assess that competency must be developed and tested; and transparent, reliable, and safe ways to communicate this information must be created.To achieve these goals, the authors propose using a learner handover process modeled after a patient handover process. The CLASS model includes a description of the learner's Competency attainment, a summary of the Learner's performance, an Action list and statement of Situational awareness, and Synthesis by the receiving program. This model also includes coaching oriented towards improvement along the continuum of education and care. Just as studies have evaluated patient handover models using metrics that matter most to patients, studies must evaluate this learner handover model using metrics that matter most to providers, patients, and learners.

  16. An Examination of Advisor Concerns in the Era of Academic Analytics

    ERIC Educational Resources Information Center

    Daughtry, Jeremy J.

    2017-01-01

    Performance-based funding models are increasingly becoming the norm for many institutions of higher learning. Such models place greater emphasis on student retention and success metrics, for example, as requirements for receiving state appropriations. To stay competitive, universities have adopted academic analytics technologies capable of…

  17. Comparison of Home Retrofit Programs in Wisconsin

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cunningham, Kerrie; Hannigan, Eileen

    2013-03-01

    To explore ways to reduce customer barriers and increase home retrofit completions, several different existing home retrofit models have been implemented in the state of Wisconsin. This study compared these programs' performance in terms of savings per home and program cost per home to assess the relative cost-effectiveness of each program design. However, given the many variations in these different programs, it is difficult to establish a fair comparison based on only a small number of metrics. Therefore, the overall purpose of the study is to document these programs' performance in a case study approach to look at general patternsmore » of these metrics and other variables within the context of each program. This information can be used by energy efficiency program administrators and implementers to inform home retrofit program design. Six different program designs offered in Wisconsin for single-family energy efficiency improvements were included in the study. For each program, the research team provided information about the programs' approach and goals, characteristics, achievements and performance. The program models were then compared with performance results-program cost and energy savings-to help understand the overall strengths and weaknesses or challenges of each model.« less

  18. Comparison of Home Retrofit Programs in Wisconsin

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cunningham, K.; Hannigan, E.

    2013-03-01

    To explore ways to reduce customer barriers and increase home retrofit completions, several different existing home retrofit models have been implemented in the state of Wisconsin. This study compared these programs' performance in terms of savings per home and program cost per home to assess the relative cost-effectiveness of each program design. However, given the many variations in these different programs, it is difficult to establish a fair comparison based on only a small number of metrics. Therefore, the overall purpose of the study is to document these programs' performance in a case study approach to look at general patternsmore » of these metrics and other variables within the context of each program. This information can be used by energy efficiency program administrators and implementers to inform home retrofit program design. Six different program designs offered in Wisconsin for single-family energy efficiency improvements were included in the study. For each program, the research team provided information about the programs' approach and goals, characteristics, achievements and performance. The program models were then compared with performance results -- program cost and energy savings -- to help understand the overall strengths and weaknesses or challenges of each model.« less

  19. Biomechanical metrics of aesthetic perception in dance.

    PubMed

    Bronner, Shaw; Shippen, James

    2015-12-01

    The brain may be tuned to evaluate aesthetic perception through perceptual chunking when we observe the grace of the dancer. We modelled biomechanical metrics to explain biological determinants of aesthetic perception in dance. Eighteen expert (EXP) and intermediate (INT) dancers performed développé arabesque in three conditions: (1) slow tempo, (2) slow tempo with relevé, and (3) fast tempo. To compare biomechanical metrics of kinematic data, we calculated intra-excursion variability, principal component analysis (PCA), and dimensionless jerk for the gesture limb. Observers, all trained dancers, viewed motion capture stick figures of the trials and ranked each for aesthetic (1) proficiency and (2) movement smoothness. Statistical analyses included group by condition repeated-measures ANOVA for metric data; Mann-Whitney U rank and Friedman's rank tests for nonparametric rank data; Spearman's rho correlations to compare aesthetic rankings and metrics; and linear regression to examine which metric best quantified observers' aesthetic rankings, p < 0.05. The goodness of fit of the proposed models was determined using Akaike information criteria. Aesthetic proficiency and smoothness rankings of the dance movements revealed differences between groups and condition, p < 0.0001. EXP dancers were rated more aesthetically proficient than INT dancers. The slow and fast conditions were judged more aesthetically proficient than slow with relevé (p < 0.0001). Of the metrics, PCA best captured the differences due to group and condition. PCA also provided the most parsimonious model to explain aesthetic proficiency and smoothness rankings. By permitting organization of large data sets into simpler groupings, PCA may mirror the phenomenon of chunking in which the brain combines sensory motor elements into integrated units of behaviour. In this representation, the chunk of information which is remembered, and to which the observer reacts, is the elemental mode shape of the motion rather than physical displacements. This suggests that reduction in redundant information to a simplistic dimensionality is related to the experienced observer's aesthetic perception.

  20. SU-E-I-71: Quality Assessment of Surrogate Metrics in Multi-Atlas-Based Image Segmentation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhao, T; Ruan, D

    Purpose: With the ever-growing data of heterogeneous quality, relevance assessment of atlases becomes increasingly critical for multi-atlas-based image segmentation. However, there is no universally recognized best relevance metric and even a standard to compare amongst candidates remains elusive. This study, for the first time, designs a quantification to assess relevance metrics’ quality, based on a novel perspective of the metric as surrogate for inferring the inaccessible oracle geometric agreement. Methods: We first develop an inference model to relate surrogate metrics in image space to the underlying oracle relevance metric in segmentation label space, with a monotonically non-decreasing function subject tomore » random perturbations. Subsequently, we investigate model parameters to reveal key contributing factors to surrogates’ ability in prognosticating the oracle relevance value, for the specific task of atlas selection. Finally, we design an effective contract-to-noise ratio (eCNR) to quantify surrogates’ quality based on insights from these analyses and empirical observations. Results: The inference model was specialized to a linear function with normally distributed perturbations, with surrogate metric exemplified by several widely-used image similarity metrics, i.e., MSD/NCC/(N)MI. Surrogates’ behaviors in selecting the most relevant atlases were assessed under varying eCNR, showing that surrogates with high eCNR dominated those with low eCNR in retaining the most relevant atlases. In an end-to-end validation, NCC/(N)MI with eCNR of 0.12 compared to MSD with eCNR of 0.10 resulted in statistically better segmentation with mean DSC of about 0.85 and the first and third quartiles of (0.83, 0.89), compared to MSD with mean DSC of 0.84 and the first and third quartiles of (0.81, 0.89). Conclusion: The designed eCNR is capable of characterizing surrogate metrics’ quality in prognosticating the oracle relevance value. It has been demonstrated to be correlated with the performance of relevant atlas selection and ultimate label fusion.« less

  1. Metrics for covariate balance in cohort studies of causal effects.

    PubMed

    Franklin, Jessica M; Rassen, Jeremy A; Ackermann, Diana; Bartels, Dorothee B; Schneeweiss, Sebastian

    2014-05-10

    Inferring causation from non-randomized studies of exposure requires that exposure groups can be balanced with respect to prognostic factors for the outcome. Although there is broad agreement in the literature that balance should be checked, there is confusion regarding the appropriate metric. We present a simulation study that compares several balance metrics with respect to the strength of their association with bias in estimation of the effect of a binary exposure on a binary, count, or continuous outcome. The simulations utilize matching on the propensity score with successively decreasing calipers to produce datasets with varying covariate balance. We propose the post-matching C-statistic as a balance metric and found that it had consistently strong associations with estimation bias, even when the propensity score model was misspecified, as long as the propensity score was estimated with sufficient study size. This metric, along with the average standardized difference and the general weighted difference, outperformed all other metrics considered in association with bias, including the unstandardized absolute difference, Kolmogorov-Smirnov and Lévy distances, overlapping coefficient, Mahalanobis balance, and L1 metrics. Of the best-performing metrics, the C-statistic and general weighted difference also have the advantage that they automatically evaluate balance on all covariates simultaneously and can easily incorporate balance on interactions among covariates. Therefore, when combined with the usual practice of comparing individual covariate means and standard deviations across exposure groups, these metrics may provide useful summaries of the observed covariate imbalance. Copyright © 2013 John Wiley & Sons, Ltd.

  2. Compression performance comparison in low delay real-time video for mobile applications

    NASA Astrophysics Data System (ADS)

    Bivolarski, Lazar

    2012-10-01

    This article compares the performance of several current video coding standards in the conditions of low-delay real-time in a resource constrained environment. The comparison is performed using the same content and the metrics and mix of objective and perceptual quality metrics. The metrics results in different coding schemes are analyzed from a point of view of user perception and quality of service. Multiple standards are compared MPEG-2, MPEG4 and MPEG-AVC and well and H.263. The metrics used in the comparison include SSIM, VQM and DVQ. Subjective evaluation and quality of service are discussed from a point of view of perceptual metrics and their incorporation in the coding scheme development process. The performance and the correlation of results are presented as a predictor of the performance of video compression schemes.

  3. TOD to TTP calibration

    NASA Astrophysics Data System (ADS)

    Bijl, Piet; Reynolds, Joseph P.; Vos, Wouter K.; Hogervorst, Maarten A.; Fanning, Jonathan D.

    2011-05-01

    The TTP (Targeting Task Performance) metric, developed at NVESD, is the current standard US Army model to predict EO/IR Target Acquisition performance. This model however does not have a corresponding lab or field test to empirically assess the performance of a camera system. The TOD (Triangle Orientation Discrimination) method, developed at TNO in The Netherlands, provides such a measurement. In this study, we make a direct comparison between TOD performance for a range of sensors and the extensive historical US observer performance database built to develop and calibrate the TTP metric. The US perception data were collected doing an identification task by military personnel on a standard 12 target, 12 aspect tactical vehicle image set that was processed through simulated sensors for which the most fundamental sensor parameters such as blur, sampling, spatial and temporal noise were varied. In the present study, we measured TOD sensor performance using exactly the same sensors processing a set of TOD triangle test patterns. The study shows that good overall agreement is obtained when the ratio between target characteristic size and TOD test pattern size at threshold equals 6.3. Note that this number is purely based on empirical data without any intermediate modeling. The calibration of the TOD to the TTP is highly beneficial to the sensor modeling and testing community for a variety of reasons. These include: i) a connection between requirement specification and acceptance testing, and ii) a very efficient method to quickly validate or extend the TTP range prediction model to new systems and tasks.

  4. Reference Manual for the System Advisor Model's Wind Power Performance Model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Freeman, J.; Jorgenson, J.; Gilman, P.

    2014-08-01

    This manual describes the National Renewable Energy Laboratory's System Advisor Model (SAM) wind power performance model. The model calculates the hourly electrical output of a single wind turbine or of a wind farm. The wind power performance model requires information about the wind resource, wind turbine specifications, wind farm layout (if applicable), and costs. In SAM, the performance model can be coupled to one of the financial models to calculate economic metrics for residential, commercial, or utility-scale wind projects. This manual describes the algorithms used by the wind power performance model, which is available in the SAM user interface andmore » as part of the SAM Simulation Core (SSC) library, and is intended to supplement the user documentation that comes with the software.« less

  5. Applying graphs and complex networks to football metric interpretation.

    PubMed

    Arriaza-Ardiles, E; Martín-González, J M; Zuniga, M D; Sánchez-Flores, J; de Saa, Y; García-Manso, J M

    2018-02-01

    This work presents a methodology for analysing the interactions between players in a football team, from the point of view of graph theory and complex networks. We model the complex network of passing interactions between players of a same team in 32 official matches of the Liga de Fútbol Profesional (Spain), using a passing/reception graph. This methodology allows us to understand the play structure of the team, by analysing the offensive phases of game-play. We utilise two different strategies for characterising the contribution of the players to the team: the clustering coefficient, and centrality metrics (closeness and betweenness). We show the application of this methodology by analyzing the performance of a professional Spanish team according to these metrics and the distribution of passing/reception in the field. Keeping in mind the dynamic nature of collective sports, in the future we will incorporate metrics which allows us to analyse the performance of the team also according to the circumstances of game-play and to different contextual variables such as, the utilisation of the field space, the time, and the ball, according to specific tactical situations. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. Biophysics of Euglena phototaxis

    NASA Astrophysics Data System (ADS)

    Tsang, Alan Cheng Hou; Riedel-Kruse, Ingmar H.

    Phototactic microorganisms usually respond to light stimuli via phototaxis to optimize the process of photosynthesis and avoid photodamage by excessive amount of light. Unicellular phototactic microorganisms such as Euglena gracilis only possesses a single photoreceptor, which highly limits its access to the light in three-dimensional world. However, experiments demonstrated that Euglena responds to light stimuli sensitively and exhibits phototaxis quickly, and it's not well understood how it performs so efficiently. We propose a mathematical model of Euglena's phototaxis that couples the dynamics of Euglena and its phototactic response. This model shows that Euglena exhibits wobbling path under weak ambient light, which is consistent to experimental observation. We show that this wobbling motion can enhance the sensitivity of photoreceptor to signals of small light intensity and provide an efficient mechanism for Euglena to sample light in different directions. We further investigate the optimization of Euglena's phototaxis using different performance metrics, including reorientation time, energy consumption, and swimming efficiency. We characterize the tradeoff among these performance metrics and the best strategy for phototaxis.

  7. Wide-area, real-time monitoring and visualization system

    DOEpatents

    Budhraja, Vikram S.; Dyer, James D.; Martinez Morales, Carlos A.

    2013-03-19

    A real-time performance monitoring system for monitoring an electric power grid. The electric power grid has a plurality of grid portions, each grid portion corresponding to one of a plurality of control areas. The real-time performance monitoring system includes a monitor computer for monitoring at least one of reliability metrics, generation metrics, transmission metrics, suppliers metrics, grid infrastructure security metrics, and markets metrics for the electric power grid. The data for metrics being monitored by the monitor computer are stored in a data base, and a visualization of the metrics is displayed on at least one display computer having a monitor. The at least one display computer in one said control area enables an operator to monitor the grid portion corresponding to a different said control area.

  8. Wide-area, real-time monitoring and visualization system

    DOEpatents

    Budhraja, Vikram S [Los Angeles, CA; Dyer, James D [La Mirada, CA; Martinez Morales, Carlos A [Upland, CA

    2011-11-15

    A real-time performance monitoring system for monitoring an electric power grid. The electric power grid has a plurality of grid portions, each grid portion corresponding to one of a plurality of control areas. The real-time performance monitoring system includes a monitor computer for monitoring at least one of reliability metrics, generation metrics, transmission metrics, suppliers metrics, grid infrastructure security metrics, and markets metrics for the electric power grid. The data for metrics being monitored by the monitor computer are stored in a data base, and a visualization of the metrics is displayed on at least one display computer having a monitor. The at least one display computer in one said control area enables an operator to monitor the grid portion corresponding to a different said control area.

  9. Evaluation of Two Crew Module Boilerplate Tests Using Newly Developed Calibration Metrics

    NASA Technical Reports Server (NTRS)

    Horta, Lucas G.; Reaves, Mercedes C.

    2012-01-01

    The paper discusses a application of multi-dimensional calibration metrics to evaluate pressure data from water drop tests of the Max Launch Abort System (MLAS) crew module boilerplate. Specifically, three metrics are discussed: 1) a metric to assess the probability of enveloping the measured data with the model, 2) a multi-dimensional orthogonality metric to assess model adequacy between test and analysis, and 3) a prediction error metric to conduct sensor placement to minimize pressure prediction errors. Data from similar (nearly repeated) capsule drop tests shows significant variability in the measured pressure responses. When compared to expected variability using model predictions, it is demonstrated that the measured variability cannot be explained by the model under the current uncertainty assumptions.

  10. Global Rating Scales and Motion Analysis Are Valid Proficiency Metrics in Virtual and Benchtop Knee Arthroscopy Simulators.

    PubMed

    Chang, Justues; Banaszek, Daniel C; Gambrel, Jason; Bardana, Davide

    2016-04-01

    Work-hour restrictions and fatigue management strategies in surgical training programs continue to evolve in an effort to improve the learning environment and promote safer patient care. In response, training programs must reevaluate how various teaching modalities such as simulation can augment the development of surgical competence in trainees. For surgical simulators to be most useful, it is important to determine whether surgical proficiency can be reliably differentiated using them. To our knowledge, performance on both virtual and benchtop arthroscopy simulators has not been concurrently assessed in the same subjects. (1) Do global rating scales and procedure time differentiate arthroscopic expertise in virtual and benchtop knee models? (2) Can commercially available built-in motion analysis metrics differentiate arthroscopic expertise? (3) How well are performance measures on virtual and benchtop simulators correlated? (4) Are these metrics sensitive enough to differentiate by year of training? A cross-sectional study of 19 subjects (four medical students, 12 residents, and three staff) were recruited and divided into 11 novice arthroscopists (student to Postgraduate Year [PGY] 3) and eight proficient arthroscopists (PGY 4 to staff) who completed a diagnostic arthroscopy and loose-body retrieval in both virtual and benchtop knee models. Global rating scales (GRS), procedure times, and motion analysis metrics were used to evaluate performance. The proficient group scored higher on virtual (14 ± 6 [95% confidence interval {CI}, 10-18] versus 36 ± 5 [95% CI, 32-40], p < 0.001) and benchtop (16 ± 8 [95% CI, 11-21] versus 36 ± 5 [95% CI, 31-40], p < 0.001) GRS scales. The proficient subjects completed nearly all tasks faster than novice subjects, including the virtual scope (579 ±169 [95% CI, 466-692] versus 358 ± 178 [95% CI, 210-507] seconds, p = 0.02) and benchtop knee scope + probe (480 ± 160 [95% CI, 373-588] versus 277 ± 64 [95% CI, 224-330] seconds, p = 0.002). The built-in motion analysis metrics also distinguished novices from proficient arthroscopists using the self-generated virtual loose body retrieval task scores (4 ± 1 [95% CI, 3-5] versus 6 ± 1 [95% CI, 5-7], p = 0.001). GRS scores between virtual and benchtop models were very strongly correlated (ρ = 0.93, p < 0.001). There was strong correlation between year of training and virtual GRS (ρ = 0.8, p < 0.001) and benchtop GRS (ρ = 0.87, p < 0.001) scores. To our knowledge, this is the first study to evaluate performance on both virtual and benchtop knee simulators. We have shown that subjective GRS scores and objective motion analysis metrics and procedure time are valid measures to distinguish arthroscopic skill on both virtual and benchtop modalities. Performance on both modalities is well correlated. We believe that training on artificial models allows acquisition of skills in a safe environment. Future work should compare different modalities in the efficiency of skill acquisition, retention, and transferability to the operating room.

  11. Grading the Metrics: Performance-Based Funding in the Florida State University System

    ERIC Educational Resources Information Center

    Cornelius, Luke M.; Cavanaugh, Terence W.

    2016-01-01

    A policy analysis of Florida's 10-factor Performance-Based Funding system for state universities. The focus of the article is on the system of performance metrics developed by the state Board of Governors and their impact on institutions and their missions. The paper also discusses problems and issues with the metrics, their ongoing evolution, and…

  12. The roofline model: A pedagogical tool for program analysis and optimization

    DOE PAGES

    Williams, Samuel; Patterson, David; Oliker, Leonid; ...

    2008-08-01

    This article consists of a collection of slides from the authors' conference presentation. The Roofline model is a visually intuitive figure for kernel analysis and optimization. We believe undergraduates will find it useful in assessing performance and scalability limitations. It is easily extended to other architectural paradigms. It is easily extendable to other metrics: performance (sort, graphics, crypto..) bandwidth (L2, PCIe, ..). Furthermore, a performance counters could be used to generate a runtime-specific roofline that would greatly aide the optimization.

  13. Control algorithms and applications of the wavefront sensorless adaptive optics

    NASA Astrophysics Data System (ADS)

    Ma, Liang; Wang, Bin; Zhou, Yuanshen; Yang, Huizhen

    2017-10-01

    Compared with the conventional adaptive optics (AO) system, the wavefront sensorless (WFSless) AO system need not to measure the wavefront and reconstruct it. It is simpler than the conventional AO in system architecture and can be applied to the complex conditions. Based on the analysis of principle and system model of the WFSless AO system, wavefront correction methods of the WFSless AO system were divided into two categories: model-free-based and model-based control algorithms. The WFSless AO system based on model-free-based control algorithms commonly considers the performance metric as a function of the control parameters and then uses certain control algorithm to improve the performance metric. The model-based control algorithms include modal control algorithms, nonlinear control algorithms and control algorithms based on geometrical optics. Based on the brief description of above typical control algorithms, hybrid methods combining the model-free-based control algorithm with the model-based control algorithm were generalized. Additionally, characteristics of various control algorithms were compared and analyzed. We also discussed the extensive applications of WFSless AO system in free space optical communication (FSO), retinal imaging in the human eye, confocal microscope, coherent beam combination (CBC) techniques and extended objects.

  14. Performance analysis of LAN bridges and routers

    NASA Technical Reports Server (NTRS)

    Hajare, Ankur R.

    1991-01-01

    Bridges and routers are used to interconnect Local Area Networks (LANs). The performance of these devices is important since they can become bottlenecks in large multi-segment networks. Performance metrics and test methodology for bridges and routers were not standardized. Performance data reported by vendors is not applicable to the actual scenarios encountered in an operational network. However, vendor-provided data can be used to calibrate models of bridges and routers that, along with other models, yield performance data for a network. Several tools are available for modeling bridges and routers - Network II.5 was used. The results of the analysis of some bridges and routers are presented.

  15. Evaluation Metrics for Simulations of Tropical South America

    NASA Astrophysics Data System (ADS)

    Gallup, S.; Baker, I. T.; Denning, A. S.; Cheeseman, M.; Haynes, K. D.; Phillips, M.

    2017-12-01

    The evergreen broadleaf forest of the Amazon Basin is the largest rainforest on earth, and has teleconnections to global climate and carbon cycle characteristics. This region defies simple characterization, spanning large gradients in total rainfall and seasonal variability. Broadly, the region can be thought of as trending from light-limited in its wettest areas to water-limited near the ecotone, with individual landscapes possibly exhibiting the characteristics of either (or both) limitations during an annual cycle. A basin-scale classification of mean behavior has been elusive, and ecosystem response to seasonal cycles and anomalous drought events has resulted in some disagreement in the literature, to say the least. However, new observational platforms and instruments make characterization of the heterogeneity and variability more feasible.To evaluate simulations of ecophysiological function, we develop metrics that correlate various observational products with meteorological variables such as precipitation and radiation. Observations include eddy covariance fluxes, Solar Induced Fluorescence (SIF, from GOME2 and OCO2), biomass and vegetation indices. We find that the modest correlation between SIF and precipitation decreases with increasing annual precipitation, although the relationship is not consistent between products. Biomass increases with increasing precipitation. Although vegetation indices are generally correlated with biomass and precipitation, they can saturate or experience retrieval issues during cloudy periods.Using these observational products and relationships, we develop a set of model evaluation metrics. These metrics are designed to call attention to models that get "the right answer only if it's for the right reason," and provide an opportunity for more critical evaluation of model physics. These metrics represent a testbed that can be applied to multiple models as a means to evaluate their performance in tropical South America.

  16. Virtual reality, ultrasound-guided liver biopsy simulator: development and performance discrimination.

    PubMed

    Johnson, S J; Hunt, C M; Woolnough, H M; Crawshaw, M; Kilkenny, C; Gould, D A; England, A; Sinha, A; Villard, P F

    2012-05-01

    The aim of this article was to identify and prospectively investigate simulated ultrasound-guided targeted liver biopsy performance metrics as differentiators between levels of expertise in interventional radiology. Task analysis produced detailed procedural step documentation allowing identification of critical procedure steps and performance metrics for use in a virtual reality ultrasound-guided targeted liver biopsy procedure. Consultant (n=14; male=11, female=3) and trainee (n=26; male=19, female=7) scores on the performance metrics were compared. Ethical approval was granted by the Liverpool Research Ethics Committee (UK). Independent t-tests and analysis of variance (ANOVA) investigated differences between groups. Independent t-tests revealed significant differences between trainees and consultants on three performance metrics: targeting, p=0.018, t=-2.487 (-2.040 to -0.207); probe usage time, p = 0.040, t=2.132 (11.064 to 427.983); mean needle length in beam, p=0.029, t=-2.272 (-0.028 to -0.002). ANOVA reported significant differences across years of experience (0-1, 1-2, 3+ years) on seven performance metrics: no-go area touched, p=0.012; targeting, p=0.025; length of session, p=0.024; probe usage time, p=0.025; total needle distance moved, p=0.038; number of skin contacts, p<0.001; total time in no-go area, p=0.008. More experienced participants consistently received better performance scores on all 19 performance metrics. It is possible to measure and monitor performance using simulation, with performance metrics providing feedback on skill level and differentiating levels of expertise. However, a transfer of training study is required.

  17. Electro-Optic Identification Research Program

    DTIC Science & Technology

    2002-04-01

    Electro - optic identification (EOID) sensors provide photographic quality images that can be used to identify mine-like contacts provided by long...tasks such as validating existing electro - optic models, development of performance metrics, and development of computer aided identification and

  18. Wafer hot spot identification through advanced photomask characterization techniques

    NASA Astrophysics Data System (ADS)

    Choi, Yohan; Green, Michael; McMurran, Jeff; Ham, Young; Lin, Howard; Lan, Andy; Yang, Richer; Lung, Mike

    2016-10-01

    As device manufacturers progress through advanced technology nodes, limitations in standard 1-dimensional (1D) mask Critical Dimension (CD) metrics are becoming apparent. Historically, 1D metrics such as Mean to Target (MTT) and CD Uniformity (CDU) have been adequate for end users to evaluate and predict the mask impact on the wafer process. However, the wafer lithographer's process margin is shrinking at advanced nodes to a point that the classical mask CD metrics are no longer adequate to gauge the mask contribution to wafer process error. For example, wafer CDU error at advanced nodes is impacted by mask factors such as 3-dimensional (3D) effects and mask pattern fidelity on subresolution assist features (SRAFs) used in Optical Proximity Correction (OPC) models of ever-increasing complexity. These items are not quantifiable with the 1D metrology techniques of today. Likewise, the mask maker needs advanced characterization methods in order to optimize the mask process to meet the wafer lithographer's needs. These advanced characterization metrics are what is needed to harmonize mask and wafer processes for enhanced wafer hot spot analysis. In this paper, we study advanced mask pattern characterization techniques and their correlation with modeled wafer performance.

  19. Hyperspectral face recognition using improved inter-channel alignment based on qualitative prediction models.

    PubMed

    Cho, Woon; Jang, Jinbeum; Koschan, Andreas; Abidi, Mongi A; Paik, Joonki

    2016-11-28

    A fundamental limitation of hyperspectral imaging is the inter-band misalignment correlated with subject motion during data acquisition. One way of resolving this problem is to assess the alignment quality of hyperspectral image cubes derived from the state-of-the-art alignment methods. In this paper, we present an automatic selection framework for the optimal alignment method to improve the performance of face recognition. Specifically, we develop two qualitative prediction models based on: 1) a principal curvature map for evaluating the similarity index between sequential target bands and a reference band in the hyperspectral image cube as a full-reference metric; and 2) the cumulative probability of target colors in the HSV color space for evaluating the alignment index of a single sRGB image rendered using all of the bands of the hyperspectral image cube as a no-reference metric. We verify the efficacy of the proposed metrics on a new large-scale database, demonstrating a higher prediction accuracy in determining improved alignment compared to two full-reference and five no-reference image quality metrics. We also validate the ability of the proposed framework to improve hyperspectral face recognition.

  20. Analysis and optimization of preliminary aircraft configurations in relationship to emerging agility metrics

    NASA Technical Reports Server (NTRS)

    Sandlin, Doral R.; Bauer, Brent Alan

    1993-01-01

    This paper discusses the development of a FORTRAN computer code to perform agility analysis on aircraft configurations. This code is to be part of the NASA-Ames ACSYNT (AirCraft SYNThesis) design code. This paper begins with a discussion of contemporary agility research in the aircraft industry and a survey of a few agility metrics. The methodology, techniques and models developed for the code are then presented. Finally, example trade studies using the agility module along with ACSYNT are illustrated. These trade studies were conducted using a Northrop F-20 Tigershark aircraft model. The studies show that the agility module is effective in analyzing the influence of common parameters such as thrust-to-weight ratio and wing loading on agility criteria. The module can compare the agility potential between different configurations. In addition one study illustrates the module's ability to optimize a configuration's agility performance.

  1. Development of a Computer Program for Analyzing Preliminary Aircraft Configurations in Relationship to Emerging Agility Metrics

    NASA Technical Reports Server (NTRS)

    Bauer, Brent

    1993-01-01

    This paper discusses the development of a FORTRAN computer code to perform agility analysis on aircraft configurations. This code is to be part of the NASA-Ames ACSYNT (AirCraft SYNThesis) design code. This paper begins with a discussion of contemporary agility research in the aircraft industry and a survey of a few agility metrics. The methodology, techniques and models developed for the code are then presented. Finally, example trade studies using the agility module along with ACSYNT are illustrated. These trade studies were conducted using a Northrop F-20 Tigershark aircraft model. The studies show that the agility module is effective in analyzing the influence of common parameters such as thrust-to-weight ratio and wing loading on agility criteria. The module can compare the agility potential between different configurations. In addition, one study illustrates the module's ability to optimize a configuration's agility performance.

  2. What do we know and when do we know it?

    NASA Astrophysics Data System (ADS)

    Nicholls, Anthony

    2008-03-01

    Two essential aspects of virtual screening are considered: experimental design and performance metrics. In the design of any retrospective virtual screen, choices have to be made as to the purpose of the exercise. Is the goal to compare methods? Is the interest in a particular type of target or all targets? Are we simulating a `real-world' setting, or teasing out distinguishing features of a method? What are the confidence limits for the results? What should be reported in a publication? In particular, what criteria should be used to decide between different performance metrics? Comparing the field of molecular modeling to other endeavors, such as medical statistics, criminology, or computer hardware evaluation indicates some clear directions. Taken together these suggest the modeling field has a long way to go to provide effective assessment of its approaches, either to itself or to a broader audience, but that there are no technical reasons why progress cannot be made.

  3. A cross-validation package driving Netica with python

    USGS Publications Warehouse

    Fienen, Michael N.; Plant, Nathaniel G.

    2014-01-01

    Bayesian networks (BNs) are powerful tools for probabilistically simulating natural systems and emulating process models. Cross validation is a technique to avoid overfitting resulting from overly complex BNs. Overfitting reduces predictive skill. Cross-validation for BNs is known but rarely implemented due partly to a lack of software tools designed to work with available BN packages. CVNetica is open-source, written in Python, and extends the Netica software package to perform cross-validation and read, rebuild, and learn BNs from data. Insights gained from cross-validation and implications on prediction versus description are illustrated with: a data-driven oceanographic application; and a model-emulation application. These examples show that overfitting occurs when BNs become more complex than allowed by supporting data and overfitting incurs computational costs as well as causing a reduction in prediction skill. CVNetica evaluates overfitting using several complexity metrics (we used level of discretization) and its impact on performance metrics (we used skill).

  4. Rationalizing context-dependent performance of dynamic RNA regulatory devices.

    PubMed

    Kent, Ross; Halliwell, Samantha; Young, Kate; Swainston, Neil; Dixon, Neil

    2018-06-21

    The ability of RNA to sense, regulate and store information is an attractive attribute for a variety of functional applications including the development of regulatory control devices for synthetic biology. RNA folding and function is known to be highly context sensitive, which limits the modularity and reuse of RNA regulatory devices to control different heterologous sequences and genes. We explored the cause and effect of sequence context sensitivity for translational ON riboswitches located in the 5' UTR, by constructing and screening a library of N-terminal synonymous codon variants. By altering the N-terminal codon usage we were able to obtain RNA devices with a broad range of functional performance properties (ON, OFF, fold-change). Linear regression and calculated metrics were used to rationalize the major determining features leading to optimal riboswitch performance, and to identify multiple interactions between the explanatory metrics. Finally, partial least squared (PLS) analysis was employed in order to understand the metrics and their respective effect on performance. This PLS model was shown to provide good explanation of our library. This study provides a novel multi-variant analysis framework by which to rationalize the codon context performance of allosteric RNA-devices. The framework will also serve as a platform for future riboswitch context engineering endeavors.

  5. A Web Resource for Standardized Benchmark Datasets, Metrics, and Rosetta Protocols for Macromolecular Modeling and Design.

    PubMed

    Ó Conchúir, Shane; Barlow, Kyle A; Pache, Roland A; Ollikainen, Noah; Kundert, Kale; O'Meara, Matthew J; Smith, Colin A; Kortemme, Tanja

    2015-01-01

    The development and validation of computational macromolecular modeling and design methods depend on suitable benchmark datasets and informative metrics for comparing protocols. In addition, if a method is intended to be adopted broadly in diverse biological applications, there needs to be information on appropriate parameters for each protocol, as well as metrics describing the expected accuracy compared to experimental data. In certain disciplines, there exist established benchmarks and public resources where experts in a particular methodology are encouraged to supply their most efficient implementation of each particular benchmark. We aim to provide such a resource for protocols in macromolecular modeling and design. We present a freely accessible web resource (https://kortemmelab.ucsf.edu/benchmarks) to guide the development of protocols for protein modeling and design. The site provides benchmark datasets and metrics to compare the performance of a variety of modeling protocols using different computational sampling methods and energy functions, providing a "best practice" set of parameters for each method. Each benchmark has an associated downloadable benchmark capture archive containing the input files, analysis scripts, and tutorials for running the benchmark. The captures may be run with any suitable modeling method; we supply command lines for running the benchmarks using the Rosetta software suite. We have compiled initial benchmarks for the resource spanning three key areas: prediction of energetic effects of mutations, protein design, and protein structure prediction, each with associated state-of-the-art modeling protocols. With the help of the wider macromolecular modeling community, we hope to expand the variety of benchmarks included on the website and continue to evaluate new iterations of current methods as they become available.

  6. Advanced Life Support System Value Metric

    NASA Technical Reports Server (NTRS)

    Jones, Harry W.; Rasky, Daniel J. (Technical Monitor)

    1999-01-01

    The NASA Advanced Life Support (ALS) Program is required to provide a performance metric to measure its progress in system development. Extensive discussions within the ALS program have led to the following approach. The Equivalent System Mass (ESM) metric has been traditionally used and provides a good summary of the weight, size, and power cost factors of space life support equipment. But ESM assumes that all the systems being traded off exactly meet a fixed performance requirement, so that the value and benefit (readiness, performance, safety, etc.) of all the different systems designs are considered to be exactly equal. This is too simplistic. Actual system design concepts are selected using many cost and benefit factors and the system specification is defined after many trade-offs. The ALS program needs a multi-parameter metric including both the ESM and a System Value Metric (SVM). The SVM would include safety, maintainability, reliability, performance, use of cross cutting technology, and commercialization potential. Another major factor in system selection is technology readiness level (TRL), a familiar metric in ALS. The overall ALS system metric that is suggested is a benefit/cost ratio, SVM/[ESM + function (TRL)], with appropriate weighting and scaling. The total value is given by SVM. Cost is represented by higher ESM and lower TRL. The paper provides a detailed description and example application of a suggested System Value Metric and an overall ALS system metric.

  7. Public Health Analysis Transport Optimization Model v. 1.0

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Beyeler, Walt; Finley, Patrick; Walser, Alex

    PHANTOM models logistic functions of national public health systems. The system enables public health officials to visualize and coordinate options for public health surveillance, diagnosis, response and administration in an integrated analytical environment. Users may simulate and analyze system performance applying scenarios that represent current conditions or future contingencies what-if analyses of potential systemic improvements. Public health networks are visualized as interactive maps, with graphical displays of relevant system performance metrics as calculated by the simulation modeling components.

  8. An Alternative Time Metric to Modified Tau for Unmanned Aircraft System Detect And Avoid

    NASA Technical Reports Server (NTRS)

    Wu, Minghong G.; Bageshwar, Vibhor L.; Euteneuer, Eric A.

    2017-01-01

    A new horizontal time metric, Time to Protected Zone, is proposed for use in the Detect and Avoid (DAA) Systems equipped by unmanned aircraft systems (UAS). This time metric has three advantages over the currently adopted time metric, modified tau: it corresponds to a physical event, it is linear with time, and it can be directly used to prioritize intruding aircraft. The protected zone defines an area around the UAS that can be a function of each intruding aircraft's surveillance measurement errors. Even with its advantages, the Time to Protected Zone depends explicitly on encounter geometry and may be more sensitive to surveillance sensor errors than modified tau. To quantify its sensitivity, simulation of 972 encounters using realistic sensor models and a proprietary fusion tracker is performed. Two sensitivity metrics, the probability of time reversal and the average absolute time error, are computed for both the Time to Protected Zone and modified tau. Results show that the sensitivity of the Time to Protected Zone is comparable to that of modified tau if the dimensions of the protected zone are adequately defined.

  9. Analysis of PV Advanced Inverter Functions and Setpoints under Time Series Simulation.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seuss, John; Reno, Matthew J.; Broderick, Robert Joseph

    Utilities are increasingly concerned about the potential negative impacts distributed PV may have on the operational integrity of their distribution feeders. Some have proposed novel methods for controlling a PV system's grid - tie inverter to mitigate poten tial PV - induced problems. This report investigates the effectiveness of several of these PV advanced inverter controls on improving distribution feeder operational metrics. The controls are simulated on a large PV system interconnected at several locations within two realistic distribution feeder models. Due to the time - domain nature of the advanced inverter controls, quasi - static time series simulations aremore » performed under one week of representative variable irradiance and load data for each feeder. A para metric study is performed on each control type to determine how well certain measurable network metrics improve as a function of the control parameters. This methodology is used to determine appropriate advanced inverter settings for each location on the f eeder and overall for any interconnection location on the feeder.« less

  10. Citizen science: A new perspective to advance spatial pattern evaluation in hydrology.

    PubMed

    Koch, Julian; Stisen, Simon

    2017-01-01

    Citizen science opens new pathways that can complement traditional scientific practice. Intuition and reasoning often make humans more effective than computer algorithms in various realms of problem solving. In particular, a simple visual comparison of spatial patterns is a task where humans are often considered to be more reliable than computer algorithms. However, in practice, science still largely depends on computer based solutions, which inevitably gives benefits such as speed and the possibility to automatize processes. However, the human vision can be harnessed to evaluate the reliability of algorithms which are tailored to quantify similarity in spatial patterns. We established a citizen science project to employ the human perception to rate similarity and dissimilarity between simulated spatial patterns of several scenarios of a hydrological catchment model. In total, the turnout counts more than 2500 volunteers that provided over 43000 classifications of 1095 individual subjects. We investigate the capability of a set of advanced statistical performance metrics to mimic the human perception to distinguish between similarity and dissimilarity. Results suggest that more complex metrics are not necessarily better at emulating the human perception, but clearly provide auxiliary information that is valuable for model diagnostics. The metrics clearly differ in their ability to unambiguously distinguish between similar and dissimilar patterns which is regarded a key feature of a reliable metric. The obtained dataset can provide an insightful benchmark to the community to test novel spatial metrics.

  11. Evaluation of SEBS, SEBAL, and METRIC models in estimation of the evaporation from the freshwater lakes (Case study: Amirkabir dam, Iran)

    NASA Astrophysics Data System (ADS)

    Zamani Losgedaragh, Saeideh; Rahimzadegan, Majid

    2018-06-01

    Evapotranspiration (ET) estimation is of great importance due to its key role in water resource management. Surface energy modeling tools such as Surface Energy Balance Algorithm for Land (SEBAL), Mapping Evapotranspiration with Internalized Calibration (METRIC), and the Surface Energy Balance System (SEBS) can estimate the amount of evapotranspiration for every pixel of the satellite images. The main objective of this research is evaporation investigation from the freshwater bodies using SEBAL, METRIC, and SEBS. For this purpose, the Amirkabir dam reservoir and its nearby agricultural lands in a semi-arid climate were selected and studied from 2011 to 2017 as the study area. The implementations of this study were accomplished on 16 satellite images of Landsat TM5 and OLI. Then, SEBAL, METRIC, and SEBS were implemented on the selected images. Moreover, the corresponding pan evaporate measurements on the reservoir bank were considered as the ground truth data. Regarding to the results, SEBAL is not a reliable method to evaluate freshwater evaporation with the coefficient of determination (R2) of 0.36 and the Root Mean Square Error (RMSE) of 5.1 mm. On the other hand, METRIC with RMSE and R2 of 0.57 and 2.02 mm and SEBS with RMSE and R2 of 0.93 and 0.62 demonstrated a relatively good performance.

  12. Disturbance metrics predict a wetland Vegetation Index of Biotic Integrity

    USGS Publications Warehouse

    Stapanian, Martin A.; Mack, John; Adams, Jean V.; Gara, Brian; Micacchion, Mick

    2013-01-01

    Indices of biological integrity of wetlands based on vascular plants (VIBIs) have been developed in many areas in the USA. Knowledge of the best predictors of VIBIs would enable management agencies to make better decisions regarding mitigation site selection and performance monitoring criteria. We use a novel statistical technique to develop predictive models for an established index of wetland vegetation integrity (Ohio VIBI), using as independent variables 20 indices and metrics of habitat quality, wetland disturbance, and buffer area land use from 149 wetlands in Ohio, USA. For emergent and forest wetlands, predictive models explained 61% and 54% of the variability, respectively, in Ohio VIBI scores. In both cases the most important predictor of Ohio VIBI score was a metric that assessed habitat alteration and development in the wetland. Of secondary importance as a predictor was a metric that assessed microtopography, interspersion, and quality of vegetation communities in the wetland. Metrics and indices assessing disturbance and land use of the buffer area were generally poor predictors of Ohio VIBI scores. Our results suggest that vegetation integrity of emergent and forest wetlands could be most directly enhanced by minimizing substrate and habitat disturbance within the wetland. Such efforts could include reducing or eliminating any practices that disturb the soil profile, such as nutrient enrichment from adjacent farm land, mowing, grazing, or cutting or removing woody plants.

  13. R&D100: Lightweight Distributed Metric Service

    ScienceCinema

    Gentile, Ann; Brandt, Jim; Tucker, Tom; Showerman, Mike

    2018-06-12

    On today's High Performance Computing platforms, the complexity of applications and configurations makes efficient use of resources difficult. The Lightweight Distributed Metric Service (LDMS) is monitoring software developed by Sandia National Laboratories to provide detailed metrics of system performance. LDMS provides collection, transport, and storage of data from extreme-scale systems at fidelities and timescales to provide understanding of application and system performance with no statistically significant impact on application performance.

  14. R&D100: Lightweight Distributed Metric Service

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gentile, Ann; Brandt, Jim; Tucker, Tom

    2015-11-19

    On today's High Performance Computing platforms, the complexity of applications and configurations makes efficient use of resources difficult. The Lightweight Distributed Metric Service (LDMS) is monitoring software developed by Sandia National Laboratories to provide detailed metrics of system performance. LDMS provides collection, transport, and storage of data from extreme-scale systems at fidelities and timescales to provide understanding of application and system performance with no statistically significant impact on application performance.

  15. Competitive exclusion: an ecological model demonstrates how research metrics can drive women out of science

    NASA Astrophysics Data System (ADS)

    O'Brien, K.; Hapgood, K.

    2012-12-01

    While universities are often perceived within the wider population as a flexible family-friendly work environment, continuous full-time employment remains the norm in tenure track roles. This traditional career path is strongly re-inforced by research metrics, which typically measure accumulated historical performance. There is a strong feedback between historical and future research output, and there is a minimum threshold of research output below which it becomes very difficult to attract funding, high quality students and collaborators. The competing timescales of female fertility and establishment of a research career mean that many women do not exceed this threshold before having children. Using a mathematical model taken from an ecological analogy, we demonstrate how these mechanisms create substantial barriers to pursuing a research career while working part-time or returning from extended parental leave. The model highlights a conundrum for research managers: metrics can promote research productivity and excellence within an organisation, but can classify highly capable scientists as poor performers simply because they have not followed the traditional career path of continuous full-time employment. Based on this analysis, we make concrete recommendations for researchers and managers seeking to retain the skills and training invested in female scientists. We also provide survival tactics for women and men who wish to pursue a career in science while also spending substantial time and energy raising their family.

  16. Quantitative comparison using Generalized Relative Object Detectability (G-ROD) metrics of an amorphous selenium detector with high resolution Microangiographic Fluoroscopes (MAF) and standard flat panel detectors (FPD).

    PubMed

    Russ, M; Shankar, A; Jain, A; Setlur Nagesh, S V; Ionita, C N; Scott, C; Karim, K S; Bednarek, D R; Rudin, S

    2016-02-27

    A novel amorphous selenium (a-Se) direct detector with CMOS readout has been designed, and relative detector performance investigated. The detector features include a 25 μ m pixel pitch, and 1000 μ m thick a-Se layer operating at 10V/ μ m bias field. A simulated detector DQE was determined, and used in comparative calculations of the Relative Object Detectability (ROD) family of prewhitening matched-filter (PWMF) observer and non-prewhitening matched filter (NPWMF) observer model metrics to gauge a-Se detector performance against existing high resolution micro-angiographic fluoroscopic (MAF) detectors and a standard flat panel detector (FPD). The PWMF-ROD or ROD metric compares two x-ray imaging detectors in their relative abilities in imaging a given object by taking the integral over spatial frequencies of the Fourier transform of the detector DQE weighted by an object function, divided by the comparable integral for a different detector. The generalized-ROD (G-ROD) metric incorporates clinically relevant parameters (focal-spot size, magnification, and scatter) to show the degradation in imaging performance for detectors that are part of an imaging chain. Preliminary ROD calculations using simulated spheres as the object predicted superior imaging performance by the a-Se detector as compared to existing detectors. New PWMF-G-ROD and NPWMF-G-ROD results still indicate better performance by the a-Se detector in an imaging chain over all sphere sizes for various focal spot sizes and magnifications, although a-Se performance advantages were degraded by focal spot blurring. Nevertheless, the a-Se technology has great potential to provide breakthrough abilities such as visualization of fine details including of neuro-vascular perforator vessels and of small vascular devices.

  17. Quantitative comparison using generalized relative object detectability (G-ROD) metrics of an amorphous selenium detector with high resolution microangiographic fluoroscopes (MAF) and standard flat panel detectors (FPD)

    NASA Astrophysics Data System (ADS)

    Russ, M.; Shankar, A.; Jain, A.; Setlur Nagesh, S. V.; Ionita, C. N.; Scott, C.; Karim, K. S.; Bednarek, D. R.; Rudin, S.

    2016-03-01

    A novel amorphous selenium (a-Se) direct detector with CMOS readout has been designed, and relative detector performance investigated. The detector features include a 25μm pixel pitch, and 1000μm thick a-Se layer operating at 10V/μm bias field. A simulated detector DQE was determined, and used in comparative calculations of the Relative Object Detectability (ROD) family of prewhitening matched-filter (PWMF) observer and non-pre-whitening matched filter (NPWMF) observer model metrics to gauge a-Se detector performance against existing high resolution micro-angiographic fluoroscopic (MAF) detectors and a standard flat panel detector (FPD). The PWMF-ROD or ROD metric compares two x-ray imaging detectors in their relative abilities in imaging a given object by taking the integral over spatial frequencies of the Fourier transform of the detector DQE weighted by an object function, divided by the comparable integral for a different detector. The generalized-ROD (G-ROD) metric incorporates clinically relevant parameters (focal- spot size, magnification, and scatter) to show the degradation in imaging performance for detectors that are part of an imaging chain. Preliminary ROD calculations using simulated spheres as the object predicted superior imaging performance by the a-Se detector as compared to existing detectors. New PWMF-G-ROD and NPWMF-G-ROD results still indicate better performance by the a-Se detector in an imaging chain over all sphere sizes for various focal spot sizes and magnifications, although a-Se performance advantages were degraded by focal spot blurring. Nevertheless, the a-Se technology has great potential to provide break- through abilities such as visualization of fine details including of neuro-vascular perforator vessels and of small vascular devices.

  18. Quantitative comparison using Generalized Relative Object Detectability (G-ROD) metrics of an amorphous selenium detector with high resolution Microangiographic Fluoroscopes (MAF) and standard flat panel detectors (FPD)

    PubMed Central

    Russ, M.; Shankar, A.; Jain, A.; Setlur Nagesh, S. V.; Ionita, C. N.; Scott, C.; Karim, K. S.; Bednarek, D. R.; Rudin, S.

    2017-01-01

    A novel amorphous selenium (a-Se) direct detector with CMOS readout has been designed, and relative detector performance investigated. The detector features include a 25μm pixel pitch, and 1000μm thick a-Se layer operating at 10V/μm bias field. A simulated detector DQE was determined, and used in comparative calculations of the Relative Object Detectability (ROD) family of prewhitening matched-filter (PWMF) observer and non-prewhitening matched filter (NPWMF) observer model metrics to gauge a-Se detector performance against existing high resolution micro-angiographic fluoroscopic (MAF) detectors and a standard flat panel detector (FPD). The PWMF-ROD or ROD metric compares two x-ray imaging detectors in their relative abilities in imaging a given object by taking the integral over spatial frequencies of the Fourier transform of the detector DQE weighted by an object function, divided by the comparable integral for a different detector. The generalized-ROD (G-ROD) metric incorporates clinically relevant parameters (focal-spot size, magnification, and scatter) to show the degradation in imaging performance for detectors that are part of an imaging chain. Preliminary ROD calculations using simulated spheres as the object predicted superior imaging performance by the a-Se detector as compared to existing detectors. New PWMF-G-ROD and NPWMF-G-ROD results still indicate better performance by the a-Se detector in an imaging chain over all sphere sizes for various focal spot sizes and magnifications, although a-Se performance advantages were degraded by focal spot blurring. Nevertheless, the a-Se technology has great potential to provide breakthrough abilities such as visualization of fine details including of neuro-vascular perforator vessels and of small vascular devices. PMID:28615795

  19. Task-Driven Comparison of Topic Models.

    PubMed

    Alexander, Eric; Gleicher, Michael

    2016-01-01

    Topic modeling, a method of statistically extracting thematic content from a large collection of texts, is used for a wide variety of tasks within text analysis. Though there are a growing number of tools and techniques for exploring single models, comparisons between models are generally reduced to a small set of numerical metrics. These metrics may or may not reflect a model's performance on the analyst's intended task, and can therefore be insufficient to diagnose what causes differences between models. In this paper, we explore task-centric topic model comparison, considering how we can both provide detail for a more nuanced understanding of differences and address the wealth of tasks for which topic models are used. We derive comparison tasks from single-model uses of topic models, which predominantly fall into the categories of understanding topics, understanding similarity, and understanding change. Finally, we provide several visualization techniques that facilitate these tasks, including buddy plots, which combine color and position encodings to allow analysts to readily view changes in document similarity.

  20. Land use regression models for the oxidative potential of fine particles (PM2.5) in five European areas.

    PubMed

    Gulliver, John; Morley, David; Dunster, Chrissi; McCrea, Adrienne; van Nunen, Erik; Tsai, Ming-Yi; Probst-Hensch, Nicoltae; Eeftens, Marloes; Imboden, Medea; Ducret-Stich, Regina; Naccarati, Alessio; Galassi, Claudia; Ranzi, Andrea; Nieuwenhuijsen, Mark; Curto, Ariadna; Donaire-Gonzalez, David; Cirach, Marta; Vermeulen, Roel; Vineis, Paolo; Hoek, Gerard; Kelly, Frank J

    2018-01-01

    Oxidative potential (OP) of particulate matter (PM) is proposed as a biologically-relevant exposure metric for studies of air pollution and health. We aimed to evaluate the spatial variability of the OP of measured PM 2.5 using ascorbate (AA) and (reduced) glutathione (GSH), and develop land use regression (LUR) models to explain this spatial variability. We estimated annual average values (m -3 ) of OP AA and OP GSH for five areas (Basel, CH; Catalonia, ES; London-Oxford, UK (no OP GSH ); the Netherlands; and Turin, IT) using PM 2.5 filters. OP AA and OP GSH LUR models were developed using all monitoring sites, separately for each area and combined-areas. The same variables were then used in repeated sub-sampling of monitoring sites to test sensitivity of variable selection; new variables were offered where variables were excluded (p > .1). On average, measurements of OP AA and OP GSH were moderately correlated (maximum Pearson's maximum Pearson's R = = .7) with PM 2.5 and other metrics (PM 2.5 absorbance, NO 2 , Cu, Fe). HOV (hold-out validation) R 2 for OP AA models was .21, .58, .45, .53, and .13 for Basel, Catalonia, London-Oxford, the Netherlands and Turin respectively. For OP GSH , the only model achieving at least moderate performance was for the Netherlands (R 2 = .31). Combined models for OP AA and OP GSH were largely explained by study area with weak local predictors of intra-area contrasts; we therefore do not endorse them for use in epidemiologic studies. Given the moderate correlation of OP AA with other pollutants, the three reasonably performing LUR models for OP AA could be used independently of other pollutant metrics in epidemiological studies. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Advanced Life Support System Value Metric

    NASA Technical Reports Server (NTRS)

    Jones, Harry W.; Arnold, James O. (Technical Monitor)

    1999-01-01

    The NASA Advanced Life Support (ALS) Program is required to provide a performance metric to measure its progress in system development. Extensive discussions within the ALS program have reached a consensus. The Equivalent System Mass (ESM) metric has been traditionally used and provides a good summary of the weight, size, and power cost factors of space life support equipment. But ESM assumes that all the systems being traded off exactly meet a fixed performance requirement, so that the value and benefit (readiness, performance, safety, etc.) of all the different systems designs are exactly equal. This is too simplistic. Actual system design concepts are selected using many cost and benefit factors and the system specification is then set accordingly. The ALS program needs a multi-parameter metric including both the ESM and a System Value Metric (SVM). The SVM would include safety, maintainability, reliability, performance, use of cross cutting technology, and commercialization potential. Another major factor in system selection is technology readiness level (TRL), a familiar metric in ALS. The overall ALS system metric that is suggested is a benefit/cost ratio, [SVM + TRL]/ESM, with appropriate weighting and scaling. The total value is the sum of SVM and TRL. Cost is represented by ESM. The paper provides a detailed description and example application of the suggested System Value Metric.

  2. Water Resource Planning Under Future Climate and Socioeconomic Uncertainty in the Cauvery River Basin in Karnataka, India

    PubMed Central

    Conway, Declan; Dessai, Suraje; Stainforth, David A.

    2018-01-01

    Abstract Decision‐Making Under Uncertainty (DMUU) approaches have been less utilized in developing countries than developed countries for water resources contexts. High climate vulnerability and rapid socioeconomic change often characterize developing country contexts, making DMUU approaches relevant. We develop an iterative multi‐method DMUU approach, including scenario generation, coproduction with stakeholders and water resources modeling. We apply this approach to explore the robustness of adaptation options and pathways against future climate and socioeconomic uncertainties in the Cauvery River Basin in Karnataka, India. A water resources model is calibrated and validated satisfactorily using observed streamflow. Plausible future changes in Indian Summer Monsoon (ISM) precipitation and water demand are used to drive simulations of water resources from 2021 to 2055. Two stakeholder‐identified decision‐critical metrics are examined: a basin‐wide metric comprising legal instream flow requirements for the downstream state of Tamil Nadu, and a local metric comprising water supply reliability to Bangalore city. In model simulations, the ability to satisfy these performance metrics without adaptation is reduced under almost all scenarios. Implementing adaptation options can partially offset the negative impacts of change. Sequencing of options according to stakeholder priorities into Adaptation Pathways affects metric satisfaction. Early focus on agricultural demand management improves the robustness of pathways but trade‐offs emerge between intrabasin and basin‐wide water availability. We demonstrate that the fine balance between water availability and demand is vulnerable to future changes and uncertainty. Despite current and long‐term planning challenges, stakeholders in developing countries may engage meaningfully in coproduction approaches for adaptation decision‐making under deep uncertainty. PMID:29706676

  3. Water Resource Planning Under Future Climate and Socioeconomic Uncertainty in the Cauvery River Basin in Karnataka, India.

    PubMed

    Bhave, Ajay Gajanan; Conway, Declan; Dessai, Suraje; Stainforth, David A

    2018-02-01

    Decision-Making Under Uncertainty (DMUU) approaches have been less utilized in developing countries than developed countries for water resources contexts. High climate vulnerability and rapid socioeconomic change often characterize developing country contexts, making DMUU approaches relevant. We develop an iterative multi-method DMUU approach, including scenario generation, coproduction with stakeholders and water resources modeling. We apply this approach to explore the robustness of adaptation options and pathways against future climate and socioeconomic uncertainties in the Cauvery River Basin in Karnataka, India. A water resources model is calibrated and validated satisfactorily using observed streamflow. Plausible future changes in Indian Summer Monsoon (ISM) precipitation and water demand are used to drive simulations of water resources from 2021 to 2055. Two stakeholder-identified decision-critical metrics are examined: a basin-wide metric comprising legal instream flow requirements for the downstream state of Tamil Nadu, and a local metric comprising water supply reliability to Bangalore city. In model simulations, the ability to satisfy these performance metrics without adaptation is reduced under almost all scenarios. Implementing adaptation options can partially offset the negative impacts of change. Sequencing of options according to stakeholder priorities into Adaptation Pathways affects metric satisfaction. Early focus on agricultural demand management improves the robustness of pathways but trade-offs emerge between intrabasin and basin-wide water availability. We demonstrate that the fine balance between water availability and demand is vulnerable to future changes and uncertainty. Despite current and long-term planning challenges, stakeholders in developing countries may engage meaningfully in coproduction approaches for adaptation decision-making under deep uncertainty.

  4. Water Resource Planning Under Future Climate and Socioeconomic Uncertainty in the Cauvery River Basin in Karnataka, India

    NASA Astrophysics Data System (ADS)

    Bhave, Ajay Gajanan; Conway, Declan; Dessai, Suraje; Stainforth, David A.

    2018-02-01

    Decision-Making Under Uncertainty (DMUU) approaches have been less utilized in developing countries than developed countries for water resources contexts. High climate vulnerability and rapid socioeconomic change often characterize developing country contexts, making DMUU approaches relevant. We develop an iterative multi-method DMUU approach, including scenario generation, coproduction with stakeholders and water resources modeling. We apply this approach to explore the robustness of adaptation options and pathways against future climate and socioeconomic uncertainties in the Cauvery River Basin in Karnataka, India. A water resources model is calibrated and validated satisfactorily using observed streamflow. Plausible future changes in Indian Summer Monsoon (ISM) precipitation and water demand are used to drive simulations of water resources from 2021 to 2055. Two stakeholder-identified decision-critical metrics are examined: a basin-wide metric comprising legal instream flow requirements for the downstream state of Tamil Nadu, and a local metric comprising water supply reliability to Bangalore city. In model simulations, the ability to satisfy these performance metrics without adaptation is reduced under almost all scenarios. Implementing adaptation options can partially offset the negative impacts of change. Sequencing of options according to stakeholder priorities into Adaptation Pathways affects metric satisfaction. Early focus on agricultural demand management improves the robustness of pathways but trade-offs emerge between intrabasin and basin-wide water availability. We demonstrate that the fine balance between water availability and demand is vulnerable to future changes and uncertainty. Despite current and long-term planning challenges, stakeholders in developing countries may engage meaningfully in coproduction approaches for adaptation decision-making under deep uncertainty.

  5. Guiding principles and checklist for population-based quality metrics.

    PubMed

    Krishnan, Mahesh; Brunelli, Steven M; Maddux, Franklin W; Parker, Thomas F; Johnson, Douglas; Nissenson, Allen R; Collins, Allan; Lacson, Eduardo

    2014-06-06

    The Centers for Medicare and Medicaid Services oversees the ESRD Quality Incentive Program to ensure that the highest quality of health care is provided by outpatient dialysis facilities that treat patients with ESRD. To that end, Centers for Medicare and Medicaid Services uses clinical performance measures to evaluate quality of care under a pay-for-performance or value-based purchasing model. Now more than ever, the ESRD therapeutic area serves as the vanguard of health care delivery. By translating medical evidence into clinical performance measures, the ESRD Prospective Payment System became the first disease-specific sector using the pay-for-performance model. A major challenge for the creation and implementation of clinical performance measures is the adjustments that are necessary to transition from taking care of individual patients to managing the care of patient populations. The National Quality Forum and others have developed effective and appropriate population-based clinical performance measures quality metrics that can be aggregated at the physician, hospital, dialysis facility, nursing home, or surgery center level. Clinical performance measures considered for endorsement by the National Quality Forum are evaluated using five key criteria: evidence, performance gap, and priority (impact); reliability; validity; feasibility; and usability and use. We have developed a checklist of special considerations for clinical performance measure development according to these National Quality Forum criteria. Although the checklist is focused on ESRD, it could also have broad application to chronic disease states, where health care delivery organizations seek to enhance quality, safety, and efficiency of their services. Clinical performance measures are likely to become the norm for tracking performance for health care insurers. Thus, it is critical that the methodologies used to develop such metrics serve the payer and the provider and most importantly, reflect what represents the best care to improve patient outcomes. Copyright © 2014 by the American Society of Nephrology.

  6. A Deep Similarity Metric Learning Model for Matching Text Chunks to Spatial Entities

    NASA Astrophysics Data System (ADS)

    Ma, K.; Wu, L.; Tao, L.; Li, W.; Xie, Z.

    2017-12-01

    The matching of spatial entities with related text is a long-standing research topic that has received considerable attention over the years. This task aims at enrich the contents of spatial entity, and attach the spatial location information to the text chunk. In the data fusion field, matching spatial entities with the corresponding describing text chunks has a big range of significance. However, the most traditional matching methods often rely fully on manually designed, task-specific linguistic features. This work proposes a Deep Similarity Metric Learning Model (DSMLM) based on Siamese Neural Network to learn similarity metric directly from the textural attributes of spatial entity and text chunk. The low-dimensional feature representation of the space entity and the text chunk can be learned separately. By employing the Cosine distance to measure the matching degree between the vectors, the model can make the matching pair vectors as close as possible. Mearnwhile, it makes the mismatching as far apart as possible through supervised learning. In addition, extensive experiments and analysis on geological survey data sets show that our DSMLM model can effectively capture the matching characteristics between the text chunk and the spatial entity, and achieve state-of-the-art performance.

  7. Comparison of power curve monitoring methods

    NASA Astrophysics Data System (ADS)

    Cambron, Philippe; Masson, Christian; Tahan, Antoine; Torres, David; Pelletier, Francis

    2017-11-01

    Performance monitoring is an important aspect of operating wind farms. This can be done through the power curve monitoring (PCM) of wind turbines (WT). In the past years, important work has been conducted on PCM. Various methodologies have been proposed, each one with interesting results. However, it is difficult to compare these methods because they have been developed using their respective data sets. The objective of this actual work is to compare some of the proposed PCM methods using common data sets. The metric used to compare the PCM methods is the time needed to detect a change in the power curve. Two power curve models will be covered to establish the effect the model type has on the monitoring outcomes. Each model was tested with two control charts. Other methodologies and metrics proposed in the literature for power curve monitoring such as areas under the power curve and the use of statistical copulas have also been covered. Results demonstrate that model-based PCM methods are more reliable at the detecting a performance change than other methodologies and that the effectiveness of the control chart depends on the types of shift observed.

  8. Integrated Resilient Aircraft Control Project Full Scale Flight Validation

    NASA Technical Reports Server (NTRS)

    Bosworth, John T.

    2009-01-01

    Objective: Provide validation of adaptive control law concepts through full scale flight evaluation. Technical Approach: a) Engage failure mode - destabilizing or frozen surface. b) Perform formation flight and air-to-air tracking tasks. Evaluate adaptive algorithm: a) Stability metrics. b) Model following metrics. Full scale flight testing provides an ability to validate different adaptive flight control approaches. Full scale flight testing adds credence to NASA's research efforts. A sustained research effort is required to remove the road blocks and provide adaptive control as a viable design solution for increased aircraft resilience.

  9. Performance Optimizing Adaptive Control with Time-Varying Reference Model Modification

    NASA Technical Reports Server (NTRS)

    Nguyen, Nhan T.; Hashemi, Kelley E.

    2017-01-01

    This paper presents a new adaptive control approach that involves a performance optimization objective. The control synthesis involves the design of a performance optimizing adaptive controller from a subset of control inputs. The resulting effect of the performance optimizing adaptive controller is to modify the initial reference model into a time-varying reference model which satisfies the performance optimization requirement obtained from an optimal control problem. The time-varying reference model modification is accomplished by the real-time solutions of the time-varying Riccati and Sylvester equations coupled with the least-squares parameter estimation of the sensitivities of the performance metric. The effectiveness of the proposed method is demonstrated by an application of maneuver load alleviation control for a flexible aircraft.

  10. Development and validation of a septoplasty training model using 3-dimensional printing technology.

    PubMed

    AlReefi, Mahmoud A; Nguyen, Lily H P; Mongeau, Luc G; Haq, Bassam Ul; Boyanapalli, Siddharth; Hafeez, Nauman; Cegarra-Escolano, Francois; Tewfik, Marc A

    2017-04-01

    Providing alternative training modalities may improve trainees' ability to perform septoplasty. Three-dimensional printing has been shown to be a powerful tool in surgical training. The objectives of this study were to explain the development of our 3-dimensional (3D) printed septoplasty training model, to assess its face and content validity, and to present evidence supporting its ability to distinguish between levels of surgical proficiency. Imaging data of a patient with a nasal septal deviation was selected for printing. Printing materials reproducing the mechanical properties of human tissues were selected based on literature review and prototype testing. Eight expert rhinologists, 6 senior residents, and 6 junior residents performed endoscopic septoplasties on the model and completed a postsimulation survey. Performance metrics in quality (final product analysis), efficiency (time), and safety (eg, perforation length, nares damage) were recorded and analyzed in a study-blind manner. The model was judged to be anatomically correct and the steps performed realistic, with scores of 4.05 ± 0.82 and 4.2 ± 1, respectively, on a 5-point Likert scale. Ninety-two percent of residents desired the simulator to be integrated into their teaching curriculum. There was a significant difference (p < 0.05) between the expert, intermediate, and novice groups in time taken and nares cuts, whereas other performance metrics showed no significant difference. To our knowledge, there are no other simulator training models for septoplasty. Our model incorporates 2 different materials mixed into the 3 relevant consistencies necessary to simulate septoplasty. Our findings provide evidence supporting the validity of the model. © 2016 ARS-AAOA, LLC.

  11. A software quality model and metrics for risk assessment

    NASA Technical Reports Server (NTRS)

    Hyatt, L.; Rosenberg, L.

    1996-01-01

    A software quality model and its associated attributes are defined and used as the model for the basis for a discussion on risk. Specific quality goals and attributes are selected based on their importance to a software development project and their ability to be quantified. Risks that can be determined by the model's metrics are identified. A core set of metrics relating to the software development process and its products is defined. Measurements for each metric and their usability and applicability are discussed.

  12. Evaluating synoptic systems in the CMIP5 climate models over the Australian region

    NASA Astrophysics Data System (ADS)

    Gibson, Peter B.; Uotila, Petteri; Perkins-Kirkpatrick, Sarah E.; Alexander, Lisa V.; Pitman, Andrew J.

    2016-10-01

    Climate models are our principal tool for generating the projections used to inform climate change policy. Our confidence in projections depends, in part, on how realistically they simulate present day climate and associated variability over a range of time scales. Traditionally, climate models are less commonly assessed at time scales relevant to daily weather systems. Here we explore the utility of a self-organizing maps (SOMs) procedure for evaluating the frequency, persistence and transitions of daily synoptic systems in the Australian region simulated by state-of-the-art global climate models. In terms of skill in simulating the climatological frequency of synoptic systems, large spread was observed between models. A positive association between all metrics was found, implying that relative skill in simulating the persistence and transitions of systems is related to skill in simulating the climatological frequency. Considering all models and metrics collectively, model performance was found to be related to model horizontal resolution but unrelated to vertical resolution or representation of the stratosphere. In terms of the SOM procedure, the timespan over which evaluation was performed had some influence on model performance skill measures, as did the number of circulation types examined. These findings have implications for selecting models most useful for future projections over the Australian region, particularly for projections related to synoptic scale processes and phenomena. More broadly, this study has demonstrated the utility of the SOMs procedure in providing a process-based evaluation of climate models.

  13. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale

    PubMed Central

    Kobourov, Stephen; Gallant, Mike; Börner, Katy

    2016-01-01

    Overview Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms—Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. Cluster Quality Metrics We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Network Clustering Algorithms Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters. PMID:27391786

  14. Quality assessment of color images based on the measure of just noticeable color difference

    NASA Astrophysics Data System (ADS)

    Chou, Chun-Hsien; Hsu, Yun-Hsiang

    2014-01-01

    Accurate assessment on the quality of color images is an important step to many image processing systems that convey visual information of the reproduced images. An accurate objective image quality assessment (IQA) method is expected to give the assessment result highly agreeing with the subjective assessment. To assess the quality of color images, many approaches simply apply the metric for assessing the quality of gray scale images to each of three color channels of the color image, neglecting the correlation among three color channels. In this paper, a metric for assessing color images' quality is proposed, in which the model of variable just-noticeable color difference (VJNCD) is employed to estimate the visibility thresholds of distortion inherent in each color pixel. With the estimated visibility thresholds of distortion, the proposed metric measures the average perceptible distortion in terms of the quantized distortion according to the perceptual error map similar to that defined by National Bureau of Standards (NBS) for converting the color difference enumerated by CIEDE2000 to the objective score of perceptual quality assessment. The perceptual error map in this case is designed for each pixel according to the visibility threshold estimated by the VJNCD model. The performance of the proposed metric is verified by assessing the test images in the LIVE database, and is compared with those of many well-know IQA metrics. Experimental results indicate that the proposed metric is an effective IQA method that can accurately predict the image quality of color images in terms of the correlation between objective scores and subjective evaluation.

  15. Performance metrics for the assessment of satellite data products: an ocean color case study

    EPA Science Inventory

    Performance assessment of ocean color satellite data has generally relied on statistical metrics chosen for their common usage and the rationale for selecting certain metrics is infrequently explained. Commonly reported statistics based on mean squared errors, such as the coeffic...

  16. Performance Metrics for Soil Moisture Retrievals and Applications Requirements

    USDA-ARS?s Scientific Manuscript database

    Quadratic performance metrics such as root-mean-square error (RMSE) and time series correlation are often used to assess the accuracy of geophysical retrievals and true fields. These metrics are generally related; nevertheless each has advantages and disadvantages. In this study we explore the relat...

  17. Assessing the Effects of Data Compression in Simulations Using Physically Motivated Metrics

    DOE PAGES

    Laney, Daniel; Langer, Steven; Weber, Christopher; ...

    2014-01-01

    This paper examines whether lossy compression can be used effectively in physics simulations as a possible strategy to combat the expected data-movement bottleneck in future high performance computing architectures. We show that, for the codes and simulations we tested, compression levels of 3–5X can be applied without causing significant changes to important physical quantities. Rather than applying signal processing error metrics, we utilize physics-based metrics appropriate for each code to assess the impact of compression. We evaluate three different simulation codes: a Lagrangian shock-hydrodynamics code, an Eulerian higher-order hydrodynamics turbulence modeling code, and an Eulerian coupled laser-plasma interaction code. Wemore » compress relevant quantities after each time-step to approximate the effects of tightly coupled compression and study the compression rates to estimate memory and disk-bandwidth reduction. We find that the error characteristics of compression algorithms must be carefully considered in the context of the underlying physics being modeled.« less

  18. A Topology Control Strategy with Reliability Assurance for Satellite Cluster Networks in Earth Observation

    PubMed Central

    Chen, Qing; Zhang, Jinxiu; Hu, Ze

    2017-01-01

    This article investigates the dynamic topology control problem of satellite cluster networks (SCNs) in Earth observation (EO) missions by applying a novel metric of stability for inter-satellite links (ISLs). The properties of the periodicity and predictability of satellites’ relative position are involved in the link cost metric which is to give a selection criterion for choosing the most reliable data routing paths. Also, a cooperative work model with reliability is proposed for the situation of emergency EO missions. Based on the link cost metric and the proposed reliability model, a reliability assurance topology control algorithm and its corresponding dynamic topology control (RAT) strategy are established to maximize the stability of data transmission in the SCNs. The SCNs scenario is tested through some numeric simulations of the topology stability of average topology lifetime and average packet loss rate. Simulation results show that the proposed reliable strategy applied in SCNs significantly improves the data transmission performance and prolongs the average topology lifetime. PMID:28241474

  19. A Topology Control Strategy with Reliability Assurance for Satellite Cluster Networks in Earth Observation.

    PubMed

    Chen, Qing; Zhang, Jinxiu; Hu, Ze

    2017-02-23

    This article investigates the dynamic topology control problemof satellite cluster networks (SCNs) in Earth observation (EO) missions by applying a novel metric of stability for inter-satellite links (ISLs). The properties of the periodicity and predictability of satellites' relative position are involved in the link cost metric which is to give a selection criterion for choosing the most reliable data routing paths. Also, a cooperative work model with reliability is proposed for the situation of emergency EO missions. Based on the link cost metric and the proposed reliability model, a reliability assurance topology control algorithm and its corresponding dynamic topology control (RAT) strategy are established to maximize the stability of data transmission in the SCNs. The SCNs scenario is tested through some numeric simulations of the topology stability of average topology lifetime and average packet loss rate. Simulation results show that the proposed reliable strategy applied in SCNs significantly improves the data transmission performance and prolongs the average topology lifetime.

  20. Test of the FLRW Metric and Curvature with Strong Lens Time Delays

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liao, Kai; Li, Zhengxiang; Wang, Guo-Jian

    We present a new model-independent strategy for testing the Friedmann–Lemaître–Robertson–Walker (FLRW) metric and constraining cosmic curvature, based on future time-delay measurements of strongly lensed quasar-elliptical galaxy systems from the Large Synoptic Survey Telescope and supernova observations from the Dark Energy Survey. The test only relies on geometric optics. It is independent of the energy contents of the universe and the validity of the Einstein equation on cosmological scales. The study comprises two levels: testing the FLRW metric through the distance sum rule (DSR) and determining/constraining cosmic curvature. We propose an effective and efficient (redshift) evolution model for performing the formermore » test, which allows us to concretely specify the violation criterion for the FLRW DSR. If the FLRW metric is consistent with the observations, then on the second level the cosmic curvature parameter will be constrained to ∼0.057 or ∼0.041 (1 σ ), depending on the availability of high-redshift supernovae, which is much more stringent than current model-independent techniques. We also show that the bias in the time-delay method might be well controlled, leading to robust results. The proposed method is a new independent tool for both testing the fundamental assumptions of homogeneity and isotropy in cosmology and for determining cosmic curvature. It is complementary to cosmic microwave background plus baryon acoustic oscillation analyses, which normally assume a cosmological model with dark energy domination in the late-time universe.« less

  1. Self-supervised online metric learning with low rank constraint for scene categorization.

    PubMed

    Cong, Yang; Liu, Ji; Yuan, Junsong; Luo, Jiebo

    2013-08-01

    Conventional visual recognition systems usually train an image classifier in a bath mode with all training data provided in advance. However, in many practical applications, only a small amount of training samples are available in the beginning and many more would come sequentially during online recognition. Because the image data characteristics could change over time, it is important for the classifier to adapt to the new data incrementally. In this paper, we present an online metric learning method to address the online scene recognition problem via adaptive similarity measurement. Given a number of labeled data followed by a sequential input of unseen testing samples, the similarity metric is learned to maximize the margin of the distance among different classes of samples. By considering the low rank constraint, our online metric learning model not only can provide competitive performance compared with the state-of-the-art methods, but also guarantees convergence. A bi-linear graph is also defined to model the pair-wise similarity, and an unseen sample is labeled depending on the graph-based label propagation, while the model can also self-update using the more confident new samples. With the ability of online learning, our methodology can well handle the large-scale streaming video data with the ability of incremental self-updating. We evaluate our model to online scene categorization and experiments on various benchmark datasets and comparisons with state-of-the-art methods demonstrate the effectiveness and efficiency of our algorithm.

  2. Quantitative criteria for assessment of gamma-ray imager performance

    NASA Astrophysics Data System (ADS)

    Gottesman, Steve; Keller, Kristi; Malik, Hans

    2015-08-01

    In recent years gamma ray imagers such as the GammaCamTM and Polaris have demonstrated good imaging performance in the field. Imager performance is often summarized as "resolution", either angular, or spatial at some distance from the imager, however the definition of resolution is not always related to the ability to image an object. It is difficult to quantitatively compare imagers without a common definition of image quality. This paper examines three categories of definition: point source; line source; and area source. It discusses the details of those definitions and which ones are more relevant for different situations. Metrics such as Full Width Half Maximum (FWHM), variations on the Rayleigh criterion, and some analogous to National Imagery Interpretability Rating Scale (NIIRS) are discussed. The performance against these metrics is evaluated for a high resolution coded aperture imager modeled using Monte Carlo N-Particle (MCNP), and for a medium resolution imager measured in the lab.

  3. A priori discretization quality metrics for distributed hydrologic modeling applications

    NASA Astrophysics Data System (ADS)

    Liu, Hongli; Tolson, Bryan; Craig, James; Shafii, Mahyar; Basu, Nandita

    2016-04-01

    In distributed hydrologic modelling, a watershed is treated as a set of small homogeneous units that address the spatial heterogeneity of the watershed being simulated. The ability of models to reproduce observed spatial patterns firstly depends on the spatial discretization, which is the process of defining homogeneous units in the form of grid cells, subwatersheds, or hydrologic response units etc. It is common for hydrologic modelling studies to simply adopt a nominal or default discretization strategy without formally assessing alternative discretization levels. This approach lacks formal justifications and is thus problematic. More formalized discretization strategies are either a priori or a posteriori with respect to building and running a hydrologic simulation model. A posteriori approaches tend to be ad-hoc and compare model calibration and/or validation performance under various watershed discretizations. The construction and calibration of multiple versions of a distributed model can become a seriously limiting computational burden. Current a priori approaches are more formalized and compare overall heterogeneity statistics of dominant variables between candidate discretization schemes and input data or reference zones. While a priori approaches are efficient and do not require running a hydrologic model, they do not fully investigate the internal spatial pattern changes of variables of interest. Furthermore, the existing a priori approaches focus on landscape and soil data and do not assess impacts of discretization on stream channel definition even though its significance has been noted by numerous studies. The primary goals of this study are to (1) introduce new a priori discretization quality metrics considering the spatial pattern changes of model input data; (2) introduce a two-step discretization decision-making approach to compress extreme errors and meet user-specified discretization expectations through non-uniform discretization threshold modification. The metrics for the first time provides quantification of the routing relevant information loss due to discretization according to the relationship between in-channel routing length and flow velocity. Moreover, it identifies and counts the spatial pattern changes of dominant hydrological variables by overlaying candidate discretization schemes upon input data and accumulating variable changes in area-weighted way. The metrics are straightforward and applicable to any semi-distributed or fully distributed hydrological model with grid scales are greater than input data resolutions. The discretization metrics and decision-making approach are applied to the Grand River watershed located in southwestern Ontario, Canada where discretization decisions are required for a semi-distributed modelling application. Results show that discretization induced information loss monotonically increases as discretization gets rougher. With regards to routing information loss in subbasin discretization, multiple interesting points rather than just the watershed outlet should be considered. Moreover, subbasin and HRU discretization decisions should not be considered independently since subbasin input significantly influences the complexity of HRU discretization result. Finally, results show that the common and convenient approach of making uniform discretization decisions across the watershed domain performs worse compared to a metric informed non-uniform discretization approach as the later since is able to conserve more watershed heterogeneity under the same model complexity (number of computational units).

  4. Virtual reality, ultrasound-guided liver biopsy simulator: development and performance discrimination

    PubMed Central

    Johnson, S J; Hunt, C M; Woolnough, H M; Crawshaw, M; Kilkenny, C; Gould, D A; England, A; Sinha, A; Villard, P F

    2012-01-01

    Objectives The aim of this article was to identify and prospectively investigate simulated ultrasound-guided targeted liver biopsy performance metrics as differentiators between levels of expertise in interventional radiology. Methods Task analysis produced detailed procedural step documentation allowing identification of critical procedure steps and performance metrics for use in a virtual reality ultrasound-guided targeted liver biopsy procedure. Consultant (n=14; male=11, female=3) and trainee (n=26; male=19, female=7) scores on the performance metrics were compared. Ethical approval was granted by the Liverpool Research Ethics Committee (UK). Independent t-tests and analysis of variance (ANOVA) investigated differences between groups. Results Independent t-tests revealed significant differences between trainees and consultants on three performance metrics: targeting, p=0.018, t=−2.487 (−2.040 to −0.207); probe usage time, p = 0.040, t=2.132 (11.064 to 427.983); mean needle length in beam, p=0.029, t=−2.272 (−0.028 to −0.002). ANOVA reported significant differences across years of experience (0–1, 1–2, 3+ years) on seven performance metrics: no-go area touched, p=0.012; targeting, p=0.025; length of session, p=0.024; probe usage time, p=0.025; total needle distance moved, p=0.038; number of skin contacts, p<0.001; total time in no-go area, p=0.008. More experienced participants consistently received better performance scores on all 19 performance metrics. Conclusion It is possible to measure and monitor performance using simulation, with performance metrics providing feedback on skill level and differentiating levels of expertise. However, a transfer of training study is required. PMID:21304005

  5. NEW CATEGORICAL METRICS FOR AIR QUALITY MODEL EVALUATION

    EPA Science Inventory

    Traditional categorical metrics used in model evaluations are "clear-cut" measures in that the model's ability to predict an exceedance is defined by a fixed threshold concentration and the metrics are defined by observation-forecast sets that are paired both in space and time. T...

  6. Multi-objective optimization for evaluation of simulation fidelity for precipitation, cloudiness and insolation in regional climate models

    NASA Astrophysics Data System (ADS)

    Lee, H.

    2016-12-01

    Precipitation is one of the most important climate variables that are taken into account in studying regional climate. Nevertheless, how precipitation will respond to a changing climate and even its mean state in the current climate are not well represented in regional climate models (RCMs). Hence, comprehensive and mathematically rigorous methodologies to evaluate precipitation and related variables in multiple RCMs are required. The main objective of the current study is to evaluate the joint variability of climate variables related to model performance in simulating precipitation and condense multiple evaluation metrics into a single summary score. We use multi-objective optimization, a mathematical process that provides a set of optimal tradeoff solutions based on a range of evaluation metrics, to characterize the joint representation of precipitation, cloudiness and insolation in RCMs participating in the North American Regional Climate Change Assessment Program (NARCCAP) and Coordinated Regional Climate Downscaling Experiment-North America (CORDEX-NA). We also leverage ground observations, NASA satellite data and the Regional Climate Model Evaluation System (RCMES). Overall, the quantitative comparison of joint probability density functions between the three variables indicates that performance of each model differs markedly between sub-regions and also shows strong seasonal dependence. Because of the large variability across the models, it is important to evaluate models systematically and make future projections using only models showing relatively good performance. Our results indicate that the optimized multi-model ensemble always shows better performance than the arithmetic ensemble mean and may guide reliable future projections.

  7. The Plumbing of Land Surface Models: Is Poor Performance a Result of Methodology or Data Quality?

    NASA Technical Reports Server (NTRS)

    Haughton, Ned; Abramowitz, Gab; Pitman, Andy J.; Or, Dani; Best, Martin J.; Johnson, Helen R.; Balsamo, Gianpaolo; Boone, Aaron; Cuntz, Matthais; Decharme, Bertrand; hide

    2016-01-01

    The PALS Land sUrface Model Benchmarking Evaluation pRoject (PLUMBER) illustrated the value of prescribing a priori performance targets in model intercomparisons. It showed that the performance of turbulent energy flux predictions from different land surface models, at a broad range of flux tower sites using common evaluation metrics, was on average worse than relatively simple empirical models. For sensible heat fluxes, all land surface models were outperformed by a linear regression against downward shortwave radiation. For latent heat flux, all land surface models were outperformed by a regression against downward shortwave, surface air temperature and relative humidity. These results are explored here in greater detail and possible causes are investigated. We examine whether particular metrics or sites unduly influence the collated results, whether results change according to time-scale aggregation and whether a lack of energy conservation in fluxtower data gives the empirical models an unfair advantage in the intercomparison. We demonstrate that energy conservation in the observational data is not responsible for these results. We also show that the partitioning between sensible and latent heat fluxes in LSMs, rather than the calculation of available energy, is the cause of the original findings. Finally, we present evidence suggesting that the nature of this partitioning problem is likely shared among all contributing LSMs. While we do not find a single candidate explanation forwhy land surface models perform poorly relative to empirical benchmarks in PLUMBER, we do exclude multiple possible explanations and provide guidance on where future research should focus.

  8. Automated Metrics in a Virtual-Reality Myringotomy Simulator: Development and Construct Validity.

    PubMed

    Huang, Caiwen; Cheng, Horace; Bureau, Yves; Ladak, Hanif M; Agrawal, Sumit K

    2018-06-15

    The objectives of this study were: 1) to develop and implement a set of automated performance metrics into the Western myringotomy simulator, and 2) to establish construct validity. Prospective simulator-based assessment study. The Auditory Biophysics Laboratory at Western University, London, Ontario, Canada. Eleven participants were recruited from the Department of Otolaryngology-Head & Neck Surgery at Western University: four senior otolaryngology consultants and seven junior otolaryngology residents. Educational simulation. Discrimination between expert and novice participants on five primary automated performance metrics: 1) time to completion, 2) surgical errors, 3) incision angle, 4) incision length, and 5) the magnification of the microscope. Automated performance metrics were developed, programmed, and implemented into the simulator. Participants were given a standardized simulator orientation and instructions on myringotomy and tube placement. Each participant then performed 10 procedures and automated metrics were collected. The metrics were analyzed using the Mann-Whitney U test with Bonferroni correction. All metrics discriminated senior otolaryngologists from junior residents with a significance of p < 0.002. Junior residents had 2.8 times more errors compared with the senior otolaryngologists. Senior otolaryngologists took significantly less time to completion compared with junior residents. The senior group also had significantly longer incision lengths, more accurate incision angles, and lower magnification keeping both the umbo and annulus in view. Automated quantitative performance metrics were successfully developed and implemented, and construct validity was established by discriminating between expert and novice participants.

  9. Person re-identification over camera networks using multi-task distance metric learning.

    PubMed

    Ma, Lianyang; Yang, Xiaokang; Tao, Dacheng

    2014-08-01

    Person reidentification in a camera network is a valuable yet challenging problem to solve. Existing methods learn a common Mahalanobis distance metric by using the data collected from different cameras and then exploit the learned metric for identifying people in the images. However, the cameras in a camera network have different settings and the recorded images are seriously affected by variability in illumination conditions, camera viewing angles, and background clutter. Using a common metric to conduct person reidentification tasks on different camera pairs overlooks the differences in camera settings; however, it is very time-consuming to label people manually in images from surveillance videos. For example, in most existing person reidentification data sets, only one image of a person is collected from each of only two cameras; therefore, directly learning a unique Mahalanobis distance metric for each camera pair is susceptible to over-fitting by using insufficiently labeled data. In this paper, we reformulate person reidentification in a camera network as a multitask distance metric learning problem. The proposed method designs multiple Mahalanobis distance metrics to cope with the complicated conditions that exist in typical camera networks. We address the fact that these Mahalanobis distance metrics are different but related, and learned by adding joint regularization to alleviate over-fitting. Furthermore, by extending, we present a novel multitask maximally collapsing metric learning (MtMCML) model for person reidentification in a camera network. Experimental results demonstrate that formulating person reidentification over camera networks as multitask distance metric learning problem can improve performance, and our proposed MtMCML works substantially better than other current state-of-the-art person reidentification methods.

  10. Metrics to assess injury prevention programs for young workers in high-risk occupations: a scoping review of the literature.

    PubMed

    Jennifer, Smith; Purewal, Birinder Praneet; Macpherson, Alison; Pike, Ian

    2018-05-01

    Despite legal protections for young workers in Canada, youth aged 15-24 are at high risk of traumatic occupational injury. While many injury prevention initiatives targeting young workers exist, the challenge faced by youth advocates and employers is deciding what aspect(s) of prevention will be the most effective focus for their efforts. A review of the academic and grey literatures was undertaken to compile the metrics-both the indicators being evaluated and the methods of measurement-commonly used to assess injury prevention programs for young workers. Metrics are standards of measurement through which efficiency, performance, progress, or quality of a plan, process, or product can be assessed. A PICO framework was used to develop search terms. Medline, PubMed, OVID, EMBASE, CCOHS, PsychINFO, CINAHL, NIOSHTIC, Google Scholar and the grey literature were searched for articles in English, published between 1975-2015. Two independent reviewers screened the resulting list and categorized the metrics in three domains of injury prevention: Education, Environment and Enforcement. Of 174 acquired articles meeting the inclusion criteria, 21 both described and assessed an intervention. Half were educational in nature (N=11). Commonly assessed metrics included: knowledge, perceptions, self-reported behaviours or intentions, hazardous exposures, injury claims, and injury counts. One study outlined a method for developing metrics to predict injury rates. Metrics specific to the evaluation of young worker injury prevention programs are needed, as current metrics are insufficient to predict reduced injuries following program implementation. One study, which the review brought to light, could be an appropriate model for future research to develop valid leading metrics specific to young workers, and then apply these metrics to injury prevention programs for youth.

  11. Automated map sharpening by maximization of detail and connectivity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Terwilliger, Thomas C.; Sobolev, Oleg V.; Afonine, Pavel V.

    An algorithm for automatic map sharpening is presented that is based on optimization of the detail and connectivity of the sharpened map. The detail in the map is reflected in the surface area of an iso-contour surface that contains a fixed fraction of the volume of the map, where a map with high level of detail has a high surface area. The connectivity of the sharpened map is reflected in the number of connected regions defined by the same iso-contour surfaces, where a map with high connectivity has a small number of connected regions. By combining these two measures inmore » a metric termed the `adjusted surface area', map quality can be evaluated in an automated fashion. This metric was used to choose optimal map-sharpening parameters without reference to a model or other interpretations of the map. Map sharpening by optimization of the adjusted surface area can be carried out for a map as a whole or it can be carried out locally, yielding a locally sharpened map. To evaluate the performance of various approaches, a simple metric based on map–model correlation that can reproduce visual choices of optimally sharpened maps was used. The map–model correlation is calculated using a model withBfactors (atomic displacement factors; ADPs) set to zero. Finally, this model-based metric was used to evaluate map sharpening and to evaluate map-sharpening approaches, and it was found that optimization of the adjusted surface area can be an effective tool for map sharpening.« less

  12. Automated map sharpening by maximization of detail and connectivity

    DOE PAGES

    Terwilliger, Thomas C.; Sobolev, Oleg V.; Afonine, Pavel V.; ...

    2018-05-18

    An algorithm for automatic map sharpening is presented that is based on optimization of the detail and connectivity of the sharpened map. The detail in the map is reflected in the surface area of an iso-contour surface that contains a fixed fraction of the volume of the map, where a map with high level of detail has a high surface area. The connectivity of the sharpened map is reflected in the number of connected regions defined by the same iso-contour surfaces, where a map with high connectivity has a small number of connected regions. By combining these two measures inmore » a metric termed the `adjusted surface area', map quality can be evaluated in an automated fashion. This metric was used to choose optimal map-sharpening parameters without reference to a model or other interpretations of the map. Map sharpening by optimization of the adjusted surface area can be carried out for a map as a whole or it can be carried out locally, yielding a locally sharpened map. To evaluate the performance of various approaches, a simple metric based on map–model correlation that can reproduce visual choices of optimally sharpened maps was used. The map–model correlation is calculated using a model withBfactors (atomic displacement factors; ADPs) set to zero. Finally, this model-based metric was used to evaluate map sharpening and to evaluate map-sharpening approaches, and it was found that optimization of the adjusted surface area can be an effective tool for map sharpening.« less

  13. Differential correlation for sequencing data.

    PubMed

    Siska, Charlotte; Kechris, Katerina

    2017-01-19

    Several methods have been developed to identify differential correlation (DC) between pairs of molecular features from -omics studies. Most DC methods have only been tested with microarrays and other platforms producing continuous and Gaussian-like data. Sequencing data is in the form of counts, often modeled with a negative binomial distribution making it difficult to apply standard correlation metrics. We have developed an R package for identifying DC called Discordant which uses mixture models for correlations between features and the Expectation Maximization (EM) algorithm for fitting parameters of the mixture model. Several correlation metrics for sequencing data are provided and tested using simulations. Other extensions in the Discordant package include additional modeling for different types of differential correlation, and faster implementation, using a subsampling routine to reduce run-time and address the assumption of independence between molecular feature pairs. With simulations and breast cancer miRNA-Seq and RNA-Seq data, we find that Spearman's correlation has the best performance among the tested correlation methods for identifying differential correlation. Application of Spearman's correlation in the Discordant method demonstrated the most power in ROC curves and sensitivity/specificity plots, and improved ability to identify experimentally validated breast cancer miRNA. We also considered including additional types of differential correlation, which showed a slight reduction in power due to the additional parameters that need to be estimated, but more versatility in applications. Finally, subsampling within the EM algorithm considerably decreased run-time with negligible effect on performance. A new method and R package called Discordant is presented for identifying differential correlation with sequencing data. Based on comparisons with different correlation metrics, this study suggests Spearman's correlation is appropriate for sequencing data, but other correlation metrics are available to the user depending on the application and data type. The Discordant method can also be extended to investigate additional DC types and subsampling with the EM algorithm is now available for reduced run-time. These extensions to the R package make Discordant more robust and versatile for multiple -omics studies.

  14. Ranking of stopping criteria for log domain diffeomorphic demons application in clinical radiation therapy.

    PubMed

    Peroni, M; Golland, P; Sharp, G C; Baroni, G

    2011-01-01

    Deformable Image Registration is a complex optimization algorithm with the goal of modeling a non-rigid transformation between two images. A crucial issue in this field is guaranteeing the user a robust but computationally reasonable algorithm. We rank the performances of four stopping criteria and six stopping value computation strategies for a log domain deformable registration. The stopping criteria we test are: (a) velocity field update magnitude, (b) vector field Jacobian, (c) mean squared error, and (d) harmonic energy. Experiments demonstrate that comparing the metric value over the last three iterations with the metric minimum of between four and six previous iterations is a robust and appropriate strategy. The harmonic energy and vector field update magnitude metrics give the best results in terms of robustness and speed of convergence.

  15. Wafer hot spot identification through advanced photomask characterization techniques: part 2

    NASA Astrophysics Data System (ADS)

    Choi, Yohan; Green, Michael; Cho, Young; Ham, Young; Lin, Howard; Lan, Andy; Yang, Richer; Lung, Mike

    2017-03-01

    Historically, 1D metrics such as Mean to Target (MTT) and CD Uniformity (CDU) have been adequate for mask end users to evaluate and predict the mask impact on the wafer process. However, the wafer lithographer's process margin is shrinking at advanced nodes to a point that classical mask CD metrics are no longer adequate to gauge the mask contribution to wafer process error. For example, wafer CDU error at advanced nodes is impacted by mask factors such as 3-dimensional (3D) effects and mask pattern fidelity on sub-resolution assist features (SRAFs) used in Optical Proximity Correction (OPC) models of ever-increasing complexity. To overcome the limitation of 1D metrics, there are numerous on-going industry efforts to better define wafer-predictive metrics through both standard mask metrology and aerial CD methods. Even with these improvements, the industry continues to struggle to define useful correlative metrics that link the mask to final device performance. In part 1 of this work, we utilized advanced mask pattern characterization techniques to extract potential hot spots on the mask and link them, theoretically, to issues with final wafer performance. In this paper, part 2, we complete the work by verifying these techniques at wafer level. The test vehicle (TV) that was used for hot spot detection on the mask in part 1 will be used to expose wafers. The results will be used to verify the mask-level predictions. Finally, wafer performance with predicted and verified mask/wafer condition will be shown as the result of advanced mask characterization. The goal is to maximize mask end user yield through mask-wafer technology harmonization. This harmonization will provide the necessary feedback to determine optimum design, mask specifications, and mask-making conditions for optimal wafer process margin.

  16. Persuasive communication: A theoretical model for changing the attitude of preservice elementary teachers toward metric conversion

    NASA Astrophysics Data System (ADS)

    Shrigley, Robert L.

    This study was based on Hovland's four-part statement, Who says what to whom with what effect, the rationale for persuasive communication, a theoretical model for modifying attitudes. Part I was a survey of 139 perservice elementary teachers from which were generated the more credible characteristics of metric instructors, a central element in the who component of Hovland's model. They were: (1) background in mathematics and science, (2) fluency in metrics, (3) capability of thinking metrically, (4) a record of excellent teaching, (5) previous teaching of metric measurement to children, (6) responsibility for teaching metric content in methods courses and (7) an open enthusiasm for metric conversion. Part II was a survey of 45 mathematics educators where belief statements were synthesized for the what component of Hovland's model. It found that math educators support metric measurement because: (1) it is consistent with our monetary system; (2) the conversion of units is easier into metric than English; (3) it is easier to teach and easier to learn than English measurement; there is less need for common fractions; (4) most nations use metric measurement; scientists have used it for decades; (5) American industry has begun to use it; (6) metric measurement will facilitate world trade and communication; and (7) American children will need it as adults; educational agencies are mandating it. With the who and what of Hovland's four-part statement defined, educational researchers now have baseline data to use in testing experimentally the effect of persuasive communication on the attitude of preservice teachers toward metrication.

  17. A Three-Dimensional Receiver Operator Characteristic Surface Diagnostic Metric

    NASA Technical Reports Server (NTRS)

    Simon, Donald L.

    2011-01-01

    Receiver Operator Characteristic (ROC) curves are commonly applied as metrics for quantifying the performance of binary fault detection systems. An ROC curve provides a visual representation of a detection system s True Positive Rate versus False Positive Rate sensitivity as the detection threshold is varied. The area under the curve provides a measure of fault detection performance independent of the applied detection threshold. While the standard ROC curve is well suited for quantifying binary fault detection performance, it is not suitable for quantifying the classification performance of multi-fault classification problems. Furthermore, it does not provide a measure of diagnostic latency. To address these shortcomings, a novel three-dimensional receiver operator characteristic (3D ROC) surface metric has been developed. This is done by generating and applying two separate curves: the standard ROC curve reflecting fault detection performance, and a second curve reflecting fault classification performance. A third dimension, diagnostic latency, is added giving rise to 3D ROC surfaces. Applying numerical integration techniques, the volumes under and between the surfaces are calculated to produce metrics of the diagnostic system s detection and classification performance. This paper will describe the 3D ROC surface metric in detail, and present an example of its application for quantifying the performance of aircraft engine gas path diagnostic methods. Metric limitations and potential enhancements are also discussed

  18. Oscillatory neural network for pattern recognition: trajectory based classification and supervised learning.

    PubMed

    Miller, Vonda H; Jansen, Ben H

    2008-12-01

    Computer algorithms that match human performance in recognizing written text or spoken conversation remain elusive. The reasons why the human brain far exceeds any existing recognition scheme to date in the ability to generalize and to extract invariant characteristics relevant to category matching are not clear. However, it has been postulated that the dynamic distribution of brain activity (spatiotemporal activation patterns) is the mechanism by which stimuli are encoded and matched to categories. This research focuses on supervised learning using a trajectory based distance metric for category discrimination in an oscillatory neural network model. Classification is accomplished using a trajectory based distance metric. Since the distance metric is differentiable, a supervised learning algorithm based on gradient descent is demonstrated. Classification of spatiotemporal frequency transitions and their relation to a priori assessed categories is shown along with the improved classification results after supervised training. The results indicate that this spatiotemporal representation of stimuli and the associated distance metric is useful for simple pattern recognition tasks and that supervised learning improves classification results.

  19. Development of a Multimetric Indicator of Pelagic Zooplankton ...

    EPA Pesticide Factsheets

    We used zooplankton data collected for the 2012 National Lakes Assessment (NLA) to develop multimetric indices (MMIs) for five aggregated ecoregions of the conterminous USA (Coastal Plains, Eastern Highlands, Plains, Upper Midwest, and Western Mountains and Xeric [“West’]). We classified candidate metrics into six categories: We evaluated the performance of candidate metrics, and used metrics that had passed these screens to calculate all possible candidate MMIs that included at least one metric from each category. We selected the candidate MMI that had high responsiveness, a reasonable value for repeatability, low mean pairwise correlation among component metrics, and, when possible, a maximum pairwise correlation among component metrics that was <0.7. We were able to develop MMIs that were sufficiently responsive and repeatable to assess ecological condition for the NLA without the need to reduce the effects of natural variation using models. We did not observe effects of either lake size, lake origin, or site depth on the MMIs. The MMIs appear to respond more strongly to increased nutrient concentrations than to shoreline habitat conditions. Improving our understanding of how zooplankton assemblages respond to increased human disturbance, and obtaining more complete autecological information for zooplankton taxa would likely improve MMIs developed for future assessments. Using zooplankton assemblage data from the 2012 National Lakes Assessment (NLA),

  20. A bridge role metric model for nodes in software networks.

    PubMed

    Li, Bo; Feng, Yanli; Ge, Shiyu; Li, Dashe

    2014-01-01

    A bridge role metric model is put forward in this paper. Compared with previous metric models, our solution of a large-scale object-oriented software system as a complex network is inherently more realistic. To acquire nodes and links in an undirected network, a new model that presents the crucial connectivity of a module or the hub instead of only centrality as in previous metric models is presented. Two previous metric models are described for comparison. In addition, it is obvious that the fitting curve between the Bre results and degrees can well be fitted by a power law. The model represents many realistic characteristics of actual software structures, and a hydropower simulation system is taken as an example. This paper makes additional contributions to an accurate understanding of module design of software systems and is expected to be beneficial to software engineering practices.

  1. A Bridge Role Metric Model for Nodes in Software Networks

    PubMed Central

    Li, Bo; Feng, Yanli; Ge, Shiyu; Li, Dashe

    2014-01-01

    A bridge role metric model is put forward in this paper. Compared with previous metric models, our solution of a large-scale object-oriented software system as a complex network is inherently more realistic. To acquire nodes and links in an undirected network, a new model that presents the crucial connectivity of a module or the hub instead of only centrality as in previous metric models is presented. Two previous metric models are described for comparison. In addition, it is obvious that the fitting curve between the results and degrees can well be fitted by a power law. The model represents many realistic characteristics of actual software structures, and a hydropower simulation system is taken as an example. This paper makes additional contributions to an accurate understanding of module design of software systems and is expected to be beneficial to software engineering practices. PMID:25364938

  2. Generalized two-dimensional (2D) linear system analysis metrics (GMTF, GDQE) for digital radiography systems including the effect of focal spot, magnification, scatter, and detector characteristics.

    PubMed

    Jain, Amit; Kuhls-Gilcrist, Andrew T; Gupta, Sandesh K; Bednarek, Daniel R; Rudin, Stephen

    2010-03-01

    The MTF, NNPS, and DQE are standard linear system metrics used to characterize intrinsic detector performance. To evaluate total system performance for actual clinical conditions, generalized linear system metrics (GMTF, GNNPS and GDQE) that include the effect of the focal spot distribution, scattered radiation, and geometric unsharpness are more meaningful and appropriate. In this study, a two-dimensional (2D) generalized linear system analysis was carried out for a standard flat panel detector (FPD) (194-micron pixel pitch and 600-micron thick CsI) and a newly-developed, high-resolution, micro-angiographic fluoroscope (MAF) (35-micron pixel pitch and 300-micron thick CsI). Realistic clinical parameters and x-ray spectra were used. The 2D detector MTFs were calculated using the new Noise Response method and slanted edge method and 2D focal spot distribution measurements were done using a pin-hole assembly. The scatter fraction, generated for a uniform head equivalent phantom, was measured and the scatter MTF was simulated with a theoretical model. Different magnifications and scatter fractions were used to estimate the 2D GMTF, GNNPS and GDQE for both detectors. Results show spatial non-isotropy for the 2D generalized metrics which provide a quantitative description of the performance of the complete imaging system for both detectors. This generalized analysis demonstrated that the MAF and FPD have similar capabilities at lower spatial frequencies, but that the MAF has superior performance over the FPD at higher frequencies even when considering focal spot blurring and scatter. This 2D generalized performance analysis is a valuable tool to evaluate total system capabilities and to enable optimized design for specific imaging tasks.

  3. A Discrete Velocity Kinetic Model with Food Metric: Chemotaxis Traveling Waves.

    PubMed

    Choi, Sun-Ho; Kim, Yong-Jung

    2017-02-01

    We introduce a mesoscopic scale chemotaxis model for traveling wave phenomena which is induced by food metric. The organisms of this simplified kinetic model have two discrete velocity modes, [Formula: see text] and a constant tumbling rate. The main feature of the model is that the speed of organisms is constant [Formula: see text] with respect to the food metric, not the Euclidean metric. The uniqueness and the existence of the traveling wave solution of the model are obtained. Unlike the classical logarithmic model case there exist traveling waves under super-linear consumption rates and infinite population pulse-type traveling waves are obtained. Numerical simulations are also provided.

  4. DSN Array Simulator

    NASA Technical Reports Server (NTRS)

    Tikidjian, Raffi; Mackey, Ryan

    2008-01-01

    The DSN Array Simulator (wherein 'DSN' signifies NASA's Deep Space Network) is an updated version of software previously denoted the DSN Receive Array Technology Assessment Simulation. This software (see figure) is used for computational modeling of a proposed DSN facility comprising user-defined arrays of antennas and transmitting and receiving equipment for microwave communication with spacecraft on interplanetary missions. The simulation includes variations in spacecraft tracked and communication demand changes for up to several decades of future operation. Such modeling is performed to estimate facility performance, evaluate requirements that govern facility design, and evaluate proposed improvements in hardware and/or software. The updated version of this software affords enhanced capability for characterizing facility performance against user-defined mission sets. The software includes a Monte Carlo simulation component that enables rapid generation of key mission-set metrics (e.g., numbers of links, data rates, and date volumes), and statistical distributions thereof as functions of time. The updated version also offers expanded capability for mixed-asset network modeling--for example, for running scenarios that involve user-definable mixtures of antennas having different diameters (in contradistinction to a fixed number of antennas having the same fixed diameter). The improved version also affords greater simulation fidelity, sufficient for validation by comparison with actual DSN operations and analytically predictable performance metrics.

  5. Routing to preserve energy in wireless networks

    NASA Astrophysics Data System (ADS)

    Block, Frederick J., IV

    Many applications for wireless radio networks require that some or all radios in the network rely on batteries as energy sources. In many cases, battery replacement is infeasible, expensive, or impossible. Communication protocols for such networks should be designed to preserve limited energy supplies. Because the choice of a route to a traffic sink influences how often radios must transmit and receive, poor route selection can quickly deplete the batteries of certain nodes. Previous work has shown that a network's lifetime can be extended by assigning higher routing costs to nodes with little remaining energy and nodes that must use high transmitter power to reach neighbor radios. Although using remaining energy levels in routing metrics can increase network lifetime, in practice, there may be significant error in a node's estimate of its battery level. The effect of battery level uncertainty on routing is examined. Routing metrics are presented that are designed to explicitly account for uncertainty in remaining energy. Simulation results using several statistical models for this uncertainty show that the proposed metrics perform well. In addition to knowledge of current battery levels, estimates of how quickly radios are consuming energy may be helpful in extending network lifetime. We present a family of routing metrics that incorporate a radio's rate of energy consumption. Simulation results show that the proposed family of metrics performs well under a variety of traffic models and network topologies. Route selection can also be complicated by time-varying link conditions. Radios may be subject to interference from other nearby communication systems, hostile jammers, and other, non-communication sources of noise. A route that first appears to have only a small cost may later require much greater energy expenditure when transmitting packets. Frequent route selection can help radios avoid using links with interference, but additional routing control messages increase energy consumption. We investigate the effects of time-varying interference on the lifetime of ad hoc networks. It is shown that there is a tradeoff between packet delay and node lifetime. We show that it is possible to design the system to perform well under a wide variety of channel conditions.

  6. Using Patient Health Questionnaire-9 item parameters of a common metric resulted in similar depression scores compared to independent item response theory model reestimation.

    PubMed

    Liegl, Gregor; Wahl, Inka; Berghöfer, Anne; Nolte, Sandra; Pieh, Christoph; Rose, Matthias; Fischer, Felix

    2016-03-01

    To investigate the validity of a common depression metric in independent samples. We applied a common metrics approach based on item-response theory for measuring depression to four German-speaking samples that completed the Patient Health Questionnaire (PHQ-9). We compared the PHQ item parameters reported for this common metric to reestimated item parameters that derived from fitting a generalized partial credit model solely to the PHQ-9 items. We calibrated the new model on the same scale as the common metric using two approaches (estimation with shifted prior and Stocking-Lord linking). By fitting a mixed-effects model and using Bland-Altman plots, we investigated the agreement between latent depression scores resulting from the different estimation models. We found different item parameters across samples and estimation methods. Although differences in latent depression scores between different estimation methods were statistically significant, these were clinically irrelevant. Our findings provide evidence that it is possible to estimate latent depression scores by using the item parameters from a common metric instead of reestimating and linking a model. The use of common metric parameters is simple, for example, using a Web application (http://www.common-metrics.org) and offers a long-term perspective to improve the comparability of patient-reported outcome measures. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. To Control False Positives in Gene-Gene Interaction Analysis: Two Novel Conditional Entropy-Based Approaches

    PubMed Central

    Lin, Meihua; Li, Haoli; Zhao, Xiaolei; Qin, Jiheng

    2013-01-01

    Genome-wide analysis of gene-gene interactions has been recognized as a powerful avenue to identify the missing genetic components that can not be detected by using current single-point association analysis. Recently, several model-free methods (e.g. the commonly used information based metrics and several logistic regression-based metrics) were developed for detecting non-linear dependence between genetic loci, but they are potentially at the risk of inflated false positive error, in particular when the main effects at one or both loci are salient. In this study, we proposed two conditional entropy-based metrics to challenge this limitation. Extensive simulations demonstrated that the two proposed metrics, provided the disease is rare, could maintain consistently correct false positive rate. In the scenarios for a common disease, our proposed metrics achieved better or comparable control of false positive error, compared to four previously proposed model-free metrics. In terms of power, our methods outperformed several competing metrics in a range of common disease models. Furthermore, in real data analyses, both metrics succeeded in detecting interactions and were competitive with the originally reported results or the logistic regression approaches. In conclusion, the proposed conditional entropy-based metrics are promising as alternatives to current model-based approaches for detecting genuine epistatic effects. PMID:24339984

  8. A comparison of quantum limited dose and noise equivalent dose

    NASA Astrophysics Data System (ADS)

    Job, Isaias D.; Boyce, Sarah J.; Petrillo, Michael J.; Zhou, Kungang

    2016-03-01

    Quantum-limited-dose (QLD) and noise-equivalent-dose (NED) are performance metrics often used interchangeably. Although the metrics are related, they are not equivalent unless the treatment of electronic noise is carefully considered. These metrics are increasingly important to properly characterize the low-dose performance of flat panel detectors (FPDs). A system can be said to be quantum-limited when the Signal-to-noise-ratio (SNR) is proportional to the square-root of x-ray exposure. Recent experiments utilizing three methods to determine the quantum-limited dose range yielded inconsistent results. To investigate the deviation in results, generalized analytical equations are developed to model the image processing and analysis of each method. We test the generalized expression for both radiographic and fluoroscopic detectors. The resulting analysis shows that total noise content of the images processed by each method are inherently different based on their readout scheme. Finally, it will be shown that the NED is equivalent to the instrumentation-noise-equivalent-exposure (INEE) and furthermore that the NED is derived from the quantum-noise-only method of determining QLD. Future investigations will measure quantum-limited performance of radiographic panels with a modified readout scheme to allow for noise improvements similar to measurements performed with fluoroscopic detectors.

  9. Impact of hydrogeological data on measures of uncertainty, site characterization and environmental performance metrics

    NASA Astrophysics Data System (ADS)

    de Barros, Felipe P. J.; Ezzedine, Souheil; Rubin, Yoram

    2012-02-01

    The significance of conditioning predictions of environmental performance metrics (EPMs) on hydrogeological data in heterogeneous porous media is addressed. Conditioning EPMs on available data reduces uncertainty and increases the reliability of model predictions. We present a rational and concise approach to investigate the impact of conditioning EPMs on data as a function of the location of the environmentally sensitive target receptor, data types and spacing between measurements. We illustrate how the concept of comparative information yield curves introduced in de Barros et al. [de Barros FPJ, Rubin Y, Maxwell R. The concept of comparative information yield curves and its application to risk-based site characterization. Water Resour Res 2009;45:W06401. doi:10.1029/2008WR007324] could be used to assess site characterization needs as a function of flow and transport dimensionality and EPMs. For a given EPM, we show how alternative uncertainty reduction metrics yield distinct gains of information from a variety of sampling schemes. Our results show that uncertainty reduction is EPM dependent (e.g., travel times) and does not necessarily indicate uncertainty reduction in an alternative EPM (e.g., human health risk). The results show how the position of the environmental target, flow dimensionality and the choice of the uncertainty reduction metric can be used to assist in field sampling campaigns.

  10. Contrast model for three-dimensional vehicles in natural lighting and search performance analysis

    NASA Astrophysics Data System (ADS)

    Witus, Gary; Gerhart, Grant R.; Ellis, R. Darin

    2001-09-01

    Ground vehicles in natural lighting tend to have significant and systematic variation in luminance through the presented area. This arises, in large part, from the vehicle surfaces having different orientations and shadowing relative to the source of illumination and the position of the observer. These systematic differences create the appearance of a structured 3D object. The 3D appearance is an important factor in search, figure-ground segregation, and object recognition. We present a contrast metric to predict search and detection performance that accounts for the 3D structure. The approach first computes the contrast of the front (or rear), side, and top surfaces. The vehicle contrast metric is the area-weighted sum of the absolute values of the contrasts of the component surfaces. The 3D structure contrast metric, together with target height, account for more than 80% of the variance in probability of detection and 75% of the variance in search time. When false alarm effects are discounted, they account for 89% of the variance in probability of detection and 95% of the variance in search time. The predictive power of the signature metric, when calibrated to half the data and evaluated against the other half, is 90% of the explanatory power.

  11. An automated, quantitative, and case-specific evaluation of deformable image registration in computed tomography images

    NASA Astrophysics Data System (ADS)

    Kierkels, R. G. J.; den Otter, L. A.; Korevaar, E. W.; Langendijk, J. A.; van der Schaaf, A.; Knopf, A. C.; Sijtsema, N. M.

    2018-02-01

    A prerequisite for adaptive dose-tracking in radiotherapy is the assessment of the deformable image registration (DIR) quality. In this work, various metrics that quantify DIR uncertainties are investigated using realistic deformation fields of 26 head and neck and 12 lung cancer patients. Metrics related to the physiologically feasibility (the Jacobian determinant, harmonic energy (HE), and octahedral shear strain (OSS)) and numerically robustness of the deformation (the inverse consistency error (ICE), transitivity error (TE), and distance discordance metric (DDM)) were investigated. The deformable registrations were performed using a B-spline transformation model. The DIR error metrics were log-transformed and correlated (Pearson) against the log-transformed ground-truth error on a voxel level. Correlations of r  ⩾  0.5 were found for the DDM and HE. Given a DIR tolerance threshold of 2.0 mm and a negative predictive value of 0.90, the DDM and HE thresholds were 0.49 mm and 0.014, respectively. In conclusion, the log-transformed DDM and HE can be used to identify voxels at risk for large DIR errors with a large negative predictive value. The HE and/or DDM can therefore be used to perform automated quality assurance of each CT-based DIR for head and neck and lung cancer patients.

  12. Scientist-Practitioner Engagement to Inform Regional Hydroclimate Model Evaluation

    NASA Astrophysics Data System (ADS)

    Jones, A. D.; Jagannathan, K. A.; Ullrich, P. A.

    2017-12-01

    Water mangers face significant challenges in planning for the coming decades as previously stationary aspects of the regional hydroclimate shift in response to global climate change. Providing scientific insights that enable appropriate use of regional hydroclimate projections for planning is a non-trivial problem. The system of data, models, and methods used to produce regional hydroclimate projections is subject to multiple interacting uncertainties and biases, including uncertainties that arise from general circulation models, re-analysis data products, regional climate models, hydrologic models, and statistical downscaling methods. Moreover, many components of this system were not designed with the information needs of water managers in mind. To address this problem and provide actionable insights into the sources of uncertainty present in regional hydroclimate data products, Project Hyperion has undertaken a stakeholder engagement process in four case study water basins across the US. Teams of water managers and scientists are interacting in a structured manner to identify decision-relevant metrics of model performance. These metrics are in turn being used to drive scientific investigations to uncover the sources of uncertainty in these quantities. Thus far, we have found that identification of climate phenomena of interest to stakeholders is relatively easy, but translating these into specific quantifiable metrics and prioritizing metrics is more challenging. Iterative feedback among scientists and stakeholders has proven critical in resolving these challenges, as has the roles played by boundary spanners who understand and can speak to the perspectives of multiple professional communities. Here we describe the structured format of our engagement process and the lessons learned so far, as we aim to improve the decision-relevance of hydroclimate projections through a collaborative process.

  13. Validation of sea ice models using an uncertainty-based distance metric for multiple model variables: NEW METRIC FOR SEA ICE MODEL VALIDATION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Urrego-Blanco, Jorge R.; Hunke, Elizabeth C.; Urban, Nathan M.

    Here, we implement a variance-based distance metric (D n) to objectively assess skill of sea ice models when multiple output variables or uncertainties in both model predictions and observations need to be considered. The metric compares observations and model data pairs on common spatial and temporal grids improving upon highly aggregated metrics (e.g., total sea ice extent or volume) by capturing the spatial character of model skill. The D n metric is a gamma-distributed statistic that is more general than the χ 2 statistic commonly used to assess model fit, which requires the assumption that the model is unbiased andmore » can only incorporate observational error in the analysis. The D n statistic does not assume that the model is unbiased, and allows the incorporation of multiple observational data sets for the same variable and simultaneously for different variables, along with different types of variances that can characterize uncertainties in both observations and the model. This approach represents a step to establish a systematic framework for probabilistic validation of sea ice models. The methodology is also useful for model tuning by using the D n metric as a cost function and incorporating model parametric uncertainty as part of a scheme to optimize model functionality. We apply this approach to evaluate different configurations of the standalone Los Alamos sea ice model (CICE) encompassing the parametric uncertainty in the model, and to find new sets of model configurations that produce better agreement than previous configurations between model and observational estimates of sea ice concentration and thickness.« less

  14. Validation of sea ice models using an uncertainty-based distance metric for multiple model variables: NEW METRIC FOR SEA ICE MODEL VALIDATION

    DOE PAGES

    Urrego-Blanco, Jorge R.; Hunke, Elizabeth C.; Urban, Nathan M.; ...

    2017-04-01

    Here, we implement a variance-based distance metric (D n) to objectively assess skill of sea ice models when multiple output variables or uncertainties in both model predictions and observations need to be considered. The metric compares observations and model data pairs on common spatial and temporal grids improving upon highly aggregated metrics (e.g., total sea ice extent or volume) by capturing the spatial character of model skill. The D n metric is a gamma-distributed statistic that is more general than the χ 2 statistic commonly used to assess model fit, which requires the assumption that the model is unbiased andmore » can only incorporate observational error in the analysis. The D n statistic does not assume that the model is unbiased, and allows the incorporation of multiple observational data sets for the same variable and simultaneously for different variables, along with different types of variances that can characterize uncertainties in both observations and the model. This approach represents a step to establish a systematic framework for probabilistic validation of sea ice models. The methodology is also useful for model tuning by using the D n metric as a cost function and incorporating model parametric uncertainty as part of a scheme to optimize model functionality. We apply this approach to evaluate different configurations of the standalone Los Alamos sea ice model (CICE) encompassing the parametric uncertainty in the model, and to find new sets of model configurations that produce better agreement than previous configurations between model and observational estimates of sea ice concentration and thickness.« less

  15. Landsat phenological metrics and their relation to aboveground carbon in the Brazilian Savanna.

    PubMed

    Schwieder, M; Leitão, P J; Pinto, J R R; Teixeira, A M C; Pedroni, F; Sanchez, M; Bustamante, M M; Hostert, P

    2018-05-15

    The quantification and spatially explicit mapping of carbon stocks in terrestrial ecosystems is important to better understand the global carbon cycle and to monitor and report change processes, especially in the context of international policy mechanisms such as REDD+ or the implementation of Nationally Determined Contributions (NDCs) and the UN Sustainable Development Goals (SDGs). Especially in heterogeneous ecosystems, such as Savannas, accurate carbon quantifications are still lacking, where highly variable vegetation densities occur and a strong seasonality hinders consistent data acquisition. In order to account for these challenges we analyzed the potential of land surface phenological metrics derived from gap-filled 8-day Landsat time series for carbon mapping. We selected three areas located in different subregions in the central Brazil region, which is a prominent example of a Savanna with significant carbon stocks that has been undergoing extensive land cover conversions. Here phenological metrics from the season 2014/2015 were combined with aboveground carbon field samples of cerrado sensu stricto vegetation using Random Forest regression models to map the regional carbon distribution and to analyze the relation between phenological metrics and aboveground carbon. The gap filling approach enabled to accurately approximate the original Landsat ETM+ and OLI EVI values and the subsequent derivation of annual phenological metrics. Random Forest model performances varied between the three study areas with RMSE values of 1.64 t/ha (mean relative RMSE 30%), 2.35 t/ha (46%) and 2.18 t/ha (45%). Comparable relationships between remote sensing based land surface phenological metrics and aboveground carbon were observed in all study areas. Aboveground carbon distributions could be mapped and revealed comprehensible spatial patterns. Phenological metrics were derived from 8-day Landsat time series with a spatial resolution that is sufficient to capture gradual changes in carbon stocks of heterogeneous Savanna ecosystems. These metrics revealed the relationship between aboveground carbon and the phenology of the observed vegetation. Our results suggest that metrics relating to the seasonal minimum and maximum values were the most influential variables and bear potential to improve spatially explicit mapping approaches in heterogeneous ecosystems, where both spatial and temporal resolutions are critical.

  16. The Albuquerque Seismological Laboratory Data Quality Analyzer

    NASA Astrophysics Data System (ADS)

    Ringler, A. T.; Hagerty, M.; Holland, J.; Gee, L. S.; Wilson, D.

    2013-12-01

    The U.S. Geological Survey's Albuquerque Seismological Laboratory (ASL) has several efforts underway to improve data quality at its stations. The Data Quality Analyzer (DQA) is one such development. The DQA is designed to characterize station data quality in a quantitative and automated manner. Station quality is based on the evaluation of various metrics, such as timing quality, noise levels, sensor coherence, and so on. These metrics are aggregated into a measurable grade for each station. The DQA consists of a website, a metric calculator (Seedscan), and a PostgreSQL database. The website allows the user to make requests for various time periods, review specific networks and stations, adjust weighting of the station's grade, and plot metrics as a function of time. The website dynamically loads all station data from a PostgreSQL database. The database is central to the application; it acts as a hub where metric values and limited station descriptions are stored. Data is stored at the level of one sensor's channel per day. The database is populated by Seedscan. Seedscan reads and processes miniSEED data, to generate metric values. Seedscan, written in Java, compares hashes of metadata and data to detect changes and perform subsequent recalculations. This ensures that the metric values are up to date and accurate. Seedscan can be run in a scheduled task or on demand by way of a config file. It will compute metrics specified in its configuration file. While many metrics are currently in development, some are completed and being actively used. These include: availability, timing quality, gap count, deviation from the New Low Noise Model, deviation from a station's noise baseline, inter-sensor coherence, and data-synthetic fits. In all, 20 metrics are planned, but any number could be added. ASL is actively using the DQA on a daily basis for station diagnostics and evaluation. As Seedscan is scheduled to run every night, data quality analysts are able to then use the website to diagnose changes in noise levels or other anomalous data. This allows for errors to be corrected quickly and efficiently. The code is designed to be flexible for adding metrics and portable for use in other networks. We anticipate further development of the DQA by improving the existing web-interface, adding more metrics, adding an interface to facilitate the verification of historic station metadata and performance, and an interface to allow better monitoring of data quality goals.

  17. An Evaluation of the IntelliMetric[SM] Essay Scoring System

    ERIC Educational Resources Information Center

    Rudner, Lawrence M.; Garcia, Veronica; Welch, Catherine

    2006-01-01

    This report provides a two-part evaluation of the IntelliMetric[SM] automated essay scoring system based on its performance scoring essays from the Analytic Writing Assessment of the Graduate Management Admission Test[TM] (GMAT[TM]). The IntelliMetric system performance is first compared to that of individual human raters, a Bayesian system…

  18. A support vector machine for predicting defibrillation outcomes from waveform metrics.

    PubMed

    Howe, Andrew; Escalona, Omar J; Di Maio, Rebecca; Massot, Bertrand; Cromie, Nick A; Darragh, Karen M; Adgey, Jennifer; McEneaney, David J

    2014-03-01

    Algorithms to predict shock success based on VF waveform metrics could significantly enhance resuscitation by optimising the timing of defibrillation. To investigate robust methods of predicting defibrillation success in VF cardiac arrest patients, by using a support vector machine (SVM) optimisation approach. Frequency-domain (AMSA, dominant frequency and median frequency) and time-domain (slope and RMS amplitude) VF waveform metrics were calculated in a 4.1Y window prior to defibrillation. Conventional prediction test validity of each waveform parameter was conducted and used AUC>0.6 as the criterion for inclusion as a corroborative attribute processed by the SVM classification model. The latter used a Gaussian radial-basis-function (RBF) kernel and the error penalty factor C was fixed to 1. A two-fold cross-validation resampling technique was employed. A total of 41 patients had 115 defibrillation instances. AMSA, slope and RMS waveform metrics performed test validation with AUC>0.6 for predicting termination of VF and return-to-organised rhythm. Predictive accuracy of the optimised SVM design for termination of VF was 81.9% (± 1.24 SD); positive and negative predictivity were respectively 84.3% (± 1.98 SD) and 77.4% (± 1.24 SD); sensitivity and specificity were 87.6% (± 2.69 SD) and 71.6% (± 9.38 SD) respectively. AMSA, slope and RMS were the best VF waveform frequency-time parameters predictors of termination of VF according to test validity assessment. This a priori can be used for a simplified SVM optimised design that combines the predictive attributes of these VF waveform metrics for improved prediction accuracy and generalisation performance without requiring the definition of any threshold value on waveform metrics. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  19. Development of a diatom-based multimetric index for acid mine drainage impacted depressional wetlands.

    PubMed

    Riato, Luisa; Leira, Manel; Della Bella, Valentina; Oberholster, Paul J

    2018-01-15

    Acid mine drainage (AMD) from coal mining in the Mpumalanga Highveld region of South Africa has caused severe chemical and biological degradation of aquatic habitats, specifically depressional wetlands, as mines use these wetlands for storage of AMD. Diatom-based multimetric indices (MMIs) to assess wetland condition have mostly been developed to assess agricultural and urban land use impacts. No diatom MMI of wetland condition has been developed to assess AMD impacts related to mining activities. Previous approaches to diatom-based MMI development in wetlands have not accounted for natural variability. Natural variability among depressional wetlands may influence the accuracy of MMIs. Epiphytic diatom MMIs sensitive to AMD were developed for a range of depressional wetland types to account for natural variation in biological metrics. For this, we classified wetland types based on diatom typologies. A range of 4-15 final metrics were selected from a pool of ~140 candidate metrics to develop the MMIs based on their: (1) broad range, (2) high separation power and (3) low correlation among metrics. Final metrics were selected from three categories: similarity to reference sites, functional groups, and taxonomic composition, which represent different aspects of diatom assemblage structure and function. MMI performances were evaluated according to their precision in distinguishing reference sites, responsiveness to discriminate reference and disturbed sites, sensitivity to human disturbances and relevancy to AMD-related stressors. Each MMI showed excellent discriminatory power, whether or not it accounted for natural variation. However, accounting for variation by grouping sites based on diatom typologies improved overall performance of MMIs. Our study highlights the usefulness of diatom-based metrics and provides a model for the biological assessment of depressional wetland condition in South Africa and elsewhere. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. An optimization based sampling approach for multiple metrics uncertainty analysis using generalized likelihood uncertainty estimation

    NASA Astrophysics Data System (ADS)

    Zhou, Rurui; Li, Yu; Lu, Di; Liu, Haixing; Zhou, Huicheng

    2016-09-01

    This paper investigates the use of an epsilon-dominance non-dominated sorted genetic algorithm II (ɛ-NSGAII) as a sampling approach with an aim to improving sampling efficiency for multiple metrics uncertainty analysis using Generalized Likelihood Uncertainty Estimation (GLUE). The effectiveness of ɛ-NSGAII based sampling is demonstrated compared with Latin hypercube sampling (LHS) through analyzing sampling efficiency, multiple metrics performance, parameter uncertainty and flood forecasting uncertainty with a case study of flood forecasting uncertainty evaluation based on Xinanjiang model (XAJ) for Qing River reservoir, China. Results obtained demonstrate the following advantages of the ɛ-NSGAII based sampling approach in comparison to LHS: (1) The former performs more effective and efficient than LHS, for example the simulation time required to generate 1000 behavioral parameter sets is shorter by 9 times; (2) The Pareto tradeoffs between metrics are demonstrated clearly with the solutions from ɛ-NSGAII based sampling, also their Pareto optimal values are better than those of LHS, which means better forecasting accuracy of ɛ-NSGAII parameter sets; (3) The parameter posterior distributions from ɛ-NSGAII based sampling are concentrated in the appropriate ranges rather than uniform, which accords with their physical significance, also parameter uncertainties are reduced significantly; (4) The forecasted floods are close to the observations as evaluated by three measures: the normalized total flow outside the uncertainty intervals (FOUI), average relative band-width (RB) and average deviation amplitude (D). The flood forecasting uncertainty is also reduced a lot with ɛ-NSGAII based sampling. This study provides a new sampling approach to improve multiple metrics uncertainty analysis under the framework of GLUE, and could be used to reveal the underlying mechanisms of parameter sets under multiple conflicting metrics in the uncertainty analysis process.

  1. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale.

    PubMed

    Emmons, Scott; Kobourov, Stephen; Gallant, Mike; Börner, Katy

    2016-01-01

    Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms-Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters.

  2. Implementing the Data Center Energy Productivity Metric in a High Performance Computing Data Center

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sego, Landon H.; Marquez, Andres; Rawson, Andrew

    2013-06-30

    As data centers proliferate in size and number, the improvement of their energy efficiency and productivity has become an economic and environmental imperative. Making these improvements requires metrics that are robust, interpretable, and practical. We discuss the properties of a number of the proposed metrics of energy efficiency and productivity. In particular, we focus on the Data Center Energy Productivity (DCeP) metric, which is the ratio of useful work produced by the data center to the energy consumed performing that work. We describe our approach for using DCeP as the principal outcome of a designed experiment using a highly instrumented,more » high-performance computing data center. We found that DCeP was successful in clearly distinguishing different operational states in the data center, thereby validating its utility as a metric for identifying configurations of hardware and software that would improve energy productivity. We also discuss some of the challenges and benefits associated with implementing the DCeP metric, and we examine the efficacy of the metric in making comparisons within a data center and between data centers.« less

  3. Markov Modeling of Component Fault Growth over a Derived Domain of Feasible Output Control Effort Modifications

    NASA Technical Reports Server (NTRS)

    Bole, Brian; Goebel, Kai; Vachtsevanos, George

    2012-01-01

    This paper introduces a novel Markov process formulation of stochastic fault growth modeling, in order to facilitate the development and analysis of prognostics-based control adaptation. A metric representing the relative deviation between the nominal output of a system and the net output that is actually enacted by an implemented prognostics-based control routine, will be used to define the action space of the formulated Markov process. The state space of the Markov process will be defined in terms of an abstracted metric representing the relative health remaining in each of the system s components. The proposed formulation of component fault dynamics will conveniently relate feasible system output performance modifications to predictions of future component health deterioration.

  4. Understanding software faults and their role in software reliability modeling

    NASA Technical Reports Server (NTRS)

    Munson, John C.

    1994-01-01

    This study is a direct result of an on-going project to model the reliability of a large real-time control avionics system. In previous modeling efforts with this system, hardware reliability models were applied in modeling the reliability behavior of this system. In an attempt to enhance the performance of the adapted reliability models, certain software attributes were introduced in these models to control for differences between programs and also sequential executions of the same program. As the basic nature of the software attributes that affect software reliability become better understood in the modeling process, this information begins to have important implications on the software development process. A significant problem arises when raw attribute measures are to be used in statistical models as predictors, for example, of measures of software quality. This is because many of the metrics are highly correlated. Consider the two attributes: lines of code, LOC, and number of program statements, Stmts. In this case, it is quite obvious that a program with a high value of LOC probably will also have a relatively high value of Stmts. In the case of low level languages, such as assembly language programs, there might be a one-to-one relationship between the statement count and the lines of code. When there is a complete absence of linear relationship among the metrics, they are said to be orthogonal or uncorrelated. Usually the lack of orthogonality is not serious enough to affect a statistical analysis. However, for the purposes of some statistical analysis such as multiple regression, the software metrics are so strongly interrelated that the regression results may be ambiguous and possibly even misleading. Typically, it is difficult to estimate the unique effects of individual software metrics in the regression equation. The estimated values of the coefficients are very sensitive to slight changes in the data and to the addition or deletion of variables in the regression equation. Since most of the existing metrics have common elements and are linear combinations of these common elements, it seems reasonable to investigate the structure of the underlying common factors or components that make up the raw metrics. The technique we have chosen to use to explore this structure is a procedure called principal components analysis. Principal components analysis is a decomposition technique that may be used to detect and analyze collinearity in software metrics. When confronted with a large number of metrics measuring a single construct, it may be desirable to represent the set by some smaller number of variables that convey all, or most, of the information in the original set. Principal components are linear transformations of a set of random variables that summarize the information contained in the variables. The transformations are chosen so that the first component accounts for the maximal amount of variation of the measures of any possible linear transform; the second component accounts for the maximal amount of residual variation; and so on. The principal components are constructed so that they represent transformed scores on dimensions that are orthogonal. Through the use of principal components analysis, it is possible to have a set of highly related software attributes mapped into a small number of uncorrelated attribute domains. This definitively solves the problem of multi-collinearity in subsequent regression analysis. There are many software metrics in the literature, but principal component analysis reveals that there are few distinct sources of variation, i.e. dimensions, in this set of metrics. It would appear perfectly reasonable to characterize the measurable attributes of a program with a simple function of a small number of orthogonal metrics each of which represents a distinct software attribute domain.

  5. Metric half-span model support system

    NASA Technical Reports Server (NTRS)

    Jackson, C. M., Jr.; Dollyhigh, S. M.; Shaw, D. S. (Inventor)

    1982-01-01

    A model support system used to support a model in a wind tunnel test section is described. The model comprises a metric, or measured, half-span supported by a nonmetric, or nonmeasured half-span which is connected to a sting support. Moments and forces acting on the metric half-span are measured without interference from the support system during a wind tunnel test.

  6. Super Ensemble-based Aviation Turbulence Guidance (SEATG) for Air Traffic Management (ATM)

    NASA Astrophysics Data System (ADS)

    Kim, Jung-Hoon; Chan, William; Sridhar, Banavar; Sharman, Robert

    2014-05-01

    Super Ensemble (ensemble of ten turbulence metrics from time-lagged ensemble members of weather forecast data)-based Aviation Turbulence Guidance (SEATG) is developed using Weather Research and Forecasting (WRF) model and in-situ eddy dissipation rate (EDR) observations equipped on commercial aircraft over the contiguous United States. SEATG is a sequence of five procedures including weather modeling, calculating turbulence metrics, mapping EDR-scale, evaluating metrics, and producing final SEATG forecast. This uses similar methodology to the operational Graphic Turbulence Guidance (GTG) with three major improvements. First, SEATG use a higher resolution (3-km) WRF model to capture cloud-resolving scale phenomena. Second, SEATG computes turbulence metrics for multiple forecasts that are combined at the same valid time resulting in an time-lagged ensemble of multiple turbulence metrics. Third, SEATG provides both deterministic and probabilistic turbulence forecasts to take into account weather uncertainties and user demands. It is found that the SEATG forecasts match well with observed radar reflectivity along a surface front as well as convectively induced turbulence outside the clouds on 7-8 Sep 2012. And, overall performance skill of deterministic SEATG against the observed EDR data during this period is superior to any single turbulence metrics. Finally, probabilistic SEATG is used as an example application of turbulence forecast for air-traffic management. In this study, a simple Wind-Optimal Route (WOR) passing through the potential areas of probabilistic SEATG and Lateral Turbulence Avoidance Route (LTAR) taking into account the SEATG are calculated at z = 35000 ft (z = 12 km) from Los Angeles to John F. Kennedy international airports. As a result, WOR takes total of 239 minutes with 16 minutes of SEATG areas for 40% of moderate turbulence potential, while LTAR takes total of 252 minutes travel time that 5% of fuel would be additionally consumed to entirely avoid the moderate SEATG regions.

  7. Citizen science: A new perspective to advance spatial pattern evaluation in hydrology

    PubMed Central

    Stisen, Simon

    2017-01-01

    Citizen science opens new pathways that can complement traditional scientific practice. Intuition and reasoning often make humans more effective than computer algorithms in various realms of problem solving. In particular, a simple visual comparison of spatial patterns is a task where humans are often considered to be more reliable than computer algorithms. However, in practice, science still largely depends on computer based solutions, which inevitably gives benefits such as speed and the possibility to automatize processes. However, the human vision can be harnessed to evaluate the reliability of algorithms which are tailored to quantify similarity in spatial patterns. We established a citizen science project to employ the human perception to rate similarity and dissimilarity between simulated spatial patterns of several scenarios of a hydrological catchment model. In total, the turnout counts more than 2500 volunteers that provided over 43000 classifications of 1095 individual subjects. We investigate the capability of a set of advanced statistical performance metrics to mimic the human perception to distinguish between similarity and dissimilarity. Results suggest that more complex metrics are not necessarily better at emulating the human perception, but clearly provide auxiliary information that is valuable for model diagnostics. The metrics clearly differ in their ability to unambiguously distinguish between similar and dissimilar patterns which is regarded a key feature of a reliable metric. The obtained dataset can provide an insightful benchmark to the community to test novel spatial metrics. PMID:28558050

  8. Diagnosing Undersampling in Monte Carlo Eigenvalue and Flux Tally Estimates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Perfetti, Christopher M; Rearden, Bradley T

    2015-01-01

    This study explored the impact of undersampling on the accuracy of tally estimates in Monte Carlo (MC) calculations. Steady-state MC simulations were performed for models of several critical systems with varying degrees of spatial and isotopic complexity, and the impact of undersampling on eigenvalue and fuel pin flux/fission estimates was examined. This study observed biases in MC eigenvalue estimates as large as several percent and biases in fuel pin flux/fission tally estimates that exceeded tens, and in some cases hundreds, of percent. This study also investigated five statistical metrics for predicting the occurrence of undersampling biases in MC simulations. Threemore » of the metrics (the Heidelberger-Welch RHW, the Geweke Z-Score, and the Gelman-Rubin diagnostics) are commonly used for diagnosing the convergence of Markov chains, and two of the methods (the Contributing Particles per Generation and Tally Entropy) are new convergence metrics developed in the course of this study. These metrics were implemented in the KENO MC code within the SCALE code system and were evaluated for their reliability at predicting the onset and magnitude of undersampling biases in MC eigenvalue and flux tally estimates in two of the critical models. Of the five methods investigated, the Heidelberger-Welch RHW, the Gelman-Rubin diagnostics, and Tally Entropy produced test metrics that correlated strongly to the size of the observed undersampling biases, indicating their potential to effectively predict the size and prevalence of undersampling biases in MC simulations.« less

  9. Testing Strategies for Model-Based Development

    NASA Technical Reports Server (NTRS)

    Heimdahl, Mats P. E.; Whalen, Mike; Rajan, Ajitha; Miller, Steven P.

    2006-01-01

    This report presents an approach for testing artifacts generated in a model-based development process. This approach divides the traditional testing process into two parts: requirements-based testing (validation testing) which determines whether the model implements the high-level requirements and model-based testing (conformance testing) which determines whether the code generated from a model is behaviorally equivalent to the model. The goals of the two processes differ significantly and this report explores suitable testing metrics and automation strategies for each. To support requirements-based testing, we define novel objective requirements coverage metrics similar to existing specification and code coverage metrics. For model-based testing, we briefly describe automation strategies and examine the fault-finding capability of different structural coverage metrics using tests automatically generated from the model.

  10. Comparison of optimization strategy and similarity metric in atlas-to-subject registration using statistical deformation model

    NASA Astrophysics Data System (ADS)

    Otake, Y.; Murphy, R. J.; Grupp, R. B.; Sato, Y.; Taylor, R. H.; Armand, M.

    2015-03-01

    A robust atlas-to-subject registration using a statistical deformation model (SDM) is presented. The SDM uses statistics of voxel-wise displacement learned from pre-computed deformation vectors of a training dataset. This allows an atlas instance to be directly translated into an intensity volume and compared with a patient's intensity volume. Rigid and nonrigid transformation parameters were simultaneously optimized via the Covariance Matrix Adaptation - Evolutionary Strategy (CMA-ES), with image similarity used as the objective function. The algorithm was tested on CT volumes of the pelvis from 55 female subjects. A performance comparison of the CMA-ES and Nelder-Mead downhill simplex optimization algorithms with the mutual information and normalized cross correlation similarity metrics was conducted. Simulation studies using synthetic subjects were performed, as well as leave-one-out cross validation studies. Both studies suggested that mutual information and CMA-ES achieved the best performance. The leave-one-out test demonstrated 4.13 mm error with respect to the true displacement field, and 26,102 function evaluations in 180 seconds, on average.

  11. Utilization of an agility assessment module in analysis and optimization of preliminary fighter configuration

    NASA Technical Reports Server (NTRS)

    Ngan, Angelen; Biezad, Daniel

    1996-01-01

    A study has been conducted to develop and to analyze a FORTRAN computer code for performing agility analysis on fighter aircraft configurations. This program is one of the modules of the NASA Ames ACSYNT (AirCraft SYNThesis) design code. The background of the agility research in the aircraft industry and a survey of a few agility metrics are discussed. The methodology, techniques, and models developed for the code are presented. The validity of the existing code was evaluated by comparing with existing flight test data. A FORTRAN program was developed for a specific metric, PM (Pointing Margin), as part of the agility module. Example trade studies using the agility module along with ACSYNT were conducted using a McDonnell Douglas F/A-18 Hornet aircraft model. Tile sensitivity of thrust loading, wing loading, and thrust vectoring on agility criteria were investigated. The module can compare the agility potential between different configurations and has capability to optimize agility performance in the preliminary design process. This research provides a new and useful design tool for analyzing fighter performance during air combat engagements in the preliminary design.

  12. Development of an agility assessment module for preliminary fighter design

    NASA Technical Reports Server (NTRS)

    Ngan, Angelen; Bauer, Brent; Biezad, Daniel; Hahn, Andrew

    1996-01-01

    A FORTRAN computer program is presented to perform agility analysis on fighter aircraft configurations. This code is one of the modules of the NASA Ames ACSYNT (AirCraft SYNThesis) design code. The background of the agility research in the aircraft industry and a survey of a few agility metrics are discussed. The methodology, techniques, and models developed for the code are presented. FORTRAN programs were developed for two specific metrics, CCT (Combat Cycle Time) and PM (Pointing Margin), as part of the agility module. The validity of the code was evaluated by comparing with existing flight test data. Example trade studies using the agility module along with ACSYNT were conducted using Northrop F-20 Tigershark and McDonnell Douglas F/A-18 Hornet aircraft models. The sensitivity of thrust loading and wing loading on agility criteria were investigated. The module can compare the agility potential between different configurations and has the capability to optimize agility performance in the preliminary design process. This research provides a new and useful design tool for analyzing fighter performance during air combat engagements.

  13. Feasibility of and Rationale for the Collection of Orthopaedic Trauma Surgery Quality of Care Metrics.

    PubMed

    Miller, Anna N; Kozar, Rosemary; Wolinsky, Philip

    2017-06-01

    Reproducible metrics are needed to evaluate the delivery of orthopaedic trauma care, national care, norms, and outliers. The American College of Surgeons (ACS) is uniquely positioned to collect and evaluate the data needed to evaluate orthopaedic trauma care via the Committee on Trauma and the Trauma Quality Improvement Project. We evaluated the first quality metrics the ACS has collected for orthopaedic trauma surgery to determine whether these metrics can be appropriately collected with accuracy and completeness. The metrics include the time to administration of the first dose of antibiotics for open fractures, the time to surgical irrigation and débridement of open tibial fractures, and the percentage of patients who undergo stabilization of femoral fractures at trauma centers nationwide. These metrics were analyzed to evaluate for variances in the delivery of orthopaedic care across the country. The data showed wide variances for all metrics, and many centers had incomplete ability to collect the orthopaedic trauma care metrics. There was a large variability in the results of the metrics collected among different trauma center levels, as well as among centers of a particular level. The ACS has successfully begun tracking orthopaedic trauma care performance measures, which will help inform reevaluation of the goals and continued work on data collection and improvement of patient care. Future areas of research may link these performance measures with patient outcomes, such as long-term tracking, to assess nonunion and function. This information can provide insight into center performance and its effect on patient outcomes. The ACS was able to successfully collect and evaluate the data for three metrics used to assess the quality of orthopaedic trauma care. However, additional research is needed to determine whether these metrics are suitable for evaluating orthopaedic trauma care and cutoff values for each metric.

  14. Comparative assessment of GIS-based methods and metrics for estimating long-term exposures to air pollution

    NASA Astrophysics Data System (ADS)

    Gulliver, John; de Hoogh, Kees; Fecht, Daniela; Vienneau, Danielle; Briggs, David

    2011-12-01

    The development of geographical information system techniques has opened up a wide array of methods for air pollution exposure assessment. The extent to which these provide reliable estimates of air pollution concentrations is nevertheless not clearly established. Nor is it clear which methods or metrics should be preferred in epidemiological studies. This paper compares the performance of ten different methods and metrics in terms of their ability to predict mean annual PM 10 concentrations across 52 monitoring sites in London, UK. Metrics analysed include indicators (distance to nearest road, traffic volume on nearest road, heavy duty vehicle (HDV) volume on nearest road, road density within 150 m, traffic volume within 150 m and HDV volume within 150 m) and four modelling approaches: based on the nearest monitoring site, kriging, dispersion modelling and land use regression (LUR). Measures were computed in a GIS, and resulting metrics calibrated and validated against monitoring data using a form of grouped jack-knife analysis. The results show that PM 10 concentrations across London show little spatial variation. As a consequence, most methods can predict the average without serious bias. Few of the approaches, however, show good correlations with monitored PM 10 concentrations, and most predict no better than a simple classification based on site type. Only land use regression reaches acceptable levels of correlation ( R2 = 0.47), though this can be improved by also including information on site type. This might therefore be taken as a recommended approach in many studies, though care is needed in developing meaningful land use regression models, and like any method they need to be validated against local data before their application as part of epidemiological studies.

  15. Impacts of Land Use/Cover Uncertainty on Predictions of Ecologically Relevant Flow Metrics

    NASA Astrophysics Data System (ADS)

    Kalin, L.; Dosdogru, F.

    2016-12-01

    Streamflow regimes are crucial parts of the ecological integrity in river systems. Although species are adopted to natural flow variability, permanent changes in flow regimes as a result of alterations in land use/cover of the watersheds can adversely impact ecosystem health. This study assessed the impacts of land use/cover (LULC) changes on ecologically relevant flow (ERF) metrics in the rapidly urbanizing upper Cahaba River basin in north-central Alabama. Cahaba River is the longest free-flowing river in the state of Alabama and is identified by the Nature Conservancy as one of the only eight "Hotspot of Biodiversity" in the contiguous United States. Cahaba River and its major tributaries support 69 rare and imperiled species, making it one of the most various aquatic ecosystems in the United States. SWAT model was used to generate daily streamflows, which were then fed into the Indicators of Hydrological Alterations (IHA) software to generate 38 key ERF metrics that capture high, low, and median flow, as well as flashiness, which are known to have significant impacts on flora and fauna. SWAT was calibrated and validated twice with two different sources of LULC. Model performances during calibration and validations were very good and were very similar with both LULC. The flow duration curves generated based on each LULC also look very similar. However, when we compared the ERF metrics significant differences were observed signifying the importance of LULC sources. The biggest differences were in Oct-Dec low flows, rise and fall rates of daily flows, annual maximum flow and average during month od October. This study shows that although model calibration can compensate for the differences in differences in LULC sources, when it comes to key ERF metrics the use of the most reliable LULC source is evident.

  16. Energy Information Systems

    Science.gov Websites

    Energy Analytics Campaign > 2014-2018 Assessment of Automated M&V Methods > 2012-2018 Better Assessment of automated measurement and verification methods. Granderson, J. et al. Lawrence Berkeley . PDF, 726 KB Performance Metrics and Objective Testing Methods for Energy Baseline Modeling Software

  17. Modelling the heart as a communication system.

    PubMed

    Ashikaga, Hiroshi; Aguilar-Rodríguez, José; Gorsky, Shai; Lusczek, Elizabeth; Marquitti, Flávia Maria Darcie; Thompson, Brian; Wu, Degang; Garland, Joshua

    2015-04-06

    Electrical communication between cardiomyocytes can be perturbed during arrhythmia, but these perturbations are not captured by conventional electrocardiographic metrics. We developed a theoretical framework to quantify electrical communication using information theory metrics in two-dimensional cell lattice models of cardiac excitation propagation. The time series generated by each cell was coarse-grained to 1 when excited or 0 when resting. The Shannon entropy for each cell was calculated from the time series during four clinically important heart rhythms: normal heartbeat, anatomical reentry, spiral reentry and multiple reentry. We also used mutual information to perform spatial profiling of communication during these cardiac arrhythmias. We found that information sharing between cells was spatially heterogeneous. In addition, cardiac arrhythmia significantly impacted information sharing within the heart. Entropy localized the path of the drifting core of spiral reentry, which could be an optimal target of therapeutic ablation. We conclude that information theory metrics can quantitatively assess electrical communication among cardiomyocytes. The traditional concept of the heart as a functional syncytium sharing electrical information cannot predict altered entropy and information sharing during complex arrhythmia. Information theory metrics may find clinical application in the identification of rhythm-specific treatments which are currently unmet by traditional electrocardiographic techniques. © 2015 The Author(s) Published by the Royal Society. All rights reserved.

  18. Selecting global climate models for regional climate change studies

    PubMed Central

    Pierce, David W.; Barnett, Tim P.; Santer, Benjamin D.; Gleckler, Peter J.

    2009-01-01

    Regional or local climate change modeling studies currently require starting with a global climate model, then downscaling to the region of interest. How should global models be chosen for such studies, and what effect do such choices have? This question is addressed in the context of a regional climate detection and attribution (D&A) study of January-February-March (JFM) temperature over the western U.S. Models are often selected for a regional D&A analysis based on the quality of the simulated regional climate. Accordingly, 42 performance metrics based on seasonal temperature and precipitation, the El Nino/Southern Oscillation (ENSO), and the Pacific Decadal Oscillation are constructed and applied to 21 global models. However, no strong relationship is found between the score of the models on the metrics and results of the D&A analysis. Instead, the importance of having ensembles of runs with enough realizations to reduce the effects of natural internal climate variability is emphasized. Also, the superiority of the multimodel ensemble average (MM) to any 1 individual model, already found in global studies examining the mean climate, is true in this regional study that includes measures of variability as well. Evidence is shown that this superiority is largely caused by the cancellation of offsetting errors in the individual global models. Results with both the MM and models picked randomly confirm the original D&A results of anthropogenically forced JFM temperature changes in the western U.S. Future projections of temperature do not depend on model performance until the 2080s, after which the better performing models show warmer temperatures. PMID:19439652

  19. Use of Normalized Difference Vegetation Index (NDVI) habitat models to predict breeding birds on the San Pedro River, Arizona

    USGS Publications Warehouse

    McFarland, Tiffany Marie; van Riper, Charles

    2013-01-01

    Successful management practices of avian populations depend on understanding relationships between birds and their habitat, especially in rare habitats, such as riparian areas of the desert Southwest. Remote-sensing technology has become popular in habitat modeling, but most of these models focus on single species, leaving their applicability to understanding broader community structure and function largely untested. We investigated the usefulness of two Normalized Difference Vegetation Index (NDVI) habitat models to model avian abundance and species richness on the upper San Pedro River in southeastern Arizona. Although NDVI was positively correlated with our bird metrics, the amount of explained variation was low. We then investigated the addition of vegetation metrics and other remote-sensing metrics to improve our models. Although both vegetation metrics and remotely sensed metrics increased the power of our models, the overall explained variation was still low, suggesting that general avian community structure may be too complex for NDVI models.

  20. Modeling the October 2005 lahars at Panabaj (Guatemala)

    NASA Astrophysics Data System (ADS)

    Charbonnier, S. J.; Connor, C. B.; Connor, L. J.; Sheridan, M. F.; Oliva Hernández, J. P.; Richardson, J. A.

    2018-01-01

    An extreme rainfall event in October of 2005 triggered two deadly lahars on the flanks of Tolimán volcano (Guatemala) that caused many fatalities in the village of Panabaj. We mapped the deposits of these lahars, then developed computer simulations of the lahars using the geologic data and compared simulated area inundated by the flows to mapped area inundated. Computer simulation of the two lahars was dramatically improved after calibration with geological data. Specifically, detailed field measurements of flow inundation area, flow thickness, flow direction, and velocity estimates, collected after lahar emplacement, were used to calibrate the rheological input parameters for the models, including deposit volume, yield strength, sediment and water concentrations, and Manning roughness coefficients. Simulations of the two lahars, with volumes of 240,200 ± 55,400 and 126,000 ± 29,000 m3, using the FLO-2D computer program produced models of lahar runout within 3% of measured runouts and produced reasonable estimates of flow thickness and velocity along the lengths of the simulated flows. We compare areas inundated using the Jaccard fit, model sensitivity, and model precision metrics, all related to Bayes' theorem. These metrics show that false negatives (areas inundated by the observed lahar where not simulated) and false positives (areas not inundated by the observed lahar where inundation was simulated) are reduced using a model calibrated by rheology. The metrics offer a procedure for tuning model performance that will enhance model accuracy and make numerical models a more robust tool for natural hazard reduction.

  1. Exploring s-CIELAB as a scanner metric for print uniformity

    NASA Astrophysics Data System (ADS)

    Hertel, Dirk W.

    2005-01-01

    The s-CIELAB color difference metric combines the standard CIELAB metric for perceived color difference with spatial contrast sensitivity filtering. When studying the performance of digital image processing algorithms, maps of spatial color difference between 'before' and 'after' images are a measure of perceived image difference. A general image quality metric can be obtained by modeling the perceived difference from an ideal image. This paper explores the s-CIELAB concept for evaluating the quality of digital prints. Prints present the challenge that the 'ideal print' which should serve as the reference when calculating the delta E* error map is unknown, and thus be estimated from the scanned print. A reasonable estimate of what the ideal print 'should have been' is possible at least for images of known content such as flat fields or continuous wedges, where the error map can be calculated against a global or local mean. While such maps showing the perceived error at each pixel are extremely useful when analyzing print defects, it is desirable to statistically reduce them to a more manageable dataset. Examples of digital print uniformity are given, and the effect of specific print defects on the s-CIELAB delta E* metric are discussed.

  2. Task-based detectability comparison of exponential transformation of free-response operating characteristic (EFROC) curve and channelized Hotelling observer (CHO)

    NASA Astrophysics Data System (ADS)

    Khobragade, P.; Fan, Jiahua; Rupcich, Franco; Crotty, Dominic J.; Gilat Schmidt, Taly

    2016-03-01

    This study quantitatively evaluated the performance of the exponential transformation of the free-response operating characteristic curve (EFROC) metric, with the Channelized Hotelling Observer (CHO) as a reference. The CHO has been used for image quality assessment of reconstruction algorithms and imaging systems and often it is applied to study the signal-location-known cases. The CHO also requires a large set of images to estimate the covariance matrix. In terms of clinical applications, this assumption and requirement may be unrealistic. The newly developed location-unknown EFROC detectability metric is estimated from the confidence scores reported by a model observer. Unlike the CHO, EFROC does not require a channelization step and is a non-parametric detectability metric. There are few quantitative studies available on application of the EFROC metric, most of which are based on simulation data. This study investigated the EFROC metric using experimental CT data. A phantom with four low contrast objects: 3mm (14 HU), 5mm (7HU), 7mm (5 HU) and 10 mm (3 HU) was scanned at dose levels ranging from 25 mAs to 270 mAs and reconstructed using filtered backprojection. The area under the curve values for CHO (AUC) and EFROC (AFE) were plotted with respect to different dose levels. The number of images required to estimate the non-parametric AFE metric was calculated for varying tasks and found to be less than the number of images required for parametric CHO estimation. The AFE metric was found to be more sensitive to changes in dose than the CHO metric. This increased sensitivity and the assumption of unknown signal location may be useful for investigating and optimizing CT imaging methods. Future work is required to validate the AFE metric against human observers.

  3. Structural texture similarity metrics for image analysis and retrieval.

    PubMed

    Zujovic, Jana; Pappas, Thrasyvoulos N; Neuhoff, David L

    2013-07-01

    We develop new metrics for texture similarity that accounts for human visual perception and the stochastic nature of textures. The metrics rely entirely on local image statistics and allow substantial point-by-point deviations between textures that according to human judgment are essentially identical. The proposed metrics extend the ideas of structural similarity and are guided by research in texture analysis-synthesis. They are implemented using a steerable filter decomposition and incorporate a concise set of subband statistics, computed globally or in sliding windows. We conduct systematic tests to investigate metric performance in the context of "known-item search," the retrieval of textures that are "identical" to the query texture. This eliminates the need for cumbersome subjective tests, thus enabling comparisons with human performance on a large database. Our experimental results indicate that the proposed metrics outperform peak signal-to-noise ratio (PSNR), structural similarity metric (SSIM) and its variations, as well as state-of-the-art texture classification metrics, using standard statistical measures.

  4. ACCESS - A Science and Engineering Assessment of Space Coronagraph Concepts for the Direct Imaging and Spectroscopy of Exoplanetary Systems

    NASA Technical Reports Server (NTRS)

    Trauger, John

    2008-01-01

    Topics include and overview, science objectives, study objectives, coronagraph types, metrics, ACCESS observatory, laboratory validations, and summary. Individual slides examine ACCESS engineering approach, ACCESS gamut of coronagraph types, coronagraph metrics, ACCESS Discovery Space, coronagraph optical layout, wavefront control on the "level playing field", deformable mirror development for HCIT, laboratory testbed demonstrations, high contract imaging with the HCIT, laboratory coronagraph contrast and stability, model validation and performance predictions, HCIT coronagraph optical layout, Lyot coronagraph on the HCIT, pupil mapping (PIAA), shaped pupils, and vortex phase mask experiments on the HCIT.

  5. Estimated work ability in warm outdoor environments depends on the chosen heat stress assessment metric.

    PubMed

    Bröde, Peter; Fiala, Dusan; Lemke, Bruno; Kjellstrom, Tord

    2018-03-01

    With a view to occupational effects of climate change, we performed a simulation study on the influence of different heat stress assessment metrics on estimated workability (WA) of labour in warm outdoor environments. Whole-day shifts with varying workloads were simulated using as input meteorological records for the hottest month from four cities with prevailing hot (Dallas, New Delhi) or warm-humid conditions (Managua, Osaka), respectively. In addition, we considered the effects of adaptive strategies like shielding against solar radiation and different work-rest schedules assuming an acclimated person wearing light work clothes (0.6 clo). We assessed WA according to Wet Bulb Globe Temperature (WBGT) by means of an empirical relation of worker performance from field studies (Hothaps), and as allowed work hours using safety threshold limits proposed by the corresponding standards. Using the physiological models Predicted Heat Strain (PHS) and Universal Thermal Climate Index (UTCI)-Fiala, we calculated WA as the percentage of working hours with body core temperature and cumulated sweat loss below standard limits (38 °C and 7.5% of body weight, respectively) recommended by ISO 7933 and below conservative (38 °C; 3%) and liberal (38.2 °C; 7.5%) limits in comparison. ANOVA results showed that the different metrics, workload, time of day and climate type determined the largest part of WA variance. WBGT-based metrics were highly correlated and indicated slightly more constrained WA for moderate workload, but were less restrictive with high workload and for afternoon work hours compared to PHS and UTCI-Fiala. Though PHS showed unrealistic dynamic responses to rest from work compared to UTCI-Fiala, differences in WA assessed by the physiological models largely depended on the applied limit criteria. In conclusion, our study showed that the choice of the heat stress assessment metric impacts notably on the estimated WA. Whereas PHS and UTCI-Fiala can account for cumulative physiological strain imposed by extended work hours when working heavily under high heat stress, the current WBGT standards do not include this. Advanced thermophysiological models might help developing alternatives, where not only modelling details but also the choice of physiological limit criteria will require attention. There is also an urgent need for suitable empirical data relating workplace heat exposure to workability.

  6. Estimated work ability in warm outdoor environments depends on the chosen heat stress assessment metric

    NASA Astrophysics Data System (ADS)

    Bröde, Peter; Fiala, Dusan; Lemke, Bruno; Kjellstrom, Tord

    2018-03-01

    With a view to occupational effects of climate change, we performed a simulation study on the influence of different heat stress assessment metrics on estimated workability (WA) of labour in warm outdoor environments. Whole-day shifts with varying workloads were simulated using as input meteorological records for the hottest month from four cities with prevailing hot (Dallas, New Delhi) or warm-humid conditions (Managua, Osaka), respectively. In addition, we considered the effects of adaptive strategies like shielding against solar radiation and different work-rest schedules assuming an acclimated person wearing light work clothes (0.6 clo). We assessed WA according to Wet Bulb Globe Temperature (WBGT) by means of an empirical relation of worker performance from field studies (Hothaps), and as allowed work hours using safety threshold limits proposed by the corresponding standards. Using the physiological models Predicted Heat Strain (PHS) and Universal Thermal Climate Index (UTCI)-Fiala, we calculated WA as the percentage of working hours with body core temperature and cumulated sweat loss below standard limits (38 °C and 7.5% of body weight, respectively) recommended by ISO 7933 and below conservative (38 °C; 3%) and liberal (38.2 °C; 7.5%) limits in comparison. ANOVA results showed that the different metrics, workload, time of day and climate type determined the largest part of WA variance. WBGT-based metrics were highly correlated and indicated slightly more constrained WA for moderate workload, but were less restrictive with high workload and for afternoon work hours compared to PHS and UTCI-Fiala. Though PHS showed unrealistic dynamic responses to rest from work compared to UTCI-Fiala, differences in WA assessed by the physiological models largely depended on the applied limit criteria. In conclusion, our study showed that the choice of the heat stress assessment metric impacts notably on the estimated WA. Whereas PHS and UTCI-Fiala can account for cumulative physiological strain imposed by extended work hours when working heavily under high heat stress, the current WBGT standards do not include this. Advanced thermophysiological models might help developing alternatives, where not only modelling details but also the choice of physiological limit criteria will require attention. There is also an urgent need for suitable empirical data relating workplace heat exposure to workability.

  7. Identifying dominant controls on hydrologic parameter transfer from gauged to ungauged catchments: a comparative hydrology approach

    USGS Publications Warehouse

    Singh, R.; Archfield, S.A.; Wagener, T.

    2014-01-01

    Daily streamflow information is critical for solving various hydrologic problems, though observations of continuous streamflow for model calibration are available at only a small fraction of the world’s rivers. One approach to estimate daily streamflow at an ungauged location is to transfer rainfall–runoff model parameters calibrated at a gauged (donor) catchment to an ungauged (receiver) catchment of interest. Central to this approach is the selection of a hydrologically similar donor. No single metric or set of metrics of hydrologic similarity have been demonstrated to consistently select a suitable donor catchment. We design an experiment to diagnose the dominant controls on successful hydrologic model parameter transfer. We calibrate a lumped rainfall–runoff model to 83 stream gauges across the United States. All locations are USGS reference gauges with minimal human influence. Parameter sets from the calibrated models are then transferred to each of the other catchments and the performance of the transferred parameters is assessed. This transfer experiment is carried out both at the scale of the entire US and then for six geographic regions. We use classification and regression tree (CART) analysis to determine the relationship between catchment similarity and performance of transferred parameters. Similarity is defined using physical/climatic catchment characteristics, as well as streamflow response characteristics (signatures such as baseflow index and runoff ratio). Across the entire US, successful parameter transfer is governed by similarity in elevation and climate, and high similarity in streamflow signatures. Controls vary for different geographic regions though. Geology followed by drainage, topography and climate constitute the dominant similarity metrics in forested eastern mountains and plateaus, whereas agricultural land use relates most strongly with successful parameter transfer in the humid plains.

  8. Engineering pollinator phenotypes: consequences of induced size variation on adult morphology and flight performance metrics in the solitary bee, Osmia lignaria

    USDA-ARS?s Scientific Manuscript database

    Body size is an important trait because it strongly correlates with morphology, performance, and fitness. In insects, the body size model argues that adult size is determined during the larval stage by the mechanisms regulating growth rate and the duration of growth. Though explicit links have been ...

  9. Citizen science: A new perspective to evaluate spatial patterns in hydrology.

    NASA Astrophysics Data System (ADS)

    Koch, J.; Stisen, S.

    2016-12-01

    Citizen science opens new pathways that can complement traditional scientific practice. Intuition and reasoning make humans often more effective than computer algorithms in various realms of problem solving. In particular, a simple visual comparison of spatial patterns is a task where humans are often considered to be more reliable than computer algorithms. However, in practice, science still largely depends on computer based solutions, which is inevitable giving benefits such as speed and the possibility to automatize processes. This study highlights the integration of the generally underused human resource into hydrology. We established a citizen science project on the zooniverse platform entitled Pattern Perception. The aim is to employ the human perception to rate similarity and dissimilarity between simulated spatial patterns of a hydrological catchment model. In total, the turnout counts more than 2,800 users that provided over 46,000 classifications of 1,095 individual subjects within 64 days after the launch. Each subject displays simulated spatial patterns of land-surface variables of a baseline model and six modelling scenarios. The citizen science data discloses a numeric pattern similarity score for each of the scenarios with respect to the reference. We investigate the capability of a set of innovative statistical performance metrics to mimic the human perception to distinguish between similarity and dissimilarity. Results suggest that more complex metrics are not necessarily better at emulating the human perception, but clearly provide flexibility and auxiliary information that is valuable for model diagnostics. The metrics clearly differ in their ability to unambiguously distinguish between similar and dissimilar patterns which is regarded a key feature of a reliable metric.

  10. Information risk and security modeling

    NASA Astrophysics Data System (ADS)

    Zivic, Predrag

    2005-03-01

    This research paper presentation will feature current frameworks to addressing risk and security modeling and metrics. The paper will analyze technical level risk and security metrics of Common Criteria/ISO15408, Centre for Internet Security guidelines, NSA configuration guidelines and metrics used at this level. Information IT operational standards view on security metrics such as GMITS/ISO13335, ITIL/ITMS and architectural guidelines such as ISO7498-2 will be explained. Business process level standards such as ISO17799, COSO and CobiT will be presented with their control approach to security metrics. Top level, the maturity standards such as SSE-CMM/ISO21827, NSA Infosec Assessment and CobiT will be explored and reviewed. For each defined level of security metrics the research presentation will explore the appropriate usage of these standards. The paper will discuss standards approaches to conducting the risk and security metrics. The research findings will demonstrate the need for common baseline for both risk and security metrics. This paper will show the relation between the attribute based common baseline and corporate assets and controls for risk and security metrics. IT will be shown that such approach spans over all mentioned standards. The proposed approach 3D visual presentation and development of the Information Security Model will be analyzed and postulated. Presentation will clearly demonstrate the benefits of proposed attributes based approach and defined risk and security space for modeling and measuring.

  11. Gamut Volume Index: a color preference metric based on meta-analysis and optimized colour samples.

    PubMed

    Liu, Qiang; Huang, Zheng; Xiao, Kaida; Pointer, Michael R; Westland, Stephen; Luo, M Ronnier

    2017-07-10

    A novel metric named Gamut Volume Index (GVI) is proposed for evaluating the colour preference of lighting. This metric is based on the absolute gamut volume of optimized colour samples. The optimal colour set of the proposed metric was obtained by optimizing the weighted average correlation between the metric predictions and the subjective ratings for 8 psychophysical studies. The performance of 20 typical colour metrics was also investigated, which included colour difference based metrics, gamut based metrics, memory based metrics as well as combined metrics. It was found that the proposed GVI outperformed the existing counterparts, especially for the conditions where correlated colour temperatures differed.

  12. How robust is a robust policy? A comparative analysis of alternative robustness metrics for supporting robust decision analysis.

    NASA Astrophysics Data System (ADS)

    Kwakkel, Jan; Haasnoot, Marjolijn

    2015-04-01

    In response to climate and socio-economic change, in various policy domains there is increasingly a call for robust plans or policies. That is, plans or policies that performs well in a very large range of plausible futures. In the literature, a wide range of alternative robustness metrics can be found. The relative merit of these alternative conceptualizations of robustness has, however, received less attention. Evidently, different robustness metrics can result in different plans or policies being adopted. This paper investigates the consequences of several robustness metrics on decision making, illustrated here by the design of a flood risk management plan. A fictitious case, inspired by a river reach in the Netherlands is used. The performance of this system in terms of casualties, damages, and costs for flood and damage mitigation actions is explored using a time horizon of 100 years, and accounting for uncertainties pertaining to climate change and land use change. A set of candidate policy options is specified up front. This set of options includes dike raising, dike strengthening, creating more space for the river, and flood proof building and evacuation options. The overarching aim is to design an effective flood risk mitigation strategy that is designed from the outset to be adapted over time in response to how the future actually unfolds. To this end, the plan will be based on the dynamic adaptive policy pathway approach (Haasnoot, Kwakkel et al. 2013) being used in the Dutch Delta Program. The policy problem is formulated as a multi-objective robust optimization problem (Kwakkel, Haasnoot et al. 2014). We solve the multi-objective robust optimization problem using several alternative robustness metrics, including both satisficing robustness metrics and regret based robustness metrics. Satisficing robustness metrics focus on the performance of candidate plans across a large ensemble of plausible futures. Regret based robustness metrics compare the performance of a candidate plan with the performance of other candidate plans across a large ensemble of plausible futures. Initial results suggest that the simplest satisficing metric, inspired by the signal to noise ratio, results in very risk averse solutions. Other satisficing metrics, which handle the average performance and the dispersion around the average separately, provide substantial additional insights into the trade off between the average performance, and the dispersion around this average. In contrast, the regret-based metrics enhance insight into the relative merits of candidate plans, while being less clear on the average performance or the dispersion around this performance. These results suggest that it is beneficial to use multiple robustness metrics when doing a robust decision analysis study. Haasnoot, M., J. H. Kwakkel, W. E. Walker and J. Ter Maat (2013). "Dynamic Adaptive Policy Pathways: A New Method for Crafting Robust Decisions for a Deeply Uncertain World." Global Environmental Change 23(2): 485-498. Kwakkel, J. H., M. Haasnoot and W. E. Walker (2014). "Developing Dynamic Adaptive Policy Pathways: A computer-assisted approach for developing adaptive strategies for a deeply uncertain world." Climatic Change.

  13. Metric for evaluation of filter efficiency in spectral cameras.

    PubMed

    Nahavandi, Alireza Mahmoudi; Tehran, Mohammad Amani

    2016-11-10

    Although metric functions that show the performance of a colorimetric imaging device have been investigated, a metric for performance analysis of a set of filters in wideband filter-based spectral cameras has rarely been studied. Based on a generalization of Vora's Measure of Goodness (MOG) and the spanning theorem, a single function metric that estimates the effectiveness of a filter set is introduced. The improved metric, named MMOG, varies between one, for a perfect, and zero, for the worst possible set of filters. Results showed that MMOG exhibits a trend that is more similar to the mean square of spectral reflectance reconstruction errors than does Vora's MOG index, and it is robust to noise in the imaging system. MMOG as a single metric could be exploited for further analysis of manufacturing errors.

  14. Modeled hydrologic metrics show links between hydrology and the functional composition of stream assemblages.

    PubMed

    Patrick, Christopher J; Yuan, Lester L

    2017-07-01

    Flow alteration is widespread in streams, but current understanding of the effects of differences in flow characteristics on stream biological communities is incomplete. We tested hypotheses about the effect of variation in hydrology on stream communities by using generalized additive models to relate watershed information to the values of different flow metrics at gauged sites. Flow models accounted for 54-80% of the spatial variation in flow metric values among gauged sites. We then used these models to predict flow metrics in 842 ungauged stream sites in the mid-Atlantic United States that were sampled for fish, macroinvertebrates, and environmental covariates. Fish and macroinvertebrate assemblages were characterized in terms of a suite of metrics that quantified aspects of community composition, diversity, and functional traits that were expected to be associated with differences in flow characteristics. We related modeled flow metrics to biological metrics in a series of stressor-response models. Our analyses identified both drying and base flow instability as explaining 30-50% of the observed variability in fish and invertebrate community composition. Variations in community composition were related to variations in the prevalence of dispersal traits in invertebrates and trophic guilds in fish. The results demonstrate that we can use statistical models to predict hydrologic conditions at bioassessment sites, which, in turn, we can use to estimate relationships between flow conditions and biological characteristics. This analysis provides an approach to quantify the effects of spatial variation in flow metrics using readily available biomonitoring data. © 2017 by the Ecological Society of America.

  15. MJO simulation in CMIP5 climate models: MJO skill metrics and process-oriented diagnosis

    NASA Astrophysics Data System (ADS)

    Ahn, Min-Seop; Kim, Daehyun; Sperber, Kenneth R.; Kang, In-Sik; Maloney, Eric; Waliser, Duane; Hendon, Harry

    2017-12-01

    The Madden-Julian Oscillation (MJO) simulation diagnostics developed by MJO Working Group and the process-oriented MJO simulation diagnostics developed by MJO Task Force are applied to 37 Coupled Model Intercomparison Project phase 5 (CMIP5) models in order to assess model skill in representing amplitude, period, and coherent eastward propagation of the MJO, and to establish a link between MJO simulation skill and parameterized physical processes. Process-oriented diagnostics include the Relative Humidity Composite based on Precipitation (RHCP), Normalized Gross Moist Stability (NGMS), and the Greenhouse Enhancement Factor (GEF). Numerous scalar metrics are developed to quantify the results. Most CMIP5 models underestimate MJO amplitude, especially when outgoing longwave radiation (OLR) is used in the evaluation, and exhibit too fast phase speed while lacking coherence between eastward propagation of precipitation/convection and the wind field. The RHCP-metric, indicative of the sensitivity of simulated convection to low-level environmental moisture, and the NGMS-metric, indicative of the efficiency of a convective atmosphere for exporting moist static energy out of the column, show robust correlations with a large number of MJO skill metrics. The GEF-metric, indicative of the strength of the column-integrated longwave radiative heating due to cloud-radiation interaction, is also correlated with the MJO skill metrics, but shows relatively lower correlations compared to the RHCP- and NGMS-metrics. Our results suggest that modifications to processes associated with moisture-convection coupling and the gross moist stability might be the most fruitful for improving simulations of the MJO. Though the GEF-metric exhibits lower correlations with the MJO skill metrics, the longwave radiation feedback is highly relevant for simulating the weak precipitation anomaly regime that may be important for the establishment of shallow convection and the transition to deep convection.

  16. MJO simulation in CMIP5 climate models: MJO skill metrics and process-oriented diagnosis

    DOE PAGES

    Ahn, Min-Seop; Kim, Daehyun; Sperber, Kenneth R.; ...

    2017-03-23

    The Madden-Julian Oscillation (MJO) simulation diagnostics developed by MJO Working Group and the process-oriented MJO simulation diagnostics developed by MJO Task Force are applied to 37 Coupled Model Intercomparison Project phase 5 (CMIP5) models in order to assess model skill in representing amplitude, period, and coherent eastward propagation of the MJO, and to establish a link between MJO simulation skill and parameterized physical processes. Process-oriented diagnostics include the Relative Humidity Composite based on Precipitation (RHCP), Normalized Gross Moist Stability (NGMS), and the Greenhouse Enhancement Factor (GEF). Numerous scalar metrics are developed to quantify the results. Most CMIP5 models underestimate MJOmore » amplitude, especially when outgoing longwave radiation (OLR) is used in the evaluation, and exhibit too fast phase speed while lacking coherence between eastward propagation of precipitation/convection and the wind field. The RHCP-metric, indicative of the sensitivity of simulated convection to low-level environmental moisture, and the NGMS-metric, indicative of the efficiency of a convective atmosphere for exporting moist static energy out of the column, show robust correlations with a large number of MJO skill metrics. The GEF-metric, indicative of the strength of the column-integrated longwave radiative heating due to cloud-radiation interaction, is also correlated with the MJO skill metrics, but shows relatively lower correlations compared to the RHCP- and NGMS-metrics. Our results suggest that modifications to processes associated with moisture-convection coupling and the gross moist stability might be the most fruitful for improving simulations of the MJO. Though the GEF-metric exhibits lower correlations with the MJO skill metrics, the longwave radiation feedback is highly relevant for simulating the weak precipitation anomaly regime that may be important for the establishment of shallow convection and the transition to deep convection.« less

  17. MJO simulation in CMIP5 climate models: MJO skill metrics and process-oriented diagnosis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ahn, Min-Seop; Kim, Daehyun; Sperber, Kenneth R.

    The Madden-Julian Oscillation (MJO) simulation diagnostics developed by MJO Working Group and the process-oriented MJO simulation diagnostics developed by MJO Task Force are applied to 37 Coupled Model Intercomparison Project phase 5 (CMIP5) models in order to assess model skill in representing amplitude, period, and coherent eastward propagation of the MJO, and to establish a link between MJO simulation skill and parameterized physical processes. Process-oriented diagnostics include the Relative Humidity Composite based on Precipitation (RHCP), Normalized Gross Moist Stability (NGMS), and the Greenhouse Enhancement Factor (GEF). Numerous scalar metrics are developed to quantify the results. Most CMIP5 models underestimate MJOmore » amplitude, especially when outgoing longwave radiation (OLR) is used in the evaluation, and exhibit too fast phase speed while lacking coherence between eastward propagation of precipitation/convection and the wind field. The RHCP-metric, indicative of the sensitivity of simulated convection to low-level environmental moisture, and the NGMS-metric, indicative of the efficiency of a convective atmosphere for exporting moist static energy out of the column, show robust correlations with a large number of MJO skill metrics. The GEF-metric, indicative of the strength of the column-integrated longwave radiative heating due to cloud-radiation interaction, is also correlated with the MJO skill metrics, but shows relatively lower correlations compared to the RHCP- and NGMS-metrics. Our results suggest that modifications to processes associated with moisture-convection coupling and the gross moist stability might be the most fruitful for improving simulations of the MJO. Though the GEF-metric exhibits lower correlations with the MJO skill metrics, the longwave radiation feedback is highly relevant for simulating the weak precipitation anomaly regime that may be important for the establishment of shallow convection and the transition to deep convection.« less

  18. Asset sustainability index : quick guide : proposed metrics for the long-term financial sustainability of highway networks.

    DOT National Transportation Integrated Search

    2013-04-01

    "This report provides a Quick Guide to the concept of asset sustainability metrics. Such metrics address the long-term performance of highway assets based upon expected expenditure levels. : It examines how such metrics are used in Australia, Britain...

  19. A novel approach for evaluating the performance of real time quantitative loop-mediated isothermal amplification-based methods.

    PubMed

    Nixon, Gavin J; Svenstrup, Helle F; Donald, Carol E; Carder, Caroline; Stephenson, Judith M; Morris-Jones, Stephen; Huggett, Jim F; Foy, Carole A

    2014-12-01

    Molecular diagnostic measurements are currently underpinned by the polymerase chain reaction (PCR). There are also a number of alternative nucleic acid amplification technologies, which unlike PCR, work at a single temperature. These 'isothermal' methods, reportedly offer potential advantages over PCR such as simplicity, speed and resistance to inhibitors and could also be used for quantitative molecular analysis. However there are currently limited mechanisms to evaluate their quantitative performance, which would assist assay development and study comparisons. This study uses a sexually transmitted infection diagnostic model in combination with an adapted metric termed isothermal doubling time (IDT), akin to PCR efficiency, to compare quantitative PCR and quantitative loop-mediated isothermal amplification (qLAMP) assays, and to quantify the impact of matrix interference. The performance metric described here facilitates the comparison of qLAMP assays that could assist assay development and validation activities.

  20. The influence of vegetation height heterogeneity on forest and woodland bird species richness across the United States.

    PubMed

    Huang, Qiongyu; Swatantran, Anu; Dubayah, Ralph; Goetz, Scott J

    2014-01-01

    Avian diversity is under increasing pressures. It is thus critical to understand the ecological variables that contribute to large scale spatial distribution of avian species diversity. Traditionally, studies have relied primarily on two-dimensional habitat structure to model broad scale species richness. Vegetation vertical structure is increasingly used at local scales. However, the spatial arrangement of vegetation height has never been taken into consideration. Our goal was to examine the efficacies of three-dimensional forest structure, particularly the spatial heterogeneity of vegetation height in improving avian richness models across forested ecoregions in the U.S. We developed novel habitat metrics to characterize the spatial arrangement of vegetation height using the National Biomass and Carbon Dataset for the year 2000 (NBCD). The height-structured metrics were compared with other habitat metrics for statistical association with richness of three forest breeding bird guilds across Breeding Bird Survey (BBS) routes: a broadly grouped woodland guild, and two forest breeding guilds with preferences for forest edge and for interior forest. Parametric and non-parametric models were built to examine the improvement of predictability. Height-structured metrics had the strongest associations with species richness, yielding improved predictive ability for the woodland guild richness models (r(2) = ∼ 0.53 for the parametric models, 0.63 the non-parametric models) and the forest edge guild models (r(2) = ∼ 0.34 for the parametric models, 0.47 the non-parametric models). All but one of the linear models incorporating height-structured metrics showed significantly higher adjusted-r2 values than their counterparts without additional metrics. The interior forest guild richness showed a consistent low association with height-structured metrics. Our results suggest that height heterogeneity, beyond canopy height alone, supplements habitat characterization and richness models of forest bird species. The metrics and models derived in this study demonstrate practical examples of utilizing three-dimensional vegetation data for improved characterization of spatial patterns in species richness.

  1. The Influence of Vegetation Height Heterogeneity on Forest and Woodland Bird Species Richness across the United States

    PubMed Central

    Huang, Qiongyu; Swatantran, Anu; Dubayah, Ralph; Goetz, Scott J.

    2014-01-01

    Avian diversity is under increasing pressures. It is thus critical to understand the ecological variables that contribute to large scale spatial distribution of avian species diversity. Traditionally, studies have relied primarily on two-dimensional habitat structure to model broad scale species richness. Vegetation vertical structure is increasingly used at local scales. However, the spatial arrangement of vegetation height has never been taken into consideration. Our goal was to examine the efficacies of three-dimensional forest structure, particularly the spatial heterogeneity of vegetation height in improving avian richness models across forested ecoregions in the U.S. We developed novel habitat metrics to characterize the spatial arrangement of vegetation height using the National Biomass and Carbon Dataset for the year 2000 (NBCD). The height-structured metrics were compared with other habitat metrics for statistical association with richness of three forest breeding bird guilds across Breeding Bird Survey (BBS) routes: a broadly grouped woodland guild, and two forest breeding guilds with preferences for forest edge and for interior forest. Parametric and non-parametric models were built to examine the improvement of predictability. Height-structured metrics had the strongest associations with species richness, yielding improved predictive ability for the woodland guild richness models (r2 = ∼0.53 for the parametric models, 0.63 the non-parametric models) and the forest edge guild models (r2 = ∼0.34 for the parametric models, 0.47 the non-parametric models). All but one of the linear models incorporating height-structured metrics showed significantly higher adjusted-r2 values than their counterparts without additional metrics. The interior forest guild richness showed a consistent low association with height-structured metrics. Our results suggest that height heterogeneity, beyond canopy height alone, supplements habitat characterization and richness models of forest bird species. The metrics and models derived in this study demonstrate practical examples of utilizing three-dimensional vegetation data for improved characterization of spatial patterns in species richness. PMID:25101782

  2. Real-Time Performance Feedback for the Manual Control of Spacecraft

    NASA Astrophysics Data System (ADS)

    Karasinski, John Austin

    Real-time performance metrics were developed to quantify workload, situational awareness, and manual task performance for use as visual feedback to pilots of aerospace vehicles. Results from prior lunar lander experiments with variable levels of automation were replicated and extended to provide insights for the development of real-time metrics. Increased levels of automation resulted in increased flight performance, lower workload, and increased situational awareness. Automated Speech Recognition (ASR) was employed to detect verbal callouts as a limited measure of subjects' situational awareness. A one-dimensional manual tracking task and simple instructor-model visual feedback scheme was developed. This feedback was indicated to the operator by changing the color of a guidance element on the primary flight display, similar to how a flight instructor points out elements of a display to a student pilot. Experiments showed that for this low-complexity task, visual feedback did not change subject performance, but did increase the subjects' measured workload. Insights gained from these experiments were applied to a Simplified Aid for EVA Rescue (SAFER) inspection task. The effects of variations of an instructor-model performance-feedback strategy on human performance in a novel SAFER inspection task were investigated. Real-time feedback was found to have a statistically significant effect of improving subject performance and decreasing workload in this complicated four degree of freedom manual control task with two secondary tasks.

  3. Development of an Objective Space Suit Mobility Performance Metric Using Metabolic Cost and Functional Tasks

    NASA Technical Reports Server (NTRS)

    McFarland, Shane M.; Norcross, Jason

    2016-01-01

    Existing methods for evaluating EVA suit performance and mobility have historically concentrated on isolated joint range of motion and torque. However, these techniques do little to evaluate how well a suited crewmember can actually perform during an EVA. An alternative method of characterizing suited mobility through measurement of metabolic cost to the wearer has been evaluated at Johnson Space Center over the past several years. The most recent study involved six test subjects completing multiple trials of various functional tasks in each of three different space suits; the results indicated it was often possible to discern between different suit designs on the basis of metabolic cost alone. However, other variables may have an effect on real-world suited performance; namely, completion time of the task, the gravity field in which the task is completed, etc. While previous results have analyzed completion time, metabolic cost, and metabolic cost normalized to system mass individually, it is desirable to develop a single metric comprising these (and potentially other) performance metrics. This paper outlines the background upon which this single-score metric is determined to be feasible, and initial efforts to develop such a metric. Forward work includes variable coefficient determination and verification of the metric through repeated testing.

  4. Video-Based Method of Quantifying Performance and Instrument Motion During Simulated Phonosurgery

    PubMed Central

    Conroy, Ellen; Surender, Ketan; Geng, Zhixian; Chen, Ting; Dailey, Seth; Jiang, Jack

    2015-01-01

    Objectives/Hypothesis To investigate the use of the Video-Based Phonomicrosurgery Instrument Tracking System to collect instrument position data during simulated phonomicrosurgery and calculate motion metrics using these data. We used this system to determine if novice subject motion metrics improved over 1 week of training. Study Design Prospective cohort study. Methods Ten subjects performed simulated surgical tasks once per day for 5 days. Instrument position data were collected and used to compute motion metrics (path length, depth perception, and motion smoothness). Data were analyzed to determine if motion metrics improved with practice time. Task outcome was also determined each day, and relationships between task outcome and motion metrics were used to evaluate the validity of motion metrics as indicators of surgical performance. Results Significant decreases over time were observed for path length (P <.001), depth perception (P <.001), and task outcome (P <.001). No significant change was observed for motion smoothness. Significant relationships were observed between task outcome and path length (P <.001), depth perception (P <.001), and motion smoothness (P <.001). Conclusions Our system can estimate instrument trajectory and provide quantitative descriptions of surgical performance. It may be useful for evaluating phonomicrosurgery performance. Path length and depth perception may be particularly useful indicators. PMID:24737286

  5. A Quantitative Human Spacecraft Design Evaluation Model for Assessing Crew Accommodation and Utilization

    NASA Astrophysics Data System (ADS)

    Fanchiang, Christine

    Crew performance, including both accommodation and utilization factors, is an integral part of every human spaceflight mission from commercial space tourism, to the demanding journey to Mars and beyond. Spacecraft were historically built by engineers and technologists trying to adapt the vehicle into cutting edge rocketry with the assumption that the astronauts could be trained and will adapt to the design. By and large, that is still the current state of the art. It is recognized, however, that poor human-machine design integration can lead to catastrophic and deadly mishaps. The premise of this work relies on the idea that if an accurate predictive model exists to forecast crew performance issues as a result of spacecraft design and operations, it can help designers and managers make better decisions throughout the design process, and ensure that the crewmembers are well-integrated with the system from the very start. The result should be a high-quality, user-friendly spacecraft that optimizes the utilization of the crew while keeping them alive, healthy, and happy during the course of the mission. Therefore, the goal of this work was to develop an integrative framework to quantitatively evaluate a spacecraft design from the crew performance perspective. The approach presented here is done at a very fundamental level starting with identifying and defining basic terminology, and then builds up important axioms of human spaceflight that lay the foundation for how such a framework can be developed. With the framework established, a methodology for characterizing the outcome using a mathematical model was developed by pulling from existing metrics and data collected on human performance in space. Representative test scenarios were run to show what information could be garnered and how it could be applied as a useful, understandable metric for future spacecraft design. While the model is the primary tangible product from this research, the more interesting outcome of this work is the structure of the framework and what it tells future researchers in terms of where the gaps and limitations exist for developing a better framework. It also identifies metrics that can now be collected as part of future validation efforts for the model.

  6. Development and comparison of metrics for evaluating climate models and estimation of projection uncertainty

    NASA Astrophysics Data System (ADS)

    Ring, Christoph; Pollinger, Felix; Kaspar-Ott, Irena; Hertig, Elke; Jacobeit, Jucundus; Paeth, Heiko

    2017-04-01

    The COMEPRO project (Comparison of Metrics for Probabilistic Climate Change Projections of Mediterranean Precipitation), funded by the Deutsche Forschungsgemeinschaft (DFG), is dedicated to the development of new evaluation metrics for state-of-the-art climate models. Further, we analyze implications for probabilistic projections of climate change. This study focuses on the results of 4-field matrix metrics. Here, six different approaches are compared. We evaluate 24 models of the Coupled Model Intercomparison Project Phase 3 (CMIP3), 40 of CMIP5 and 18 of the Coordinated Regional Downscaling Experiment (CORDEX). In addition to the annual and seasonal precipitation the mean temperature is analysed. We consider both 50-year trend and climatological mean for the second half of the 20th century. For the probabilistic projections of climate change A1b, A2 (CMIP3) and RCP4.5, RCP8.5 (CMIP5,CORDEX) scenarios are used. The eight main study areas are located in the Mediterranean. However, we apply our metrics to globally distributed regions as well. The metrics show high simulation quality of temperature trend and both precipitation and temperature mean for most climate models and study areas. In addition, we find high potential for model weighting in order to reduce uncertainty. These results are in line with other accepted evaluation metrics and studies. The comparison of the different 4-field approaches reveals high correlations for most metrics. The results of the metric-weighted probabilistic density functions of climate change are heterogeneous. We find for different regions and seasons both increases and decreases of uncertainty. The analysis of global study areas is consistent with the regional study areas of the Medeiterrenean.

  7. Spatial scale, means and gradients of hydrographic variables define pelagic seascapes of bluefin and bullet tuna spawning distribution.

    PubMed

    Alvarez-Berastegui, Diego; Ciannelli, Lorenzo; Aparicio-Gonzalez, Alberto; Reglero, Patricia; Hidalgo, Manuel; López-Jurado, Jose Luis; Tintoré, Joaquín; Alemany, Francisco

    2014-01-01

    Seascape ecology is an emerging discipline focused on understanding how features of the marine habitat influence the spatial distribution of marine species. However, there is still a gap in the development of concepts and techniques for its application in the marine pelagic realm, where there are no clear boundaries delimitating habitats. Here we demonstrate that pelagic seascape metrics defined as a combination of hydrographic variables and their spatial gradients calculated at an appropriate spatial scale, improve our ability to model pelagic fish distribution. We apply the analysis to study the spawning locations of two tuna species: Atlantic bluefin and bullet tuna. These two species represent a gradient in life history strategies. Bluefin tuna has a large body size and is a long-distant migrant, while bullet tuna has a small body size and lives year-round in coastal waters within the Mediterranean Sea. The results show that the models performance incorporating the proposed seascape metrics increases significantly when compared with models that do not consider these metrics. This improvement is more important for Atlantic bluefin, whose spawning ecology is dependent on the local oceanographic scenario, than it is for bullet tuna, which is less influenced by the hydrographic conditions. Our study advances our understanding of how species perceive their habitat and confirms that the spatial scale at which the seascape metrics provide information is related to the spawning ecology and life history strategy of each species.

  8. What do we know and when do we know it?

    PubMed Central

    2008-01-01

    Two essential aspects of virtual screening are considered: experimental design and performance metrics. In the design of any retrospective virtual screen, choices have to be made as to the purpose of the exercise. Is the goal to compare methods? Is the interest in a particular type of target or all targets? Are we simulating a ‘real-world’ setting, or teasing out distinguishing features of a method? What are the confidence limits for the results? What should be reported in a publication? In particular, what criteria should be used to decide between different performance metrics? Comparing the field of molecular modeling to other endeavors, such as medical statistics, criminology, or computer hardware evaluation indicates some clear directions. Taken together these suggest the modeling field has a long way to go to provide effective assessment of its approaches, either to itself or to a broader audience, but that there are no technical reasons why progress cannot be made. PMID:18253702

  9. A guide to calculating habitat-quality metrics to inform conservation of highly mobile species

    USGS Publications Warehouse

    Bieri, Joanna A.; Sample, Christine; Thogmartin, Wayne E.; Diffendorfer, James E.; Earl, Julia E.; Erickson, Richard A.; Federico, Paula; Flockhart, D. T. Tyler; Nicol, Sam; Semmens, Darius J.; Skraber, T.; Wiederholt, Ruscena; Mattsson, Brady J.

    2018-01-01

    Many metrics exist for quantifying the relative value of habitats and pathways used by highly mobile species. Properly selecting and applying such metrics requires substantial background in mathematics and understanding the relevant management arena. To address this multidimensional challenge, we demonstrate and compare three measurements of habitat quality: graph-, occupancy-, and demographic-based metrics. Each metric provides insights into system dynamics, at the expense of increasing amounts and complexity of data and models. Our descriptions and comparisons of diverse habitat-quality metrics provide means for practitioners to overcome the modeling challenges associated with management or conservation of such highly mobile species. Whereas previous guidance for applying habitat-quality metrics has been scattered in diversified tracks of literature, we have brought this information together into an approachable format including accessible descriptions and a modeling case study for a typical example that conservation professionals can adapt for their own decision contexts and focal populations.Considerations for Resource ManagersManagement objectives, proposed actions, data availability and quality, and model assumptions are all relevant considerations when applying and interpreting habitat-quality metrics.Graph-based metrics answer questions related to habitat centrality and connectivity, are suitable for populations with any movement pattern, quantify basic spatial and temporal patterns of occupancy and movement, and require the least data.Occupancy-based metrics answer questions about likelihood of persistence or colonization, are suitable for populations that undergo localized extinctions, quantify spatial and temporal patterns of occupancy and movement, and require a moderate amount of data.Demographic-based metrics answer questions about relative or absolute population size, are suitable for populations with any movement pattern, quantify demographic processes and population dynamics, and require the most data.More real-world examples applying occupancy-based, agent-based, and continuous-based metrics to seasonally migratory species are needed to better understand challenges and opportunities for applying these metrics more broadly.

  10. Predicting the difficulty of pure, strict, epistatic models: metrics for simulated model selection.

    PubMed

    Urbanowicz, Ryan J; Kiralis, Jeff; Fisher, Jonathan M; Moore, Jason H

    2012-09-26

    Algorithms designed to detect complex genetic disease associations are initially evaluated using simulated datasets. Typical evaluations vary constraints that influence the correct detection of underlying models (i.e. number of loci, heritability, and minor allele frequency). Such studies neglect to account for model architecture (i.e. the unique specification and arrangement of penetrance values comprising the genetic model), which alone can influence the detectability of a model. In order to design a simulation study which efficiently takes architecture into account, a reliable metric is needed for model selection. We evaluate three metrics as predictors of relative model detection difficulty derived from previous works: (1) Penetrance table variance (PTV), (2) customized odds ratio (COR), and (3) our own Ease of Detection Measure (EDM), calculated from the penetrance values and respective genotype frequencies of each simulated genetic model. We evaluate the reliability of these metrics across three very different data search algorithms, each with the capacity to detect epistatic interactions. We find that a model's EDM and COR are each stronger predictors of model detection success than heritability. This study formally identifies and evaluates metrics which quantify model detection difficulty. We utilize these metrics to intelligently select models from a population of potential architectures. This allows for an improved simulation study design which accounts for differences in detection difficulty attributed to model architecture. We implement the calculation and utilization of EDM and COR into GAMETES, an algorithm which rapidly and precisely generates pure, strict, n-locus epistatic models.

  11. Measuring Distribution Performance? Benchmarking Warrants Your Attention

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ericson, Sean J; Alvarez, Paul

    Identifying, designing, and measuring performance metrics is critical to securing customer value, but can be a difficult task. This article examines the use of benchmarks based on publicly available performance data to set challenging, yet fair, metrics and targets.

  12. METRIC model for the estimation and mapping of evapotranspiration in a super intensive olive orchard in Southern Portugal

    NASA Astrophysics Data System (ADS)

    Pôças, Isabel; Nogueira, António; Paço, Teresa A.; Sousa, Adélia; Valente, Fernanda; Silvestre, José; Andrade, José A.; Santos, Francisco L.; Pereira, Luís S.; Allen, Richard G.

    2013-04-01

    Satellite-based surface energy balance models have been successfully applied to estimate and map evapotranspiration (ET). The METRICtm model, Mapping EvapoTranspiration at high Resolution using Internalized Calibration, is one of such models. METRIC has been widely used over an extensive range of vegetation types and applications, mostly focusing annual crops. In the current study, the single-layer-blended METRIC model was applied to Landsat5 TM and Landsat7 ETM+ images to produce estimates of evapotranspiration (ET) in a super intensive olive orchard in Southern Portugal. In sparse woody canopies as in olive orchards, some adjustments in METRIC application related to the estimation of vegetation temperature and of momentum roughness length and sensible heat flux (H) for tall vegetation must be considered. To minimize biases in H estimates due to uncertainties in the definition of momentum roughness length, the Perrier function based on leaf area index and tree canopy architecture, associated with an adjusted estimation of crop height, was used to obtain momentum roughness length estimates. Additionally, to minimize the biases in surface temperature simulations, due to soil and shadow effects, the computation of radiometric temperature considered a three-source condition, where Ts=fcTc+fshadowTshadow+fsunlitTsunlit. As such, the surface temperature (Ts), derived from the thermal band of the Landsat images, integrates the temperature of the canopy (Tc), the temperature of the shaded ground surface (Tshadow), and the temperature of the sunlit ground surface (Tsunlit), according to the relative fraction of vegetation (fc), shadow (fshadow) and sunlit (fsunlit) ground surface, respectively. As the sunlit canopies are the primary source of energy exchange, the effective temperature for the canopy was estimated by solving the three-source condition equation for Tc. To evaluate METRIC performance to estimate ET over the olive grove, several parameters derived from the algorithm were tested against data collected in the field, including eddy covariance ET, surface temperature over the canopy and soil temperature in shaded and sunlit conditions. Additionally, the results were also compared with results published in the literature. The information obtained so far revealed very interesting perspectives for the use of METRIC in the estimation and mapping of ET in super intensive olive orchards. Thereby, this approach might constitute a useful tool towards the improvement of the efficiency of irrigation water management in this crop. The study described is still under way, and thus further applications of METRIC algorithm to a larger number of images and to olive groves with different tree density are planned.

  13. Questionable validity of the catheter-associated urinary tract infection metric used for value-based purchasing.

    PubMed

    Calderon, Lindsay E; Kavanagh, Kevin T; Rice, Mara K

    2015-10-01

    Catheter-associated urinary tract infections (CAUTIs) occur in 290,000 US hospital patients annually, with an estimated cost of $290 million. Two different measurement systems are being used to track the US health care system's performance in lowering the rate of CAUTIs. Since 2010, the Agency for Healthcare Research and Quality (AHRQ) metric has shown a 28.2% decrease in CAUTI, whereas the Centers for Disease Control and Prevention metric has shown a 3%-6% increase in CAUTI since 2009. Differences in data acquisition and the definition of the denominator may explain this discrepancy. The AHRQ metric analyzes chart-audited data and reflects both catheter use and care. The Centers for Disease Control and Prevention metric analyzes self-reported data and primarily reflects catheter care. Because analysis of the AHRQ metric showed a progressive change in performance over time and the scientific literature supports the importance of catheter use in the prevention of CAUTI, it is suggested that risk-adjusted catheter-use data be incorporated into metrics that are used for determining facility performance and for value-based purchasing initiatives. Copyright © 2015 Association for Professionals in Infection Control and Epidemiology, Inc. Published by Elsevier Inc. All rights reserved.

  14. Applying Sigma Metrics to Reduce Outliers.

    PubMed

    Litten, Joseph

    2017-03-01

    Sigma metrics can be used to predict assay quality, allowing easy comparison of instrument quality and predicting which tests will require minimal quality control (QC) rules to monitor the performance of the method. A Six Sigma QC program can result in fewer controls and fewer QC failures for methods with a sigma metric of 5 or better. The higher the number of methods with a sigma metric of 5 or better, the lower the costs for reagents, supplies, and control material required to monitor the performance of the methods. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. Quality Measures in Stroke

    PubMed Central

    Poisson, Sharon N.; Josephson, S. Andrew

    2011-01-01

    Stroke is a major public health burden, and accounts for many hospitalizations each year. Due to gaps in practice and recommended guidelines, there has been a recent push toward implementing quality measures to be used for improving patient care, comparing institutions, as well as for rewarding or penalizing physicians through pay-for-performance. This article reviews the major organizations involved in implementing quality metrics for stroke, and the 10 major metrics currently being tracked. We also discuss possible future metrics and the implications of public reporting and using metrics for pay-for-performance. PMID:23983840

  16. A public hedonic analysis of environmental attributes in an open space preservation program

    NASA Astrophysics Data System (ADS)

    Nordman, Erik E.

    The Town of Brookhaven, on Long Island, NY, has implemented an open space preservation program to protect natural areas, and the ecosystem services they provide, from suburban growth. I used a public hedonic model of Brookhaven's open space purchases to estimate implicit prices for various environmental attributes, locational variables and spatial metrics. I also measured the correlation between cost per acre and non-monetary environmental benefit scores and tested whether including cost data, as opposed to non-monetary environmental benefit score alone, would change the prioritization ranks of acquired properties. The mean acquisition cost per acre was 82,501. I identified the key on-site environmental and locational variables using stepwise regression for four functional forms. The log-log specification performed best ( R2adj= 0.727). I performed a second stepwise regression (log-log form) which included spatial metrics, calculated from a high-resolution land cover classification, in addition to the environmental and locational variables. This markedly improved the model's performance ( R2adj=0.866). Statistically significant variables included the property size, location in the Pine Barrens Compatible Growth Area, location in a FEMA flood zone, adjacency to public land, and several other environmental dummy variables. The single significant spatial metric, the fractal dimension of the tree cover class, had the largest elasticity of any variable. Of the dummy variables, location within the Compatible Growth Area had the largest implicit price (298,792 per acre). The priority rank for the two methods, non-monetary environmental benefit score alone and the ratio of non-monetary environmental benefit score to acquisition cost were significantly positively correlated. This suggests that, despite the lack of cost data in their ranking method, Brookhaven does not suffer from efficiency losses. The economics literature encourages using both environmental benefits and acquisition costs to ensure cost-effective conservation programs. I recommend that Brookhaven consider acquisition costs in addition to environmental benefits to avert potential efficiency losses in future open space purchases. This dissertation shows that the addition of spatial metrics can enhance the performance of hedonic models. It also provides a baseline valuation for the environmental attributes of Brookhaven' open spaces and shows that location is critical when dealing with open space preservation programs.

  17. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Williams, Samuel; Patterson, David; Oliker, Leonid

    This article consists of a collection of slides from the authors' conference presentation. The Roofline model is a visually intuitive figure for kernel analysis and optimization. We believe undergraduates will find it useful in assessing performance and scalability limitations. It is easily extended to other architectural paradigms. It is easily extendable to other metrics: performance (sort, graphics, crypto..) bandwidth (L2, PCIe, ..). Furthermore, a performance counters could be used to generate a runtime-specific roofline that would greatly aide the optimization.

  18. Building America Energy Renovations. A Business Case for Home Performance Contracting

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Baechler, Michael C.; Antonopoulos, C. A.; Sevigny, M.

    2012-10-01

    This research report gives an overview of the needs and opportunities that exist in the U.S. home performance contracting industry. The report discusses industry trends, market drivers, different business models, and points of entry for existing and new businesses hoping to enter the home performance contracting industry. Case studies of eight companies who successfully entered the industry are provided, including business metrics, start-up costs, and marketing approaches.

  19. Empirical Evaluation of Hunk Metrics as Bug Predictors

    NASA Astrophysics Data System (ADS)

    Ferzund, Javed; Ahsan, Syed Nadeem; Wotawa, Franz

    Reducing the number of bugs is a crucial issue during software development and maintenance. Software process and product metrics are good indicators of software complexity. These metrics have been used to build bug predictor models to help developers maintain the quality of software. In this paper we empirically evaluate the use of hunk metrics as predictor of bugs. We present a technique for bug prediction that works at smallest units of code change called hunks. We build bug prediction models using random forests, which is an efficient machine learning classifier. Hunk metrics are used to train the classifier and each hunk metric is evaluated for its bug prediction capabilities. Our classifier can classify individual hunks as buggy or bug-free with 86 % accuracy, 83 % buggy hunk precision and 77% buggy hunk recall. We find that history based and change level hunk metrics are better predictors of bugs than code level hunk metrics.

  20. Improving Metallic Thermal Protection System Hypervelocity Impact Resistance Through Design of Experiments Approach

    NASA Technical Reports Server (NTRS)

    Poteet, Carl C.; Blosser, Max L.

    2001-01-01

    A design of experiments approach has been implemented using computational hypervelocity impact simulations to determine the most effective place to add mass to an existing metallic Thermal Protection System (TPS) to improve hypervelocity impact protection. Simulations were performed using axisymmetric models in CTH, a shock-physics code developed by Sandia National Laboratories, and validated by comparison with existing test data. The axisymmetric models were then used in a statistical sensitivity analysis to determine the influence of five design parameters on degree of hypervelocity particle dispersion. Several damage metrics were identified and evaluated. Damage metrics related to the extent of substructure damage were seen to produce misleading results, however damage metrics related to the degree of dispersion of the hypervelocity particle produced results that corresponded to physical intuition. Based on analysis of variance results it was concluded that the most effective way to increase hypervelocity impact resistance is to increase the thickness of the outer foil layer. Increasing the spacing between the outer surface and the substructure is also very effective at increasing dispersion.

  1. Measuring β-diversity with species abundance data.

    PubMed

    Barwell, Louise J; Isaac, Nick J B; Kunin, William E

    2015-07-01

    In 2003, 24 presence-absence β-diversity metrics were reviewed and a number of trade-offs and redundancies identified. We present a parallel investigation into the performance of abundance-based metrics of β-diversity. β-diversity is a multi-faceted concept, central to spatial ecology. There are multiple metrics available to quantify it: the choice of metric is an important decision. We test 16 conceptual properties and two sampling properties of a β-diversity metric: metrics should be 1) independent of α-diversity and 2) cumulative along a gradient of species turnover. Similarity should be 3) probabilistic when assemblages are independently and identically distributed. Metrics should have 4) a minimum of zero and increase monotonically with the degree of 5) species turnover, 6) decoupling of species ranks and 7) evenness differences. However, complete species turnover should always generate greater values of β than extreme 8) rank shifts or 9) evenness differences. Metrics should 10) have a fixed upper limit, 11) symmetry (βA,B  = βB,A ), 12) double-zero asymmetry for double absences and double presences and 13) not decrease in a series of nested assemblages. Additionally, metrics should be independent of 14) species replication 15) the units of abundance and 16) differences in total abundance between sampling units. When samples are used to infer β-diversity, metrics should be 1) independent of sample sizes and 2) independent of unequal sample sizes. We test 29 metrics for these properties and five 'personality' properties. Thirteen metrics were outperformed or equalled across all conceptual and sampling properties. Differences in sensitivity to species' abundance lead to a performance trade-off between sample size bias and the ability to detect turnover among rare species. In general, abundance-based metrics are substantially less biased in the face of undersampling, although the presence-absence metric, βsim , performed well overall. Only βBaselga R turn , βBaselga B-C turn and βsim measured purely species turnover and were independent of nestedness. Among the other metrics, sensitivity to nestedness varied >4-fold. Our results indicate large amounts of redundancy among existing β-diversity metrics, whilst the estimation of unseen shared and unshared species is lacking and should be addressed in the design of new abundance-based metrics. © 2015 The Authors. Journal of Animal Ecology published by John Wiley & Sons Ltd on behalf of British Ecological Society.

  2. An Exploratory Study of OEE Implementation in Indian Manufacturing Companies

    NASA Astrophysics Data System (ADS)

    Kumar, J.; Soni, V. K.

    2015-04-01

    Globally, the implementation of Overall equipment effectiveness (OEE) has proven to be highly effective in improving availability, performance rate and quality rate while reducing unscheduled breakdown and wastage that stems from the equipment. This paper investigates the present status and future scope of OEE metrics in Indian manufacturing companies through an extensive survey. In this survey, opinions of Production and Maintenance Managers have been analyzed statistically to explore the relationship between factors, perspective of OEE and potential use of OEE metrics. Although the sample has been divers in terms of product, process type, size, and geographic location of the companies, they are enforced to implement improvement techniques such as OEE metrics to improve performance. The findings reveal that OEE metrics has huge potential and scope to improve performance. Responses indicate that Indian companies are aware of OEE but they are not utilizing full potential of OEE metrics.

  3. Association of airborne moisture-indicating microorganisms withbuilding-related symptoms and water damage in 100 U.S. office buildings:Analyses of the U.S. EPA BASE data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mendell, Mark J.; Lei, Quanhong; Cozen, Myrna O.

    2003-10-01

    Metrics of culturable airborne microorganisms for either total organisms or suspected harmful subgroups have generally not been associated with symptoms among building occupants. However, the visible presence of moisture damage or mold in residences and other buildings has consistently been associated with respiratory symptoms and other health effects. This relationship is presumably caused by adverse but uncharacterized exposures to moisture-related microbiological growth. In order to assess this hypothesis, we studied relationships in U.S. office buildings between the prevalence of respiratory and irritant symptoms, the concentrations of airborne microorganisms that require moist surfaces on which to grow, and the presence ofmore » visible water damage. For these analyses we used data on buildings, indoor environments, and occupants collected from a representative sample of 100 U.S. office buildings in the U.S. Environmental Protection Agency's Building Assessment Survey and Evaluation (EPA BASE) study. We created 19 alternate metrics, using scales ranging from 3-10 units, that summarized the concentrations of airborne moisture-indicating microorganisms (AMIMOs) as indicators of moisture in buildings. Two were constructed to resemble a metric previously reported to be associated with lung function changes in building occupants; the others were based on another metric from the same group of Finnish researchers, concentration cutpoints from other studies, and professional judgment. We assessed three types of associations: between AMIMO metrics and symptoms in office workers, between evidence of water damage and symptoms, and between water damage and AMIMO metrics. We estimated (as odds ratios (ORs) with 95% confidence intervals) the unadjusted and adjusted associations between the 19 metrics and two types of weekly, work-related symptoms--lower respiratory and mucous membrane--using logistic regression models. Analyses used the original AMIMO metrics and were repeated with simplified dichotomized metrics. The multivariate models adjusted for other potential confounding variables associated with respondents, occupied spaces, buildings, or ventilation systems. Models excluded covariates for moisture-related risks hypothesized to increase AMIMO levels. We also estimated the association of water damage (using variables for specific locations in the study space or building, or summary variables) with the two symptom outcomes. Finally, using selected AMIMO metrics as outcomes, we constructed logistic regression models with observations at the building level to estimate unadjusted and adjusted associations of evident water damage with AMIMO metrics. All original AMIMO metrics showed little overall pattern of unadjusted or adjusted association with either symptom outcome. The 3-category metric resembling that previously used by others, which of all constructed metrics had the largest number of buildings in its top category, was not associated with symptoms in these buildings. However, most metrics with few buildings in their highest category showed increased risk for both symptoms in that category, especially metrics using cutpoints of >100 but <500 colony-forming units (CFU)/m{sup 3} for concentration of total culturable fungi. With AMIMO metrics dichotomized to compare the highest category with all lower categories combined, four metrics had unadjusted ORs between 1.4 and 1.6 for both symptom outcomes. The same four metrics had adjusted ORs of 1.7-2.1 for both symptom outcomes. In models of water damage and symptoms, several specific locations of past water damage had significant associations with outcomes, with ORs ranging from 1.4-1.6. In bivariate models of water damage and selected AMIMO metrics, a number of specific types of water damage and several summary variables for water damage were very strongly associated with AMIMO metrics (significant ORs ranging above 15). Multivariate modeling with the dichotomous AMIMO metrics was not possible due to limited numbers of observations.« less

  4. Common world model for unmanned systems: Phase 2

    NASA Astrophysics Data System (ADS)

    Dean, Robert M. S.; Oh, Jean; Vinokurov, Jerry

    2014-06-01

    The Robotics Collaborative Technology Alliance (RCTA) seeks to provide adaptive robot capabilities which move beyond traditional metric algorithms to include cognitive capabilities. Key to this effort is the Common World Model, which moves beyond the state-of-the-art by representing the world using semantic and symbolic as well as metric information. It joins these layers of information to define objects in the world. These objects may be reasoned upon jointly using traditional geometric, symbolic cognitive algorithms and new computational nodes formed by the combination of these disciplines to address Symbol Grounding and Uncertainty. The Common World Model must understand how these objects relate to each other. It includes the concept of Self-Information about the robot. By encoding current capability, component status, task execution state, and their histories we track information which enables the robot to reason and adapt its performance using Meta-Cognition and Machine Learning principles. The world model also includes models of how entities in the environment behave which enable prediction of future world states. To manage complexity, we have adopted a phased implementation approach. Phase 1, published in these proceedings in 2013 [1], presented the approach for linking metric with symbolic information and interfaces for traditional planners and cognitive reasoning. Here we discuss the design of "Phase 2" of this world model, which extends the Phase 1 design API, data structures, and reviews the use of the Common World Model as part of a semantic navigation use case.

  5. Toward an optimisation technique for dynamically monitored environment

    NASA Astrophysics Data System (ADS)

    Shurrab, Orabi M.

    2016-10-01

    The data fusion community has introduced multiple procedures of situational assessments; this is to facilitate timely responses to emerging situations. More directly, the process refinement of the Joint Directors of Laboratories (JDL) is a meta-process to assess and improve the data fusion task during real-time operation. In other wording, it is an optimisation technique to verify the overall data fusion performance, and enhance it toward the top goals of the decision-making resources. This paper discusses the theoretical concept of prioritisation. Where the analysts team is required to keep an up to date with the dynamically changing environment, concerning different domains such as air, sea, land, space and cyberspace. Furthermore, it demonstrates an illustration example of how various tracking activities are ranked, simultaneously into a predetermined order. Specifically, it presents a modelling scheme for a case study based scenario, where the real-time system is reporting different classes of prioritised events. Followed by a performance metrics for evaluating the prioritisation process of situational awareness (SWA) domain. The proposed performance metrics has been designed and evaluated using an analytical approach. The modelling scheme represents the situational awareness system outputs mathematically, in the form of a list of activities. Such methods allowed the evaluation process to conduct a rigorous analysis of the prioritisation process, despite any constrained related to a domain-specific configuration. After conducted three levels of assessments over three separates scenario, The Prioritisation Capability Score (PCS) has provided an appropriate scoring scheme for different ranking instances, Indeed, from the data fusion perspectives, the proposed metric has assessed real-time system performance adequately, and it is capable of conducting a verification process, to direct the operator's attention to any issue, concerning the prioritisation capability of situational awareness domain.

  6. Metrication report to the Congress

    NASA Technical Reports Server (NTRS)

    1991-01-01

    NASA's principal metrication accomplishments for FY 1990 were establishment of metrication policy for major programs, development of an implementing instruction for overall metric policy and initiation of metrication planning for the major program offices. In FY 1991, development of an overall NASA plan and individual program office plans will be completed, requirement assessments will be performed for all support areas, and detailed assessment and transition planning will be undertaken at the institutional level. Metric feasibility decisions on a number of major programs are expected over the next 18 months.

  7. Going Metric: Is It for You? A Planning Model for Small Manufacturing Companies.

    ERIC Educational Resources Information Center

    Beek, C.; And Others

    This booklet is designed to aid small manufacturing companies in ascertaining the meaning of going metric for their unique circumstances and to guide them in making a smooth conversion to the metric system. First is a brief discussion of what the law says about metrics and what the metric system is. Then what is involved in going metric is…

  8. Challenge toward the prediction of typhoon behaviour and down pour

    NASA Astrophysics Data System (ADS)

    Takahashi, K.; Onishi, R.; Baba, Y.; Kida, S.; Matsuda, K.; Goto, K.; Fuchigami, H.

    2013-08-01

    Mechanisms of interactions among different scale phenomena play important roles for forecasting of weather and climate. Multi-scale Simulator for the Geoenvironment (MSSG), which deals with multi-scale multi-physics phenomena, is a coupled non-hydrostatic atmosphere-ocean model designed to be run efficiently on the Earth Simulator. We present simulation results with the world-highest 1.9km horizontal resolution for the entire globe and regional heavy rain with 1km horizontal resolution and 5m horizontal/vertical resolution for urban area simulation. To gain high performance by exploiting the system capabilities, we propose novel performance evaluation metrics introduced in previous studies that incorporate the effects of the data caching mechanism between CPU and memory. With a useful code optimization guideline based on such metrics, we demonstrate that MSSG can achieve an excellent peak performance ratio of 32.2% on the Earth Simulator with the single-core performance found to be a key to a reduced time-to-solution.

  9. Manifold Preserving: An Intrinsic Approach for Semisupervised Distance Metric Learning.

    PubMed

    Ying, Shihui; Wen, Zhijie; Shi, Jun; Peng, Yaxin; Peng, Jigen; Qiao, Hong

    2017-05-18

    In this paper, we address the semisupervised distance metric learning problem and its applications in classification and image retrieval. First, we formulate a semisupervised distance metric learning model by considering the metric information of inner classes and interclasses. In this model, an adaptive parameter is designed to balance the inner metrics and intermetrics by using data structure. Second, we convert the model to a minimization problem whose variable is symmetric positive-definite matrix. Third, in implementation, we deduce an intrinsic steepest descent method, which assures that the metric matrix is strictly symmetric positive-definite at each iteration, with the manifold structure of the symmetric positive-definite matrix manifold. Finally, we test the proposed algorithm on conventional data sets, and compare it with other four representative methods. The numerical results validate that the proposed method significantly improves the classification with the same computational efficiency.

  10. Model assessment using a multi-metric ranking technique

    NASA Astrophysics Data System (ADS)

    Fitzpatrick, P. J.; Lau, Y.; Alaka, G.; Marks, F.

    2017-12-01

    Validation comparisons of multiple models presents challenges when skill levels are similar, especially in regimes dominated by the climatological mean. Assessing skill separation will require advanced validation metrics and identifying adeptness in extreme events, but maintain simplicity for management decisions. Flexibility for operations is also an asset. This work postulates a weighted tally and consolidation technique which ranks results by multiple types of metrics. Variables include absolute error, bias, acceptable absolute error percentages, outlier metrics, model efficiency, Pearson correlation, Kendall's Tau, reliability Index, multiplicative gross error, and root mean squared differences. Other metrics, such as root mean square difference and rank correlation were also explored, but removed when the information was discovered to be generally duplicative to other metrics. While equal weights are applied, weights could be altered depending for preferred metrics. Two examples are shown comparing ocean models' currents and tropical cyclone products, including experimental products. The importance of using magnitude and direction for tropical cyclone track forecasts instead of distance, along-track, and cross-track are discussed. Tropical cyclone intensity and structure prediction are also assessed. Vector correlations are not included in the ranking process, but found useful in an independent context, and will be briefly reported.

  11. Mask Design for the Space Interferometry Mission Internal Metrology

    NASA Technical Reports Server (NTRS)

    Marx, David; Zhao, Feng; Korechoff, Robert

    2005-01-01

    This slide presentation reviews the mask design used for the internal metrology of the Space Interferometry Mission (SIM). Included is information about the project, the method of measurements with SIM, the internal metrology, numerical model of internal metrology, wavefront examples, performance metrics, and mask design

  12. Computer-Aided Design and Optimization of High-Performance Vacuum Electronic Devices

    DTIC Science & Technology

    2006-08-15

    approximations to the metric, and space mapping wherein low-accuracy (coarse mesh) solutions can potentially be used more effectively in an...interface and algorithm development. • Work on space - mapping or related methods for utilizing models of varying levels of approximation within an

  13. On the use of hidden Markov models for gaze pattern modeling

    NASA Astrophysics Data System (ADS)

    Mannaru, Pujitha; Balasingam, Balakumar; Pattipati, Krishna; Sibley, Ciara; Coyne, Joseph

    2016-05-01

    Some of the conventional metrics derived from gaze patterns (on computer screens) to study visual attention, engagement and fatigue are saccade counts, nearest neighbor index (NNI) and duration of dwells/fixations. Each of these metrics has drawbacks in modeling the behavior of gaze patterns; one such drawback comes from the fact that some portions on the screen are not as important as some other portions on the screen. This is addressed by computing the eye gaze metrics corresponding to important areas of interest (AOI) on the screen. There are some challenges in developing accurate AOI based metrics: firstly, the definition of AOI is always fuzzy; secondly, it is possible that the AOI may change adaptively over time. Hence, there is a need to introduce eye-gaze metrics that are aware of the AOI in the field of view; at the same time, the new metrics should be able to automatically select the AOI based on the nature of the gazes. In this paper, we propose a novel way of computing NNI based on continuous hidden Markov models (HMM) that model the gazes as 2D Gaussian observations (x-y coordinates of the gaze) with the mean at the center of the AOI and covariance that is related to the concentration of gazes. The proposed modeling allows us to accurately compute the NNI metric in the presence of multiple, undefined AOI on the screen in the presence of intermittent casual gazing that is modeled as random gazes on the screen.

  14. Looking beyond general metrics for model evaluation - lessons from an international model intercomparison study

    NASA Astrophysics Data System (ADS)

    Bouaziz, Laurène; de Boer-Euser, Tanja; Brauer, Claudia; Drogue, Gilles; Fenicia, Fabrizio; Grelier, Benjamin; de Niel, Jan; Nossent, Jiri; Pereira, Fernando; Savenije, Hubert; Thirel, Guillaume; Willems, Patrick

    2016-04-01

    International collaboration between institutes and universities is a promising way to reach consensus on hydrological model development. Education, experience and expert knowledge of the hydrological community have resulted in the development of a great variety of model concepts, calibration methods and analysis techniques. Although comparison studies are very valuable for international cooperation, they do often not lead to very clear new insights regarding the relevance of the modelled processes. We hypothesise that this is partly caused by model complexity and the used comparison methods, which focus on a good overall performance instead of focusing on specific events. We propose an approach that focuses on the evaluation of specific events. Eight international research groups calibrated their model for the Ourthe catchment in Belgium (1607 km2) and carried out a validation in time for the Ourthe (i.e. on two different periods, one of them on a blind mode for the modellers) and a validation in space for nested and neighbouring catchments of the Meuse in a completely blind mode. For each model, the same protocol was followed and an ensemble of best performing parameter sets was selected. Signatures were first used to assess model performances in the different catchments during validation. Comparison of the models was then followed by evaluation of selected events, which include: low flows, high flows and the transition from low to high flows. While the models show rather similar performances based on general metrics (i.e. Nash-Sutcliffe Efficiency), clear differences can be observed for specific events. While most models are able to simulate high flows well, large differences are observed during low flows and in the ability to capture the first peaks after drier months. The transferability of model parameters to neighbouring and nested catchments is assessed as an additional measure in the model evaluation. This suggested approach helps to select, among competing model alternatives, the most suitable model for a specific purpose.

  15. A Comparison of Exposure Metrics for Traffic-Related Air Pollutants: Application to Epidemiology Studies in Detroit, Michigan

    PubMed Central

    Batterman, Stuart; Burke, Janet; Isakov, Vlad; Lewis, Toby; Mukherjee, Bhramar; Robins, Thomas

    2014-01-01

    Vehicles are major sources of air pollutant emissions, and individuals living near large roads endure high exposures and health risks associated with traffic-related air pollutants. Air pollution epidemiology, health risk, environmental justice, and transportation planning studies would all benefit from an improved understanding of the key information and metrics needed to assess exposures, as well as the strengths and limitations of alternate exposure metrics. This study develops and evaluates several metrics for characterizing exposure to traffic-related air pollutants for the 218 residential locations of participants in the NEXUS epidemiology study conducted in Detroit (MI, USA). Exposure metrics included proximity to major roads, traffic volume, vehicle mix, traffic density, vehicle exhaust emissions density, and pollutant concentrations predicted by dispersion models. Results presented for each metric include comparisons of exposure distributions, spatial variability, intraclass correlation, concordance and discordance rates, and overall strengths and limitations. While showing some agreement, the simple categorical and proximity classifications (e.g., high diesel/low diesel traffic roads and distance from major roads) do not reflect the range and overlap of exposures seen in the other metrics. Information provided by the traffic density metric, defined as the number of kilometers traveled (VKT) per day within a 300 m buffer around each home, was reasonably consistent with the more sophisticated metrics. Dispersion modeling provided spatially- and temporally-resolved concentrations, along with apportionments that separated concentrations due to traffic emissions and other sources. While several of the exposure metrics showed broad agreement, including traffic density, emissions density and modeled concentrations, these alternatives still produced exposure classifications that differed for a substantial fraction of study participants, e.g., from 20% to 50% of homes, depending on the metric, would be incorrectly classified into “low”, “medium” or “high” traffic exposure classes. These and other results suggest the potential for exposure misclassification and the need for refined and validated exposure metrics. While data and computational demands for dispersion modeling of traffic emissions are non-trivial concerns, once established, dispersion modeling systems can provide exposure information for both on- and near-road environments that would benefit future traffic-related assessments. PMID:25226412

  16. Restaurant Energy Use Benchmarking Guideline

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hedrick, R.; Smith, V.; Field, K.

    2011-07-01

    A significant operational challenge for food service operators is defining energy use benchmark metrics to compare against the performance of individual stores. Without metrics, multiunit operators and managers have difficulty identifying which stores in their portfolios require extra attention to bring their energy performance in line with expectations. This report presents a method whereby multiunit operators may use their own utility data to create suitable metrics for evaluating their operations.

  17. Improving Department of Defense Global Distribution Performance Through Network Analysis

    DTIC Science & Technology

    2016-06-01

    network performance increase. 14. SUBJECT TERMS supply chain metrics, distribution networks, requisition shipping time, strategic distribution database...peace and war” (p. 4). USTRANSCOM Metrics and Analysis Branch defines, develops, tracks, and maintains outcomes- based supply chain metrics to...2014a, p. 8). The Joint Staff defines a TDD standard as the maximum number of days the supply chain can take to deliver requisitioned materiel

  18. State updating and calibration period selection to improve dynamic monthly streamflow forecasts for an environmental flow management application

    NASA Astrophysics Data System (ADS)

    Gibbs, Matthew S.; McInerney, David; Humphrey, Greer; Thyer, Mark A.; Maier, Holger R.; Dandy, Graeme C.; Kavetski, Dmitri

    2018-02-01

    Monthly to seasonal streamflow forecasts provide useful information for a range of water resource management and planning applications. This work focuses on improving such forecasts by considering the following two aspects: (1) state updating to force the models to match observations from the start of the forecast period, and (2) selection of a shorter calibration period that is more representative of the forecast period, compared to a longer calibration period traditionally used. The analysis is undertaken in the context of using streamflow forecasts for environmental flow water management of an open channel drainage network in southern Australia. Forecasts of monthly streamflow are obtained using a conceptual rainfall-runoff model combined with a post-processor error model for uncertainty analysis. This model set-up is applied to two catchments, one with stronger evidence of non-stationarity than the other. A range of metrics are used to assess different aspects of predictive performance, including reliability, sharpness, bias and accuracy. The results indicate that, for most scenarios and metrics, state updating improves predictive performance for both observed rainfall and forecast rainfall sources. Using the shorter calibration period also improves predictive performance, particularly for the catchment with stronger evidence of non-stationarity. The results highlight that a traditional approach of using a long calibration period can degrade predictive performance when there is evidence of non-stationarity. The techniques presented can form the basis for operational monthly streamflow forecasting systems and provide support for environmental decision-making.

  19. Tide or Tsunami? The Impact of Metrics on Scholarly Research

    ERIC Educational Resources Information Center

    Bonnell, Andrew G.

    2016-01-01

    Australian universities are increasingly resorting to the use of journal metrics such as impact factors and ranking lists in appraisal and promotion processes, and are starting to set quantitative "performance expectations" which make use of such journal-based metrics. The widespread use and misuse of research metrics is leading to…

  20. Looking beyond general metrics for model comparison - lessons from an international model intercomparison study

    NASA Astrophysics Data System (ADS)

    de Boer-Euser, Tanja; Bouaziz, Laurène; De Niel, Jan; Brauer, Claudia; Dewals, Benjamin; Drogue, Gilles; Fenicia, Fabrizio; Grelier, Benjamin; Nossent, Jiri; Pereira, Fernando; Savenije, Hubert; Thirel, Guillaume; Willems, Patrick

    2017-01-01

    International collaboration between research institutes and universities is a promising way to reach consensus on hydrological model development. Although model comparison studies are very valuable for international cooperation, they do often not lead to very clear new insights regarding the relevance of the modelled processes. We hypothesise that this is partly caused by model complexity and the comparison methods used, which focus too much on a good overall performance instead of focusing on a variety of specific events. In this study, we use an approach that focuses on the evaluation of specific events and characteristics. Eight international research groups calibrated their hourly model on the Ourthe catchment in Belgium and carried out a validation in time for the Ourthe catchment and a validation in space for nested and neighbouring catchments. The same protocol was followed for each model and an ensemble of best-performing parameter sets was selected. Although the models showed similar performances based on general metrics (i.e. the Nash-Sutcliffe efficiency), clear differences could be observed for specific events. We analysed the hydrographs of these specific events and conducted three types of statistical analyses on the entire time series: cumulative discharges, empirical extreme value distribution of the peak flows and flow duration curves for low flows. The results illustrate the relevance of including a very quick flow reservoir preceding the root zone storage to model peaks during low flows and including a slow reservoir in parallel with the fast reservoir to model the recession for the studied catchments. This intercomparison enhanced the understanding of the hydrological functioning of the catchment, in particular for low flows, and enabled to identify present knowledge gaps for other parts of the hydrograph. Above all, it helped to evaluate each model against a set of alternative models.

  1. Prediction Models for 30-Day Mortality and Complications After Total Knee and Hip Arthroplasties for Veteran Health Administration Patients With Osteoarthritis.

    PubMed

    Harris, Alex Hs; Kuo, Alfred C; Bowe, Thomas; Gupta, Shalini; Nordin, David; Giori, Nicholas J

    2018-05-01

    Statistical models to preoperatively predict patients' risk of death and major complications after total joint arthroplasty (TJA) could improve the quality of preoperative management and informed consent. Although risk models for TJA exist, they have limitations including poor transparency and/or unknown or poor performance. Thus, it is currently impossible to know how well currently available models predict short-term complications after TJA, or if newly developed models are more accurate. We sought to develop and conduct cross-validation of predictive risk models, and report details and performance metrics as benchmarks. Over 90 preoperative variables were used as candidate predictors of death and major complications within 30 days for Veterans Health Administration patients with osteoarthritis who underwent TJA. Data were split into 3 samples-for selection of model tuning parameters, model development, and cross-validation. C-indexes (discrimination) and calibration plots were produced. A total of 70,569 patients diagnosed with osteoarthritis who received primary TJA were included. C-statistics and bootstrapped confidence intervals for the cross-validation of the boosted regression models were highest for cardiac complications (0.75; 0.71-0.79) and 30-day mortality (0.73; 0.66-0.79) and lowest for deep vein thrombosis (0.59; 0.55-0.64) and return to the operating room (0.60; 0.57-0.63). Moderately accurate predictive models of 30-day mortality and cardiac complications after TJA in Veterans Health Administration patients were developed and internally cross-validated. By reporting model coefficients and performance metrics, other model developers can test these models on new samples and have a procedure and indication-specific benchmark to surpass. Published by Elsevier Inc.

  2. A new definition of pharmaceutical quality: assembly of a risk simulation platform to investigate the impact of manufacturing/product variability on clinical performance.

    PubMed

    Short, Steven M; Cogdill, Robert P; D'Amico, Frank; Drennen, James K; Anderson, Carl A

    2010-12-01

    The absence of a unanimous, industry-specific definition of quality is, to a certain degree, impeding the progress of ongoing efforts to "modernize" the pharmaceutical industry. This work was predicated on requests by Dr. Woodcock (FDA) to re-define pharmaceutical quality in terms of risk by linking production characteristics to clinical attributes. A risk simulation platform that integrates population statistics, drug delivery system characteristics, dosing guidelines, patient compliance estimates, production metrics, and pharmacokinetic, pharmacodynamic, and in vitro-in vivo correlation models to investigate the impact of manufacturing variability on clinical performance of a model extended-release theophylline solid oral dosage system was developed. Manufacturing was characterized by inter- and intra-batch content uniformity and dissolution variability metrics, while clinical performance was described by a probabilistic pharmacodynamic model that expressed the probability of inefficacy and toxicity as a function of plasma concentrations. Least-squares regression revealed that both patient compliance variables, percent of doses taken and dosing time variability, significantly impacted efficacy and toxicity. Additionally, intra-batch content uniformity variability elicited a significant change in risk scores for the two adverse events and, therefore, was identified as a critical quality attribute. The proposed methodology demonstrates that pharmaceutical quality can be recast to explicitly reflect clinical performance. © 2010 Wiley-Liss, Inc. and the American Pharmacists Association

  3. Liver sharing and organ procurement organization performance.

    PubMed

    Gentry, Sommer E; Chow, Eric K H; Massie, Allan; Luo, Xun; Zaun, David; Snyder, Jon J; Israni, Ajay K; Kasiske, Bert; Segev, Dorry L

    2015-03-01

    Whether the liver allocation system shifts organs from better performing organ procurement organizations (OPOs) to poorer performing OPOs has been debated for many years. Models of OPO performance from the Scientific Registry of Transplant Recipients make it possible to study this question in a data-driven manner. We investigated whether each OPO's net liver import was correlated with 2 performance metrics [observed to expected (O:E) liver yield and liver donor conversion ratio] as well as 2 alternative explanations [eligible deaths and incident listings above a Model for End-Stage Liver Disease (MELD) score of 15]. We found no evidence to support the hypothesis that the allocation system transfers livers from better performing OPOs to centers with poorer performing OPOs. Also, having fewer eligible deaths was not associated with a net import. However, having more incident listings was strongly correlated with the net import, both before and after Share 35. Most importantly, the magnitude of the variation in OPO performance was much lower than the variation in demand: although the poorest performing OPOs differed from the best ones by less than 2-fold in the O:E liver yield, incident listings above a MELD score of 15 varied nearly 14-fold. Although it is imperative that all OPOs achieve the best possible results, the flow of livers is not explained by OPO performance metrics, and instead, it appears to be strongly related to differences in demand. © 2015 American Association for the Study of Liver Diseases.

  4. On Railroad Tank Car Puncture Performance: Part I - Considering Metrics

    DOT National Transportation Integrated Search

    2016-04-12

    This paper is the first in a two-part series on the puncture performance of railroad tank cars carrying hazardous materials in the event of an accident. Various metrics are often mentioned in the open literature to characterize the structural perform...

  5. Tracking occupational hearing loss across global industries: A comparative analysis of metrics

    PubMed Central

    Rabinowitz, Peter M.; Galusha, Deron; McTague, Michael F.; Slade, Martin D.; Wesdock, James C.; Dixon-Ernst, Christine

    2013-01-01

    Occupational hearing loss is one of the most prevalent occupational conditions; yet, there is no acknowledged international metric to allow comparisons of risk between different industries and regions. In order to make recommendations for an international standard of occupational hearing loss, members of an international industry group (the International Aluminium Association) submitted details of different hearing loss metrics currently in use by members. We compared the performance of these metrics using an audiometric data set for over 6000 individuals working in 10 locations of one member company. We calculated rates for each metric at each location from 2002 to 2006. For comparison, we calculated the difference of observed–expected (for age) binaural high frequency hearing loss (in dB/year) for each location over the same time period. We performed linear regression to determine the correlation between each metric and the observed–expected rate of hearing loss. The different metrics produced discrepant results, with annual rates ranging from 0.0% for a less-sensitive metric to more than 10% for a highly sensitive metric. At least two metrics, a 10 dB age-corrected threshold shift from baseline and a 15 dB nonage-corrected shift metric, correlated well with the difference of observed–expected high-frequency hearing loss. This study suggests that it is feasible to develop an international standard for tracking occupational hearing loss in industrial working populations. PMID:22387709

  6. Do Your Students Measure Up Metrically?

    ERIC Educational Resources Information Center

    Taylor, P. Mark; Simms, Ken; Kim, Ok-Kyeong; Reys, Robert E.

    2001-01-01

    Examines released metric items from the Third International Mathematics and Science Study (TIMSS) and the 3rd and 4th grade results. Recommends refocusing instruction on the metric system to improve student performance in measurement. (KHR)

  7. Three validation metrics for automated probabilistic image segmentation of brain tumours

    PubMed Central

    Zou, Kelly H.; Wells, William M.; Kikinis, Ron; Warfield, Simon K.

    2005-01-01

    SUMMARY The validity of brain tumour segmentation is an important issue in image processing because it has a direct impact on surgical planning. We examined the segmentation accuracy based on three two-sample validation metrics against the estimated composite latent gold standard, which was derived from several experts’ manual segmentations by an EM algorithm. The distribution functions of the tumour and control pixel data were parametrically assumed to be a mixture of two beta distributions with different shape parameters. We estimated the corresponding receiver operating characteristic curve, Dice similarity coefficient, and mutual information, over all possible decision thresholds. Based on each validation metric, an optimal threshold was then computed via maximization. We illustrated these methods on MR imaging data from nine brain tumour cases of three different tumour types, each consisting of a large number of pixels. The automated segmentation yielded satisfactory accuracy with varied optimal thresholds. The performances of these validation metrics were also investigated via Monte Carlo simulation. Extensions of incorporating spatial correlation structures using a Markov random field model were considered. PMID:15083482

  8. Why choice of metric matters in public health analyses: a case study of the attribution of credit for the decline in coronary heart disease mortality in the US and other populations.

    PubMed

    Gouda, Hebe N; Critchley, Julia; Powles, John; Capewell, Simon

    2012-01-28

    Reasons for the widespread declines in coronary heart disease (CHD) mortality in high income countries are controversial. Here we explore how the type of metric chosen for the analyses of these declines affects the answer obtained. The analyses we reviewed were performed using IMPACT, a large Excel based model of the determinants of temporal change in mortality from CHD. Assessments of the decline in CHD mortality in the USA between 1980 and 2000 served as the central case study. Analyses based in the metric of number of deaths prevented attributed about half the decline to treatments (including preventive medications) and half to favourable shifts in risk factors. However, when mortality change was expressed in the metric of life-years-gained, the share attributed to risk factor change rose to 65%. This happened because risk factor changes were modelled as slowing disease progression, such that the hypothetical deaths averted resulted in longer average remaining lifetimes gained than the deaths averted by better treatments. This result was robust to a range of plausible assumptions on the relative effect sizes of changes in treatments and risk factors. Time-based metrics (such as life years) are generally preferable because they direct attention to the changes in the natural history of disease that are produced by changes in key health determinants. The life-years attached to each death averted will also weight deaths in a way that better reflects social preferences.

  9. Evaluation of image deblurring methods via a classification metric

    NASA Astrophysics Data System (ADS)

    Perrone, Daniele; Humphreys, David; Lamb, Robert A.; Favaro, Paolo

    2012-09-01

    The performance of single image deblurring algorithms is typically evaluated via a certain discrepancy measure between the reconstructed image and the ideal sharp image. The choice of metric, however, has been a source of debate and has also led to alternative metrics based on human visual perception. While fixed metrics may fail to capture some small but visible artifacts, perception-based metrics may favor reconstructions with artifacts that are visually pleasant. To overcome these limitations, we propose to assess the quality of reconstructed images via a task-driven metric. In this paper we consider object classification as the task and therefore use the rate of classification as the metric to measure deblurring performance. In our evaluation we use data with different types of blur in two cases: Optical Character Recognition (OCR), where the goal is to recognise characters in a black and white image, and object classification with no restrictions on pose, illumination and orientation. Finally, we show how off-the-shelf classification algorithms benefit from working with deblurred images.

  10. Speckle pattern sequential extraction metric for estimating the focus spot size on a remote diffuse target.

    PubMed

    Yu, Zhan; Li, Yuanyang; Liu, Lisheng; Guo, Jin; Wang, Tingfeng; Yang, Guoqing

    2017-11-10

    The speckle pattern (line by line) sequential extraction (SPSE) metric is proposed by the one-dimensional speckle intensity level crossing theory. Through the sequential extraction of received speckle information, the speckle metrics for estimating the variation of focusing spot size on a remote diffuse target are obtained. Based on the simulation, we will give some discussions about the SPSE metric range of application under the theoretical conditions, and the aperture size will affect the metric performance of the observation system. The results of the analyses are verified by the experiment. This method is applied to the detection of relative static target (speckled jitter frequency is less than the CCD sampling frequency). The SPSE metric can determine the variation of the focusing spot size over a long distance, moreover, the metric will estimate the spot size under some conditions. Therefore, the monitoring and the feedback of far-field spot will be implemented laser focusing system applications and help the system to optimize the focusing performance.

  11. Magnetic Resonance Imaging of Intracranial Hypotension: Diagnostic Value of Combined Qualitative Signs and Quantitative Metrics.

    PubMed

    Aslan, Kerim; Gunbey, Hediye Pinar; Tomak, Leman; Ozmen, Zafer; Incesu, Lutfi

    The aim of this study was to investigate whether the use of combination quantitative metrics (mamillopontine distance [MPD], pontomesencephalic angle, and mesencephalon anterior-posterior/medial-lateral diameter ratios) with qualitative signs (dural enhancement, subdural collections/hematoma, venous engorgement, pituitary gland enlargements, and tonsillar herniations) provides a more accurate diagnosis of intracranial hypotension (IH). The quantitative metrics and qualitative signs of 34 patients and 34 control subjects were assessed by 2 independent observers. Receiver operating characteristic (ROC) curve was used to evaluate the diagnostic performance of quantitative metrics and qualitative signs, and for the diagnosis of IH, optimum cutoff values of quantitative metrics were found with ROC analysis. Combined ROC curve was measured for the quantitative metrics, and qualitative signs combinations in determining diagnostic accuracy and sensitivity, specificity, and positive and negative predictive values were found, and the best model combination was formed. Whereas MPD and pontomesencephalic angle were significantly lower in patients with IH when compared with the control group (P < 0.001), mesencephalon anterior-posterior/medial-lateral diameter ratio was significantly higher (P < 0.001). For qualitative signs, the highest individual distinctive power was dural enhancement with area under the ROC curve (AUC) of 0.838. For quantitative metrics, the highest individual distinctive power was MPD with AUC of 0.947. The best accuracy in the diagnosis of IH was obtained by combination of dural enhancement, venous engorgement, and MPD with an AUC of 1.00. This study showed that the combined use of dural enhancement, venous engorgement, and MPD had diagnostic accuracy of 100 % for the diagnosis of IH. Therefore, a more accurate IH diagnosis can be provided with combination of quantitative metrics with qualitative signs.

  12. Simulated Models Suggest That Price per Calorie Is the Dominant Price Metric That Low-Income Individuals Use for Food Decision Making123

    PubMed Central

    2016-01-01

    Background: The price of food has long been considered one of the major factors that affects food choices. However, the price metric (e.g., the price of food per calorie or the price of food per gram) that individuals predominantly use when making food choices is unclear. Understanding which price metric is used is especially important for studying individuals with severe budget constraints because food price then becomes even more important in food choice. Objective: We assessed which price metric is used by low-income individuals in deciding what to eat. Methods: With the use of data from NHANES and the USDA Food and Nutrient Database for Dietary Studies, we created an agent-based model that simulated an environment representing the US population, wherein individuals were modeled as agents with a specific weight, age, and income. In our model, agents made dietary food choices while meeting their budget limits with the use of 1 of 3 different metrics for decision making: energy cost (price per calorie), unit price (price per gram), and serving price (price per serving). The food consumption patterns generated by our model were compared to 3 independent data sets. Results: The food choice behaviors observed in 2 of the data sets were found to be closest to the simulated dietary patterns generated by the price per calorie metric. The behaviors observed in the third data set were equidistant from the patterns generated by price per calorie and price per serving metrics, whereas results generated by the price per gram metric were further away. Conclusions: Our simulations suggest that dietary food choice based on price per calorie best matches actual consumption patterns and may therefore be the most salient price metric for low-income populations. PMID:27655757

  13. Simulated Models Suggest That Price per Calorie Is the Dominant Price Metric That Low-Income Individuals Use for Food Decision Making.

    PubMed

    Beheshti, Rahmatollah; Igusa, Takeru; Jones-Smith, Jessica

    2016-11-01

    The price of food has long been considered one of the major factors that affects food choices. However, the price metric (e.g., the price of food per calorie or the price of food per gram) that individuals predominantly use when making food choices is unclear. Understanding which price metric is used is especially important for studying individuals with severe budget constraints because food price then becomes even more important in food choice. We assessed which price metric is used by low-income individuals in deciding what to eat. With the use of data from NHANES and the USDA Food and Nutrient Database for Dietary Studies, we created an agent-based model that simulated an environment representing the US population, wherein individuals were modeled as agents with a specific weight, age, and income. In our model, agents made dietary food choices while meeting their budget limits with the use of 1 of 3 different metrics for decision making: energy cost (price per calorie), unit price (price per gram), and serving price (price per serving). The food consumption patterns generated by our model were compared to 3 independent data sets. The food choice behaviors observed in 2 of the data sets were found to be closest to the simulated dietary patterns generated by the price per calorie metric. The behaviors observed in the third data set were equidistant from the patterns generated by price per calorie and price per serving metrics, whereas results generated by the price per gram metric were further away. Our simulations suggest that dietary food choice based on price per calorie best matches actual consumption patterns and may therefore be the most salient price metric for low-income populations. © 2016 American Society for Nutrition.

  14. Greenroads : a sustainability performance metric for roadway design and construction.

    DOT National Transportation Integrated Search

    2009-11-01

    Greenroads is a performance metric for quantifying sustainable practices associated with roadway design and construction. Sustainability is defined as having seven key components: ecology, equity, economy, extent, expectations, experience and exposur...

  15. Performance metrics used by freight transport providers.

    DOT National Transportation Integrated Search

    2008-09-30

    The newly-established National Cooperative Freight Research Program (NCFRP) has allocated $300,000 in funding to a project entitled Performance Metrics for Freight Transportation (NCFRP 03). The project is scheduled for completion in September ...

  16. Irregular large-scale computed tomography on multiple graphics processors improves energy-efficiency metrics for industrial applications

    NASA Astrophysics Data System (ADS)

    Jimenez, Edward S.; Goodman, Eric L.; Park, Ryeojin; Orr, Laurel J.; Thompson, Kyle R.

    2014-09-01

    This paper will investigate energy-efficiency for various real-world industrial computed-tomography reconstruction algorithms, both CPU- and GPU-based implementations. This work shows that the energy required for a given reconstruction is based on performance and problem size. There are many ways to describe performance and energy efficiency, thus this work will investigate multiple metrics including performance-per-watt, energy-delay product, and energy consumption. This work found that irregular GPU-based approaches1 realized tremendous savings in energy consumption when compared to CPU implementations while also significantly improving the performance-per- watt and energy-delay product metrics. Additional energy savings and other metric improvement was realized on the GPU-based reconstructions by improving storage I/O by implementing a parallel MIMD-like modularization of the compute and I/O tasks.

  17. Evaluation of a Tactical Surface Metering Tool for Charlotte Douglas International Airport via Human-in-the-Loop Simulation

    NASA Technical Reports Server (NTRS)

    Verma, Savita; Lee, Hanbong; Dulchinos, Victoria L.; Martin, Lynne; Stevens, Lindsay; Jung, Yoon; Chevalley, Eric; Jobe, Kim; Parke, Bonny

    2017-01-01

    NASA has been working with the FAA and aviation industry partners to develop and demonstrate new concepts and technologies that integrate arrival, departure, and surface traffic management capabilities. In March 2017, NASA conducted a human-in-the-loop (HITL) simulation for integrated surface and airspace operations, modeling Charlotte Douglas International Airport, to evaluate the operational procedures and information requirements for the tactical surface metering tool, and data exchange elements between the airline controlled ramp and ATC Tower. In this paper, we focus on the calibration of the tactical surface metering tool using various metrics measured from the HITL simulation results. Key performance metrics include gate hold times from pushback advisories, taxi-in-out times, runway throughput, and departure queue size. Subjective metrics presented in this paper include workload, situational awareness, and acceptability of the metering tool and its calibration.

  18. Evaluation of a Tactical Surface Metering Tool for Charlotte Douglas International Airport Via Human-in-the-Loop Simulation

    NASA Technical Reports Server (NTRS)

    Verma, Savita; Lee, Hanbong; Martin, Lynne; Stevens, Lindsay; Jung, Yoon; Dulchinos, Victoria; Chevalley, Eric; Jobe, Kim; Parke, Bonny

    2017-01-01

    NASA has been working with the FAA and aviation industry partners to develop and demonstrate new concepts and technologies that integrate arrival, departure, and surface traffic management capabilities. In March 2017, NASA conducted a human-in-the-loop (HITL) simulation for integrated surface and airspace operations, modeling Charlotte Douglas International Airport, to evaluate the operational procedures and information requirements for the tactical surface metering tool, and data exchange elements between the airline controlled ramp and ATC Tower. In this paper, we focus on the calibration of the tactical surface metering tool using various metrics measured from the HITL simulation results. Key performance metrics include gate hold times from pushback advisories, taxi-in/out times, runway throughput, and departure queue size. Subjective metrics presented in this paper include workload, situational awareness, and acceptability of the metering tool and its calibration

  19. Computer-Aided TRIZ Ideality and Level of Invention Estimation Using Natural Language Processing and Machine Learning

    NASA Astrophysics Data System (ADS)

    Adams, Christopher; Tate, Derrick

    Patent textual descriptions provide a wealth of information that can be used to understand the underlying design approaches that result in the generation of novel and innovative technology. This article will discuss a new approach for estimating Degree of Ideality and Level of Invention metrics from the theory of inventive problem solving (TRIZ) using patent textual information. Patent text includes information that can be used to model both the functions performed by a design and the associated costs and problems that affect a design’s value. The motivation of this research is to use patent data with calculation of TRIZ metrics to help designers understand which combinations of system components and functions result in creative and innovative design solutions. This article will discuss in detail methods to estimate these TRIZ metrics using natural language processing and machine learning with the use of neural networks.

  20. Throughput analysis for the National Airspace System

    NASA Astrophysics Data System (ADS)

    Sureshkumar, Chandrasekar

    The United States National Airspace System (NAS) network performance is currently measured using a variety of metrics based on delay. Developments in the fields of wireless communication, manufacturing and other modes of transportation like road, freight, etc. have explored various metrics that complement the delay metric. In this work, we develop a throughput concept for both the terminal and en-route phases of flight inspired by studies in the above areas and explore the applications of throughput metrics for the en-route airspace of the NAS. These metrics can be applied to the NAS performance at each hierarchical level—the sector, center, regional and national and will consist of multiple layers of networks with the bottom level comprising the traffic pattern modelled as a network of individual sectors acting as nodes. This hierarchical approach is especially suited for executive level decision making as it gives an overall picture of not just the inefficiencies but also the aspects where the NAS has performed well in a given situation from which specific information about the effects of a policy change on the NAS performance at each level can be determined. These metrics are further validated with real traffic data using the Future Air Traffic Management Concepts Evaluation Tool (FACET) for three en-route sectors and an Air Route Traffic Control Center (ARTCC). Further, this work proposes a framework to compute the minimum makespan and the capacity of a runway system in any configuration. Towards this, an algorithm for optimal arrival and departure flight sequencing is proposed. The proposed algorithm is based on a branch-and-bound technique and allows for the efficient computation of the best runway assignment and sequencing of arrival and departure operations that minimize the makespan at a given airport. The lower and upper bounds of the cost of each branch for the best first search in the branch-and-bound algorithm are computed based on the minimum separation standards between arrival and departure operations set by the Federal Aviation Administration. The optimal objective value is mathematically proved to lie between these bounds and the algorithm uses these bounds to efficiently find promising branches and discard all others and terminate with atleast one sequence with the minimal makespan. The proposed algorithm is analyzed and validated through real traffic operations data at the Hartsfield-Jackson Atlanta international airport.

  1. Photogrammetry using Apollo 16 orbital photography, part B

    NASA Technical Reports Server (NTRS)

    Wu, S. S. C.; Schafer, F. J.; Jordan, R.; Nakata, G. M.

    1972-01-01

    Discussion is made of the Apollo 15 and 16 metric and panoramic cameras which provided photographs for accurate topographic portrayal of the lunar surface using photogrammetric methods. Nine stereoscopic models of Apollo 16 metric photographs and three models of panoramic photographs were evaluated photogrammetrically in support of the Apollo 16 geologic investigations. Four of the models were used to collect profile data for crater morphology studies; three models were used to collect evaluation data for the frequency distributions of lunar slopes; one model was used to prepare a map of the Apollo 16 traverse area; and one model was used to determine elevations of the Cayley Formation. The remaining three models were used to test photogrammetric techniques using oblique metric and panoramic camera photographs. Two preliminary contour maps were compiled and a high-oblique metric photograph was rectified.

  2. Quality assessment for color reproduction using a blind metric

    NASA Astrophysics Data System (ADS)

    Bringier, B.; Quintard, L.; Larabi, M.-C.

    2007-01-01

    This paper deals with image quality assessment. This field plays nowadays an important role in various image processing applications. Number of objective image quality metrics, that correlate or not, with the subjective quality have been developed during the last decade. Two categories of metrics can be distinguished, the first with full-reference and the second with no-reference. Full-reference metric tries to evaluate the distortion introduced to an image with regards to the reference. No-reference approach attempts to model the judgment of image quality in a blind way. Unfortunately, the universal image quality model is not on the horizon and empirical models established on psychophysical experimentation are generally used. In this paper, we focus only on the second category to evaluate the quality of color reproduction where a blind metric, based on human visual system modeling is introduced. The objective results are validated by single-media and cross-media subjective tests.

  3. Comparing NEO Search Telescopes

    NASA Astrophysics Data System (ADS)

    Myhrvold, Nathan

    2016-04-01

    Multiple terrestrial and space-based telescopes have been proposed for detecting and tracking near-Earth objects (NEOs). Detailed simulations of the search performance of these systems have used complex computer codes that are not widely available, which hinders accurate cross-comparison of the proposals and obscures whether they have consistent assumptions. Moreover, some proposed instruments would survey infrared (IR) bands, whereas others would operate in the visible band, and differences among asteroid thermal and visible-light models used in the simulations further complicate like-to-like comparisons. I use simple physical principles to estimate basic performance metrics for the ground-based Large Synoptic Survey Telescope and three space-based instruments—Sentinel, NEOCam, and a Cubesat constellation. The performance is measured against two different NEO distributions, the Bottke et al. distribution of general NEOs, and the Veres et al. distribution of Earth-impacting NEO. The results of the comparison show simplified relative performance metrics, including the expected number of NEOs visible in the search volumes and the initial detection rates expected for each system. Although these simplified comparisons do not capture all of the details, they give considerable insight into the physical factors limiting performance. Multiple asteroid thermal models are considered, including FRM, NEATM, and a new generalized form of FRM. I describe issues with how IR albedo and emissivity have been estimated in previous studies, which may render them inaccurate. A thermal model for tumbling asteroids is also developed and suggests that tumbling asteroids may be surprisingly difficult for IR telescopes to observe.

  4. Iterative methods for mixed finite element equations

    NASA Technical Reports Server (NTRS)

    Nakazawa, S.; Nagtegaal, J. C.; Zienkiewicz, O. C.

    1985-01-01

    Iterative strategies for the solution of indefinite system of equations arising from the mixed finite element method are investigated in this paper with application to linear and nonlinear problems in solid and structural mechanics. The augmented Hu-Washizu form is derived, which is then utilized to construct a family of iterative algorithms using the displacement method as the preconditioner. Two types of iterative algorithms are implemented. Those are: constant metric iterations which does not involve the update of preconditioner; variable metric iterations, in which the inverse of the preconditioning matrix is updated. A series of numerical experiments is conducted to evaluate the numerical performance with application to linear and nonlinear model problems.

  5. Implementation of the Automated Numerical Model Performance Metrics System

    DTIC Science & Technology

    2011-09-26

    question. As of this writing, the DSRC IBM AIX machines DaVinci and Pascal, and the Cray XT Einstein all use the PBS batch queuing system for...3.3). 12 Appendix A – General Automation System This system provides general purpose tools and a general way to automatically run

  6. Validating the use of MODIS time series for salinity assessment over agricultural soils in California, USA

    USDA-ARS?s Scientific Manuscript database

    Testing soil salinity assessment methodologies over different regions is important for future continental and global scale applications. A novel regional-scale soil salinity modeling approach using plant-performance metrics was proposed by Zhang et al. (2015) for farmland in the Yellow River Delta, ...

  7. Measuring strategic success.

    PubMed

    Gish, Ryan

    2002-08-01

    Strategic triggers and metrics help healthcare providers achieve financial success. Metrics help assess progress toward long-term goals. Triggers signal market changes requiring a change in strategy. All metrics may not move in concert. Organizations need to identify indicators, monitor performance.

  8. Cognitive context detection in UAS operators using eye-gaze patterns on computer screens

    NASA Astrophysics Data System (ADS)

    Mannaru, Pujitha; Balasingam, Balakumar; Pattipati, Krishna; Sibley, Ciara; Coyne, Joseph

    2016-05-01

    In this paper, we demonstrate the use of eye-gaze metrics of unmanned aerial systems (UAS) operators as effective indices of their cognitive workload. Our analyses are based on an experiment where twenty participants performed pre-scripted UAS missions of three different difficulty levels by interacting with two custom designed graphical user interfaces (GUIs) that are displayed side by side. First, we compute several eye-gaze metrics, traditional eye movement metrics as well as newly proposed ones, and analyze their effectiveness as cognitive classifiers. Most of the eye-gaze metrics are computed by dividing the computer screen into "cells". Then, we perform several analyses in order to select metrics for effective cognitive context classification related to our specific application; the objective of these analyses are to (i) identify appropriate ways to divide the screen into cells; (ii) select appropriate metrics for training and classification of cognitive features; and (iii) identify a suitable classification method.

  9. Text Authorship Identified Using the Dynamics of Word Co-Occurrence Networks

    PubMed Central

    Akimushkin, Camilo; Amancio, Diego Raphael; Oliveira, Osvaldo Novais

    2017-01-01

    Automatic identification of authorship in disputed documents has benefited from complex network theory as this approach does not require human expertise or detailed semantic knowledge. Networks modeling entire books can be used to discriminate texts from different sources and understand network growth mechanisms, but only a few studies have probed the suitability of networks in modeling small chunks of text to grasp stylistic features. In this study, we introduce a methodology based on the dynamics of word co-occurrence networks representing written texts to classify a corpus of 80 texts by 8 authors. The texts were divided into sections with equal number of linguistic tokens, from which time series were created for 12 topological metrics. Since 73% of all series were stationary (ARIMA(p, 0, q)) and the remaining were integrable of first order (ARIMA(p, 1, q)), probability distributions could be obtained for the global network metrics. The metrics exhibit bell-shaped non-Gaussian distributions, and therefore distribution moments were used as learning attributes. With an optimized supervised learning procedure based on a nonlinear transformation performed by Isomap, 71 out of 80 texts were correctly classified using the K-nearest neighbors algorithm, i.e. a remarkable 88.75% author matching success rate was achieved. Hence, purely dynamic fluctuations in network metrics can characterize authorship, thus paving the way for a robust description of large texts in terms of small evolving networks. PMID:28125703

  10. Measure of robustness for complex networks

    NASA Astrophysics Data System (ADS)

    Youssef, Mina Nabil

    Critical infrastructures are repeatedly attacked by external triggers causing tremendous amount of damages. Any infrastructure can be studied using the powerful theory of complex networks. A complex network is composed of extremely large number of different elements that exchange commodities providing significant services. The main functions of complex networks can be damaged by different types of attacks and failures that degrade the network performance. These attacks and failures are considered as disturbing dynamics, such as the spread of viruses in computer networks, the spread of epidemics in social networks, and the cascading failures in power grids. Depending on the network structure and the attack strength, every network differently suffers damages and performance degradation. Hence, quantifying the robustness of complex networks becomes an essential task. In this dissertation, new metrics are introduced to measure the robustness of technological and social networks with respect to the spread of epidemics, and the robustness of power grids with respect to cascading failures. First, we introduce a new metric called the Viral Conductance (VCSIS ) to assess the robustness of networks with respect to the spread of epidemics that are modeled through the susceptible/infected/susceptible (SIS) epidemic approach. In contrast to assessing the robustness of networks based on a classical metric, the epidemic threshold, the new metric integrates the fraction of infected nodes at steady state for all possible effective infection strengths. Through examples, VCSIS provides more insights about the robustness of networks than the epidemic threshold. In addition, both the paradoxical robustness of Barabasi-Albert preferential attachment networks and the effect of the topology on the steady state infection are studied, to show the importance of quantifying the robustness of networks. Second, a new metric VCSIR is introduced to assess the robustness of networks with respect to the spread of susceptible/infected/recovered (SIR) epidemics. To compute VCSIR, we propose a novel individual-based approach to model the spread of SIR epidemics in networks, which captures the infection size for a given effective infection rate. Thus, VCSIR quantitatively integrates the infection strength with the corresponding infection size. To optimize the VCSIR metric, a new mitigation strategy is proposed, based on a temporary reduction of contacts in social networks. The social contact network is modeled as a weighted graph that describes the frequency of contacts among the individuals. Thus, we consider the spread of an epidemic as a dynamical system, and the total number of infection cases as the state of the system, while the weight reduction in the social network is the controller variable leading to slow/reduce the spread of epidemics. Using optimal control theory, the obtained solution represents an optimal adaptive weighted network defined over a finite time interval. Moreover, given the high complexity of the optimization problem, we propose two heuristics to find the near optimal solutions by reducing the contacts among the individuals in a decentralized way. Finally, the cascading failures that can take place in power grids and have recently caused several blackouts are studied. We propose a new metric to assess the robustness of the power grid with respect to the cascading failures. The power grid topology is modeled as a network, which consists of nodes and links representing power substations and transmission lines, respectively. We also propose an optimal islanding strategy to protect the power grid when a cascading failure event takes place in the grid. The robustness metrics are numerically evaluated using real and synthetic networks to quantify their robustness with respect to disturbing dynamics. We show that the proposed metrics outperform the classical metrics in quantifying the robustness of networks and the efficiency of the mitigation strategies. In summary, our work advances the network science field in assessing the robustness of complex networks with respect to various disturbing dynamics.

  11. Development and Validity of a Silicone Renal Tumor Model for Robotic Partial Nephrectomy Training.

    PubMed

    Monda, Steven M; Weese, Jonathan R; Anderson, Barrett G; Vetter, Joel M; Venkatesh, Ramakrishna; Du, Kefu; Andriole, Gerald L; Figenshau, Robert S

    2018-04-01

    To provide a training tool to address the technical challenges of robot-assisted laparoscopic partial nephrectomy, we created silicone renal tumor models using 3-dimensional printed molds of a patient's kidney with a mass. In this study, we assessed the face, content, and construct validity of these models. Surgeons of different training levels completed 4 simulations on silicone renal tumor models. Participants were surveyed on the usefulness and realism of the model as a training tool. Performance was measured using operation-specific metrics, self-reported operative demands (NASA Task Load Index [NASA TLX]), and blinded expert assessment (Global Evaluative Assessment of Robotic Surgeons [GEARS]). Twenty-four participants included attending urologists, endourology fellows, urology residents, and medical students. Post-training surveys of expert participants yielded mean results of 79.2 on the realism of the model's overall feel and 90.2 on the model's overall usefulness for training. Renal artery clamp times and GEARS scores were significantly better in surgeons further in training (P ≤.005 and P ≤.025). Renal artery clamp times, preserved renal parenchyma, positive margins, NASA TLX, and GEARS scores were all found to improve across trials (P <.001, P = .025, P = .024, P ≤.020, and P ≤.006, respectively). Face, content, and construct validity were demonstrated in the use of a silicone renal tumor model in a cohort of surgeons of different training levels. Expert participants deemed the model useful and realistic. Surgeons of higher training levels performed better than less experienced surgeons in various study metrics, and improvements within individuals were observed over sequential trials. Future studies should aim to assess model predictive validity, namely, the association between model performance improvements and improvements in live surgery. Copyright © 2018 Elsevier Inc. All rights reserved.

  12. The discrepancy between social isolation and loneliness as a clinically meaningful metric: findings from the Irish and English longitudinal studies of ageing (TILDA and ELSA).

    PubMed

    McHugh, J E; Kenny, R A; Lawlor, B A; Steptoe, A; Kee, F

    2017-06-01

    Scant evidence is available on the discordance between loneliness and social isolation among older adults. We aimed to investigate this discordance and any health implications that it may have. Using nationally representative datasets from ageing cohorts in Ireland (TILDA) and England (ELSA), we created a metric of discordance between loneliness and social isolation, to which we refer as Social Asymmetry. This metric was the categorised difference between standardised scores on a scale of loneliness and a scale of social isolation, giving categories of: Concordantly Lonely and Isolated, Discordant: Robust to Loneliness, or Discordant: Susceptible to Loneliness. We used regression and multilevel modelling to identify potential relationships between Social Asymmetry and cognitive outcomes. Social Asymmetry predicted cognitive outcomes cross-sectionally and at a two-year follow-up, such that Discordant: Robust to Loneliness individuals were superior performers, but we failed to find evidence for Social Asymmetry as a predictor of cognitive trajectory over time. We present a new metric and preliminary evidence of a relationship with clinical outcomes. Further research validating this metric in different populations, and evaluating its relationship with other outcomes, is warranted. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  13. Designing Industrial Networks Using Ecological Food Web Metrics.

    PubMed

    Layton, Astrid; Bras, Bert; Weissburg, Marc

    2016-10-18

    Biologically Inspired Design (biomimicry) and Industrial Ecology both look to natural systems to enhance the sustainability and performance of engineered products, systems and industries. Bioinspired design (BID) traditionally has focused on a unit operation and single product level. In contrast, this paper describes how principles of network organization derived from analysis of ecosystem properties can be applied to industrial system networks. Specifically, this paper examines the applicability of particular food web matrix properties as design rules for economically and biologically sustainable industrial networks, using an optimization model developed for a carpet recycling network. Carpet recycling network designs based on traditional cost and emissions based optimization are compared to designs obtained using optimizations based solely on ecological food web metrics. The analysis suggests that networks optimized using food web metrics also were superior from a traditional cost and emissions perspective; correlations between optimization using ecological metrics and traditional optimization ranged generally from 0.70 to 0.96, with flow-based metrics being superior to structural parameters. Four structural food parameters provided correlations nearly the same as that obtained using all structural parameters, but individual structural parameters provided much less satisfactory correlations. The analysis indicates that bioinspired design principles from ecosystems can lead to both environmentally and economically sustainable industrial resource networks, and represent guidelines for designing sustainable industry networks.

  14. Metric Optimization for Surface Analysis in the Laplace-Beltrami Embedding Space

    PubMed Central

    Lai, Rongjie; Wang, Danny J.J.; Pelletier, Daniel; Mohr, David; Sicotte, Nancy; Toga, Arthur W.

    2014-01-01

    In this paper we present a novel approach for the intrinsic mapping of anatomical surfaces and its application in brain mapping research. Using the Laplace-Beltrami eigen-system, we represent each surface with an isometry invariant embedding in a high dimensional space. The key idea in our system is that we realize surface deformation in the embedding space via the iterative optimization of a conformal metric without explicitly perturbing the surface or its embedding. By minimizing a distance measure in the embedding space with metric optimization, our method generates a conformal map directly between surfaces with highly uniform metric distortion and the ability of aligning salient geometric features. Besides pairwise surface maps, we also extend the metric optimization approach for group-wise atlas construction and multi-atlas cortical label fusion. In experimental results, we demonstrate the robustness and generality of our method by applying it to map both cortical and hippocampal surfaces in population studies. For cortical labeling, our method achieves excellent performance in a cross-validation experiment with 40 manually labeled surfaces, and successfully models localized brain development in a pediatric study of 80 subjects. For hippocampal mapping, our method produces much more significant results than two popular tools on a multiple sclerosis study of 109 subjects. PMID:24686245

  15. Foul tip impact attenuation of baseball catcher masks using head impact metrics

    PubMed Central

    White, Terrance R.; Cutcliffe, Hattie C.; Shridharani, Jay K.; Wood, Garrett W.; Bass, Cameron R.

    2018-01-01

    Currently, no scientific consensus exists on the relative safety of catcher mask styles and materials. Due to differences in mass and material properties, the style and material of a catcher mask influences the impact metrics observed during simulated foul ball impacts. The catcher surrogate was a Hybrid III head and neck equipped with a six degree of freedom sensor package to obtain linear accelerations and angular rates. Four mask styles were impacted using an air cannon for six 30 m/s and six 35 m/s impacts to the nasion. To quantify impact severity, the metrics peak linear acceleration, peak angular acceleration, Head Injury Criterion, Head Impact Power, and Gadd Severity Index were used. An Analysis of Covariance and a Tukey’s HSD Test were conducted to compare the least squares mean between masks for each head injury metric. For each injury metric a P-Value less than 0.05 was found indicating a significant difference in mask performance. Tukey’s HSD test found for each metric, the traditional style titanium mask fell in the lowest performance category while the hockey style mask was in the highest performance category. Limitations of this study prevented a direct correlation from mask testing performance to mild traumatic brain injury. PMID:29856814

  16. Proposed Performance-Based Metrics for the Future Funding of Graduate Medical Education: Starting the Conversation.

    PubMed

    Caverzagie, Kelly J; Lane, Susan W; Sharma, Niraj; Donnelly, John; Jaeger, Jeffrey R; Laird-Fick, Heather; Moriarty, John P; Moyer, Darilyn V; Wallach, Sara L; Wardrop, Richard M; Steinmann, Alwin F

    2017-12-12

    Graduate medical education (GME) in the United States is financed by contributions from both federal and state entities that total over $15 billion annually. Within institutions, these funds are distributed with limited transparency to achieve ill-defined outcomes. To address this, the Institute of Medicine convened a committee on the governance and financing of GME to recommend finance reform that would promote a physician training system that meets society's current and future needs. The resulting report provided several recommendations regarding the oversight and mechanisms of GME funding, including implementation of performance-based GME payments, but did not provide specific details about the content and development of metrics for these payments. To initiate a national conversation about performance-based GME funding, the authors asked: What should GME be held accountable for in exchange for public funding? In answer to this question, the authors propose 17 potential performance-based metrics for GME funding that could inform future funding decisions. Eight of the metrics are described as exemplars to add context and to help readers obtain a deeper understanding of the inherent complexities of performance-based GME funding. The authors also describe considerations and precautions for metric implementation.

  17. The importance of metrics for evaluating scientific performance

    NASA Astrophysics Data System (ADS)

    Miyakawa, Tsuyoshi

    Evaluation of scientific performance is a major factor that determines the behavior of both individual researchers and the academic institutes to which they belong. Because the number of researchers heavily outweighs the number of available research posts, and the competitive funding accounts for an ever-increasing proportion of research budget, some objective indicators of research performance have gained recognition for increasing transparency and openness. It is common practice to use metrics and indices to evaluate a researcher's performance or the quality of their grant applications. Such measures include the number of publications, the number of times these papers are cited and, more recently, the h-index, which measures the number of highly-cited papers the researcher has written. However, academic institutions and funding agencies in Japan have been rather slow to adopt such metrics. In this article, I will outline some of the currently available metrics, and discuss why we need to use such objective indicators of research performance more often in Japan. I will also discuss how to promote the use of metrics and what we should keep in mind when using them, as well as their potential impact on the research community in Japan.

  18. Gravitational waves during inflation from a 5D large-scale repulsive gravity model

    NASA Astrophysics Data System (ADS)

    Reyes, Luz M.; Moreno, Claudia; Madriz Aguilar, José Edgar; Bellini, Mauricio

    2012-10-01

    We investigate, in the transverse traceless (TT) gauge, the generation of the relic background of gravitational waves, generated during the early inflationary stage, on the framework of a large-scale repulsive gravity model. We calculate the spectrum of the tensor metric fluctuations of an effective 4D Schwarzschild-de Sitter metric on cosmological scales. This metric is obtained after implementing a planar coordinate transformation on a 5D Ricci-flat metric solution, in the context of a non-compact Kaluza-Klein theory of gravity. We found that the spectrum is nearly scale invariant under certain conditions. One interesting aspect of this model is that it is possible to derive the dynamical field equations for the tensor metric fluctuations, valid not just at cosmological scales, but also at astrophysical scales, from the same theoretical model. The astrophysical and cosmological scales are determined by the gravity-antigravity radius, which is a natural length scale of the model, that indicates when gravity becomes repulsive in nature.

  19. Skill Assessment for Coupled Biological/Physical Models of Marine Systems.

    PubMed

    Stow, Craig A; Jolliff, Jason; McGillicuddy, Dennis J; Doney, Scott C; Allen, J Icarus; Friedrichs, Marjorie A M; Rose, Kenneth A; Wallhead, Philip

    2009-02-20

    Coupled biological/physical models of marine systems serve many purposes including the synthesis of information, hypothesis generation, and as a tool for numerical experimentation. However, marine system models are increasingly used for prediction to support high-stakes decision-making. In such applications it is imperative that a rigorous model skill assessment is conducted so that the model's capabilities are tested and understood. Herein, we review several metrics and approaches useful to evaluate model skill. The definition of skill and the determination of the skill level necessary for a given application is context specific and no single metric is likely to reveal all aspects of model skill. Thus, we recommend the use of several metrics, in concert, to provide a more thorough appraisal. The routine application and presentation of rigorous skill assessment metrics will also serve the broader interests of the modeling community, ultimately resulting in improved forecasting abilities as well as helping us recognize our limitations.

  20. A no-reference video quality assessment metric based on ROI

    NASA Astrophysics Data System (ADS)

    Jia, Lixiu; Zhong, Xuefei; Tu, Yan; Niu, Wenjuan

    2015-01-01

    A no reference video quality assessment metric based on the region of interest (ROI) was proposed in this paper. In the metric, objective video quality was evaluated by integrating the quality of two compressed artifacts, i.e. blurring distortion and blocking distortion. The Gaussian kernel function was used to extract the human density maps of the H.264 coding videos from the subjective eye tracking data. An objective bottom-up ROI extraction model based on magnitude discrepancy of discrete wavelet transform between two consecutive frames, center weighted color opponent model, luminance contrast model and frequency saliency model based on spectral residual was built. Then only the objective saliency maps were used to compute the objective blurring and blocking quality. The results indicate that the objective ROI extraction metric has a higher the area under the curve (AUC) value. Comparing with the conventional video quality assessment metrics which measured all the video quality frames, the metric proposed in this paper not only decreased the computation complexity, but improved the correlation between subjective mean opinion score (MOS) and objective scores.

Top