Sample records for metrics evaluation based

  1. Metrics for Performance Evaluation of Patient Exercises during Physical Therapy.

    PubMed

    Vakanski, Aleksandar; Ferguson, Jake M; Lee, Stephen

    2017-06-01

    The article proposes a set of metrics for evaluation of patient performance in physical therapy exercises. Taxonomy is employed that classifies the metrics into quantitative and qualitative categories, based on the level of abstraction of the captured motion sequences. Further, the quantitative metrics are classified into model-less and model-based metrics, in reference to whether the evaluation employs the raw measurements of patient performed motions, or whether the evaluation is based on a mathematical model of the motions. The reviewed metrics include root-mean square distance, Kullback Leibler divergence, log-likelihood, heuristic consistency, Fugl-Meyer Assessment, and similar. The metrics are evaluated for a set of five human motions captured with a Kinect sensor. The metrics can potentially be integrated into a system that employs machine learning for modelling and assessment of the consistency of patient performance in home-based therapy setting. Automated performance evaluation can overcome the inherent subjectivity in human performed therapy assessment, and it can increase the adherence to prescribed therapy plans, and reduce healthcare costs.

  2. Gamut Volume Index: a color preference metric based on meta-analysis and optimized colour samples.

    PubMed

    Liu, Qiang; Huang, Zheng; Xiao, Kaida; Pointer, Michael R; Westland, Stephen; Luo, M Ronnier

    2017-07-10

    A novel metric named Gamut Volume Index (GVI) is proposed for evaluating the colour preference of lighting. This metric is based on the absolute gamut volume of optimized colour samples. The optimal colour set of the proposed metric was obtained by optimizing the weighted average correlation between the metric predictions and the subjective ratings for 8 psychophysical studies. The performance of 20 typical colour metrics was also investigated, which included colour difference based metrics, gamut based metrics, memory based metrics as well as combined metrics. It was found that the proposed GVI outperformed the existing counterparts, especially for the conditions where correlated colour temperatures differed.

  3. Evaluative Usage-Based Metrics for the Selection of E-Journals.

    ERIC Educational Resources Information Center

    Hahn, Karla L.; Faulkner, Lila A.

    2002-01-01

    Explores electronic journal usage statistics and develops three metrics and three benchmarks based on those metrics. Topics include earlier work that assessed the value of print journals and was modified for the electronic format; the evaluation of potential purchases; and implications for standards development, including the need for content…

  4. A condition metric for Eucalyptus woodland derived from expert evaluations.

    PubMed

    Sinclair, Steve J; Bruce, Matthew J; Griffioen, Peter; Dodd, Amanda; White, Matthew D

    2018-02-01

    The evaluation of ecosystem quality is important for land-management and land-use planning. Evaluation is unavoidably subjective, and robust metrics must be based on consensus and the structured use of observations. We devised a transparent and repeatable process for building and testing ecosystem metrics based on expert data. We gathered quantitative evaluation data on the quality of hypothetical grassy woodland sites from experts. We used these data to train a model (an ensemble of 30 bagged regression trees) capable of predicting the perceived quality of similar hypothetical woodlands based on a set of 13 site variables as inputs (e.g., cover of shrubs, richness of native forbs). These variables can be measured at any site and the model implemented in a spreadsheet as a metric of woodland quality. We also investigated the number of experts required to produce an opinion data set sufficient for the construction of a metric. The model produced evaluations similar to those provided by experts, as shown by assessing the model's quality scores of expert-evaluated test sites not used to train the model. We applied the metric to 13 woodland conservation reserves and asked managers of these sites to independently evaluate their quality. To assess metric performance, we compared the model's evaluation of site quality with the managers' evaluations through multidimensional scaling. The metric performed relatively well, plotting close to the center of the space defined by the evaluators. Given the method provides data-driven consensus and repeatability, which no single human evaluator can provide, we suggest it is a valuable tool for evaluating ecosystem quality in real-world contexts. We believe our approach is applicable to any ecosystem. © 2017 State of Victoria.

  5. A Locally Weighted Fixation Density-Based Metric for Assessing the Quality of Visual Saliency Predictions

    NASA Astrophysics Data System (ADS)

    Gide, Milind S.; Karam, Lina J.

    2016-08-01

    With the increased focus on visual attention (VA) in the last decade, a large number of computational visual saliency methods have been developed over the past few years. These models are traditionally evaluated by using performance evaluation metrics that quantify the match between predicted saliency and fixation data obtained from eye-tracking experiments on human observers. Though a considerable number of such metrics have been proposed in the literature, there are notable problems in them. In this work, we discuss shortcomings in existing metrics through illustrative examples and propose a new metric that uses local weights based on fixation density which overcomes these flaws. To compare the performance of our proposed metric at assessing the quality of saliency prediction with other existing metrics, we construct a ground-truth subjective database in which saliency maps obtained from 17 different VA models are evaluated by 16 human observers on a 5-point categorical scale in terms of their visual resemblance with corresponding ground-truth fixation density maps obtained from eye-tracking data. The metrics are evaluated by correlating metric scores with the human subjective ratings. The correlation results show that the proposed evaluation metric outperforms all other popular existing metrics. Additionally, the constructed database and corresponding subjective ratings provide an insight into which of the existing metrics and future metrics are better at estimating the quality of saliency prediction and can be used as a benchmark.

  6. Quality evaluation of extracted ion chromatograms and chromatographic peaks in liquid chromatography/mass spectrometry-based metabolomics data

    PubMed Central

    2014-01-01

    Background Extracted ion chromatogram (EIC) extraction and chromatographic peak detection are two important processing procedures in liquid chromatography/mass spectrometry (LC/MS)-based metabolomics data analysis. Most commonly, the LC/MS technique employs electrospray ionization as the ionization method. The EICs from LC/MS data are often noisy and contain high background signals. Furthermore, the chromatographic peak quality varies with respect to its location in the chromatogram and most peaks have zigzag shapes. Therefore, there is a critical need to develop effective metrics for quality evaluation of EICs and chromatographic peaks in LC/MS based metabolomics data analysis. Results We investigated a comprehensive set of potential quality evaluation metrics for extracted EICs and detected chromatographic peaks. Specifically, for EIC quality evaluation, we analyzed the mass chromatographic quality index (MCQ index) and propose a novel quality evaluation metric, the EIC-related global zigzag index, which is based on an EIC's first order derivatives. For chromatographic peak quality evaluation, we analyzed and compared six metrics: sharpness, Gaussian similarity, signal-to-noise ratio, peak significance level, triangle peak area similarity ratio and the local peak-related local zigzag index. Conclusions Although the MCQ index is suited for selecting and aligning analyte components, it cannot fairly evaluate EICs with high background signals or those containing only a single peak. Our proposed EIC related global zigzag index is robust enough to evaluate EIC qualities in both scenarios. Of the six peak quality evaluation metrics, the sharpness, peak significance level, and zigzag index outperform the others due to the zigzag nature of LC/MS chromatographic peaks. Furthermore, using several peak quality metrics in combination is more efficient than individual metrics in peak quality evaluation. PMID:25350128

  7. Quality evaluation of extracted ion chromatograms and chromatographic peaks in liquid chromatography/mass spectrometry-based metabolomics data.

    PubMed

    Zhang, Wenchao; Zhao, Patrick X

    2014-01-01

    Extracted ion chromatogram (EIC) extraction and chromatographic peak detection are two important processing procedures in liquid chromatography/mass spectrometry (LC/MS)-based metabolomics data analysis. Most commonly, the LC/MS technique employs electrospray ionization as the ionization method. The EICs from LC/MS data are often noisy and contain high background signals. Furthermore, the chromatographic peak quality varies with respect to its location in the chromatogram and most peaks have zigzag shapes. Therefore, there is a critical need to develop effective metrics for quality evaluation of EICs and chromatographic peaks in LC/MS based metabolomics data analysis. We investigated a comprehensive set of potential quality evaluation metrics for extracted EICs and detected chromatographic peaks. Specifically, for EIC quality evaluation, we analyzed the mass chromatographic quality index (MCQ index) and propose a novel quality evaluation metric, the EIC-related global zigzag index, which is based on an EIC's first order derivatives. For chromatographic peak quality evaluation, we analyzed and compared six metrics: sharpness, Gaussian similarity, signal-to-noise ratio, peak significance level, triangle peak area similarity ratio and the local peak-related local zigzag index. Although the MCQ index is suited for selecting and aligning analyte components, it cannot fairly evaluate EICs with high background signals or those containing only a single peak. Our proposed EIC related global zigzag index is robust enough to evaluate EIC qualities in both scenarios. Of the six peak quality evaluation metrics, the sharpness, peak significance level, and zigzag index outperform the others due to the zigzag nature of LC/MS chromatographic peaks. Furthermore, using several peak quality metrics in combination is more efficient than individual metrics in peak quality evaluation.

  8. Evaluating true BCI communication rate through mutual information and language models.

    PubMed

    Speier, William; Arnold, Corey; Pouratian, Nader

    2013-01-01

    Brain-computer interface (BCI) systems are a promising means for restoring communication to patients suffering from "locked-in" syndrome. Research to improve system performance primarily focuses on means to overcome the low signal to noise ratio of electroencephalogric (EEG) recordings. However, the literature and methods are difficult to compare due to the array of evaluation metrics and assumptions underlying them, including that: 1) all characters are equally probable, 2) character selection is memoryless, and 3) errors occur completely at random. The standardization of evaluation metrics that more accurately reflect the amount of information contained in BCI language output is critical to make progress. We present a mutual information-based metric that incorporates prior information and a model of systematic errors. The parameters of a system used in one study were re-optimized, showing that the metric used in optimization significantly affects the parameter values chosen and the resulting system performance. The results of 11 BCI communication studies were then evaluated using different metrics, including those previously used in BCI literature and the newly advocated metric. Six studies' results varied based on the metric used for evaluation and the proposed metric produced results that differed from those originally published in two of the studies. Standardizing metrics to accurately reflect the rate of information transmission is critical to properly evaluate and compare BCI communication systems and advance the field in an unbiased manner.

  9. Improvement of impact noise in a passenger car utilizing sound metric based on wavelet transform

    NASA Astrophysics Data System (ADS)

    Lee, Sang-Kwon; Kim, Ho-Wuk; Na, Eun-Woo

    2010-08-01

    A new sound metric for impact sound is developed based on the continuous wavelet transform (CWT), a useful tool for the analysis of non-stationary signals such as impact noise. Together with new metric, two other conventional sound metrics related to sound modulation and fluctuation are also considered. In all, three sound metrics are employed to develop impact sound quality indexes for several specific impact courses on the road. Impact sounds are evaluated subjectively by 25 jurors. The indexes are verified by comparing the correlation between the index output and results of a subjective evaluation based on a jury test. These indexes are successfully applied to an objective evaluation for improvement of the impact sound quality for cases where some parts of the suspension system of the test car are modified.

  10. Development and evaluation of aperture-based complexity metrics using film and EPID measurements of static MLC openings

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Götstedt, Julia; Karlsson Hauer, Anna; Bäck, Anna, E-mail: anna.back@vgregion.se

    Purpose: Complexity metrics have been suggested as a complement to measurement-based quality assurance for intensity modulated radiation therapy (IMRT) and volumetric modulated arc therapy (VMAT). However, these metrics have not yet been sufficiently validated. This study develops and evaluates new aperture-based complexity metrics in the context of static multileaf collimator (MLC) openings and compares them to previously published metrics. Methods: This study develops the converted aperture metric and the edge area metric. The converted aperture metric is based on small and irregular parts within the MLC opening that are quantified as measured distances between MLC leaves. The edge area metricmore » is based on the relative size of the region around the edges defined by the MLC. Another metric suggested in this study is the circumference/area ratio. Earlier defined aperture-based complexity metrics—the modulation complexity score, the edge metric, the ratio monitor units (MU)/Gy, the aperture area, and the aperture irregularity—are compared to the newly proposed metrics. A set of small and irregular static MLC openings are created which simulate individual IMRT/VMAT control points of various complexities. These are measured with both an amorphous silicon electronic portal imaging device and EBT3 film. The differences between calculated and measured dose distributions are evaluated using a pixel-by-pixel comparison with two global dose difference criteria of 3% and 5%. The extent of the dose differences, expressed in terms of pass rate, is used as a measure of the complexity of the MLC openings and used for the evaluation of the metrics compared in this study. The different complexity scores are calculated for each created static MLC opening. The correlation between the calculated complexity scores and the extent of the dose differences (pass rate) are analyzed in scatter plots and using Pearson’s r-values. Results: The complexity scores calculated by the edge area metric, converted aperture metric, circumference/area ratio, edge metric, and MU/Gy ratio show good linear correlation to the complexity of the MLC openings, expressed as the 5% dose difference pass rate, with Pearson’s r-values of −0.94, −0.88, −0.84, −0.89, and −0.82, respectively. The overall trends for the 3% and 5% dose difference evaluations are similar. Conclusions: New complexity metrics are developed. The calculated scores correlate to the complexity of the created static MLC openings. The complexity of the MLC opening is dependent on the penumbra region relative to the area of the opening. The aperture-based complexity metrics that combined either the distances between the MLC leaves or the MLC opening circumference with the aperture area show the best correlation with the complexity of the static MLC openings.« less

  11. Evaluating hydrological model performance using information theory-based metrics

    USDA-ARS?s Scientific Manuscript database

    The accuracy-based model performance metrics not necessarily reflect the qualitative correspondence between simulated and measured streamflow time series. The objective of this work was to use the information theory-based metrics to see whether they can be used as complementary tool for hydrologic m...

  12. On Applying the Prognostic Performance Metrics

    NASA Technical Reports Server (NTRS)

    Saxena, Abhinav; Celaya, Jose; Saha, Bhaskar; Saha, Sankalita; Goebel, Kai

    2009-01-01

    Prognostics performance evaluation has gained significant attention in the past few years. As prognostics technology matures and more sophisticated methods for prognostic uncertainty management are developed, a standardized methodology for performance evaluation becomes extremely important to guide improvement efforts in a constructive manner. This paper is in continuation of previous efforts where several new evaluation metrics tailored for prognostics were introduced and were shown to effectively evaluate various algorithms as compared to other conventional metrics. Specifically, this paper presents a detailed discussion on how these metrics should be interpreted and used. Several shortcomings identified, while applying these metrics to a variety of real applications, are also summarized along with discussions that attempt to alleviate these problems. Further, these metrics have been enhanced to include the capability of incorporating probability distribution information from prognostic algorithms as opposed to evaluation based on point estimates only. Several methods have been suggested and guidelines have been provided to help choose one method over another based on probability distribution characteristics. These approaches also offer a convenient and intuitive visualization of algorithm performance with respect to some of these new metrics like prognostic horizon and alpha-lambda performance, and also quantify the corresponding performance while incorporating the uncertainty information.

  13. Visual salience metrics for image inpainting

    NASA Astrophysics Data System (ADS)

    Ardis, Paul A.; Singhal, Amit

    2009-01-01

    Quantitative metrics for successful image inpainting currently do not exist, with researchers instead relying upon qualitative human comparisons to evaluate their methodologies and techniques. In an attempt to rectify this situation, we propose two new metrics to capture the notions of noticeability and visual intent in order to evaluate inpainting results. The proposed metrics use a quantitative measure of visual salience based upon a computational model of human visual attention. We demonstrate how these two metrics repeatably correlate with qualitative opinion in a human observer study, correctly identify the optimum uses for exemplar-based inpainting (as specified in the original publication), and match qualitative opinion in published examples.

  14. Evaluation of image deblurring methods via a classification metric

    NASA Astrophysics Data System (ADS)

    Perrone, Daniele; Humphreys, David; Lamb, Robert A.; Favaro, Paolo

    2012-09-01

    The performance of single image deblurring algorithms is typically evaluated via a certain discrepancy measure between the reconstructed image and the ideal sharp image. The choice of metric, however, has been a source of debate and has also led to alternative metrics based on human visual perception. While fixed metrics may fail to capture some small but visible artifacts, perception-based metrics may favor reconstructions with artifacts that are visually pleasant. To overcome these limitations, we propose to assess the quality of reconstructed images via a task-driven metric. In this paper we consider object classification as the task and therefore use the rate of classification as the metric to measure deblurring performance. In our evaluation we use data with different types of blur in two cases: Optical Character Recognition (OCR), where the goal is to recognise characters in a black and white image, and object classification with no restrictions on pose, illumination and orientation. Finally, we show how off-the-shelf classification algorithms benefit from working with deblurred images.

  15. Energy-Based Metrics for Arthroscopic Skills Assessment.

    PubMed

    Poursartip, Behnaz; LeBel, Marie-Eve; McCracken, Laura C; Escoto, Abelardo; Patel, Rajni V; Naish, Michael D; Trejos, Ana Luisa

    2017-08-05

    Minimally invasive skills assessment methods are essential in developing efficient surgical simulators and implementing consistent skills evaluation. Although numerous methods have been investigated in the literature, there is still a need to further improve the accuracy of surgical skills assessment. Energy expenditure can be an indication of motor skills proficiency. The goals of this study are to develop objective metrics based on energy expenditure, normalize these metrics, and investigate classifying trainees using these metrics. To this end, different forms of energy consisting of mechanical energy and work were considered and their values were divided by the related value of an ideal performance to develop normalized metrics. These metrics were used as inputs for various machine learning algorithms including support vector machines (SVM) and neural networks (NNs) for classification. The accuracy of the combination of the normalized energy-based metrics with these classifiers was evaluated through a leave-one-subject-out cross-validation. The proposed method was validated using 26 subjects at two experience levels (novices and experts) in three arthroscopic tasks. The results showed that there are statistically significant differences between novices and experts for almost all of the normalized energy-based metrics. The accuracy of classification using SVM and NN methods was between 70% and 95% for the various tasks. The results show that the normalized energy-based metrics and their combination with SVM and NN classifiers are capable of providing accurate classification of trainees. The assessment method proposed in this study can enhance surgical training by providing appropriate feedback to trainees about their level of expertise and can be used in the evaluation of proficiency.

  16. Comparative Simulation Study of Glucose Control Methods Designed for Use in the Intensive Care Unit Setting via a Novel Controller Scoring Metric.

    PubMed

    DeJournett, Jeremy; DeJournett, Leon

    2017-11-01

    Effective glucose control in the intensive care unit (ICU) setting has the potential to decrease morbidity and mortality rates and thereby decrease health care expenditures. To evaluate what constitutes effective glucose control, typically several metrics are reported, including time in range, time in mild and severe hypoglycemia, coefficient of variation, and others. To date, there is no one metric that combines all of these individual metrics to give a number indicative of overall performance. We proposed a composite metric that combines 5 commonly reported metrics, and we used this composite metric to compare 6 glucose controllers. We evaluated the following controllers: Ideal Medical Technologies (IMT) artificial-intelligence-based controller, Yale protocol, Glucommander, Wintergerst et al PID controller, GRIP, and NICE-SUGAR. We evaluated each controller across 80 simulated patients, 4 clinically relevant exogenous dextrose infusions, and one nonclinical infusion as a test of the controller's ability to handle difficult situations. This gave a total of 2400 5-day simulations, and 585 604 individual glucose values for analysis. We used a random walk sensor error model that gave a 10% MARD. For each controller, we calculated severe hypoglycemia (<40 mg/dL), mild hypoglycemia (40-69 mg/dL), normoglycemia (70-140 mg/dL), hyperglycemia (>140 mg/dL), and coefficient of variation (CV), as well as our novel controller metric. For the controllers tested, we achieved the following median values for our novel controller scoring metric: IMT: 88.1, YALE: 46.7, GLUC: 47.2, PID: 50, GRIP: 48.2, NICE: 46.4. The novel scoring metric employed in this study shows promise as a means for evaluating new and existing ICU-based glucose controllers, and it could be used in the future to compare results of glucose control studies in critical care. The IMT AI-based glucose controller demonstrated the most consistent performance results based on this new metric.

  17. Multi-objective optimization for generating a weighted multi-model ensemble

    NASA Astrophysics Data System (ADS)

    Lee, H.

    2017-12-01

    Many studies have demonstrated that multi-model ensembles generally show better skill than each ensemble member. When generating weighted multi-model ensembles, the first step is measuring the performance of individual model simulations using observations. There is a consensus on the assignment of weighting factors based on a single evaluation metric. When considering only one evaluation metric, the weighting factor for each model is proportional to a performance score or inversely proportional to an error for the model. While this conventional approach can provide appropriate combinations of multiple models, the approach confronts a big challenge when there are multiple metrics under consideration. When considering multiple evaluation metrics, it is obvious that a simple averaging of multiple performance scores or model ranks does not address the trade-off problem between conflicting metrics. So far, there seems to be no best method to generate weighted multi-model ensembles based on multiple performance metrics. The current study applies the multi-objective optimization, a mathematical process that provides a set of optimal trade-off solutions based on a range of evaluation metrics, to combining multiple performance metrics for the global climate models and their dynamically downscaled regional climate simulations over North America and generating a weighted multi-model ensemble. NASA satellite data and the Regional Climate Model Evaluation System (RCMES) software toolkit are used for assessment of the climate simulations. Overall, the performance of each model differs markedly with strong seasonal dependence. Because of the considerable variability across the climate simulations, it is important to evaluate models systematically and make future projections by assigning optimized weighting factors to the models with relatively good performance. Our results indicate that the optimally weighted multi-model ensemble always shows better performance than an arithmetic ensemble mean and may provide reliable future projections.

  18. Metrics for Offline Evaluation of Prognostic Performance

    NASA Technical Reports Server (NTRS)

    Saxena, Abhinav; Celaya, Jose; Saha, Bhaskar; Saha, Sankalita; Goebel, Kai

    2010-01-01

    Prognostic performance evaluation has gained significant attention in the past few years. Currently, prognostics concepts lack standard definitions and suffer from ambiguous and inconsistent interpretations. This lack of standards is in part due to the varied end-user requirements for different applications, time scales, available information, domain dynamics, etc. to name a few. The research community has used a variety of metrics largely based on convenience and their respective requirements. Very little attention has been focused on establishing a standardized approach to compare different efforts. This paper presents several new evaluation metrics tailored for prognostics that were recently introduced and were shown to effectively evaluate various algorithms as compared to other conventional metrics. Specifically, this paper presents a detailed discussion on how these metrics should be interpreted and used. These metrics have the capability of incorporating probabilistic uncertainty estimates from prognostic algorithms. In addition to quantitative assessment they also offer a comprehensive visual perspective that can be used in designing the prognostic system. Several methods are suggested to customize these metrics for different applications. Guidelines are provided to help choose one method over another based on distribution characteristics. Various issues faced by prognostics and its performance evaluation are discussed followed by a formal notational framework to help standardize subsequent developments.

  19. Decision-relevant evaluation of climate models: A case study of chill hours in California

    NASA Astrophysics Data System (ADS)

    Jagannathan, K. A.; Jones, A. D.; Kerr, A. C.

    2017-12-01

    The past decade has seen a proliferation of different climate datasets with over 60 climate models currently in use. Comparative evaluation and validation of models can assist practitioners chose the most appropriate models for adaptation planning. However, such assessments are usually conducted for `climate metrics' such as seasonal temperature, while sectoral decisions are often based on `decision-relevant outcome metrics' such as growing degree days or chill hours. Since climate models predict different metrics with varying skill, the goal of this research is to conduct a bottom-up evaluation of model skill for `outcome-based' metrics. Using chill hours (number of hours in winter months where temperature is lesser than 45 deg F) in Fresno, CA as a case, we assess how well different GCMs predict the historical mean and slope of chill hours, and whether and to what extent projections differ based on model selection. We then compare our results with other climate-based evaluations of the region, to identify similarities and differences. For the model skill evaluation, historically observed chill hours were compared with simulations from 27 GCMs (and multiple ensembles). Model skill scores were generated based on a statistical hypothesis test of the comparative assessment. Future projections from RCP 8.5 runs were evaluated, and a simple bias correction was also conducted. Our analysis indicates that model skill in predicting chill hour slope is dependent on its skill in predicting mean chill hours, which results from the non-linear nature of the chill metric. However, there was no clear relationship between the models that performed well for the chill hour metric and those that performed well in other temperature-based evaluations (such winter minimum temperature or diurnal temperature range). Further, contrary to conclusions from other studies, we also found that the multi-model mean or large ensemble mean results may not always be most appropriate for this outcome metric. Our assessment sheds light on key differences between global versus local skill, and broad versus specific skill of climate models, highlighting that decision-relevant model evaluation may be crucial for providing practitioners with the best available climate information for their specific needs.

  20. An Evaluation of the IntelliMetric[SM] Essay Scoring System

    ERIC Educational Resources Information Center

    Rudner, Lawrence M.; Garcia, Veronica; Welch, Catherine

    2006-01-01

    This report provides a two-part evaluation of the IntelliMetric[SM] automated essay scoring system based on its performance scoring essays from the Analytic Writing Assessment of the Graduate Management Admission Test[TM] (GMAT[TM]). The IntelliMetric system performance is first compared to that of individual human raters, a Bayesian system…

  1. Can segmentation evaluation metric be used as an indicator of land cover classification accuracy?

    NASA Astrophysics Data System (ADS)

    Švab Lenarčič, Andreja; Đurić, Nataša; Čotar, Klemen; Ritlop, Klemen; Oštir, Krištof

    2016-10-01

    It is a broadly established belief that the segmentation result significantly affects subsequent image classification accuracy. However, the actual correlation between the two has never been evaluated. Such an evaluation would be of considerable importance for any attempts to automate the object-based classification process, as it would reduce the amount of user intervention required to fine-tune the segmentation parameters. We conducted an assessment of segmentation and classification by analyzing 100 different segmentation parameter combinations, 3 classifiers, 5 land cover classes, 20 segmentation evaluation metrics, and 7 classification accuracy measures. The reliability definition of segmentation evaluation metrics as indicators of land cover classification accuracy was based on the linear correlation between the two. All unsupervised metrics that are not based on number of segments have a very strong correlation with all classification measures and are therefore reliable as indicators of land cover classification accuracy. On the other hand, correlation at supervised metrics is dependent on so many factors that it cannot be trusted as a reliable classification quality indicator. Algorithms for land cover classification studied in this paper are widely used; therefore, presented results are applicable to a wider area.

  2. Indicators and Metrics for Evaluating the Sustainability of Chemical Processes

    EPA Science Inventory

    A metric-based method, called GREENSCOPE, has been developed for evaluating process sustainability. Using lab-scale information and engineering assumptions the method evaluates full-scale epresentations of processes in environmental, efficiency, energy and economic areas. The m...

  3. Evaluation techniques and metrics for assessment of pan+MSI fusion (pansharpening)

    NASA Astrophysics Data System (ADS)

    Mercovich, Ryan A.

    2015-05-01

    Fusion of broadband panchromatic data with narrow band multispectral data - pansharpening - is a common and often studied problem in remote sensing. Many methods exist to produce data fusion results with the best possible spatial and spectral characteristics, and a number have been commercially implemented. This study examines the output products of 4 commercial implementations with regard to their relative strengths and weaknesses for a set of defined image characteristics and analyst use-cases. Image characteristics used are spatial detail, spatial quality, spectral integrity, and composite color quality (hue and saturation), and analyst use-cases included a variety of object detection and identification tasks. The imagery comes courtesy of the RIT SHARE 2012 collect. Two approaches are used to evaluate the pansharpening methods, analyst evaluation or qualitative measure and image quality metrics or quantitative measures. Visual analyst evaluation results are compared with metric results to determine which metrics best measure the defined image characteristics and product use-cases and to support future rigorous characterization the metrics' correlation with the analyst results. Because pansharpening represents a trade between adding spatial information from the panchromatic image, and retaining spectral information from the MSI channels, the metrics examined are grouped into spatial improvement metrics and spectral preservation metrics. A single metric to quantify the quality of a pansharpening method would necessarily be a combination of weighted spatial and spectral metrics based on the importance of various spatial and spectral characteristics for the primary task of interest. Appropriate metrics and weights for such a combined metric are proposed here, based on the conducted analyst evaluation. Additionally, during this work, a metric was developed specifically focused on assessment of spatial structure improvement relative to a reference image and independent of scene content. Using analysis of Fourier transform images, a measure of high-frequency content is computed in small sub-segments of the image. The average increase in high-frequency content across the image is used as the metric, where averaging across sub-segments combats the scene dependent nature of typical image sharpness techniques. This metric had an improved range of scores, better representing difference in the test set than other common spatial structure metrics.

  4. Empirical Evaluation of Hunk Metrics as Bug Predictors

    NASA Astrophysics Data System (ADS)

    Ferzund, Javed; Ahsan, Syed Nadeem; Wotawa, Franz

    Reducing the number of bugs is a crucial issue during software development and maintenance. Software process and product metrics are good indicators of software complexity. These metrics have been used to build bug predictor models to help developers maintain the quality of software. In this paper we empirically evaluate the use of hunk metrics as predictor of bugs. We present a technique for bug prediction that works at smallest units of code change called hunks. We build bug prediction models using random forests, which is an efficient machine learning classifier. Hunk metrics are used to train the classifier and each hunk metric is evaluated for its bug prediction capabilities. Our classifier can classify individual hunks as buggy or bug-free with 86 % accuracy, 83 % buggy hunk precision and 77% buggy hunk recall. We find that history based and change level hunk metrics are better predictors of bugs than code level hunk metrics.

  5. Algal bioassessment metrics for wadeable streams and rivers of Maine, USA

    USGS Publications Warehouse

    Danielson, Thomas J.; Loftin, Cynthia S.; Tsomides, Leonidas; DiFranco, Jeanne L.; Connors, Beth

    2011-01-01

    Many state water-quality agencies use biological assessment methods based on lotic fish and macroinvertebrate communities, but relatively few states have incorporated algal multimetric indices into monitoring programs. Algae are good indicators for monitoring water quality because they are sensitive to many environmental stressors. We evaluated benthic algal community attributes along a landuse gradient affecting wadeable streams and rivers in Maine, USA, to identify potential bioassessment metrics. We collected epilithic algal samples from 193 locations across the state. We computed weighted-average optima for common taxa for total P, total N, specific conductance, % impervious cover, and % developed watershed, which included all land use that is no longer forest or wetland. We assigned Maine stream tolerance values and categories (sensitive, intermediate, tolerant) to taxa based on their optima and responses to watershed disturbance. We evaluated performance of algal community metrics used in multimetric indices from other regions and novel metrics based on Maine data. Metrics specific to Maine data, such as the relative richness of species characterized as being sensitive in Maine, were more correlated with % developed watershed than most metrics used in other regions. Few community-structure attributes (e.g., species richness) were useful metrics in Maine. Performance of algal bioassessment models would be improved if metrics were evaluated with attributes of local data before inclusion in multimetric indices or statistical models. ?? 2011 by The North American Benthological Society.

  6. Evaluation of an Integrated Framework for Biodiversity with a New Metric for Functional Dispersion

    PubMed Central

    Presley, Steven J.; Scheiner, Samuel M.; Willig, Michael R.

    2014-01-01

    Growing interest in understanding ecological patterns from phylogenetic and functional perspectives has driven the development of metrics that capture variation in evolutionary histories or ecological functions of species. Recently, an integrated framework based on Hill numbers was developed that measures three dimensions of biodiversity based on abundance, phylogeny and function of species. This framework is highly flexible, allowing comparison of those diversity dimensions, including different aspects of a single dimension and their integration into a single measure. The behavior of those metrics with regard to variation in data structure has not been explored in detail, yet is critical for ensuring an appropriate match between the concept and its measurement. We evaluated how each metric responds to particular data structures and developed a new metric for functional biodiversity. The phylogenetic metric is sensitive to variation in the topology of phylogenetic trees, including variation in the relative lengths of basal, internal and terminal branches. In contrast, the functional metric exhibited multiple shortcomings: (1) species that are functionally redundant contribute nothing to functional diversity and (2) a single highly distinct species causes functional diversity to approach the minimum possible value. We introduced an alternative, improved metric based on functional dispersion that solves both of these problems. In addition, the new metric exhibited more desirable behavior when based on multiple traits. PMID:25148103

  7. Launch Vehicle Production and Operations Cost Metrics

    NASA Technical Reports Server (NTRS)

    Watson, Michael D.; Neeley, James R.; Blackburn, Ruby F.

    2014-01-01

    Traditionally, launch vehicle cost has been evaluated based on $/Kg to orbit. This metric is calculated based on assumptions not typically met by a specific mission. These assumptions include the specified orbit whether Low Earth Orbit (LEO), Geostationary Earth Orbit (GEO), or both. The metric also assumes the payload utilizes the full lift mass of the launch vehicle, which is rarely true even with secondary payloads.1,2,3 Other approaches for cost metrics have been evaluated including unit cost of the launch vehicle and an approach to consider the full program production and operations costs.4 Unit cost considers the variable cost of the vehicle and the definition of variable costs are discussed. The full program production and operation costs include both the variable costs and the manufacturing base. This metric also distinguishes operations costs from production costs, including pre-flight operational testing. Operations costs also consider the costs of flight operations, including control center operation and maintenance. Each of these 3 cost metrics show different sensitivities to various aspects of launch vehicle cost drivers. The comparison of these metrics provides the strengths and weaknesses of each yielding an assessment useful for cost metric selection for launch vehicle programs.

  8. Impact of region contouring variability on image-based focal therapy evaluation

    NASA Astrophysics Data System (ADS)

    Gibson, Eli; Donaldson, Ian A.; Shah, Taimur T.; Hu, Yipeng; Ahmed, Hashim U.; Barratt, Dean C.

    2016-03-01

    Motivation: Focal therapy is an emerging low-morbidity treatment option for low-intermediate risk prostate cancer; however, challenges remain in accurately delivering treatment to specified targets and determining treatment success. Registered multi-parametric magnetic resonance imaging (MPMRI) acquired before and after treatment can support focal therapy evaluation and optimization; however, contouring variability, when defining the prostate, the clinical target volume (CTV) and the ablation region in images, reduces the precision of quantitative image-based focal therapy evaluation metrics. To inform the interpretation and clarify the limitations of such metrics, we investigated inter-observer contouring variability and its impact on four metrics. Methods: Pre-therapy and 2-week-post-therapy standard-of-care MPMRI were acquired from 5 focal cryotherapy patients. Two clinicians independently contoured, on each slice, the prostate (pre- and post-treatment) and the dominant index lesion CTV (pre-treatment) in the T2-weighted MRI, and the ablated region (post-treatment) in the dynamic-contrast- enhanced MRI. For each combination of clinician contours, post-treatment images were registered to pre-treatment images using a 3D biomechanical-model-based registration of prostate surfaces, and four metrics were computed: the proportion of the target tissue region that was ablated and the target:ablated region volume ratio for each of two targets (the CTV and an expanded planning target volume). Variance components analysis was used to measure the contribution of each type of contour to the variance in the therapy evaluation metrics. Conclusions: 14-23% of evaluation metric variance was attributable to contouring variability (including 6-12% from ablation region contouring); reducing this variability could improve the precision of focal therapy evaluation metrics.

  9. An objective method for a video quality evaluation in a 3DTV service

    NASA Astrophysics Data System (ADS)

    Wilczewski, Grzegorz

    2015-09-01

    The following article describes proposed objective method for a 3DTV video quality evaluation, a Compressed Average Image Intensity (CAII) method. Identification of the 3DTV service's content chain nodes enables to design a versatile, objective video quality metric. It is based on an advanced approach to the stereoscopic videostream analysis. Insights towards designed metric mechanisms, as well as the evaluation of performance of the designed video quality metric, in the face of the simulated environmental conditions are herein discussed. As a result, created CAII metric might be effectively used in a variety of service quality assessment applications.

  10. Comparing masked target transform volume (MTTV) clutter metric to human observer evaluation of visual clutter

    NASA Astrophysics Data System (ADS)

    Camp, H. A.; Moyer, Steven; Moore, Richard K.

    2010-04-01

    The Night Vision and Electronic Sensors Directorate's current time-limited search (TLS) model, which makes use of the targeting task performance (TTP) metric to describe image quality, does not explicitly account for the effects of visual clutter on observer performance. The TLS model is currently based on empirical fits to describe human performance for a time of day, spectrum and environment. Incorporating a clutter metric into the TLS model may reduce the number of these empirical fits needed. The masked target transform volume (MTTV) clutter metric has been previously presented and compared to other clutter metrics. Using real infrared imagery of rural images with varying levels of clutter, NVESD is currently evaluating the appropriateness of the MTTV metric. NVESD had twenty subject matter experts (SME) rank the amount of clutter in each scene in a series of pair-wise comparisons. MTTV metric values were calculated and then compared to the SME observers rankings. The MTTV metric ranked the clutter in a similar manner to the SME evaluation, suggesting that the MTTV metric may emulate SME response. This paper is a first step in quantifying clutter and measuring the agreement to subjective human evaluation.

  11. The data quality analyzer: a quality control program for seismic data

    USGS Publications Warehouse

    Ringler, Adam; Hagerty, M.T.; Holland, James F.; Gonzales, A.; Gee, Lind S.; Edwards, J.D.; Wilson, David; Baker, Adam

    2015-01-01

    The quantification of data quality is based on the evaluation of various metrics (e.g., timing quality, daily noise levels relative to long-term noise models, and comparisons between broadband data and event synthetics). Users may select which metrics contribute to the assessment and those metrics are aggregated into a “grade” for each station. The DQA is being actively used for station diagnostics and evaluation based on the completed metrics (availability, gap count, timing quality, deviation from a global noise model, deviation from a station noise model, coherence between co-located sensors, and comparison between broadband data and synthetics for earthquakes) on stations in the Global Seismographic Network and Advanced National Seismic System.

  12. Evaluating Algorithm Performance Metrics Tailored for Prognostics

    NASA Technical Reports Server (NTRS)

    Saxena, Abhinav; Celaya, Jose; Saha, Bhaskar; Saha, Sankalita; Goebel, Kai

    2009-01-01

    Prognostics has taken a center stage in Condition Based Maintenance (CBM) where it is desired to estimate Remaining Useful Life (RUL) of the system so that remedial measures may be taken in advance to avoid catastrophic events or unwanted downtimes. Validation of such predictions is an important but difficult proposition and a lack of appropriate evaluation methods renders prognostics meaningless. Evaluation methods currently used in the research community are not standardized and in many cases do not sufficiently assess key performance aspects expected out of a prognostics algorithm. In this paper we introduce several new evaluation metrics tailored for prognostics and show that they can effectively evaluate various algorithms as compared to other conventional metrics. Specifically four algorithms namely; Relevance Vector Machine (RVM), Gaussian Process Regression (GPR), Artificial Neural Network (ANN), and Polynomial Regression (PR) are compared. These algorithms vary in complexity and their ability to manage uncertainty around predicted estimates. Results show that the new metrics rank these algorithms in different manner and depending on the requirements and constraints suitable metrics may be chosen. Beyond these results, these metrics offer ideas about how metrics suitable to prognostics may be designed so that the evaluation procedure can be standardized. 1

  13. Toward a perceptual video-quality metric

    NASA Astrophysics Data System (ADS)

    Watson, Andrew B.

    1998-07-01

    The advent of widespread distribution of digital video creates a need for automated methods for evaluating the visual quality of digital video. This is particularly so since most digital video is compressed using lossy methods, which involve the controlled introduction of potentially visible artifacts. Compounding the problem is the bursty nature of digital video, which requires adaptive bit allocation based on visual quality metrics, and the economic need to reduce bit-rate to the lowest level that yields acceptable quality. In previous work, we have developed visual quality metrics for evaluating, controlling,a nd optimizing the quality of compressed still images. These metrics incorporate simplified models of human visual sensitivity to spatial and chromatic visual signals. Here I describe a new video quality metric that is an extension of these still image metrics into the time domain. Like the still image metrics, it is based on the Discrete Cosine Transform. An effort has been made to minimize the amount of memory and computation required by the metric, in order that might be applied in the widest range of applications. To calibrate the basic sensitivity of this metric to spatial and temporal signals we have made measurements of visual thresholds for temporally varying samples of DCT quantization noise.

  14. Using Geometry-Based Metrics as Part of Fitness-for-Purpose Evaluations of 3D City Models

    NASA Astrophysics Data System (ADS)

    Wong, K.; Ellul, C.

    2016-10-01

    Three-dimensional geospatial information is being increasingly used in a range of tasks beyond visualisation. 3D datasets, however, are often being produced without exact specifications and at mixed levels of geometric complexity. This leads to variations within the models' geometric and semantic complexity as well as the degree of deviation from the corresponding real world objects. Existing descriptors and measures of 3D data such as CityGML's level of detail are perhaps only partially sufficient in communicating data quality and fitness-for-purpose. This study investigates whether alternative, automated, geometry-based metrics describing the variation of complexity within 3D datasets could provide additional relevant information as part of a process of fitness-for-purpose evaluation. The metrics include: mean vertex/edge/face counts per building; vertex/face ratio; minimum 2D footprint area and; minimum feature length. Each metric was tested on six 3D city models from international locations. The results show that geometry-based metrics can provide additional information on 3D city models as part of fitness-for-purpose evaluations. The metrics, while they cannot be used in isolation, may provide a complement to enhance existing data descriptors if backed up with local knowledge, where possible.

  15. On the new metrics for IMRT QA verification.

    PubMed

    Garcia-Romero, Alejandro; Hernandez-Vitoria, Araceli; Millan-Cebrian, Esther; Alba-Escorihuela, Veronica; Serrano-Zabaleta, Sonia; Ortega-Pardina, Pablo

    2016-11-01

    The aim of this work is to search for new metrics that could give more reliable acceptance/rejection criteria on the IMRT verification process and to offer solutions to the discrepancies found among different conventional metrics. Therefore, besides conventional metrics, new ones are proposed and evaluated with new tools to find correlations among them. These new metrics are based on the processing of the dose-volume histogram information, evaluating the absorbed dose differences, the dose constraint fulfillment, or modified biomathematical treatment outcome models such as tumor control probability (TCP) and normal tissue complication probability (NTCP). An additional purpose is to establish whether the new metrics yield the same acceptance/rejection plan distribution as the conventional ones. Fifty eight treatment plans concerning several patient locations are analyzed. All of them were verified prior to the treatment, using conventional metrics, and retrospectively after the treatment with the new metrics. These new metrics include the definition of three continuous functions, based on dose-volume histograms resulting from measurements evaluated with a reconstructed dose system and also with a Monte Carlo redundant calculation. The 3D gamma function for every volume of interest is also calculated. The information is also processed to obtain ΔTCP or ΔNTCP for the considered volumes of interest. These biomathematical treatment outcome models have been modified to increase their sensitivity to dose changes. A robustness index from a radiobiological point of view is defined to classify plans in robustness against dose changes. Dose difference metrics can be condensed in a single parameter: the dose difference global function, with an optimal cutoff that can be determined from a receiver operating characteristics (ROC) analysis of the metric. It is not always possible to correlate differences in biomathematical treatment outcome models with dose difference metrics. This is due to the fact that the dose constraint is often far from the dose that has an actual impact on the radiobiological model, and therefore, biomathematical treatment outcome models are insensitive to big dose differences between the verification system and the treatment planning system. As an alternative, the use of modified radiobiological models which provides a better correlation is proposed. In any case, it is better to choose robust plans from a radiobiological point of view. The robustness index defined in this work is a good predictor of the plan rejection probability according to metrics derived from modified radiobiological models. The global 3D gamma-based metric calculated for each plan volume shows a good correlation with the dose difference metrics and presents a good performance in the acceptance/rejection process. Some discrepancies have been found in dose reconstruction depending on the algorithm employed. Significant and unavoidable discrepancies were found between the conventional metrics and the new ones. The dose difference global function and the 3D gamma for each plan volume are good classifiers regarding dose difference metrics. ROC analysis is useful to evaluate the predictive power of the new metrics. The correlation between biomathematical treatment outcome models and the dose difference-based metrics is enhanced by using modified TCP and NTCP functions that take into account the dose constraints for each plan. The robustness index is useful to evaluate if a plan is likely to be rejected. Conventional verification should be replaced by the new metrics, which are clinically more relevant.

  16. Evaluating Process Sustainability Using Flowsheet Monitoring

    EPA Science Inventory

    Environmental metric software can be used to evaluate the sustainability of a chemical based on data from the chemical process that is used to manufacture it. One problem in developing environmental metric software is that chemical process simulation packages typically do not rea...

  17. Performance assessment of geospatial simulation models of land-use change--a landscape metric-based approach.

    PubMed

    Sakieh, Yousef; Salmanmahiny, Abdolrassoul

    2016-03-01

    Performance evaluation is a critical step when developing land-use and cover change (LUCC) models. The present study proposes a spatially explicit model performance evaluation method, adopting a landscape metric-based approach. To quantify GEOMOD model performance, a set of composition- and configuration-based landscape metrics including number of patches, edge density, mean Euclidean nearest neighbor distance, largest patch index, class area, landscape shape index, and splitting index were employed. The model takes advantage of three decision rules including neighborhood effect, persistence of change direction, and urbanization suitability values. According to the results, while class area, largest patch index, and splitting indices demonstrated insignificant differences between spatial pattern of ground truth and simulated layers, there was a considerable inconsistency between simulation results and real dataset in terms of the remaining metrics. Specifically, simulation outputs were simplistic and the model tended to underestimate number of developed patches by producing a more compact landscape. Landscape-metric-based performance evaluation produces more detailed information (compared to conventional indices such as the Kappa index and overall accuracy) on the model's behavior in replicating spatial heterogeneity features of a landscape such as frequency, fragmentation, isolation, and density. Finally, as the main characteristic of the proposed method, landscape metrics employ the maximum potential of observed and simulated layers for a performance evaluation procedure, provide a basis for more robust interpretation of a calibration process, and also deepen modeler insight into the main strengths and pitfalls of a specific land-use change model when simulating a spatiotemporal phenomenon.

  18. Performance metrics for the evaluation of hyperspectral chemical identification systems

    NASA Astrophysics Data System (ADS)

    Truslow, Eric; Golowich, Steven; Manolakis, Dimitris; Ingle, Vinay

    2016-02-01

    Remote sensing of chemical vapor plumes is a difficult but important task for many military and civilian applications. Hyperspectral sensors operating in the long-wave infrared regime have well-demonstrated detection capabilities. However, the identification of a plume's chemical constituents, based on a chemical library, is a multiple hypothesis testing problem which standard detection metrics do not fully describe. We propose using an additional performance metric for identification based on the so-called Dice index. Our approach partitions and weights a confusion matrix to develop both the standard detection metrics and identification metric. Using the proposed metrics, we demonstrate that the intuitive system design of a detector bank followed by an identifier is indeed justified when incorporating performance information beyond the standard detection metrics.

  19. Evaluating Process Sustainability Using Flowsheet Monitoring (Abstract)

    EPA Science Inventory

    Environmental metric software can be used to evaluate the sustainability of a chemical based upon data from the chemical process that is used to manufacture it. One problem in developing environmental metric software is that chemical process simulation packages typically do not p...

  20. Towards a Visual Quality Metric for Digital Video

    NASA Technical Reports Server (NTRS)

    Watson, Andrew B.

    1998-01-01

    The advent of widespread distribution of digital video creates a need for automated methods for evaluating visual quality of digital video. This is particularly so since most digital video is compressed using lossy methods, which involve the controlled introduction of potentially visible artifacts. Compounding the problem is the bursty nature of digital video, which requires adaptive bit allocation based on visual quality metrics. In previous work, we have developed visual quality metrics for evaluating, controlling, and optimizing the quality of compressed still images. These metrics incorporate simplified models of human visual sensitivity to spatial and chromatic visual signals. The challenge of video quality metrics is to extend these simplified models to temporal signals as well. In this presentation I will discuss a number of the issues that must be resolved in the design of effective video quality metrics. Among these are spatial, temporal, and chromatic sensitivity and their interactions, visual masking, and implementation complexity. I will also touch on the question of how to evaluate the performance of these metrics.

  1. Piloted Simulation Study of Rudder Pedal Force/Feel Characteristics

    NASA Technical Reports Server (NTRS)

    Hess, Ronald A.

    2007-01-01

    A piloted, fixed-base simulation was conducted in 2006 to determine optimum rudder pedal force/feel characteristics for transport aircraft. As part of this research, an evaluation of four metrics for assessing rudder pedal characteristics previously presented in the literature was conducted. This evaluation was based upon the numerical handling qualities ratings assigned to a variety of pedal force/feel systems used in the simulation study. It is shown that, with the inclusion of a fifth metric, most of the rudder pedal force/feel system designs that were rated poorly by the evaluation pilots could be identified. It is suggested that these metrics form the basis of a certification requirement for transport aircraft.

  2. Evaluating Modeled Impact Metrics for Human Health, Agriculture Growth, and Near-Term Climate

    NASA Astrophysics Data System (ADS)

    Seltzer, K. M.; Shindell, D. T.; Faluvegi, G.; Murray, L. T.

    2017-12-01

    Simulated metrics that assess impacts on human health, agriculture growth, and near-term climate were evaluated using ground-based and satellite observations. The NASA GISS ModelE2 and GEOS-Chem models were used to simulate the near-present chemistry of the atmosphere. A suite of simulations that varied by model, meteorology, horizontal resolution, emissions inventory, and emissions year were performed, enabling an analysis of metric sensitivities to various model components. All simulations utilized consistent anthropogenic global emissions inventories (ECLIPSE V5a or CEDS), and an evaluation of simulated results were carried out for 2004-2006 and 2009-2011 over the United States and 2014-2015 over China. Results for O3- and PM2.5-based metrics featured minor differences due to the model resolutions considered here (2.0° × 2.5° and 0.5° × 0.666°) and model, meteorology, and emissions inventory each played larger roles in variances. Surface metrics related to O3 were consistently high biased, though to varying degrees, demonstrating the need to evaluate particular modeling frameworks before O3 impacts are quantified. Surface metrics related to PM2.5 were diverse, indicating that a multimodel mean with robust results are valuable tools in predicting PM2.5-related impacts. Oftentimes, the configuration that captured the change of a metric best over time differed from the configuration that captured the magnitude of the same metric best, demonstrating the challenge in skillfully simulating impacts. These results highlight the strengths and weaknesses of these models in simulating impact metrics related to air quality and near-term climate. With such information, the reliability of historical and future simulations can be better understood.

  3. Detecting trends in landscape pattern metrics over a 20-year period using a sampling-based monitoring programme

    USGS Publications Warehouse

    Griffith, J.A.; Stehman, S.V.; Sohl, Terry L.; Loveland, Thomas R.

    2003-01-01

    Temporal trends in landscape pattern metrics describing texture, patch shape and patch size were evaluated in the US Middle Atlantic Coastal Plain Ecoregion. The landscape pattern metrics were calculated for a sample of land use/cover data obtained for four points in time from 1973-1992. The multiple sampling dates permit evaluation of trend, whereas availability of only two sampling dates allows only evaluation of change. Observed statistically significant trends in the landscape pattern metrics demonstrated that the sampling-based monitoring protocol was able to detect a trend toward a more fine-grained landscape in this ecoregion. This sampling and analysis protocol is being extended spatially to the remaining 83 ecoregions in the US and temporally to the year 2000 to provide a national and regional synthesis of the temporal and spatial dynamics of landscape pattern covering the period 1973-2000.

  4. Comparison of continuous versus categorical tumor measurement-based metrics to predict overall survival in cancer treatment trials.

    PubMed

    An, Ming-Wen; Mandrekar, Sumithra J; Branda, Megan E; Hillman, Shauna L; Adjei, Alex A; Pitot, Henry C; Goldberg, Richard M; Sargent, Daniel J

    2011-10-15

    The categorical definition of response assessed via the Response Evaluation Criteria in Solid Tumors has documented limitations. We sought to identify alternative metrics for tumor response that improve prediction of overall survival. Individual patient data from three North Central Cancer Treatment Group trials (N0026, n = 117; N9741, n = 1,109; and N9841, n = 332) were used. Continuous metrics of tumor size based on longitudinal tumor measurements were considered in addition to a trichotomized response [TriTR: response (complete or partial) vs. stable disease vs. progression). Cox proportional hazards models, adjusted for treatment arm and baseline tumor burden, were used to assess the impact of the metrics on subsequent overall survival, using a landmark analysis approach at 12, 16, and 24 weeks postbaseline. Model discrimination was evaluated by the concordance (c) index. The overall best response rates for the three trials were 26%, 45%, and 25%, respectively. Although nearly all metrics were statistically significantly associated with overall survival at the different landmark time points, the concordance indices (c-index) for the traditional response metrics ranged from 0.59 to 0.65; for the continuous metrics from 0.60 to 0.66; and for the TriTR metrics from 0.64 to 0.69. The c-indices for TriTR at 12 weeks were comparable with those at 16 and 24 weeks. Continuous tumor measurement-based metrics provided no predictive improvement over traditional response-based metrics or TriTR; TriTR had better predictive ability than best TriTR or confirmed response. If confirmed, TriTR represents a promising endpoint for future phase II trials. ©2011 AACR.

  5. Comparison of continuous versus categorical tumor measurement-based metrics to predict overall survival in cancer treatment trials

    PubMed Central

    An, Ming-Wen; Mandrekar, Sumithra J.; Branda, Megan E.; Hillman, Shauna L.; Adjei, Alex A.; Pitot, Henry; Goldberg, Richard M.; Sargent, Daniel J.

    2011-01-01

    Purpose The categorical definition of response assessed via the Response Evaluation Criteria in Solid Tumors has documented limitations. We sought to identify alternative metrics for tumor response that improve prediction of overall survival. Experimental Design Individual patient data from three North Central Cancer Treatment Group trials (N0026, n=117; N9741, n=1109; N9841, n=332) were used. Continuous metrics of tumor size based on longitudinal tumor measurements were considered in addition to a trichotomized response (TriTR: Response vs. Stable vs. Progression). Cox proportional hazards models, adjusted for treatment arm and baseline tumor burden, were used to assess the impact of the metrics on subsequent overall survival, using a landmark analysis approach at 12-, 16- and 24-weeks post baseline. Model discrimination was evaluated using the concordance (c) index. Results The overall best response rates for the three trials were 26%, 45%, and 25% respectively. While nearly all metrics were statistically significantly associated with overall survival at the different landmark time points, the c-indices for the traditional response metrics ranged from 0.59-0.65; for the continuous metrics from 0.60-0.66 and for the TriTR metrics from 0.64-0.69. The c-indices for TriTR at 12-weeks were comparable to those at 16- and 24-weeks. Conclusions Continuous tumor-measurement-based metrics provided no predictive improvement over traditional response based metrics or TriTR; TriTR had better predictive ability than best TriTR or confirmed response. If confirmed, TriTR represents a promising endpoint for future Phase II trials. PMID:21880789

  6. Investigation of 3D histograms of oriented gradients for image-based registration of CT with interventional CBCT

    NASA Astrophysics Data System (ADS)

    Trimborn, Barbara; Wolf, Ivo; Abu-Sammour, Denis; Henzler, Thomas; Schad, Lothar R.; Zöllner, Frank G.

    2017-03-01

    Image registration of preprocedural contrast-enhanced CTs to intraprocedual cone-beam computed tomography (CBCT) can provide additional information for interventional liver oncology procedures such as transcatheter arterial chemoembolisation (TACE). In this paper, a novel similarity metric for gradient-based image registration is proposed. The metric relies on the patch-based computation of histograms of oriented gradients (HOG) building the basis for a feature descriptor. The metric was implemented in a framework for rigid 3D-3D-registration of pre-interventional CT with intra-interventional CBCT data obtained during the workflow of a TACE. To evaluate the performance of the new metric, the capture range was estimated based on the calculation of the mean target registration error and compared to the results obtained with a normalized cross correlation metric. The results show that 3D HOG feature descriptors are suitable as image-similarity metric and that the novel metric can compete with established methods in terms of registration accuracy

  7. On the Tradeoff Between Altruism and Selfishness in MANET Trust Management

    DTIC Science & Technology

    2016-04-07

    to discourage selfish behaviors, using a hidden Markov model (HMM) to quanti - tatively measure the trustworthiness of nodes. Adams et al. [18...based reliability metric to predict trust-based system survivability. Section 4 analyzes numerical results obtained through the evaluation of our SPN...concepts in MANETs, trust man- agement for MANETs should consider the following design features: trust metrics must be customizable, evaluation of

  8. Quality evaluation of motion-compensated edge artifacts in compressed video.

    PubMed

    Leontaris, Athanasios; Cosman, Pamela C; Reibman, Amy R

    2007-04-01

    Little attention has been paid to an impairment common in motion-compensated video compression: the addition of high-frequency (HF) energy as motion compensation displaces blocking artifacts off block boundaries. In this paper, we employ an energy-based approach to measure this motion-compensated edge artifact, using both compressed bitstream information and decoded pixels. We evaluate the performance of our proposed metric, along with several blocking and blurring metrics, on compressed video in two ways. First, ordinal scales are evaluated through a series of expectations that a good quality metric should satisfy: the objective evaluation. Then, the best performing metrics are subjectively evaluated. The same subjective data set is finally used to obtain interval scales to gain more insight. Experimental results show that we accurately estimate the percentage of the added HF energy in compressed video.

  9. Value-Based Assessment of Radiology Reporting Using Radiologist-Referring Physician Two-Way Feedback System-a Design Thinking-Based Approach.

    PubMed

    Shaikh, Faiq; Hendrata, Kenneth; Kolowitz, Brian; Awan, Omer; Shrestha, Rasu; Deible, Christopher

    2017-06-01

    In the era of value-based healthcare, many aspects of medical care are being measured and assessed to improve quality and reduce costs. Radiology adds enormously to health care costs and is under pressure to adopt a more efficient system that incorporates essential metrics to assess its value and impact on outcomes. Most current systems tie radiologists' incentives and evaluations to RVU-based productivity metrics and peer-review-based quality metrics. In a new potential model, a radiologist's performance will have to increasingly depend on a number of parameters that define "value," beginning with peer review metrics that include referrer satisfaction and feedback from radiologists to the referring physician that evaluates the potency and validity of clinical information provided for a given study. These new dimensions of value measurement will directly impact the cascade of further medical management. We share our continued experience with this project that had two components: RESP (Referrer Evaluation System Pilot) and FRACI (Feedback from Radiologist Addressing Confounding Issues), which were introduced to the clinical radiology workflow in order to capture referrer-based and radiologist-based feedback on radiology reporting. We also share our insight into the principles of design thinking as applied in its planning and execution.

  10. EVALUATION OF METRIC PRECISION FOR A RIPARIAN FOREST SURVEY

    EPA Science Inventory

    This paper evaluates the performance of a protocol to monitor riparian forests in western Oregon based on the quality of the data obtained from a recent field survey. Precision and accuracy are the criteria used to determine the quality of 19 field metrics. The field survey con...

  11. New exposure-based metric approach for evaluating O3 risk to North American aspen forests

    Treesearch

    K.E. Percy; M. Nosal; W. Heilman; T. Dann; J. Sober; A.H. Legge; D.F. Karnosky

    2007-01-01

    The United States and Canada currently use exposure-based metrics to protect vegetation from O3. Using 5 years (1999-2003) of co-measured O3, meteorology and growth response, we have developed exposure-based regression models that predict Populus tremuloides growth change within the North American ambient...

  12. Automated Assessment of Visual Quality of Digital Video

    NASA Technical Reports Server (NTRS)

    Watson, Andrew B.; Ellis, Stephen R. (Technical Monitor)

    1997-01-01

    The advent of widespread distribution of digital video creates a need for automated methods for evaluating visual quality of digital video. This is particularly so since most digital video is compressed using lossy methods, which involve the controlled introduction of potentially visible artifacts. Compounding the problem is the bursty nature of digital video, which requires adaptive bit allocation based on visual quality metrics. In previous work, we have developed visual quality metrics for evaluating, controlling, and optimizing the quality of compressed still images[1-4]. These metrics incorporate simplified models of human visual sensitivity to spatial and chromatic visual signals. The challenge of video quality metrics is to extend these simplified models to temporal signals as well. In this presentation I will discuss a number of the issues that must be resolved in the design of effective video quality metrics. Among these are spatial, temporal, and chromatic sensitivity and their interactions, visual masking, and implementation complexity. I will also touch on the question of how to evaluate the performance of these metrics.

  13. Distance Metric Learning via Iterated Support Vector Machines.

    PubMed

    Zuo, Wangmeng; Wang, Faqiang; Zhang, David; Lin, Liang; Huang, Yuchi; Meng, Deyu; Zhang, Lei

    2017-07-11

    Distance metric learning aims to learn from the given training data a valid distance metric, with which the similarity between data samples can be more effectively evaluated for classification. Metric learning is often formulated as a convex or nonconvex optimization problem, while most existing methods are based on customized optimizers and become inefficient for large scale problems. In this paper, we formulate metric learning as a kernel classification problem with the positive semi-definite constraint, and solve it by iterated training of support vector machines (SVMs). The new formulation is easy to implement and efficient in training with the off-the-shelf SVM solvers. Two novel metric learning models, namely Positive-semidefinite Constrained Metric Learning (PCML) and Nonnegative-coefficient Constrained Metric Learning (NCML), are developed. Both PCML and NCML can guarantee the global optimality of their solutions. Experiments are conducted on general classification, face verification and person re-identification to evaluate our methods. Compared with the state-of-the-art approaches, our methods can achieve comparable classification accuracy and are efficient in training.

  14. Modulated evaluation metrics for drug-based ontologies.

    PubMed

    Amith, Muhammad; Tao, Cui

    2017-04-24

    Research for ontology evaluation is scarce. If biomedical ontological datasets and knowledgebases are to be widely used, there needs to be quality control and evaluation for the content and structure of the ontology. This paper introduces how to effectively utilize a semiotic-inspired approach to ontology evaluation, specifically towards drug-related ontologies hosted on the National Center for Biomedical Ontology BioPortal. Using the semiotic-based evaluation framework for drug-based ontologies, we adjusted the quality metrics based on the semiotic features of drug ontologies. Then, we compared the quality scores before and after tailoring. The scores revealed a more precise measurement and a closer distribution compared to the before-tailoring. The results of this study reveal that a tailored semiotic evaluation produced a more meaningful and accurate assessment of drug-based ontologies, lending to the possible usefulness of semiotics in ontology evaluation.

  15. Development and Implementation of a Design Metric for Systems Containing Long-Term Fluid Loops

    NASA Technical Reports Server (NTRS)

    Steele, John W.

    2016-01-01

    John Steele, a chemist and technical fellow from United Technologies Corporation, provided a water quality module to assist engineers and scientists with a metric tool to evaluate risks associated with the design of space systems with fluid loops. This design metric is a methodical, quantitative, lessons-learned based means to evaluate the robustness of a long-term fluid loop system design. The tool was developed by a cross-section of engineering disciplines who had decades of experience and problem resolution.

  16. Ranking streamflow model performance based on Information theory metrics

    NASA Astrophysics Data System (ADS)

    Martinez, Gonzalo; Pachepsky, Yakov; Pan, Feng; Wagener, Thorsten; Nicholson, Thomas

    2016-04-01

    The accuracy-based model performance metrics not necessarily reflect the qualitative correspondence between simulated and measured streamflow time series. The objective of this work was to use the information theory-based metrics to see whether they can be used as complementary tool for hydrologic model evaluation and selection. We simulated 10-year streamflow time series in five watersheds located in Texas, North Carolina, Mississippi, and West Virginia. Eight model of different complexity were applied. The information-theory based metrics were obtained after representing the time series as strings of symbols where different symbols corresponded to different quantiles of the probability distribution of streamflow. The symbol alphabet was used. Three metrics were computed for those strings - mean information gain that measures the randomness of the signal, effective measure complexity that characterizes predictability and fluctuation complexity that characterizes the presence of a pattern in the signal. The observed streamflow time series has smaller information content and larger complexity metrics than the precipitation time series. Watersheds served as information filters and and streamflow time series were less random and more complex than the ones of precipitation. This is reflected the fact that the watershed acts as the information filter in the hydrologic conversion process from precipitation to streamflow. The Nash Sutcliffe efficiency metric increased as the complexity of models increased, but in many cases several model had this efficiency values not statistically significant from each other. In such cases, ranking models by the closeness of the information-theory based parameters in simulated and measured streamflow time series can provide an additional criterion for the evaluation of hydrologic model performance.

  17. Coverage Metrics for Requirements-Based Testing: Evaluation of Effectiveness

    NASA Technical Reports Server (NTRS)

    Staats, Matt; Whalen, Michael W.; Heindahl, Mats P. E.; Rajan, Ajitha

    2010-01-01

    In black-box testing, the tester creates a set of tests to exercise a system under test without regard to the internal structure of the system. Generally, no objective metric is used to measure the adequacy of black-box tests. In recent work, we have proposed three requirements coverage metrics, allowing testers to objectively measure the adequacy of a black-box test suite with respect to a set of requirements formalized as Linear Temporal Logic (LTL) properties. In this report, we evaluate the effectiveness of these coverage metrics with respect to fault finding. Specifically, we conduct an empirical study to investigate two questions: (1) do test suites satisfying a requirements coverage metric provide better fault finding than randomly generated test suites of approximately the same size?, and (2) do test suites satisfying a more rigorous requirements coverage metric provide better fault finding than test suites satisfying a less rigorous requirements coverage metric? Our results indicate (1) only one coverage metric proposed -- Unique First Cause (UFC) coverage -- is sufficiently rigorous to ensure test suites satisfying the metric outperform randomly generated test suites of similar size and (2) that test suites satisfying more rigorous coverage metrics provide better fault finding than test suites satisfying less rigorous coverage metrics.

  18. Metric for evaluation of filter efficiency in spectral cameras.

    PubMed

    Nahavandi, Alireza Mahmoudi; Tehran, Mohammad Amani

    2016-11-10

    Although metric functions that show the performance of a colorimetric imaging device have been investigated, a metric for performance analysis of a set of filters in wideband filter-based spectral cameras has rarely been studied. Based on a generalization of Vora's Measure of Goodness (MOG) and the spanning theorem, a single function metric that estimates the effectiveness of a filter set is introduced. The improved metric, named MMOG, varies between one, for a perfect, and zero, for the worst possible set of filters. Results showed that MMOG exhibits a trend that is more similar to the mean square of spectral reflectance reconstruction errors than does Vora's MOG index, and it is robust to noise in the imaging system. MMOG as a single metric could be exploited for further analysis of manufacturing errors.

  19. An Underwater Color Image Quality Evaluation Metric.

    PubMed

    Yang, Miao; Sowmya, Arcot

    2015-12-01

    Quality evaluation of underwater images is a key goal of underwater video image retrieval and intelligent processing. To date, no metric has been proposed for underwater color image quality evaluation (UCIQE). The special absorption and scattering characteristics of the water medium do not allow direct application of natural color image quality metrics especially to different underwater environments. In this paper, subjective testing for underwater image quality has been organized. The statistical distribution of the underwater image pixels in the CIELab color space related to subjective evaluation indicates the sharpness and colorful factors correlate well with subjective image quality perception. Based on these, a new UCIQE metric, which is a linear combination of chroma, saturation, and contrast, is proposed to quantify the non-uniform color cast, blurring, and low-contrast that characterize underwater engineering and monitoring images. Experiments are conducted to illustrate the performance of the proposed UCIQE metric and its capability to measure the underwater image enhancement results. They show that the proposed metric has comparable performance to the leading natural color image quality metrics and the underwater grayscale image quality metrics available in the literature, and can predict with higher accuracy the relative amount of degradation with similar image content in underwater environments. Importantly, UCIQE is a simple and fast solution for real-time underwater video processing. The effectiveness of the presented measure is also demonstrated by subjective evaluation. The results show better correlation between the UCIQE and the subjective mean opinion score.

  20. Comparison of task-based exposure metrics for an epidemiologic study of isocyanate inhalation exposures among autobody shop workers.

    PubMed

    Woskie, Susan R; Bello, Dhimiter; Gore, Rebecca J; Stowe, Meredith H; Eisen, Ellen A; Liu, Youcheng; Sparer, Judy A; Redlich, Carrie A; Cullen, Mark R

    2008-09-01

    Because many occupational epidemiologic studies use exposure surrogates rather than quantitative exposure metrics, the UMass Lowell and Yale study of autobody shop workers provided an opportunity to evaluate the relative utility of surrogates and quantitative exposure metrics in an exposure response analysis of cross-week change in respiratory function. A task-based exposure assessment was used to develop several metrics of inhalation exposure to isocyanates. The metrics included the surrogates, job title, counts of spray painting events during the day, counts of spray and bystander exposure events, and a quantitative exposure metric that incorporated exposure determinant models based on task sampling and a personal workplace protection factor for respirator use, combined with a daily task checklist. The result of the quantitative exposure algorithm was an estimate of the daily time-weighted average respirator-corrected total NCO exposure (microg/m(3)). In general, these four metrics were found to be variable in agreement using measures such as weighted kappa and Spearman correlation. A logistic model for 10% drop in FEV(1) from Monday morning to Thursday morning was used to evaluate the utility of each exposure metric. The quantitative exposure metric was the most favorable, producing the best model fit, as well as the greatest strength and magnitude of association. This finding supports the reports of others that reducing exposure misclassification can improve risk estimates that otherwise would be biased toward the null. Although detailed and quantitative exposure assessment can be more time consuming and costly, it can improve exposure-disease evaluations and is more useful for risk assessment purposes. The task-based exposure modeling method successfully produced estimates of daily time-weighted average exposures in the complex and changing autobody shop work environment. The ambient TWA exposures of all of the office workers and technicians and 57% of the painters were found to be below the current U.K. Health and Safety Executive occupational exposure limit (OEL) for total NCO of 20 microg/m(3). When respirator use was incorporated, all personal daily exposures were below the U.K. OEL.

  1. A management-oriented framework for selecting metrics used to assess habitat- and path-specific quality in spatially structured populations

    USGS Publications Warehouse

    Nicol, Sam; Wiederholt, Ruscena; Diffendorfer, James E.; Mattsson, Brady; Thogmartin, Wayne E.; Semmens, Darius J.; Laura Lopez-Hoffman,; Norris, Ryan

    2016-01-01

    Mobile species with complex spatial dynamics can be difficult to manage because their population distributions vary across space and time, and because the consequences of managing particular habitats are uncertain when evaluated at the level of the entire population. Metrics to assess the importance of habitats and pathways connecting habitats in a network are necessary to guide a variety of management decisions. Given the many metrics developed for spatially structured models, it can be challenging to select the most appropriate one for a particular decision. To guide the management of spatially structured populations, we define three classes of metrics describing habitat and pathway quality based on their data requirements (graph-based, occupancy-based, and demographic-based metrics) and synopsize the ecological literature relating to these classes. Applying the first steps of a formal decision-making approach (problem framing, objectives, and management actions), we assess the utility of metrics for particular types of management decisions. Our framework can help managers with problem framing, choosing metrics of habitat and pathway quality, and to elucidate the data needs for a particular metric. Our goal is to help managers to narrow the range of suitable metrics for a management project, and aid in decision-making to make the best use of limited resources.

  2. Perceived assessment metrics for visible and infrared color fused image quality without reference image

    NASA Astrophysics Data System (ADS)

    Yu, Xuelian; Chen, Qian; Gu, Guohua; Ren, Jianle; Sui, Xiubao

    2015-02-01

    Designing objective quality assessment of color-fused image is a very demanding and challenging task. We propose four no-reference metrics based on human visual system characteristics for objectively evaluating the quality of false color fusion image. The perceived edge metric (PEM) is defined based on visual perception model and color image gradient similarity between the fused image and the source images. The perceptual contrast metric (PCM) is established associating multi-scale contrast and varying contrast sensitivity filter (CSF) with color components. The linear combination of the standard deviation and mean value over the fused image construct the image colorfulness metric (ICM). The color comfort metric (CCM) is designed by the average saturation and the ratio of pixels with high and low saturation. The qualitative and quantitative experimental results demonstrate that the proposed metrics have a good agreement with subjective perception.

  3. Averaged ratio between complementary profiles for evaluating shape distortions of map projections and spherical hierarchical tessellations

    NASA Astrophysics Data System (ADS)

    Yan, Jin; Song, Xiao; Gong, Guanghong

    2016-02-01

    We describe a metric named averaged ratio between complementary profiles to represent the distortion of map projections, and the shape regularity of spherical cells derived from map projections or non-map-projection methods. The properties and statistical characteristics of our metric are investigated. Our metric (1) is a variable of numerical equivalence to both scale component and angular deformation component of Tissot indicatrix, and avoids the invalidation when using Tissot indicatrix and derived differential calculus for evaluating non-map-projection based tessellations where mathematical formulae do not exist (e.g., direct spherical subdivisions), (2) exhibits simplicity (neither differential nor integral calculus) and uniformity in the form of calculations, (3) requires low computational cost, while maintaining high correlation with the results of differential calculus, (4) is a quasi-invariant under rotations, and (5) reflects the distortions of map projections, distortion of spherical cells, and the associated distortions of texels. As an indicator of quantitative evaluation, we investigated typical spherical tessellation methods, some variants of tessellation methods, and map projections. The tessellation methods we evaluated are based on map projections or direct spherical subdivisions. The evaluation involves commonly used Platonic polyhedrons, Catalan polyhedrons, etc. Quantitative analyses based on our metric of shape regularity and an essential metric of area uniformity implied that (1) Uniform Spherical Grids and its variant show good qualities in both area uniformity and shape regularity, and (2) Crusta, Unicube map, and a variant of Unicube map exhibit fairly acceptable degrees of area uniformity and shape regularity.

  4. An Abstract Process and Metrics Model for Evaluating Unified Command and Control: A Scenario and Technology Agnostic Approach

    DTIC Science & Technology

    2004-06-01

    18 EBO Cognitive or Memetic input type ..................................................................... 18 Unanticipated EBO generated... Memetic Effects Based COA.................................................................................... 23 Policy...41 Belief systems or Memetic Content Metrics

  5. Can state-of-the-art HVS-based objective image quality criteria be used for image reconstruction techniques based on ROI analysis?

    NASA Astrophysics Data System (ADS)

    Dostal, P.; Krasula, L.; Klima, M.

    2012-06-01

    Various image processing techniques in multimedia technology are optimized using visual attention feature of the human visual system. Spatial non-uniformity causes that different locations in an image are of different importance in terms of perception of the image. In other words, the perceived image quality depends mainly on the quality of important locations known as regions of interest. The performance of such techniques is measured by subjective evaluation or objective image quality criteria. Many state-of-the-art objective metrics are based on HVS properties; SSIM, MS-SSIM based on image structural information, VIF based on the information that human brain can ideally gain from the reference image or FSIM utilizing the low-level features to assign the different importance to each location in the image. But still none of these objective metrics utilize the analysis of regions of interest. We solve the question if these objective metrics can be used for effective evaluation of images reconstructed by processing techniques based on ROI analysis utilizing high-level features. In this paper authors show that the state-of-the-art objective metrics do not correlate well with subjective evaluation while the demosaicing based on ROI analysis is used for reconstruction. The ROI were computed from "ground truth" visual attention data. The algorithm combining two known demosaicing techniques on the basis of ROI location is proposed to reconstruct the ROI in fine quality while the rest of image is reconstructed with low quality. The color image reconstructed by this ROI approach was compared with selected demosaicing techniques by objective criteria and subjective testing. The qualitative comparison of the objective and subjective results indicates that the state-of-the-art objective metrics are still not suitable for evaluation image processing techniques based on ROI analysis and new criteria is demanded.

  6. Cross-evaluation of metrics to estimate the significance of creative works

    PubMed Central

    Wasserman, Max; Zeng, Xiao Han T.; Amaral, Luís A. Nunes

    2015-01-01

    In a world overflowing with creative works, it is useful to be able to filter out the unimportant works so that the significant ones can be identified and thereby absorbed. An automated method could provide an objective approach for evaluating the significance of works on a universal scale. However, there have been few attempts at creating such a measure, and there are few “ground truths” for validating the effectiveness of potential metrics for significance. For movies, the US Library of Congress’s National Film Registry (NFR) contains American films that are “culturally, historically, or aesthetically significant” as chosen through a careful evaluation and deliberation process. By analyzing a network of citations between 15,425 United States-produced films procured from the Internet Movie Database (IMDb), we obtain several automated metrics for significance. The best of these metrics is able to indicate a film’s presence in the NFR at least as well or better than metrics based on aggregated expert opinions or large population surveys. Importantly, automated metrics can easily be applied to older films for which no other rating may be available. Our results may have implications for the evaluation of other creative works such as scientific research. PMID:25605881

  7. SU-F-J-94: Development of a Plug-in Based Image Analysis Tool for Integration Into Treatment Planning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Owen, D; Anderson, C; Mayo, C

    Purpose: To extend the functionality of a commercial treatment planning system (TPS) to support (i) direct use of quantitative image-based metrics within treatment plan optimization and (ii) evaluation of dose-functional volume relationships to assist in functional image adaptive radiotherapy. Methods: A script was written that interfaces with a commercial TPS via an Application Programming Interface (API). The script executes a program that performs dose-functional volume analyses. Written in C#, the script reads the dose grid and correlates it with image data on a voxel-by-voxel basis through API extensions that can access registration transforms. A user interface was designed through WinFormsmore » to input parameters and display results. To test the performance of this program, image- and dose-based metrics computed from perfusion SPECT images aligned to the treatment planning CT were generated, validated, and compared. Results: The integration of image analysis information was successfully implemented as a plug-in to a commercial TPS. Perfusion SPECT images were used to validate the calculation and display of image-based metrics as well as dose-intensity metrics and histograms for defined structures on the treatment planning CT. Various biological dose correction models, custom image-based metrics, dose-intensity computations, and dose-intensity histograms were applied to analyze the image-dose profile. Conclusion: It is possible to add image analysis features to commercial TPSs through custom scripting applications. A tool was developed to enable the evaluation of image-intensity-based metrics in the context of functional targeting and avoidance. In addition to providing dose-intensity metrics and histograms that can be easily extracted from a plan database and correlated with outcomes, the system can also be extended to a plug-in optimization system, which can directly use the computed metrics for optimization of post-treatment tumor or normal tissue response models. Supported by NIH - P01 - CA059827.« less

  8. The data quality analyzer: A quality control program for seismic data

    NASA Astrophysics Data System (ADS)

    Ringler, A. T.; Hagerty, M. T.; Holland, J.; Gonzales, A.; Gee, L. S.; Edwards, J. D.; Wilson, D.; Baker, A. M.

    2015-03-01

    The U.S. Geological Survey's Albuquerque Seismological Laboratory (ASL) has several initiatives underway to enhance and track the quality of data produced from ASL seismic stations and to improve communication about data problems to the user community. The Data Quality Analyzer (DQA) is one such development and is designed to characterize seismic station data quality in a quantitative and automated manner. The DQA consists of a metric calculator, a PostgreSQL database, and a Web interface: The metric calculator, SEEDscan, is a Java application that reads and processes miniSEED data and generates metrics based on a configuration file. SEEDscan compares hashes of metadata and data to detect changes in either and performs subsequent recalculations as needed. This ensures that the metric values are up to date and accurate. SEEDscan can be run as a scheduled task or on demand. The PostgreSQL database acts as a central hub where metric values and limited station descriptions are stored at the channel level with one-day granularity. The Web interface dynamically loads station data from the database and allows the user to make requests for time periods of interest, review specific networks and stations, plot metrics as a function of time, and adjust the contribution of various metrics to the overall quality grade of the station. The quantification of data quality is based on the evaluation of various metrics (e.g., timing quality, daily noise levels relative to long-term noise models, and comparisons between broadband data and event synthetics). Users may select which metrics contribute to the assessment and those metrics are aggregated into a "grade" for each station. The DQA is being actively used for station diagnostics and evaluation based on the completed metrics (availability, gap count, timing quality, deviation from a global noise model, deviation from a station noise model, coherence between co-located sensors, and comparison between broadband data and synthetics for earthquakes) on stations in the Global Seismographic Network and Advanced National Seismic System.

  9. A no-reference video quality assessment metric based on ROI

    NASA Astrophysics Data System (ADS)

    Jia, Lixiu; Zhong, Xuefei; Tu, Yan; Niu, Wenjuan

    2015-01-01

    A no reference video quality assessment metric based on the region of interest (ROI) was proposed in this paper. In the metric, objective video quality was evaluated by integrating the quality of two compressed artifacts, i.e. blurring distortion and blocking distortion. The Gaussian kernel function was used to extract the human density maps of the H.264 coding videos from the subjective eye tracking data. An objective bottom-up ROI extraction model based on magnitude discrepancy of discrete wavelet transform between two consecutive frames, center weighted color opponent model, luminance contrast model and frequency saliency model based on spectral residual was built. Then only the objective saliency maps were used to compute the objective blurring and blocking quality. The results indicate that the objective ROI extraction metric has a higher the area under the curve (AUC) value. Comparing with the conventional video quality assessment metrics which measured all the video quality frames, the metric proposed in this paper not only decreased the computation complexity, but improved the correlation between subjective mean opinion score (MOS) and objective scores.

  10. [Perceptual sharpness metric for visible and infrared color fusion images].

    PubMed

    Gao, Shao-Shu; Jin, Wei-Qi; Wang, Xia; Wang, Ling-Xue; Luo, Yuan

    2012-12-01

    For visible and infrared color fusion images, objective sharpness assessment model is proposed to measure the clarity of detail and edge definition of the fusion image. Firstly, the contrast sensitivity functions (CSF) of the human visual system is used to reduce insensitive frequency components under certain viewing conditions. Secondly, perceptual contrast model, which takes human luminance masking effect into account, is proposed based on local band-limited contrast model. Finally, the perceptual contrast is calculated in the region of interest (contains image details and edges) in the fusion image to evaluate image perceptual sharpness. Experimental results show that the proposed perceptual sharpness metrics provides better predictions, which are more closely matched to human perceptual evaluations, than five existing sharpness (blur) metrics for color images. The proposed perceptual sharpness metrics can evaluate the perceptual sharpness for color fusion images effectively.

  11. An optimization based sampling approach for multiple metrics uncertainty analysis using generalized likelihood uncertainty estimation

    NASA Astrophysics Data System (ADS)

    Zhou, Rurui; Li, Yu; Lu, Di; Liu, Haixing; Zhou, Huicheng

    2016-09-01

    This paper investigates the use of an epsilon-dominance non-dominated sorted genetic algorithm II (ɛ-NSGAII) as a sampling approach with an aim to improving sampling efficiency for multiple metrics uncertainty analysis using Generalized Likelihood Uncertainty Estimation (GLUE). The effectiveness of ɛ-NSGAII based sampling is demonstrated compared with Latin hypercube sampling (LHS) through analyzing sampling efficiency, multiple metrics performance, parameter uncertainty and flood forecasting uncertainty with a case study of flood forecasting uncertainty evaluation based on Xinanjiang model (XAJ) for Qing River reservoir, China. Results obtained demonstrate the following advantages of the ɛ-NSGAII based sampling approach in comparison to LHS: (1) The former performs more effective and efficient than LHS, for example the simulation time required to generate 1000 behavioral parameter sets is shorter by 9 times; (2) The Pareto tradeoffs between metrics are demonstrated clearly with the solutions from ɛ-NSGAII based sampling, also their Pareto optimal values are better than those of LHS, which means better forecasting accuracy of ɛ-NSGAII parameter sets; (3) The parameter posterior distributions from ɛ-NSGAII based sampling are concentrated in the appropriate ranges rather than uniform, which accords with their physical significance, also parameter uncertainties are reduced significantly; (4) The forecasted floods are close to the observations as evaluated by three measures: the normalized total flow outside the uncertainty intervals (FOUI), average relative band-width (RB) and average deviation amplitude (D). The flood forecasting uncertainty is also reduced a lot with ɛ-NSGAII based sampling. This study provides a new sampling approach to improve multiple metrics uncertainty analysis under the framework of GLUE, and could be used to reveal the underlying mechanisms of parameter sets under multiple conflicting metrics in the uncertainty analysis process.

  12. Objective evaluation of interior noise booming in a passenger car based on sound metrics and artificial neural networks.

    PubMed

    Lee, Hyun-Ho; Lee, Sang-Kwon

    2009-09-01

    Booming sound is one of the important sounds in a passenger car. The aim of the paper is to develop the objective evaluation method of interior booming sound. The development method is based on the sound metrics and ANN (artificial neural network). The developed method is called the booming index. Previous work maintained that booming sound quality is related to loudness and sharpness--the sound metrics used in psychoacoustics--and that the booming index is developed by using the loudness and sharpness for a signal within whole frequency between 20 Hz and 20 kHz. In the present paper, the booming sound quality was found to be effectively related to the loudness at frequencies below 200 Hz; thus the booming index is updated by using the loudness of the signal filtered by the low pass filter at frequency under 200 Hz. The relationship between the booming index and sound metric is identified by an ANN. The updated booming index has been successfully applied to the objective evaluation of the booming sound quality of mass-produced passenger cars.

  13. Information fusion performance evaluation for motion imagery data using mutual information: initial study

    NASA Astrophysics Data System (ADS)

    Grieggs, Samuel M.; McLaughlin, Michael J.; Ezekiel, Soundararajan; Blasch, Erik

    2015-06-01

    As technology and internet use grows at an exponential rate, video and imagery data is becoming increasingly important. Various techniques such as Wide Area Motion imagery (WAMI), Full Motion Video (FMV), and Hyperspectral Imaging (HSI) are used to collect motion data and extract relevant information. Detecting and identifying a particular object in imagery data is an important step in understanding visual imagery, such as content-based image retrieval (CBIR). Imagery data is segmented and automatically analyzed and stored in dynamic and robust database. In our system, we seek utilize image fusion methods which require quality metrics. Many Image Fusion (IF) algorithms have been proposed based on different, but only a few metrics, used to evaluate the performance of these algorithms. In this paper, we seek a robust, objective metric to evaluate the performance of IF algorithms which compares the outcome of a given algorithm to ground truth and reports several types of errors. Given the ground truth of a motion imagery data, it will compute detection failure, false alarm, precision and recall metrics, background and foreground regions statistics, as well as split and merge of foreground regions. Using the Structural Similarity Index (SSIM), Mutual Information (MI), and entropy metrics; experimental results demonstrate the effectiveness of the proposed methodology for object detection, activity exploitation, and CBIR.

  14. Metrication report to the Congress

    NASA Technical Reports Server (NTRS)

    1989-01-01

    The major NASA metrication activity of 1988 concerned the Space Station. Although the metric system was the baseline measurement system for preliminary design studies, solicitations for final design and development of the Space Station Freedom requested use of the inch-pound system because of concerns with cost impact and potential safety hazards. Under that policy, however use of the metric system would be permitted through waivers where its use was appropriate. Late in 1987, several Department of Defense decisions were made to increase commitment to the metric system, thereby broadening the potential base of metric involvement in the U.S. industry. A re-evaluation of Space Station Freedom units of measure policy was, therefore, initiated in January 1988.

  15. Initial Ada components evaluation

    NASA Technical Reports Server (NTRS)

    Moebes, Travis

    1989-01-01

    The SAIC has the responsibility for independent test and validation of the SSE. They have been using a mathematical functions library package implemented in Ada to test the SSE IV and V process. The library package consists of elementary mathematical functions and is both machine and accuracy independent. The SSE Ada components evaluation includes code complexity metrics based on Halstead's software science metrics and McCabe's measure of cyclomatic complexity. Halstead's metrics are based on the number of operators and operands on a logical unit of code and are compiled from the number of distinct operators, distinct operands, and total number of occurrences of operators and operands. These metrics give an indication of the physical size of a program in terms of operators and operands and are used diagnostically to point to potential problems. McCabe's Cyclomatic Complexity Metrics (CCM) are compiled from flow charts transformed to equivalent directed graphs. The CCM is a measure of the total number of linearly independent paths through the code's control structure. These metrics were computed for the Ada mathematical functions library using Software Automated Verification and Validation (SAVVAS), the SSE IV and V tool. A table with selected results was shown, indicating that most of these routines are of good quality. Thresholds for the Halstead measures indicate poor quality if the length metric exceeds 260 or difficulty is greater than 190. The McCabe CCM indicated a high quality of software products.

  16. San Luis Basin Sustainability Metrics Project: A Methodology for Evaluating Regional Sustainability

    EPA Science Inventory

    Although there are several scientifically-based sustainability metrics, many are data intensive, difficult to calculate, and fail to capture all aspects of a system. To address these issues, we produced a scientifically-defensible, but straightforward and inexpensive, methodolog...

  17. The compressed average image intensity metric for stereoscopic video quality assessment

    NASA Astrophysics Data System (ADS)

    Wilczewski, Grzegorz

    2016-09-01

    The following article depicts insights towards design, creation and testing of a genuine metric designed for a 3DTV video quality evaluation. The Compressed Average Image Intensity (CAII) mechanism is based upon stereoscopic video content analysis, setting its core feature and functionality to serve as a versatile tool for an effective 3DTV service quality assessment. Being an objective type of quality metric it may be utilized as a reliable source of information about the actual performance of a given 3DTV system, under strict providers evaluation. Concerning testing and the overall performance analysis of the CAII metric, the following paper presents comprehensive study of results gathered across several testing routines among selected set of samples of stereoscopic video content. As a result, the designed method for stereoscopic video quality evaluation is investigated across the range of synthetic visual impairments injected into the original video stream.

  18. Testing, Requirements, and Metrics

    NASA Technical Reports Server (NTRS)

    Rosenberg, Linda; Hyatt, Larry; Hammer, Theodore F.; Huffman, Lenore; Wilson, William

    1998-01-01

    The criticality of correct, complete, testable requirements is a fundamental tenet of software engineering. Also critical is complete requirements based testing of the final product. Modern tools for managing requirements allow new metrics to be used in support of both of these critical processes. Using these tools, potential problems with the quality of the requirements and the test plan can be identified early in the life cycle. Some of these quality factors include: ambiguous or incomplete requirements, poorly designed requirements databases, excessive or insufficient test cases, and incomplete linkage of tests to requirements. This paper discusses how metrics can be used to evaluate the quality of the requirements and test to avoid problems later. Requirements management and requirements based testing have always been critical in the implementation of high quality software systems. Recently, automated tools have become available to support requirements management. At NASA's Goddard Space Flight Center (GSFC), automated requirements management tools are being used on several large projects. The use of these tools opens the door to innovative uses of metrics in characterizing test plan quality and assessing overall testing risks. In support of these projects, the Software Assurance Technology Center (SATC) is working to develop and apply a metrics program that utilizes the information now available through the application of requirements management tools. Metrics based on this information provides real-time insight into the testing of requirements and these metrics assist the Project Quality Office in its testing oversight role. This paper discusses three facets of the SATC's efforts to evaluate the quality of the requirements and test plan early in the life cycle, thus preventing costly errors and time delays later.

  19. Examination of the properties of IMRT and VMAT beams and evaluation against pre-treatment quality assurance results

    NASA Astrophysics Data System (ADS)

    Crowe, S. B.; Kairn, T.; Middlebrook, N.; Sutherland, B.; Hill, B.; Kenny, J.; Langton, C. M.; Trapp, J. V.

    2015-03-01

    This study aimed to provide a detailed evaluation and comparison of a range of modulated beam evaluation metrics, in terms of their correlation with QA testing results and their variation between treatment sites, for a large number of treatments. Ten metrics including the modulation index (MI), fluence map complexity, modulation complexity score (MCS), mean aperture displacement (MAD) and small aperture score (SAS) were evaluated for 546 beams from 122 intensity modulated radiotherapy (IMRT) and volumetric modulated arc therapy (VMAT) treatment plans targeting the anus, rectum, endometrium, brain, head and neck and prostate. The calculated sets of metrics were evaluated in terms of their relationships to each other and their correlation with the results of electronic portal imaging based quality assurance (QA) evaluations of the treatment beams. Evaluation of the MI, MAD and SAS suggested that beams used in treatments of the anus, rectum, head and neck were more complex than the prostate and brain treatment beams. Seven of the ten beam complexity metrics were found to be strongly correlated with the results from QA testing of the IMRT beams (p < 0.00008). For example, values of SAS (with multileaf collimator apertures narrower than 10 mm defined as ‘small’) less than 0.2 also identified QA passing IMRT beams with 100% specificity. However, few of the metrics are correlated with the results from QA testing of the VMAT beams, whether they were evaluated as whole 360° arcs or as 60° sub-arcs. Select evaluation of beam complexity metrics (at least MI, MCS and SAS) is therefore recommended, as an intermediate step in the IMRT QA chain. Such evaluation may also be useful as a means of periodically reviewing VMAT planning or optimiser performance.

  20. Quality assessment for color reproduction using a blind metric

    NASA Astrophysics Data System (ADS)

    Bringier, B.; Quintard, L.; Larabi, M.-C.

    2007-01-01

    This paper deals with image quality assessment. This field plays nowadays an important role in various image processing applications. Number of objective image quality metrics, that correlate or not, with the subjective quality have been developed during the last decade. Two categories of metrics can be distinguished, the first with full-reference and the second with no-reference. Full-reference metric tries to evaluate the distortion introduced to an image with regards to the reference. No-reference approach attempts to model the judgment of image quality in a blind way. Unfortunately, the universal image quality model is not on the horizon and empirical models established on psychophysical experimentation are generally used. In this paper, we focus only on the second category to evaluate the quality of color reproduction where a blind metric, based on human visual system modeling is introduced. The objective results are validated by single-media and cross-media subjective tests.

  1. Assessing the Greenness of Chemical Reactions in the Laboratory Using Updated Holistic Graphic Metrics Based on the Globally Harmonized System of Classification and Labeling of Chemicals

    ERIC Educational Resources Information Center

    Ribeiro, M. Gabriela T. C.; Yunes, Santiago F.; Machado, Adelio A. S. C.

    2014-01-01

    Two graphic holistic metrics for assessing the greenness of synthesis, the "green star" and the "green circle", have been presented previously. These metrics assess the greenness by the degree of accomplishment of each of the 12 principles of green chemistry that apply to the case under evaluation. The criteria for assessment…

  2. Video-Based Method of Quantifying Performance and Instrument Motion During Simulated Phonosurgery

    PubMed Central

    Conroy, Ellen; Surender, Ketan; Geng, Zhixian; Chen, Ting; Dailey, Seth; Jiang, Jack

    2015-01-01

    Objectives/Hypothesis To investigate the use of the Video-Based Phonomicrosurgery Instrument Tracking System to collect instrument position data during simulated phonomicrosurgery and calculate motion metrics using these data. We used this system to determine if novice subject motion metrics improved over 1 week of training. Study Design Prospective cohort study. Methods Ten subjects performed simulated surgical tasks once per day for 5 days. Instrument position data were collected and used to compute motion metrics (path length, depth perception, and motion smoothness). Data were analyzed to determine if motion metrics improved with practice time. Task outcome was also determined each day, and relationships between task outcome and motion metrics were used to evaluate the validity of motion metrics as indicators of surgical performance. Results Significant decreases over time were observed for path length (P <.001), depth perception (P <.001), and task outcome (P <.001). No significant change was observed for motion smoothness. Significant relationships were observed between task outcome and path length (P <.001), depth perception (P <.001), and motion smoothness (P <.001). Conclusions Our system can estimate instrument trajectory and provide quantitative descriptions of surgical performance. It may be useful for evaluating phonomicrosurgery performance. Path length and depth perception may be particularly useful indicators. PMID:24737286

  3. A comparison of metrics to evaluate the effects of hydro-facility passage stressors on fish

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Colotelo, Alison H.; Goldman, Amy E.; Wagner, Katie A.

    Hydropower is the most common form of renewable energy, and countries worldwide are considering expanding hydropower to new areas. One of the challenges of hydropower deployment is mitigation of the environmental impacts including water quality, habitat alterations, and ecosystem connectivity. For fish species that inhabit river systems with hydropower facilities, passage through the facility to access spawning and rearing habitats can be particularly challenging. Fish moving downstream through a hydro-facility can be exposed to a number of stressors (e.g., rapid decompression, shear forces, blade strike and collision, and turbulence), which can all affect fish survival in direct and indirect ways.more » Many studies have investigated the effects of hydro-turbine passage on fish; however, the comparability among studies is limited by variation in the metrics and biological endpoints used. Future studies investigating the effects of hydro-turbine passage should focus on using metrics and endpoints that are easily comparable. This review summarizes four categories of metrics that are used in fisheries research and have application to hydro-turbine passage (i.e., mortality, injury, molecular metrics, behavior) and evaluates them based on several criteria (i.e., resources needed, invasiveness, comparability among stressors and species, and diagnostic properties). Additionally, these comparisons are put into context of study setting (i.e., laboratory vs. field). Overall, injury and molecular metrics are ideal for studies in which there is a need to understand the mechanisms of effect, whereas behavior and mortality metrics provide information on the whole body response of the fish. The study setting strongly influences the comparability among studies. In laboratory-based studies, stressors can be controlled by both type and magnitude, allowing for easy comparisons among studies. In contrast, field studies expose fish to realistic passage environments but the comparability is limited. Based on these results, future studies, whether lab or field-based, should focus on metrics that relate to mortality for ease of comparison.« less

  4. Large radius of curvature measurement based on the evaluation of interferogram-quality metric in non-null interferometry

    NASA Astrophysics Data System (ADS)

    Yang, Zhongming; Dou, Jiantai; Du, Jinyu; Gao, Zhishan

    2018-03-01

    Non-null interferometry could use to measure the radius of curvature (ROC), we have presented a virtual quadratic Newton rings phase-shifting moiré-fringes measurement method for large ROC measurement (Yang et al., 2016). In this paper, we propose a large ROC measurement method based on the evaluation of the interferogram-quality metric by the non-null interferometer. With the multi-configuration model of the non-null interferometric system in ZEMAX, the retrace errors and the phase introduced by the test surface are reconstructed. The interferogram-quality metric is obtained by the normalized phase-shifted testing Newton rings with the spherical surface model in the non-null interferometric system. The radius curvature of the test spherical surface can be obtained until the minimum of the interferogram-quality metric is found. Simulations and experimental results are verified the feasibility of our proposed method. For a spherical mirror with a ROC of 41,400 mm, the measurement accuracy is better than 0.13%.

  5. Quality Measures for Dialysis: Time for a Balanced Scorecard

    PubMed Central

    2016-01-01

    Recent federal legislation establishes a merit-based incentive payment system for physicians, with a scorecard for each professional. The Centers for Medicare and Medicaid Services evaluate quality of care with clinical performance measures and have used these metrics for public reporting and payment to dialysis facilities. Similar metrics may be used for the future merit-based incentive payment system. In nephrology, most clinical performance measures measure processes and intermediate outcomes of care. These metrics were developed from population studies of best practice and do not identify opportunities for individualizing care on the basis of patient characteristics and individual goals of treatment. The In-Center Hemodialysis (ICH) Consumer Assessment of Healthcare Providers and Systems (CAHPS) survey examines patients' perception of care and has entered the arena to evaluate quality of care. A balanced scorecard of quality performance should include three elements: population-based best clinical practice, patient perceptions, and individually crafted patient goals of care. PMID:26316622

  6. Evaluation schemes for video and image anomaly detection algorithms

    NASA Astrophysics Data System (ADS)

    Parameswaran, Shibin; Harguess, Josh; Barngrover, Christopher; Shafer, Scott; Reese, Michael

    2016-05-01

    Video anomaly detection is a critical research area in computer vision. It is a natural first step before applying object recognition algorithms. There are many algorithms that detect anomalies (outliers) in videos and images that have been introduced in recent years. However, these algorithms behave and perform differently based on differences in domains and tasks to which they are subjected. In order to better understand the strengths and weaknesses of outlier algorithms and their applicability in a particular domain/task of interest, it is important to measure and quantify their performance using appropriate evaluation metrics. There are many evaluation metrics that have been used in the literature such as precision curves, precision-recall curves, and receiver operating characteristic (ROC) curves. In order to construct these different metrics, it is also important to choose an appropriate evaluation scheme that decides when a proposed detection is considered a true or a false detection. Choosing the right evaluation metric and the right scheme is very critical since the choice can introduce positive or negative bias in the measuring criterion and may favor (or work against) a particular algorithm or task. In this paper, we review evaluation metrics and popular evaluation schemes that are used to measure the performance of anomaly detection algorithms on videos and imagery with one or more anomalies. We analyze the biases introduced by these by measuring the performance of an existing anomaly detection algorithm.

  7. Meaningful Assessment of Robotic Surgical Style using the Wisdom of Crowds.

    PubMed

    Ershad, M; Rege, R; Fey, A Majewicz

    2018-07-01

    Quantitative assessment of surgical skills is an important aspect of surgical training; however, the proposed metrics are sometimes difficult to interpret and may not capture the stylistic characteristics that define expertise. This study proposes a methodology for evaluating the surgical skill, based on metrics associated with stylistic adjectives, and evaluates the ability of this method to differentiate expertise levels. We recruited subjects from different expertise levels to perform training tasks on a surgical simulator. A lexicon of contrasting adjective pairs, based on important skills for robotic surgery, inspired by the global evaluative assessment of robotic skills tool, was developed. To validate the use of stylistic adjectives for surgical skill assessment, posture videos of the subjects performing the task, as well as videos of the task were rated by crowd-workers. Metrics associated with each adjective were found using kinematic and physiological measurements through correlation with the crowd-sourced adjective assignment ratings. To evaluate the chosen metrics' ability in distinguishing expertise levels, two classifiers were trained and tested using these metrics. Crowd-assignment ratings for all adjectives were significantly correlated with expertise levels. The results indicate that naive Bayes classifier performs the best, with an accuracy of [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text] when classifying into four, three, and two levels of expertise, respectively. The proposed method is effective at mapping understandable adjectives of expertise to the stylistic movements and physiological response of trainees.

  8. Measurement of Chronic Pain and Opioid Use Evaluation in Community-Based Persons with Serious Illnesses

    PubMed Central

    Naidu, Ramana K.

    2018-01-01

    Abstract Background: Chronic pain associated with serious illnesses is having a major impact on population health in the United States. Accountability for high quality care for community-dwelling patients with serious illnesses requires selection of metrics that capture the burden of chronic pain whose treatment may be enhanced or complicated by opioid use. Objective: Our aim was to evaluate options for assessing pain in seriously ill community dwelling adults, to discuss the use/abuse of opioids in individuals with chronic pain, and to suggest pain and opioid use metrics that can be considered for screening and evaluation of patient responses and quality care. Design: Structured literature review. Measurements: Evaluation of pain and opioid use assessment metrics and measures for their potential usefulness in the community. Results: Several pain and opioid assessment instruments are available for consideration. Yet, no one pain instrument has been identified as “the best” to assess pain in seriously ill community-dwelling patients. Screening tools exist that are specific to the assessment of risk in opioid management. Opioid screening can assess risk based on substance use history, general risk taking, and reward-seeking behavior. Conclusions: Accountability for high quality care for community-dwelling patients requires selection of metrics that will capture the burden of chronic pain and beneficial use or misuse of opioids. Future research is warranted to identify, modify, or develop instruments that contain important metrics, demonstrate a balance between sensitivity and specificity, and address patient preferences and quality outcomes. PMID:29091525

  9. Index of cyber integrity

    NASA Astrophysics Data System (ADS)

    Anderson, Gustave

    2014-05-01

    Unfortunately, there is no metric, nor set of metrics, that are both general enough to encompass all possible types of applications yet specific enough to capture the application and attack specific details. As a result we are left with ad-hoc methods for generating evaluations of the security of our systems. Current state of the art methods for evaluating the security of systems include penetration testing and cyber evaluation tests. For these evaluations, security professionals simulate an attack from malicious outsiders and malicious insiders. These evaluations are very productive and are able to discover potential vulnerabilities resulting from improper system configuration, hardware and software flaws, or operational weaknesses. We therefore propose the index of cyber integrity (ICI), which is modeled after the index of biological integrity (IBI) to provide a holistic measure of the health of a system under test in a cyber-environment. The ICI provides a broad base measure through a collection of application and system specific metrics. In this paper, following the example of the IBI, we demonstrate how a multi-metric index may be used as a holistic measure of the health of a system under test in a cyber-environment.

  10. Creating "Intelligent" Ensemble Averages Using a Process-Based Framework

    NASA Astrophysics Data System (ADS)

    Baker, Noel; Taylor, Patrick

    2014-05-01

    The CMIP5 archive contains future climate projections from over 50 models provided by dozens of modeling centers from around the world. Individual model projections, however, are subject to biases created by structural model uncertainties. As a result, ensemble averaging of multiple models is used to add value to individual model projections and construct a consensus projection. Previous reports for the IPCC establish climate change projections based on an equal-weighted average of all model projections. However, individual models reproduce certain climate processes better than other models. Should models be weighted based on performance? Unequal ensemble averages have previously been constructed using a variety of mean state metrics. What metrics are most relevant for constraining future climate projections? This project develops a framework for systematically testing metrics in models to identify optimal metrics for unequal weighting multi-model ensembles. The intention is to produce improved ("intelligent") unequal-weight ensemble averages. A unique aspect of this project is the construction and testing of climate process-based model evaluation metrics. A climate process-based metric is defined as a metric based on the relationship between two physically related climate variables—e.g., outgoing longwave radiation and surface temperature. Several climate process metrics are constructed using high-quality Earth radiation budget data from NASA's Clouds and Earth's Radiant Energy System (CERES) instrument in combination with surface temperature data sets. It is found that regional values of tested quantities can vary significantly when comparing the equal-weighted ensemble average and an ensemble weighted using the process-based metric. Additionally, this study investigates the dependence of the metric weighting scheme on the climate state using a combination of model simulations including a non-forced preindustrial control experiment, historical simulations, and several radiative forcing Representative Concentration Pathway (RCP) scenarios. Ultimately, the goal of the framework is to advise better methods for ensemble averaging models and create better climate predictions.

  11. Metrics for Evaluating the Accuracy of Solar Power Forecasting: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, J.; Hodge, B. M.; Florita, A.

    2013-10-01

    Forecasting solar energy generation is a challenging task due to the variety of solar power systems and weather regimes encountered. Forecast inaccuracies can result in substantial economic losses and power system reliability issues. This paper presents a suite of generally applicable and value-based metrics for solar forecasting for a comprehensive set of scenarios (i.e., different time horizons, geographic locations, applications, etc.). In addition, a comprehensive framework is developed to analyze the sensitivity of the proposed metrics to three types of solar forecasting improvements using a design of experiments methodology, in conjunction with response surface and sensitivity analysis methods. The resultsmore » show that the developed metrics can efficiently evaluate the quality of solar forecasts, and assess the economic and reliability impact of improved solar forecasting.« less

  12. The influence of soil properties and nutrients on conifer forest growth in Sweden, and the first steps in developing a nutrient availability metric

    NASA Astrophysics Data System (ADS)

    Van Sundert, Kevin; Horemans, Joanna A.; Stendahl, Johan; Vicca, Sara

    2018-06-01

    The availability of nutrients is one of the factors that regulate terrestrial carbon cycling and modify ecosystem responses to environmental changes. Nonetheless, nutrient availability is often overlooked in climate-carbon cycle studies because it depends on the interplay of various soil factors that would ideally be comprised into metrics applicable at large spatial scales. Such metrics do not currently exist. Here, we use a Swedish forest inventory database that contains soil data and tree growth data for > 2500 forests across Sweden to (i) test which combination of soil factors best explains variation in tree growth, (ii) evaluate an existing metric of constraints on nutrient availability, and (iii) adjust this metric for boreal forest data. With (iii), we thus aimed to provide an adjustable nutrient metric, applicable for Sweden and with potential for elaboration to other regions. While taking into account confounding factors such as climate, N deposition, and soil oxygen availability, our analyses revealed that the soil organic carbon concentration (SOC) and the ratio of soil carbon to nitrogen (C : N) were the most important factors explaining variation in normalized (climate-independent) productivity (mean annual volume increment - m3 ha-1 yr-1) across Sweden. Normalized forest productivity was significantly negatively related to the soil C : N ratio (R2 = 0.02-0.13), while SOC exhibited an empirical optimum (R2 = 0.05-0.15). For the metric, we started from a (yet unvalidated) metric for constraints on nutrient availability that was previously developed by the International Institute for Applied Systems Analysis (IIASA - Laxenburg, Austria) for evaluating potential productivity of arable land. This IIASA metric requires information on soil properties that are indicative of nutrient availability (SOC, soil texture, total exchangeable bases - TEB, and pH) and is based on theoretical considerations that are also generally valid for nonagricultural ecosystems. However, the IIASA metric was unrelated to normalized forest productivity across Sweden (R2 = 0.00-0.01) because the soil factors under consideration were not optimally implemented according to the Swedish data, and because the soil C : N ratio was not included. Using two methods (each one based on a different way of normalizing productivity for climate), we adjusted this metric by incorporating soil C : N and modifying the relationship between SOC and nutrient availability in view of the observed relationships across our database. In contrast to the IIASA metric, the adjusted metrics explained some variation in normalized productivity in the database (R2 = 0.03-0.21; depending on the applied method). A test for five manually selected local fertility gradients in our database revealed a significant and stronger relationship between the adjusted metrics and productivity for each of the gradients (R2 = 0.09-0.38). This study thus shows for the first time how nutrient availability metrics can be evaluated and adjusted for a particular ecosystem type, using a large-scale database.

  13. A Review of Indicators of Estuarine Tidal Wetland Condition

    EPA Science Inventory

    This review critically evaluates indicators of tidal wetland condition based on 36 indicator development studies and indicators developed as part of U.S. state tidal wetland monitoring programs. Individual metrics were evaluated based on relative scores on two sets of evaluation ...

  14. Evaluation metrics for bone segmentation in ultrasound

    NASA Astrophysics Data System (ADS)

    Lougheed, Matthew; Fichtinger, Gabor; Ungi, Tamas

    2015-03-01

    Tracked ultrasound is a safe alternative to X-ray for imaging bones. The interpretation of bony structures is challenging as ultrasound has no specific intensity characteristic of bones. Several image segmentation algorithms have been devised to identify bony structures. We propose an open-source framework that would aid in the development and comparison of such algorithms by quantitatively measuring segmentation performance in the ultrasound images. True-positive and false-negative metrics used in the framework quantify algorithm performance based on correctly segmented bone and correctly segmented boneless regions. Ground-truth for these metrics are defined manually and along with the corresponding automatically segmented image are used for the performance analysis. Manually created ground truth tests were generated to verify the accuracy of the analysis. Further evaluation metrics for determining average performance per slide and standard deviation are considered. The metrics provide a means of evaluating accuracy of frames along the length of a volume. This would aid in assessing the accuracy of the volume itself and the approach to image acquisition (positioning and frequency of frame). The framework was implemented as an open-source module of the 3D Slicer platform. The ground truth tests verified that the framework correctly calculates the implemented metrics. The developed framework provides a convenient way to evaluate bone segmentation algorithms. The implementation fits in a widely used application for segmentation algorithm prototyping. Future algorithm development will benefit by monitoring the effects of adjustments to an algorithm in a standard evaluation framework.

  15. Revision and extension of Eco-LCA metrics for sustainability assessment of the energy and chemical processes.

    PubMed

    Yang, Shiying; Yang, Siyu; Kraslawski, Andrzej; Qian, Yu

    2013-12-17

    Ecologically based life cycle assessment (Eco-LCA) is an appealing approach for the evaluation of resources utilization and environmental impacts of the process industries from an ecological scale. However, the aggregated metrics of Eco-LCA suffer from some drawbacks: the environmental impact metric has limited applicability; the resource utilization metric ignores indirect consumption; the renewability metric fails to address the quantitative distinction of resources availability; the productivity metric seems self-contradictory. In this paper, the existing Eco-LCA metrics are revised and extended for sustainability assessment of the energy and chemical processes. A new Eco-LCA metrics system is proposed, including four independent dimensions: environmental impact, resource utilization, resource availability, and economic effectiveness. An illustrative example of comparing assessment between a gas boiler and a solar boiler process provides insight into the features of the proposed approach.

  16. A scoring mechanism for the rank aggregation of network robustness

    NASA Astrophysics Data System (ADS)

    Yazdani, Alireza; Dueñas-Osorio, Leonardo; Li, Qilin

    2013-10-01

    To date, a number of metrics have been proposed to quantify inherent robustness of network topology against failures. However, each single metric usually only offers a limited view of network vulnerability to different types of random failures and targeted attacks. When applied to certain network configurations, different metrics rank network topology robustness in different orders which is rather inconsistent, and no single metric fully characterizes network robustness against different modes of failure. To overcome such inconsistency, this work proposes a multi-metric approach as the basis of evaluating aggregate ranking of network topology robustness. This is based on simultaneous utilization of a minimal set of distinct robustness metrics that are standardized so to give way to a direct comparison of vulnerability across networks with different sizes and configurations, hence leading to an initial scoring of inherent topology robustness. Subsequently, based on the inputs of initial scoring a rank aggregation method is employed to allocate an overall ranking of robustness to each network topology. A discussion is presented in support of the presented multi-metric approach and its applications to more realistically assess and rank network topology robustness.

  17. MESUR: USAGE-BASED METRICS OF SCHOLARLY IMPACT

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    BOLLEN, JOHAN; RODRIGUEZ, MARKO A.; VAN DE SOMPEL, HERBERT

    2007-01-30

    The evaluation of scholarly communication items is now largely a matter of expert opinion or metrics derived from citation data. Both approaches can fail to take into account the myriad of factors that shape scholarly impact. Usage data has emerged as a promising complement to existing methods o fassessment but the formal groundwork to reliably and validly apply usage-based metrics of schlolarly impact is lacking. The Andrew W. Mellon Foundation funded MESUR project constitutes a systematic effort to define, validate and cross-validate a range of usage-based metrics of schlolarly impact by creating a semantic model of the scholarly communication process.more » The constructed model will serve as the basis of a creating a large-scale semantic network that seamlessly relates citation, bibliographic and usage data from a variety of sources. A subsequent program that uses the established semantic network as a reference data set will determine the characteristics and semantics of a variety of usage-based metrics of schlolarly impact. This paper outlines the architecture and methodology adopted by the MESUR project and its future direction.« less

  18. Developing image processing meta-algorithms with data mining of multiple metrics.

    PubMed

    Leung, Kelvin; Cunha, Alexandre; Toga, A W; Parker, D Stott

    2014-01-01

    People often use multiple metrics in image processing, but here we take a novel approach of mining the values of batteries of metrics on image processing results. We present a case for extending image processing methods to incorporate automated mining of multiple image metric values. Here by a metric we mean any image similarity or distance measure, and in this paper we consider intensity-based and statistical image measures and focus on registration as an image processing problem. We show how it is possible to develop meta-algorithms that evaluate different image processing results with a number of different metrics and mine the results in an automated fashion so as to select the best results. We show that the mining of multiple metrics offers a variety of potential benefits for many image processing problems, including improved robustness and validation.

  19. State of the States 2009. Renewable Energy Development and the Role of Policy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Doris, Elizabeth; McLaren, Joyce; Healey, Victoria

    2009-10-01

    This report tracks the progress of U.S. renewable energy development at the state level, with metrics on development status and reviews of relevant policies. The analysis offers state-by-state policy suggestions and develops performance-based evaluation metrics to accelerate and improve renewable energy development.

  20. Can metric-based approaches really improve multi-model climate projections? A perfect model framework applied to summer temperature change in France.

    NASA Astrophysics Data System (ADS)

    Boé, Julien; Terray, Laurent

    2014-05-01

    Ensemble approaches for climate change projections have become ubiquitous. Because of large model-to-model variations and, generally, lack of rationale for the choice of a particular climate model against others, it is widely accepted that future climate change and its impacts should not be estimated based on a single climate model. Generally, as a default approach, the multi-model ensemble mean (MMEM) is considered to provide the best estimate of climate change signals. The MMEM approach is based on the implicit hypothesis that all the models provide equally credible projections of future climate change. This hypothesis is unlikely to be true and ideally one would want to give more weight to more realistic models. A major issue with this alternative approach lies in the assessment of the relative credibility of future climate projections from different climate models, as they can only be evaluated against present-day observations: which present-day metric(s) should be used to decide which models are "good" and which models are "bad" in the future climate? Once a supposedly informative metric has been found, other issues arise. What is the best statistical method to combine multiple models results taking into account their relative credibility measured by a given metric? How to be sure in the end that the metric-based estimate of future climate change is not in fact less realistic than the MMEM? It is impossible to provide strict answers to those questions in the climate change context. Yet, in this presentation, we propose a methodological approach based on a perfect model framework that could bring some useful elements of answer to the questions previously mentioned. The basic idea is to take a random climate model in the ensemble and treat it as if it were the truth (results of this model, in both past and future climate, are called "synthetic observations"). Then, all the other members from the multi-model ensemble are used to derive thanks to a metric-based approach a posterior estimate of climate change, based on the synthetic observation of the metric. Finally, it is possible to compare the posterior estimate to the synthetic observation of future climate change to evaluate the skill of the method. The main objective of this presentation is to describe and apply this perfect model framework to test different methodological issues associated with non-uniform model weighting and similar metric-based approaches. The methodology presented is general, but will be applied to the specific case of summer temperature change in France, for which previous works have suggested potentially useful metrics associated with soil-atmosphere and cloud-temperature interactions. The relative performances of different simple statistical approaches to combine multiple model results based on metrics will be tested. The impact of ensemble size, observational errors, internal variability, and model similarity will be characterized. The potential improvements associated with metric-based approaches compared to the MMEM is terms of errors and uncertainties will be quantified.

  1. Creating "Intelligent" Climate Model Ensemble Averages Using a Process-Based Framework

    NASA Astrophysics Data System (ADS)

    Baker, N. C.; Taylor, P. C.

    2014-12-01

    The CMIP5 archive contains future climate projections from over 50 models provided by dozens of modeling centers from around the world. Individual model projections, however, are subject to biases created by structural model uncertainties. As a result, ensemble averaging of multiple models is often used to add value to model projections: consensus projections have been shown to consistently outperform individual models. Previous reports for the IPCC establish climate change projections based on an equal-weighted average of all model projections. However, certain models reproduce climate processes better than other models. Should models be weighted based on performance? Unequal ensemble averages have previously been constructed using a variety of mean state metrics. What metrics are most relevant for constraining future climate projections? This project develops a framework for systematically testing metrics in models to identify optimal metrics for unequal weighting multi-model ensembles. A unique aspect of this project is the construction and testing of climate process-based model evaluation metrics. A climate process-based metric is defined as a metric based on the relationship between two physically related climate variables—e.g., outgoing longwave radiation and surface temperature. Metrics are constructed using high-quality Earth radiation budget data from NASA's Clouds and Earth's Radiant Energy System (CERES) instrument and surface temperature data sets. It is found that regional values of tested quantities can vary significantly when comparing weighted and unweighted model ensembles. For example, one tested metric weights the ensemble by how well models reproduce the time-series probability distribution of the cloud forcing component of reflected shortwave radiation. The weighted ensemble for this metric indicates lower simulated precipitation (up to .7 mm/day) in tropical regions than the unweighted ensemble: since CMIP5 models have been shown to overproduce precipitation, this result could indicate that the metric is effective in identifying models which simulate more realistic precipitation. Ultimately, the goal of the framework is to identify performance metrics for advising better methods for ensemble averaging models and create better climate predictions.

  2. Evaluation Metrics for Biostatistical and Epidemiological Collaborations

    PubMed Central

    Rubio, Doris McGartland; del Junco, Deborah J.; Bhore, Rafia; Lindsell, Christopher J.; Oster, Robert A.; Wittkowski, Knut M.; Welty, Leah J.; Li, Yi-Ju; DeMets, Dave

    2011-01-01

    Increasing demands for evidence-based medicine and for the translation of biomedical research into individual and public health benefit have been accompanied by the proliferation of special units that offer expertise in biostatistics, epidemiology, and research design (BERD) within academic health centers. Objective metrics that can be used to evaluate, track, and improve the performance of these BERD units are critical to their successful establishment and sustainable future. To develop a set of reliable but versatile metrics that can be adapted easily to different environments and evolving needs, we consulted with members of BERD units from the consortium of academic health centers funded by the Clinical and Translational Science Award Program of the National Institutes of Health. Through a systematic process of consensus building and document drafting, we formulated metrics that covered the three identified domains of BERD practices: the development and maintenance of collaborations with clinical and translational science investigators, the application of BERD-related methods to clinical and translational research, and the discovery of novel BERD-related methodologies. In this article, we describe the set of metrics and advocate their use for evaluating BERD practices. The routine application, comparison of findings across diverse BERD units, and ongoing refinement of the metrics will identify trends, facilitate meaningful changes, and ultimately enhance the contribution of BERD activities to biomedical research. PMID:21284015

  3. Incorporating big data into treatment plan evaluation: Development of statistical DVH metrics and visualization dashboards.

    PubMed

    Mayo, Charles S; Yao, John; Eisbruch, Avraham; Balter, James M; Litzenberg, Dale W; Matuszak, Martha M; Kessler, Marc L; Weyburn, Grant; Anderson, Carlos J; Owen, Dawn; Jackson, William C; Haken, Randall Ten

    2017-01-01

    To develop statistical dose-volume histogram (DVH)-based metrics and a visualization method to quantify the comparison of treatment plans with historical experience and among different institutions. The descriptive statistical summary (ie, median, first and third quartiles, and 95% confidence intervals) of volume-normalized DVH curve sets of past experiences was visualized through the creation of statistical DVH plots. Detailed distribution parameters were calculated and stored in JavaScript Object Notation files to facilitate management, including transfer and potential multi-institutional comparisons. In the treatment plan evaluation, structure DVH curves were scored against computed statistical DVHs and weighted experience scores (WESs). Individual, clinically used, DVH-based metrics were integrated into a generalized evaluation metric (GEM) as a priority-weighted sum of normalized incomplete gamma functions. Historical treatment plans for 351 patients with head and neck cancer, 104 with prostate cancer who were treated with conventional fractionation, and 94 with liver cancer who were treated with stereotactic body radiation therapy were analyzed to demonstrate the usage of statistical DVH, WES, and GEM in a plan evaluation. A shareable dashboard plugin was created to display statistical DVHs and integrate GEM and WES scores into a clinical plan evaluation within the treatment planning system. Benchmarking with normal tissue complication probability scores was carried out to compare the behavior of GEM and WES scores. DVH curves from historical treatment plans were characterized and presented, with difficult-to-spare structures (ie, frequently compromised organs at risk) identified. Quantitative evaluations by GEM and/or WES compared favorably with the normal tissue complication probability Lyman-Kutcher-Burman model, transforming a set of discrete threshold-priority limits into a continuous model reflecting physician objectives and historical experience. Statistical DVH offers an easy-to-read, detailed, and comprehensive way to visualize the quantitative comparison with historical experiences and among institutions. WES and GEM metrics offer a flexible means of incorporating discrete threshold-prioritizations and historic context into a set of standardized scoring metrics. Together, they provide a practical approach for incorporating big data into clinical practice for treatment plan evaluations.

  4. Reference-free ground truth metric for metal artifact evaluation in CT images.

    PubMed

    Kratz, Bärbel; Ens, Svitlana; Müller, Jan; Buzug, Thorsten M

    2011-07-01

    In computed tomography (CT), metal objects in the region of interest introduce data inconsistencies during acquisition. Reconstructing these data results in an image with star shaped artifacts induced by the metal inconsistencies. To enhance image quality, the influence of the metal objects can be reduced by different metal artifact reduction (MAR) strategies. For an adequate evaluation of new MAR approaches a ground truth reference data set is needed. In technical evaluations, where phantoms can be measured with and without metal inserts, ground truth data can easily be obtained by a second reference data acquisition. Obviously, this is not possible for clinical data. Here, an alternative evaluation method is presented without the need of an additionally acquired reference data set. The proposed metric is based on an inherent ground truth for metal artifacts as well as MAR methods comparison, where no reference information in terms of a second acquisition is needed. The method is based on the forward projection of a reconstructed image, which is compared to the actually measured projection data. The new evaluation technique is performed on phantom and on clinical CT data with and without MAR. The metric results are then compared with methods using a reference data set as well as an expert-based classification. It is shown that the new approach is an adequate quantification technique for artifact strength in reconstructed metal or MAR CT images. The presented method works solely on the original projection data itself, which yields some advantages compared to distance measures in image domain using two data sets. Beside this, no parameters have to be manually chosen. The new metric is a useful evaluation alternative when no reference data are available.

  5. Pharmacy Dashboard: An Innovative Process for Pharmacy Workload and Productivity.

    PubMed

    Kinney, Ashley; Bui, Quyen; Hodding, Jane; Le, Jennifer

    2017-03-01

    Background: Innovative approaches, including LEAN systems and dashboards, to enhance pharmacy production continue to evolve in a cost and safety conscious health care environment. Furthermore, implementing and evaluating the effectiveness of these novel methods continues to be challenging for pharmacies. Objective: To describe a comprehensive, real-time pharmacy dashboard that incorporated LEAN methodologies and evaluate its utilization in an inpatient Central Intravenous Additives Services (CIVAS) pharmacy. Methods: Long Beach Memorial Hospital (462 adult beds) and Miller Children's and Women's Hospital of Long Beach (combined 324 beds) are tertiary not-for-profit, community-based hospitals that are served by one CIVAS pharmacy. Metrics to evaluate the effectiveness of CIVAS were developed and implemented on a dashboard in real-time from March 2013 to March 2014. Results: The metrics that were designed and implemented to evaluate the effectiveness of CIVAS were quality and value, financial resilience, and the department's people and culture. Using a dashboard that integrated these metrics, the accuracy of manufacturing defect-free products was ≥99.9%, indicating excellent quality and value of CIVAS. The metric for financial resilience demonstrated a cost savings of $78,000 annually within pharmacy by eliminating the outsourcing of products. People and value metrics on the dashboard focused on standard work, with an overall 94.6% compliance to the workflow. Conclusion: A unique dashboard that incorporated metrics to monitor 3 important areas was successfully implemented to improve the effectiveness of CIVAS pharmacy. These metrics helped pharmacy to monitor progress in real-time, allowing attainment of production goals and fostering continuous quality improvement through LEAN work.

  6. Pharmacy Dashboard: An Innovative Process for Pharmacy Workload and Productivity

    PubMed Central

    Bui, Quyen; Hodding, Jane; Le, Jennifer

    2017-01-01

    Background: Innovative approaches, including LEAN systems and dashboards, to enhance pharmacy production continue to evolve in a cost and safety conscious health care environment. Furthermore, implementing and evaluating the effectiveness of these novel methods continues to be challenging for pharmacies. Objective: To describe a comprehensive, real-time pharmacy dashboard that incorporated LEAN methodologies and evaluate its utilization in an inpatient Central Intravenous Additives Services (CIVAS) pharmacy. Methods: Long Beach Memorial Hospital (462 adult beds) and Miller Children's and Women's Hospital of Long Beach (combined 324 beds) are tertiary not-for-profit, community-based hospitals that are served by one CIVAS pharmacy. Metrics to evaluate the effectiveness of CIVAS were developed and implemented on a dashboard in real-time from March 2013 to March 2014. Results: The metrics that were designed and implemented to evaluate the effectiveness of CIVAS were quality and value, financial resilience, and the department's people and culture. Using a dashboard that integrated these metrics, the accuracy of manufacturing defect-free products was ≥99.9%, indicating excellent quality and value of CIVAS. The metric for financial resilience demonstrated a cost savings of $78,000 annually within pharmacy by eliminating the outsourcing of products. People and value metrics on the dashboard focused on standard work, with an overall 94.6% compliance to the workflow. Conclusion: A unique dashboard that incorporated metrics to monitor 3 important areas was successfully implemented to improve the effectiveness of CIVAS pharmacy. These metrics helped pharmacy to monitor progress in real-time, allowing attainment of production goals and fostering continuous quality improvement through LEAN work. PMID:28439134

  7. Synthesized view comparison method for no-reference 3D image quality assessment

    NASA Astrophysics Data System (ADS)

    Luo, Fangzhou; Lin, Chaoyi; Gu, Xiaodong; Ma, Xiaojun

    2018-04-01

    We develop a no-reference image quality assessment metric to evaluate the quality of synthesized view rendered from the Multi-view Video plus Depth (MVD) format. Our metric is named Synthesized View Comparison (SVC), which is designed for real-time quality monitoring at the receiver side in a 3D-TV system. The metric utilizes the virtual views in the middle which are warped from left and right views by Depth-image-based rendering algorithm (DIBR), and compares the difference between the virtual views rendered from different cameras by Structural SIMilarity (SSIM), a popular 2D full-reference image quality assessment metric. The experimental results indicate that our no-reference quality assessment metric for the synthesized images has competitive prediction performance compared with some classic full-reference image quality assessment metrics.

  8. Honorary authorship epidemic in scholarly publications? How the current use of citation-based evaluative metrics make (pseudo)honorary authors from honest contributors of every multi-author article.

    PubMed

    Kovacs, Jozsef

    2013-08-01

    The current use of citation-based metrics to evaluate the research output of individual researchers is highly discriminatory because they are uniformly applied to authors of single-author articles as well as contributors of multi-author papers. In the latter case, these quantitative measures are counted, as if each contributor were the single author of the full article. In this way, each and every contributor is assigned the full impact-factor score and all the citations that the article has received. This has a multiplication effect on each contributor's citation-based evaluative metrics of multi-author articles, because the more contributors an article has, the more undeserved credit is assigned to each of them. In this paper, I argue that this unfair system could be made fairer by requesting the contributors of multi-author articles to describe the nature of their contribution, and to assign a numerical value to their degree of relative contribution. In this way, we could create a contribution-specific index of each contributor for each citation metric. This would be a strong disincentive against honorary authorship and publication cartels, because it would transform the current win-win strategy of accepting honorary authors in the byline into a zero-sum game for each contributor.

  9. Return on investment in healthcare leadership development programs.

    PubMed

    Jeyaraman, Maya M; Qadar, Sheikh Muhammad Zeeshan; Wierzbowski, Aleksandra; Farshidfar, Farnaz; Lys, Justin; Dickson, Graham; Grimes, Kelly; Phillips, Leah A; Mitchell, Jonathan I; Van Aerde, John; Johnson, Dave; Krupka, Frank; Zarychanski, Ryan; Abou-Setta, Ahmed M

    2018-02-05

    Purpose Strong leadership has been shown to foster change, including loyalty, improved performance and decreased error rates, but there is a dearth of evidence on effectiveness of leadership development programs. To ensure a return on the huge investments made, evidence-based approaches are needed to assess the impact of leadership on health-care establishments. As a part of a pan-Canadian initiative to design an effective evaluative instrument, the purpose of this paper was to identify and summarize evidence on health-care outcomes/return on investment (ROI) indicators and metrics associated with leadership quality, leadership development programs and existing evaluative instruments. Design/methodology/approach The authors performed a scoping review using the Arksey and O'Malley framework, searching eight databases from 2006 through June 2016. Findings Of 11,868 citations screened, the authors included 223 studies reporting on health-care outcomes/ROI indicators and metrics associated with leadership quality (73 studies), leadership development programs (138 studies) and existing evaluative instruments (12 studies). The extracted ROI indicators and metrics have been summarized in detail. Originality/value This review provides a snapshot in time of the current evidence on ROI indicators and metrics associated with leadership. Summarized ROI indicators and metrics can be used to design an effective evaluative instrument to assess the impact of leadership on health-care organizations.

  10. Developing Image Processing Meta-Algorithms with Data Mining of Multiple Metrics

    PubMed Central

    Cunha, Alexandre; Toga, A. W.; Parker, D. Stott

    2014-01-01

    People often use multiple metrics in image processing, but here we take a novel approach of mining the values of batteries of metrics on image processing results. We present a case for extending image processing methods to incorporate automated mining of multiple image metric values. Here by a metric we mean any image similarity or distance measure, and in this paper we consider intensity-based and statistical image measures and focus on registration as an image processing problem. We show how it is possible to develop meta-algorithms that evaluate different image processing results with a number of different metrics and mine the results in an automated fashion so as to select the best results. We show that the mining of multiple metrics offers a variety of potential benefits for many image processing problems, including improved robustness and validation. PMID:24653748

  11. Experimental evaluation of ontology-based HIV/AIDS frequently asked question retrieval system.

    PubMed

    Ayalew, Yirsaw; Moeng, Barbara; Mosweunyane, Gontlafetse

    2018-05-01

    This study presents the results of experimental evaluations of an ontology-based frequently asked question retrieval system in the domain of HIV and AIDS. The main purpose of the system is to provide answers to questions on HIV/AIDS using ontology. To evaluate the effectiveness of the frequently asked question retrieval system, we conducted two experiments. The first experiment focused on the evaluation of the quality of the ontology we developed using the OQuaRE evaluation framework which is based on software quality metrics and metrics designed for ontology quality evaluation. The second experiment focused on evaluating the effectiveness of the ontology in retrieving relevant answers. For this we used an open-source information retrieval platform, Terrier, with retrieval models BM25 and PL2. For the measurement of performance, we used the measures mean average precision, mean reciprocal rank, and precision at 5. The results suggest that frequently asked question retrieval with ontology is more effective than frequently asked question retrieval without ontology in the domain of HIV/AIDS.

  12. An Instrumented Glove to Assess Manual Dexterity in Simulation-Based Neurosurgical Education

    PubMed Central

    Lemos, Juan Diego; Hernandez, Alher Mauricio; Soto-Romero, Georges

    2017-01-01

    The traditional neurosurgical apprenticeship scheme includes the assessment of trainee’s manual skills carried out by experienced surgeons. However, the introduction of surgical simulation technology presents a new paradigm where residents can refine surgical techniques on a simulator before putting them into practice in real patients. Unfortunately, in this new scheme, an experienced surgeon will not always be available to evaluate trainee’s performance. For this reason, it is necessary to develop automatic mechanisms to estimate metrics for assessing manual dexterity in a quantitative way. Authors have proposed some hardware-software approaches to evaluate manual dexterity on surgical simulators. This paper presents IGlove, a wearable device that uses inertial sensors embedded on an elastic glove to capture hand movements. Metrics to assess manual dexterity are estimated from sensors signals using data processing and information analysis algorithms. It has been designed to be used with a neurosurgical simulator called Daubara NS Trainer, but can be easily adapted to another benchtop- and manikin-based medical simulators. The system was tested with a sample of 14 volunteers who performed a test that was designed to simultaneously evaluate their fine motor skills and the IGlove’s functionalities. Metrics obtained by each of the participants are presented as results in this work; it is also shown how these metrics are used to automatically evaluate the level of manual dexterity of each volunteer. PMID:28468268

  13. Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity.

    PubMed

    Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S

    2015-09-01

    The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. © 2015 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

  14. Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity

    PubMed Central

    Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S

    2015-01-01

    The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. PMID:26073648

  15. Quality Measures for Dialysis: Time for a Balanced Scorecard.

    PubMed

    Kliger, Alan S

    2016-02-05

    Recent federal legislation establishes a merit-based incentive payment system for physicians, with a scorecard for each professional. The Centers for Medicare and Medicaid Services evaluate quality of care with clinical performance measures and have used these metrics for public reporting and payment to dialysis facilities. Similar metrics may be used for the future merit-based incentive payment system. In nephrology, most clinical performance measures measure processes and intermediate outcomes of care. These metrics were developed from population studies of best practice and do not identify opportunities for individualizing care on the basis of patient characteristics and individual goals of treatment. The In-Center Hemodialysis (ICH) Consumer Assessment of Healthcare Providers and Systems (CAHPS) survey examines patients' perception of care and has entered the arena to evaluate quality of care. A balanced scorecard of quality performance should include three elements: population-based best clinical practice, patient perceptions, and individually crafted patient goals of care. Copyright © 2016 by the American Society of Nephrology.

  16. First International Diagnosis Competition - DXC'09

    NASA Technical Reports Server (NTRS)

    Kurtoglu, tolga; Narasimhan, Sriram; Poll, Scott; Garcia, David; Kuhn, Lukas; deKleer, Johan; vanGemund, Arjan; Feldman, Alexander

    2009-01-01

    A framework to compare and evaluate diagnosis algorithms (DAs) has been created jointly by NASA Ames Research Center and PARC. In this paper, we present the first concrete implementation of this framework as a competition called DXC 09. The goal of this competition was to evaluate and compare DAs in a common platform and to determine a winner based on diagnosis results. 12 DAs (model-based and otherwise) competed in this first year of the competition in 3 tracks that included industrial and synthetic systems. Specifically, the participants provided algorithms that communicated with the run-time architecture to receive scenario data and return diagnostic results. These algorithms were run on extended scenario data sets (different from sample set) to compute a set of pre-defined metrics. A ranking scheme based on weighted metrics was used to declare winners. This paper presents the systems used in DXC 09, description of faults and data sets, a listing of participating DAs, the metrics and results computed from running the DAs, and a superficial analysis of the results.

  17. From the eyes and the heart: a novel eye-gaze metric that predicts video preferences of a large audience

    PubMed Central

    Christoforou, Christoforos; Christou-Champi, Spyros; Constantinidou, Fofi; Theodorou, Maria

    2015-01-01

    Eye-tracking has been extensively used to quantify audience preferences in the context of marketing and advertising research, primarily in methodologies involving static images or stimuli (i.e., advertising, shelf testing, and website usability). However, these methodologies do not generalize to narrative-based video stimuli where a specific storyline is meant to be communicated to the audience. In this paper, a novel metric based on eye-gaze dispersion (both within and across viewings) that quantifies the impact of narrative-based video stimuli to the preferences of large audiences is presented. The metric is validated in predicting the performance of video advertisements aired during the 2014 Super Bowl final. In particular, the metric is shown to explain 70% of the variance in likeability scores of the 2014 Super Bowl ads as measured by the USA TODAY Ad-Meter. In addition, by comparing the proposed metric with Heart Rate Variability (HRV) indices, we have associated the metric with biological processes relating to attention allocation. The underlying idea behind the proposed metric suggests a shift in perspective when it comes to evaluating narrative-based video stimuli. In particular, it suggests that audience preferences on video are modulated by the level of viewers lack of attention allocation. The proposed metric can be calculated on any narrative-based video stimuli (i.e., movie, narrative content, emotional content, etc.), and thus has the potential to facilitate the use of such stimuli in several contexts: prediction of audience preferences of movies, quantitative assessment of entertainment pieces, prediction of the impact of movie trailers, identification of group, and individual differences in the study of attention-deficit disorders, and the study of desensitization to media violence. PMID:26029135

  18. From the eyes and the heart: a novel eye-gaze metric that predicts video preferences of a large audience.

    PubMed

    Christoforou, Christoforos; Christou-Champi, Spyros; Constantinidou, Fofi; Theodorou, Maria

    2015-01-01

    Eye-tracking has been extensively used to quantify audience preferences in the context of marketing and advertising research, primarily in methodologies involving static images or stimuli (i.e., advertising, shelf testing, and website usability). However, these methodologies do not generalize to narrative-based video stimuli where a specific storyline is meant to be communicated to the audience. In this paper, a novel metric based on eye-gaze dispersion (both within and across viewings) that quantifies the impact of narrative-based video stimuli to the preferences of large audiences is presented. The metric is validated in predicting the performance of video advertisements aired during the 2014 Super Bowl final. In particular, the metric is shown to explain 70% of the variance in likeability scores of the 2014 Super Bowl ads as measured by the USA TODAY Ad-Meter. In addition, by comparing the proposed metric with Heart Rate Variability (HRV) indices, we have associated the metric with biological processes relating to attention allocation. The underlying idea behind the proposed metric suggests a shift in perspective when it comes to evaluating narrative-based video stimuli. In particular, it suggests that audience preferences on video are modulated by the level of viewers lack of attention allocation. The proposed metric can be calculated on any narrative-based video stimuli (i.e., movie, narrative content, emotional content, etc.), and thus has the potential to facilitate the use of such stimuli in several contexts: prediction of audience preferences of movies, quantitative assessment of entertainment pieces, prediction of the impact of movie trailers, identification of group, and individual differences in the study of attention-deficit disorders, and the study of desensitization to media violence.

  19. Automated map sharpening by maximization of detail and connectivity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Terwilliger, Thomas C.; Sobolev, Oleg V.; Afonine, Pavel V.

    An algorithm for automatic map sharpening is presented that is based on optimization of the detail and connectivity of the sharpened map. The detail in the map is reflected in the surface area of an iso-contour surface that contains a fixed fraction of the volume of the map, where a map with high level of detail has a high surface area. The connectivity of the sharpened map is reflected in the number of connected regions defined by the same iso-contour surfaces, where a map with high connectivity has a small number of connected regions. By combining these two measures inmore » a metric termed the `adjusted surface area', map quality can be evaluated in an automated fashion. This metric was used to choose optimal map-sharpening parameters without reference to a model or other interpretations of the map. Map sharpening by optimization of the adjusted surface area can be carried out for a map as a whole or it can be carried out locally, yielding a locally sharpened map. To evaluate the performance of various approaches, a simple metric based on map–model correlation that can reproduce visual choices of optimally sharpened maps was used. The map–model correlation is calculated using a model withBfactors (atomic displacement factors; ADPs) set to zero. Finally, this model-based metric was used to evaluate map sharpening and to evaluate map-sharpening approaches, and it was found that optimization of the adjusted surface area can be an effective tool for map sharpening.« less

  20. Automated map sharpening by maximization of detail and connectivity

    DOE PAGES

    Terwilliger, Thomas C.; Sobolev, Oleg V.; Afonine, Pavel V.; ...

    2018-05-18

    An algorithm for automatic map sharpening is presented that is based on optimization of the detail and connectivity of the sharpened map. The detail in the map is reflected in the surface area of an iso-contour surface that contains a fixed fraction of the volume of the map, where a map with high level of detail has a high surface area. The connectivity of the sharpened map is reflected in the number of connected regions defined by the same iso-contour surfaces, where a map with high connectivity has a small number of connected regions. By combining these two measures inmore » a metric termed the `adjusted surface area', map quality can be evaluated in an automated fashion. This metric was used to choose optimal map-sharpening parameters without reference to a model or other interpretations of the map. Map sharpening by optimization of the adjusted surface area can be carried out for a map as a whole or it can be carried out locally, yielding a locally sharpened map. To evaluate the performance of various approaches, a simple metric based on map–model correlation that can reproduce visual choices of optimally sharpened maps was used. The map–model correlation is calculated using a model withBfactors (atomic displacement factors; ADPs) set to zero. Finally, this model-based metric was used to evaluate map sharpening and to evaluate map-sharpening approaches, and it was found that optimization of the adjusted surface area can be an effective tool for map sharpening.« less

  1. Human Performance Optimization Metrics: Consensus Findings, Gaps, and Recommendations for Future Research.

    PubMed

    Nindl, Bradley C; Jaffin, Dianna P; Dretsch, Michael N; Cheuvront, Samuel N; Wesensten, Nancy J; Kent, Michael L; Grunberg, Neil E; Pierce, Joseph R; Barry, Erin S; Scott, Jonathan M; Young, Andrew J; OʼConnor, Francis G; Deuster, Patricia A

    2015-11-01

    Human performance optimization (HPO) is defined as "the process of applying knowledge, skills and emerging technologies to improve and preserve the capabilities of military members, and organizations to execute essential tasks." The lack of consensus for operationally relevant and standardized metrics that meet joint military requirements has been identified as the single most important gap for research and application of HPO. In 2013, the Consortium for Health and Military Performance hosted a meeting to develop a toolkit of standardized HPO metrics for use in military and civilian research, and potentially for field applications by commanders, units, and organizations. Performance was considered from a holistic perspective as being influenced by various behaviors and barriers. To accomplish the goal of developing a standardized toolkit, key metrics were identified and evaluated across a spectrum of domains that contribute to HPO: physical performance, nutritional status, psychological status, cognitive performance, environmental challenges, sleep, and pain. These domains were chosen based on relevant data with regard to performance enhancers and degraders. The specific objectives at this meeting were to (a) identify and evaluate current metrics for assessing human performance within selected domains; (b) prioritize metrics within each domain to establish a human performance assessment toolkit; and (c) identify scientific gaps and the needed research to more effectively assess human performance across domains. This article provides of a summary of 150 total HPO metrics across multiple domains that can be used as a starting point-the beginning of an HPO toolkit: physical fitness (29 metrics), nutrition (24 metrics), psychological status (36 metrics), cognitive performance (35 metrics), environment (12 metrics), sleep (9 metrics), and pain (5 metrics). These metrics can be particularly valuable as the military emphasizes a renewed interest in Human Dimension efforts, and leverages science, resources, programs, and policies to optimize the performance capacities of all Service members.

  2. Evaluation metrics for biostatistical and epidemiological collaborations.

    PubMed

    Rubio, Doris McGartland; Del Junco, Deborah J; Bhore, Rafia; Lindsell, Christopher J; Oster, Robert A; Wittkowski, Knut M; Welty, Leah J; Li, Yi-Ju; Demets, Dave

    2011-10-15

    Increasing demands for evidence-based medicine and for the translation of biomedical research into individual and public health benefit have been accompanied by the proliferation of special units that offer expertise in biostatistics, epidemiology, and research design (BERD) within academic health centers. Objective metrics that can be used to evaluate, track, and improve the performance of these BERD units are critical to their successful establishment and sustainable future. To develop a set of reliable but versatile metrics that can be adapted easily to different environments and evolving needs, we consulted with members of BERD units from the consortium of academic health centers funded by the Clinical and Translational Science Award Program of the National Institutes of Health. Through a systematic process of consensus building and document drafting, we formulated metrics that covered the three identified domains of BERD practices: the development and maintenance of collaborations with clinical and translational science investigators, the application of BERD-related methods to clinical and translational research, and the discovery of novel BERD-related methodologies. In this article, we describe the set of metrics and advocate their use for evaluating BERD practices. The routine application, comparison of findings across diverse BERD units, and ongoing refinement of the metrics will identify trends, facilitate meaningful changes, and ultimately enhance the contribution of BERD activities to biomedical research. Copyright © 2011 John Wiley & Sons, Ltd.

  3. Modelling the B2C Marketplace: Evaluation of a Reputation Metric for e-Commerce

    NASA Astrophysics Data System (ADS)

    Gutowska, Anna; Sloane, Andrew

    This paper evaluates recently developed novel and comprehensive reputation metric designed for the distributed multi-agent reputation system for the Business-to-Consumer (B2C) E-commerce applications. To do that an agent-based simulation framework was implemented which models different types of behaviours in the marketplace. The trustworthiness of different types of providers is investigated to establish whether the simulation models behaviour of B2C e-Commerce systems as they are expected to behave in real life.

  4. ARM Data-Oriented Metrics and Diagnostics Package for Climate Model Evaluation Value-Added Product

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Chengzhu; Xie, Shaocheng

    A Python-based metrics and diagnostics package is currently being developed by the U.S. Department of Energy (DOE) Atmospheric Radiation Measurement (ARM) Infrastructure Team at Lawrence Livermore National Laboratory (LLNL) to facilitate the use of long-term, high-frequency measurements from the ARM Facility in evaluating the regional climate simulation of clouds, radiation, and precipitation. This metrics and diagnostics package computes climatological means of targeted climate model simulation and generates tables and plots for comparing the model simulation with ARM observational data. The Coupled Model Intercomparison Project (CMIP) model data sets are also included in the package to enable model intercomparison as demonstratedmore » in Zhang et al. (2017). The mean of the CMIP model can serve as a reference for individual models. Basic performance metrics are computed to measure the accuracy of mean state and variability of climate models. The evaluated physical quantities include cloud fraction, temperature, relative humidity, cloud liquid water path, total column water vapor, precipitation, sensible and latent heat fluxes, and radiative fluxes, with plan to extend to more fields, such as aerosol and microphysics properties. Process-oriented diagnostics focusing on individual cloud- and precipitation-related phenomena are also being developed for the evaluation and development of specific model physical parameterizations. The version 1.0 package is designed based on data collected at ARM’s Southern Great Plains (SGP) Research Facility, with the plan to extend to other ARM sites. The metrics and diagnostics package is currently built upon standard Python libraries and additional Python packages developed by DOE (such as CDMS and CDAT). The ARM metrics and diagnostic package is available publicly with the hope that it can serve as an easy entry point for climate modelers to compare their models with ARM data. In this report, we first present the input data, which constitutes the core content of the metrics and diagnostics package in section 2, and a user's guide documenting the workflow/structure of the version 1.0 codes, and including step-by-step instruction for running the package in section 3.« less

  5. Performance metrics for the assessment of satellite data products: an ocean color case study

    PubMed Central

    Seegers, Bridget N.; Stumpf, Richard P.; Schaeffer, Blake A.; Loftin, Keith A.; Werdell, P. Jeremy

    2018-01-01

    Performance assessment of ocean color satellite data has generally relied on statistical metrics chosen for their common usage and the rationale for selecting certain metrics is infrequently explained. Commonly reported statistics based on mean squared errors, such as the coefficient of determination (r2), root mean square error, and regression slopes, are most appropriate for Gaussian distributions without outliers and, therefore, are often not ideal for ocean color algorithm performance assessment, which is often limited by sample availability. In contrast, metrics based on simple deviations, such as bias and mean absolute error, as well as pair-wise comparisons, often provide more robust and straightforward quantities for evaluating ocean color algorithms with non-Gaussian distributions and outliers. This study uses a SeaWiFS chlorophyll-a validation data set to demonstrate a framework for satellite data product assessment and recommends a multi-metric and user-dependent approach that can be applied within science, modeling, and resource management communities. PMID:29609296

  6. Evaluating structural pattern recognition for handwritten math via primitive label graphs

    NASA Astrophysics Data System (ADS)

    Zanibbi, Richard; Mouchère, Harold; Viard-Gaudin, Christian

    2013-01-01

    Currently, structural pattern recognizer evaluations compare graphs of detected structure to target structures (i.e. ground truth) using recognition rates, recall and precision for object segmentation, classification and relationships. In document recognition, these target objects (e.g. symbols) are frequently comprised of multiple primitives (e.g. connected components, or strokes for online handwritten data), but current metrics do not characterize errors at the primitive level, from which object-level structure is obtained. Primitive label graphs are directed graphs defined over primitives and primitive pairs. We define new metrics obtained by Hamming distances over label graphs, which allow classification, segmentation and parsing errors to be characterized separately, or using a single measure. Recall and precision for detected objects may also be computed directly from label graphs. We illustrate the new metrics by comparing a new primitive-level evaluation to the symbol-level evaluation performed for the CROHME 2012 handwritten math recognition competition. A Python-based set of utilities for evaluating, visualizing and translating label graphs is publicly available.

  7. Validation of the updated ArthroS simulator: face and construct validity of a passive haptic virtual reality simulator with novel performance metrics.

    PubMed

    Garfjeld Roberts, Patrick; Guyver, Paul; Baldwin, Mathew; Akhtar, Kash; Alvand, Abtin; Price, Andrew J; Rees, Jonathan L

    2017-02-01

    To assess the construct and face validity of ArthroS, a passive haptic VR simulator. A secondary aim was to evaluate the novel performance metrics produced by this simulator. Two groups of 30 participants, each divided into novice, intermediate or expert based on arthroscopic experience, completed three separate tasks on either the knee or shoulder module of the simulator. Performance was recorded using 12 automatically generated performance metrics and video footage of the arthroscopic procedures. The videos were blindly assessed using a validated global rating scale (GRS). Participants completed a survey about the simulator's realism and training utility. This new simulator demonstrated construct validity of its tasks when evaluated against a GRS (p ≤ 0.003 in all cases). Regarding it's automatically generated performance metrics, established outputs such as time taken (p ≤ 0.001) and instrument path length (p ≤ 0.007) also demonstrated good construct validity. However, two-thirds of the proposed 'novel metrics' the simulator reports could not distinguish participants based on arthroscopic experience. Face validity assessment rated the simulator as a realistic and useful tool for trainees, but the passive haptic feedback (a key feature of this simulator) is rated as less realistic. The ArthroS simulator has good task construct validity based on established objective outputs, but some of the novel performance metrics could not distinguish between surgical experience. The passive haptic feedback of the simulator also needs improvement. If simulators could offer automated and validated performance feedback, this would facilitate improvements in the delivery of training by allowing trainees to practise and self-assess.

  8. Metrics for Evaluation of Student Models

    ERIC Educational Resources Information Center

    Pelanek, Radek

    2015-01-01

    Researchers use many different metrics for evaluation of performance of student models. The aim of this paper is to provide an overview of commonly used metrics, to discuss properties, advantages, and disadvantages of different metrics, to summarize current practice in educational data mining, and to provide guidance for evaluation of student…

  9. A novel spatial performance metric for robust pattern optimization of distributed hydrological models

    NASA Astrophysics Data System (ADS)

    Stisen, S.; Demirel, C.; Koch, J.

    2017-12-01

    Evaluation of performance is an integral part of model development and calibration as well as it is of paramount importance when communicating modelling results to stakeholders and the scientific community. There exists a comprehensive and well tested toolbox of metrics to assess temporal model performance in the hydrological modelling community. On the contrary, the experience to evaluate spatial performance is not corresponding to the grand availability of spatial observations readily available and to the sophisticate model codes simulating the spatial variability of complex hydrological processes. This study aims at making a contribution towards advancing spatial pattern oriented model evaluation for distributed hydrological models. This is achieved by introducing a novel spatial performance metric which provides robust pattern performance during model calibration. The promoted SPAtial EFficiency (spaef) metric reflects three equally weighted components: correlation, coefficient of variation and histogram overlap. This multi-component approach is necessary in order to adequately compare spatial patterns. spaef, its three components individually and two alternative spatial performance metrics, i.e. connectivity analysis and fractions skill score, are tested in a spatial pattern oriented model calibration of a catchment model in Denmark. The calibration is constrained by a remote sensing based spatial pattern of evapotranspiration and discharge timeseries at two stations. Our results stress that stand-alone metrics tend to fail to provide holistic pattern information to the optimizer which underlines the importance of multi-component metrics. The three spaef components are independent which allows them to complement each other in a meaningful way. This study promotes the use of bias insensitive metrics which allow comparing variables which are related but may differ in unit in order to optimally exploit spatial observations made available by remote sensing platforms. We see great potential of spaef across environmental disciplines dealing with spatially distributed modelling.

  10. Performance Metrics for Liquid Chromatography-Tandem Mass Spectrometry Systems in Proteomics Analyses*

    PubMed Central

    Rudnick, Paul A.; Clauser, Karl R.; Kilpatrick, Lisa E.; Tchekhovskoi, Dmitrii V.; Neta, Pedatsur; Blonder, Nikša; Billheimer, Dean D.; Blackman, Ronald K.; Bunk, David M.; Cardasis, Helene L.; Ham, Amy-Joan L.; Jaffe, Jacob D.; Kinsinger, Christopher R.; Mesri, Mehdi; Neubert, Thomas A.; Schilling, Birgit; Tabb, David L.; Tegeler, Tony J.; Vega-Montoto, Lorenzo; Variyath, Asokan Mulayath; Wang, Mu; Wang, Pei; Whiteaker, Jeffrey R.; Zimmerman, Lisa J.; Carr, Steven A.; Fisher, Susan J.; Gibson, Bradford W.; Paulovich, Amanda G.; Regnier, Fred E.; Rodriguez, Henry; Spiegelman, Cliff; Tempst, Paul; Liebler, Daniel C.; Stein, Stephen E.

    2010-01-01

    A major unmet need in LC-MS/MS-based proteomics analyses is a set of tools for quantitative assessment of system performance and evaluation of technical variability. Here we describe 46 system performance metrics for monitoring chromatographic performance, electrospray source stability, MS1 and MS2 signals, dynamic sampling of ions for MS/MS, and peptide identification. Applied to data sets from replicate LC-MS/MS analyses, these metrics displayed consistent, reasonable responses to controlled perturbations. The metrics typically displayed variations less than 10% and thus can reveal even subtle differences in performance of system components. Analyses of data from interlaboratory studies conducted under a common standard operating procedure identified outlier data and provided clues to specific causes. Moreover, interlaboratory variation reflected by the metrics indicates which system components vary the most between laboratories. Application of these metrics enables rational, quantitative quality assessment for proteomics and other LC-MS/MS analytical applications. PMID:19837981

  11. Memory colours and colour quality evaluation of conventional and solid-state lamps.

    PubMed

    Smet, Kevin A G; Ryckaert, Wouter R; Pointer, Michael R; Deconinck, Geert; Hanselaer, Peter

    2010-12-06

    A colour quality metric based on memory colours is presented. The basic idea is simple. The colour quality of a test source is evaluated as the degree of similarity between the colour appearance of a set of familiar objects and their memory colours. The closer the match, the better the colour quality. This similarity was quantified using a set of similarity distributions obtained by Smet et al. in a previous study. The metric was validated by calculating the Pearson and Spearman correlation coefficients between the metric predictions and the visual appreciation results obtained in a validation experiment conducted by the authors as well those obtained in two independent studies. The metric was found to correlate well with the visual appreciation of the lighting quality of the sources used in the three experiments. Its performance was also compared with that of the CIE colour rendering index and the NIST colour quality scale. For all three experiments, the metric was found to be significantly better at predicting the correct visual rank order of the light sources (p < 0.1).

  12. Linking hydrodynamic complexity to delta smelt (Hypomesus transpacificus) distribution in the San Francisco Estuary, USA

    USGS Publications Warehouse

    Bever, Aaron J.; MacWilliams, Michael L.; Herbold, Bruce; Brown, Larry R.; Feyrer, Frederick V.

    2016-01-01

    Long-term fish sampling data from the San Francisco Estuary were combined with detailed three dimensional hydrodynamic modeling to investigate the relationship between historical fish catch and hydrodynamic complexity. Delta Smelt catch data at 45 stations from the Fall Midwater Trawl (FMWT) survey in the vicinity of Suisun Bay were used to develop a quantitative catch-based station index. This index was used to rank stations based on historical Delta Smelt catch. The correlations between historical Delta Smelt catch and 35 quantitative metrics of environmental complexity were evaluated at each station. Eight metrics of environmental conditions were derived from FMWT data and 27 metrics were derived from model predictions at each FMWT station. To relate the station index to conceptual models of Delta Smelt habitat, the metrics were used to predict the station ranking based on the quantified environmental conditions. Salinity, current speed, and turbidity metrics were used to predict the relative ranking of each station for Delta Smelt catch. Including a measure of the current speed at each station improved predictions of the historical ranking for Delta Smelt catch relative to similar predictions made using only salinity and turbidity. Current speed was also found to be a better predictor of historical Delta Smelt catch than water depth. The quantitative approach developed using the FMWT data was validated using the Delta Smelt catch data from the San Francisco Bay Study. Complexity metrics in Suisun Bay were-evaluated during 2010 and 2011. This analysis indicated that a key to historical Delta Smelt catch is the overlap of low salinity, low maximum velocity, and low Secchi depth regions. This overlap occurred in Suisun Bay during 2011, and may have contributed to higher Delta Smelt abundance in 2011 than in 2010 when the favorable ranges of the metrics did not overlap in Suisun Bay.

  13. Voice based gender classification using machine learning

    NASA Astrophysics Data System (ADS)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  14. Performance evaluation of no-reference image quality metrics for face biometric images

    NASA Astrophysics Data System (ADS)

    Liu, Xinwei; Pedersen, Marius; Charrier, Christophe; Bours, Patrick

    2018-03-01

    The accuracy of face recognition systems is significantly affected by the quality of face sample images. The recent established standardization proposed several important aspects for the assessment of face sample quality. There are many existing no-reference image quality metrics (IQMs) that are able to assess natural image quality by taking into account similar image-based quality attributes as introduced in the standardization. However, whether such metrics can assess face sample quality is rarely considered. We evaluate the performance of 13 selected no-reference IQMs on face biometrics. The experimental results show that several of them can assess face sample quality according to the system performance. We also analyze the strengths and weaknesses of different IQMs as well as why some of them failed to assess face sample quality. Retraining an original IQM by using face database can improve the performance of such a metric. In addition, the contribution of this paper can be used for the evaluation of IQMs on other biometric modalities; furthermore, it can be used for the development of multimodality biometric IQMs.

  15. SU-C-BRB-05: Determining the Adequacy of Auto-Contouring Via Probabilistic Assessment of Ensuing Treatment Plan Metrics in Comparison with Manual Contours

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nourzadeh, H; Watkins, W; Siebers, J

    Purpose: To determine if auto-contour and manual-contour—based plans differ when evaluated with respect to probabilistic coverage metrics and biological model endpoints for prostate IMRT. Methods: Manual and auto-contours were created for 149 CT image sets acquired from 16 unique prostate patients. A single physician manually contoured all images. Auto-contouring was completed utilizing Pinnacle’s Smart Probabilistic Image Contouring Engine (SPICE). For each CT, three different 78 Gy/39 fraction 7-beam IMRT plans are created; PD with drawn ROIs, PAS with auto-contoured ROIs, and PM with auto-contoured OARs with the manually drawn target. For each plan, 1000 virtual treatment simulations with different sampledmore » systematic errors for each simulation and a different sampled random error for each fraction were performed using our in-house GPU-accelerated robustness analyzer tool which reports the statistical probability of achieving dose-volume metrics, NTCP, TCP, and the probability of achieving the optimization criteria for both auto-contoured (AS) and manually drawn (D) ROIs. Metrics are reported for all possible cross-evaluation pairs of ROI types (AS,D) and planning scenarios (PD,PAS,PM). Bhattacharyya coefficient (BC) is calculated to measure the PDF similarities for the dose-volume metric, NTCP, TCP, and objectives with respect to the manually drawn contour evaluated on base plan (D-PD). Results: We observe high BC values (BC≥0.94) for all OAR objectives. BC values of max dose objective on CTV also signify high resemblance (BC≥0.93) between the distributions. On the other hand, BC values for CTV’s D95 and Dmin objectives are small for AS-PM, AS-PD. NTCP distributions are similar across all evaluation pairs, while TCP distributions of AS-PM, AS-PD sustain variations up to %6 compared to other evaluated pairs. Conclusion: No significant probabilistic differences are observed in the metrics when auto-contoured OARs are used. The prostate auto-contour needs improvement to achieve clinically equivalent plans.« less

  16. Critical insights for a sustainability framework to address integrated community water services: Technical metrics and approaches.

    PubMed

    Xue, Xiaobo; Schoen, Mary E; Ma, Xin Cissy; Hawkins, Troy R; Ashbolt, Nicholas J; Cashdollar, Jennifer; Garland, Jay

    2015-06-15

    Planning for sustainable community water systems requires a comprehensive understanding and assessment of the integrated source-drinking-wastewater systems over their life-cycles. Although traditional life cycle assessment and similar tools (e.g. footprints and emergy) have been applied to elements of these water services (i.e. water resources, drinking water, stormwater or wastewater treatment alone), we argue for the importance of developing and combining the system-based tools and metrics in order to holistically evaluate the complete water service system based on the concept of integrated resource management. We analyzed the strengths and weaknesses of key system-based tools and metrics, and discuss future directions to identify more sustainable municipal water services. Such efforts may include the need for novel metrics that address system adaptability to future changes and infrastructure robustness. Caution is also necessary when coupling fundamentally different tools so to avoid misunderstanding and consequently misleading decision-making. Published by Elsevier Ltd.

  17. About neighborhood counting measure metric and minimum risk metric.

    PubMed

    Argentini, Andrea; Blanzieri, Enrico

    2010-04-01

    In a 2006 TPAMI paper, Wang proposed the Neighborhood Counting Measure, a similarity measure for the k-NN algorithm. In his paper, Wang mentioned the Minimum Risk Metric (MRM), an early distance measure based on the minimization of the risk of misclassification. Wang did not compare NCM to MRM because of its allegedly excessive computational load. In this comment paper, we complete the comparison that was missing in Wang's paper and, from our empirical evaluation, we show that MRM outperforms NCM and that its running time is not prohibitive as Wang suggested.

  18. Evaluating the Good Ontology Design Guideline (GoodOD) with the Ontology Quality Requirements and Evaluation Method and Metrics (OQuaRE)

    PubMed Central

    Duque-Ramos, Astrid; Boeker, Martin; Jansen, Ludger; Schulz, Stefan; Iniesta, Miguela; Fernández-Breis, Jesualdo Tomás

    2014-01-01

    Objective To (1) evaluate the GoodOD guideline for ontology development by applying the OQuaRE evaluation method and metrics to the ontology artefacts that were produced by students in a randomized controlled trial, and (2) informally compare the OQuaRE evaluation method with gold standard and competency questions based evaluation methods, respectively. Background In the last decades many methods for ontology construction and ontology evaluation have been proposed. However, none of them has become a standard and there is no empirical evidence of comparative evaluation of such methods. This paper brings together GoodOD and OQuaRE. GoodOD is a guideline for developing robust ontologies. It was previously evaluated in a randomized controlled trial employing metrics based on gold standard ontologies and competency questions as outcome parameters. OQuaRE is a method for ontology quality evaluation which adapts the SQuaRE standard for software product quality to ontologies and has been successfully used for evaluating the quality of ontologies. Methods In this paper, we evaluate the effect of training in ontology construction based on the GoodOD guideline within the OQuaRE quality evaluation framework and compare the results with those obtained for the previous studies based on the same data. Results Our results show a significant effect of the GoodOD training over developed ontologies by topics: (a) a highly significant effect was detected in three topics from the analysis of the ontologies of untrained and trained students; (b) both positive and negative training effects with respect to the gold standard were found for five topics. Conclusion The GoodOD guideline had a significant effect over the quality of the ontologies developed. Our results show that GoodOD ontologies can be effectively evaluated using OQuaRE and that OQuaRE is able to provide additional useful information about the quality of the GoodOD ontologies. PMID:25148262

  19. Evaluating the Good Ontology Design Guideline (GoodOD) with the ontology quality requirements and evaluation method and metrics (OQuaRE).

    PubMed

    Duque-Ramos, Astrid; Boeker, Martin; Jansen, Ludger; Schulz, Stefan; Iniesta, Miguela; Fernández-Breis, Jesualdo Tomás

    2014-01-01

    To (1) evaluate the GoodOD guideline for ontology development by applying the OQuaRE evaluation method and metrics to the ontology artefacts that were produced by students in a randomized controlled trial, and (2) informally compare the OQuaRE evaluation method with gold standard and competency questions based evaluation methods, respectively. In the last decades many methods for ontology construction and ontology evaluation have been proposed. However, none of them has become a standard and there is no empirical evidence of comparative evaluation of such methods. This paper brings together GoodOD and OQuaRE. GoodOD is a guideline for developing robust ontologies. It was previously evaluated in a randomized controlled trial employing metrics based on gold standard ontologies and competency questions as outcome parameters. OQuaRE is a method for ontology quality evaluation which adapts the SQuaRE standard for software product quality to ontologies and has been successfully used for evaluating the quality of ontologies. In this paper, we evaluate the effect of training in ontology construction based on the GoodOD guideline within the OQuaRE quality evaluation framework and compare the results with those obtained for the previous studies based on the same data. Our results show a significant effect of the GoodOD training over developed ontologies by topics: (a) a highly significant effect was detected in three topics from the analysis of the ontologies of untrained and trained students; (b) both positive and negative training effects with respect to the gold standard were found for five topics. The GoodOD guideline had a significant effect over the quality of the ontologies developed. Our results show that GoodOD ontologies can be effectively evaluated using OQuaRE and that OQuaRE is able to provide additional useful information about the quality of the GoodOD ontologies.

  20. Feasibility of and Rationale for the Collection of Orthopaedic Trauma Surgery Quality of Care Metrics.

    PubMed

    Miller, Anna N; Kozar, Rosemary; Wolinsky, Philip

    2017-06-01

    Reproducible metrics are needed to evaluate the delivery of orthopaedic trauma care, national care, norms, and outliers. The American College of Surgeons (ACS) is uniquely positioned to collect and evaluate the data needed to evaluate orthopaedic trauma care via the Committee on Trauma and the Trauma Quality Improvement Project. We evaluated the first quality metrics the ACS has collected for orthopaedic trauma surgery to determine whether these metrics can be appropriately collected with accuracy and completeness. The metrics include the time to administration of the first dose of antibiotics for open fractures, the time to surgical irrigation and débridement of open tibial fractures, and the percentage of patients who undergo stabilization of femoral fractures at trauma centers nationwide. These metrics were analyzed to evaluate for variances in the delivery of orthopaedic care across the country. The data showed wide variances for all metrics, and many centers had incomplete ability to collect the orthopaedic trauma care metrics. There was a large variability in the results of the metrics collected among different trauma center levels, as well as among centers of a particular level. The ACS has successfully begun tracking orthopaedic trauma care performance measures, which will help inform reevaluation of the goals and continued work on data collection and improvement of patient care. Future areas of research may link these performance measures with patient outcomes, such as long-term tracking, to assess nonunion and function. This information can provide insight into center performance and its effect on patient outcomes. The ACS was able to successfully collect and evaluate the data for three metrics used to assess the quality of orthopaedic trauma care. However, additional research is needed to determine whether these metrics are suitable for evaluating orthopaedic trauma care and cutoff values for each metric.

  1. Holistic Metrics for Assessment of the Greenness of Chemical Reactions in the Context of Chemical Education

    ERIC Educational Resources Information Center

    Ribeiro, M. Gabriela T. C.; Machado, Adelio A. S. C.

    2013-01-01

    Two new semiquantitative green chemistry metrics, the green circle and the green matrix, have been developed for quick assessment of the greenness of a chemical reaction or process, even without performing the experiment from a protocol if enough detail is provided in it. The evaluation is based on the 12 principles of green chemistry. The…

  2. Hyperspectral face recognition using improved inter-channel alignment based on qualitative prediction models.

    PubMed

    Cho, Woon; Jang, Jinbeum; Koschan, Andreas; Abidi, Mongi A; Paik, Joonki

    2016-11-28

    A fundamental limitation of hyperspectral imaging is the inter-band misalignment correlated with subject motion during data acquisition. One way of resolving this problem is to assess the alignment quality of hyperspectral image cubes derived from the state-of-the-art alignment methods. In this paper, we present an automatic selection framework for the optimal alignment method to improve the performance of face recognition. Specifically, we develop two qualitative prediction models based on: 1) a principal curvature map for evaluating the similarity index between sequential target bands and a reference band in the hyperspectral image cube as a full-reference metric; and 2) the cumulative probability of target colors in the HSV color space for evaluating the alignment index of a single sRGB image rendered using all of the bands of the hyperspectral image cube as a no-reference metric. We verify the efficacy of the proposed metrics on a new large-scale database, demonstrating a higher prediction accuracy in determining improved alignment compared to two full-reference and five no-reference image quality metrics. We also validate the ability of the proposed framework to improve hyperspectral face recognition.

  3. Conceptual Soundness, Metric Development, Benchmarking, and Targeting for PATH Subprogram Evaluation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mosey. G.; Doris, E.; Coggeshall, C.

    The objective of this study is to evaluate the conceptual soundness of the U.S. Department of Housing and Urban Development (HUD) Partnership for Advancing Technology in Housing (PATH) program's revised goals and establish and apply a framework to identify and recommend metrics that are the most useful for measuring PATH's progress. This report provides an evaluative review of PATH's revised goals, outlines a structured method for identifying and selecting metrics, proposes metrics and benchmarks for a sampling of individual PATH programs, and discusses other metrics that potentially could be developed that may add value to the evaluation process. The frameworkmore » and individual program metrics can be used for ongoing management improvement efforts and to inform broader program-level metrics for government reporting requirements.« less

  4. Tools for monitoring system suitability in LC MS/MS centric proteomic experiments.

    PubMed

    Bereman, Michael S

    2015-03-01

    With advances in liquid chromatography coupled to tandem mass spectrometry technologies combined with the continued goals of biomarker discovery, clinical applications of established biomarkers, and integrating large multiomic datasets (i.e. "big data"), there remains an urgent need for robust tools to assess instrument performance (i.e. system suitability) in proteomic workflows. To this end, several freely available tools have been introduced that monitor a number of peptide identification (ID) and/or peptide ID free metrics. Peptide ID metrics include numbers of proteins, peptides, or peptide spectral matches identified from a complex mixture. Peptide ID free metrics include retention time reproducibility, full width half maximum, ion injection times, and integrated peptide intensities. The main driving force in the development of these tools is to monitor both intra- and interexperiment performance variability and to identify sources of variation. The purpose of this review is to summarize and evaluate these tools based on versatility, automation, vendor neutrality, metrics monitored, and visualization capabilities. In addition, the implementation of a robust system suitability workflow is discussed in terms of metrics, type of standard, and frequency of evaluation along with the obstacles to overcome prior to incorporating a more proactive approach to overall quality control in liquid chromatography coupled to tandem mass spectrometry based proteomic workflows. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Geospace Environment Modeling 2008-2009 Challenge: Ground Magnetic Field Perturbations

    NASA Technical Reports Server (NTRS)

    Pulkkinen, A.; Kuznetsova, M.; Ridley, A.; Raeder, J.; Vapirev, A.; Weimer, D.; Weigel, R. S.; Wiltberger, M.; Millward, G.; Rastatter, L.; hide

    2011-01-01

    Acquiring quantitative metrics!based knowledge about the performance of various space physics modeling approaches is central for the space weather community. Quantification of the performance helps the users of the modeling products to better understand the capabilities of the models and to choose the approach that best suits their specific needs. Further, metrics!based analyses are important for addressing the differences between various modeling approaches and for measuring and guiding the progress in the field. In this paper, the metrics!based results of the ground magnetic field perturbation part of the Geospace Environment Modeling 2008 2009 Challenge are reported. Predictions made by 14 different models, including an ensemble model, are compared to geomagnetic observatory recordings from 12 different northern hemispheric locations. Five different metrics are used to quantify the model performances for four storm events. It is shown that the ranking of the models is strongly dependent on the type of metric used to evaluate the model performance. None of the models rank near or at the top systematically for all used metrics. Consequently, one cannot pick the absolute winner : the choice for the best model depends on the characteristics of the signal one is interested in. Model performances vary also from event to event. This is particularly clear for root!mean!square difference and utility metric!based analyses. Further, analyses indicate that for some of the models, increasing the global magnetohydrodynamic model spatial resolution and the inclusion of the ring current dynamics improve the models capability to generate more realistic ground magnetic field fluctuations.

  6. Performance Benchmarks for Scholarly Metrics Associated with Fisheries and Wildlife Faculty

    PubMed Central

    Swihart, Robert K.; Sundaram, Mekala; Höök, Tomas O.; DeWoody, J. Andrew; Kellner, Kenneth F.

    2016-01-01

    Research productivity and impact are often considered in professional evaluations of academics, and performance metrics based on publications and citations increasingly are used in such evaluations. To promote evidence-based and informed use of these metrics, we collected publication and citation data for 437 tenure-track faculty members at 33 research-extensive universities in the United States belonging to the National Association of University Fisheries and Wildlife Programs. For each faculty member, we computed 8 commonly used performance metrics based on numbers of publications and citations, and recorded covariates including academic age (time since Ph.D.), sex, percentage of appointment devoted to research, and the sub-disciplinary research focus. Standardized deviance residuals from regression models were used to compare faculty after accounting for variation in performance due to these covariates. We also aggregated residuals to enable comparison across universities. Finally, we tested for temporal trends in citation practices to assess whether the “law of constant ratios”, used to enable comparison of performance metrics between disciplines that differ in citation and publication practices, applied to fisheries and wildlife sub-disciplines when mapped to Web of Science Journal Citation Report categories. Our regression models reduced deviance by ¼ to ½. Standardized residuals for each faculty member, when combined across metrics as a simple average or weighted via factor analysis, produced similar results in terms of performance based on percentile rankings. Significant variation was observed in scholarly performance across universities, after accounting for the influence of covariates. In contrast to findings for other disciplines, normalized citation ratios for fisheries and wildlife sub-disciplines increased across years. Increases were comparable for all sub-disciplines except ecology. We discuss the advantages and limitations of our methods, illustrate their use when applied to new data, and suggest future improvements. Our benchmarking approach may provide a useful tool to augment detailed, qualitative assessment of performance. PMID:27152838

  7. Performance Benchmarks for Scholarly Metrics Associated with Fisheries and Wildlife Faculty.

    PubMed

    Swihart, Robert K; Sundaram, Mekala; Höök, Tomas O; DeWoody, J Andrew; Kellner, Kenneth F

    2016-01-01

    Research productivity and impact are often considered in professional evaluations of academics, and performance metrics based on publications and citations increasingly are used in such evaluations. To promote evidence-based and informed use of these metrics, we collected publication and citation data for 437 tenure-track faculty members at 33 research-extensive universities in the United States belonging to the National Association of University Fisheries and Wildlife Programs. For each faculty member, we computed 8 commonly used performance metrics based on numbers of publications and citations, and recorded covariates including academic age (time since Ph.D.), sex, percentage of appointment devoted to research, and the sub-disciplinary research focus. Standardized deviance residuals from regression models were used to compare faculty after accounting for variation in performance due to these covariates. We also aggregated residuals to enable comparison across universities. Finally, we tested for temporal trends in citation practices to assess whether the "law of constant ratios", used to enable comparison of performance metrics between disciplines that differ in citation and publication practices, applied to fisheries and wildlife sub-disciplines when mapped to Web of Science Journal Citation Report categories. Our regression models reduced deviance by ¼ to ½. Standardized residuals for each faculty member, when combined across metrics as a simple average or weighted via factor analysis, produced similar results in terms of performance based on percentile rankings. Significant variation was observed in scholarly performance across universities, after accounting for the influence of covariates. In contrast to findings for other disciplines, normalized citation ratios for fisheries and wildlife sub-disciplines increased across years. Increases were comparable for all sub-disciplines except ecology. We discuss the advantages and limitations of our methods, illustrate their use when applied to new data, and suggest future improvements. Our benchmarking approach may provide a useful tool to augment detailed, qualitative assessment of performance.

  8. Health impact metrics for air pollution management strategies

    PubMed Central

    Martenies, Sheena E.; Wilkins, Donele; Batterman, Stuart A.

    2015-01-01

    Health impact assessments (HIAs) inform policy and decision making by providing information regarding future health concerns, and quantitative HIAs now are being used for local and urban-scale projects. HIA results can be expressed using a variety of metrics that differ in meaningful ways, and guidance is lacking with respect to best practices for the development and use of HIA metrics. This study reviews HIA metrics pertaining to air quality management and presents evaluative criteria for their selection and use. These are illustrated in a case study where PM2.5 concentrations are lowered from 10 to 8 µg/m3 in an urban area of 1.8 million people. Health impact functions are used to estimate the number of premature deaths, unscheduled hospitalizations and other morbidity outcomes. The most common metric in recent quantitative HIAs has been the number of cases of adverse outcomes avoided. Other metrics include time-based measures, e.g., disability-adjusted life years (DALYs), monetized impacts, functional-unit based measures, e.g., benefits per ton of emissions reduced, and other economic indicators, e.g., cost-benefit ratios. These metrics are evaluated by considering their comprehensiveness, the spatial and temporal resolution of the analysis, how equity considerations are facilitated, and the analysis and presentation of uncertainty. In the case study, the greatest number of avoided cases occurs for low severity morbidity outcomes, e.g., asthma exacerbations (n=28,000) and minor-restricted activity days (n=37,000); while DALYs and monetized impacts are driven by the severity, duration and value assigned to a relatively low number of premature deaths (n=190 to 230 per year). The selection of appropriate metrics depends on the problem context and boundaries, the severity of impacts, and community values regarding health. The number of avoided cases provides an estimate of the number of people affected, and monetized impacts facilitate additional economic analyses useful to policy analysis. DALYs are commonly used as an aggregate measure of health impacts and can be used to compare impacts across studies. Benefits per ton metrics may be appropriate when changes in emissions rates can be estimated. To address community concerns and HIA objectives, a combination of metrics is suggested. PMID:26372694

  9. NATO Code of Best Practice for Command and Control Assessment (Code OTAN des meilleures pratiques pour l’evaluation du commandement et du controle)

    DTIC Science & Technology

    2004-01-01

    Based Research Inc . 1595 Spring Hill Road, Suite 250 Vienna, VA 22182-2216 Wheatleyg@je.jfcom.mil Mr. J. Wilder UNITED STATES US. Army Training...Sarin, 2000). Collaboration C2 Metrics The following collaboration metrics have evolved out of work done by Evidence Based Research, Inc . for the...Enemy, Troops, Terrain, Troops, Time, and Civil considerations OOTW Operations Other Than War PESTLE Political, Economic, Social, Technological

  10. Relevance of motion-related assessment metrics in laparoscopic surgery.

    PubMed

    Oropesa, Ignacio; Chmarra, Magdalena K; Sánchez-González, Patricia; Lamata, Pablo; Rodrigues, Sharon P; Enciso, Silvia; Sánchez-Margallo, Francisco M; Jansen, Frank-Willem; Dankelman, Jenny; Gómez, Enrique J

    2013-06-01

    Motion metrics have become an important source of information when addressing the assessment of surgical expertise. However, their direct relationship with the different surgical skills has not been fully explored. The purpose of this study is to investigate the relevance of motion-related metrics in the evaluation processes of basic psychomotor laparoscopic skills and their correlation with the different abilities sought to measure. A framework for task definition and metric analysis is proposed. An explorative survey was first conducted with a board of experts to identify metrics to assess basic psychomotor skills. Based on the output of that survey, 3 novel tasks for surgical assessment were designed. Face and construct validation was performed, with focus on motion-related metrics. Tasks were performed by 42 participants (16 novices, 22 residents, and 4 experts). Movements of the laparoscopic instruments were registered with the TrEndo tracking system and analyzed. Time, path length, and depth showed construct validity for all 3 tasks. Motion smoothness and idle time also showed validity for tasks involving bimanual coordination and tasks requiring a more tactical approach, respectively. Additionally, motion smoothness and average speed showed a high internal consistency, proving them to be the most task-independent of all the metrics analyzed. Motion metrics are complementary and valid for assessing basic psychomotor skills, and their relevance depends on the skill being evaluated. A larger clinical implementation, combined with quality performance information, will give more insight on the relevance of the results shown in this study.

  11. Uncertainty quantification metrics for whole product life cycle cost estimates in aerospace innovation

    NASA Astrophysics Data System (ADS)

    Schwabe, O.; Shehab, E.; Erkoyuncu, J.

    2015-08-01

    The lack of defensible methods for quantifying cost estimate uncertainty over the whole product life cycle of aerospace innovations such as propulsion systems or airframes poses a significant challenge to the creation of accurate and defensible cost estimates. Based on the axiomatic definition of uncertainty as the actual prediction error of the cost estimate, this paper provides a comprehensive overview of metrics used for the uncertainty quantification of cost estimates based on a literature review, an evaluation of publicly funded projects such as part of the CORDIS or Horizon 2020 programs, and an analysis of established approaches used by organizations such NASA, the U.S. Department of Defence, the ESA, and various commercial companies. The metrics are categorized based on their foundational character (foundations), their use in practice (state-of-practice), their availability for practice (state-of-art) and those suggested for future exploration (state-of-future). Insights gained were that a variety of uncertainty quantification metrics exist whose suitability depends on the volatility of available relevant information, as defined by technical and cost readiness level, and the number of whole product life cycle phases the estimate is intended to be valid for. Information volatility and number of whole product life cycle phases can hereby be considered as defining multi-dimensional probability fields admitting various uncertainty quantification metric families with identifiable thresholds for transitioning between them. The key research gaps identified were the lacking guidance grounded in theory for the selection of uncertainty quantification metrics and lacking practical alternatives to metrics based on the Central Limit Theorem. An innovative uncertainty quantification framework consisting of; a set-theory based typology, a data library, a classification system, and a corresponding input-output model are put forward to address this research gap as the basis for future work in this field.

  12. Hydrologic Model Development and Calibration: Contrasting a Single- and Multi-Objective Approach for Comparing Model Performance

    NASA Astrophysics Data System (ADS)

    Asadzadeh, M.; Maclean, A.; Tolson, B. A.; Burn, D. H.

    2009-05-01

    Hydrologic model calibration aims to find a set of parameters that adequately simulates observations of watershed behavior, such as streamflow, or a state variable, such as snow water equivalent (SWE). There are different metrics for evaluating calibration effectiveness that involve quantifying prediction errors, such as the Nash-Sutcliffe (NS) coefficient and bias evaluated for the entire calibration period, on a seasonal basis, for low flows, or for high flows. Many of these metrics are conflicting such that the set of parameters that maximizes the high flow NS differs from the set of parameters that maximizes the low flow NS. Conflicting objectives are very likely when different calibration objectives are based on different fluxes and/or state variables (e.g., NS based on streamflow versus SWE). One of the most popular ways to balance different metrics is to aggregate them based on their importance and find the set of parameters that optimizes a weighted sum of the efficiency metrics. Comparing alternative hydrologic models (e.g., assessing model improvement when a process or more detail is added to the model) based on the aggregated objective might be misleading since it represents one point on the tradeoff of desired error metrics. To derive a more comprehensive model comparison, we solved a bi-objective calibration problem to estimate the tradeoff between two error metrics for each model. Although this approach is computationally more expensive than the aggregation approach, it results in a better understanding of the effectiveness of selected models at each level of every error metric and therefore provides a better rationale for judging relative model quality. The two alternative models used in this study are two MESH hydrologic models (version 1.2) of the Wolf Creek Research basin that differ in their watershed spatial discretization (a single Grouped Response Unit, GRU, versus multiple GRUs). The MESH model, currently under development by Environment Canada, is a coupled land-surface and hydrologic model. Results will demonstrate the conclusions a modeller might make regarding the value of additional watershed spatial discretization under both an aggregated (single-objective) and multi-objective model comparison framework.

  13. Evaluation of Science.

    PubMed

    Usmani, Adnan Mahmmood; Meo, Sultan Ayoub

    2011-01-01

    Scientific achievement by publishing a scientific manuscript in a peer reviewed biomedical journal is an important ingredient of research along with a career-enhancing advantages and significant amount of personal satisfaction. The road to evaluate science (research, scientific publications) among scientists often seems complicated. Scientist's career is generally summarized by the number of publications / citations, teaching the undergraduate, graduate and post-doctoral students, writing or reviewing grants and papers, preparing for and organizing meetings, participating in collaborations and conferences, advising colleagues, and serving on editorial boards of scientific journals. Scientists have been sizing up their colleagues since science began. Scientometricians have invented a wide variety of algorithms called science metrics to evaluate science. Many of the science metrics are even unknown to the everyday scientist. Unfortunately, there is no all-in-one metric. Each of them has its own strength, limitation and scope. Some of them are mistakenly applied to evaluate individuals, and each is surrounded by a cloud of variants designed to help them apply across different scientific fields or different career stages [1]. A suitable indicator should be chosen by considering the purpose of the evaluation, and how the results will be used. Scientific Evaluation assists us in: computing the research performance, comparison with peers, forecasting the growth, identifying the excellence in research, citation ranking, finding the influence of research, measuring the productivity, making policy decisions, securing funds for research and spotting trends. Key concepts in science metrics are output and impact. Evaluation of science is traditionally expressed in terms of citation counts. Although most of the science metrics are based on citation counts but two most commonly used are impact factor [2] and h-index [3].

  14. Local coding based matching kernel method for image classification.

    PubMed

    Song, Yan; McLoughlin, Ian Vince; Dai, Li-Rong

    2014-01-01

    This paper mainly focuses on how to effectively and efficiently measure visual similarity for local feature based representation. Among existing methods, metrics based on Bag of Visual Word (BoV) techniques are efficient and conceptually simple, at the expense of effectiveness. By contrast, kernel based metrics are more effective, but at the cost of greater computational complexity and increased storage requirements. We show that a unified visual matching framework can be developed to encompass both BoV and kernel based metrics, in which local kernel plays an important role between feature pairs or between features and their reconstruction. Generally, local kernels are defined using Euclidean distance or its derivatives, based either explicitly or implicitly on an assumption of Gaussian noise. However, local features such as SIFT and HoG often follow a heavy-tailed distribution which tends to undermine the motivation behind Euclidean metrics. Motivated by recent advances in feature coding techniques, a novel efficient local coding based matching kernel (LCMK) method is proposed. This exploits the manifold structures in Hilbert space derived from local kernels. The proposed method combines advantages of both BoV and kernel based metrics, and achieves a linear computational complexity. This enables efficient and scalable visual matching to be performed on large scale image sets. To evaluate the effectiveness of the proposed LCMK method, we conduct extensive experiments with widely used benchmark datasets, including 15-Scenes, Caltech101/256, PASCAL VOC 2007 and 2011 datasets. Experimental results confirm the effectiveness of the relatively efficient LCMK method.

  15. Evaluating which plan quality metrics are appropriate for use in lung SBRT.

    PubMed

    Yaparpalvi, Ravindra; Garg, Madhur K; Shen, Jin; Bodner, William R; Mynampati, Dinesh K; Gafar, Aleiya; Kuo, Hsiang-Chi; Basavatia, Amar K; Ohri, Nitin; Hong, Linda X; Kalnicki, Shalom; Tome, Wolfgang A

    2018-02-01

    Several dose metrics in the categories-homogeneity, coverage, conformity and gradient have been proposed in literature for evaluating treatment plan quality. In this study, we applied these metrics to characterize and identify the plan quality metrics that would merit plan quality assessment in lung stereotactic body radiation therapy (SBRT) dose distributions. Treatment plans of 90 lung SBRT patients, comprising 91 targets, treated in our institution were retrospectively reviewed. Dose calculations were performed using anisotropic analytical algorithm (AAA) with heterogeneity correction. A literature review on published plan quality metrics in the categories-coverage, homogeneity, conformity and gradient was performed. For each patient, using dose-volume histogram data, plan quality metric values were quantified and analysed. For the study, the radiation therapy oncology group (RTOG) defined plan quality metrics were: coverage (0.90 ± 0.08); homogeneity (1.27 ± 0.07); conformity (1.03 ± 0.07) and gradient (4.40 ± 0.80). Geometric conformity strongly correlated with conformity index (p < 0.0001). Gradient measures strongly correlated with target volume (p < 0.0001). The RTOG lung SBRT protocol advocated conformity guidelines for prescribed dose in all categories were met in ≥94% of cases. The proportion of total lung volume receiving doses of 20 Gy and 5 Gy (V 20 and V 5 ) were mean 4.8% (±3.2) and 16.4% (±9.2), respectively. Based on our study analyses, we recommend the following metrics as appropriate surrogates for establishing SBRT lung plan quality guidelines-coverage % (ICRU 62), conformity (CN or CI Paddick ) and gradient (R 50% ). Furthermore, we strongly recommend that RTOG lung SBRT protocols adopt either CN or CI Padddick in place of prescription isodose to target volume ratio for conformity index evaluation. Advances in knowledge: Our study metrics are valuable tools for establishing lung SBRT plan quality guidelines.

  16. A suite of standard post-tagging evaluation metrics can help assess tag retention for field-based fish telemetry research

    USGS Publications Warehouse

    Gerber, Kayla M.; Mather, Martha E.; Smith, Joseph M.

    2017-01-01

    Telemetry can inform many scientific and research questions if a context exists for integrating individual studies into the larger body of literature. Creating cumulative distributions of post-tagging evaluation metrics would allow individual researchers to relate their telemetry data to other studies. Widespread reporting of standard metrics is a precursor to the calculation of benchmarks for these distributions (e.g., mean, SD, 95% CI). Here we illustrate five types of standard post-tagging evaluation metrics using acoustically tagged Blue Catfish (Ictalurus furcatus) released into a Kansas reservoir. These metrics included: (1) percent of tagged fish detected overall, (2) percent of tagged fish detected daily using abacus plot data, (3) average number of (and percent of available) receiver sites visited, (4) date of last movement between receiver sites (and percent of tagged fish moving during that time period), and (5) number (and percent) of fish that egressed through exit gates. These metrics were calculated for one to three time periods: early (<10 d), during (weekly), and at the end of the study (5 months). Over three-quarters of our tagged fish were detected early (85%) and at the end (85%) of the study. Using abacus plot data, all tagged fish (100%) were detected at least one day and 96% were detected for > 5 days early in the study. On average, tagged Blue Catfish visited 9 (50%) and 13 (72%) of 18 within-reservoir receivers early and at the end of the study, respectively. At the end of the study, 73% of all tagged fish were detected moving between receivers. Creating statistical benchmarks for individual metrics can provide useful reference points. In addition, combining multiple metrics can inform ecology and research design. Consequently, individual researchers and the field of telemetry research can benefit from widespread, detailed, and standard reporting of post-tagging detection metrics.

  17. Toward objective image quality metrics: the AIC Eval Program of the JPEG

    NASA Astrophysics Data System (ADS)

    Richter, Thomas; Larabi, Chaker

    2008-08-01

    Objective quality assessment of lossy image compression codecs is an important part of the recent call of the JPEG for Advanced Image Coding. The target of the AIC ad-hoc group is twofold: First, to receive state-of-the-art still image codecs and to propose suitable technology for standardization; and second, to study objective image quality metrics to evaluate the performance of such codes. Even tthough the performance of an objective metric is defined by how well it predicts the outcome of a subjective assessment, one can also study the usefulness of a metric in a non-traditional way indirectly, namely by measuring the subjective quality improvement of a codec that has been optimized for a specific objective metric. This approach shall be demonstrated here on the recently proposed HDPhoto format14 introduced by Microsoft and a SSIM-tuned17 version of it by one of the authors. We compare these two implementations with JPEG1 in two variations and a visual and PSNR optimal JPEG200013 implementation. To this end, we use subjective and objective tests based on the multiscale SSIM and a new DCT based metric.

  18. Performance comparison of optical interference cancellation system architectures.

    PubMed

    Lu, Maddie; Chang, Matt; Deng, Yanhua; Prucnal, Paul R

    2013-04-10

    The performance of three optics-based interference cancellation systems are compared and contrasted with each other, and with traditional electronic techniques for interference cancellation. The comparison is based on a set of common performance metrics that we have developed for this purpose. It is shown that thorough evaluation of our optical approaches takes into account the traditional notions of depth of cancellation and dynamic range, along with notions of link loss and uniformity of cancellation. Our evaluation shows that our use of optical components affords performance that surpasses traditional electronic approaches, and that the optimal choice for an optical interference canceller requires taking into account the performance metrics discussed in this paper.

  19. A software technology evaluation program

    NASA Technical Reports Server (NTRS)

    Novaes-Card, David N.

    1985-01-01

    A set of quantitative approaches is presented for evaluating software development methods and tools. The basic idea is to generate a set of goals which are refined into quantifiable questions which specify metrics to be collected on the software development and maintenance process and product. These metrics can be used to characterize, evaluate, predict, and motivate. They can be used in an active as well as passive way by learning form analyzing the data and improving the methods and tools based upon what is learned from that analysis. Several examples were given representing each of the different approaches to evaluation. The cost of the approaches varied inversely with the level of confidence in the interpretation of the results.

  20. Objective measurement of complex multimodal and multidimensional display formats: a common metric for predicting format effectiveness

    NASA Astrophysics Data System (ADS)

    Marshak, William P.; Darkow, David J.; Wesler, Mary M.; Fix, Edward L.

    2000-08-01

    Computer-based display designers have more sensory modes and more dimensions within sensory modality with which to encode information in a user interface than ever before. This elaboration of information presentation has made measurement of display/format effectiveness and predicting display/format performance extremely difficult. A multivariate method has been devised which isolates critical information, physically measures its signal strength, and compares it with other elements of the display, which act like background noise. This common Metric relates signal-to-noise ratios (SNRs) within each stimulus dimension, then combines SNRs among display modes, dimensions and cognitive factors can predict display format effectiveness. Examples with their Common Metric assessment and validation in performance will be presented along with the derivation of the metric. Implications of the Common Metric in display design and evaluation will be discussed.

  1. Evaluation of image quality metrics for the prediction of subjective best focus.

    PubMed

    Kilintari, Marina; Pallikaris, Aristophanis; Tsiklis, Nikolaos; Ginis, Harilaos S

    2010-03-01

    Seven existing and three new image quality metrics were evaluated in terms of their effectiveness in predicting subjective cycloplegic refraction. Monochromatic wavefront aberrations (WA) were measured in 70 eyes using a Shack-Hartmann based device (Complete Ophthalmic Analysis System; Wavefront Sciences). Subjective cycloplegic spherocylindrical correction was obtained using a standard manifest refraction procedure. The dioptric amount required to optimize each metric was calculated and compared with the subjective refraction result. Metrics included monochromatic and polychromatic variants, as well as variants taking into consideration the Stiles and Crawford effect (SCE). WA measurements were performed using infrared light and converted to visible before all calculations. The mean difference between subjective cycloplegic and WA-derived spherical refraction ranged from 0.17 to 0.36 diopters (D), while paraxial curvature resulted in a difference of 0.68 D. Monochromatic metrics exhibited smaller mean differences between subjective cycloplegic and objective refraction. Consideration of the SCE reduced the standard deviation (SD) of the difference between subjective and objective refraction. All metrics exhibited similar performance in terms of accuracy and precision. We hypothesize that errors pertaining to the conversion between infrared and visible wavelengths rather than calculation method may be the limiting factor in determining objective best focus from near infrared WA measurements.

  2. Evaluation of diffusion kurtosis imaging in ex vivo hypomyelinated mouse brains.

    PubMed

    Kelm, Nathaniel D; West, Kathryn L; Carson, Robert P; Gochberg, Daniel F; Ess, Kevin C; Does, Mark D

    2016-01-01

    Diffusion tensor imaging (DTI), diffusion kurtosis imaging (DKI), and DKI-derived white matter tract integrity metrics (WMTI) were experimentally evaluated ex vivo through comparisons to histological measurements and established magnetic resonance imaging (MRI) measures of myelin in two knockout mouse models with varying degrees of hypomyelination. DKI metrics of mean and radial kurtosis were found to be better indicators of myelin content than conventional DTI metrics. The biophysical WMTI model based on the DKI framework reported on axon water fraction with good accuracy in cases with near normal axon density, but did not provide additional specificity to myelination. Overall, DKI provided additional information regarding white matter microstructure compared with DTI, making it an attractive method for future assessments of white matter development and pathology. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. Development of a diatom-based multimetric index for acid mine drainage impacted depressional wetlands.

    PubMed

    Riato, Luisa; Leira, Manel; Della Bella, Valentina; Oberholster, Paul J

    2018-01-15

    Acid mine drainage (AMD) from coal mining in the Mpumalanga Highveld region of South Africa has caused severe chemical and biological degradation of aquatic habitats, specifically depressional wetlands, as mines use these wetlands for storage of AMD. Diatom-based multimetric indices (MMIs) to assess wetland condition have mostly been developed to assess agricultural and urban land use impacts. No diatom MMI of wetland condition has been developed to assess AMD impacts related to mining activities. Previous approaches to diatom-based MMI development in wetlands have not accounted for natural variability. Natural variability among depressional wetlands may influence the accuracy of MMIs. Epiphytic diatom MMIs sensitive to AMD were developed for a range of depressional wetland types to account for natural variation in biological metrics. For this, we classified wetland types based on diatom typologies. A range of 4-15 final metrics were selected from a pool of ~140 candidate metrics to develop the MMIs based on their: (1) broad range, (2) high separation power and (3) low correlation among metrics. Final metrics were selected from three categories: similarity to reference sites, functional groups, and taxonomic composition, which represent different aspects of diatom assemblage structure and function. MMI performances were evaluated according to their precision in distinguishing reference sites, responsiveness to discriminate reference and disturbed sites, sensitivity to human disturbances and relevancy to AMD-related stressors. Each MMI showed excellent discriminatory power, whether or not it accounted for natural variation. However, accounting for variation by grouping sites based on diatom typologies improved overall performance of MMIs. Our study highlights the usefulness of diatom-based metrics and provides a model for the biological assessment of depressional wetland condition in South Africa and elsewhere. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. VHA mental health information system: applying health information technology to monitor and facilitate implementation of VHA Uniform Mental Health Services Handbook requirements.

    PubMed

    Trafton, Jodie A; Greenberg, Greg; Harris, Alex H S; Tavakoli, Sara; Kearney, Lisa; McCarthy, John; Blow, Fredric; Hoff, Rani; Schohn, Mary

    2013-03-01

    To describe the design and deployment of health information technology to support implementation of mental health services policy requirements in the Veterans Health Administration (VHA). Using administrative and self-report survey data, we developed and fielded metrics regarding implementation of the requirements delineated in the VHA Uniform Mental Health Services Handbook. Finalized metrics were incorporated into 2 external facilitation-based quality improvement programs led by the VHA Mental Health Operations. To support these programs, tailored site-specific reports were generated. Metric development required close collaboration between program evaluators, policy makers and clinical leadership, and consideration of policy language and intent. Electronic reports supporting different purposes required distinct formatting and presentation features, despite their having similar general goals and using the same metrics. Health information technology can facilitate mental health policy implementation but must be integrated into a process of consensus building and close collaboration with policy makers, evaluators, and practitioners.

  5. Investigation into Text Classification With Kernel Based Schemes

    DTIC Science & Technology

    2010-03-01

    Document Matrix TDMs Term-Document Matrices TMG Text to Matrix Generator TN True Negative TP True Positive VSM Vector Space Model xxii THIS PAGE...are represented as a term-document matrix, common evaluation metrics, and the software package Text to Matrix Generator ( TMG ). The classifier...AND METRICS This chapter introduces the indexing capabilities of the Text to Matrix Generator ( TMG ) Toolbox. Specific attention is placed on the

  6. PET and MRI image fusion based on combination of 2-D Hilbert transform and IHS method.

    PubMed

    Haddadpour, Mozhdeh; Daneshvar, Sabalan; Seyedarabi, Hadi

    2017-08-01

    The process of medical image fusion is combining two or more medical images such as Magnetic Resonance Image (MRI) and Positron Emission Tomography (PET) and mapping them to a single image as fused image. So purpose of our study is assisting physicians to diagnose and treat the diseases in the least of the time. We used Magnetic Resonance Image (MRI) and Positron Emission Tomography (PET) as input images, so fused them based on combination of two dimensional Hilbert transform (2-D HT) and Intensity Hue Saturation (IHS) method. Evaluation metrics that we apply are Discrepancy (D k ) as an assessing spectral features and Average Gradient (AG k ) as an evaluating spatial features and also Overall Performance (O.P) to verify properly of the proposed method. In this paper we used three common evaluation metrics like Average Gradient (AG k ) and the lowest Discrepancy (D k ) and Overall Performance (O.P) to evaluate the performance of our method. Simulated and numerical results represent the desired performance of proposed method. Since that the main purpose of medical image fusion is preserving both spatial and spectral features of input images, so based on numerical results of evaluation metrics such as Average Gradient (AG k ), Discrepancy (D k ) and Overall Performance (O.P) and also desired simulated results, it can be concluded that our proposed method can preserve both spatial and spectral features of input images. Copyright © 2017 Chang Gung University. Published by Elsevier B.V. All rights reserved.

  7. Stability metrics for multi-source biomedical data based on simplicial projections from probability distribution distances.

    PubMed

    Sáez, Carlos; Robles, Montserrat; García-Gómez, Juan M

    2017-02-01

    Biomedical data may be composed of individuals generated from distinct, meaningful sources. Due to possible contextual biases in the processes that generate data, there may exist an undesirable and unexpected variability among the probability distribution functions (PDFs) of the source subsamples, which, when uncontrolled, may lead to inaccurate or unreproducible research results. Classical statistical methods may have difficulties to undercover such variabilities when dealing with multi-modal, multi-type, multi-variate data. This work proposes two metrics for the analysis of stability among multiple data sources, robust to the aforementioned conditions, and defined in the context of data quality assessment. Specifically, a global probabilistic deviation and a source probabilistic outlyingness metrics are proposed. The first provides a bounded degree of the global multi-source variability, designed as an estimator equivalent to the notion of normalized standard deviation of PDFs. The second provides a bounded degree of the dissimilarity of each source to a latent central distribution. The metrics are based on the projection of a simplex geometrical structure constructed from the Jensen-Shannon distances among the sources PDFs. The metrics have been evaluated and demonstrated their correct behaviour on a simulated benchmark and with real multi-source biomedical data using the UCI Heart Disease data set. The biomedical data quality assessment based on the proposed stability metrics may improve the efficiency and effectiveness of biomedical data exploitation and research.

  8. An investigation of the impact of variations of DVH calculation algorithms on DVH dependant radiation therapy plan evaluation metrics

    NASA Astrophysics Data System (ADS)

    Kennedy, A. M.; Lane, J.; Ebert, M. A.

    2014-03-01

    Plan review systems often allow dose volume histogram (DVH) recalculation as part of a quality assurance process for trials. A review of the algorithms provided by a number of systems indicated that they are often very similar. One notable point of variation between implementations is in the location and frequency of dose sampling. This study explored the impact such variations can have on DVH based plan evaluation metrics (Normal Tissue Complication Probability (NTCP), min, mean and max dose), for a plan with small structures placed over areas of high dose gradient. Dose grids considered were exported from the original planning system at a range of resolutions. We found that for the CT based resolutions used in all but one plan review systems (CT and CT with guaranteed minimum number of sampling voxels in the x and y direction) results were very similar and changed in a similar manner with changes in the dose grid resolution despite the extreme conditions. Differences became noticeable however when resolution was increased in the axial (z) direction. Evaluation metrics also varied differently with changing dose grid for CT based resolutions compared to dose grid based resolutions. This suggests that if DVHs are being compared between systems that use a different basis for selecting sampling resolution it may become important to confirm that a similar resolution was used during calculation.

  9. Evaluation of cassette-based digital radiography detectors using standardized image quality metrics: AAPM TG-150 Draft Image Detector Tests.

    PubMed

    Li, Guang; Greene, Travis C; Nishino, Thomas K; Willis, Charles E

    2016-09-08

    The purpose of this study was to evaluate several of the standardized image quality metrics proposed by the American Association of Physics in Medicine (AAPM) Task Group 150. The task group suggested region-of-interest (ROI)-based techniques to measure nonuniformity, minimum signal-to-noise ratio (SNR), number of anomalous pixels, and modulation transfer function (MTF). This study evaluated the effects of ROI size and layout on the image metrics by using four different ROI sets, assessed result uncertainty by repeating measurements, and compared results with two commercially available quality control tools, namely the Carestream DIRECTVIEW Total Quality Tool (TQT) and the GE Healthcare Quality Assurance Process (QAP). Seven Carestream DRX-1C (CsI) detectors on mobile DR systems and four GE FlashPad detectors in radiographic rooms were tested. Images were analyzed using MATLAB software that had been previously validated and reported. Our values for signal and SNR nonuniformity and MTF agree with values published by other investigators. Our results show that ROI size affects nonuniformity and minimum SNR measurements, but not detection of anomalous pixels. Exposure geometry affects all tested image metrics except for the MTF. TG-150 metrics in general agree with the TQT, but agree with the QAP only for local and global signal nonuniformity. The difference in SNR nonuniformity and MTF values between the TG-150 and QAP may be explained by differences in the calculation of noise and acquisition beam quality, respectively. TG-150's SNR nonuniformity metrics are also more sensitive to detector nonuniformity compared to the QAP. Our results suggest that fixed ROI size should be used for consistency because nonuniformity metrics depend on ROI size. Ideally, detector tests should be performed at the exact calibration position. If not feasible, a baseline should be established from the mean of several repeated measurements. Our study indicates that the TG-150 tests can be used as an independent standardized procedure for detector performance assessment. © 2016 The Authors.

  10. Evaluation of cassette‐based digital radiography detectors using standardized image quality metrics: AAPM TG‐150 Draft Image Detector Tests

    PubMed Central

    Greene, Travis C.; Nishino, Thomas K.; Willis, Charles E.

    2016-01-01

    The purpose of this study was to evaluate several of the standardized image quality metrics proposed by the American Association of Physics in Medicine (AAPM) Task Group 150. The task group suggested region‐of‐interest (ROI)‐based techniques to measure nonuniformity, minimum signal‐to‐noise ratio (SNR), number of anomalous pixels, and modulation transfer function (MTF). This study evaluated the effects of ROI size and layout on the image metrics by using four different ROI sets, assessed result uncertainty by repeating measurements, and compared results with two commercially available quality control tools, namely the Carestream DIRECTVIEW Total Quality Tool (TQT) and the GE Healthcare Quality Assurance Process (QAP). Seven Carestream DRX‐1C (CsI) detectors on mobile DR systems and four GE FlashPad detectors in radiographic rooms were tested. Images were analyzed using MATLAB software that had been previously validated and reported. Our values for signal and SNR nonuniformity and MTF agree with values published by other investigators. Our results show that ROI size affects nonuniformity and minimum SNR measurements, but not detection of anomalous pixels. Exposure geometry affects all tested image metrics except for the MTF. TG‐150 metrics in general agree with the TQT, but agree with the QAP only for local and global signal nonuniformity. The difference in SNR nonuniformity and MTF values between the TG‐150 and QAP may be explained by differences in the calculation of noise and acquisition beam quality, respectively. TG‐150's SNR nonuniformity metrics are also more sensitive to detector nonuniformity compared to the QAP. Our results suggest that fixed ROI size should be used for consistency because nonuniformity metrics depend on ROI size. Ideally, detector tests should be performed at the exact calibration position. If not feasible, a baseline should be established from the mean of several repeated measurements. Our study indicates that the TG‐150 tests can be used as an independent standardized procedure for detector performance assessment. PACS number(s): 87.57.‐s, 87.57.C PMID:27685102

  11. Hospital-Based Clinical Pharmacy Services to Improve Ambulatory Management of Chronic Obstructive Pulmonary Disease

    PubMed Central

    Smith, Amber Lanae; Palmer, Valerie; Farhat, Nada; Kalus, James S.; Thavarajah, Krishna; DiGiovine, Bruno; MacDonald, Nancy C.

    2016-01-01

    Background: No systematic evaluations of a comprehensive clinical pharmacy process measures currently exist to determine an optimal ambulatory care collaboration model for chronic obstructive pulmonary disease (COPD) patients. Objective: Describe the impact of a pharmacist-provided clinical COPD bundle on the management of COPD in a hospital-based ambulatory care clinic. Methods: This retrospective cohort analysis evaluated patients with COPD managed in an outpatient pulmonary clinic. The primary objective of this study was to assess the completion of 4 metrics known to improve the management of COPD: (1) medication therapy management, (2) quality measures including smoking cessation and vaccines, (3) patient adherence, and (4) patient education. The secondary objective was to evaluate the impact of the clinical COPD bundle on clinical and economic outcomes at 30 and 90 days post–initial visit. Results: A total of 138 patients were included in the study; 70 patients served as controls and 68 patients received the COPD bundle from the clinical pharmacist. No patients from the control group had all 4 metrics completed as documented, compared to 66 of the COPD bundle group (P < .0001). Additionally, a statistically significant difference was found in all 4 metrics when evaluated individually. Clinical pharmacy services reduced the number of phone call consults at 90 days (P = .04) but did not have a statistically significant impact on any additional pre-identified clinical outcomes. Conclusion: A pharmacist-driven clinical COPD bundle was associated with significant increases in the completion and documentation of 4 metrics known to improve the outpatient management of COPD.

  12. A framework for quantification of groundwater dynamics - redundancy and transferability of hydro(geo-)logical metrics

    NASA Astrophysics Data System (ADS)

    Heudorfer, Benedikt; Haaf, Ezra; Barthel, Roland; Stahl, Kerstin

    2017-04-01

    A new framework for quantification of groundwater dynamics has been proposed in a companion study (Haaf et al., 2017). In this framework, a number of conceptual aspects of dynamics, such as seasonality, regularity, flashiness or inter-annual forcing, are described, which are then linked to quantitative metrics. Hereby, a large number of possible metrics are readily available from literature, such as Pardé Coefficients, Colwell's Predictability Indices or Base Flow Index. In the present work, we focus on finding multicollinearity and in consequence redundancy among the metrics representing different patterns of dynamics found in groundwater hydrographs. This is done also to verify the categories of dynamics aspects suggested by Haaf et al., 2017. To determine the optimal set of metrics we need to balance the desired minimum number of metrics and the desired maximum descriptive property of the metrics. To do this, a substantial number of candidate metrics are applied to a diverse set of groundwater hydrographs from France, Germany and Austria within the northern alpine and peri-alpine region. By applying Principle Component Analysis (PCA) to the correlation matrix of the metrics, we determine a limited number of relevant metrics that describe the majority of variation in the dataset. The resulting reduced set of metrics comprise an optimized set that can be used to describe the aspects of dynamics that were identified within the groundwater dynamics framework. For some aspects of dynamics a single significant metric could be attributed. Other aspects have a more fuzzy quality that can only be described by an ensemble of metrics and are re-evaluated. The PCA is furthermore applied to groups of groundwater hydrographs containing regimes of similar behaviour in order to explore transferability when applying the metric-based characterization framework to groups of hydrographs from diverse groundwater systems. In conclusion, we identify an optimal number of metrics, which are readily available for usage in studies on groundwater dynamics, intended to help overcome analytical limitations that exist due to the complexity of groundwater dynamics. Haaf, E., Heudorfer, B., Stahl, K., Barthel, R., 2017. A framework for quantification of groundwater dynamics - concepts and hydro(geo-)logical metrics. EGU General Assembly 2017, Vienna, Austria.

  13. Introduction to the special collection of papers on the San Luis Basin Sustainability Metrics Project: a methodology for evaluating regional sustainability.

    PubMed

    Heberling, Matthew T; Hopton, Matthew E

    2012-11-30

    This paper introduces a collection of four articles describing the San Luis Basin Sustainability Metrics Project. The Project developed a methodology for evaluating regional sustainability. This introduction provides the necessary background information for the project, description of the region, overview of the methods, and summary of the results. Although there are a multitude of scientifically based sustainability metrics, many are data intensive, difficult to calculate, and fail to capture all aspects of a system. We wanted to see if we could develop an approach that decision-makers could use to understand if their system was moving toward or away from sustainability. The goal was to produce a scientifically defensible, but straightforward and inexpensive methodology to measure and monitor environmental quality within a regional system. We initiated an interdisciplinary pilot project in the San Luis Basin, south-central Colorado, to test the methodology. The objectives were: 1) determine the applicability of using existing datasets to estimate metrics of sustainability at a regional scale; 2) calculate metrics through time from 1980 to 2005; and 3) compare and contrast the results to determine if the system was moving toward or away from sustainability. The sustainability metrics, chosen to represent major components of the system, were: 1) Ecological Footprint to capture the impact and human burden on the system; 2) Green Net Regional Product to represent economic welfare; 3) Emergy to capture the quality-normalized flow of energy through the system; and 4) Fisher information to capture the overall dynamic order and to look for possible regime changes. The methodology, data, and results of each metric are presented in the remaining four papers of the special collection. Based on the results of each metric and our criteria for understanding the sustainability trends, we find that the San Luis Basin is moving away from sustainability. Although we understand there are strengths and limitations of the methodology, we argue that each metric identifies changes to major components of the system. Published by Elsevier Ltd.

  14. Measuring economic complexity of countries and products: which metric to use?

    NASA Astrophysics Data System (ADS)

    Mariani, Manuel Sebastian; Vidmer, Alexandre; Medo, Matsúš; Zhang, Yi-Cheng

    2015-11-01

    Evaluating the economies of countries and their relations with products in the global market is a central problem in economics, with far-reaching implications to our theoretical understanding of the international trade as well as to practical applications, such as policy making and financial investment planning. The recent Economic Complexity approach aims to quantify the competitiveness of countries and the quality of the exported products based on the empirical observation that the most competitive countries have diversified exports, whereas developing countries only export few low quality products - typically those exported by many other countries. Two different metrics, Fitness-Complexity and the Method of Reflections, have been proposed to measure country and product score in the Economic Complexity framework. We use international trade data and a recent ranking evaluation measure to quantitatively compare the ability of the two metrics to rank countries and products according to their importance in the network. The results show that the Fitness-Complexity metric outperforms the Method of Reflections in both the ranking of products and the ranking of countries. We also investigate a generalization of the Fitness-Complexity metric and show that it can produce improved rankings provided that the input data are reliable.

  15. Graph Theoretical Analysis of Functional Brain Networks: Test-Retest Evaluation on Short- and Long-Term Resting-State Functional MRI Data

    PubMed Central

    Wang, Jin-Hui; Zuo, Xi-Nian; Gohel, Suril; Milham, Michael P.; Biswal, Bharat B.; He, Yong

    2011-01-01

    Graph-based computational network analysis has proven a powerful tool to quantitatively characterize functional architectures of the brain. However, the test-retest (TRT) reliability of graph metrics of functional networks has not been systematically examined. Here, we investigated TRT reliability of topological metrics of functional brain networks derived from resting-state functional magnetic resonance imaging data. Specifically, we evaluated both short-term (<1 hour apart) and long-term (>5 months apart) TRT reliability for 12 global and 6 local nodal network metrics. We found that reliability of global network metrics was overall low, threshold-sensitive and dependent on several factors of scanning time interval (TI, long-term>short-term), network membership (NM, networks excluding negative correlations>networks including negative correlations) and network type (NT, binarized networks>weighted networks). The dependence was modulated by another factor of node definition (ND) strategy. The local nodal reliability exhibited large variability across nodal metrics and a spatially heterogeneous distribution. Nodal degree was the most reliable metric and varied the least across the factors above. Hub regions in association and limbic/paralimbic cortices showed moderate TRT reliability. Importantly, nodal reliability was robust to above-mentioned four factors. Simulation analysis revealed that global network metrics were extremely sensitive (but varying degrees) to noise in functional connectivity and weighted networks generated numerically more reliable results in compared with binarized networks. For nodal network metrics, they showed high resistance to noise in functional connectivity and no NT related differences were found in the resistance. These findings provide important implications on how to choose reliable analytical schemes and network metrics of interest. PMID:21818285

  16. Flight Tasks and Metrics to Evaluate Laser Eye Protection in Flight Simulators

    DTIC Science & Technology

    2017-07-07

    AFRL-RH-FS-TR-2017-0026 Flight Tasks and Metrics to Evaluate Laser Eye Protection in Flight Simulators Thomas K. Kuyk Peter A. Smith Solangia...34Flight Tasks and Metrics to Evaluate Laser Eye Protection in Flight Simulators" (AFRL-RH-FS-TR- 2017 - 0026 SHORTER.PATRI CK.D.1023156390 Digitally...SUBTITLE Flight Tasks and Metrics to Evaluate Laser Eye Protection in Flight Simulators 5a. CONTRACT NUMBER FA8650-14-D-6519 5b. GRANT NUMBER 5c

  17. Person Re-Identification via Distance Metric Learning With Latent Variables.

    PubMed

    Sun, Chong; Wang, Dong; Lu, Huchuan

    2017-01-01

    In this paper, we propose an effective person re-identification method with latent variables, which represents a pedestrian as the mixture of a holistic model and a number of flexible models. Three types of latent variables are introduced to model uncertain factors in the re-identification problem, including vertical misalignments, horizontal misalignments and leg posture variations. The distance between two pedestrians can be determined by minimizing a given distance function with respect to latent variables, and then be used to conduct the re-identification task. In addition, we develop a latent metric learning method for learning the effective metric matrix, which can be solved via an iterative manner: once latent information is specified, the metric matrix can be obtained based on some typical metric learning methods; with the computed metric matrix, the latent variables can be determined by searching the state space exhaustively. Finally, extensive experiments are conducted on seven databases to evaluate the proposed method. The experimental results demonstrate that our method achieves better performance than other competing algorithms.

  18. Metrication report to the Congress

    NASA Technical Reports Server (NTRS)

    1990-01-01

    The principal NASA metrication activities for FY 1989 were a revision of NASA metric policy and evaluation of the impact of using the metric system of measurement for the design and construction of the Space Station Freedom. Additional studies provided a basis for focusing follow-on activity. In FY 1990, emphasis will shift to implementation of metric policy and development of a long-range metrication plan. The report which follows addresses Policy Development, Planning and Program Evaluation, and Supporting Activities for the past and coming year.

  19. The Effect of Training Data Set Composition on the Performance of a Neural Image Caption Generator

    DTIC Science & Technology

    2017-09-01

    objects was compared using the Metric for Evaluation of Translation with Explicit Ordering (METEOR) and Consensus-Based Image Description Evaluation...using automated scoring systems. Many such systems exist, including Bilingual Evaluation Understudy (BLEU), Consensus-Based Image Description Evaluation...shown to be essential to automated scoring, which correlates highly with human precision.5 CIDEr uses a system of consensus among the captions and

  20. An Extension of BLANC to System Mentions.

    PubMed

    Luo, Xiaoqiang; Pradhan, Sameer; Recasens, Marta; Hovy, Eduard

    2014-06-01

    BLANC is a link-based coreference evaluation metric for measuring the quality of coreference systems on gold mentions. This paper extends the original BLANC ("BLANC-gold" henceforth) to system mentions, removing the gold mention assumption. The proposed BLANC falls back seamlessly to the original one if system mentions are identical to gold mentions, and it is shown to strongly correlate with existing metrics on the 2011 and 2012 CoNLL data.

  1. Application of Support Vector Machine to Forex Monitoring

    NASA Astrophysics Data System (ADS)

    Kamruzzaman, Joarder; Sarker, Ruhul A.

    Previous studies have demonstrated superior performance of artificial neural network (ANN) based forex forecasting models over traditional regression models. This paper applies support vector machines to build a forecasting model from the historical data using six simple technical indicators and presents a comparison with an ANN based model trained by scaled conjugate gradient (SCG) learning algorithm. The models are evaluated and compared on the basis of five commonly used performance metrics that measure closeness of prediction as well as correctness in directional change. Forecasting results of six different currencies against Australian dollar reveal superior performance of SVM model using simple linear kernel over ANN-SCG model in terms of all the evaluation metrics. The effect of SVM parameter selection on prediction performance is also investigated and analyzed.

  2. Development of a clinician reputation metric to identify appropriate problem-medication pairs in a crowdsourced knowledge base.

    PubMed

    McCoy, Allison B; Wright, Adam; Rogith, Deevakar; Fathiamini, Safa; Ottenbacher, Allison J; Sittig, Dean F

    2014-04-01

    Correlation of data within electronic health records is necessary for implementation of various clinical decision support functions, including patient summarization. A key type of correlation is linking medications to clinical problems; while some databases of problem-medication links are available, they are not robust and depend on problems and medications being encoded in particular terminologies. Crowdsourcing represents one approach to generating robust knowledge bases across a variety of terminologies, but more sophisticated approaches are necessary to improve accuracy and reduce manual data review requirements. We sought to develop and evaluate a clinician reputation metric to facilitate the identification of appropriate problem-medication pairs through crowdsourcing without requiring extensive manual review. We retrieved medications from our clinical data warehouse that had been prescribed and manually linked to one or more problems by clinicians during e-prescribing between June 1, 2010 and May 31, 2011. We identified measures likely to be associated with the percentage of accurate problem-medication links made by clinicians. Using logistic regression, we created a metric for identifying clinicians who had made greater than or equal to 95% appropriate links. We evaluated the accuracy of the approach by comparing links made by those physicians identified as having appropriate links to a previously manually validated subset of problem-medication pairs. Of 867 clinicians who asserted a total of 237,748 problem-medication links during the study period, 125 had a reputation metric that predicted the percentage of appropriate links greater than or equal to 95%. These clinicians asserted a total of 2464 linked problem-medication pairs (983 distinct pairs). Compared to a previously validated set of problem-medication pairs, the reputation metric achieved a specificity of 99.5% and marginally improved the sensitivity of previously described knowledge bases. A reputation metric may be a valuable measure for identifying high quality clinician-entered, crowdsourced data. Copyright © 2013 Elsevier Inc. All rights reserved.

  3. Development of a clinician reputation metric to identify appropriate problem-medication pairs in a crowdsourced knowledge base

    PubMed Central

    McCoy, Allison B.; Wright, Adam; Rogith, Deevakar; Fathiamini, Safa; Ottenbacher, Allison J.; Sittig, Dean F.

    2014-01-01

    Background Correlation of data within electronic health records is necessary for implementation of various clinical decision support functions, including patient summarization. A key type of correlation is linking medications to clinical problems; while some databases of problem-medication links are available, they are not robust and depend on problems and medications being encoded in particular terminologies. Crowdsourcing represents one approach to generating robust knowledge bases across a variety of terminologies, but more sophisticated approaches are necessary to improve accuracy and reduce manual data review requirements. Objective We sought to develop and evaluate a clinician reputation metric to facilitate the identification of appropriate problem-medication pairs through crowdsourcing without requiring extensive manual review. Approach We retrieved medications from our clinical data warehouse that had been prescribed and manually linked to one or more problems by clinicians during e-prescribing between June 1, 2010 and May 31, 2011. We identified measures likely to be associated with the percentage of accurate problem-medication links made by clinicians. Using logistic regression, we created a metric for identifying clinicians who had made greater than or equal to 95% appropriate links. We evaluated the accuracy of the approach by comparing links made by those physicians identified as having appropriate links to a previously manually validated subset of problem-medication pairs. Results Of 867 clinicians who asserted a total of 237,748 problem-medication links during the study period, 125 had a reputation metric that predicted the percentage of appropriate links greater than or equal to 95%. These clinicians asserted a total of 2464 linked problem-medication pairs (983 distinct pairs). Compared to a previously validated set of problem-medication pairs, the reputation metric achieved a specificity of 99.5% and marginally improved the sensitivity of previously described knowledge bases. Conclusion A reputation metric may be a valuable measure for identifying high quality clinician-entered, crowdsourced data. PMID:24321170

  4. A comparative study of multi-focus image fusion validation metrics

    NASA Astrophysics Data System (ADS)

    Giansiracusa, Michael; Lutz, Adam; Messer, Neal; Ezekiel, Soundararajan; Alford, Mark; Blasch, Erik; Bubalo, Adnan; Manno, Michael

    2016-05-01

    Fusion of visual information from multiple sources is relevant for applications security, transportation, and safety applications. One way that image fusion can be particularly useful is when fusing imagery data from multiple levels of focus. Different focus levels can create different visual qualities for different regions in the imagery, which can provide much more visual information to analysts when fused. Multi-focus image fusion would benefit a user through automation, which requires the evaluation of the fused images to determine whether they have properly fused the focused regions of each image. Many no-reference metrics, such as information theory based, image feature based and structural similarity-based have been developed to accomplish comparisons. However, it is hard to scale an accurate assessment of visual quality which requires the validation of these metrics for different types of applications. In order to do this, human perception based validation methods have been developed, particularly dealing with the use of receiver operating characteristics (ROC) curves and the area under them (AUC). Our study uses these to analyze the effectiveness of no-reference image fusion metrics applied to multi-resolution fusion methods in order to determine which should be used when dealing with multi-focus data. Preliminary results show that the Tsallis, SF, and spatial frequency metrics are consistent with the image quality and peak signal to noise ratio (PSNR).

  5. A New Metric for Land-Atmosphere Coupling Strength: Applications on Observations and Modeling

    NASA Astrophysics Data System (ADS)

    Tang, Q.; Xie, S.; Zhang, Y.; Phillips, T. J.; Santanello, J. A., Jr.; Cook, D. R.; Riihimaki, L.; Gaustad, K.

    2017-12-01

    A new metric is proposed to quantify the land-atmosphere (LA) coupling strength and is elaborated by correlating the surface evaporative fraction and impacting land and atmosphere variables (e.g., soil moisture, vegetation, and radiation). Based upon multiple linear regression, this approach simultaneously considers multiple factors and thus represents complex LA coupling mechanisms better than existing single variable metrics. The standardized regression coefficients quantify the relative contributions from individual drivers in a consistent manner, avoiding the potential inconsistency in relative influence of conventional metrics. Moreover, the unique expendable feature of the new method allows us to verify and explore potentially important coupling mechanisms. Our observation-based application of the new metric shows moderate coupling with large spatial variations at the U.S. Southern Great Plains. The relative importance of soil moisture vs. vegetation varies by location. We also show that LA coupling strength is generally underestimated by single variable methods due to their incompleteness. We also apply this new metric to evaluate the representation of LA coupling in the Accelerated Climate Modeling for Energy (ACME) V1 Contiguous United States (CONUS) regionally refined model (RRM). This work is performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-ABS-734201

  6. Image quality assessment metric for frame accumulated image

    NASA Astrophysics Data System (ADS)

    Yu, Jianping; Li, Gang; Wang, Shaohui; Lin, Ling

    2018-01-01

    The medical image quality determines the accuracy of diagnosis, and the gray-scale resolution is an important parameter to measure image quality. But current objective metrics are not very suitable for assessing medical images obtained by frame accumulation technology. Little attention was paid to the gray-scale resolution, basically based on spatial resolution and limited to the 256 level gray scale of the existing display device. Thus, this paper proposes a metric, "mean signal-to-noise ratio" (MSNR) based on signal-to-noise in order to be more reasonable to evaluate frame accumulated medical image quality. We demonstrate its potential application through a series of images under a constant illumination signal. Here, the mean image of enough images was regarded as the reference image. Several groups of images by different frame accumulation and their MSNR were calculated. The results of the experiment show that, compared with other quality assessment methods, the metric is simpler, more effective, and more suitable for assessing frame accumulated images that surpass the gray scale and precision of the original image.

  7. Blind Source Parameters for Performance Evaluation of Despeckling Filters.

    PubMed

    Biradar, Nagashettappa; Dewal, M L; Rohit, ManojKumar; Gowre, Sanjaykumar; Gundge, Yogesh

    2016-01-01

    The speckle noise is inherent to transthoracic echocardiographic images. A standard noise-free reference echocardiographic image does not exist. The evaluation of filters based on the traditional parameters such as peak signal-to-noise ratio, mean square error, and structural similarity index may not reflect the true filter performance on echocardiographic images. Therefore, the performance of despeckling can be evaluated using blind assessment metrics like the speckle suppression index, speckle suppression and mean preservation index (SMPI), and beta metric. The need for noise-free reference image is overcome using these three parameters. This paper presents a comprehensive analysis and evaluation of eleven types of despeckling filters for echocardiographic images in terms of blind and traditional performance parameters along with clinical validation. The noise is effectively suppressed using the logarithmic neighborhood shrinkage (NeighShrink) embedded with Stein's unbiased risk estimation (SURE). The SMPI is three times more effective compared to the wavelet based generalized likelihood estimation approach. The quantitative evaluation and clinical validation reveal that the filters such as the nonlocal mean, posterior sampling based Bayesian estimation, hybrid median, and probabilistic patch based filters are acceptable whereas median, anisotropic diffusion, fuzzy, and Ripplet nonlinear approximation filters have limited applications for echocardiographic images.

  8. Blind Source Parameters for Performance Evaluation of Despeckling Filters

    PubMed Central

    Biradar, Nagashettappa; Dewal, M. L.; Rohit, ManojKumar; Gowre, Sanjaykumar; Gundge, Yogesh

    2016-01-01

    The speckle noise is inherent to transthoracic echocardiographic images. A standard noise-free reference echocardiographic image does not exist. The evaluation of filters based on the traditional parameters such as peak signal-to-noise ratio, mean square error, and structural similarity index may not reflect the true filter performance on echocardiographic images. Therefore, the performance of despeckling can be evaluated using blind assessment metrics like the speckle suppression index, speckle suppression and mean preservation index (SMPI), and beta metric. The need for noise-free reference image is overcome using these three parameters. This paper presents a comprehensive analysis and evaluation of eleven types of despeckling filters for echocardiographic images in terms of blind and traditional performance parameters along with clinical validation. The noise is effectively suppressed using the logarithmic neighborhood shrinkage (NeighShrink) embedded with Stein's unbiased risk estimation (SURE). The SMPI is three times more effective compared to the wavelet based generalized likelihood estimation approach. The quantitative evaluation and clinical validation reveal that the filters such as the nonlocal mean, posterior sampling based Bayesian estimation, hybrid median, and probabilistic patch based filters are acceptable whereas median, anisotropic diffusion, fuzzy, and Ripplet nonlinear approximation filters have limited applications for echocardiographic images. PMID:27298618

  9. Voxel-based statistical analysis of uncertainties associated with deformable image registration

    NASA Astrophysics Data System (ADS)

    Li, Shunshan; Glide-Hurst, Carri; Lu, Mei; Kim, Jinkoo; Wen, Ning; Adams, Jeffrey N.; Gordon, James; Chetty, Indrin J.; Zhong, Hualiang

    2013-09-01

    Deformable image registration (DIR) algorithms have inherent uncertainties in their displacement vector fields (DVFs).The purpose of this study is to develop an optimal metric to estimate DIR uncertainties. Six computational phantoms have been developed from the CT images of lung cancer patients using a finite element method (FEM). The FEM generated DVFs were used as a standard for registrations performed on each of these phantoms. A mechanics-based metric, unbalanced energy (UE), was developed to evaluate these registration DVFs. The potential correlation between UE and DIR errors was explored using multivariate analysis, and the results were validated by landmark approach and compared with two other error metrics: DVF inverse consistency (IC) and image intensity difference (ID). Landmark-based validation was performed using the POPI-model. The results show that the Pearson correlation coefficient between UE and DIR error is rUE-error = 0.50. This is higher than rIC-error = 0.29 for IC and DIR error and rID-error = 0.37 for ID and DIR error. The Pearson correlation coefficient between UE and the product of the DIR displacements and errors is rUE-error × DVF = 0.62 for the six patients and rUE-error × DVF = 0.73 for the POPI-model data. It has been demonstrated that UE has a strong correlation with DIR errors, and the UE metric outperforms the IC and ID metrics in estimating DIR uncertainties. The quantified UE metric can be a useful tool for adaptive treatment strategies, including probability-based adaptive treatment planning.

  10. A comparison of spectral decorrelation techniques and performance evaluation metrics for a wavelet-based, multispectral data compression algorithm

    NASA Technical Reports Server (NTRS)

    Matic, Roy M.; Mosley, Judith I.

    1994-01-01

    Future space-based, remote sensing systems will have data transmission requirements that exceed available downlinks necessitating the use of lossy compression techniques for multispectral data. In this paper, we describe several algorithms for lossy compression of multispectral data which combine spectral decorrelation techniques with an adaptive, wavelet-based, image compression algorithm to exploit both spectral and spatial correlation. We compare the performance of several different spectral decorrelation techniques including wavelet transformation in the spectral dimension. The performance of each technique is evaluated at compression ratios ranging from 4:1 to 16:1. Performance measures used are visual examination, conventional distortion measures, and multispectral classification results. We also introduce a family of distortion metrics that are designed to quantify and predict the effect of compression artifacts on multi spectral classification of the reconstructed data.

  11. Health impact metrics for air pollution management strategies.

    PubMed

    Martenies, Sheena E; Wilkins, Donele; Batterman, Stuart A

    2015-12-01

    Health impact assessments (HIAs) inform policy and decision making by providing information regarding future health concerns, and quantitative HIAs now are being used for local and urban-scale projects. HIA results can be expressed using a variety of metrics that differ in meaningful ways, and guidance is lacking with respect to best practices for the development and use of HIA metrics. This study reviews HIA metrics pertaining to air quality management and presents evaluative criteria for their selection and use. These are illustrated in a case study where PM2.5 concentrations are lowered from 10 to 8μg/m(3) in an urban area of 1.8 million people. Health impact functions are used to estimate the number of premature deaths, unscheduled hospitalizations and other morbidity outcomes. The most common metric in recent quantitative HIAs has been the number of cases of adverse outcomes avoided. Other metrics include time-based measures, e.g., disability-adjusted life years (DALYs), monetized impacts, functional-unit based measures, e.g., benefits per ton of emissions reduced, and other economic indicators, e.g., cost-benefit ratios. These metrics are evaluated by considering their comprehensiveness, the spatial and temporal resolution of the analysis, how equity considerations are facilitated, and the analysis and presentation of uncertainty. In the case study, the greatest number of avoided cases occurs for low severity morbidity outcomes, e.g., asthma exacerbations (n=28,000) and minor-restricted activity days (n=37,000); while DALYs and monetized impacts are driven by the severity, duration and value assigned to a relatively low number of premature deaths (n=190 to 230 per year). The selection of appropriate metrics depends on the problem context and boundaries, the severity of impacts, and community values regarding health. The number of avoided cases provides an estimate of the number of people affected, and monetized impacts facilitate additional economic analyses useful to policy analysis. DALYs are commonly used as an aggregate measure of health impacts and can be used to compare impacts across studies. Benefits per ton metrics may be appropriate when changes in emissions rates can be estimated. To address community concerns and HIA objectives, a combination of metrics is suggested. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Zone calculation as a tool for assessing performance outcome in laparoscopic suturing.

    PubMed

    Buckley, Christina E; Kavanagh, Dara O; Nugent, Emmeline; Ryan, Donncha; Traynor, Oscar J; Neary, Paul C

    2015-06-01

    Simulator performance is measured by metrics, which are valued as an objective way of assessing trainees. Certain procedures such as laparoscopic suturing, however, may not be suitable for assessment under traditionally formulated metrics. Our aim was to assess if our new metric is a valid method of assessing laparoscopic suturing. A software program was developed to order to create a new metric, which would calculate the percentage of time spent operating within pre-defined areas called "zones." Twenty-five candidates (medical students N = 10, surgical residents N = 10, and laparoscopic experts N = 5) performed the laparoscopic suturing task on the ProMIS III(®) simulator. New metrics of "in-zone" and "out-zone" scores as well as traditional metrics of time, path length, and smoothness were generated. Performance was also assessed by two blinded observers using the OSATS and FLS rating scales. This novel metric was evaluated by comparing it to both traditional metrics and subjective scores. There was a significant difference in the average in-zone and out-zone scores between all three experience groups (p < 0.05). The new zone metrics scores correlated significantly with the subjective-blinded observer scores of OSATS and FLS (p = 0.0001). The new zone metric scores also correlated significantly with the traditional metrics of path length, time, and smoothness (p < 0.05). The new metric is a valid tool for assessing laparoscopic suturing objectively. This could be incorporated into a competency-based curriculum to monitor resident progression in the simulated setting.

  13. Scientist impact factor (SIF): a new metric for improving scientists' evaluation?

    PubMed

    Lippi, Giuseppe; Mattiuzzi, Camilla

    2017-08-01

    The publication of scientific research is the mainstay for knowledge dissemination, but is also an essential criterion of scientists' evaluation for recruiting funds and career progression. Although the most widespread approach for evaluating scientists is currently based on the H-index, the total impact factor (IF) and the overall number of citations, these metrics are plagued by some well-known drawbacks. Therefore, with the aim to improve the process of scientists' evaluation, we developed a new and potentially useful indicator of recent scientific output. The new metric scientist impact factor (SIF) was calculated as all citations of articles published in the two years following the publication year of the articles, divided by the overall number of articles published in that year. The metrics was then tested by analyzing data of the 40 top scientists of the local University. No correlation was found between SIF and H-index (r=0.15; P=0.367) or 2 years H-index (r=-0.01; P=0.933), whereas the H-index and 2 years H-index values were found to be highly correlated (r=0.57; P<0.001). A highly significant correlation was also observed between the articles published in one year and the total number of citations to these articles in the two following years (r=0.62; P<0.001). According to our data, the SIF may be a useful measure to complement current metrics for evaluating scientific output. Its use may be especially helpful for young scientists, wherein the SIF reflects the scientific output over the past two years thus increasing their chances to apply to and obtain competitive funding.

  14. Scientist impact factor (SIF): a new metric for improving scientists’ evaluation?

    PubMed Central

    Mattiuzzi, Camilla

    2017-01-01

    Background The publication of scientific research is the mainstay for knowledge dissemination, but is also an essential criterion of scientists’ evaluation for recruiting funds and career progression. Although the most widespread approach for evaluating scientists is currently based on the H-index, the total impact factor (IF) and the overall number of citations, these metrics are plagued by some well-known drawbacks. Therefore, with the aim to improve the process of scientists’ evaluation, we developed a new and potentially useful indicator of recent scientific output. Methods The new metric scientist impact factor (SIF) was calculated as all citations of articles published in the two years following the publication year of the articles, divided by the overall number of articles published in that year. The metrics was then tested by analyzing data of the 40 top scientists of the local University. Results No correlation was found between SIF and H-index (r=0.15; P=0.367) or 2 years H-index (r=−0.01; P=0.933), whereas the H-index and 2 years H-index values were found to be highly correlated (r=0.57; P<0.001). A highly significant correlation was also observed between the articles published in one year and the total number of citations to these articles in the two following years (r=0.62; P<0.001). Conclusions According to our data, the SIF may be a useful measure to complement current metrics for evaluating scientific output. Its use may be especially helpful for young scientists, wherein the SIF reflects the scientific output over the past two years thus increasing their chances to apply to and obtain competitive funding. PMID:28856143

  15. Development of a multimetric index for assessing the biological condition of the Ohio River

    USGS Publications Warehouse

    Emery, E.B.; Simon, T.P.; McCormick, F.H.; Angermeier, P.L.; Deshon, J.E.; Yoder, C.O.; Sanders, R.E.; Pearson, W.D.; Hickman, G.D.; Reash, R.J.; Thomas, J.A.

    2003-01-01

    The use of fish communities to assess environmental quality is common for streams, but a standard methodology for large rivers is as yet largely undeveloped. We developed an index to assess the condition of fish assemblages along 1,580 km of the Ohio River. Representative samples of fish assemblages were collected from 709 Ohio River reaches, including 318 "least-impacted" sites, from 1991 to 2001 by means of standardized nighttime boat-electrofishing techniques. We evaluated 55 candidate metrics based on attributes of fish assemblage structure and function to derive a multimetric index of river health. We examined the spatial (by river kilometer) and temporal variability of these metrics and assessed their responsiveness to anthropogenic disturbances, namely, effluents, turbidity, and highly embedded substrates. The resulting Ohio River Fish Index (ORFIn) comprises 13 metrics selected because they responded predictably to measures of human disturbance or reflected desirable features of the Ohio River. We retained two metrics (the number of intolerant species and the number of sucker species [family Catostomidae]) from Karr's original index of biotic integrity. Six metrics were modified from indices developed for the upper Ohio River (the number of native species; number of great-river species; number of centrarchid species; the number of deformities, eroded fins and barbels, lesions, and tumors; percent individuals as simple lithophils; and percent individuals as tolerant species). We also incorporated three trophic metrics (the percent of individuals as detritivores, invertivores, and piscivores), one metric based on catch per unit effort, and one metric based on the percent of individuals as nonindigenous fish species. The ORFIn declined significantly where anthropogenic effects on substrate and water quality were prevalent and was significantly lower in the first 500 m below point source discharges than at least-impacted sites nearby. Although additional research on the temporal stability of the metrics and index will likely enhance the reliability of the ORFIn, its incorporation into Ohio River assessments still represents an improvement over current physicochemical protocols.

  16. Process-oriented Observational Metrics for CMIP6 Climate Model Assessments

    NASA Astrophysics Data System (ADS)

    Jiang, J. H.; Su, H.

    2016-12-01

    Observational metrics based on satellite observations have been developed and effectively applied during post-CMIP5 model evaluation and improvement projects. As new physics and parameterizations continue to be included in models for the upcoming CMIP6, it is important to continue objective comparisons between observations and model results. This talk will summarize the process-oriented observational metrics and methodologies for constraining climate models with A-Train satellite observations and support CMIP6 model assessments. We target parameters and processes related to atmospheric clouds and water vapor, which are critically important for Earth's radiative budget, climate feedbacks, and water and energy cycles, and thus reduce uncertainties in climate models.

  17. Accounting for regional variation in both natural environment and human disturbance to improve performance of multimetric indices of lotic benthic diatoms.

    PubMed

    Tang, Tao; Stevenson, R Jan; Infante, Dana M

    2016-10-15

    Regional variation in both natural environment and human disturbance can influence performance of ecological assessments. In this study we calculated 5 types of benthic diatom multimetric indices (MMIs) with 3 different approaches to account for variation in ecological assessments. We used: site groups defined by ecoregions or diatom typologies; the same or different sets of metrics among site groups; and unmodeled or modeled MMIs, where models accounted for natural variation in metrics within site groups by calculating an expected reference condition for each metric and each site. We used data from the USEPA's National Rivers and Streams Assessment to calculate the MMIs and evaluate changes in MMI performance. MMI performance was evaluated with indices of precision, bias, responsiveness, sensitivity and relevancy which were respectively measured as MMI variation among reference sites, effects of natural variables on MMIs, difference between MMIs at reference and highly disturbed sites, percent of highly disturbed sites properly classified, and relation of MMIs to human disturbance and stressors. All 5 types of MMIs showed considerable discrimination ability. Using different metrics among ecoregions sometimes reduced precision, but it consistently increased responsiveness, sensitivity, and relevancy. Site specific metric modeling reduced bias and increased responsiveness. Combined use of different metrics among site groups and site specific modeling significantly improved MMI performance irrespective of site grouping approach. Compared to ecoregion site classification, grouping sites based on diatom typologies improved precision, but did not improve overall performance of MMIs if we accounted for natural variation in metrics with site specific models. We conclude that using different metrics among ecoregions and site specific metric modeling improve MMI performance, particularly when used together. Applications of these MMI approaches in ecological assessments introduced a tradeoff with assessment consistency when metrics differed across site groups, but they justified the convenient and consistent use of ecoregions. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. CUQI: cardiac ultrasound video quality index

    PubMed Central

    Razaak, Manzoor; Martini, Maria G.

    2016-01-01

    Abstract. Medical images and videos are now increasingly part of modern telecommunication applications, including telemedicinal applications, favored by advancements in video compression and communication technologies. Medical video quality evaluation is essential for modern applications since compression and transmission processes often compromise the video quality. Several state-of-the-art video quality metrics used for quality evaluation assess the perceptual quality of the video. For a medical video, assessing quality in terms of “diagnostic” value rather than “perceptual” quality is more important. We present a diagnostic-quality–oriented video quality metric for quality evaluation of cardiac ultrasound videos. Cardiac ultrasound videos are characterized by rapid repetitive cardiac motions and distinct structural information characteristics that are explored by the proposed metric. Cardiac ultrasound video quality index, the proposed metric, is a full reference metric and uses the motion and edge information of the cardiac ultrasound video to evaluate the video quality. The metric was evaluated for its performance in approximating the quality of cardiac ultrasound videos by testing its correlation with the subjective scores of medical experts. The results of our tests showed that the metric has high correlation with medical expert opinions and in several cases outperforms the state-of-the-art video quality metrics considered in our tests. PMID:27014715

  19. Scale-dependent complementarity of climatic velocity and environmental diversity for identifying priority areas for conservation under climate change.

    PubMed

    Carroll, Carlos; Roberts, David R; Michalak, Julia L; Lawler, Joshua J; Nielsen, Scott E; Stralberg, Diana; Hamann, Andreas; Mcrae, Brad H; Wang, Tongli

    2017-11-01

    As most regions of the earth transition to altered climatic conditions, new methods are needed to identify refugia and other areas whose conservation would facilitate persistence of biodiversity under climate change. We compared several common approaches to conservation planning focused on climate resilience over a broad range of ecological settings across North America and evaluated how commonalities in the priority areas identified by different methods varied with regional context and spatial scale. Our results indicate that priority areas based on different environmental diversity metrics differed substantially from each other and from priorities based on spatiotemporal metrics such as climatic velocity. Refugia identified by diversity or velocity metrics were not strongly associated with the current protected area system, suggesting the need for additional conservation measures including protection of refugia. Despite the inherent uncertainties in predicting future climate, we found that variation among climatic velocities derived from different general circulation models and emissions pathways was less than the variation among the suite of environmental diversity metrics. To address uncertainty created by this variation, planners can combine priorities identified by alternative metrics at a single resolution and downweight areas of high variation between metrics. Alternately, coarse-resolution velocity metrics can be combined with fine-resolution diversity metrics in order to leverage the respective strengths of the two groups of metrics as tools for identification of potential macro- and microrefugia that in combination maximize both transient and long-term resilience to climate change. Planners should compare and integrate approaches that span a range of model complexity and spatial scale to match the range of ecological and physical processes influencing persistence of biodiversity and identify a conservation network resilient to threats operating at multiple scales. © 2017 The Authors. Global Change Biology Published by John Wiley & Sons Ltd.

  20. Task-based detectability comparison of exponential transformation of free-response operating characteristic (EFROC) curve and channelized Hotelling observer (CHO)

    NASA Astrophysics Data System (ADS)

    Khobragade, P.; Fan, Jiahua; Rupcich, Franco; Crotty, Dominic J.; Gilat Schmidt, Taly

    2016-03-01

    This study quantitatively evaluated the performance of the exponential transformation of the free-response operating characteristic curve (EFROC) metric, with the Channelized Hotelling Observer (CHO) as a reference. The CHO has been used for image quality assessment of reconstruction algorithms and imaging systems and often it is applied to study the signal-location-known cases. The CHO also requires a large set of images to estimate the covariance matrix. In terms of clinical applications, this assumption and requirement may be unrealistic. The newly developed location-unknown EFROC detectability metric is estimated from the confidence scores reported by a model observer. Unlike the CHO, EFROC does not require a channelization step and is a non-parametric detectability metric. There are few quantitative studies available on application of the EFROC metric, most of which are based on simulation data. This study investigated the EFROC metric using experimental CT data. A phantom with four low contrast objects: 3mm (14 HU), 5mm (7HU), 7mm (5 HU) and 10 mm (3 HU) was scanned at dose levels ranging from 25 mAs to 270 mAs and reconstructed using filtered backprojection. The area under the curve values for CHO (AUC) and EFROC (AFE) were plotted with respect to different dose levels. The number of images required to estimate the non-parametric AFE metric was calculated for varying tasks and found to be less than the number of images required for parametric CHO estimation. The AFE metric was found to be more sensitive to changes in dose than the CHO metric. This increased sensitivity and the assumption of unknown signal location may be useful for investigating and optimizing CT imaging methods. Future work is required to validate the AFE metric against human observers.

  1. A meta-analysis of asbestos-related cancer risk that addresses fiber size and mineral type.

    PubMed

    Berman, D Wayne; Crump, Kenny S

    2008-01-01

    Quantitative estimates of the risk of lung cancer or mesothelioma in humans from asbestos exposure made by the U.S. Environmental Protection Agency (EPA) make use of estimates of potency factors based on phase-contrast microscopy (PCM) and obtained from cohorts exposed to asbestos in different occupational environments. These potency factors exhibit substantial variability. The most likely reasons for this variability appear to be differences among environments in fiber size and mineralogy not accounted for by PCM. In this article, the U.S. Environmental Protection Agency (EPA) models for asbestos-related lung cancer and mesothelioma are expanded to allow the potency of fibers to depend upon their mineralogical types and sizes. This is accomplished by positing exposure metrics composed of nonoverlapping fiber categories and assigning each category its own unique potency. These category-specific potencies are estimated in a meta-analysis that fits the expanded models to potencies for lung cancer (KL's) or mesothelioma (KM's) based on PCM that were calculated for multiple epidemiological studies in our previous paper (Berman and Crump, 2008). Epidemiological study-specific estimates of exposures to fibers in the different fiber size categories of an exposure metric are estimated using distributions for fiber size based on transmission electron microscopy (TEM) obtained from the literature and matched to the individual epidemiological studies. The fraction of total asbestos exposure in a given environment respectively represented by chrysotile and amphibole asbestos is also estimated from information in the literature for that environment. Adequate information was found to allow KL's from 15 epidemiological studies and KM's from 11 studies to be included in the meta-analysis. Since the range of exposure metrics that could be considered was severely restricted by limitations in the published TEM fiber size distributions, it was decided to focus attention on four exposure metrics distinguished by fiber width: "all widths," widths > 0.2 micro m, widths < 0.4 microm, and widths < 0.2 microm, each of which has historical relevance. Each such metric defined by width was composed of four categories of fibers: chrysotile or amphibole asbestos with lengths between 5 microm and 10 microm or longer than 10 microm. Using these metrics three parameters were estimated for lung cancer and, separately, for mesothelioma: KLA, the potency of longer (length > 10 microm) amphibole fibers; rpc, the potency of pure chrysotile (uncontaminated by amphibole) relative to amphibole asbestos; and rps, the potency of shorter fibers (5 microm < length < 10 microm) relative to longer fibers. For mesothelioma, the hypothesis that chrysotile and amphibole asbestos are equally potent (rpc = 1) was strongly rejected by every metric and the hypothesis that (pure) chrysotile is nonpotent for mesothelioma was not rejected by any metric. Best estimates for the relative potency of chrysotile ranged from zero to about 1/200th that of amphibole asbestos (depending on metric). For lung cancer, the hypothesis that chrysotile and amphibole asbestos are equally potent (rpc = 1) was rejected (p < or = .05) by the two metrics based on thin fibers (length < 0.4 microm and < 0.2 microm) but not by the metrics based on thicker fibers. The "all widths" and widths < 0.4 microm metrics provide the best fits to both the lung cancer and mesothelioma data over the other metrics evaluated, although the improvements are only marginal for lung cancer. That these two metrics provide equivalent (for mesothelioma) and nearly equivalent (for lung cancer) fits to the data suggests that the available data sets may not be sufficiently rich (in variation of exposure characteristics) to fully evaluate the effects of fiber width on potency. Compared to the metric with widths > 0.2 microm with both rps and rpc fixed at 1 (which is nominally equivalent to the traditional PCM metric), the "all widths" and widths < 0.4 microm metrics provide substantially better fits for both lung cancer and, especially, mesothelioma. Although the best estimates of the potency of shorter fibers (5 < length < 10 microm) is zero for the "all widths" and widths < 0.4 microm metrics (or a small fraction of that of longer fibers for the widths > 0.2 microm metric for mesothelioma), the hypothesis that these shorter fibers were nonpotent could not be rejected for any of these metrics. Expansion of these metrics to include a category for fibers with lengths < 5 microm did not find any consistent evidence for any potency of these shortest fibers for either lung cancer or mesothelioma. Despite the substantial improvements in fit over that provided by the traditional use of PCM, neither the "all widths" nor the widths < 0.4 microm metrics (or any of the other metrics evaluated) completely resolve the differences in potency factors estimated in different occupational studies. Unresolved in particular is the discrepancy in potency factors for lung cancer from Quebec chrysotile miners and workers at the Charleston, SC, textile mill, which mainly processed chrysotile from Quebec. A leading hypothesis for this discrepancy is limitations in the fiber size distributions available for this analysis. Dement et al. (2007) recently analyzed by TEM archived air samples from the South Carolina plant to determine a detailed distribution of fiber lengths up to lengths of 40 microm and greater. If similar data become available for Quebec, perhaps these two size distributions can be used to eliminate the discrepancy between these two studies.

  2. Automatic Summarization of MEDLINE Citations for Evidence–Based Medical Treatment: A Topic-Oriented Evaluation

    PubMed Central

    Fiszman, Marcelo; Demner-Fushman, Dina; Kilicoglu, Halil; Rindflesch, Thomas C.

    2009-01-01

    As the number of electronic biomedical textual resources increases, it becomes harder for physicians to find useful answers at the point of care. Information retrieval applications provide access to databases; however, little research has been done on using automatic summarization to help navigate the documents returned by these systems. After presenting a semantic abstraction automatic summarization system for MEDLINE citations, we concentrate on evaluating its ability to identify useful drug interventions for fifty-three diseases. The evaluation methodology uses existing sources of evidence-based medicine as surrogates for a physician-annotated reference standard. Mean average precision (MAP) and a clinical usefulness score developed for this study were computed as performance metrics. The automatic summarization system significantly outperformed the baseline in both metrics. The MAP gain was 0.17 (p < 0.01) and the increase in the overall score of clinical usefulness was 0.39 (p < 0.05). PMID:19022398

  3. A Comparison of Evaluation Metrics for Biomedical Journals, Articles, and Websites in Terms of Sensitivity to Topic

    PubMed Central

    Fu, Lawrence D.; Aphinyanaphongs, Yindalon; Wang, Lily; Aliferis, Constantin F.

    2011-01-01

    Evaluating the biomedical literature and health-related websites for quality are challenging information retrieval tasks. Current commonly used methods include impact factor for journals, PubMed’s clinical query filters and machine learning-based filter models for articles, and PageRank for websites. Previous work has focused on the average performance of these methods without considering the topic, and it is unknown how performance varies for specific topics or focused searches. Clinicians, researchers, and users should be aware when expected performance is not achieved for specific topics. The present work analyzes the behavior of these methods for a variety of topics. Impact factor, clinical query filters, and PageRank vary widely across different topics while a topic-specific impact factor and machine learning-based filter models are more stable. The results demonstrate that a method may perform excellently on average but struggle when used on a number of narrower topics. Topic adjusted metrics and other topic robust methods have an advantage in such situations. Users of traditional topic-sensitive metrics should be aware of their limitations. PMID:21419864

  4. Bibliometrics: tracking research impact by selecting the appropriate metrics.

    PubMed

    Agarwal, Ashok; Durairajanayagam, Damayanthi; Tatagari, Sindhuja; Esteves, Sandro C; Harlev, Avi; Henkel, Ralf; Roychoudhury, Shubhadeep; Homa, Sheryl; Puchalt, Nicolás Garrido; Ramasamy, Ranjith; Majzoub, Ahmad; Ly, Kim Dao; Tvrda, Eva; Assidi, Mourad; Kesari, Kavindra; Sharma, Reecha; Banihani, Saleem; Ko, Edmund; Abu-Elmagd, Muhammad; Gosalvez, Jaime; Bashiri, Asher

    2016-01-01

    Traditionally, the success of a researcher is assessed by the number of publications he or she publishes in peer-reviewed, indexed, high impact journals. This essential yardstick, often referred to as the impact of a specific researcher, is assessed through the use of various metrics. While researchers may be acquainted with such matrices, many do not know how to use them to enhance their careers. In addition to these metrics, a number of other factors should be taken into consideration to objectively evaluate a scientist's profile as a researcher and academician. Moreover, each metric has its own limitations that need to be considered when selecting an appropriate metric for evaluation. This paper provides a broad overview of the wide array of metrics currently in use in academia and research. Popular metrics are discussed and defined, including traditional metrics and article-level metrics, some of which are applied to researchers for a greater understanding of a particular concept, including varicocele that is the thematic area of this Special Issue of Asian Journal of Andrology. We recommend the combined use of quantitative and qualitative evaluation using judiciously selected metrics for a more objective assessment of scholarly output and research impact.

  5. Bibliometrics: tracking research impact by selecting the appropriate metrics

    PubMed Central

    Agarwal, Ashok; Durairajanayagam, Damayanthi; Tatagari, Sindhuja; Esteves, Sandro C; Harlev, Avi; Henkel, Ralf; Roychoudhury, Shubhadeep; Homa, Sheryl; Puchalt, Nicolás Garrido; Ramasamy, Ranjith; Majzoub, Ahmad; Ly, Kim Dao; Tvrda, Eva; Assidi, Mourad; Kesari, Kavindra; Sharma, Reecha; Banihani, Saleem; Ko, Edmund; Abu-Elmagd, Muhammad; Gosalvez, Jaime; Bashiri, Asher

    2016-01-01

    Traditionally, the success of a researcher is assessed by the number of publications he or she publishes in peer-reviewed, indexed, high impact journals. This essential yardstick, often referred to as the impact of a specific researcher, is assessed through the use of various metrics. While researchers may be acquainted with such matrices, many do not know how to use them to enhance their careers. In addition to these metrics, a number of other factors should be taken into consideration to objectively evaluate a scientist's profile as a researcher and academician. Moreover, each metric has its own limitations that need to be considered when selecting an appropriate metric for evaluation. This paper provides a broad overview of the wide array of metrics currently in use in academia and research. Popular metrics are discussed and defined, including traditional metrics and article-level metrics, some of which are applied to researchers for a greater understanding of a particular concept, including varicocele that is the thematic area of this Special Issue of Asian Journal of Andrology. We recommend the combined use of quantitative and qualitative evaluation using judiciously selected metrics for a more objective assessment of scholarly output and research impact. PMID:26806079

  6. Performance Evaluation of EnKF-based Hydrogeological Site Characterization using Color Coherent Vectors

    NASA Astrophysics Data System (ADS)

    Moslehi, M.; de Barros, F.

    2017-12-01

    Complexity of hydrogeological systems arises from the multi-scale heterogeneity and insufficient measurements of their underlying parameters such as hydraulic conductivity and porosity. An inadequate characterization of hydrogeological properties can significantly decrease the trustworthiness of numerical models that predict groundwater flow and solute transport. Therefore, a variety of data assimilation methods have been proposed in order to estimate hydrogeological parameters from spatially scarce data by incorporating the governing physical models. In this work, we propose a novel framework for evaluating the performance of these estimation methods. We focus on the Ensemble Kalman Filter (EnKF) approach that is a widely used data assimilation technique. It reconciles multiple sources of measurements to sequentially estimate model parameters such as the hydraulic conductivity. Several methods have been used in the literature to quantify the accuracy of the estimations obtained by EnKF, including Rank Histograms, RMSE and Ensemble Spread. However, these commonly used methods do not regard the spatial information and variability of geological formations. This can cause hydraulic conductivity fields with very different spatial structures to have similar histograms or RMSE. We propose a vision-based approach that can quantify the accuracy of estimations by considering the spatial structure embedded in the estimated fields. Our new approach consists of adapting a new metric, Color Coherent Vectors (CCV), to evaluate the accuracy of estimated fields achieved by EnKF. CCV is a histogram-based technique for comparing images that incorporate spatial information. We represent estimated fields as digital three-channel images and use CCV to compare and quantify the accuracy of estimations. The sensitivity of CCV to spatial information makes it a suitable metric for assessing the performance of spatial data assimilation techniques. Under various factors of data assimilation methods such as number, layout, and type of measurements, we compare the performance of CCV with other metrics such as RMSE. By simulating hydrogeological processes using estimated and true fields, we observe that CCV outperforms other existing evaluation metrics.

  7. Value redefined for inflammatory bowel disease patients: a choice-based conjoint analysis of patients' preferences.

    PubMed

    van Deen, Welmoed K; Nguyen, Dominic; Duran, Natalie E; Kane, Ellen; van Oijen, Martijn G H; Hommes, Daniel W

    2017-02-01

    Value-based healthcare is an upcoming field. The core idea is to evaluate care based on achieved outcomes divided by the costs. Unfortunately, the optimal way to evaluate outcomes is ill-defined. In this study, we aim to develop a single, preference based, outcome metric, which can be used to quantify overall health value in inflammatory bowel disease (IBD). IBD patients filled out a choice-based conjoint (CBC) questionnaire in which patients chose preferable outcome scenarios with different levels of disease control (DC), quality of life (QoL), and productivity (Pr). A CBC analysis was performed to estimate the relative value of DC, QoL, and Pr. A patient-centered composite score was developed which was weighted based on the stated preferences. We included 210 IBD patients. Large differences in stated preferences were observed. Increases from low to intermediate outcome levels were valued more than increases from intermediate to high outcome levels. Overall, QoL was more important to patients than DC or Pr. Individual outcome scores were calculated based on the stated preferences. This score was significantly different from a score not weighted based on patient preferences in patients with active disease. We showed the feasibility of creating a single outcome metric in IBD which incorporates patients' values using a CBC. Because this metric changes significantly when weighted according to patients' values, we propose that success in healthcare should be measured accordingly.

  8. Guidelines for evaluating performance of oyster habitat restoration

    USGS Publications Warehouse

    Baggett, Lesley P.; Powers, Sean P.; Brumbaugh, Robert D.; Coen, Loren D.; DeAngelis, Bryan M.; Greene, Jennifer K.; Hancock, Boze T.; Morlock, Summer M.; Allen, Brian L.; Breitburg, Denise L.; Bushek, David; Grabowski, Jonathan H.; Grizzle, Raymond E.; Grosholz, Edwin D.; LaPeyre, Megan K.; Luckenbach, Mark W.; McGraw, Kay A.; Piehler, Michael F.; Westby, Stephanie R.; zu Ermgassen, Philine S. E.

    2015-01-01

    Restoration of degraded ecosystems is an important societal goal, yet inadequate monitoring and the absence of clear performance metrics are common criticisms of many habitat restoration projects. Funding limitations can prevent adequate monitoring, but we suggest that the lack of accepted metrics to address the diversity of restoration objectives also presents a serious challenge to the monitoring of restoration projects. A working group with experience in designing and monitoring oyster reef projects was used to develop standardized monitoring metrics, units, and performance criteria that would allow for comparison among restoration sites and projects of various construction types. A set of four universal metrics (reef areal dimensions, reef height, oyster density, and oyster size–frequency distribution) and a set of three universal environmental variables (water temperature, salinity, and dissolved oxygen) are recommended to be monitored for all oyster habitat restoration projects regardless of their goal(s). In addition, restoration goal-based metrics specific to four commonly cited ecosystem service-based restoration goals are recommended, along with an optional set of seven supplemental ancillary metrics that could provide information useful to the interpretation of prerestoration and postrestoration monitoring data. Widespread adoption of a common set of metrics with standardized techniques and units to assess well-defined goals not only allows practitioners to gauge the performance of their own projects but also allows for comparison among projects, which is both essential to the advancement of the field of oyster restoration and can provide new knowledge about the structure and ecological function of oyster reef ecosystems.

  9. Adapting the ISO 20462 softcopy ruler method for online image quality studies

    NASA Astrophysics Data System (ADS)

    Burns, Peter D.; Phillips, Jonathan B.; Williams, Don

    2013-01-01

    In this paper we address the problem of Image Quality Assessment of no reference metrics, focusing on JPEG corrupted images. In general no reference metrics are not able to measure with the same performance the distortions within their possible range and with respect to different image contents. The crosstalk between content and distortion signals influences the human perception. We here propose two strategies to improve the correlation between subjective and objective quality data. The first strategy is based on grouping the images according to their spatial complexity. The second one is based on a frequency analysis. Both the strategies are tested on two databases available in the literature. The results show an improvement in the correlations between no reference metrics and psycho-visual data, evaluated in terms of the Pearson Correlation Coefficient.

  10. Quantitative radiomics: impact of stochastic effects on textural feature analysis implies the need for standards

    PubMed Central

    Nyflot, Matthew J.; Yang, Fei; Byrd, Darrin; Bowen, Stephen R.; Sandison, George A.; Kinahan, Paul E.

    2015-01-01

    Abstract. Image heterogeneity metrics such as textural features are an active area of research for evaluating clinical outcomes with positron emission tomography (PET) imaging and other modalities. However, the effects of stochastic image acquisition noise on these metrics are poorly understood. We performed a simulation study by generating 50 statistically independent PET images of the NEMA IQ phantom with realistic noise and resolution properties. Heterogeneity metrics based on gray-level intensity histograms, co-occurrence matrices, neighborhood difference matrices, and zone size matrices were evaluated within regions of interest surrounding the lesions. The impact of stochastic variability was evaluated with percent difference from the mean of the 50 realizations, coefficient of variation and estimated sample size for clinical trials. Additionally, sensitivity studies were performed to simulate the effects of patient size and image reconstruction method on the quantitative performance of these metrics. Complex trends in variability were revealed as a function of textural feature, lesion size, patient size, and reconstruction parameters. In conclusion, the sensitivity of PET textural features to normal stochastic image variation and imaging parameters can be large and is feature-dependent. Standards are needed to ensure that prospective studies that incorporate textural features are properly designed to measure true effects that may impact clinical outcomes. PMID:26251842

  11. Quantitative radiomics: impact of stochastic effects on textural feature analysis implies the need for standards.

    PubMed

    Nyflot, Matthew J; Yang, Fei; Byrd, Darrin; Bowen, Stephen R; Sandison, George A; Kinahan, Paul E

    2015-10-01

    Image heterogeneity metrics such as textural features are an active area of research for evaluating clinical outcomes with positron emission tomography (PET) imaging and other modalities. However, the effects of stochastic image acquisition noise on these metrics are poorly understood. We performed a simulation study by generating 50 statistically independent PET images of the NEMA IQ phantom with realistic noise and resolution properties. Heterogeneity metrics based on gray-level intensity histograms, co-occurrence matrices, neighborhood difference matrices, and zone size matrices were evaluated within regions of interest surrounding the lesions. The impact of stochastic variability was evaluated with percent difference from the mean of the 50 realizations, coefficient of variation and estimated sample size for clinical trials. Additionally, sensitivity studies were performed to simulate the effects of patient size and image reconstruction method on the quantitative performance of these metrics. Complex trends in variability were revealed as a function of textural feature, lesion size, patient size, and reconstruction parameters. In conclusion, the sensitivity of PET textural features to normal stochastic image variation and imaging parameters can be large and is feature-dependent. Standards are needed to ensure that prospective studies that incorporate textural features are properly designed to measure true effects that may impact clinical outcomes.

  12. Expedited CT-Based Methods for Evaluating Fracture Severity to Assess Risk of Post-Traumatic Osteoarthritis After Articular Fractures.

    PubMed

    Anderson, Donald D; Kilburg, Anthony T; Thomas, Thaddeus P; Marsh, J Lawrence

    2016-01-01

    Post-traumatic osteoarthritis (PTOA) is common after intra-articular fractures of the tibial plafond. An objective CT-based measure of fracture severity was previously found to reliably predict whether PTOA developed following surgical treatment of such fractures. However, the extended time required obtaining the fracture energy metric and its reliance upon an intact contralateral limb CT limited its clinical applicability. The objective of this study was to establish an expedited fracture severity metric that provided comparable PTOA predictive ability without the prior limitations. An expedited fracture severity metric was computed from the CT scans of 30 tibial plafond fractures using textural analysis to quantify disorder in CT images. The expedited method utilized an intact surrogate model to enable severity assessment without requiring a contralateral limb CT. Agreement between the expedited fracture severity metric and the Kellgren-Lawrence (KL) radiographic OA score at two-year follow-up was assessed using concordance. The ability of the metric to differentiate between patients that did or did not develop PTOA was assessed using the Wilcoxon Ranked Sum test. The expedited severity metric agreed well (75.2% concordance) with the KL scores. The initial fracture severity of cases that developed PTOA differed significantly (p = 0.004) from those that did not. Receiver operating characteristic analysis showed that the expedited severity metric could accurately predict PTOA outcome in 80% of the cases. The time required to obtain the expedited severity metric averaged 14.9 minutes/ case, and the metric was obtained without using an intact contralateral CT. The expedited CT-based methods for fracture severity assessment present a solution to issues limiting the utility of prior methods. In a relatively short amount of time, the expedited methodology provided a severity score capable of predicting PTOA risk, without needing to have the intact contralateral limb included in the CT scan. The described methods provide surgeons an objective, quantitative representation of the severity of a fracture. Obtained prior to the surgery, it provides a reasonable alternative to current subjective classification systems. The expedited severity metric offers surgeons an objective means for factoring severity of joint insult into treatment decision-making.

  13. Learning Compositional Shape Models of Multiple Distance Metrics by Information Projection.

    PubMed

    Luo, Ping; Lin, Liang; Liu, Xiaobai

    2016-07-01

    This paper presents a novel compositional contour-based shape model by incorporating multiple distance metrics to account for varying shape distortions or deformations. Our approach contains two key steps: 1) contour feature generation and 2) generative model pursuit. For each category, we first densely sample an ensemble of local prototype contour segments from a few positive shape examples and describe each segment using three different types of distance metrics. These metrics are diverse and complementary with each other to capture various shape deformations. We regard the parameterized contour segment plus an additive residual ϵ as a basic subspace, namely, ϵ -ball, in the sense that it represents local shape variance under the certain distance metric. Using these ϵ -balls as features, we then propose a generative learning algorithm to pursue the compositional shape model, which greedily selects the most representative features under the information projection principle. In experiments, we evaluate our model on several public challenging data sets, and demonstrate that the integration of multiple shape distance metrics is capable of dealing various shape deformations, articulations, and background clutter, hence boosting system performance.

  14. Decomposition-based transfer distance metric learning for image classification.

    PubMed

    Luo, Yong; Liu, Tongliang; Tao, Dacheng; Xu, Chao

    2014-09-01

    Distance metric learning (DML) is a critical factor for image analysis and pattern recognition. To learn a robust distance metric for a target task, we need abundant side information (i.e., the similarity/dissimilarity pairwise constraints over the labeled data), which is usually unavailable in practice due to the high labeling cost. This paper considers the transfer learning setting by exploiting the large quantity of side information from certain related, but different source tasks to help with target metric learning (with only a little side information). The state-of-the-art metric learning algorithms usually fail in this setting because the data distributions of the source task and target task are often quite different. We address this problem by assuming that the target distance metric lies in the space spanned by the eigenvectors of the source metrics (or other randomly generated bases). The target metric is represented as a combination of the base metrics, which are computed using the decomposed components of the source metrics (or simply a set of random bases); we call the proposed method, decomposition-based transfer DML (DTDML). In particular, DTDML learns a sparse combination of the base metrics to construct the target metric by forcing the target metric to be close to an integration of the source metrics. The main advantage of the proposed method compared with existing transfer metric learning approaches is that we directly learn the base metric coefficients instead of the target metric. To this end, far fewer variables need to be learned. We therefore obtain more reliable solutions given the limited side information and the optimization tends to be faster. Experiments on the popular handwritten image (digit, letter) classification and challenge natural image annotation tasks demonstrate the effectiveness of the proposed method.

  15. Rendering-based video-CT registration with physical constraints for image-guided endoscopic sinus surgery

    NASA Astrophysics Data System (ADS)

    Otake, Y.; Leonard, S.; Reiter, A.; Rajan, P.; Siewerdsen, J. H.; Ishii, M.; Taylor, R. H.; Hager, G. D.

    2015-03-01

    We present a system for registering the coordinate frame of an endoscope to pre- or intra- operatively acquired CT data based on optimizing the similarity metric between an endoscopic image and an image predicted via rendering of CT. Our method is robust and semi-automatic because it takes account of physical constraints, specifically, collisions between the endoscope and the anatomy, to initialize and constrain the search. The proposed optimization method is based on a stochastic optimization algorithm that evaluates a large number of similarity metric functions in parallel on a graphics processing unit. Images from a cadaver and a patient were used for evaluation. The registration error was 0.83 mm and 1.97 mm for cadaver and patient images respectively. The average registration time for 60 trials was 4.4 seconds. The patient study demonstrated robustness of the proposed algorithm against a moderate anatomical deformation.

  16. Definition and classification of evaluation units for tertiary structure prediction in CASP12 facilitated through semi-automated metrics.

    PubMed

    Abriata, Luciano A; Kinch, Lisa N; Tamò, Giorgio E; Monastyrskyy, Bohdan; Kryshtafovych, Andriy; Dal Peraro, Matteo

    2018-03-01

    For assessment purposes, CASP targets are split into evaluation units. We herein present the official definition of CASP12 evaluation units (EUs) and their classification into difficulty categories. Each target can be evaluated as one EU (the whole target) or/and several EUs (separate structural domains or groups of structural domains). The specific scenario for a target split is determined by the domain organization of available templates, the difference in server performance on separate domains versus combination of the domains, and visual inspection. In the end, 71 targets were split into 96 EUs. Classification of the EUs into difficulty categories was done semi-automatically with the assistance of metrics provided by the Prediction Center. These metrics account for sequence and structural similarities of the EUs to potential structural templates from the Protein Data Bank, and for the baseline performance of automated server predictions. The metrics readily separate the 96 EUs into 38 EUs that should be straightforward for template-based modeling (TBM) and 39 that are expected to be hard for homology modeling and are thus left for free modeling (FM). The remaining 19 borderline evaluation units were dubbed FM/TBM, and were inspected case by case. The article also overviews structural and evolutionary features of selected targets relevant to our accompanying article presenting the assessment of FM and FM/TBM predictions, and overviews structural features of the hardest evaluation units from the FM category. We finally suggest improvements for the EU definition and classification procedures. © 2017 Wiley Periodicals, Inc.

  17. Assessing agreement among alternative climate change projections to inform conservation recommendations in the contiguous United States.

    PubMed

    Belote, R Travis; Carroll, Carlos; Martinuzzi, Sebastián; Michalak, Julia; Williams, John W; Williamson, Matthew A; Aplet, Gregory H

    2018-06-21

    Addressing uncertainties in climate vulnerability remains a challenge for conservation planning. We evaluate how confidence in conservation recommendations may change with agreement among alternative climate projections and metrics of climate exposure. We assessed agreement among three multivariate estimates of climate exposure (forward velocity, backward velocity, and climate dissimilarity) using 18 alternative climate projections for the contiguous United States. For each metric, we classified maps into quartiles for each alternative climate projections, and calculated the frequency of quartiles assigned for each gridded location (high quartile frequency = more agreement among climate projections). We evaluated recommendations using a recent climate adaptation heuristic framework that recommends emphasizing various conservation strategies to land based on current conservation value and expected climate exposure. We found that areas where conservation strategies would be confidently assigned based on high agreement among climate projections varied substantially across regions. In general, there was more agreement in forward and backward velocity estimates among alternative projections than agreement in estimates of local dissimilarity. Consensus of climate predictions resulted in the same conservation recommendation assignments in a few areas, but patterns varied by climate exposure metric. This work demonstrates an approach for explicitly evaluating alternative predictions in geographic patterns of climate change.

  18. Integrated computational biology analysis to evaluate target genes for chronic myelogenous leukemia.

    PubMed

    Zheng, Yu; Wang, Yu-Ping; Cao, Hongbao; Chen, Qiusheng; Zhang, Xi

    2018-06-05

    Although hundreds of genes have been linked to chronic myelogenous leukemia (CML), many of the results lack reproducibility. In the present study, data across multiple modalities were integrated to evaluate 579 CML candidate genes, including literature‑based CML‑gene relation data, Gene Expression Omnibus RNA expression data and pathway‑based gene‑gene interaction data. The expression data included samples from 76 patients with CML and 73 healthy controls. For each target gene, four metrics were proposed and tested with case/control classification. The effectiveness of the four metrics presented was demonstrated by the high classification accuracy (94.63%; P<2x10‑4). Cross metric analysis suggested nine top candidate genes for CML: Epidermal growth factor receptor, tumor protein p53, catenin β 1, janus kinase 2, tumor necrosis factor, abelson murine leukemia viral oncogene homolog 1, vascular endothelial growth factor A, B‑cell lymphoma 2 and proto‑oncogene tyrosine‑protein kinase. In addition, 145 CML candidate pathways enriched with 485 out of 579 genes were identified (P<8.2x10‑11; q=0.005). In conclusion, weighted genetic networks generated using computational biology may be complementary to biological experiments for the evaluation of known or novel CML target genes.

  19. Researcher and Author Impact Metrics: Variety, Value, and Context.

    PubMed

    Gasparyan, Armen Yuri; Yessirkepov, Marlen; Duisenova, Akmaral; Trukhachev, Vladimir I; Kostyukova, Elena I; Kitas, George D

    2018-04-30

    Numerous quantitative indicators are currently available for evaluating research productivity. No single metric is suitable for comprehensive evaluation of the author-level impact. The choice of particular metrics depends on the purpose and context of the evaluation. The aim of this article is to overview some of the widely employed author impact metrics and highlight perspectives of their optimal use. The h -index is one of the most popular metrics for research evaluation, which is easy to calculate and understandable for non-experts. It is automatically displayed on researcher and author profiles on citation databases such as Scopus and Web of Science. Its main advantage relates to the combined approach to the quantification of publication and citation counts. This index is increasingly cited globally. Being an appropriate indicator of publication and citation activity of highly productive and successfully promoted authors, the h -index has been criticized primarily for disadvantaging early career researchers and authors with a few indexed publications. Numerous variants of the index have been proposed to overcome its limitations. Alternative metrics have also emerged to highlight 'societal impact.' However, each of these traditional and alternative metrics has its own drawbacks, necessitating careful analyses of the context of social attention and value of publication and citation sets. Perspectives of the optimal use of researcher and author metrics is dependent on evaluation purposes and compounded by information sourced from various global, national, and specialist bibliographic databases.

  20. Investigation of designated eye position and viewing zone for a two-view autostereoscopic display.

    PubMed

    Huang, Kuo-Chung; Chou, Yi-Heng; Lin, Lang-chin; Lin, Hoang Yan; Chen, Fu-Hao; Liao, Ching-Chiu; Chen, Yi-Han; Lee, Kuen; Hsu, Wan-Hsuan

    2014-02-24

    Designated eye position (DEP) and viewing zone (VZ) are important optical parameters for designing a two-view autostereoscopic display. Although much research has been done to date, little empirical evidence has been found to establish a direct relationship between design and measurement. More rigorous studies and verifications to investigate DEP and to ascertain the VZ criterion will be valuable. We propose evaluation metrics based on equivalent luminance (EL) and binocular luminance (BL) to figure out DEP and VZ for a two-view autostereoscopic display. Simulation and experimental results prove that our proposed evaluation metrics can be used to find the DEP and VZ accurately.

  1. Validation of simulated earthquake ground motions based on evolution of intensity and frequency content

    USGS Publications Warehouse

    Rezaeian, Sanaz; Zhong, Peng; Hartzell, Stephen; Zareian, Farzin

    2015-01-01

    Simulated earthquake ground motions can be used in many recent engineering applications that require time series as input excitations. However, applicability and validation of simulations are subjects of debate in the seismological and engineering communities. We propose a validation methodology at the waveform level and directly based on characteristics that are expected to influence most structural and geotechnical response parameters. In particular, three time-dependent validation metrics are used to evaluate the evolving intensity, frequency, and bandwidth of a waveform. These validation metrics capture nonstationarities in intensity and frequency content of waveforms, making them ideal to address nonlinear response of structural systems. A two-component error vector is proposed to quantify the average and shape differences between these validation metrics for a simulated and recorded ground-motion pair. Because these metrics are directly related to the waveform characteristics, they provide easily interpretable feedback to seismologists for modifying their ground-motion simulation models. To further simplify the use and interpretation of these metrics for engineers, it is shown how six scalar key parameters, including duration, intensity, and predominant frequency, can be extracted from the validation metrics. The proposed validation methodology is a step forward in paving the road for utilization of simulated ground motions in engineering practice and is demonstrated using examples of recorded and simulated ground motions from the 1994 Northridge, California, earthquake.

  2. "Can you see me now?" An objective metric for predicting intelligibility of compressed American Sign Language video

    NASA Astrophysics Data System (ADS)

    Ciaramello, Francis M.; Hemami, Sheila S.

    2007-02-01

    For members of the Deaf Community in the United States, current communication tools include TTY/TTD services, video relay services, and text-based communication. With the growth of cellular technology, mobile sign language conversations are becoming a possibility. Proper coding techniques must be employed to compress American Sign Language (ASL) video for low-rate transmission while maintaining the quality of the conversation. In order to evaluate these techniques, an appropriate quality metric is needed. This paper demonstrates that traditional video quality metrics, such as PSNR, fail to predict subjective intelligibility scores. By considering the unique structure of ASL video, an appropriate objective metric is developed. Face and hand segmentation is performed using skin-color detection techniques. The distortions in the face and hand regions are optimally weighted and pooled across all frames to create an objective intelligibility score for a distorted sequence. The objective intelligibility metric performs significantly better than PSNR in terms of correlation with subjective responses.

  3. Model-Based Referenceless Quality Metric of 3D Synthesized Images Using Local Image Description.

    PubMed

    Gu, Ke; Jakhetiya, Vinit; Qiao, Jun-Fei; Li, Xiaoli; Lin, Weisi; Thalmann, Daniel

    2017-07-28

    New challenges have been brought out along with the emerging of 3D-related technologies such as virtual reality (VR), augmented reality (AR), and mixed reality (MR). Free viewpoint video (FVV), due to its applications in remote surveillance, remote education, etc, based on the flexible selection of direction and viewpoint, has been perceived as the development direction of next-generation video technologies and has drawn a wide range of researchers' attention. Since FVV images are synthesized via a depth image-based rendering (DIBR) procedure in the "blind" environment (without reference images), a reliable real-time blind quality evaluation and monitoring system is urgently required. But existing assessment metrics do not render human judgments faithfully mainly because geometric distortions are generated by DIBR. To this end, this paper proposes a novel referenceless quality metric of DIBR-synthesized images using the autoregression (AR)-based local image description. It was found that, after the AR prediction, the reconstructed error between a DIBR-synthesized image and its AR-predicted image can accurately capture the geometry distortion. The visual saliency is then leveraged to modify the proposed blind quality metric to a sizable margin. Experiments validate the superiority of our no-reference quality method as compared with prevailing full-, reduced- and no-reference models.

  4. Human-centric predictive model of task difficulty for human-in-the-loop control tasks

    PubMed Central

    Majewicz Fey, Ann

    2018-01-01

    Quantitatively measuring the difficulty of a manipulation task in human-in-the-loop control systems is ill-defined. Currently, systems are typically evaluated through task-specific performance measures and post-experiment user surveys; however, these methods do not capture the real-time experience of human users. In this study, we propose to analyze and predict the difficulty of a bivariate pointing task, with a haptic device interface, using human-centric measurement data in terms of cognition, physical effort, and motion kinematics. Noninvasive sensors were used to record the multimodal response of human user for 14 subjects performing the task. A data-driven approach for predicting task difficulty was implemented based on several task-independent metrics. We compare four possible models for predicting task difficulty to evaluated the roles of the various types of metrics, including: (I) a movement time model, (II) a fusion model using both physiological and kinematic metrics, (III) a model only with kinematic metrics, and (IV) a model only with physiological metrics. The results show significant correlation between task difficulty and the user sensorimotor response. The fusion model, integrating user physiology and motion kinematics, provided the best estimate of task difficulty (R2 = 0.927), followed by a model using only kinematic metrics (R2 = 0.921). Both models were better predictors of task difficulty than the movement time model (R2 = 0.847), derived from Fitt’s law, a well studied difficulty model for human psychomotor control. PMID:29621301

  5. NEW CATEGORICAL METRICS FOR AIR QUALITY MODEL EVALUATION

    EPA Science Inventory

    Traditional categorical metrics used in model evaluations are "clear-cut" measures in that the model's ability to predict an exceedance is defined by a fixed threshold concentration and the metrics are defined by observation-forecast sets that are paired both in space and time. T...

  6. Workshop summary: 'Integrating air quality and climate mitigation - is there a need for new metrics to support decision making?'

    NASA Astrophysics Data System (ADS)

    von Schneidemesser, E.; Schmale, J.; Van Aardenne, J.

    2013-12-01

    Air pollution and climate change are often treated at national and international level as separate problems under different regulatory or thematic frameworks and different policy departments. With air pollution and climate change being strongly linked with regard to their causes, effects and mitigation options, the integration of policies that steer air pollutant and greenhouse gas emission reductions might result in cost-efficient, more effective and thus more sustainable tackling of the two problems. To support informed decision making and to work towards an integrated air quality and climate change mitigation policy requires the identification, quantification and communication of present-day and potential future co-benefits and trade-offs. The identification of co-benefits and trade-offs requires the application of appropriate metrics that are well rooted in science, easy to understand and reflect the needs of policy, industry and the public for informed decision making. For the purpose of this workshop, metrics were loosely defined as a quantified measure of effect or impact used to inform decision-making and to evaluate mitigation measures. The workshop held on October 9 and 10 and co-organized between the European Environment Agency and the Institute for Advanced Sustainability Studies brought together representatives from science, policy, NGOs, and industry to discuss whether current available metrics are 'fit for purpose' or whether there is a need to develop alternative metrics or reassess the way current metrics are used and communicated. Based on the workshop outcome the presentation will (a) summarize the informational needs and current application of metrics by the end-users, who, depending on their field and area of operation might require health, policy, and/or economically relevant parameters at different scales, (b) provide an overview of the state of the science of currently used and newly developed metrics, and the scientific validity of these metrics, (c) identify gaps in the current information base, whether from the scientific development of metrics or their application by different users.

  7. Coreference Resolution With Reconcile

    DTIC Science & Technology

    2010-07-01

    evaluation of coreference re- solvers across a variety of benchmark data sets and standard scoring metrics. We describe Reconcile and present experimental... scores vary wildly across data sets, evaluation metrics, and system configurations. We believe that one root cause of these dispar- ities is the high...resolution and empirical evaluation of coreference resolvers across a variety of benchmark data sets and standard scoring metrics. We describe Reconcile

  8. A methodology to enable rapid evaluation of aviation environmental impacts and aircraft technologies

    NASA Astrophysics Data System (ADS)

    Becker, Keith Frederick

    Commercial aviation has become an integral part of modern society and enables unprecedented global connectivity by increasing rapid business, cultural, and personal connectivity. In the decades following World War II, passenger travel through commercial aviation quickly grew at a rate of roughly 8% per year globally. The FAA's most recent Terminal Area Forecast predicts growth to continue at a rate of 2.5% domestically, and the market outlooks produced by Airbus and Boeing generally predict growth to continue at a rate of 5% per year globally over the next several decades, which translates into a need for up to 30,000 new aircraft produced by 2025. With such large numbers of new aircraft potentially entering service, any negative consequences of commercial aviation must undergo examination and mitigation by governing bodies so that growth may still be achieved. Options to simultaneously grow while reducing environmental impact include evolution of the commercial fleet through changes in operations, aircraft mix, and technology adoption. Methods to rapidly evaluate fleet environmental metrics are needed to enable decision makers to quickly compare the impact of different scenarios and weigh the impact of multiple policy options. As the fleet evolves, interdependencies may emerge in the form of tradeoffs between improvements in different environmental metrics as new technologies are brought into service. In order to include the impacts of these interdependencies on fleet evolution, physics-based modeling is required at the appropriate level of fidelity. Evaluation of environmental metrics in a physics-based manner can be done at the individual aircraft level, but will then not capture aggregate fleet metrics. Contrastingly, evaluation of environmental metrics at the fleet level is already being done for aircraft in the commercial fleet, but current tools and approaches require enhancement because they currently capture technology implementation through post-processing, which does not capture physical interdependencies that may arise at the aircraft-level. The goal of the work that has been conducted here was the development of a methodology to develop surrogate fleet approaches that leverage the capability of physics-based aircraft models and the development of connectivity to fleet-level analysis tools to enable rapid evaluation of fuel burn and emissions metrics. Instead of requiring development of an individual physics-based model for each vehicle in the fleet, the surrogate fleet approaches seek to reduce the number of such models needed while still accurately capturing performance of the fleet. By reducing the number of models, both development time and execution time to generate fleet-level results may also be reduced. The initial steps leading to surrogate fleet formulation were a characterization of the commercial fleet into groups based on capability followed by the selection of a reference vehicle model and a reference set of operations for each group. Next, three potential surrogate fleet approaches were formulated. These approaches include the parametric correction factor approach, in which the results of a reference vehicle model are corrected to match the aggregate results of each group; the average replacement approach, in which a new vehicle model is developed to generate aggregate results of each group, and the best-in-class replacement approach, in which results for a reference vehicle are simply substituted for the entire group. Once candidate surrogate fleet approaches were developed, they were each applied to and evaluated over the set of reference operations. Then each approach was evaluated for their ability to model variations in operations. Finally, the ability of each surrogate fleet approach to capture implementation of different technology suites along with corresponding interdependencies between fuel burn and emissions was evaluated using the concept of a virtual fleet to simulate the technology response of multiple aircraft families. The results of experimentation led to a down selection to the best approach to use to rapidly characterize the performance of the commercial fleet for accurately in the context of acceptability of current fleet evaluation methods. The parametric correction factor and average replacement approaches were shown to be successful in capturing reference fleet results as well as fleet performance with variations in operations. The best-in-class replacement approach was shown to be unacceptable as a model for the larger fleet in each of the scenarios tested. Finally, the average replacement approach was the only one that was successful in capturing the impact of technologies on a larger fleet. These results are meaningful because they show that it is possible to calculate the fuel burn and emissions of a larger fleet with a reduced number of physics-based models within acceptable bounds of accuracy. At the same time, the physics-based modeling also provides the ability to evaluate the impact of technologies on fleet-level fuel burn and emissions metrics. The value of such a capability is that multiple future fleet scenarios involving changes in both aircraft operations and technology levels may now be rapidly evaluated to inform and equip policy makers of the implications of impacts of changes on fleet-level metrics.

  9. SU-E-T-436: Fluence-Based Trajectory Optimization for Non-Coplanar VMAT

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Smyth, G; Bamber, JC; Bedford, JL

    2015-06-15

    Purpose: To investigate a fluence-based trajectory optimization technique for non-coplanar VMAT for brain cancer. Methods: Single-arc non-coplanar VMAT trajectories were determined using a heuristic technique for five patients. Organ at risk (OAR) volume intersected during raytracing was minimized for two cases: absolute volume and the sum of relative volumes weighted by OAR importance. These trajectories and coplanar VMAT formed starting points for the fluence-based optimization method. Iterative least squares optimization was performed on control points 24° apart in gantry rotation. Optimization minimized the root-mean-square (RMS) deviation of PTV dose from the prescription (relative importance 100), maximum dose to the brainstemmore » (10), optic chiasm (5), globes (5) and optic nerves (5), plus mean dose to the lenses (5), hippocampi (3), temporal lobes (2), cochleae (1) and brain excluding other regions of interest (1). Control point couch rotations were varied in steps of up to 10° and accepted if the cost function improved. Final treatment plans were optimized with the same objectives in an in-house planning system and evaluated using a composite metric - the sum of optimization metrics weighted by importance. Results: The composite metric decreased with fluence-based optimization in 14 of the 15 plans. In the remaining case its overall value, and the PTV and OAR components, were unchanged but the balance of OAR sparing differed. PTV RMS deviation was improved in 13 cases and unchanged in two. The OAR component was reduced in 13 plans. In one case the OAR component increased but the composite metric decreased - a 4 Gy increase in OAR metrics was balanced by a reduction in PTV RMS deviation from 2.8% to 2.6%. Conclusion: Fluence-based trajectory optimization improved plan quality as defined by the composite metric. While dose differences were case specific, fluence-based optimization improved both PTV and OAR dosimetry in 80% of cases.« less

  10. A novel critical infrastructure resilience assessment approach using dynamic Bayesian networks

    NASA Astrophysics Data System (ADS)

    Cai, Baoping; Xie, Min; Liu, Yonghong; Liu, Yiliu; Ji, Renjie; Feng, Qiang

    2017-10-01

    The word resilience originally originates from the Latin word "resiliere", which means to "bounce back". The concept has been used in various fields, such as ecology, economics, psychology, and society, with different definitions. In the field of critical infrastructure, although some resilience metrics are proposed, they are totally different from each other, which are determined by the performances of the objects of evaluation. Here we bridge the gap by developing a universal critical infrastructure resilience metric from the perspective of reliability engineering. A dynamic Bayesian networks-based assessment approach is proposed to calculate the resilience value. A series, parallel and voting system is used to demonstrate the application of the developed resilience metric and assessment approach.

  11. Combining satellite data and appropriate objective functions for improved spatial pattern performance of a distributed hydrologic model

    NASA Astrophysics Data System (ADS)

    Demirel, Mehmet C.; Mai, Juliane; Mendiguren, Gorka; Koch, Julian; Samaniego, Luis; Stisen, Simon

    2018-02-01

    Satellite-based earth observations offer great opportunities to improve spatial model predictions by means of spatial-pattern-oriented model evaluations. In this study, observed spatial patterns of actual evapotranspiration (AET) are utilised for spatial model calibration tailored to target the pattern performance of the model. The proposed calibration framework combines temporally aggregated observed spatial patterns with a new spatial performance metric and a flexible spatial parameterisation scheme. The mesoscale hydrologic model (mHM) is used to simulate streamflow and AET and has been selected due to its soil parameter distribution approach based on pedo-transfer functions and the build in multi-scale parameter regionalisation. In addition two new spatial parameter distribution options have been incorporated in the model in order to increase the flexibility of root fraction coefficient and potential evapotranspiration correction parameterisations, based on soil type and vegetation density. These parameterisations are utilised as they are most relevant for simulated AET patterns from the hydrologic model. Due to the fundamental challenges encountered when evaluating spatial pattern performance using standard metrics, we developed a simple but highly discriminative spatial metric, i.e. one comprised of three easily interpretable components measuring co-location, variation and distribution of the spatial data. The study shows that with flexible spatial model parameterisation used in combination with the appropriate objective functions, the simulated spatial patterns of actual evapotranspiration become substantially more similar to the satellite-based estimates. Overall 26 parameters are identified for calibration through a sequential screening approach based on a combination of streamflow and spatial pattern metrics. The robustness of the calibrations is tested using an ensemble of nine calibrations based on different seed numbers using the shuffled complex evolution optimiser. The calibration results reveal a limited trade-off between streamflow dynamics and spatial patterns illustrating the benefit of combining separate observation types and objective functions. At the same time, the simulated spatial patterns of AET significantly improved when an objective function based on observed AET patterns and a novel spatial performance metric compared to traditional streamflow-only calibration were included. Since the overall water balance is usually a crucial goal in hydrologic modelling, spatial-pattern-oriented optimisation should always be accompanied by traditional discharge measurements. In such a multi-objective framework, the current study promotes the use of a novel bias-insensitive spatial pattern metric, which exploits the key information contained in the observed patterns while allowing the water balance to be informed by discharge observations.

  12. Implementation of a channelized Hotelling observer model to assess image quality of x-ray angiography systems.

    PubMed

    Favazza, Christopher P; Fetterly, Kenneth A; Hangiandreou, Nicholas J; Leng, Shuai; Schueler, Beth A

    2015-01-01

    Evaluation of flat-panel angiography equipment through conventional image quality metrics is limited by the scope of standard spatial-domain image quality metric(s), such as contrast-to-noise ratio and spatial resolution, or by restricted access to appropriate data to calculate Fourier domain measurements, such as modulation transfer function, noise power spectrum, and detective quantum efficiency. Observer models have been shown capable of overcoming these limitations and are able to comprehensively evaluate medical-imaging systems. We present a spatial domain-based channelized Hotelling observer model to calculate the detectability index (DI) of our different sized disks and compare the performance of different imaging conditions and angiography systems. When appropriate, changes in DIs were compared to expectations based on the classical Rose model of signal detection to assess linearity of the model with quantum signal-to-noise ratio (SNR) theory. For these experiments, the estimated uncertainty of the DIs was less than 3%, allowing for precise comparison of imaging systems or conditions. For most experimental variables, DI changes were linear with expectations based on quantum SNR theory. DIs calculated for the smallest objects demonstrated nonlinearity with quantum SNR theory due to system blur. Two angiography systems with different detector element sizes were shown to perform similarly across the majority of the detection tasks.

  13. Distributed Space Mission Design for Earth Observation Using Model-Based Performance Evaluation

    NASA Technical Reports Server (NTRS)

    Nag, Sreeja; LeMoigne-Stewart, Jacqueline; Cervantes, Ben; DeWeck, Oliver

    2015-01-01

    Distributed Space Missions (DSMs) are gaining momentum in their application to earth observation missions owing to their unique ability to increase observation sampling in multiple dimensions. DSM design is a complex problem with many design variables, multiple objectives determining performance and cost and emergent, often unexpected, behaviors. There are very few open-access tools available to explore the tradespace of variables, minimize cost and maximize performance for pre-defined science goals, and therefore select the most optimal design. This paper presents a software tool that can multiple DSM architectures based on pre-defined design variable ranges and size those architectures in terms of predefined science and cost metrics. The tool will help a user select Pareto optimal DSM designs based on design of experiments techniques. The tool will be applied to some earth observation examples to demonstrate its applicability in making some key decisions between different performance metrics and cost metrics early in the design lifecycle.

  14. Parameter Search Algorithms for Microwave Radar-Based Breast Imaging: Focal Quality Metrics as Fitness Functions.

    PubMed

    O'Loughlin, Declan; Oliveira, Bárbara L; Elahi, Muhammad Adnan; Glavin, Martin; Jones, Edward; Popović, Milica; O'Halloran, Martin

    2017-12-06

    Inaccurate estimation of average dielectric properties can have a tangible impact on microwave radar-based breast images. Despite this, recent patient imaging studies have used a fixed estimate although this is known to vary from patient to patient. Parameter search algorithms are a promising technique for estimating the average dielectric properties from the reconstructed microwave images themselves without additional hardware. In this work, qualities of accurately reconstructed images are identified from point spread functions. As the qualities of accurately reconstructed microwave images are similar to the qualities of focused microscopic and photographic images, this work proposes the use of focal quality metrics for average dielectric property estimation. The robustness of the parameter search is evaluated using experimental dielectrically heterogeneous phantoms on the three-dimensional volumetric image. Based on a very broad initial estimate of the average dielectric properties, this paper shows how these metrics can be used as suitable fitness functions in parameter search algorithms to reconstruct clear and focused microwave radar images.

  15. Evaluation of BLAST-based edge-weighting metrics used for homology inference with the Markov Clustering algorithm.

    PubMed

    Gibbons, Theodore R; Mount, Stephen M; Cooper, Endymion D; Delwiche, Charles F

    2015-07-10

    Clustering protein sequences according to inferred homology is a fundamental step in the analysis of many large data sets. Since the publication of the Markov Clustering (MCL) algorithm in 2002, it has been the centerpiece of several popular applications. Each of these approaches generates an undirected graph that represents sequences as nodes connected to each other by edges weighted with a BLAST-based metric. MCL is then used to infer clusters of homologous proteins by analyzing these graphs. The various approaches differ only by how they weight the edges, yet there has been very little direct examination of the relative performance of alternative edge-weighting metrics. This study compares the performance of four BLAST-based edge-weighting metrics: the bit score, bit score ratio (BSR), bit score over anchored length (BAL), and negative common log of the expectation value (NLE). Performance is tested using the Extended CEGMA KOGs (ECK) database, which we introduce here. All metrics performed similarly when analyzing full-length sequences, but dramatic differences emerged as progressively larger fractions of the test sequences were split into fragments. The BSR and BAL successfully rescued subsets of clusters by strengthening certain types of alignments between fragmented sequences, but also shifted the largest correct scores down near the range of scores generated from spurious alignments. This penalty outweighed the benefits in most test cases, and was greatly exacerbated by increasing the MCL inflation parameter, making these metrics less robust than the bit score or the more popular NLE. Notably, the bit score performed as well or better than the other three metrics in all scenarios. The results provide a strong case for use of the bit score, which appears to offer equivalent or superior performance to the more popular NLE. The insight that MCL-based clustering methods can be improved using a more tractable edge-weighting metric will greatly simplify future implementations. We demonstrate this with our own minimalist Python implementation: Porthos, which uses only standard libraries and can process a graph with 25 m + edges connecting the 60 k + KOG sequences in half a minute using less than half a gigabyte of memory.

  16. The Relationship between Student Engagement and Alumni Involvement

    ERIC Educational Resources Information Center

    Randazza, Paula T.

    2017-01-01

    As issues of access, affordability, and accountability continue to make the national agenda, students and families are evaluating colleges and universities based on metrics that help determine a return on investment. While measures such as student loan debt and graduation rates are important factors in evaluating any institution, it is also…

  17. Pattern Activity Clustering and Evaluation (PACE)

    NASA Astrophysics Data System (ADS)

    Blasch, Erik; Banas, Christopher; Paul, Michael; Bussjager, Becky; Seetharaman, Guna

    2012-06-01

    With the vast amount of network information available on activities of people (i.e. motions, transportation routes, and site visits) there is a need to explore the salient properties of data that detect and discriminate the behavior of individuals. Recent machine learning approaches include methods of data mining, statistical analysis, clustering, and estimation that support activity-based intelligence. We seek to explore contemporary methods in activity analysis using machine learning techniques that discover and characterize behaviors that enable grouping, anomaly detection, and adversarial intent prediction. To evaluate these methods, we describe the mathematics and potential information theory metrics to characterize behavior. A scenario is presented to demonstrate the concept and metrics that could be useful for layered sensing behavior pattern learning and analysis. We leverage work on group tracking, learning and clustering approaches; as well as utilize information theoretical metrics for classification, behavioral and event pattern recognition, and activity and entity analysis. The performance evaluation of activity analysis supports high-level information fusion of user alerts, data queries and sensor management for data extraction, relations discovery, and situation analysis of existing data.

  18. Generating One Biometric Feature from Another: Faces from Fingerprints

    PubMed Central

    Ozkaya, Necla; Sagiroglu, Seref

    2010-01-01

    This study presents a new approach based on artificial neural networks for generating one biometric feature (faces) from another (only fingerprints). An automatic and intelligent system was designed and developed to analyze the relationships among fingerprints and faces and also to model and to improve the existence of the relationships. The new proposed system is the first study that generates all parts of the face including eyebrows, eyes, nose, mouth, ears and face border from only fingerprints. It is also unique and different from similar studies recently presented in the literature with some superior features. The parameter settings of the system were achieved with the help of Taguchi experimental design technique. The performance and accuracy of the system have been evaluated with 10-fold cross validation technique using qualitative evaluation metrics in addition to the expanded quantitative evaluation metrics. Consequently, the results were presented on the basis of the combination of these objective and subjective metrics for illustrating the qualitative properties of the proposed methods as well as a quantitative evaluation of their performances. Experimental results have shown that one biometric feature can be determined from another. These results have once more indicated that there is a strong relationship between fingerprints and faces. PMID:22399877

  19. Researcher and Author Impact Metrics: Variety, Value, and Context

    PubMed Central

    2018-01-01

    Numerous quantitative indicators are currently available for evaluating research productivity. No single metric is suitable for comprehensive evaluation of the author-level impact. The choice of particular metrics depends on the purpose and context of the evaluation. The aim of this article is to overview some of the widely employed author impact metrics and highlight perspectives of their optimal use. The h-index is one of the most popular metrics for research evaluation, which is easy to calculate and understandable for non-experts. It is automatically displayed on researcher and author profiles on citation databases such as Scopus and Web of Science. Its main advantage relates to the combined approach to the quantification of publication and citation counts. This index is increasingly cited globally. Being an appropriate indicator of publication and citation activity of highly productive and successfully promoted authors, the h-index has been criticized primarily for disadvantaging early career researchers and authors with a few indexed publications. Numerous variants of the index have been proposed to overcome its limitations. Alternative metrics have also emerged to highlight ‘societal impact.’ However, each of these traditional and alternative metrics has its own drawbacks, necessitating careful analyses of the context of social attention and value of publication and citation sets. Perspectives of the optimal use of researcher and author metrics is dependent on evaluation purposes and compounded by information sourced from various global, national, and specialist bibliographic databases. PMID:29713258

  20. Combining ground-based measurements and satellite-based spectral vegetation indices to track biomass accumulation in post-fire chaparral

    NASA Astrophysics Data System (ADS)

    Uyeda, K. A.; Stow, D. A.; Roberts, D. A.; Riggan, P. J.

    2015-12-01

    Multi-temporal satellite imagery can provide valuable information on patterns of vegetation growth over large spatial extents and long time periods, but corresponding ground-referenced biomass information is often difficult to acquire, especially at an annual scale. In this study, I test the relationship between annual biomass estimated using shrub growth rings and metrics of seasonal growth derived from Moderate Resolution Imaging Spectroradiometer (MODIS) spectral vegetation indices (SVIs) for a small area of southern California chaparral to evaluate the potential for mapping biomass at larger spatial extents. The site had most recently burned in 2002, and annual biomass accumulation measurements were available from years 5 - 11 post-fire. I tested metrics of seasonal growth using six SVIs (Normalized Difference Vegetation Index, Enhanced Vegetation Index, Soil Adjusted Vegetation Index, Normalized Difference Water Index, Normalized Difference Infrared Index 6, and Vegetation Atmospherically Resistant Index). While additional research would be required to determine which of these metrics and SVIs are most promising over larger spatial extents, several of the seasonal growth metrics/ SVI combinations have a very strong relationship with annual biomass, and all SVIs have a strong relationship with annual biomass for at least one of the seasonal growth metrics.

  1. Climate Classification is an Important Factor in ­Assessing Hospital Performance Metrics

    NASA Astrophysics Data System (ADS)

    Boland, M. R.; Parhi, P.; Gentine, P.; Tatonetti, N. P.

    2017-12-01

    Context/Purpose: Climate is a known modulator of disease, but its impact on hospital performance metrics remains unstudied. Methods: We assess the relationship between Köppen-Geiger climate classification and hospital performance metrics, specifically 30-day mortality, as reported in Hospital Compare, and collected for the period July 2013 through June 2014 (7/1/2013 - 06/30/2014). A hospital-level multivariate linear regression analysis was performed while controlling for known socioeconomic factors to explore the relationship between all-cause mortality and climate. Hospital performance scores were obtained from 4,524 hospitals belonging to 15 distinct Köppen-Geiger climates and 2,373 unique counties. Results: Model results revealed that hospital performance metrics for mortality showed significant climate dependence (p<0.001) after adjusting for socioeconomic factors. Interpretation: Currently, hospitals are reimbursed by Governmental agencies using 30-day mortality rates along with 30-day readmission rates. These metrics allow Government agencies to rank hospitals according to their `performance' along these metrics. Various socioeconomic factors are taken into consideration when determining individual hospitals performance. However, no climate-based adjustment is made within the existing framework. Our results indicate that climate-based variability in 30-day mortality rates does exist even after socioeconomic confounder adjustment. Use of standardized high-level climate classification systems (such as Koppen-Geiger) would be useful to incorporate in future metrics. Conclusion: Climate is a significant factor in evaluating hospital 30-day mortality rates. These results demonstrate that climate classification is an important factor when comparing hospital performance across the United States.

  2. Metrics for evaluating performance and uncertainty of Bayesian network models

    Treesearch

    Bruce G. Marcot

    2012-01-01

    This paper presents a selected set of existing and new metrics for gauging Bayesian network model performance and uncertainty. Selected existing and new metrics are discussed for conducting model sensitivity analysis (variance reduction, entropy reduction, case file simulation); evaluating scenarios (influence analysis); depicting model complexity (numbers of model...

  3. Spin-lock imaging of exogenous exchange-based contrast agents to assess tissue pH.

    PubMed

    Zu, Zhongliang; Li, Hua; Jiang, Xiaoyu; Gore, John C

    2018-01-01

    Some X-ray contrast agents contain exchangeable protons that give rise to exchange-based effects on MRI, including chemical exchange saturation transfer (CEST). However, CEST has poor specificity to explicit exchange parameters. Spin-lock sequences at high field are also sensitive to chemical exchange. Here, we evaluate whether spin-locking techniques can detect the contrast agent iohexol in vivo after intravenous administration, and their potential for measuring changes in tissue pH. Two metrics of contrast based on R 1ρ , the spin lattice relaxation rate in the rotating frame, were derived from the behavior of R 1ρ at different locking fields. Solutions containing iohexol at different concentrations and pH were used to evaluate the ability of the two metrics to quantify exchange effects. Images were also acquired from rat brains bearing tumors before and after intravenous injections of iohexol to evaluate the potential of spin-lock techniques for detecting the agent and pH variations. The two metrics were found to depend separately on either agent concentration or pH. Spin-lock imaging may therefore provide specific quantification of iohexol concentration and the iohexol-water exchange rate, which reports on pH. Spin-lock techniques may be used to assess the dynamics of intravenous contrast agents and detect extracellular acidification. Magn Reson Med 79:298-305, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.

  4. Regime-based evaluation of cloudiness in CMIP5 models

    NASA Astrophysics Data System (ADS)

    Jin, Daeho; Oreopoulos, Lazaros; Lee, Dongmin

    2017-01-01

    The concept of cloud regimes (CRs) is used to develop a framework for evaluating the cloudiness of 12 fifth Coupled Model Intercomparison Project (CMIP5) models. Reference CRs come from existing global International Satellite Cloud Climatology Project (ISCCP) weather states. The evaluation is made possible by the implementation in several CMIP5 models of the ISCCP simulator generating in each grid cell daily joint histograms of cloud optical thickness and cloud top pressure. Model performance is assessed with several metrics such as CR global cloud fraction (CF), CR relative frequency of occurrence (RFO), their product [long-term average total cloud amount (TCA)], cross-correlations of CR RFO maps, and a metric of resemblance between model and ISCCP CRs. In terms of CR global RFO, arguably the most fundamental metric, the models perform unsatisfactorily overall, except for CRs representing thick storm clouds. Because model CR CF is internally constrained by our method, RFO discrepancies yield also substantial TCA errors. Our results support previous findings that CMIP5 models underestimate cloudiness. The multi-model mean performs well in matching observed RFO maps for many CRs, but is still not the best for this or other metrics. When overall performance across all CRs is assessed, some models, despite shortcomings, apparently outperform Moderate Resolution Imaging Spectroradiometer cloud observations evaluated against ISCCP like another model output. Lastly, contrasting cloud simulation performance against each model's equilibrium climate sensitivity in order to gain insight on whether good cloud simulation pairs with particular values of this parameter, yields no clear conclusions.

  5. The relationship between assessment methods and self-directed learning readiness in medical education.

    PubMed

    Monroe, Katherine S

    2016-03-11

    This research explored the assessment of self-directed learning readiness within the comprehensive evaluation of medical students' knowledge and skills and the extent to which several variables predicted participants' self-directed learning readiness prior to their graduation. Five metrics for evaluating medical students were considered in a multiple regression analysis. Fourth-year medical students at a competitive US medical school received an informed consent and an online survey. Participants voluntarily completed a self-directed learning readiness scale that assessed four subsets of self-directed learning readiness and consented to the release of their academic records. The assortment of metrics considered in this study only vaguely captured students' self-directedness. The strongest predictors were faculty evaluations of students' performance on clerkship rotations. Specific clerkship grades were mildly predictive of three subscales. The Pediatrics clerkship modestly predicted critical self-evaluation (r=-.30, p=.01) and the Psychiatry clerkship mildly predicted learning self-efficacy (r =-.30, p=.01), while the Junior Surgery clerkship nominally correlated with participants' effective organization for learning (r=.21, p=.05). Other metrics examined did not contribute to predicting participants' readiness for self-directed learning. Given individual differences among participants for the variables considered, no combination of students' grades and/or test scores overwhelmingly predicted their aptitude for self-directed learning. Considering the importance of fostering medical students' self-directed learning skills, schools need a reliable and pragmatic approach to measure them. This data analysis, however, offered no clear-cut way of documenting students' self-directed learning readiness based on the evaluation metrics included.

  6. Weber-aware weighted mutual information evaluation for infrared-visible image fusion

    NASA Astrophysics Data System (ADS)

    Luo, Xiaoyan; Wang, Shining; Yuan, Ding

    2016-10-01

    A performance metric for infrared and visible image fusion is proposed based on Weber's law. To indicate the stimulus of source images, two Weber components are provided. One is differential excitation to reflect the spectral signal of visible and infrared images, and the other is orientation to capture the scene structure feature. By comparing the corresponding Weber component in infrared and visible images, the source pixels can be marked with different dominant properties in intensity or structure. If the pixels have the same dominant property label, the pixels are grouped to calculate the mutual information (MI) on the corresponding Weber components between dominant source and fused images. Then, the final fusion metric is obtained via weighting the group-wise MI values according to the number of pixels in different groups. Experimental results demonstrate that the proposed metric performs well on popular image fusion cases and outperforms other image fusion metrics.

  7. Deviations in gait metrics in patients with chronic ankle instability: a case control study.

    PubMed

    Gigi, Roy; Haim, Amir; Luger, Elchanan; Segal, Ganit; Melamed, Eyal; Beer, Yiftah; Nof, Matityahu; Nyska, Meir; Elbaz, Avi

    2015-01-01

    Gait metric alterations have been previously reported in patients suffering from chronic ankle instability (CAI). Previous studies of gait in this population have been comprised of relatively small cohorts, and the findings of these studies are not uniform. The objective of the present study was to examine spatiotemporal gait metrics in patients with CAI and examine the relationship between self-reported disease severity and the magnitude of gait abnormalities. Forty-four patients with CAI were identified and compared to 53 healthy controls. Patients were evaluated with spatiotemporal gait analysis via a computerized mat and with the Short Form (SF) - 36 health survey. Patients with CAI were found to walk with approximately 16% slower walking velocity, 9% lower cadence and approximately 7% lower step length. Furthermore, the base of support, during walking, in the CAI group was approximately 43% wider, and the single limb support phase was 3.5% shorter compared to the control group. All of the SF-36 8-subscales, as well as the SF-36 physical component summary and SF-36 mental component summary, were significantly lower in patients with CAI compared to the control group. Finally, significant correlations were found between most of the objective gait measures and the SF-36 mental component summary and SF-36 physical component summary. The results outline a gait profile for patients suffering from CAI. Significant differences were found in most spatiotemporal gait metrics. An important finding was a significantly wider base of support. It may be speculated that these gait alterations may reflect a strategy to deal with imbalance and pain. These findings suggest the usefulness of gait metrics, alongside with the use of self-evaluation questionnaires, in assessing disease severity of patients with CAI.

  8. Identifying signature of chemical applications on indigenous and invasive nontarget arthropod communities in vineyards.

    PubMed

    Nash, Michael A; Hoffmann, Ary A; Thomson, Linda J

    2010-09-01

    Communities of arthropods providing ecosystem services (e.g., pest control, pollination, and soil nutrient cycling) to agricultural production systems are influenced by pesticide inputs, yet the impact of pesticide applications on nontarget organisms is normally evaluated through standardized sets of laboratory tests involving individual pesticides applied to a few representative species. By combining season-long pesticide applications of various insecticides and fungicides into a metric based on the International Organization for Biological and Integrated Control (IOBC) toxicity ratings, we evaluate season-long pesticide impacts on communities of indigenous and exotic arthropods across 61 vineyards assessed for an entire growing season. The composition of arthropod communities, identified mostly at the family level, but in some cases at the species level, was altered depending on season-long pesticide use. Numbers of mostly indigenous parasitoids, predatory mites, and coccinellids in the canopy, as well as carabid/tenebrionid beetles and some spider families on the ground, were decreased at higher cumulative pesticide metric scores. In contrast, numbers of one invasive millipede species (Ommatoiulus moreletti Lucas, Julida: Julidae) increased under higher cumulative pesticide metric scores. These changing community patterns were detected despite the absence of broad-spectrum insecticide applications in the vineyards. Pesticide effects were mostly due to indoxacarb and sulphur, applied as a fungicide. The reduction of beneficial arthropods and increase in an invasive herbivorous millipede under high cumulative pesticide metric scores highlights the need to manage nontarget season-long pesticide impacts in vineyards. A cumulative pesticide metric, based on IOBC toxicity ratings, provides a way of assessing overall toxicity effects, giving managers a means to estimate and consider potential negative season-long pesticide impacts on ecosystem services provided through arthropod communities.

  9. A proposed global metric to aid mercury pollution policy

    NASA Astrophysics Data System (ADS)

    Selin, Noelle E.

    2018-05-01

    The Minamata Convention on Mercury entered into force in August 2017, committing its currently 92 parties to take action to protect human health and the environment from anthropogenic emissions and releases of mercury. But how can we tell whether the convention is achieving its objective? Although the convention requires periodic effectiveness evaluation (1), scientific uncertainties challenge our ability to trace how mercury policies translate into reduced human and wildlife exposure and impacts. Mercury emissions to air and releases to land and water follow a complex path through the environment before accumulating as methylmercury in fish, mammals, and birds. As these environmental processes are both uncertain and variable, analyzing existing data alone does not currently provide a clear signal of whether policies are effective. A global-scale metric to assess the impact of mercury emissions policies would help parties assess progress toward the convention's goal. Here, I build on the example of the Montreal Protocol on Substances that Deplete the Ozone Layer to identify criteria for a mercury metric. I then summarize why existing mercury data are insufficient and present and discuss a proposed new metric based on mercury emissions to air. Finally, I identify key scientific uncertainties that challenge future effectiveness evaluation.

  10. Citizen science: A new perspective to advance spatial pattern evaluation in hydrology.

    PubMed

    Koch, Julian; Stisen, Simon

    2017-01-01

    Citizen science opens new pathways that can complement traditional scientific practice. Intuition and reasoning often make humans more effective than computer algorithms in various realms of problem solving. In particular, a simple visual comparison of spatial patterns is a task where humans are often considered to be more reliable than computer algorithms. However, in practice, science still largely depends on computer based solutions, which inevitably gives benefits such as speed and the possibility to automatize processes. However, the human vision can be harnessed to evaluate the reliability of algorithms which are tailored to quantify similarity in spatial patterns. We established a citizen science project to employ the human perception to rate similarity and dissimilarity between simulated spatial patterns of several scenarios of a hydrological catchment model. In total, the turnout counts more than 2500 volunteers that provided over 43000 classifications of 1095 individual subjects. We investigate the capability of a set of advanced statistical performance metrics to mimic the human perception to distinguish between similarity and dissimilarity. Results suggest that more complex metrics are not necessarily better at emulating the human perception, but clearly provide auxiliary information that is valuable for model diagnostics. The metrics clearly differ in their ability to unambiguously distinguish between similar and dissimilar patterns which is regarded a key feature of a reliable metric. The obtained dataset can provide an insightful benchmark to the community to test novel spatial metrics.

  11. How to measure ecosystem stability? An evaluation of the reliability of stability metrics based on remote sensing time series across the major global ecosystems.

    PubMed

    De Keersmaecker, Wanda; Lhermitte, Stef; Honnay, Olivier; Farifteh, Jamshid; Somers, Ben; Coppin, Pol

    2014-07-01

    Increasing frequency of extreme climate events is likely to impose increased stress on ecosystems and to jeopardize the services that ecosystems provide. Therefore, it is of major importance to assess the effects of extreme climate events on the temporal stability (i.e., the resistance, the resilience, and the variance) of ecosystem properties. Most time series of ecosystem properties are, however, affected by varying data characteristics, uncertainties, and noise, which complicate the comparison of ecosystem stability metrics (ESMs) between locations. Therefore, there is a strong need for a more comprehensive understanding regarding the reliability of stability metrics and how they can be used to compare ecosystem stability globally. The objective of this study was to evaluate the performance of temporal ESMs based on time series of the Moderate Resolution Imaging Spectroradiometer derived Normalized Difference Vegetation Index of 15 global land-cover types. We provide a framework (i) to assess the reliability of ESMs in function of data characteristics, uncertainties and noise and (ii) to integrate reliability estimates in future global ecosystem stability studies against climate disturbances. The performance of our framework was tested through (i) a global ecosystem comparison and (ii) an comparison of ecosystem stability in response to the 2003 drought. The results show the influence of data quality on the accuracy of ecosystem stability. White noise, biased noise, and trends have a stronger effect on the accuracy of stability metrics than the length of the time series, temporal resolution, or amount of missing values. Moreover, we demonstrate the importance of integrating reliability estimates to interpret stability metrics within confidence limits. Based on these confidence limits, other studies dealing with specific ecosystem types or locations can be put into context, and a more reliable assessment of ecosystem stability against environmental disturbances can be obtained. © 2013 John Wiley & Sons Ltd.

  12. A comparison of evaluation metrics for biomedical journals, articles, and websites in terms of sensitivity to topic.

    PubMed

    Fu, Lawrence D; Aphinyanaphongs, Yindalon; Wang, Lily; Aliferis, Constantin F

    2011-08-01

    Evaluating the biomedical literature and health-related websites for quality are challenging information retrieval tasks. Current commonly used methods include impact factor for journals, PubMed's clinical query filters and machine learning-based filter models for articles, and PageRank for websites. Previous work has focused on the average performance of these methods without considering the topic, and it is unknown how performance varies for specific topics or focused searches. Clinicians, researchers, and users should be aware when expected performance is not achieved for specific topics. The present work analyzes the behavior of these methods for a variety of topics. Impact factor, clinical query filters, and PageRank vary widely across different topics while a topic-specific impact factor and machine learning-based filter models are more stable. The results demonstrate that a method may perform excellently on average but struggle when used on a number of narrower topics. Topic-adjusted metrics and other topic robust methods have an advantage in such situations. Users of traditional topic-sensitive metrics should be aware of their limitations. Copyright © 2011 Elsevier Inc. All rights reserved.

  13. Methodology, Methods, and Metrics for Testing and Evaluating Augmented Cognition Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Greitzer, Frank L.

    The augmented cognition research community seeks cognitive neuroscience-based solutions to improve warfighter performance by applying and managing mitigation strategies to reduce workload and improve the throughput and quality of decisions. The focus of augmented cognition mitigation research is to define, demonstrate, and exploit neuroscience and behavioral measures that support inferences about the warfighter’s cognitive state that prescribe the nature and timing of mitigation. A research challenge is to develop valid evaluation methodologies, metrics and measures to assess the impact of augmented cognition mitigations. Two considerations are external validity, which is the extent to which the results apply to operational contexts;more » and internal validity, which reflects the reliability of performance measures and the conclusions based on analysis of results. The scientific rigor of the research methodology employed in conducting empirical investigations largely affects the validity of the findings. External validity requirements also compel us to demonstrate operational significance of mitigations. Thus it is important to demonstrate effectiveness of mitigations under specific conditions. This chapter reviews some cognitive science and methodological considerations in designing augmented cognition research studies and associated human performance metrics and analysis methods to assess the impact of augmented cognition mitigations.« less

  14. Using community-level metrics to monitor the effects of marine protected areas on biodiversity.

    PubMed

    Soykan, Candan U; Lewison, Rebecca L

    2015-06-01

    Marine protected areas (MPAs) are used to protect species, communities, and their associated habitats, among other goals. Measuring MPA efficacy can be challenging, however, particularly when considering responses at the community level. We gathered 36 abundance and 14 biomass data sets on fish assemblages and used meta-analysis to evaluate the ability of 22 distinct community diversity metrics to detect differences in community structure between MPAs and nearby control sites. We also considered the effects of 6 covariates-MPA size and age, MPA size and age interaction, latitude, total species richness, and level of protection-on each metric. Some common metrics, such as species richness and Shannon diversity, did not differ consistently between MPA and control sites, whereas other metrics, such as total abundance and biomass, were consistently different across studies. Metric responses derived from the biomass data sets were more consistent than those based on the abundance data sets, suggesting that community-level biomass differs more predictably than abundance between MPA and control sites. Covariate analyses indicated that level of protection, latitude, MPA size, and the interaction between MPA size and age affect metric performance. These results highlight a handful of metrics, several of which are little known, that could be used to meet the increasing demand for community-level indicators of MPA effectiveness. © 2015 Society for Conservation Biology.

  15. A consistent conceptual framework for applying climate metrics in technology life cycle assessment

    NASA Astrophysics Data System (ADS)

    Mallapragada, Dharik; Mignone, Bryan K.

    2017-07-01

    Comparing the potential climate impacts of different technologies is challenging for several reasons, including the fact that any given technology may be associated with emissions of multiple greenhouse gases when evaluated on a life cycle basis. In general, analysts must decide how to aggregate the climatic effects of different technologies, taking into account differences in the properties of the gases (differences in atmospheric lifetimes and instantaneous radiative efficiencies) as well as different technology characteristics (differences in emission factors and technology lifetimes). Available metrics proposed in the literature have incorporated these features in different ways and have arrived at different conclusions. In this paper, we develop a general framework for classifying metrics based on whether they measure: (a) cumulative or end point impacts, (b) impacts over a fixed time horizon or up to a fixed end year, and (c) impacts from a single emissions pulse or from a stream of pulses over multiple years. We then use the comparison between compressed natural gas and gasoline-fueled vehicles to illustrate how the choice of metric can affect conclusions about technologies. Finally, we consider tradeoffs involved in selecting a metric, show how the choice of metric depends on the framework that is assumed for climate change mitigation, and suggest which subset of metrics are likely to be most analytically self-consistent.

  16. Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial

    EPA Science Inventory

    This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit m...

  17. Combining control input with flight path data to evaluate pilot performance in transport aircraft.

    PubMed

    Ebbatson, Matt; Harris, Don; Huddlestone, John; Sears, Rodney

    2008-11-01

    When deriving an objective assessment of piloting performance from flight data records, it is common to employ metrics which purely evaluate errors in flight path parameters. The adequacy of pilot performance is evaluated from the flight path of the aircraft. However, in large jet transport aircraft these measures may be insensitive and require supplementing with frequency-based measures of control input parameters. Flight path and control input data were collected from pilots undertaking a jet transport aircraft conversion course during a series of symmetric and asymmetric approaches in a flight simulator. The flight path data were analyzed for deviations around the optimum flight path while flying an instrument landing approach. Manipulation of the flight controls was subject to analysis using a series of power spectral density measures. The flight path metrics showed no significant differences in performance between the symmetric and asymmetric approaches. However, control input frequency domain measures revealed that the pilots employed highly different control strategies in the pitch and yaw axes. The results demonstrate that to evaluate pilot performance fully in large aircraft, it is necessary to employ performance metrics targeted at both the outer control loop (flight path) and the inner control loop (flight control) parameters in parallel, evaluating both the product and process of a pilot's performance.

  18. Development and Evaluation of Alternative Metrics of Ambient Air Pollution Exposure for Use in Epidemiologic Studies

    EPA Science Inventory

    Population-based epidemiologic studies of air pollution have traditionally relied upon imperfect surrogates of personal exposures, such as area-wide ambient air pollution levels based on readily available outdoor concentrations from central monitoring sites. This practice may in...

  19. Evidence-Based Policy Making: Assessment of the American Heart Association's Strategic Policy Portfolio: A Policy Statement From the American Heart Association.

    PubMed

    Labarthe, Darwin R; Goldstein, Larry B; Antman, Elliott M; Arnett, Donna K; Fonarow, Gregg C; Alberts, Mark J; Hayman, Laura L; Khera, Amit; Sallis, James F; Daniels, Stephen R; Sacco, Ralph L; Li, Suhui; Ku, Leighton; Lantz, Paula M; Robinson, Jennifer G; Creager, Mark A; Van Horn, Linda; Kris-Etherton, Penny; Bhatnagar, Aruni; Whitsel, Laurie P

    2016-05-03

    American Heart Association (AHA) public policy advocacy strategies are based on its Strategic Impact Goals. The writing group appraised the evidence behind AHA's policies to determine how well they address the association's 2020 cardiovascular health (CVH) metrics and cardiovascular disease (CVD) management indicators and identified research needed to fill gaps in policy and support further policy development. The AHA policy research department first identified current AHA policies specific to each CVH metric and CVD management indicator and the evidence underlying each policy. Writing group members then reviewed each policy and the related metrics and indicators. The results of each review were summarized, and topic-specific priorities and overarching themes for future policy research were proposed. There was generally close alignment between current AHA policies and the 2020 CVH metrics and CVD management indicators; however, certain specific policies still lack a robust evidence base. For CVH metrics, the distinction between policies for adults (age ≥20 years) and children (<20 years) was often not considered, although policy approaches may differ importantly by age. Inclusion of all those <20 years of age as a single group also ignores important differences in policy needs for infants, children, adolescents, and young adults. For CVD management indicators, specific quantitative targets analogous to criteria for ideal, intermediate, and poor CVH are lacking but needed to assess progress toward the 2020 goal to reduce deaths from CVDs and stroke. New research in support of current policies needs to focus on the evaluation of their translation and implementation through expanded application of implementation science. Focused basic, clinical, and population research is required to expand and strengthen the evidence base for the development of new policies. Evaluation of the impact of targeted improvements in population health through strengthened surveillance of CVD and stroke events, determination of the cost-effectiveness of policy interventions, and measurement of the extent to which vulnerable populations are reached must be assessed for all policies. Additional attention should be paid to the social determinants of health outcomes. AHA's public policies are generally robust and well aligned with its 2020 CVH metrics and CVD indicators. Areas for further policy development to fill gaps, overarching research strategies, and topic-specific priority areas are proposed. © 2016 American Heart Association, Inc.

  20. Research on quality metrics of wireless adaptive video streaming

    NASA Astrophysics Data System (ADS)

    Li, Xuefei

    2018-04-01

    With the development of wireless networks and intelligent terminals, video traffic has increased dramatically. Adaptive video streaming has become one of the most promising video transmission technologies. For this type of service, a good QoS (Quality of Service) of wireless network does not always guarantee that all customers have good experience. Thus, new quality metrics have been widely studies recently. Taking this into account, the objective of this paper is to investigate the quality metrics of wireless adaptive video streaming. In this paper, a wireless video streaming simulation platform with DASH mechanism and multi-rate video generator is established. Based on this platform, PSNR model, SSIM model and Quality Level model are implemented. Quality Level Model considers the QoE (Quality of Experience) factors such as image quality, stalling and switching frequency while PSNR Model and SSIM Model mainly consider the quality of the video. To evaluate the performance of these QoE models, three performance metrics (SROCC, PLCC and RMSE) which are used to make a comparison of subjective and predicted MOS (Mean Opinion Score) are calculated. From these performance metrics, the monotonicity, linearity and accuracy of these quality metrics can be observed.

  1. Homogeneity and EPR metrics for assessment of regular grids used in CW EPR powder simulations.

    PubMed

    Crăciun, Cora

    2014-08-01

    CW EPR powder spectra may be approximated numerically using a spherical grid and a Voronoi tessellation-based cubature. For a given spin system, the quality of simulated EPR spectra depends on the grid type, size, and orientation in the molecular frame. In previous work, the grids used in CW EPR powder simulations have been compared mainly from geometric perspective. However, some grids with similar homogeneity degree generate different quality simulated spectra. This paper evaluates the grids from EPR perspective, by defining two metrics depending on the spin system characteristics and the grid Voronoi tessellation. The first metric determines if the grid points are EPR-centred in their Voronoi cells, based on the resonance magnetic field variations inside these cells. The second metric verifies if the adjacent Voronoi cells of the tessellation are EPR-overlapping, by computing the common range of their resonance magnetic field intervals. Beside a series of well known regular grids, the paper investigates a modified ZCW grid and a Fibonacci spherical code, which are new in the context of EPR simulations. For the investigated grids, the EPR metrics bring more information than the homogeneity quantities and are better related to the grids' EPR behaviour, for different spin system symmetries. The metrics' efficiency and limits are finally verified for grids generated from the initial ones, by using the original or magnetic field-constraint variants of the Spherical Centroidal Voronoi Tessellation method. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. Quantifying seascape structure: Extending terrestrial spatial pattern metrics to the marine realm

    USGS Publications Warehouse

    Wedding, L.M.; Christopher, L.A.; Pittman, S.J.; Friedlander, A.M.; Jorgensen, S.

    2011-01-01

    Spatial pattern metrics have routinely been applied to characterize and quantify structural features of terrestrial landscapes and have demonstrated great utility in landscape ecology and conservation planning. The important role of spatial structure in ecology and management is now commonly recognized, and recent advances in marine remote sensing technology have facilitated the application of spatial pattern metrics to the marine environment. However, it is not yet clear whether concepts, metrics, and statistical techniques developed for terrestrial ecosystems are relevant for marine species and seascapes. To address this gap in our knowledge, we reviewed, synthesized, and evaluated the utility and application of spatial pattern metrics in the marine science literature over the past 30 yr (1980 to 2010). In total, 23 studies characterized seascape structure, of which 17 quantified spatial patterns using a 2-dimensional patch-mosaic model and 5 used a continuously varying 3-dimensional surface model. Most seascape studies followed terrestrial-based studies in their search for ecological patterns and applied or modified existing metrics. Only 1 truly unique metric was found (hydrodynamic aperture applied to Pacific atolls). While there are still relatively few studies using spatial pattern metrics in the marine environment, they have suffered from similar misuse as reported for terrestrial studies, such as the lack of a priori considerations or the problem of collinearity between metrics. Spatial pattern metrics offer great potential for ecological research and environmental management in marine systems, and future studies should focus on (1) the dynamic boundary between the land and sea; (2) quantifying 3-dimensional spatial patterns; and (3) assessing and monitoring seascape change. ?? Inter-Research 2011.

  3. Using iPads as a Data Collection Tool in Extension Programming Evaluation

    ERIC Educational Resources Information Center

    Rowntree, J. E.; Witman, R. R.; Lindquist, G. L.; Raven, M. R.

    2013-01-01

    Program evaluation is an important part of Extension, especially with the increased emphasis on metrics and accountability. Agents are often the point persons for evaluation data collection, and Web-based surveys are a commonly used tool. The iPad tablet with Internet access has the potential to be an effective survey tool. iPads were field tested…

  4. WE-AB-202-02: Incorporating Regional Ventilation Function in Predicting Radiation Fibrosis After Concurrent Chemoradiotherapy for Lung Cancer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lan, F; Jeudy, J; Tseng, H

    Purpose: To investigate the incorporation of pre-therapy regional ventilation function in predicting radiation fibrosis (RF) in stage III non-small-cell lung cancer (NSCLC) patients treated with concurrent thoracic chemoradiotherapy. Methods: 37 stage III NSCLC patients were retrospectively studied. Patients received one cycle of cisplatin-gemcitabine, followed by two to three cycles of cisplatin-etoposide concurrently with involved-field thoracic radiotherapy between 46 and 66 Gy (2 Gy per fraction). Pre-therapy regional ventilation images of the lung were derived from 4DCT via a density-change-based image registration algorithm with mass correction. RF was evaluated at 6-months post-treatment using radiographic scoring based on airway dilation and volumemore » loss. Three types of ipsilateral lung metrics were studied: (1) conventional dose-volume metrics (V20, V30, V40, and mean-lung-dose (MLD)), (2) dose-function metrics (fV20, fV30, fV40, and functional mean-lung-dose (fMLD) generated by combining regional ventilation and dose), and (3) dose-subvolume metrics (sV20, sV30, sV40, and subvolume mean-lung-dose (sMLD) defined as the dose-volume metrics computed on the sub-volume of the lung with at least 60% of the quantified maximum ventilation status). Receiver operating characteristic (ROC) curve analysis and logistic regression analysis were used to evaluate the predictability of these metrics for RF. Results: In predicting airway dilation, the area under the ROC curve (AUC) values for (V20, MLD), (fV20, fMLD), and (sV20, and sMLD) were (0.76, 0.70), (0.80, 0.74) and (0.82, 0.80), respectively. The logistic regression p-values were (0.09, 0.18), (0.02, 0.05) and (0.004, 0.006), respectively. With regard to volume loss, the corresponding AUC values for these metrics were (0.66, 0.57), (0.67, 0.61) and (0.71, 0.69), and p-values were (0.95, 0.90), (0.43, 0.64) and (0.08, 0.12), respectively. Conclusion: The inclusion of regional ventilation function improved predictability of radiation fibrosis. Dose-subvolume metrics provided a promising method for incorporating functional information into the conventional dose-volume parameters for outcome assessment.« less

  5. Measuring scientific impact beyond academia: An assessment of existing impact metrics and proposed improvements.

    PubMed

    Ravenscroft, James; Liakata, Maria; Clare, Amanda; Duma, Daniel

    2017-01-01

    How does scientific research affect the world around us? Being able to answer this question is of great importance in order to appropriately channel efforts and resources in science. The impact by scientists in academia is currently measured by citation based metrics such as h-index, i-index and citation counts. These academic metrics aim to represent the dissemination of knowledge among scientists rather than the impact of the research on the wider world. In this work we are interested in measuring scientific impact beyond academia, on the economy, society, health and legislation (comprehensive impact). Indeed scientists are asked to demonstrate evidence of such comprehensive impact by authoring case studies in the context of the Research Excellence Framework (REF). We first investigate the extent to which existing citation based metrics can be indicative of comprehensive impact. We have collected all recent REF impact case studies from 2014 and we have linked these to papers in citation networks that we constructed and derived from CiteSeerX, arXiv and PubMed Central using a number of text processing and information retrieval techniques. We have demonstrated that existing citation-based metrics for impact measurement do not correlate well with REF impact results. We also consider metrics of online attention surrounding scientific works, such as those provided by the Altmetric API. We argue that in order to be able to evaluate wider non-academic impact we need to mine information from a much wider set of resources, including social media posts, press releases, news articles and political debates stemming from academic work. We also provide our data as a free and reusable collection for further analysis, including the PubMed citation network and the correspondence between REF case studies, grant applications and the academic literature.

  6. Measuring scientific impact beyond academia: An assessment of existing impact metrics and proposed improvements

    PubMed Central

    Liakata, Maria; Clare, Amanda; Duma, Daniel

    2017-01-01

    How does scientific research affect the world around us? Being able to answer this question is of great importance in order to appropriately channel efforts and resources in science. The impact by scientists in academia is currently measured by citation based metrics such as h-index, i-index and citation counts. These academic metrics aim to represent the dissemination of knowledge among scientists rather than the impact of the research on the wider world. In this work we are interested in measuring scientific impact beyond academia, on the economy, society, health and legislation (comprehensive impact). Indeed scientists are asked to demonstrate evidence of such comprehensive impact by authoring case studies in the context of the Research Excellence Framework (REF). We first investigate the extent to which existing citation based metrics can be indicative of comprehensive impact. We have collected all recent REF impact case studies from 2014 and we have linked these to papers in citation networks that we constructed and derived from CiteSeerX, arXiv and PubMed Central using a number of text processing and information retrieval techniques. We have demonstrated that existing citation-based metrics for impact measurement do not correlate well with REF impact results. We also consider metrics of online attention surrounding scientific works, such as those provided by the Altmetric API. We argue that in order to be able to evaluate wider non-academic impact we need to mine information from a much wider set of resources, including social media posts, press releases, news articles and political debates stemming from academic work. We also provide our data as a free and reusable collection for further analysis, including the PubMed citation network and the correspondence between REF case studies, grant applications and the academic literature. PMID:28278243

  7. Advancing Efforts to Achieve Health Equity: Equity Metrics for Health Impact Assessment Practice

    PubMed Central

    Heller, Jonathan; Givens, Marjory L.; Yuen, Tina K.; Gould, Solange; Benkhalti Jandu, Maria; Bourcier, Emily; Choi, Tim

    2014-01-01

    Equity is a core value of Health Impact Assessment (HIA). Many compelling moral, economic, and health arguments exist for prioritizing and incorporating equity considerations in HIA practice. Decision-makers, stakeholders, and HIA practitioners see the value of HIAs in uncovering the impacts of policy and planning decisions on various population subgroups, developing and prioritizing specific actions that promote or protect health equity, and using the process to empower marginalized communities. There have been several HIA frameworks developed to guide the inclusion of equity considerations. However, the field lacks clear indicators for measuring whether an HIA advanced equity. This article describes the development of a set of equity metrics that aim to guide and evaluate progress toward equity in HIA practice. These metrics also intend to further push the field to deepen its practice and commitment to equity in each phase of an HIA. Over the course of a year, the Society of Practitioners of Health Impact Assessment (SOPHIA) Equity Working Group took part in a consensus process to develop these process and outcome metrics. The metrics were piloted, reviewed, and refined based on feedback from reviewers. The Equity Metrics are comprised of 23 measures of equity organized into four outcomes: (1) the HIA process and products focused on equity; (2) the HIA process built the capacity and ability of communities facing health inequities to engage in future HIAs and in decision-making more generally; (3) the HIA resulted in a shift in power benefiting communities facing inequities; and (4) the HIA contributed to changes that reduced health inequities and inequities in the social and environmental determinants of health. The metrics are comprised of a measurement scale, examples of high scoring activities, potential data sources, and example interview questions to gather data and guide evaluators on scoring each metric. PMID:25347193

  8. NASA Aviation Safety Program Systems Analysis/Program Assessment Metrics Review

    NASA Technical Reports Server (NTRS)

    Louis, Garrick E.; Anderson, Katherine; Ahmad, Tisan; Bouabid, Ali; Siriwardana, Maya; Guilbaud, Patrick

    2003-01-01

    The goal of this project is to evaluate the metrics and processes used by NASA's Aviation Safety Program in assessing technologies that contribute to NASA's aviation safety goals. There were three objectives for reaching this goal. First, NASA's main objectives for aviation safety were documented and their consistency was checked against the main objectives of the Aviation Safety Program. Next, the metrics used for technology investment by the Program Assessment function of AvSP were evaluated. Finally, other metrics that could be used by the Program Assessment Team (PAT) were identified and evaluated. This investigation revealed that the objectives are in fact consistent across organizational levels at NASA and with the FAA. Some of the major issues discussed in this study which should be further investigated, are the removal of the Cost and Return-on-Investment metrics, the lack of the metrics to measure the balance of investment and technology, the interdependencies between some of the metric risk driver categories, and the conflict between 'fatal accident rate' and 'accident rate' in the language of the Aviation Safety goal as stated in different sources.

  9. Evaluation of Two Crew Module Boilerplate Tests Using Newly Developed Calibration Metrics

    NASA Technical Reports Server (NTRS)

    Horta, Lucas G.; Reaves, Mercedes C.

    2012-01-01

    The paper discusses a application of multi-dimensional calibration metrics to evaluate pressure data from water drop tests of the Max Launch Abort System (MLAS) crew module boilerplate. Specifically, three metrics are discussed: 1) a metric to assess the probability of enveloping the measured data with the model, 2) a multi-dimensional orthogonality metric to assess model adequacy between test and analysis, and 3) a prediction error metric to conduct sensor placement to minimize pressure prediction errors. Data from similar (nearly repeated) capsule drop tests shows significant variability in the measured pressure responses. When compared to expected variability using model predictions, it is demonstrated that the measured variability cannot be explained by the model under the current uncertainty assumptions.

  10. Hyperbolic Harmonic Mapping for Surface Registration

    PubMed Central

    Shi, Rui; Zeng, Wei; Su, Zhengyu; Jiang, Jian; Damasio, Hanna; Lu, Zhonglin; Wang, Yalin; Yau, Shing-Tung; Gu, Xianfeng

    2016-01-01

    Automatic computation of surface correspondence via harmonic map is an active research field in computer vision, computer graphics and computational geometry. It may help document and understand physical and biological phenomena and also has broad applications in biometrics, medical imaging and motion capture inducstries. Although numerous studies have been devoted to harmonic map research, limited progress has been made to compute a diffeomorphic harmonic map on general topology surfaces with landmark constraints. This work conquers this problem by changing the Riemannian metric on the target surface to a hyperbolic metric so that the harmonic mapping is guaranteed to be a diffeomorphism under landmark constraints. The computational algorithms are based on Ricci flow and nonlinear heat diffusion methods. The approach is general and robust. We employ our algorithm to study the constrained surface registration problem which applies to both computer vision and medical imaging applications. Experimental results demonstrate that, by changing the Riemannian metric, the registrations are always diffeomorphic and achieve relatively high performance when evaluated with some popular surface registration evaluation standards. PMID:27187948

  11. Using Algal Metrics and Biomass to Evaluate Multiple Ways of Defining Concentration-Based Nutrient Criteria in Streams and their Ecological Relevance

    EPA Science Inventory

    We examined the utility of nutrient criteria derived solely from total phosphorus (TP) concentrations in streams (regression models and percentile distributions) and evaluated their ecological relevance to diatom and algal biomass responses. We used a variety of statistics to cha...

  12. Subjective Performance Evaluation in the Public Sector: Evidence from School Inspections. CEE DP 135

    ERIC Educational Resources Information Center

    Hussain, Iftikhar

    2012-01-01

    Performance measurement in the public sector is largely based on objective metrics, which may be subject to gaming behaviour. This paper investigates a novel subjective performance evaluation system where independent inspectors visit schools at very short notice, publicly disclose their findings and sanction schools rated fail. First, I…

  13. Commentary on New Metrics, Measures, and Uses for Fluency Data

    ERIC Educational Resources Information Center

    Christ, Theodore J.; Ardoin, Scott P.

    2015-01-01

    Fluency and rate-based assessments, such as curriculum-based measurement, are frequently used to screen and evaluate student progress. The application of such measures are especially prevalent within special education and response to intervention models of prevention and early intervention. Although there is an extensive research and professional…

  14. Perceived crosstalk assessment on patterned retarder 3D display

    NASA Astrophysics Data System (ADS)

    Zou, Bochao; Liu, Yue; Huang, Yi; Wang, Yongtian

    2014-03-01

    CONTEXT: Nowadays, almost all stereoscopic displays suffer from crosstalk, which is one of the most dominant degradation factors of image quality and visual comfort for 3D display devices. To deal with such problems, it is worthy to quantify the amount of perceived crosstalk OBJECTIVE: Crosstalk measurements are usually based on some certain test patterns, but scene content effects are ignored. To evaluate the perceived crosstalk level for various scenes, subjective test may bring a more correct evaluation. However, it is a time consuming approach and is unsuitable for real­ time applications. Therefore, an objective metric that can reliably predict the perceived crosstalk is needed. A correct objective assessment of crosstalk for different scene contents would be beneficial to the development of crosstalk minimization and cancellation algorithms which could be used to bring a good quality of experience to viewers. METHOD: A patterned retarder 3D display is used to present 3D images in our experiment. By considering the mechanism of this kind of devices, an appropriate simulation of crosstalk is realized by image processing techniques to assign different values of crosstalk to each other between image pairs. It can be seen from the literature that the structures of scenes have a significant impact on the perceived crosstalk, so we first extract the differences of the structural information between original and distorted image pairs through Structural SIMilarity (SSIM) algorithm, which could directly evaluate the structural changes between two complex-structured signals. Then the structural changes of left view and right view are computed respectively and combined to an overall distortion map. Under 3D viewing condition, because of the added value of depth, the crosstalk of pop-out objects may be more perceptible. To model this effect, the depth map of a stereo pair is generated and the depth information is filtered by the distortion map. Moreover, human attention is one of important factors for crosstalk assessment due to the fact that when viewing 3D contents, perceptual salient regions are highly likely to be a major contributor to determining the quality of experience of 3D contents. To take this into account, perceptual significant regions are extracted, and a spatial pooling technique is used to combine structural distortion map, depth map and visual salience map together to predict the perceived crosstalk more precisely. To verify the performance of the proposed crosstalk assessment metric, subjective experiments are conducted with 24 participants viewing and rating 60 simuli (5 scenes * 4 crosstalk levels * 3 camera distances). After an outliers removal and statistical process, the correlation with subjective test is examined using Pearson and Spearman rank-order correlation coefficient. Furthermore, the proposed method is also compared with two traditional 2D metrics, PSNR and SSIM. The objective score is mapped to subjective scale using a nonlinear fitting function to directly evaluate the performance of the metric. RESULIS: After the above-mentioned processes, the evaluation results demonstrate that the proposed metric is highly correlated with the subjective score when compared with the existing approaches. Because the Pearson coefficient of the proposed metric is 90.3%, it is promising for objective evaluation of the perceived crosstalk. NOVELTY: The main goal of our paper is to introduce an objective metric for stereo crosstalk assessment. The novelty contributions are twofold. First, an appropriate simulation of crosstalk by considering the characteristics of patterned retarder 3D display is developed. Second, an objective crosstalk metric based on visual attention model is introduced.

  15. [Applicability of traditional landscape metrics in evaluating urban heat island effect].

    PubMed

    Chen, Ai-Lian; Sun, Ran-Hao; Chen, Li-Ding

    2012-08-01

    By using 24 landscape metrics, this paper evaluated the urban heat island effect in parts of Beijing downtown area. QuickBird (QB) images were used to extract the landscape type information, and the thermal bands from Landsat Enhanced Thematic Mapper Plus (ETM+) images were used to extract the land surface temperature (LST) in four seasons of the same year. The 24 landscape pattern metrics were calculated at landscape and class levels in a fixed window with 120 mx 120 m in size, with the applicability of these traditional landscape metrics in evaluating the urban heat island effect examined. Among the 24 landscape metrics, only the percentage composition of landscape (PLAND), patch density (PD), largest patch index (LPI), coefficient of Euclidean nearest-neighbor distance variance (ENN_CV), and landscape division index (DIVISION) at landscape level were significantly correlated with the LST in March, May, and November, and the PLAND, LPI, DIVISION, percentage of like adjacencies, and interspersion and juxtaposition index at class level showed significant correlations with the LST in March, May, July, and December, especially in July. Some metrics such as PD, edge density, clumpiness index, patch cohesion index, effective mesh size, splitting index, aggregation index, and normalized landscape shape index showed varying correlations with the LST at different class levels. The traditional landscape metrics could not be appropriate in evaluating the effects of river on LST, while some of the metrics could be useful in characterizing urban LST and analyzing the urban heat island effect, but screening and examining should be made on the metrics.

  16. A Survey and Analysis of Aircraft Maintenance Metrics: A Balanced Scorecard Approach

    DTIC Science & Technology

    2014-03-27

    Metrics Set Theory /Framework .................................................................................... 16 Balanced Scorecard overview...a useful form Figure 1: Metric Evaluation Criteria (Caplice & Sheffi, 1994, p. 14) Metrics Set Theory /Framework The researcher included an...examination of established theory and frameworks on how metrics sets are constructed in the literature review. The purpose of this examination was to

  17. Multiscale entropy-based methods for heart rate variability complexity analysis

    NASA Astrophysics Data System (ADS)

    Silva, Luiz Eduardo Virgilio; Cabella, Brenno Caetano Troca; Neves, Ubiraci Pereira da Costa; Murta Junior, Luiz Otavio

    2015-03-01

    Physiologic complexity is an important concept to characterize time series from biological systems, which associated to multiscale analysis can contribute to comprehension of many complex phenomena. Although multiscale entropy has been applied to physiological time series, it measures irregularity as function of scale. In this study we purpose and evaluate a set of three complexity metrics as function of time scales. Complexity metrics are derived from nonadditive entropy supported by generation of surrogate data, i.e. SDiffqmax, qmax and qzero. In order to access accuracy of proposed complexity metrics, receiver operating characteristic (ROC) curves were built and area under the curves was computed for three physiological situations. Heart rate variability (HRV) time series in normal sinus rhythm, atrial fibrillation, and congestive heart failure data set were analyzed. Results show that proposed metric for complexity is accurate and robust when compared to classic entropic irregularity metrics. Furthermore, SDiffqmax is the most accurate for lower scales, whereas qmax and qzero are the most accurate when higher time scales are considered. Multiscale complexity analysis described here showed potential to assess complex physiological time series and deserves further investigation in wide context.

  18. Noisy EEG signals classification based on entropy metrics. Performance assessment using first and second generation statistics.

    PubMed

    Cuesta-Frau, David; Miró-Martínez, Pau; Jordán Núñez, Jorge; Oltra-Crespo, Sandra; Molina Picó, Antonio

    2017-08-01

    This paper evaluates the performance of first generation entropy metrics, featured by the well known and widely used Approximate Entropy (ApEn) and Sample Entropy (SampEn) metrics, and what can be considered an evolution from these, Fuzzy Entropy (FuzzyEn), in the Electroencephalogram (EEG) signal classification context. The study uses the commonest artifacts found in real EEGs, such as white noise, and muscular, cardiac, and ocular artifacts. Using two different sets of publicly available EEG records, and a realistic range of amplitudes for interfering artifacts, this work optimises and assesses the robustness of these metrics against artifacts in class segmentation terms probability. The results show that the qualitative behaviour of the two datasets is similar, with SampEn and FuzzyEn performing the best, and the noise and muscular artifacts are the most confounding factors. On the contrary, there is a wide variability as regards initialization parameters. The poor performance achieved by ApEn suggests that this metric should not be used in these contexts. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Stability and Performance Metrics for Adaptive Flight Control

    NASA Technical Reports Server (NTRS)

    Stepanyan, Vahram; Krishnakumar, Kalmanje; Nguyen, Nhan; VanEykeren, Luarens

    2009-01-01

    This paper addresses the problem of verifying adaptive control techniques for enabling safe flight in the presence of adverse conditions. Since the adaptive systems are non-linear by design, the existing control verification metrics are not applicable to adaptive controllers. Moreover, these systems are in general highly uncertain. Hence, the system's characteristics cannot be evaluated by relying on the available dynamical models. This necessitates the development of control verification metrics based on the system's input-output information. For this point of view, a set of metrics is introduced that compares the uncertain aircraft's input-output behavior under the action of an adaptive controller to that of a closed-loop linear reference model to be followed by the aircraft. This reference model is constructed for each specific maneuver using the exact aerodynamic and mass properties of the aircraft to meet the stability and performance requirements commonly accepted in flight control. The proposed metrics are unified in the sense that they are model independent and not restricted to any specific adaptive control methods. As an example, we present simulation results for a wing damaged generic transport aircraft with several existing adaptive controllers.

  20. Metrics, The Measure of Your Future: Evaluation Report, 1977.

    ERIC Educational Resources Information Center

    North Carolina State Dept. of Public Instruction, Raleigh. Div. of Development.

    The primary goal of the Metric Education Project was the systematic development of a replicable educational model to facilitate the system-wide conversion to the metric system during the next five to ten years. This document is an evaluation of that project. Three sets of statistical evidence exist to support the fact that the project has been…

  1. In the pursuit of a semantic similarity metric based on UMLS annotations for articles in PubMed Central Open Access.

    PubMed

    Garcia Castro, Leyla Jael; Berlanga, Rafael; Garcia, Alexander

    2015-10-01

    Although full-text articles are provided by the publishers in electronic formats, it remains a challenge to find related work beyond the title and abstract context. Identifying related articles based on their abstract is indeed a good starting point; this process is straightforward and does not consume as many resources as full-text based similarity would require. However, further analyses may require in-depth understanding of the full content. Two articles with highly related abstracts can be substantially different regarding the full content. How similarity differs when considering title-and-abstract versus full-text and which semantic similarity metric provides better results when dealing with full-text articles are the main issues addressed in this manuscript. We have benchmarked three similarity metrics - BM25, PMRA, and Cosine, in order to determine which one performs best when using concept-based annotations on full-text documents. We also evaluated variations in similarity values based on title-and-abstract against those relying on full-text. Our test dataset comprises the Genomics track article collection from the 2005 Text Retrieval Conference. Initially, we used an entity recognition software to semantically annotate titles and abstracts as well as full-text with concepts defined in the Unified Medical Language System (UMLS®). For each article, we created a document profile, i.e., a set of identified concepts, term frequency, and inverse document frequency; we then applied various similarity metrics to those document profiles. We considered correlation, precision, recall, and F1 in order to determine which similarity metric performs best with concept-based annotations. For those full-text articles available in PubMed Central Open Access (PMC-OA), we also performed dispersion analyses in order to understand how similarity varies when considering full-text articles. We have found that the PubMed Related Articles similarity metric is the most suitable for full-text articles annotated with UMLS concepts. For similarity values above 0.8, all metrics exhibited an F1 around 0.2 and a recall around 0.1; BM25 showed the highest precision close to 1; in all cases the concept-based metrics performed better than the word-stem-based one. Our experiments show that similarity values vary when considering only title-and-abstract versus full-text similarity. Therefore, analyses based on full-text become useful when a given research requires going beyond title and abstract, particularly regarding connectivity across articles. Visualization available at ljgarcia.github.io/semsim.benchmark/, data available at http://dx.doi.org/10.5281/zenodo.13323. Copyright © 2015 Elsevier Inc. All rights reserved.

  2. SU-E-T-789: Validation of 3DVH Accuracy On Quantifying Delivery Errors Based On Clinical Relevant DVH Metrics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ma, T; Kumaraswamy, L

    Purpose: Detection of treatment delivery errors is important in radiation therapy. However, accurate quantification of delivery errors is also of great importance. This study aims to evaluate the 3DVH software’s ability to accurately quantify delivery errors. Methods: Three VMAT plans (prostate, H&N and brain) were randomly chosen for this study. First, we evaluated whether delivery errors could be detected by gamma evaluation. Conventional per-beam IMRT QA was performed with the ArcCHECK diode detector for the original plans and for the following modified plans: (1) induced dose difference error up to ±4.0% and (2) control point (CP) deletion (3 to 10more » CPs were deleted) (3) gantry angle shift error (3 degree uniformly shift). 2D and 3D gamma evaluation were performed for all plans through SNC Patient and 3DVH, respectively. Subsequently, we investigated the accuracy of 3DVH analysis for all cases. This part evaluated, using the Eclipse TPS plans as standard, whether 3DVH accurately can model the changes in clinically relevant metrics caused by the delivery errors. Results: 2D evaluation seemed to be more sensitive to delivery errors. The average differences between ECLIPSE predicted and 3DVH results for each pair of specific DVH constraints were within 2% for all three types of error-induced treatment plans, illustrating the fact that 3DVH is fairly accurate in quantifying the delivery errors. Another interesting observation was that even though the gamma pass rates for the error plans are high, the DVHs showed significant differences between original plan and error-induced plans in both Eclipse and 3DVH analysis. Conclusion: The 3DVH software is shown to accurately quantify the error in delivered dose based on clinically relevant DVH metrics, where a conventional gamma based pre-treatment QA might not necessarily detect.« less

  3. Curriculum-Based Measures in Writing: A School-Based Evaluation of Predictive Validity

    ERIC Educational Resources Information Center

    Terenzi, Christina M.

    2009-01-01

    Recent research in the area of Curriculum-Based Measures (CBM) in writing has shown that traditionally used metrics, such as total words written and total words correct, may not be the best tools for measuring writing performance, for both secondary and elementary aged children (e.g., Gansle, Noell, VanDerHeyden, Naquin, & Slider, 2002; Tindal…

  4. Validation of sea ice models using an uncertainty-based distance metric for multiple model variables: NEW METRIC FOR SEA ICE MODEL VALIDATION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Urrego-Blanco, Jorge R.; Hunke, Elizabeth C.; Urban, Nathan M.

    Here, we implement a variance-based distance metric (D n) to objectively assess skill of sea ice models when multiple output variables or uncertainties in both model predictions and observations need to be considered. The metric compares observations and model data pairs on common spatial and temporal grids improving upon highly aggregated metrics (e.g., total sea ice extent or volume) by capturing the spatial character of model skill. The D n metric is a gamma-distributed statistic that is more general than the χ 2 statistic commonly used to assess model fit, which requires the assumption that the model is unbiased andmore » can only incorporate observational error in the analysis. The D n statistic does not assume that the model is unbiased, and allows the incorporation of multiple observational data sets for the same variable and simultaneously for different variables, along with different types of variances that can characterize uncertainties in both observations and the model. This approach represents a step to establish a systematic framework for probabilistic validation of sea ice models. The methodology is also useful for model tuning by using the D n metric as a cost function and incorporating model parametric uncertainty as part of a scheme to optimize model functionality. We apply this approach to evaluate different configurations of the standalone Los Alamos sea ice model (CICE) encompassing the parametric uncertainty in the model, and to find new sets of model configurations that produce better agreement than previous configurations between model and observational estimates of sea ice concentration and thickness.« less

  5. Estimating uncertainty in ambient and saturation nutrient uptake metrics from nutrient pulse releases in stream ecosystems

    DOE PAGES

    Brooks, Scott C.; Brandt, Craig C.; Griffiths, Natalie A.

    2016-10-07

    Nutrient spiraling is an important ecosystem process characterizing nutrient transport and uptake in streams. Various nutrient addition methods are used to estimate uptake metrics; however, uncertainty in the metrics is not often evaluated. A method was developed to quantify uncertainty in ambient and saturation nutrient uptake metrics estimated from saturating pulse nutrient additions (Tracer Additions for Spiraling Curve Characterization; TASCC). Using a Monte Carlo (MC) approach, the 95% confidence interval (CI) was estimated for ambient uptake lengths (S w-amb) and maximum areal uptake rates (U max) based on 100,000 datasets generated from each of four nitrogen and five phosphorous TASCCmore » experiments conducted seasonally in a forest stream in eastern Tennessee, U.S.A. Uncertainty estimates from the MC approach were compared to the CIs estimated from ordinary least squares (OLS) and non-linear least squares (NLS) models used to calculate S w-amb and U max, respectively, from the TASCC method. The CIs for Sw-amb and Umax were large, but were not consistently larger using the MC method. Despite the large CIs, significant differences (based on nonoverlapping CIs) in nutrient metrics among seasons were found with more significant differences using the OLS/NLS vs. the MC method. Lastly, we suggest that the MC approach is a robust way to estimate uncertainty, as the calculation of S w-amb and U max violates assumptions of OLS/NLS while the MC approach is free of these assumptions. The MC approach can be applied to other ecosystem metrics that are calculated from multiple parameters, providing a more robust estimate of these metrics and their associated uncertainties.« less

  6. Validation of sea ice models using an uncertainty-based distance metric for multiple model variables: NEW METRIC FOR SEA ICE MODEL VALIDATION

    DOE PAGES

    Urrego-Blanco, Jorge R.; Hunke, Elizabeth C.; Urban, Nathan M.; ...

    2017-04-01

    Here, we implement a variance-based distance metric (D n) to objectively assess skill of sea ice models when multiple output variables or uncertainties in both model predictions and observations need to be considered. The metric compares observations and model data pairs on common spatial and temporal grids improving upon highly aggregated metrics (e.g., total sea ice extent or volume) by capturing the spatial character of model skill. The D n metric is a gamma-distributed statistic that is more general than the χ 2 statistic commonly used to assess model fit, which requires the assumption that the model is unbiased andmore » can only incorporate observational error in the analysis. The D n statistic does not assume that the model is unbiased, and allows the incorporation of multiple observational data sets for the same variable and simultaneously for different variables, along with different types of variances that can characterize uncertainties in both observations and the model. This approach represents a step to establish a systematic framework for probabilistic validation of sea ice models. The methodology is also useful for model tuning by using the D n metric as a cost function and incorporating model parametric uncertainty as part of a scheme to optimize model functionality. We apply this approach to evaluate different configurations of the standalone Los Alamos sea ice model (CICE) encompassing the parametric uncertainty in the model, and to find new sets of model configurations that produce better agreement than previous configurations between model and observational estimates of sea ice concentration and thickness.« less

  7. Estimating uncertainty in ambient and saturation nutrient uptake metrics from nutrient pulse releases in stream ecosystems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brooks, Scott C.; Brandt, Craig C.; Griffiths, Natalie A.

    Nutrient spiraling is an important ecosystem process characterizing nutrient transport and uptake in streams. Various nutrient addition methods are used to estimate uptake metrics; however, uncertainty in the metrics is not often evaluated. A method was developed to quantify uncertainty in ambient and saturation nutrient uptake metrics estimated from saturating pulse nutrient additions (Tracer Additions for Spiraling Curve Characterization; TASCC). Using a Monte Carlo (MC) approach, the 95% confidence interval (CI) was estimated for ambient uptake lengths (S w-amb) and maximum areal uptake rates (U max) based on 100,000 datasets generated from each of four nitrogen and five phosphorous TASCCmore » experiments conducted seasonally in a forest stream in eastern Tennessee, U.S.A. Uncertainty estimates from the MC approach were compared to the CIs estimated from ordinary least squares (OLS) and non-linear least squares (NLS) models used to calculate S w-amb and U max, respectively, from the TASCC method. The CIs for Sw-amb and Umax were large, but were not consistently larger using the MC method. Despite the large CIs, significant differences (based on nonoverlapping CIs) in nutrient metrics among seasons were found with more significant differences using the OLS/NLS vs. the MC method. Lastly, we suggest that the MC approach is a robust way to estimate uncertainty, as the calculation of S w-amb and U max violates assumptions of OLS/NLS while the MC approach is free of these assumptions. The MC approach can be applied to other ecosystem metrics that are calculated from multiple parameters, providing a more robust estimate of these metrics and their associated uncertainties.« less

  8. Braided river flow and invasive vegetation dynamics in the Southern Alps, New Zealand.

    PubMed

    Caruso, Brian S; Edmondson, Laura; Pithie, Callum

    2013-07-01

    In mountain braided rivers, extreme flow variability, floods and high flow pulses are fundamental elements of natural flow regimes and drivers of floodplain processes, understanding of which is essential for management and restoration. This study evaluated flow dynamics and invasive vegetation characteristics and changes in the Ahuriri River, a free-flowing braided, gravel-bed river in the Southern Alps of New Zealand's South Island. Sixty-seven flow metrics based on indicators of hydrologic alteration and environmental flow components (extreme low flows, low flows, high flow pulses, small floods and large floods) were analyzed using a 48-year flow record. Changes in the areal cover of floodplain and invasive vegetation classes and patch characteristics over 20 years (1991-2011) were quantified using five sets of aerial photographs, and the correlation between flow metrics and cover changes were evaluated. The river exhibits considerable hydrologic variability characteristic of mountain braided rivers, with large variation in floods and other flow regime metrics. The flow regime, including flood and high flow pulses, has variable effects on floodplain invasive vegetation, and creates dynamic patch mosaics that demonstrate the concepts of a shifting mosaic steady state and biogeomorphic succession. As much as 25 % of the vegetation cover was removed by the largest flood on record (570 m(3)/s, ~50-year return period), with preferential removal of lupin and less removal of willow. However, most of the vegetation regenerated and spread relatively quickly after floods. Some flow metrics analyzed were highly correlated with vegetation cover, and key metrics included the peak magnitude of the largest flood, flood frequency, and time since the last flood in the interval between photos. These metrics provided a simple multiple regression model of invasive vegetation cover in the aerial photos evaluated. Our analysis of relationships among flow regimes and invasive vegetation cover has implications for braided rivers impacted by hydroelectric power production, where increases in invasive vegetation cover are typically greater than in unimpacted rivers.

  9. Metric Evaluation Pipeline for 3d Modeling of Urban Scenes

    NASA Astrophysics Data System (ADS)

    Bosch, M.; Leichtman, A.; Chilcott, D.; Goldberg, H.; Brown, M.

    2017-05-01

    Publicly available benchmark data and metric evaluation approaches have been instrumental in enabling research to advance state of the art methods for remote sensing applications in urban 3D modeling. Most publicly available benchmark datasets have consisted of high resolution airborne imagery and lidar suitable for 3D modeling on a relatively modest scale. To enable research in larger scale 3D mapping, we have recently released a public benchmark dataset with multi-view commercial satellite imagery and metrics to compare 3D point clouds with lidar ground truth. We now define a more complete metric evaluation pipeline developed as publicly available open source software to assess semantically labeled 3D models of complex urban scenes derived from multi-view commercial satellite imagery. Evaluation metrics in our pipeline include horizontal and vertical accuracy and completeness, volumetric completeness and correctness, perceptual quality, and model simplicity. Sources of ground truth include airborne lidar and overhead imagery, and we demonstrate a semi-automated process for producing accurate ground truth shape files to characterize building footprints. We validate our current metric evaluation pipeline using 3D models produced using open source multi-view stereo methods. Data and software is made publicly available to enable further research and planned benchmarking activities.

  10. Integrated framework for developing search and discrimination metrics

    NASA Astrophysics Data System (ADS)

    Copeland, Anthony C.; Trivedi, Mohan M.

    1997-06-01

    This paper presents an experimental framework for evaluating target signature metrics as models of human visual search and discrimination. This framework is based on a prototype eye tracking testbed, the Integrated Testbed for Eye Movement Studies (ITEMS). ITEMS determines an observer's visual fixation point while he studies a displayed image scene, by processing video of the observer's eye. The utility of this framework is illustrated with an experiment using gray-scale images of outdoor scenes that contain randomly placed targets. Each target is a square region of a specific size containing pixel values from another image of an outdoor scene. The real-world analogy of this experiment is that of a military observer looking upon the sensed image of a static scene to find camouflaged enemy targets that are reported to be in the area. ITEMS provides the data necessary to compute various statistics for each target to describe how easily the observers located it, including the likelihood the target was fixated or identified and the time required to do so. The computed values of several target signature metrics are compared to these statistics, and a second-order metric based on a model of image texture was found to be the most highly correlated.

  11. Degraded visual environment image/video quality metrics

    NASA Astrophysics Data System (ADS)

    Baumgartner, Dustin D.; Brown, Jeremy B.; Jacobs, Eddie L.; Schachter, Bruce J.

    2014-06-01

    A number of image quality metrics (IQMs) and video quality metrics (VQMs) have been proposed in the literature for evaluating techniques and systems for mitigating degraded visual environments. Some require both pristine and corrupted imagery. Others require patterned target boards in the scene. None of these metrics relates well to the task of landing a helicopter in conditions such as a brownout dust cloud. We have developed and used a variety of IQMs and VQMs related to the pilot's ability to detect hazards in the scene and to maintain situational awareness. Some of these metrics can be made agnostic to sensor type. Not only are the metrics suitable for evaluating algorithm and sensor variation, they are also suitable for choosing the most cost effective solution to improve operating conditions in degraded visual environments.

  12. Application of constrained k-means clustering in ground motion simulation validation

    NASA Astrophysics Data System (ADS)

    Khoshnevis, N.; Taborda, R.

    2017-12-01

    The validation of ground motion synthetics has received increased attention over the last few years due to the advances in physics-based deterministic and hybrid simulation methods. Unlike for low frequency simulations (f ≤ 0.5 Hz), for which it has become reasonable to expect a good match between synthetics and data, in the case of high-frequency simulations (f ≥ 1 Hz) it is not possible to match results on a wiggle-by-wiggle basis. This is mostly due to the various complexities and uncertainties involved in earthquake ground motion modeling. Therefore, in order to compare synthetics with data we turn to different time series metrics, which are used as a means to characterize how the synthetics match the data on qualitative and statistical sense. In general, these metrics provide GOF scores that measure the level of similarity in the time and frequency domains. It is common for these scores to be scaled from 0 to 10, with 10 representing a perfect match. Although using individual metrics for particular applications is considered more adequate, there is no consensus or a unified method to classify the comparison between a set of synthetic and recorded seismograms when the various metrics offer different scores. We study the relationship among these metrics through a constrained k-means clustering approach. We define 4 hypothetical stations with scores 3, 5, 7, and 9 for all metrics. We put these stations in the category of cannot-link constraints. We generate the dataset through the validation of the results from a deterministic (physics-based) ground motion simulation for a moderate magnitude earthquake in the greater Los Angeles basin using three velocity models. The maximum frequency of the simulation is 4 Hz. The dataset involves over 300 stations and 11 metrics, or features, as they are understood in the clustering process, where the metrics form a multi-dimensional space. We address the high-dimensional feature effects with a subspace-clustering analysis, generate a final labeled dataset of stations, and discuss the within-class statistical characteristics of each metric. Labeling these stations is the first step towards developing a unified metric to evaluate ground motion simulations in an application-independent manner.

  13. Ideal cardiovascular health status and its association with socioeconomic factors in Chinese adults in Shandong, China.

    PubMed

    Ren, J; Guo, X L; Lu, Z L; Zhang, J Y; Tang, J L; Chen, X; Gao, C C; Xu, C X; Xu, A Q

    2016-09-07

    Cardiovascular disease (CVD) is the leading cause of morbidity and mortality in the world. In 2010, a goal released by the American Heart Association (AHA) Committee focused on the primary reduction in cardiovascular risk. Data collected from 7683 men and 7667 women aged 18-69 years were analyzed. The distribution of ideal cardiovascular health metrics based on 7 cardiovascular disease risk factors or health behaviors in according to the definition of AHA was evaluated among the subjects. The association of the socioeconomic factors on the prevalence of meeting 5 or more ideal cardiovascular health metrics was estimated by logistic regression analysis, and a chi-square test for categorical variables and the general linear model (GLM) procedure for continuous variables were used to compare differences in prevalence and in means among genders. Seven of 15350 participants (0.05 %) met all 7 cardiovascular health metrics. The women had a higher proportion of meeting 5 or more ideal health metrics compared with men (32.67 VS.14.27 %). The subjects with a higher education and income level had a higher proportion of meeting 5 or more ideal health metrics than the subjects with a lower education and income level. A comparison between subjects with meeting 5 or more ideal cardiovascular health metrics with subjects meeting 4 or fewer ideal cardiovascular health metrics reveals that adjusted odds ratio [OR, 95 % confidence intervals (95 % CI)] was 1.42 (0.95, 2.21) in men and 2.59 (1.74, 3.87) in women for higher education and income, respectively. The prevalence of meeting all 7 cardiovascular health metrics was low in the adult population. Women, young subjects, and those with higher levels of education or income tend to have a greater number of the ideal cardiovascular health metrics. Higher socioeconomic status was associated with an increasing prevalence of meeting 5 or more cardiovascular health metrics in women but not in men. It's urgent to develop comprehensive population-based interventions to improve the cardiovascular risk factors in Shandong Province in China.

  14. Detection of blur artifacts in histopathological whole-slide images of endomyocardial biopsies.

    PubMed

    Hang Wu; Phan, John H; Bhatia, Ajay K; Cundiff, Caitlin A; Shehata, Bahig M; Wang, May D

    2015-01-01

    Histopathological whole-slide images (WSIs) have emerged as an objective and quantitative means for image-based disease diagnosis. However, WSIs may contain acquisition artifacts that affect downstream image feature extraction and quantitative disease diagnosis. We develop a method for detecting blur artifacts in WSIs using distributions of local blur metrics. As features, these distributions enable accurate classification of WSI regions as sharp or blurry. We evaluate our method using over 1000 portions of an endomyocardial biopsy (EMB) WSI. Results indicate that local blur metrics accurately detect blurry image regions.

  15. Semantic Pattern Analysis for Verbal Fluency Based Assessment of Neurological Disorders

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sukumar, Sreenivas R; Ainsworth, Keela C; Brown, Tyler C

    In this paper, we present preliminary results of semantic pattern analysis of verbal fluency tests used for assessing cognitive psychological and neuropsychological disorders. We posit that recent advances in semantic reasoning and artificial intelligence can be combined to create a standardized computer-aided diagnosis tool to automatically evaluate and interpret verbal fluency tests. Towards that goal, we derive novel semantic similarity (phonetic, phonemic and conceptual) metrics and present the predictive capability of these metrics on a de-identified dataset of participants with and without neurological disorders.

  16. Implementation of a channelized Hotelling observer model to assess image quality of x-ray angiography systems

    PubMed Central

    Favazza, Christopher P.; Fetterly, Kenneth A.; Hangiandreou, Nicholas J.; Leng, Shuai; Schueler, Beth A.

    2015-01-01

    Abstract. Evaluation of flat-panel angiography equipment through conventional image quality metrics is limited by the scope of standard spatial-domain image quality metric(s), such as contrast-to-noise ratio and spatial resolution, or by restricted access to appropriate data to calculate Fourier domain measurements, such as modulation transfer function, noise power spectrum, and detective quantum efficiency. Observer models have been shown capable of overcoming these limitations and are able to comprehensively evaluate medical-imaging systems. We present a spatial domain-based channelized Hotelling observer model to calculate the detectability index (DI) of our different sized disks and compare the performance of different imaging conditions and angiography systems. When appropriate, changes in DIs were compared to expectations based on the classical Rose model of signal detection to assess linearity of the model with quantum signal-to-noise ratio (SNR) theory. For these experiments, the estimated uncertainty of the DIs was less than 3%, allowing for precise comparison of imaging systems or conditions. For most experimental variables, DI changes were linear with expectations based on quantum SNR theory. DIs calculated for the smallest objects demonstrated nonlinearity with quantum SNR theory due to system blur. Two angiography systems with different detector element sizes were shown to perform similarly across the majority of the detection tasks. PMID:26158086

  17. Regime-Based Evaluation of Cloudiness in CMIP5 Models

    NASA Technical Reports Server (NTRS)

    Jin, Daeho; Oraiopoulos, Lazaros; Lee, Dong Min

    2016-01-01

    The concept of Cloud Regimes (CRs) is used to develop a framework for evaluating the cloudiness of 12 fifth Coupled Model Intercomparison Project (CMIP5) models. Reference CRs come from existing global International Satellite Cloud Climatology Project (ISCCP) weather states. The evaluation is made possible by the implementation in several CMIP5 models of the ISCCP simulator generating for each gridcell daily joint histograms of cloud optical thickness and cloud top pressure. Model performance is assessed with several metrics such as CR global cloud fraction (CF), CR relative frequency of occurrence (RFO), their product (long-term average total cloud amount [TCA]), cross-correlations of CR RFO maps, and a metric of resemblance between model and ISCCP CRs. In terms of CR global RFO, arguably the most fundamental metric, the models perform unsatisfactorily overall, except for CRs representing thick storm clouds. Because model CR CF is internally constrained by our method, RFO discrepancies yield also substantial TCA errors. Our findings support previous studies showing that CMIP5 models underestimate cloudiness. The multi-model mean performs well in matching observed RFO maps for many CRs, but is not the best for this or other metrics. When overall performance across all CRs is assessed, some models, despite their shortcomings, apparently outperform Moderate Resolution Imaging Spectroradiometer (MODIS) cloud observations evaluated against ISCCP as if they were another model output. Lastly, cloud simulation performance is contrasted with each model's equilibrium climate sensitivity (ECS) in order to gain insight on whether good cloud simulation pairs with particular values of this parameter.

  18. Evaluation of a Metric Booklet as a Supplement to Teaching the Metric System to Undergraduate Non-Science Majors.

    ERIC Educational Resources Information Center

    Exum, Kenith Gene

    Examined is the effectiveness of a method of teaching the metric system using the booklet, Metric Supplement to Mathematics, in combination with a physical science textbook. The participants in the study were randomly selected undergraduates in a non-science oriented program of study. Instruments used included the Metric Supplement to Mathematics…

  19. Light Water Reactor Sustainability Program Operator Performance Metrics for Control Room Modernization: A Practical Guide for Early Design Evaluation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ronald Boring; Roger Lew; Thomas Ulrich

    2014-03-01

    As control rooms are modernized with new digital systems at nuclear power plants, it is necessary to evaluate the operator performance using these systems as part of a verification and validation process. There are no standard, predefined metrics available for assessing what is satisfactory operator interaction with new systems, especially during the early design stages of a new system. This report identifies the process and metrics for evaluating human system interfaces as part of control room modernization. The report includes background information on design and evaluation, a thorough discussion of human performance measures, and a practical example of how themore » process and metrics have been used as part of a turbine control system upgrade during the formative stages of design. The process and metrics are geared toward generalizability to other applications and serve as a template for utilities undertaking their own control room modernization activities.« less

  20. Evaluating software development characteristics: Assessment of software measures in the Software Engineering Laboratory. [reliability engineering

    NASA Technical Reports Server (NTRS)

    Basili, V. R.

    1981-01-01

    Work on metrics is discussed. Factors that affect software quality are reviewed. Metrics is discussed in terms of criteria achievements, reliability, and fault tolerance. Subjective and objective metrics are distinguished. Product/process and cost/quality metrics are characterized and discussed.

  1. A Patient-Centered Framework for Evaluating Digital Maturity of Health Services: A Systematic Review

    PubMed Central

    Callahan, Ryan; Darzi, Ara; Mayer, Erik

    2016-01-01

    Background Digital maturity is the extent to which digital technologies are used as enablers to deliver a high-quality health service. Extensive literature exists about how to assess the components of digital maturity, but it has not been used to design a comprehensive framework for evaluation. Consequently, the measurement systems that do exist are limited to evaluating digital programs within one service or care setting, meaning that digital maturity evaluation is not accounting for the needs of patients across their care pathways. Objective The objective of our study was to identify the best methods and metrics for evaluating digital maturity and to create a novel, evidence-based tool for evaluating digital maturity across patient care pathways. Methods We systematically reviewed the literature to find the best methods and metrics for evaluating digital maturity. We searched the PubMed database for all papers relevant to digital maturity evaluation. Papers were selected if they provided insight into how to appraise digital systems within the health service and if they indicated the factors that constitute or facilitate digital maturity. Papers were analyzed to identify methodology for evaluating digital maturity and indicators of digitally mature systems. We then used the resulting information about methodology to design an evaluation framework. Following that, the indicators of digital maturity were extracted and grouped into increasing levels of maturity and operationalized as metrics within the evaluation framework. Results We identified 28 papers as relevant to evaluating digital maturity, from which we derived 5 themes. The first theme concerned general evaluation methodology for constructing the framework (7 papers). The following 4 themes were the increasing levels of digital maturity: resources and ability (6 papers), usage (7 papers), interoperability (3 papers), and impact (5 papers). The framework includes metrics for each of these levels at each stage of the typical patient care pathway. Conclusions The framework uses a patient-centric model that departs from traditional service-specific measurements and allows for novel insights into how digital programs benefit patients across the health system. Trial Registration N/A PMID:27080852

  2. National Quality Forum Colon Cancer Quality Metric Performance: How Are Hospitals Measuring Up?

    PubMed

    Mason, Meredith C; Chang, George J; Petersen, Laura A; Sada, Yvonne H; Tran Cao, Hop S; Chai, Christy; Berger, David H; Massarweh, Nader N

    2017-12-01

    To evaluate the impact of care at high-performing hospitals on the National Quality Forum (NQF) colon cancer metrics. The NQF endorses evaluating ≥12 lymph nodes (LNs), adjuvant chemotherapy (AC) for stage III patients, and AC within 4 months of diagnosis as colon cancer quality indicators. Data on hospital-level metric performance and the association with survival are unclear. Retrospective cohort study of 218,186 patients with resected stage I to III colon cancer in the National Cancer Data Base (2004-2012). High-performing hospitals (>75% achievement) were identified by the proportion of patients achieving each measure. The association between hospital performance and survival was evaluated using Cox shared frailty modeling. Only hospital LN performance improved (15.8% in 2004 vs 80.7% in 2012; trend test, P < 0.001), with 45.9% of hospitals performing well on all 3 measures concurrently in the most recent study year. Overall, 5-year survival was 75.0%, 72.3%, 72.5%, and 69.5% for those treated at hospitals with high performance on 3, 2, 1, and 0 metrics, respectively (log-rank, P < 0.001). Care at hospitals with high metric performance was associated with lower risk of death in a dose-response fashion [0 metrics, reference; 1, hazard ratio (HR) 0.96 (0.89-1.03); 2, HR 0.92 (0.87-0.98); 3, HR 0.85 (0.80-0.90); 2 vs 1, HR 0.96 (0.91-1.01); 3 vs 1, HR 0.89 (0.84-0.93); 3 vs 2, HR 0.95 (0.89-0.95)]. Performance on metrics in combination was associated with lower risk of death [LN + AC, HR 0.86 (0.78-0.95); AC + timely AC, HR 0.92 (0.87-0.98); LN + AC + timely AC, HR 0.85 (0.80-0.90)], whereas individual measures were not [LN, HR 0.95 (0.88-1.04); AC, HR 0.95 (0.87-1.05)]. Less than half of hospitals perform well on these NQF colon cancer metrics concurrently, and high performance on individual measures is not associated with improved survival. Quality improvement efforts should shift focus from individual measures to defining composite measures encompassing the overall multimodal care pathway and capturing successful transitions from one care modality to another.

  3. Performance Evaluation of Frequency Transform Based Block Classification of Compound Image Segmentation Techniques

    NASA Astrophysics Data System (ADS)

    Selwyn, Ebenezer Juliet; Florinabel, D. Jemi

    2018-04-01

    Compound image segmentation plays a vital role in the compression of computer screen images. Computer screen images are images which are mixed with textual, graphical, or pictorial contents. In this paper, we present a comparison of two transform based block classification of compound images based on metrics like speed of classification, precision and recall rate. Block based classification approaches normally divide the compound images into fixed size blocks of non-overlapping in nature. Then frequency transform like Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) are applied over each block. Mean and standard deviation are computed for each 8 × 8 block and are used as features set to classify the compound images into text/graphics and picture/background block. The classification accuracy of block classification based segmentation techniques are measured by evaluation metrics like precision and recall rate. Compound images of smooth background and complex background images containing text of varying size, colour and orientation are considered for testing. Experimental evidence shows that the DWT based segmentation provides significant improvement in recall rate and precision rate approximately 2.3% than DCT based segmentation with an increase in block classification time for both smooth and complex background images.

  4. Model Performance Evaluation and Scenario Analysis ...

    EPA Pesticide Factsheets

    This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors. The performance measures include error analysis, coefficient of determination, Nash-Sutcliffe efficiency, and a new weighted rank method. These performance metrics only provide useful information about the overall model performance. Note that MPESA is based on the separation of observed and simulated time series into magnitude and sequence components. The separation of time series into magnitude and sequence components and the reconstruction back to time series provides diagnostic insights to modelers. For example, traditional approaches lack the capability to identify if the source of uncertainty in the simulated data is due to the quality of the input data or the way the analyst adjusted the model parameters. This report presents a suite of model diagnostics that identify if mismatches between observed and simulated data result from magnitude or sequence related errors. MPESA offers graphical and statistical options that allow HSPF users to compare observed and simulated time series and identify the parameter values to adjust or the input data to modify. The scenario analysis part of the too

  5. Metrics, The Measure of Your Future: Materials Evaluation Forms.

    ERIC Educational Resources Information Center

    Troy, Joan B.

    Three evaluation forms are contained in this publication by the Winston-Salem/Forsyth Metric Education Project to be used in conjunction with their materials. They are: (1) Field-Test Materials Evaluation Form; (2) Student Materials Evaluation Form; and (3) Composite Materials Evaluation Form. The questions in these forms are phrased so they can…

  6. A Metrics-Based Approach to Intrusion Detection System Evaluation for Distributed Real-Time Systems

    DTIC Science & Technology

    2002-04-01

    Based Approach to Intrusion Detection System Evaluation for Distributed Real - Time Systems Authors: G. A. Fink, B. L. Chappell, T. G. Turner, and...Distributed, Security. 1 Introduction Processing and cost requirements are driving future naval combat platforms to use distributed, real - time systems of...distributed, real - time systems . As these systems grow more complex, the timing requirements do not diminish; indeed, they may become more constrained

  7. MJO simulation in CMIP5 climate models: MJO skill metrics and process-oriented diagnosis

    NASA Astrophysics Data System (ADS)

    Ahn, Min-Seop; Kim, Daehyun; Sperber, Kenneth R.; Kang, In-Sik; Maloney, Eric; Waliser, Duane; Hendon, Harry

    2017-12-01

    The Madden-Julian Oscillation (MJO) simulation diagnostics developed by MJO Working Group and the process-oriented MJO simulation diagnostics developed by MJO Task Force are applied to 37 Coupled Model Intercomparison Project phase 5 (CMIP5) models in order to assess model skill in representing amplitude, period, and coherent eastward propagation of the MJO, and to establish a link between MJO simulation skill and parameterized physical processes. Process-oriented diagnostics include the Relative Humidity Composite based on Precipitation (RHCP), Normalized Gross Moist Stability (NGMS), and the Greenhouse Enhancement Factor (GEF). Numerous scalar metrics are developed to quantify the results. Most CMIP5 models underestimate MJO amplitude, especially when outgoing longwave radiation (OLR) is used in the evaluation, and exhibit too fast phase speed while lacking coherence between eastward propagation of precipitation/convection and the wind field. The RHCP-metric, indicative of the sensitivity of simulated convection to low-level environmental moisture, and the NGMS-metric, indicative of the efficiency of a convective atmosphere for exporting moist static energy out of the column, show robust correlations with a large number of MJO skill metrics. The GEF-metric, indicative of the strength of the column-integrated longwave radiative heating due to cloud-radiation interaction, is also correlated with the MJO skill metrics, but shows relatively lower correlations compared to the RHCP- and NGMS-metrics. Our results suggest that modifications to processes associated with moisture-convection coupling and the gross moist stability might be the most fruitful for improving simulations of the MJO. Though the GEF-metric exhibits lower correlations with the MJO skill metrics, the longwave radiation feedback is highly relevant for simulating the weak precipitation anomaly regime that may be important for the establishment of shallow convection and the transition to deep convection.

  8. MJO simulation in CMIP5 climate models: MJO skill metrics and process-oriented diagnosis

    DOE PAGES

    Ahn, Min-Seop; Kim, Daehyun; Sperber, Kenneth R.; ...

    2017-03-23

    The Madden-Julian Oscillation (MJO) simulation diagnostics developed by MJO Working Group and the process-oriented MJO simulation diagnostics developed by MJO Task Force are applied to 37 Coupled Model Intercomparison Project phase 5 (CMIP5) models in order to assess model skill in representing amplitude, period, and coherent eastward propagation of the MJO, and to establish a link between MJO simulation skill and parameterized physical processes. Process-oriented diagnostics include the Relative Humidity Composite based on Precipitation (RHCP), Normalized Gross Moist Stability (NGMS), and the Greenhouse Enhancement Factor (GEF). Numerous scalar metrics are developed to quantify the results. Most CMIP5 models underestimate MJOmore » amplitude, especially when outgoing longwave radiation (OLR) is used in the evaluation, and exhibit too fast phase speed while lacking coherence between eastward propagation of precipitation/convection and the wind field. The RHCP-metric, indicative of the sensitivity of simulated convection to low-level environmental moisture, and the NGMS-metric, indicative of the efficiency of a convective atmosphere for exporting moist static energy out of the column, show robust correlations with a large number of MJO skill metrics. The GEF-metric, indicative of the strength of the column-integrated longwave radiative heating due to cloud-radiation interaction, is also correlated with the MJO skill metrics, but shows relatively lower correlations compared to the RHCP- and NGMS-metrics. Our results suggest that modifications to processes associated with moisture-convection coupling and the gross moist stability might be the most fruitful for improving simulations of the MJO. Though the GEF-metric exhibits lower correlations with the MJO skill metrics, the longwave radiation feedback is highly relevant for simulating the weak precipitation anomaly regime that may be important for the establishment of shallow convection and the transition to deep convection.« less

  9. MJO simulation in CMIP5 climate models: MJO skill metrics and process-oriented diagnosis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ahn, Min-Seop; Kim, Daehyun; Sperber, Kenneth R.

    The Madden-Julian Oscillation (MJO) simulation diagnostics developed by MJO Working Group and the process-oriented MJO simulation diagnostics developed by MJO Task Force are applied to 37 Coupled Model Intercomparison Project phase 5 (CMIP5) models in order to assess model skill in representing amplitude, period, and coherent eastward propagation of the MJO, and to establish a link between MJO simulation skill and parameterized physical processes. Process-oriented diagnostics include the Relative Humidity Composite based on Precipitation (RHCP), Normalized Gross Moist Stability (NGMS), and the Greenhouse Enhancement Factor (GEF). Numerous scalar metrics are developed to quantify the results. Most CMIP5 models underestimate MJOmore » amplitude, especially when outgoing longwave radiation (OLR) is used in the evaluation, and exhibit too fast phase speed while lacking coherence between eastward propagation of precipitation/convection and the wind field. The RHCP-metric, indicative of the sensitivity of simulated convection to low-level environmental moisture, and the NGMS-metric, indicative of the efficiency of a convective atmosphere for exporting moist static energy out of the column, show robust correlations with a large number of MJO skill metrics. The GEF-metric, indicative of the strength of the column-integrated longwave radiative heating due to cloud-radiation interaction, is also correlated with the MJO skill metrics, but shows relatively lower correlations compared to the RHCP- and NGMS-metrics. Our results suggest that modifications to processes associated with moisture-convection coupling and the gross moist stability might be the most fruitful for improving simulations of the MJO. Though the GEF-metric exhibits lower correlations with the MJO skill metrics, the longwave radiation feedback is highly relevant for simulating the weak precipitation anomaly regime that may be important for the establishment of shallow convection and the transition to deep convection.« less

  10. Shilling Attacks Detection in Recommender Systems Based on Target Item Analysis

    PubMed Central

    Zhou, Wei; Wen, Junhao; Koh, Yun Sing; Xiong, Qingyu; Gao, Min; Dobbie, Gillian; Alam, Shafiq

    2015-01-01

    Recommender systems are highly vulnerable to shilling attacks, both by individuals and groups. Attackers who introduce biased ratings in order to affect recommendations, have been shown to negatively affect collaborative filtering (CF) algorithms. Previous research focuses only on the differences between genuine profiles and attack profiles, ignoring the group characteristics in attack profiles. In this paper, we study the use of statistical metrics to detect rating patterns of attackers and group characteristics in attack profiles. Another question is that most existing detecting methods are model specific. Two metrics, Rating Deviation from Mean Agreement (RDMA) and Degree of Similarity with Top Neighbors (DegSim), are used for analyzing rating patterns between malicious profiles and genuine profiles in attack models. Building upon this, we also propose and evaluate a detection structure called RD-TIA for detecting shilling attacks in recommender systems using a statistical approach. In order to detect more complicated attack models, we propose a novel metric called DegSim’ based on DegSim. The experimental results show that our detection model based on target item analysis is an effective approach for detecting shilling attacks. PMID:26222882

  11. Development and comparison of metrics for evaluating climate models and estimation of projection uncertainty

    NASA Astrophysics Data System (ADS)

    Ring, Christoph; Pollinger, Felix; Kaspar-Ott, Irena; Hertig, Elke; Jacobeit, Jucundus; Paeth, Heiko

    2017-04-01

    The COMEPRO project (Comparison of Metrics for Probabilistic Climate Change Projections of Mediterranean Precipitation), funded by the Deutsche Forschungsgemeinschaft (DFG), is dedicated to the development of new evaluation metrics for state-of-the-art climate models. Further, we analyze implications for probabilistic projections of climate change. This study focuses on the results of 4-field matrix metrics. Here, six different approaches are compared. We evaluate 24 models of the Coupled Model Intercomparison Project Phase 3 (CMIP3), 40 of CMIP5 and 18 of the Coordinated Regional Downscaling Experiment (CORDEX). In addition to the annual and seasonal precipitation the mean temperature is analysed. We consider both 50-year trend and climatological mean for the second half of the 20th century. For the probabilistic projections of climate change A1b, A2 (CMIP3) and RCP4.5, RCP8.5 (CMIP5,CORDEX) scenarios are used. The eight main study areas are located in the Mediterranean. However, we apply our metrics to globally distributed regions as well. The metrics show high simulation quality of temperature trend and both precipitation and temperature mean for most climate models and study areas. In addition, we find high potential for model weighting in order to reduce uncertainty. These results are in line with other accepted evaluation metrics and studies. The comparison of the different 4-field approaches reveals high correlations for most metrics. The results of the metric-weighted probabilistic density functions of climate change are heterogeneous. We find for different regions and seasons both increases and decreases of uncertainty. The analysis of global study areas is consistent with the regional study areas of the Medeiterrenean.

  12. A Multimetric Benthic Macroinvertebrate Index for the Assessment of Stream Biotic Integrity in Korea

    PubMed Central

    Jun, Yung-Chul; Won, Doo-Hee; Lee, Soo-Hyung; Kong, Dong-Soo; Hwang, Soon-Jin

    2012-01-01

    At a time when anthropogenic activities are increasingly disturbing the overall ecological integrity of freshwater ecosystems, monitoring of biological communities is central to assessing the health and function of streams. This study aimed to use a large nation-wide database to develop a multimetric index (the Korean Benthic macroinvertebrate Index of Biological Integrity—KB-IBI) applicable to the biological assessment of Korean streams. Reference and impaired conditions were determined based on watershed, chemical and physical criteria. Eight of an initial 34 candidate metrics were selected using a stepwise procedure that evaluated metric variability, redundancy, sensitivity and responsiveness to environmental gradients. The selected metrics were number of taxa, percent Ephemeroptera-Plecoptera-Trichoptera (EPT) individuals, percent of a dominant taxon, percent taxa abundance without Chironomidae, Shannon’s diversity index, percent gatherer individuals, ratio of filterers and scrapers, and the Korean saprobic index. Our multimetric index successfully distinguished reference from impaired conditions. A scoring system was established for each core metric using its quartile range and response to anthropogenic disturbances. The multimetric index was classified by aggregating the individual metric ..scores and the value range was quadrisected to provide a narrative criterion (Poor, Fair, Good and Excellent) to describe the biological integrity of the streams in the study. A validation procedure showed that the index is an effective method for evaluating stream conditions, and thus is appropriate for use in future studies measuring the long-term status of streams, and the effectiveness of restoration methods. PMID:23202765

  13. Citizen science: A new perspective to advance spatial pattern evaluation in hydrology

    PubMed Central

    Stisen, Simon

    2017-01-01

    Citizen science opens new pathways that can complement traditional scientific practice. Intuition and reasoning often make humans more effective than computer algorithms in various realms of problem solving. In particular, a simple visual comparison of spatial patterns is a task where humans are often considered to be more reliable than computer algorithms. However, in practice, science still largely depends on computer based solutions, which inevitably gives benefits such as speed and the possibility to automatize processes. However, the human vision can be harnessed to evaluate the reliability of algorithms which are tailored to quantify similarity in spatial patterns. We established a citizen science project to employ the human perception to rate similarity and dissimilarity between simulated spatial patterns of several scenarios of a hydrological catchment model. In total, the turnout counts more than 2500 volunteers that provided over 43000 classifications of 1095 individual subjects. We investigate the capability of a set of advanced statistical performance metrics to mimic the human perception to distinguish between similarity and dissimilarity. Results suggest that more complex metrics are not necessarily better at emulating the human perception, but clearly provide auxiliary information that is valuable for model diagnostics. The metrics clearly differ in their ability to unambiguously distinguish between similar and dissimilar patterns which is regarded a key feature of a reliable metric. The obtained dataset can provide an insightful benchmark to the community to test novel spatial metrics. PMID:28558050

  14. Comparing Anisotropic Output-Based Grid Adaptation Methods by Decomposition

    NASA Technical Reports Server (NTRS)

    Park, Michael A.; Loseille, Adrien; Krakos, Joshua A.; Michal, Todd

    2015-01-01

    Anisotropic grid adaptation is examined by decomposing the steps of flow solution, ad- joint solution, error estimation, metric construction, and simplex grid adaptation. Multiple implementations of each of these steps are evaluated by comparison to each other and expected analytic results when available. For example, grids are adapted to analytic metric fields and grid measures are computed to illustrate the properties of multiple independent implementations of grid adaptation mechanics. Different implementations of each step in the adaptation process can be evaluated in a system where the other components of the adaptive cycle are fixed. Detailed examination of these properties allows comparison of different methods to identify the current state of the art and where further development should be targeted.

  15. The Albuquerque Seismological Laboratory Data Quality Analyzer

    NASA Astrophysics Data System (ADS)

    Ringler, A. T.; Hagerty, M.; Holland, J.; Gee, L. S.; Wilson, D.

    2013-12-01

    The U.S. Geological Survey's Albuquerque Seismological Laboratory (ASL) has several efforts underway to improve data quality at its stations. The Data Quality Analyzer (DQA) is one such development. The DQA is designed to characterize station data quality in a quantitative and automated manner. Station quality is based on the evaluation of various metrics, such as timing quality, noise levels, sensor coherence, and so on. These metrics are aggregated into a measurable grade for each station. The DQA consists of a website, a metric calculator (Seedscan), and a PostgreSQL database. The website allows the user to make requests for various time periods, review specific networks and stations, adjust weighting of the station's grade, and plot metrics as a function of time. The website dynamically loads all station data from a PostgreSQL database. The database is central to the application; it acts as a hub where metric values and limited station descriptions are stored. Data is stored at the level of one sensor's channel per day. The database is populated by Seedscan. Seedscan reads and processes miniSEED data, to generate metric values. Seedscan, written in Java, compares hashes of metadata and data to detect changes and perform subsequent recalculations. This ensures that the metric values are up to date and accurate. Seedscan can be run in a scheduled task or on demand by way of a config file. It will compute metrics specified in its configuration file. While many metrics are currently in development, some are completed and being actively used. These include: availability, timing quality, gap count, deviation from the New Low Noise Model, deviation from a station's noise baseline, inter-sensor coherence, and data-synthetic fits. In all, 20 metrics are planned, but any number could be added. ASL is actively using the DQA on a daily basis for station diagnostics and evaluation. As Seedscan is scheduled to run every night, data quality analysts are able to then use the website to diagnose changes in noise levels or other anomalous data. This allows for errors to be corrected quickly and efficiently. The code is designed to be flexible for adding metrics and portable for use in other networks. We anticipate further development of the DQA by improving the existing web-interface, adding more metrics, adding an interface to facilitate the verification of historic station metadata and performance, and an interface to allow better monitoring of data quality goals.

  16. Spatial and temporal variation in distribution of mangroves in Moreton Bay, subtropical Australia: a comparison of pattern metrics and change detection analyses based on aerial photographs

    NASA Astrophysics Data System (ADS)

    Manson, F. J.; Loneragan, N. R.; Phinn, S. R.

    2003-07-01

    An assessment of the changes in the distribution and extent of mangroves within Moreton Bay, southeast Queensland, Australia, was carried out. Two assessment methods were evaluated: spatial and temporal pattern metrics analysis, and change detection analysis. Currently, about 15,000 ha of mangroves are present in Moreton Bay. These mangroves are important ecosystems, but are subject to disturbance from a number of sources. Over the past 25 years, there has been a loss of more than 3800 ha, as a result of natural losses and mangrove clearing (e.g. for urban and industrial development, agriculture and aquaculture). However, areas of new mangroves have become established over the same time period, offsetting these losses to create a net loss of about 200 ha. These new mangroves have mainly appeared in the southern bay region and the bay islands, particularly on the landward edge of existing mangroves. In addition, spatial patterns and species composition of mangrove patches have changed. The pattern metrics analysis provided an overview of mangrove distribution and change in the form of single metric values, while the change detection analysis gave a more detailed and spatially explicit description of change. An analysis of the effects of spatial scales on the pattern metrics indicated that they were relatively insensitive to scale at spatial resolutions less than 50 m, but that most metrics became sensitive at coarser resolutions, a finding which has implications for mapping of mangroves based on remotely sensed data.

  17. Development of quality metrics for ambulatory care in pediatric patients with tetralogy of Fallot.

    PubMed

    Villafane, Juan; Edwards, Thomas C; Diab, Karim A; Satou, Gary M; Saarel, Elizabeth; Lai, Wyman W; Serwer, Gerald A; Karpawich, Peter P; Cross, Russell; Schiff, Russell; Chowdhury, Devyani; Hougen, Thomas J

    2017-12-01

    The objective of this study was to develop quality metrics (QMs) relating to the ambulatory care of children after complete repair of tetralogy of Fallot (TOF). A workgroup team (WT) of pediatric cardiologists with expertise in all aspects of ambulatory cardiac management was formed at the request of the American College of Cardiology (ACC) and the Adult Congenital and Pediatric Cardiology Council (ACPC), to review published guidelines and consensus data relating to the ambulatory care of repaired TOF patients under the age of 18 years. A set of quality metrics (QMs) was proposed by the WT. The metrics went through a two-step evaluation process. In the first step, the RAND-UCLA modified Delphi methodology was employed and the metrics were voted on feasibility and validity by an expert panel. In the second step, QMs were put through an "open comments" process where feedback was provided by the ACPC members. The final QMs were approved by the ACPC council. The TOF WT formulated 9 QMs of which only 6 were submitted to the expert panel; 3 QMs passed the modified RAND-UCLA and went through the "open comments" process. Based on the feedback through the open comment process, only 1 metric was finally approved by the ACPC council. The ACPC Council was able to develop QM for ambulatory care of children with repaired TOF. These patients should have documented genetic testing for 22q11.2 deletion. However, lack of evidence in the literature made it a challenge to formulate other evidence-based QMs. © 2017 Wiley Periodicals, Inc.

  18. A probability metric for identifying high-performing facilities: an application for pay-for-performance programs.

    PubMed

    Shwartz, Michael; Peköz, Erol A; Burgess, James F; Christiansen, Cindy L; Rosen, Amy K; Berlowitz, Dan

    2014-12-01

    Two approaches are commonly used for identifying high-performing facilities on a performance measure: one, that the facility is in a top quantile (eg, quintile or quartile); and two, that a confidence interval is below (or above) the average of the measure for all facilities. This type of yes/no designation often does not do well in distinguishing high-performing from average-performing facilities. To illustrate an alternative continuous-valued metric for profiling facilities--the probability a facility is in a top quantile--and show the implications of using this metric for profiling and pay-for-performance. We created a composite measure of quality from fiscal year 2007 data based on 28 quality indicators from 112 Veterans Health Administration nursing homes. A Bayesian hierarchical multivariate normal-binomial model was used to estimate shrunken rates of the 28 quality indicators, which were combined into a composite measure using opportunity-based weights. Rates were estimated using Markov Chain Monte Carlo methods as implemented in WinBUGS. The probability metric was calculated from the simulation replications. Our probability metric allowed better discrimination of high performers than the point or interval estimate of the composite score. In a pay-for-performance program, a smaller top quantile (eg, a quintile) resulted in more resources being allocated to the highest performers, whereas a larger top quantile (eg, being above the median) distinguished less among high performers and allocated more resources to average performers. The probability metric has potential but needs to be evaluated by stakeholders in different types of delivery systems.

  19. Metrics for quantifying antimicrobial use in beef feedlots.

    PubMed

    Benedict, Katharine M; Gow, Sheryl P; Reid-Smith, Richard J; Booker, Calvin W; Morley, Paul S

    2012-08-01

    Accurate antimicrobial drug use data are needed to enlighten discussions regarding the impact of antimicrobial drug use in agriculture. The primary objective of this study was to investigate the perceived accuracy and clarity of different methods for reporting antimicrobial drug use information collected regarding beef feedlots. Producers, veterinarians, industry representatives, public health officials, and other knowledgeable beef industry leaders were invited to complete a web-based survey. A total of 156 participants in 33 US states, 4 Canadian provinces, and 8 other countries completed the survey. No single metric was considered universally optimal for all use circumstances or for all audiences. To effectively communicate antimicrobial drug use data, evaluation of the target audience is critical to presenting the information. Metrics that are most accurate need to be carefully and repeatedly explained to the audience.

  20. Methods for Evaluating Wetland Condition #12: Using Amphibians in Bioassessments of Wetlands

    USGS Publications Warehouse

    Sparling, D.W.; Richter, K.O.; Calhoun, A.; Micacchion, M.

    2001-01-01

    Because amphibians have both aquatic and terrestrial life stages they can serve in a unique way among vertebrates as sources of information for bioassessments of both wetlands and surrounding habitats. Although there are many data gaps in our knowledge about the habitat requirements and ecology of many amphibian species, it is apparent that community composition, presence and frequency of abnormalities, various mensural characteristics (e.g. snout vent length divided by body weight) and laboratory diagnostics (e.g. cholinesterase activity, blood chemistry) can be used in developing metrics for an index of biotic integrity. In addition, potential metrics can be derived from the various life stages that most amphibians experience such as egg clusters; embryonic development and hatching rates; tadpole growth, development, and survival; progress and success of metamorphosis; and breeding behavior and presence of adults. It is important, however, to focus on regional biodiversity and species assemblages of amphibians in the development of metrics rather than to strive for broadscale application of common metrics. This report discusses the procedures of developing an index of biotic integrity based on amphibians, explains potential pitfalls in using amphibians in bioassessments, and demonstrates where more research is needed to enhance the use of amphibians in evaluating wetland conditions.

  1. Comparison of two laboratory-based systems for evaluation of halos in intraocular lenses

    PubMed Central

    Alexander, Elsinore; Wei, Xin; Lee, Shinwook

    2018-01-01

    Purpose Multifocal intraocular lenses (IOLs) can be associated with unwanted visual phenomena, including halos. Predicting potential for halos is desirable when designing new multifocal IOLs. Halo images from 6 IOL models were compared using the Optikos modulation transfer function bench system and a new high dynamic range (HDR) system. Materials and methods One monofocal, 1 extended depth of focus, and 4 multifocal IOLs were evaluated. An off-the-shelf optical bench was used to simulate a distant (>50 m) car headlight and record images. A custom HDR system was constructed using an imaging photometer to simulate headlight images and to measure quantitative halo luminance data. A metric was developed to characterize halo luminance properties. Clinical relevance was investigated by correlating halo measurements to visual outcomes questionnaire data. Results The Optikos system produced halo images useful for visual comparisons; however, measurements were relative and not quantitative. The HDR halo system provided objective and quantitative measurements used to create a metric from the area under the curve (AUC) of the logarithmic normalized halo profile. This proposed metric differentiated between IOL models, and linear regression analysis found strong correlations between AUC and subjective clinical ratings of halos. Conclusion The HDR system produced quantitative, preclinical metrics that correlated to patients’ subjective perception of halos. PMID:29503526

  2. Climate Data Analytics Workflow Management

    NASA Astrophysics Data System (ADS)

    Zhang, J.; Lee, S.; Pan, L.; Mattmann, C. A.; Lee, T. J.

    2016-12-01

    In this project we aim to pave a novel path to create a sustainable building block toward Earth science big data analytics and knowledge sharing. Closely studying how Earth scientists conduct data analytics research in their daily work, we have developed a provenance model to record their activities, and to develop a technology to automatically generate workflows for scientists from the provenance. On top of it, we have built the prototype of a data-centric provenance repository, and establish a PDSW (People, Data, Service, Workflow) knowledge network to support workflow recommendation. To ensure the scalability and performance of the expected recommendation system, we have leveraged the Apache OODT system technology. The community-approved, metrics-based performance evaluation web-service will allow a user to select a metric from the list of several community-approved metrics and to evaluate model performance using the metric as well as the reference dataset. This service will facilitate the use of reference datasets that are generated in support of the model-data intercomparison projects such as Obs4MIPs and Ana4MIPs. The data-centric repository infrastructure will allow us to catch richer provenance to further facilitate knowledge sharing and scientific collaboration in the Earth science community. This project is part of Apache incubator CMDA project.

  3. Comparison of the spatial and temporal variability of macroinvertebrate and periphyton-based metrics in a macrophyte-dominated shallow lake

    NASA Astrophysics Data System (ADS)

    Zhang, Lulu; Liu, Jingling; Li, Yi

    2015-03-01

    The influence of spatial differences, which are caused by different anthropogenic disturbances, and temporal changes, which are caused by natural conditions, on macroinvertebrates with periphyton communities in Baiyangdian Lake was compared. Periphyton and macrobenthos assemblage samples were simultaneously collected on four occasions during 2009 and 2010. Based on the physical and chemical attributes in the water and sediment, the 8 sampling sites can be divided into 5 habitat types by using cluster analysis. According to coefficients variation analysis (CV), three primary conclusions can be drawn: (1) the metrics of Hilsenhoff Biotic Index (HBI), Percent Tolerant Taxa (PTT), Percent dominant taxon (PDT), and community loss index (CLI), based on macroinvertebrates, and the metrics of algal density (AD), the proportion of chlorophyta (CHL), and the proportion of cyanophyta (CYA), based on periphytons, were mostly constant throughout our study; (2) in terms of spatial variation, the CV values in the macroinvertebratebased metrics were lower than the CV values in the periphyton-based metrics, and these findings may be caused by the effects of changes in environmental factors; whereas, the CV values in the macroinvertebrate-based metrics were higher than those in the periphyton-based metrics, and these results may be linked to the influences of phenology and life history patterns of the macroinvertebrate individuals; and (3) the CV values for the functionalbased metrics were higher than those for the structuralbased metrics. Therefore, spatial and temporal variation for metrics should be considered when assessing applying the biometrics.

  4. Not seeing the forest for the trees: size of the minimum spanning trees (MSTs) forest and branch significance in MST-based phylogenetic analysis.

    PubMed

    Teixeira, Andreia Sofia; Monteiro, Pedro T; Carriço, João A; Ramirez, Mário; Francisco, Alexandre P

    2015-01-01

    Trees, including minimum spanning trees (MSTs), are commonly used in phylogenetic studies. But, for the research community, it may be unclear that the presented tree is just a hypothesis, chosen from among many possible alternatives. In this scenario, it is important to quantify our confidence in both the trees and the branches/edges included in such trees. In this paper, we address this problem for MSTs by introducing a new edge betweenness metric for undirected and weighted graphs. This spanning edge betweenness metric is defined as the fraction of equivalent MSTs where a given edge is present. The metric provides a per edge statistic that is similar to that of the bootstrap approach frequently used in phylogenetics to support the grouping of taxa. We provide methods for the exact computation of this metric based on the well known Kirchhoff's matrix tree theorem. Moreover, we implement and make available a module for the PHYLOViZ software and evaluate the proposed metric concerning both effectiveness and computational performance. Analysis of trees generated using multilocus sequence typing data (MLST) and the goeBURST algorithm revealed that the space of possible MSTs in real data sets is extremely large. Selection of the edge to be represented using bootstrap could lead to unreliable results since alternative edges are present in the same fraction of equivalent MSTs. The choice of the MST to be presented, results from criteria implemented in the algorithm that must be based in biologically plausible models.

  5. MASTtreedist: visualization of tree space based on maximum agreement subtree.

    PubMed

    Huang, Hong; Li, Yongji

    2013-01-01

    Phylogenetic tree construction process might produce many candidate trees as the "best estimates." As the number of constructed phylogenetic trees grows, the need to efficiently compare their topological or physical structures arises. One of the tree comparison's software tools, the Mesquite's Tree Set Viz module, allows the rapid and efficient visualization of the tree comparison distances using multidimensional scaling (MDS). Tree-distance measures, such as Robinson-Foulds (RF), for the topological distance among different trees have been implemented in Tree Set Viz. New and sophisticated measures such as Maximum Agreement Subtree (MAST) can be continuously built upon Tree Set Viz. MAST can detect the common substructures among trees and provide more precise information on the similarity of the trees, but it is NP-hard and difficult to implement. In this article, we present a practical tree-distance metric: MASTtreedist, a MAST-based comparison metric in Mesquite's Tree Set Viz module. In this metric, the efficient optimizations for the maximum weight clique problem are applied. The results suggest that the proposed method can efficiently compute the MAST distances among trees, and such tree topological differences can be translated as a scatter of points in two-dimensional (2D) space. We also provide statistical evaluation of provided measures with respect to RF-using experimental data sets. This new comparison module provides a new tree-tree pairwise comparison metric based on the differences of the number of MAST leaves among constructed phylogenetic trees. Such a new phylogenetic tree comparison metric improves the visualization of taxa differences by discriminating small divergences of subtree structures for phylogenetic tree reconstruction.

  6. Construct validity of individual and summary performance metrics associated with a computer-based laparoscopic simulator.

    PubMed

    Rivard, Justin D; Vergis, Ashley S; Unger, Bertram J; Hardy, Krista M; Andrew, Chris G; Gillman, Lawrence M; Park, Jason

    2014-06-01

    Computer-based surgical simulators capture a multitude of metrics based on different aspects of performance, such as speed, accuracy, and movement efficiency. However, without rigorous assessment, it may be unclear whether all, some, or none of these metrics actually reflect technical skill, which can compromise educational efforts on these simulators. We assessed the construct validity of individual performance metrics on the LapVR simulator (Immersion Medical, San Jose, CA, USA) and used these data to create task-specific summary metrics. Medical students with no prior laparoscopic experience (novices, N = 12), junior surgical residents with some laparoscopic experience (intermediates, N = 12), and experienced surgeons (experts, N = 11) all completed three repetitions of four LapVR simulator tasks. The tasks included three basic skills (peg transfer, cutting, clipping) and one procedural skill (adhesiolysis). We selected 36 individual metrics on the four tasks that assessed six different aspects of performance, including speed, motion path length, respect for tissue, accuracy, task-specific errors, and successful task completion. Four of seven individual metrics assessed for peg transfer, six of ten metrics for cutting, four of nine metrics for clipping, and three of ten metrics for adhesiolysis discriminated between experience levels. Time and motion path length were significant on all four tasks. We used the validated individual metrics to create summary equations for each task, which successfully distinguished between the different experience levels. Educators should maintain some skepticism when reviewing the plethora of metrics captured by computer-based simulators, as some but not all are valid. We showed the construct validity of a limited number of individual metrics and developed summary metrics for the LapVR. The summary metrics provide a succinct way of assessing skill with a single metric for each task, but require further validation.

  7. A guide to calculating habitat-quality metrics to inform conservation of highly mobile species

    USGS Publications Warehouse

    Bieri, Joanna A.; Sample, Christine; Thogmartin, Wayne E.; Diffendorfer, James E.; Earl, Julia E.; Erickson, Richard A.; Federico, Paula; Flockhart, D. T. Tyler; Nicol, Sam; Semmens, Darius J.; Skraber, T.; Wiederholt, Ruscena; Mattsson, Brady J.

    2018-01-01

    Many metrics exist for quantifying the relative value of habitats and pathways used by highly mobile species. Properly selecting and applying such metrics requires substantial background in mathematics and understanding the relevant management arena. To address this multidimensional challenge, we demonstrate and compare three measurements of habitat quality: graph-, occupancy-, and demographic-based metrics. Each metric provides insights into system dynamics, at the expense of increasing amounts and complexity of data and models. Our descriptions and comparisons of diverse habitat-quality metrics provide means for practitioners to overcome the modeling challenges associated with management or conservation of such highly mobile species. Whereas previous guidance for applying habitat-quality metrics has been scattered in diversified tracks of literature, we have brought this information together into an approachable format including accessible descriptions and a modeling case study for a typical example that conservation professionals can adapt for their own decision contexts and focal populations.Considerations for Resource ManagersManagement objectives, proposed actions, data availability and quality, and model assumptions are all relevant considerations when applying and interpreting habitat-quality metrics.Graph-based metrics answer questions related to habitat centrality and connectivity, are suitable for populations with any movement pattern, quantify basic spatial and temporal patterns of occupancy and movement, and require the least data.Occupancy-based metrics answer questions about likelihood of persistence or colonization, are suitable for populations that undergo localized extinctions, quantify spatial and temporal patterns of occupancy and movement, and require a moderate amount of data.Demographic-based metrics answer questions about relative or absolute population size, are suitable for populations with any movement pattern, quantify demographic processes and population dynamics, and require the most data.More real-world examples applying occupancy-based, agent-based, and continuous-based metrics to seasonally migratory species are needed to better understand challenges and opportunities for applying these metrics more broadly.

  8. An alternative mechanism for international health aid: evaluating a Global Social Protection Fund.

    PubMed

    Basu, Sanjay; Stuckler, David; McKee, Martin

    2014-01-01

    Several public health groups have called for the creation of a global fund for 'social protection'-a fund that produces the international equivalent of domestic tax collection and safety net systems to finance care for the ill and disabled and related health costs. All participating countries would pay into a global fund based on a metric of their ability to pay and withdraw from the common pool based on a metric of their need for funds. We assessed how alternative strategies and metrics by which to operate such a fund would affect its size and impact on health system financing. Using a mathematical model, we found that common targets for health funding in low-income countries require higher levels of aid expenditures than presently distributed. Some mechanisms exist that may incentivize reduction of domestic health inequalities, and direct most funds towards the poorest populations. Payments from high-income countries are also likely to decrease over time as middle-income countries' economies grow.

  9. A lighting metric for quantitative evaluation of accent lighting systems

    NASA Astrophysics Data System (ADS)

    Acholo, Cyril O.; Connor, Kenneth A.; Radke, Richard J.

    2014-09-01

    Accent lighting is critical for artwork and sculpture lighting in museums, and subject lighting for stage, Film and television. The research problem of designing effective lighting in such settings has been revived recently with the rise of light-emitting-diode-based solid state lighting. In this work, we propose an easy-to-apply quantitative measure of the scene's visual quality as perceived by human viewers. We consider a well-accent-lit scene as one which maximizes the information about the scene (in an information-theoretic sense) available to the user. We propose a metric based on the entropy of the distribution of colors, which are extracted from an image of the scene from the viewer's perspective. We demonstrate that optimizing the metric as a function of illumination configuration (i.e., position, orientation, and spectral composition) results in natural, pleasing accent lighting. We use a photorealistic simulation tool to validate the functionality of our proposed approach, showing its successful application to two- and three-dimensional scenes.

  10. Efficacy of single and multi-metric fish-based indices in tracking anthropogenic pressures in estuaries: An 8-year case study.

    PubMed

    Martinho, Filipe; Nyitrai, Daniel; Crespo, Daniel; Pardal, Miguel A

    2015-12-15

    Facing a generalized increase in water degradation, several programmes have been implemented for protecting and enhancing the water quality and associated wildlife, which rely on ecological indicators to assess the degree of deviation from a pristine state. Here, single (species number, Shannon-Wiener H', Pielou J') and multi-metric (Estuarine Fish Assessment Index, EFAI) community-based ecological quality measures were evaluated in a temperate estuary over an 8-year period (2005-2012), and established their relationships with an anthropogenic pressure index (API). Single metric indices were highly variable and neither concordant amongst themselves nor with the EFAI. The EFAI was the only index significantly correlated with the API, indicating that higher ecological quality was associated with lower anthropogenic pressure. Pressure scenarios were related with specific fish community composition, as a result of distinct food web complexity and nursery functioning of the estuary. Results were discussed in the scope of the implementation of water protection programmes. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Healthcare4VideoStorm: Making Smart Decisions Based on Storm Metrics.

    PubMed

    Zhang, Weishan; Duan, Pengcheng; Chen, Xiufeng; Lu, Qinghua

    2016-04-23

    Storm-based stream processing is widely used for real-time large-scale distributed processing. Knowing the run-time status and ensuring performance is critical to providing expected dependability for some applications, e.g., continuous video processing for security surveillance. The existing scheduling strategies' granularity is too coarse to have good performance, and mainly considers network resources without computing resources while scheduling. In this paper, we propose Healthcare4Storm, a framework that finds Storm insights based on Storm metrics to gain knowledge from the health status of an application, finally ending up with smart scheduling decisions. It takes into account both network and computing resources and conducts scheduling at a fine-grained level using tuples instead of topologies. The comprehensive evaluation shows that the proposed framework has good performance and can improve the dependability of the Storm-based applications.

  12. To Control False Positives in Gene-Gene Interaction Analysis: Two Novel Conditional Entropy-Based Approaches

    PubMed Central

    Lin, Meihua; Li, Haoli; Zhao, Xiaolei; Qin, Jiheng

    2013-01-01

    Genome-wide analysis of gene-gene interactions has been recognized as a powerful avenue to identify the missing genetic components that can not be detected by using current single-point association analysis. Recently, several model-free methods (e.g. the commonly used information based metrics and several logistic regression-based metrics) were developed for detecting non-linear dependence between genetic loci, but they are potentially at the risk of inflated false positive error, in particular when the main effects at one or both loci are salient. In this study, we proposed two conditional entropy-based metrics to challenge this limitation. Extensive simulations demonstrated that the two proposed metrics, provided the disease is rare, could maintain consistently correct false positive rate. In the scenarios for a common disease, our proposed metrics achieved better or comparable control of false positive error, compared to four previously proposed model-free metrics. In terms of power, our methods outperformed several competing metrics in a range of common disease models. Furthermore, in real data analyses, both metrics succeeded in detecting interactions and were competitive with the originally reported results or the logistic regression approaches. In conclusion, the proposed conditional entropy-based metrics are promising as alternatives to current model-based approaches for detecting genuine epistatic effects. PMID:24339984

  13. Mapping multiple components of malaria risk for improved targeting of elimination interventions.

    PubMed

    Cohen, Justin M; Le Menach, Arnaud; Pothin, Emilie; Eisele, Thomas P; Gething, Peter W; Eckhoff, Philip A; Moonen, Bruno; Schapira, Allan; Smith, David L

    2017-11-13

    There is a long history of considering the constituent components of malaria risk and the malaria transmission cycle via the use of mathematical models, yet strategic planning in endemic countries tends not to take full advantage of available disease intelligence to tailor interventions. National malaria programmes typically make operational decisions about where to implement vector control and surveillance activities based upon simple categorizations of annual parasite incidence. With technological advances, an enormous opportunity exists to better target specific malaria interventions to the places where they will have greatest impact by mapping and evaluating metrics related to a variety of risk components, each of which describes a different facet of the transmission cycle. Here, these components and their implications for operational decision-making are reviewed. For each component, related mappable malaria metrics are also described which may be measured and evaluated by malaria programmes seeking to better understand the determinants of malaria risk. Implementing tailored programmes based on knowledge of the heterogeneous distribution of the drivers of malaria transmission rather than only consideration of traditional metrics such as case incidence has the potential to result in substantial improvements in decision-making. As programmes improve their ability to prioritize their available tools to the places where evidence suggests they will be most effective, elimination aspirations may become increasingly feasible.

  14. Performance Evaluation of the Approaches and Algorithms for Hamburg Airport Operations

    NASA Technical Reports Server (NTRS)

    Zhu, Zhifan; Jung, Yoon; Lee, Hanbong; Schier, Sebastian; Okuniek, Nikolai; Gerdes, Ingrid

    2016-01-01

    In this work, fast-time simulations have been conducted using SARDA tools at Hamburg airport by NASA and real-time simulations using CADEO and TRACC with the NLR ATM Research Simulator (NARSIM) by DLR. The outputs are analyzed using a set of common metrics collaborated between DLR and NASA. The proposed metrics are derived from International Civil Aviation Organization (ICAO)s Key Performance Areas (KPAs) in capability, efficiency, predictability and environment, and adapted to simulation studies. The results are examined to explore and compare the merits and shortcomings of the two approaches using the common performance metrics. Particular attention is paid to the concept of the close-loop, trajectory-based taxi as well as the application of US concept to the European airport. Both teams consider the trajectory-based surface operation concept a critical technology advance in not only addressing the current surface traffic management problems, but also having potential application in unmanned vehicle maneuver on airport surface, such as autonomous towing or TaxiBot [6][7] and even Remote Piloted Aircraft (RPA). Based on this work, a future integration of TRACC and SOSS is described aiming at bringing conflict-free trajectory-based operation concept to US airport.

  15. FAST COGNITIVE AND TASK ORIENTED, ITERATIVE DATA DISPLAY (FACTOID)

    DTIC Science & Technology

    2017-06-01

    approaches. As a result, the following assumptions guided our efforts in developing modeling and descriptive metrics for evaluation purposes...Application Evaluation . Our analytic workflow for evaluation is to first provide descriptive statistics about applications across metrics (performance...distributions for evaluation purposes because the goal of evaluation is accurate description , not inference (e.g., prediction). Outliers depicted

  16. Strategy quantification using body worn inertial sensors in a reactive agility task.

    PubMed

    Eke, Chika U; Cain, Stephen M; Stirling, Leia A

    2017-11-07

    Agility performance is often evaluated using time-based metrics, which provide little information about which factors aid or limit success. The objective of this study was to better understand agility strategy by identifying biomechanical metrics that were sensitive to performance speed, which were calculated with data from an array of body-worn inertial sensors. Five metrics were defined (normalized number of foot contacts, stride length variance, arm swing variance, mean normalized stride frequency, and number of body rotations) that corresponded to agility terms defined by experts working in athletic, clinical, and military environments. Eighteen participants donned 13 sensors to complete a reactive agility task, which involved navigating a set of cones in response to a vocal cue. Participants were grouped into fast, medium, and slow performance based on their completion time. Participants in the fast group had the smallest number of foot contacts (normalizing by height), highest stride length variance (normalizing by height), highest forearm angular velocity variance, and highest stride frequency (normalizing by height). The number of body rotations was not sensitive to speed and may have been determined by hand and foot dominance while completing the agility task. The results of this study have the potential to inform the development of a composite agility score constructed from the list of significant metrics. By quantifying the agility terms previously defined by expert evaluators through an agility score, this study can assist in strategy development for training and rehabilitation across athletic, clinical, and military domains. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Assessment of a Pesticide Exposure Intensity Algorithm in the Agricultural Health Study

    EPA Science Inventory

    The accuracy of the exposure assessment is a critical factor in epidemiological investigations of pesticide exposures and health in agricultural populations. However, few studies have been conducted to evaluate questionnaire-based exposure metrics. The Agricultural Health Study...

  18. SU-E-J-159: Intra-Patient Deformable Image Registration Uncertainties Quantified Using the Distance Discordance Metric

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Saleh, Z; Thor, M; Apte, A

    2014-06-01

    Purpose: The quantitative evaluation of deformable image registration (DIR) is currently challenging due to lack of a ground truth. In this study we test a new method proposed for quantifying multiple-image based DIRrelated uncertainties, for DIR of pelvic images. Methods: 19 patients were analyzed, each with 6 CT scans, who previously had radiotherapy for prostate cancer. Manually delineated structures for rectum and bladder, which served as ground truth structures, were delineated on the planning CT and each subsequent scan. For each patient, voxel-by-voxel DIR-related uncertainties were evaluated, following B-spline based DIR, by applying a previously developed metric, the distance discordancemore » metric (DDM; Saleh et al., PMB (2014) 59:733). The DDM map was superimposed on the first acquired CT scan and DDM statistics were assessed, also relative to two metrics estimating the agreement between the propagated and the manually delineated structures. Results: The highest DDM values which correspond to greatest spatial uncertainties were observed near the body surface and in the bowel due to the presence of gas. The mean rectal and bladder DDM values ranged from 1.1–11.1 mm and 1.5–12.7 mm, respectively. There was a strong correlation in the DDMs between the rectum and bladder (Pearson R = 0.68 for the max DDM). For both structures, DDM was correlated with the ratio between the DIR-propagated and manually delineated volumes (R = 0.74 for the max rectal DDM). The maximum rectal DDM was negatively correlated with the Dice Similarity Coefficient between the propagated and the manually delineated volumes (R= −0.52). Conclusion: The multipleimage based DDM map quantified considerable DIR variability across different structures and among patients. Besides using the DDM for quantifying DIR-related uncertainties it could potentially be used to adjust for uncertainties in DIR-based accumulated dose distributions.« less

  19. Evaluation of Daily Evapotranspiration Over Orchards Using METRIC Approach and Landsat Satellite Observations

    NASA Astrophysics Data System (ADS)

    He, R.; Jin, Y.; Daniele, Z.; Kandelous, M. M.; Kent, E. R.

    2016-12-01

    The pistachio and almond acreage in California has been rapidly growing in the past 10 years, raising concerns about competition for limited water resources in California. A robust and cost-effective mapping of crop water use, mostly evapotranspiration (ET), by orchards, is needed for improved farm-level irrigation management and regional water planning. METRIC™, a satellite-based surface energy balance approach, has been widely used to map field-scale crop ET, mostly over row crops. We here aim to apply METRIC with Landsat satellite observations over California's orchards and evaluate the ET estimates by comparing with field measurements in South San Joaquin Valley, California. Reference ET of grass (ETo) from California Irrigation Management Information system (CIMIS) stations was used to estimate daily ET of commercial almond and pistachio orchards. Our comparisons showed that METRIC-Landsat ET daily estimates agreed well with ET measured by the eddy covariance and surface renewal stations, with a RMSE of 1.25 and a correlation coefficient of 0.84 for the pistachio orchard. A slight high bias of satellite based ET estimates was found for both pistachio and almond orchards. We also found time series of NDVI was highly correlated with ET temporal dynamics within each field, but the correlation was reduced to 0.56 when all fields were pooled together. Net radiation, however, remained highly correlated with ET across all the fields. The METRIC ET was able to distinguish the differences in ET among salt- and non-salt affected pistachio orchards, e.g., mean daily ET during growing season in salt-affected orchards was lower than that of non-salt affected one by 0.87 mm/day. The remote sensing based ET estimate will support a variety of state and local interests in water use and management, for both planning and regulatory/compliance purposes, and provide the farmers observation-based guidance for site-specific and time-sensitive irrigation management.

  20. Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial

    EPA Pesticide Factsheets

    The model performance evaluation consists of metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors.

  1. Association of airborne moisture-indicating microorganisms withbuilding-related symptoms and water damage in 100 U.S. office buildings:Analyses of the U.S. EPA BASE data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mendell, Mark J.; Lei, Quanhong; Cozen, Myrna O.

    2003-10-01

    Metrics of culturable airborne microorganisms for either total organisms or suspected harmful subgroups have generally not been associated with symptoms among building occupants. However, the visible presence of moisture damage or mold in residences and other buildings has consistently been associated with respiratory symptoms and other health effects. This relationship is presumably caused by adverse but uncharacterized exposures to moisture-related microbiological growth. In order to assess this hypothesis, we studied relationships in U.S. office buildings between the prevalence of respiratory and irritant symptoms, the concentrations of airborne microorganisms that require moist surfaces on which to grow, and the presence ofmore » visible water damage. For these analyses we used data on buildings, indoor environments, and occupants collected from a representative sample of 100 U.S. office buildings in the U.S. Environmental Protection Agency's Building Assessment Survey and Evaluation (EPA BASE) study. We created 19 alternate metrics, using scales ranging from 3-10 units, that summarized the concentrations of airborne moisture-indicating microorganisms (AMIMOs) as indicators of moisture in buildings. Two were constructed to resemble a metric previously reported to be associated with lung function changes in building occupants; the others were based on another metric from the same group of Finnish researchers, concentration cutpoints from other studies, and professional judgment. We assessed three types of associations: between AMIMO metrics and symptoms in office workers, between evidence of water damage and symptoms, and between water damage and AMIMO metrics. We estimated (as odds ratios (ORs) with 95% confidence intervals) the unadjusted and adjusted associations between the 19 metrics and two types of weekly, work-related symptoms--lower respiratory and mucous membrane--using logistic regression models. Analyses used the original AMIMO metrics and were repeated with simplified dichotomized metrics. The multivariate models adjusted for other potential confounding variables associated with respondents, occupied spaces, buildings, or ventilation systems. Models excluded covariates for moisture-related risks hypothesized to increase AMIMO levels. We also estimated the association of water damage (using variables for specific locations in the study space or building, or summary variables) with the two symptom outcomes. Finally, using selected AMIMO metrics as outcomes, we constructed logistic regression models with observations at the building level to estimate unadjusted and adjusted associations of evident water damage with AMIMO metrics. All original AMIMO metrics showed little overall pattern of unadjusted or adjusted association with either symptom outcome. The 3-category metric resembling that previously used by others, which of all constructed metrics had the largest number of buildings in its top category, was not associated with symptoms in these buildings. However, most metrics with few buildings in their highest category showed increased risk for both symptoms in that category, especially metrics using cutpoints of >100 but <500 colony-forming units (CFU)/m{sup 3} for concentration of total culturable fungi. With AMIMO metrics dichotomized to compare the highest category with all lower categories combined, four metrics had unadjusted ORs between 1.4 and 1.6 for both symptom outcomes. The same four metrics had adjusted ORs of 1.7-2.1 for both symptom outcomes. In models of water damage and symptoms, several specific locations of past water damage had significant associations with outcomes, with ORs ranging from 1.4-1.6. In bivariate models of water damage and selected AMIMO metrics, a number of specific types of water damage and several summary variables for water damage were very strongly associated with AMIMO metrics (significant ORs ranging above 15). Multivariate modeling with the dichotomous AMIMO metrics was not possible due to limited numbers of observations.« less

  2. An Innovative Metric to Evaluate Satellite Precipitation's Spatial Distribution

    NASA Astrophysics Data System (ADS)

    Liu, H.; Chu, W.; Gao, X.; Sorooshian, S.

    2011-12-01

    Thanks to its capability to cover the mountains, where ground measurement instruments cannot reach, satellites provide a good means of estimating precipitation over mountainous regions. In regions with complex terrains, accurate information on high-resolution spatial distribution of precipitation is critical for many important issues, such as flood/landslide warning, reservoir operation, water system planning, etc. Therefore, in order to be useful in many practical applications, satellite precipitation products should possess high quality in characterizing spatial distribution. However, most existing validation metrics, which are based on point/grid comparison using simple statistics, cannot effectively measure satellite's skill of capturing the spatial patterns of precipitation fields. This deficiency results from the fact that point/grid-wised comparison does not take into account of the spatial coherence of precipitation fields. Furth more, another weakness of many metrics is that they can barely provide information on why satellite products perform well or poor. Motivated by our recent findings of the consistent spatial patterns of the precipitation field over the western U.S., we developed a new metric utilizing EOF analysis and Shannon entropy. The metric can be derived through two steps: 1) capture the dominant spatial patterns of precipitation fields from both satellite products and reference data through EOF analysis, and 2) compute the similarities between the corresponding dominant patterns using mutual information measurement defined with Shannon entropy. Instead of individual point/grid, the new metric treat the entire precipitation field simultaneously, naturally taking advantage of spatial dependence. Since the dominant spatial patterns are shaped by physical processes, the new metric can shed light on why satellite product can or cannot capture the spatial patterns. For demonstration, a experiment was carried out to evaluate a satellite precipitation product, CMORPH, against the U.S. daily precipitation analysis of Climate Prediction Center (CPC) at a daily and .25o scale over the Western U.S.

  3. Evaluation of Deposited Sediment and Macroinvertebrate Metrics Used to Quantify Biological Response to Excessive Sedimentation in Agricultural Streams

    NASA Astrophysics Data System (ADS)

    Sutherland, Andrew B.; Culp, Joseph M.; Benoy, Glenn A.

    2012-07-01

    The objective of this study was to evaluate which macroinvertebrate and deposited sediment metrics are best for determining effects of excessive sedimentation on stream integrity. Fifteen instream sediment metrics, with the strongest relationship to land cover, were compared to riffle macroinvertebrate metrics in streams ranging across a gradient of land disturbance. Six deposited sediment metrics were strongly related to the relative abundance of Ephemeroptera, Plecoptera and Trichoptera and six were strongly related to the modified family biotic index (MFBI). Few functional feeding groups and habit groups were significantly related to deposited sediment, and this may be related to the focus on riffle, rather than reach-wide macroinvertebrates, as reach-wide sediment metrics were more closely related to human land use. Our results suggest that the coarse-level deposited sediment metric, visual estimate of fines, and the coarse-level biological index, MFBI, may be useful in biomonitoring efforts aimed at determining the impact of anthropogenic sedimentation on stream biotic integrity.

  4. Evaluation of deposited sediment and macroinvertebrate metrics used to quantify biological response to excessive sedimentation in agricultural streams.

    PubMed

    Sutherland, Andrew B; Culp, Joseph M; Benoy, Glenn A

    2012-07-01

    The objective of this study was to evaluate which macroinvertebrate and deposited sediment metrics are best for determining effects of excessive sedimentation on stream integrity. Fifteen instream sediment metrics, with the strongest relationship to land cover, were compared to riffle macroinvertebrate metrics in streams ranging across a gradient of land disturbance. Six deposited sediment metrics were strongly related to the relative abundance of Ephemeroptera, Plecoptera and Trichoptera and six were strongly related to the modified family biotic index (MFBI). Few functional feeding groups and habit groups were significantly related to deposited sediment, and this may be related to the focus on riffle, rather than reach-wide macroinvertebrates, as reach-wide sediment metrics were more closely related to human land use. Our results suggest that the coarse-level deposited sediment metric, visual estimate of fines, and the coarse-level biological index, MFBI, may be useful in biomonitoring efforts aimed at determining the impact of anthropogenic sedimentation on stream biotic integrity.

  5. Apparatus and method for determining microscale interactions based on compressive sensors such as crystal structures

    DOEpatents

    McAdams, Harley; AlQuraishi, Mohammed

    2015-04-21

    Techniques for determining values for a metric of microscale interactions include determining a mesoscale metric for a plurality of mesoscale interaction types, wherein a value of the mesoscale metric for each mesoscale interaction type is based on a corresponding function of values of the microscale metric for the plurality of the microscale interaction types. A plurality of observations that indicate the values of the mesoscale metric are determined for the plurality of mesoscale interaction types. Values of the microscale metric are determined for the plurality of microscale interaction types based on the plurality of observations and the corresponding functions and compressed sensing.

  6. Important LiDAR metrics for discriminating forest tree species in Central Europe

    NASA Astrophysics Data System (ADS)

    Shi, Yifang; Wang, Tiejun; Skidmore, Andrew K.; Heurich, Marco

    2018-03-01

    Numerous airborne LiDAR-derived metrics have been proposed for classifying tree species. Yet an in-depth ecological and biological understanding of the significance of these metrics for tree species mapping remains largely unexplored. In this paper, we evaluated the performance of 37 frequently used LiDAR metrics derived under leaf-on and leaf-off conditions, respectively, for discriminating six different tree species in a natural forest in Germany. We firstly assessed the correlation between these metrics. Then we applied a Random Forest algorithm to classify the tree species and evaluated the importance of the LiDAR metrics. Finally, we identified the most important LiDAR metrics and tested their robustness and transferability. Our results indicated that about 60% of LiDAR metrics were highly correlated to each other (|r| > 0.7). There was no statistically significant difference in tree species mapping accuracy between the use of leaf-on and leaf-off LiDAR metrics. However, combining leaf-on and leaf-off LiDAR metrics significantly increased the overall accuracy from 58.2% (leaf-on) and 62.0% (leaf-off) to 66.5% as well as the kappa coefficient from 0.47 (leaf-on) and 0.51 (leaf-off) to 0.58. Radiometric features, especially intensity related metrics, provided more consistent and significant contributions than geometric features for tree species discrimination. Specifically, the mean intensity of first-or-single returns as well as the mean value of echo width were identified as the most robust LiDAR metrics for tree species discrimination. These results indicate that metrics derived from airborne LiDAR data, especially radiometric metrics, can aid in discriminating tree species in a mixed temperate forest, and represent candidate metrics for tree species classification and monitoring in Central Europe.

  7. Application of Bounded Linear Stability Analysis Method for Metrics-Driven Adaptive Control

    NASA Technical Reports Server (NTRS)

    Bakhtiari-Nejad, Maryam; Nguyen, Nhan T.; Krishnakumar, Kalmanje

    2009-01-01

    This paper presents the application of Bounded Linear Stability Analysis (BLSA) method for metrics-driven adaptive control. The bounded linear stability analysis method is used for analyzing stability of adaptive control models, without linearizing the adaptive laws. Metrics-driven adaptive control introduces a notion that adaptation should be driven by some stability metrics to achieve robustness. By the application of bounded linear stability analysis method the adaptive gain is adjusted during the adaptation in order to meet certain phase margin requirements. Analysis of metrics-driven adaptive control is evaluated for a second order system that represents a pitch attitude control of a generic transport aircraft. The analysis shows that the system with the metrics-conforming variable adaptive gain becomes more robust to unmodeled dynamics or time delay. The effect of analysis time-window for BLSA is also evaluated in order to meet the stability margin criteria.

  8. Dose-distance metric that predicts late rectal bleeding in patients receiving radical prostate external-beam radiotherapy

    NASA Astrophysics Data System (ADS)

    Lee, Richard; Chan, Elisa K.; Kosztyla, Robert; Liu, Mitchell; Moiseenko, Vitali

    2012-12-01

    The relationship between rectal dose distribution and the incidence of late rectal complications following external-beam radiotherapy has been previously studied using dose-volume histograms or dose-surface histograms. However, they do not account for the spatial dose distribution. This study proposes a metric based on both surface dose and distance that can predict the incidence of rectal bleeding in prostate cancer patients treated with radical radiotherapy. One hundred and forty-four patients treated with radical radiotherapy for prostate cancer were prospectively followed to record the incidence of grade ≥2 rectal bleeding. Radiotherapy plans were used to evaluate a dose-distance metric that accounts for the dose and its spatial distribution on the rectal surface, characterized by a logistic weighting function with slope a and inflection point d0. This was compared to the effective dose obtained from dose-surface histograms, characterized by the parameter n which describes sensitivity to hot spots. The log-rank test was used to determine statistically significant (p < 0.05) cut-off values for the dose-distance metric and effective dose that predict for the occurrence of rectal bleeding. For the dose-distance metric, only d0 = 25 and 30 mm combined with a > 5 led to statistical significant cut-offs. For the effective dose metric, only values of n in the range 0.07-0.35 led to statistically significant cut-offs. The proposed dose-distance metric is a predictor of rectal bleeding in prostate cancer patients treated with radiotherapy. Both the dose-distance metric and the effective dose metric indicate that the incidence of grade ≥2 rectal bleeding is sensitive to localized damage to the rectal surface.

  9. Brain tumor classification and segmentation using sparse coding and dictionary learning.

    PubMed

    Salman Al-Shaikhli, Saif Dawood; Yang, Michael Ying; Rosenhahn, Bodo

    2016-08-01

    This paper presents a novel fully automatic framework for multi-class brain tumor classification and segmentation using a sparse coding and dictionary learning method. The proposed framework consists of two steps: classification and segmentation. The classification of the brain tumors is based on brain topology and texture. The segmentation is based on voxel values of the image data. Using K-SVD, two types of dictionaries are learned from the training data and their associated ground truth segmentation: feature dictionary and voxel-wise coupled dictionaries. The feature dictionary consists of global image features (topological and texture features). The coupled dictionaries consist of coupled information: gray scale voxel values of the training image data and their associated label voxel values of the ground truth segmentation of the training data. For quantitative evaluation, the proposed framework is evaluated using different metrics. The segmentation results of the brain tumor segmentation (MICCAI-BraTS-2013) database are evaluated using five different metric scores, which are computed using the online evaluation tool provided by the BraTS-2013 challenge organizers. Experimental results demonstrate that the proposed approach achieves an accurate brain tumor classification and segmentation and outperforms the state-of-the-art methods.

  10. Use of two population metrics clarifies biodiversity dynamics in large-scale monitoring: the case of trees in Japanese old-growth forests: the need for multiple population metrics in large-scale monitoring.

    PubMed

    Ogawa, Mifuyu; Yamaura, Yuichi; Abe, Shin; Hoshino, Daisuke; Hoshizaki, Kazuhiko; Iida, Shigeo; Katsuki, Toshio; Masaki, Takashi; Niiyama, Kaoru; Saito, Satoshi; Sakai, Takeshi; Sugita, Hisashi; Tanouchi, Hiroyuki; Amano, Tatsuya; Taki, Hisatomo; Okabe, Kimiko

    2011-07-01

    Many indicators/indices provide information on whether the 2010 biodiversity target of reducing declines in biodiversity have been achieved. The strengths and limitations of the various measures used to assess the success of such measures are now being discussed. Biodiversity dynamics are often evaluated by a single biological population metric, such as the abundance of each species. Here we examined tree population dynamics of 52 families (192 species) at 11 research sites (three vegetation zones) of Japanese old-growth forests using two population metrics: number of stems and basal area. We calculated indices that track the rate of change in all species of tree by taking the geometric mean of changes in population metrics between the 1990s and the 2000s at the national level and at the levels of the vegetation zone and family. We specifically focused on whether indices based on these two metrics behaved similarly. The indices showed that (1) the number of stems declined, whereas basal area did not change at the national level and (2) the degree of change in the indices varied by vegetation zone and family. These results suggest that Japanese old-growth forests have not degraded and may even be developing in some vegetation zones, and indicate that the use of a single population metric (or indicator/index) may be insufficient to precisely understand the state of biodiversity. It is therefore important to incorporate more metrics into monitoring schemes to overcome the risk of misunderstanding or misrepresenting biodiversity dynamics.

  11. Aboveground Biomass Estimation Using Reconstructed Feature of Airborne Discrete-Return LIDAR by Auto-Encoder Neural Network

    NASA Astrophysics Data System (ADS)

    Li, T.; Wang, Z.; Peng, J.

    2018-04-01

    Aboveground biomass (AGB) estimation is critical for quantifying carbon stocks and essential for evaluating carbon cycle. In recent years, airborne LiDAR shows its great ability for highly-precision AGB estimation. Most of the researches estimate AGB by the feature metrics extracted from the canopy height distribution of the point cloud which calculated based on precise digital terrain model (DTM). However, if forest canopy density is high, the probability of the LiDAR signal penetrating the canopy is lower, resulting in ground points is not enough to establish DTM. Then the distribution of forest canopy height is imprecise and some critical feature metrics which have a strong correlation with biomass such as percentiles, maximums, means and standard deviations of canopy point cloud can hardly be extracted correctly. In order to address this issue, we propose a strategy of first reconstructing LiDAR feature metrics through Auto-Encoder neural network and then using the reconstructed feature metrics to estimate AGB. To assess the prediction ability of the reconstructed feature metrics, both original and reconstructed feature metrics were regressed against field-observed AGB using the multiple stepwise regression (MS) and the partial least squares regression (PLS) respectively. The results showed that the estimation model using reconstructed feature metrics improved R2 by 5.44 %, 18.09 %, decreased RMSE value by 10.06 %, 22.13 % and reduced RMSEcv by 10.00 %, 21.70 % for AGB, respectively. Therefore, reconstructing LiDAR point feature metrics has potential for addressing AGB estimation challenge in dense canopy area.

  12. Standardization of a Videofluoroscopic Swallow Study Protocol to Investigate Dysphagia in Dogs.

    PubMed

    Harris, R A; Grobman, M E; Allen, M J; Schachtel, J; Rawson, N E; Bennett, B; Ledyayev, J; Hopewell, B; Coates, J R; Reinero, C R; Lever, T E

    2017-03-01

    Videofluoroscopic swallow study (VFSS) is the gold standard for diagnosis of dysphagia in veterinary medicine but lacks standardized protocols that emulate physiologic feeding practices. Age impacts swallow function in humans but has not been evaluated by VFSS in dogs. To develop a protocol with custom kennels designed to allow free-feeding of 3 optimized formulations of contrast media and diets that address limitations of current VFSS protocols. We hypothesized that dogs evaluated by a free-feeding VFSS protocol would show differences in objective swallow metrics based on age. Healthy juvenile, adult, and geriatric dogs (n = 24). Prospective, experimental study. Custom kennels were developed to maintain natural feeding behaviors during VFSS. Three food consistencies (thin liquid, pureed food, and dry kibble) were formulated with either iohexol or barium to maximize palatability and voluntary prehension. Dogs were evaluated by 16 swallow metrics and compared across age groups. Development of a standardized VFSS protocol resulted in successful collection of swallow data in healthy dogs. No significant differences in swallow metrics were observed among age groups. Substantial variability was observed in healthy dogs when evaluated under these physiologic conditions. Features typically attributed to pathologic states, such as gastric reflux, were seen in healthy dogs. Development of a VFSS protocol that reflects natural feeding practices may allow emulation of physiology resulting in clinical signs of dysphagia. Age did not result in significant changes in swallow metrics, but additional studies are needed, particularly in light of substantial normal variation. Copyright © 2017 The Authors. Journal of Veterinary Internal Medicine published by Wiley Periodicals, Inc. on behalf of the American College of Veterinary Internal Medicine.

  13. Neural decoding with kernel-based metric learning.

    PubMed

    Brockmeier, Austin J; Choi, John S; Kriminger, Evan G; Francis, Joseph T; Principe, Jose C

    2014-06-01

    In studies of the nervous system, the choice of metric for the neural responses is a pivotal assumption. For instance, a well-suited distance metric enables us to gauge the similarity of neural responses to various stimuli and assess the variability of responses to a repeated stimulus-exploratory steps in understanding how the stimuli are encoded neurally. Here we introduce an approach where the metric is tuned for a particular neural decoding task. Neural spike train metrics have been used to quantify the information content carried by the timing of action potentials. While a number of metrics for individual neurons exist, a method to optimally combine single-neuron metrics into multineuron, or population-based, metrics is lacking. We pose the problem of optimizing multineuron metrics and other metrics using centered alignment, a kernel-based dependence measure. The approach is demonstrated on invasively recorded neural data consisting of both spike trains and local field potentials. The experimental paradigm consists of decoding the location of tactile stimulation on the forepaws of anesthetized rats. We show that the optimized metrics highlight the distinguishing dimensions of the neural response, significantly increase the decoding accuracy, and improve nonlinear dimensionality reduction methods for exploratory neural analysis.

  14. EVALUATION OF ORAL AND INTRAVENOUS ROUTE PHARMACOKINETICS, PLASMA PROTEIN BINDING AND UTERINE TISSUE DOSE METRICS OF BPA: A PHYSIOLOGICALLY BASED PHARMACOKINETIC APPROACH

    EPA Science Inventory

    Bisphenol A (BPA) is a weakly estrogenic monomer used in the production of polycarbonate plastics and epoxy resins, both of which are used in food contact applications. A physiologically based pharmacokinetic (PBPK) model of BPA pharmacokinetics in rats and humans was developed t...

  15. EVALUATION OF ORAL AND INTRAVENOUS ROUTE PHARMACOKINETICS, PLASMA PROTEIN BINDING AND UTERINE TISSUE DOSE METRICS OF BPA: A PHYSIOLOGICALLY BASED PHARMACOKINETIC APPROACH

    EPA Science Inventory

    Bisphenol A (BPA) is a weakly estrogenic monomer used in the production of polycarbonate plastics and epoxy resins, both of which are used in food contact applications. A physiologically based pharmacokinetic (PBPK) model of BPA pharmacokinetics in rats and humans was developed ...

  16. Active Duty C-17 Aircraft Commander Fuel Efficiency Metrics and Goal Evaluation

    DTIC Science & Technology

    2015-03-26

    document/AFD-140304-043.pdf. AMC/A3F. “AMC Fuel Metrics,” Air Mobility Command, 2014. Bandura , A. “ Social Cognitive Theory of Self-Regulation...qualitative criteria analysis, picked the most effective metric, and utilized Goal Setting Theory (GST) to couple the metric with an attainable goal...9 Goal-Setting Theory

  17. "Your Model Is Predictive-- but Is It Useful?" Theoretical and Empirical Considerations of a New Paradigm for Adaptive Tutoring Evaluation

    ERIC Educational Resources Information Center

    González-Brenes, José P.; Huang, Yun

    2015-01-01

    Classification evaluation metrics are often used to evaluate adaptive tutoring systems-- programs that teach and adapt to humans. Unfortunately, it is not clear how intuitive these metrics are for practitioners with little machine learning background. Moreover, our experiments suggest that existing convention for evaluating tutoring systems may…

  18. Orientation estimation of anatomical structures in medical images for object recognition

    NASA Astrophysics Data System (ADS)

    Bağci, Ulaş; Udupa, Jayaram K.; Chen, Xinjian

    2011-03-01

    Recognition of anatomical structures is an important step in model based medical image segmentation. It provides pose estimation of objects and information about "where" roughly the objects are in the image and distinguishing them from other object-like entities. In,1 we presented a general method of model-based multi-object recognition to assist in segmentation (delineation) tasks. It exploits the pose relationship that can be encoded, via the concept of ball scale (b-scale), between the binary training objects and their associated grey images. The goal was to place the model, in a single shot, close to the right pose (position, orientation, and scale) in a given image so that the model boundaries fall in the close vicinity of object boundaries in the image. Unlike position and scale parameters, we observe that orientation parameters require more attention when estimating the pose of the model as even small differences in orientation parameters can lead to inappropriate recognition. Motivated from the non-Euclidean nature of the pose information, we propose in this paper the use of non-Euclidean metrics to estimate orientation of the anatomical structures for more accurate recognition and segmentation. We statistically analyze and evaluate the following metrics for orientation estimation: Euclidean, Log-Euclidean, Root-Euclidean, Procrustes Size-and-Shape, and mean Hermitian metrics. The results show that mean Hermitian and Cholesky decomposition metrics provide more accurate orientation estimates than other Euclidean and non-Euclidean metrics.

  19. The SPAtial EFficiency metric (SPAEF): multiple-component evaluation of spatial patterns for optimization of hydrological models

    NASA Astrophysics Data System (ADS)

    Koch, Julian; Cüneyd Demirel, Mehmet; Stisen, Simon

    2018-05-01

    The process of model evaluation is not only an integral part of model development and calibration but also of paramount importance when communicating modelling results to the scientific community and stakeholders. The modelling community has a large and well-tested toolbox of metrics to evaluate temporal model performance. In contrast, spatial performance evaluation does not correspond to the grand availability of spatial observations readily available and to the sophisticate model codes simulating the spatial variability of complex hydrological processes. This study makes a contribution towards advancing spatial-pattern-oriented model calibration by rigorously testing a multiple-component performance metric. The promoted SPAtial EFficiency (SPAEF) metric reflects three equally weighted components: correlation, coefficient of variation and histogram overlap. This multiple-component approach is found to be advantageous in order to achieve the complex task of comparing spatial patterns. SPAEF, its three components individually and two alternative spatial performance metrics, i.e. connectivity analysis and fractions skill score, are applied in a spatial-pattern-oriented model calibration of a catchment model in Denmark. Results suggest the importance of multiple-component metrics because stand-alone metrics tend to fail to provide holistic pattern information. The three SPAEF components are found to be independent, which allows them to complement each other in a meaningful way. In order to optimally exploit spatial observations made available by remote sensing platforms, this study suggests applying bias insensitive metrics which further allow for a comparison of variables which are related but may differ in unit. This study applies SPAEF in the hydrological context using the mesoscale Hydrologic Model (mHM; version 5.8), but we see great potential across disciplines related to spatially distributed earth system modelling.

  20. Quantitative evaluation of muscle synergy models: a single-trial task decoding approach

    PubMed Central

    Delis, Ioannis; Berret, Bastien; Pozzo, Thierry; Panzeri, Stefano

    2013-01-01

    Muscle synergies, i.e., invariant coordinated activations of groups of muscles, have been proposed as building blocks that the central nervous system (CNS) uses to construct the patterns of muscle activity utilized for executing movements. Several efficient dimensionality reduction algorithms that extract putative synergies from electromyographic (EMG) signals have been developed. Typically, the quality of synergy decompositions is assessed by computing the Variance Accounted For (VAF). Yet, little is known about the extent to which the combination of those synergies encodes task-discriminating variations of muscle activity in individual trials. To address this question, here we conceive and develop a novel computational framework to evaluate muscle synergy decompositions in task space. Unlike previous methods considering the total variance of muscle patterns (VAF based metrics), our approach focuses on variance discriminating execution of different tasks. The procedure is based on single-trial task decoding from muscle synergy activation features. The task decoding based metric evaluates quantitatively the mapping between synergy recruitment and task identification and automatically determines the minimal number of synergies that captures all the task-discriminating variability in the synergy activations. In this paper, we first validate the method on plausibly simulated EMG datasets. We then show that it can be applied to different types of muscle synergy decomposition and illustrate its applicability to real data by using it for the analysis of EMG recordings during an arm pointing task. We find that time-varying and synchronous synergies with similar number of parameters are equally efficient in task decoding, suggesting that in this experimental paradigm they are equally valid representations of muscle synergies. Overall, these findings stress the effectiveness of the decoding metric in systematically assessing muscle synergy decompositions in task space. PMID:23471195

  1. Graphical CONOPS Prototype to Demonstrate Emerging Methods, Processes, and Tools at ARDEC

    DTIC Science & Technology

    2013-07-17

    Concept Engineering Framework (ICEF), an extensive literature review was conducted to discover metrics that exist for evaluating concept engineering...language to ICEF to SysML ................................................ 34 Table 5 Artifact metrics ...50 Table 6 Collaboration metrics

  2. Metrics for quantifying antimicrobial use in beef feedlots

    PubMed Central

    Benedict, Katharine M.; Gow, Sheryl P.; Reid-Smith, Richard J.; Booker, Calvin W.; Morley, Paul S.

    2012-01-01

    Accurate antimicrobial drug use data are needed to enlighten discussions regarding the impact of antimicrobial drug use in agriculture. The primary objective of this study was to investigate the perceived accuracy and clarity of different methods for reporting antimicrobial drug use information collected regarding beef feedlots. Producers, veterinarians, industry representatives, public health officials, and other knowledgeable beef industry leaders were invited to complete a web-based survey. A total of 156 participants in 33 US states, 4 Canadian provinces, and 8 other countries completed the survey. No single metric was considered universally optimal for all use circumstances or for all audiences. To effectively communicate antimicrobial drug use data, evaluation of the target audience is critical to presenting the information. Metrics that are most accurate need to be carefully and repeatedly explained to the audience. PMID:23372190

  3. Machine learning classifier using abnormal brain network topological metrics in major depressive disorder.

    PubMed

    Guo, Hao; Cao, Xiaohua; Liu, Zhifen; Li, Haifang; Chen, Junjie; Zhang, Kerang

    2012-12-05

    Resting state functional brain networks have been widely studied in brain disease research. However, it is currently unclear whether abnormal resting state functional brain network metrics can be used with machine learning for the classification of brain diseases. Resting state functional brain networks were constructed for 28 healthy controls and 38 major depressive disorder patients by thresholding partial correlation matrices of 90 regions. Three nodal metrics were calculated using graph theory-based approaches. Nonparametric permutation tests were then used for group comparisons of topological metrics, which were used as classified features in six different algorithms. We used statistical significance as the threshold for selecting features and measured the accuracies of six classifiers with different number of features. A sensitivity analysis method was used to evaluate the importance of different features. The result indicated that some of the regions exhibited significantly abnormal nodal centralities, including the limbic system, basal ganglia, medial temporal, and prefrontal regions. Support vector machine with radial basis kernel function algorithm and neural network algorithm exhibited the highest average accuracy (79.27 and 78.22%, respectively) with 28 features (P<0.05). Correlation analysis between feature importance and the statistical significance of metrics was investigated, and the results revealed a strong positive correlation between them. Overall, the current study demonstrated that major depressive disorder is associated with abnormal functional brain network topological metrics and statistically significant nodal metrics can be successfully used for feature selection in classification algorithms.

  4. Mass, surface area and number metrics in diesel occupational exposure assessment.

    PubMed

    Ramachandran, Gurumurthy; Paulsen, Dwane; Watts, Winthrop; Kittelson, David

    2005-07-01

    While diesel aerosol exposure assessment has traditionally been based on the mass concentration metric, recent studies have suggested that particle number and surface area concentrations may be more health-relevant. In this study, we evaluated the exposures of three occupational groups-bus drivers, parking garage attendants, and bus mechanics-using the mass concentration of elemental carbon (EC) as well as surface area and number concentrations. These occupational groups are exposed to mixtures of diesel and gasoline exhaust on a regular basis in various ratios. The three groups had significantly different exposures to workshift TWA EC with the highest levels observed in the bus garage mechanics and the lowest levels in the parking ramp booth attendants. In terms of surface area, parking ramp attendants had significantly greater exposures than bus garage mechanics, who in turn had significantly greater exposures than bus drivers. In terms of number concentrations, the exposures of garage mechanics exceeded those of ramp booth attendants by a factor of 5-6. Depending on the exposure metric chosen, the three occupational groups had quite different exposure rankings. This illustrates the importance of the choice of exposure metric in epidemiological studies. If these three occupational groups were part of an epidemiological study, depending on the metric used, they may or may not be part of the same similarly exposed group (SEG). The exposure rankings (e.g., low, medium, or high) of the three groups also changes with the metric used. If the incorrect metric is used, significant misclassification errors may occur.

  5. A case-based reasoning system based on weighted heterogeneous value distance metric for breast cancer diagnosis.

    PubMed

    Gu, Dongxiao; Liang, Changyong; Zhao, Huimin

    2017-03-01

    We present the implementation and application of a case-based reasoning (CBR) system for breast cancer related diagnoses. By retrieving similar cases in a breast cancer decision support system, oncologists can obtain powerful information or knowledge, complementing their own experiential knowledge, in their medical decision making. We observed two problems in applying standard CBR to this context: the abundance of different types of attributes and the difficulty in eliciting appropriate attribute weights from human experts. We therefore used a distance measure named weighted heterogeneous value distance metric, which can better deal with both continuous and discrete attributes simultaneously than the standard Euclidean distance, and a genetic algorithm for learning the attribute weights involved in this distance measure automatically. We evaluated our CBR system in two case studies, related to benign/malignant tumor prediction and secondary cancer prediction, respectively. Weighted heterogeneous value distance metric with genetic algorithm for weight learning outperformed several alternative attribute matching methods and several classification methods by at least 3.4%, reaching 0.938, 0.883, 0.933, and 0.984 in the first case study, and 0.927, 0.842, 0.939, and 0.989 in the second case study, in terms of accuracy, sensitivity×specificity, F measure, and area under the receiver operating characteristic curve, respectively. The evaluation result indicates the potential of CBR in the breast cancer diagnosis domain. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. Toward better public health reporting using existing off the shelf approaches: The value of medical dictionaries in automated cancer detection using plaintext medical data.

    PubMed

    Kasthurirathne, Suranga N; Dixon, Brian E; Gichoya, Judy; Xu, Huiping; Xia, Yuni; Mamlin, Burke; Grannis, Shaun J

    2017-05-01

    Existing approaches to derive decision models from plaintext clinical data frequently depend on medical dictionaries as the sources of potential features. Prior research suggests that decision models developed using non-dictionary based feature sourcing approaches and "off the shelf" tools could predict cancer with performance metrics between 80% and 90%. We sought to compare non-dictionary based models to models built using features derived from medical dictionaries. We evaluated the detection of cancer cases from free text pathology reports using decision models built with combinations of dictionary or non-dictionary based feature sourcing approaches, 4 feature subset sizes, and 5 classification algorithms. Each decision model was evaluated using the following performance metrics: sensitivity, specificity, accuracy, positive predictive value, and area under the receiver operating characteristics (ROC) curve. Decision models parameterized using dictionary and non-dictionary feature sourcing approaches produced performance metrics between 70 and 90%. The source of features and feature subset size had no impact on the performance of a decision model. Our study suggests there is little value in leveraging medical dictionaries for extracting features for decision model building. Decision models built using features extracted from the plaintext reports themselves achieve comparable results to those built using medical dictionaries. Overall, this suggests that existing "off the shelf" approaches can be leveraged to perform accurate cancer detection using less complex Named Entity Recognition (NER) based feature extraction, automated feature selection and modeling approaches. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. Image navigation and registration performance assessment tool set for the GOES-R Advanced Baseline Imager and Geostationary Lightning Mapper

    NASA Astrophysics Data System (ADS)

    De Luccia, Frank J.; Houchin, Scott; Porter, Brian C.; Graybill, Justin; Haas, Evan; Johnson, Patrick D.; Isaacson, Peter J.; Reth, Alan D.

    2016-05-01

    The GOES-R Flight Project has developed an Image Navigation and Registration (INR) Performance Assessment Tool Set (IPATS) for measuring Advanced Baseline Imager (ABI) and Geostationary Lightning Mapper (GLM) INR performance metrics in the post-launch period for performance evaluation and long term monitoring. For ABI, these metrics are the 3-sigma errors in navigation (NAV), channel-to-channel registration (CCR), frame-to-frame registration (FFR), swath-to-swath registration (SSR), and within frame registration (WIFR) for the Level 1B image products. For GLM, the single metric of interest is the 3-sigma error in the navigation of background images (GLM NAV) used by the system to navigate lightning strikes. 3-sigma errors are estimates of the 99. 73rd percentile of the errors accumulated over a 24 hour data collection period. IPATS utilizes a modular algorithmic design to allow user selection of data processing sequences optimized for generation of each INR metric. This novel modular approach minimizes duplication of common processing elements, thereby maximizing code efficiency and speed. Fast processing is essential given the large number of sub-image registrations required to generate INR metrics for the many images produced over a 24 hour evaluation period. Another aspect of the IPATS design that vastly reduces execution time is the off-line propagation of Landsat based truth images to the fixed grid coordinates system for each of the three GOES-R satellite locations, operational East and West and initial checkout locations. This paper describes the algorithmic design and implementation of IPATS and provides preliminary test results.

  8. Image Navigation and Registration (INR) Performance Assessment Tool Set (IPATS) for the GOES-R Advanced Baseline Imager and Geostationary Lightning Mapper

    NASA Technical Reports Server (NTRS)

    DeLuccia, Frank J.; Houchin, Scott; Porter, Brian C.; Graybill, Justin; Haas, Evan; Johnson, Patrick D.; Isaacson, Peter J.; Reth, Alan D.

    2016-01-01

    The GOES-R Flight Project has developed an Image Navigation and Registration (INR) Performance Assessment Tool Set (IPATS) for measuring Advanced Baseline Imager (ABI) and Geostationary Lightning Mapper (GLM) INR performance metrics in the post-launch period for performance evaluation and long term monitoring. For ABI, these metrics are the 3-sigma errors in navigation (NAV), channel-to-channel registration (CCR), frame-to-frame registration (FFR), swath-to-swath registration (SSR), and within frame registration (WIFR) for the Level 1B image products. For GLM, the single metric of interest is the 3-sigma error in the navigation of background images (GLM NAV) used by the system to navigate lightning strikes. 3-sigma errors are estimates of the 99.73rd percentile of the errors accumulated over a 24 hour data collection period. IPATS utilizes a modular algorithmic design to allow user selection of data processing sequences optimized for generation of each INR metric. This novel modular approach minimizes duplication of common processing elements, thereby maximizing code efficiency and speed. Fast processing is essential given the large number of sub-image registrations required to generate INR metrics for the many images produced over a 24 hour evaluation period. Another aspect of the IPATS design that vastly reduces execution time is the off-line propagation of Landsat based truth images to the fixed grid coordinates system for each of the three GOES-R satellite locations, operational East and West and initial checkout locations. This paper describes the algorithmic design and implementation of IPATS and provides preliminary test results.

  9. Image Navigation and Registration Performance Assessment Tool Set for the GOES-R Advanced Baseline Imager and Geostationary Lightning Mapper

    NASA Technical Reports Server (NTRS)

    De Luccia, Frank J.; Houchin, Scott; Porter, Brian C.; Graybill, Justin; Haas, Evan; Johnson, Patrick D.; Isaacson, Peter J.; Reth, Alan D.

    2016-01-01

    The GOES-R Flight Project has developed an Image Navigation and Registration (INR) Performance Assessment Tool Set (IPATS) for measuring Advanced Baseline Imager (ABI) and Geostationary Lightning Mapper (GLM) INR performance metrics in the post-launch period for performance evaluation and long term monitoring. For ABI, these metrics are the 3-sigma errors in navigation (NAV), channel-to-channel registration (CCR), frame-to-frame registration (FFR), swath-to-swath registration (SSR), and within frame registration (WIFR) for the Level 1B image products. For GLM, the single metric of interest is the 3-sigma error in the navigation of background images (GLM NAV) used by the system to navigate lightning strikes. 3-sigma errors are estimates of the 99.73rd percentile of the errors accumulated over a 24-hour data collection period. IPATS utilizes a modular algorithmic design to allow user selection of data processing sequences optimized for generation of each INR metric. This novel modular approach minimizes duplication of common processing elements, thereby maximizing code efficiency and speed. Fast processing is essential given the large number of sub-image registrations required to generate INR metrics for the many images produced over a 24-hour evaluation period. Another aspect of the IPATS design that vastly reduces execution time is the off-line propagation of Landsat based truth images to the fixed grid coordinates system for each of the three GOES-R satellite locations, operational East and West and initial checkout locations. This paper describes the algorithmic design and implementation of IPATS and provides preliminary test results.

  10. Evaluation of motion artifact metrics for coronary CT angiography.

    PubMed

    Ma, Hongfeng; Gros, Eric; Szabo, Aniko; Baginski, Scott G; Laste, Zachary R; Kulkarni, Naveen M; Okerlund, Darin; Schmidt, Taly G

    2018-02-01

    This study quantified the performance of coronary artery motion artifact metrics relative to human observer ratings. Motion artifact metrics have been used as part of motion correction and best-phase selection algorithms for Coronary Computed Tomography Angiography (CCTA). However, the lack of ground truth makes it difficult to validate how well the metrics quantify the level of motion artifact. This study investigated five motion artifact metrics, including two novel metrics, using a dynamic phantom, clinical CCTA images, and an observer study that provided ground-truth motion artifact scores from a series of pairwise comparisons. Five motion artifact metrics were calculated for the coronary artery regions on both phantom and clinical CCTA images: positivity, entropy, normalized circularity, Fold Overlap Ratio (FOR), and Low-Intensity Region Score (LIRS). CT images were acquired of a dynamic cardiac phantom that simulated cardiac motion and contained six iodine-filled vessels of varying diameter and with regions of soft plaque and calcifications. Scans were repeated with different gantry start angles. Images were reconstructed at five phases of the motion cycle. Clinical images were acquired from 14 CCTA exams with patient heart rates ranging from 52 to 82 bpm. The vessel and shading artifacts were manually segmented by three readers and combined to create ground-truth artifact regions. Motion artifact levels were also assessed by readers using a pairwise comparison method to establish a ground-truth reader score. The Kendall's Tau coefficients were calculated to evaluate the statistical agreement in ranking between the motion artifacts metrics and reader scores. Linear regression between the reader scores and the metrics was also performed. On phantom images, the Kendall's Tau coefficients of the five motion artifact metrics were 0.50 (normalized circularity), 0.35 (entropy), 0.82 (positivity), 0.77 (FOR), 0.77(LIRS), where higher Kendall's Tau signifies higher agreement. The FOR, LIRS, and transformed positivity (the fourth root of the positivity) were further evaluated in the study of clinical images. The Kendall's Tau coefficients of the selected metrics were 0.59 (FOR), 0.53 (LIRS), and 0.21 (Transformed positivity). In the study of clinical data, a Motion Artifact Score, defined as the product of FOR and LIRS metrics, further improved agreement with reader scores, with a Kendall's Tau coefficient of 0.65. The metrics of FOR, LIRS, and the product of the two metrics provided the highest agreement in motion artifact ranking when compared to the readers, and the highest linear correlation to the reader scores. The validated motion artifact metrics may be useful for developing and evaluating methods to reduce motion in Coronary Computed Tomography Angiography (CCTA) images. © 2017 American Association of Physicists in Medicine.

  11. Evaluation of Vehicle-Based Crash Severity Metrics.

    PubMed

    Tsoi, Ada H; Gabler, Hampton C

    2015-01-01

    Vehicle change in velocity (delta-v) is a widely used crash severity metric used to estimate occupant injury risk. Despite its widespread use, delta-v has several limitations. Of most concern, delta-v is a vehicle-based metric which does not consider the crash pulse or the performance of occupant restraints, e.g. seatbelts and airbags. Such criticisms have prompted the search for alternative impact severity metrics based upon vehicle kinematics. The purpose of this study was to assess the ability of the occupant impact velocity (OIV), acceleration severity index (ASI), vehicle pulse index (VPI), and maximum delta-v (delta-v) to predict serious injury in real world crashes. The study was based on the analysis of event data recorders (EDRs) downloaded from the National Automotive Sampling System / Crashworthiness Data System (NASS-CDS) 2000-2013 cases. All vehicles in the sample were GM passenger cars and light trucks involved in a frontal collision. Rollover crashes were excluded. Vehicles were restricted to single-event crashes that caused an airbag deployment. All EDR data were checked for a successful, completed recording of the event and that the crash pulse was complete. The maximum abbreviated injury scale (MAIS) was used to describe occupant injury outcome. Drivers were categorized into either non-seriously injured group (MAIS2-) or seriously injured group (MAIS3+), based on the severity of any injuries to the thorax, abdomen, and spine. ASI and OIV were calculated according to the Manual for Assessing Safety Hardware. VPI was calculated according to ISO/TR 12353-3, with vehicle-specific parameters determined from U.S. New Car Assessment Program crash tests. Using binary logistic regression, the cumulative probability of injury risk was determined for each metric and assessed for statistical significance, goodness-of-fit, and prediction accuracy. The dataset included 102,744 vehicles. A Wald chi-square test showed each vehicle-based crash severity metric estimate to be a significant predictor in the model (p < 0.05). For the belted drivers, both OIV and VPI were significantly better predictors of serious injury than delta-v (p < 0.05). For the unbelted drivers, there was no statistically significant difference between delta-v, OIV, VPI, and ASI. The broad findings of this study suggest it is feasible to improve injury prediction if we consider adding restraint performance to classic measures, e.g. delta-v. Applications, such as advanced automatic crash notification, should consider the use of different metrics for belted versus unbelted occupants.

  12. Predicting the difficulty of pure, strict, epistatic models: metrics for simulated model selection.

    PubMed

    Urbanowicz, Ryan J; Kiralis, Jeff; Fisher, Jonathan M; Moore, Jason H

    2012-09-26

    Algorithms designed to detect complex genetic disease associations are initially evaluated using simulated datasets. Typical evaluations vary constraints that influence the correct detection of underlying models (i.e. number of loci, heritability, and minor allele frequency). Such studies neglect to account for model architecture (i.e. the unique specification and arrangement of penetrance values comprising the genetic model), which alone can influence the detectability of a model. In order to design a simulation study which efficiently takes architecture into account, a reliable metric is needed for model selection. We evaluate three metrics as predictors of relative model detection difficulty derived from previous works: (1) Penetrance table variance (PTV), (2) customized odds ratio (COR), and (3) our own Ease of Detection Measure (EDM), calculated from the penetrance values and respective genotype frequencies of each simulated genetic model. We evaluate the reliability of these metrics across three very different data search algorithms, each with the capacity to detect epistatic interactions. We find that a model's EDM and COR are each stronger predictors of model detection success than heritability. This study formally identifies and evaluates metrics which quantify model detection difficulty. We utilize these metrics to intelligently select models from a population of potential architectures. This allows for an improved simulation study design which accounts for differences in detection difficulty attributed to model architecture. We implement the calculation and utilization of EDM and COR into GAMETES, an algorithm which rapidly and precisely generates pure, strict, n-locus epistatic models.

  13. Texture metric that predicts target detection performance

    NASA Astrophysics Data System (ADS)

    Culpepper, Joanne B.

    2015-12-01

    Two texture metrics based on gray level co-occurrence error (GLCE) are used to predict probability of detection and mean search time. The two texture metrics are local clutter metrics and are based on the statistics of GLCE probability distributions. The degree of correlation between various clutter metrics and the target detection performance of the nine military vehicles in complex natural scenes found in the Search_2 dataset are presented. Comparison is also made between four other common clutter metrics found in the literature: root sum of squares, Doyle, statistical variance, and target structure similarity. The experimental results show that the GLCE energy metric is a better predictor of target detection performance when searching for targets in natural scenes than the other clutter metrics studied.

  14. Foresters' Metric Conversions program (version 1.0). [Computer program

    Treesearch

    Jefferson A. Palmer

    1999-01-01

    The conversion of scientific measurements has become commonplace in the fields of - engineering, research, and forestry. Foresters? Metric Conversions is a Windows-based computer program that quickly converts user-defined measurements from English to metric and from metric to English. Foresters? Metric Conversions was derived from the publication "Metric...

  15. Objective Situation Awareness Measurement Based on Performance Self-Evaluation

    NASA Technical Reports Server (NTRS)

    DeMaio, Joe

    1998-01-01

    The research was conducted in support of the NASA Safe All-Weather Flight Operations for Rotorcraft (SAFOR) program. The purpose of the work was to investigate the utility of two measurement tools developed by the British Defense Evaluation Research Agency. These tools were a subjective workload assessment scale, the DRA Workload Scale and a situation awareness measurement tool. The situation awareness tool uses a comparison of the crew's self-evaluation of performance against actual performance in order to determine what information the crew attended to during the performance. These two measurement tools were evaluated in the context of a test of innovative approach to alerting the crew by way of a helmet mounted display. The situation assessment data are reported here. The performance self-evaluation metric of situation awareness was found to be highly effective. It was used to evaluate situation awareness on a tank reconnaissance task, a tactical navigation task, and a stylized task used to evaluated handling qualities. Using the self-evaluation metric, it was possible to evaluate situation awareness, without exact knowledge the relevant information in some cases and to identify information to which the crew attended or failed to attend in others.

  16. Repeatability of quantitative 18F-FLT uptake measurements in solid tumors: an individual patient data multi-center meta-analysis.

    PubMed

    Kramer, G M; Liu, Y; de Langen, A J; Jansma, E P; Trigonis, I; Asselin, M-C; Jackson, A; Kenny, L; Aboagye, E O; Hoekstra, O S; Boellaard, R

    2018-06-01

    3'-deoxy-3'-[ 18 F]fluorothymidine ( 18 F-FLT) positron emission tomography (PET) provides a non-invasive method to assess cellular proliferation and response to antitumor therapy. Quantitative 18 F-FLT uptake metrics are being used for evaluation of proliferative response in investigational setting, however multi-center repeatability needs to be established. The aim of this study was to determine the repeatability of 18 F-FLT tumor uptake metrics by re-analyzing individual patient data from previously published reports using the same tumor segmentation method and repeatability metrics across cohorts. A systematic search in PubMed, EMBASE.com and the Cochrane Library from inception-October 2016 yielded five 18 F-FLT repeatability cohorts in solid tumors. 18 F-FLT avid lesions were delineated using a 50% isocontour adapted for local background on test and retest scans. SUV max , SUV mean , SUV peak , proliferative volume and total lesion uptake (TLU) were calculated. Repeatability was assessed using the repeatability coefficient (RC = 1.96 × SD of test-retest differences), linear regression analysis, and the intra-class correlation coefficient (ICC). The impact of different lesion selection criteria was also evaluated. Images from four cohorts containing 30 patients with 52 lesions were obtained and analyzed (ten in breast cancer, nine in head and neck squamous cell carcinoma, and 33 in non-small cell lung cancer patients). A good correlation was found between test-retest data for all 18 F-FLT uptake metrics (R 2  ≥ 0.93; ICC ≥ 0.96). Best repeatability was found for SUV peak (RC: 23.1%), without significant differences in RC between different SUV metrics. Repeatability of proliferative volume (RC: 36.0%) and TLU (RC: 36.4%) was worse than SUV. Lesion selection methods based on SUV max  ≥ 4.0 improved the repeatability of volumetric metrics (RC: 26-28%), but did not affect the repeatability of SUV metrics. In multi-center studies, differences ≥ 25% in 18 F-FLT SUV metrics likely represent a true change in tumor uptake. Larger differences are required for FLT metrics comprising volume estimates when no lesion selection criteria are applied.

  17. Does Implementation Follow Design? A Case Study of a Workplace Health Promotion Program Using the 4-S Program Design and the PIPE Impact Metric Evaluation Models.

    PubMed

    Äikäs, Antti Hermanni; Pronk, Nicolaas P; Hirvensalo, Mirja Hannele; Absetz, Pilvikki

    2017-08-01

    The aim of this study was to describe the content of a multiyear market-based workplace health promotion (WHP) program and to evaluate design and implementation processes in a real-world setting. Data was collected from the databases of the employer and the service provider. It was classified using the 4-S (Size, Scope, Scalability, and Sustainability) and PIPE Impact Metric (Penetration, Implementation) models. Data analysis utilized both qualitative and quantitative methods. Program design covered well the evidence-informed best practices except for clear path toward sustainability, cooperation with occupational health care, and support from middle-management supervisors. The penetration rate among participants was high (99%) and majority (81%) of services were implemented as designed. Study findings indicate that WHP market would benefit the use of evidence-based design principles and tendentious decisions to anticipate a long-term implementation process already during the planning phase.

  18. Does Implementation Follow Design? A Case Study of a Workplace Health Promotion Program Using the 4-S Program Design and the PIPE Impact Metric Evaluation Models

    PubMed Central

    Äikäs, Antti Hermanni; Pronk, Nicolaas P.; Hirvensalo, Mirja Hannele; Absetz, Pilvikki

    2017-01-01

    Objective: The aim of this study was to describe the content of a multiyear market-based workplace health promotion (WHP) program and to evaluate design and implementation processes in a real-world setting. Methods: Data was collected from the databases of the employer and the service provider. It was classified using the 4-S (Size, Scope, Scalability, and Sustainability) and PIPE Impact Metric (Penetration, Implementation) models. Data analysis utilized both qualitative and quantitative methods. Results: Program design covered well the evidence-informed best practices except for clear path toward sustainability, cooperation with occupational health care, and support from middle-management supervisors. The penetration rate among participants was high (99%) and majority (81%) of services were implemented as designed. Conclusion: Study findings indicate that WHP market would benefit the use of evidence-based design principles and tendentious decisions to anticipate a long-term implementation process already during the planning phase. PMID:28665839

  19. Using Vision and Speech Features for Automated Prediction of Performance Metrics in Multimodal Dialogs. Research Report. ETS RR-17-20

    ERIC Educational Resources Information Center

    Ramanarayanan, Vikram; Lange, Patrick; Evanini, Keelan; Molloy, Hillary; Tsuprun, Eugene; Qian, Yao; Suendermann-Oeft, David

    2017-01-01

    Predicting and analyzing multimodal dialog user experience (UX) metrics, such as overall call experience, caller engagement, and latency, among other metrics, in an ongoing manner is important for evaluating such systems. We investigate automated prediction of multiple such metrics collected from crowdsourced interactions with an open-source,…

  20. Technical Interchange Meeting Guidelines Breakout

    NASA Technical Reports Server (NTRS)

    Fong, Rob

    2002-01-01

    Along with concept developers, the Systems Evaluation and Assessment (SEA) sub-element of VAMS will develop those scenarios and metrics required for testing the new concepts that reside within the System-Level Integrated Concepts (SLIC) sub-element in the VAMS project. These concepts will come from the NRA process, space act agreements, a university group, and other NASA researchers. The emphasis of those concepts is to increase capacity while at least maintaining the current safety level. The concept providers will initially develop their own scenarios and metrics for self-evaluation. In about a year, the SEA sub-element will become responsible for conducting initial evaluations of the concepts using a common scenario and metric set. This set may derive many components from the scenarios and metrics used by the concept providers. Ultimately, the common scenario\\metric set will be used to help determine the most feasible and beneficial concepts. A set of 15 questions and issues, discussed below, pertaining to the scenario and metric set, and its use for assessing concepts, was submitted by the SEA sub-element for consideration during the breakout session. The questions were divided among the three breakout groups. Each breakout group deliberated on its set of questions and provided a report on its discussion.

  1. Toward Developing a New Occupational Exposure Metric Approach for Characterization of Diesel Aerosols

    PubMed Central

    Cauda, Emanuele G.; Ku, Bon Ki; Miller, Arthur L.; Barone, Teresa L.

    2015-01-01

    The extensive use of diesel-powered equipment in mines makes the exposure to diesel aerosols a serious occupational issue. The exposure metric currently used in U.S. underground noncoal mines is based on the measurement of total carbon (TC) and elemental carbon (EC) mass concentration in the air. Recent toxicological evidence suggests that the measurement of mass concentration is not sufficient to correlate ultrafine aerosol exposure with health effects. This urges the evaluation of alternative measurements. In this study, the current exposure metric and two additional metrics, the surface area and the total number concentration, were evaluated by conducting simultaneous measurements of diesel ultrafine aerosols in a laboratory setting. The results showed that the surface area and total number concentration of the particles per unit of mass varied substantially with the engine operating condition. The specific surface area (SSA) and specific number concentration (SNC) normalized with TC varied two and five times, respectively. This implies that miners, whose exposure is measured only as TC, might be exposed to an unknown variable number concentration of diesel particles and commensurate particle surface area. Taken separately, mass, surface area, and number concentration did not completely characterize the aerosols. A comprehensive assessment of diesel aerosol exposure should include all of these elements, but the use of laboratory instruments in underground mines is generally impracticable. The article proposes a new approach to solve this problem. Using SSA and SNC calculated from field-type measurements, the evaluation of additional physical properties can be obtained by using the proposed approach. PMID:26361400

  2. Dynamic Contrast-enhanced MR Imaging in Renal Cell Carcinoma: Reproducibility of Histogram Analysis on Pharmacokinetic Parameters

    PubMed Central

    Wang, Hai-yi; Su, Zi-hua; Xu, Xiao; Sun, Zhi-peng; Duan, Fei-xue; Song, Yuan-yuan; Li, Lu; Wang, Ying-wei; Ma, Xin; Guo, Ai-tao; Ma, Lin; Ye, Hui-yi

    2016-01-01

    Pharmacokinetic parameters derived from dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) have been increasingly used to evaluate the permeability of tumor vessel. Histogram metrics are a recognized promising method of quantitative MR imaging that has been recently introduced in analysis of DCE-MRI pharmacokinetic parameters in oncology due to tumor heterogeneity. In this study, 21 patients with renal cell carcinoma (RCC) underwent paired DCE-MRI studies on a 3.0 T MR system. Extended Tofts model and population-based arterial input function were used to calculate kinetic parameters of RCC tumors. Mean value and histogram metrics (Mode, Skewness and Kurtosis) of each pharmacokinetic parameter were generated automatically using ImageJ software. Intra- and inter-observer reproducibility and scan–rescan reproducibility were evaluated using intra-class correlation coefficients (ICCs) and coefficient of variation (CoV). Our results demonstrated that the histogram method (Mode, Skewness and Kurtosis) was not superior to the conventional Mean value method in reproducibility evaluation on DCE-MRI pharmacokinetic parameters (K trans & Ve) in renal cell carcinoma, especially for Skewness and Kurtosis which showed lower intra-, inter-observer and scan-rescan reproducibility than Mean value. Our findings suggest that additional studies are necessary before wide incorporation of histogram metrics in quantitative analysis of DCE-MRI pharmacokinetic parameters. PMID:27380733

  3. Evaluating reproducibility of differential expression discoveries in microarray studies by considering correlated molecular changes.

    PubMed

    Zhang, Min; Zhang, Lin; Zou, Jinfeng; Yao, Chen; Xiao, Hui; Liu, Qing; Wang, Jing; Wang, Dong; Wang, Chenguang; Guo, Zheng

    2009-07-01

    According to current consistency metrics such as percentage of overlapping genes (POG), lists of differentially expressed genes (DEGs) detected from different microarray studies for a complex disease are often highly inconsistent. This irreproducibility problem also exists in other high-throughput post-genomic areas such as proteomics and metabolism. A complex disease is often characterized with many coordinated molecular changes, which should be considered when evaluating the reproducibility of discovery lists from different studies. We proposed metrics percentage of overlapping genes-related (POGR) and normalized POGR (nPOGR) to evaluate the consistency between two DEG lists for a complex disease, considering correlated molecular changes rather than only counting gene overlaps between the lists. Based on microarray datasets of three diseases, we showed that though the POG scores for DEG lists from different studies for each disease are extremely low, the POGR and nPOGR scores can be rather high, suggesting that the apparently inconsistent DEG lists may be highly reproducible in the sense that they are actually significantly correlated. Observing different discovery results for a disease by the POGR and nPOGR scores will obviously reduce the uncertainty of the microarray studies. The proposed metrics could also be applicable in many other high-throughput post-genomic areas.

  4. Expanded Enlistment Eligibility Metrics (EEEM): Recommendations on a Non-Cognitive Screen for New Soldier Selection

    DTIC Science & Technology

    2010-07-01

    applicants and is pursing further research on the WPA. An operational test and evaluation ( IOT &E) has been initiated to evaluate the new screen...initial operational test and evaluation ( IOT &E) starting in fall 2009. vii EXPANDED ENLISTMENT ELIGIBILITY METRICS (EEEM): RECOMMENDATIONS ON A NON...Evaluation of a Performance Screen for IOT &E ..................................... 49 Approach

  5. A Case Study: Analyzing City Vitality with Four Pillars of Activity-Live, Work, Shop, and Play.

    PubMed

    Griffin, Matt; Nordstrom, Blake W; Scholes, Jon; Joncas, Kate; Gordon, Patrick; Krivenko, Elliott; Haynes, Winston; Higdon, Roger; Stewart, Elizabeth; Kolker, Natali; Montague, Elizabeth; Kolker, Eugene

    2016-03-01

    This case study evaluates and tracks vitality of a city (Seattle), based on a data-driven approach, using strategic, robust, and sustainable metrics. This case study was collaboratively conducted by the Downtown Seattle Association (DSA) and CDO Analytics teams. The DSA is a nonprofit organization focused on making the city of Seattle and its Downtown a healthy and vibrant place to Live, Work, Shop, and Play. DSA primarily operates through public policy advocacy, community and business development, and marketing. In 2010, the organization turned to CDO Analytics ( cdoanalytics.org ) to develop a process that can guide and strategically focus DSA efforts and resources for maximal benefit to the city of Seattle and its Downtown. CDO Analytics was asked to develop clear, easily understood, and robust metrics for a baseline evaluation of the health of the city, as well as for ongoing monitoring and comparisons of the vitality, sustainability, and growth. The DSA and CDO Analytics teams strategized on how to effectively assess and track the vitality of Seattle and its Downtown. The two teams filtered a variety of data sources, and evaluated the veracity of multiple diverse metrics. This iterative process resulted in the development of a small number of strategic, simple, reliable, and sustainable metrics across four pillars of activity: Live, Work, Shop, and Play. Data during the 5 years before 2010 were used for the development of the metrics and model and its training, and data during the 5 years from 2010 and on were used for testing and validation. This work enabled DSA to routinely track these strategic metrics, use them to monitor the vitality of Downtown Seattle, prioritize improvements, and identify new value-added programs. As a result, the four-pillar approach became an integral part of the data-driven decision-making and execution of the Seattle community's improvement activities. The approach described in this case study is actionable, robust, inexpensive, and easy to adopt and sustain. It can be applied to cities, districts, counties, regions, states, or countries, enabling cross-comparisons and improvements of vitality, sustainability, and growth.

  6. A framework for evaluating mixture analysis algorithms

    NASA Astrophysics Data System (ADS)

    Dasaratha, Sridhar; Vignesh, T. S.; Shanmukh, Sarat; Yarra, Malathi; Botonjic-Sehic, Edita; Grassi, James; Boudries, Hacene; Freeman, Ivan; Lee, Young K.; Sutherland, Scott

    2010-04-01

    In recent years, several sensing devices capable of identifying unknown chemical and biological substances have been commercialized. The success of these devices in analyzing real world samples is dependent on the ability of the on-board identification algorithm to de-convolve spectra of substances that are mixtures. To develop effective de-convolution algorithms, it is critical to characterize the relationship between the spectral features of a substance and its probability of detection within a mixture, as these features may be similar to or overlap with other substances in the mixture and in the library. While it has been recognized that these aspects pose challenges to mixture analysis, a systematic effort to quantify spectral characteristics and their impact, is generally lacking. In this paper, we propose metrics that can be used to quantify these spectral features. Some of these metrics, such as a modification of variance inflation factor, are derived from classical statistical measures used in regression diagnostics. We demonstrate that these metrics can be correlated to the accuracy of the substance's identification in a mixture. We also develop a framework for characterizing mixture analysis algorithms, using these metrics. Experimental results are then provided to show the application of this framework to the evaluation of various algorithms, including one that has been developed for a commercial device. The illustration is based on synthetic mixtures that are created from pure component Raman spectra measured on a portable device.

  7. The five traps of performance measurement.

    PubMed

    Likierman, Andrew

    2009-10-01

    Evaluating a company's performance often entails wading through a thicket of numbers produced by a few simple metrics, writes the author, and senior executives leave measurement to those whose specialty is spreadsheets. To take ownership of performance assessment, those executives should find qualitative, forward-looking measures that will help them avoid five common traps: Measuring against yourself. Find data from outside the company, and reward relative, rather than absolute, performance. Enterprise Rent-A-Car uses a service quality index to measure customers' repeat purchase intentions. Looking backward. Use measures that lead rather than lag the profits in your business. Humana, a health insurer, found that the sickest 10% of its patients account for 80% of its costs; now it offers customers incentives for early screening. Putting your faith in numbers. The soft drinks company Britvic evaluates its executive coaching program not by trying to assign it an ROI number but by tracking participants' careers for a year. Gaming your metrics. The law firm Clifford Chance replaced its single, easy-to-game metric of billable hours with seven criteria on which to base bonuses. Sticking to your numbers too long. Be precise about what you want to assess and explicit about what metrics are assessing it. Such clarity would have helped investors interpret the AAA ratings involved in the financial meltdown. Really good assessment will combine finance managers' relative independence with line managers' expertise.

  8. Evaluation of Ion Mobility-Mass Spectrometry for Comparative Analysis of Monoclonal Antibodies

    NASA Astrophysics Data System (ADS)

    Ferguson, Carly N.; Gucinski-Ruth, Ashley C.

    2016-05-01

    Analytical techniques capable of detecting changes in structure are necessary to monitor the quality of monoclonal antibody drug products. Ion mobility mass spectrometry offers an advanced mode of characterization of protein higher order structure. In this work, we evaluated the reproducibility of ion mobility mass spectrometry measurements and mobiligrams, as well as the suitability of this approach to differentiate between and/or characterize different monoclonal antibody drug products. Four mobiligram-derived metrics were identified to be reproducible across a multi-day window of analysis. These metrics were further applied to comparative studies of monoclonal antibody drug products representing different IgG subclasses, manufacturers, and lots. These comparisons resulted in some differences, based on the four metrics derived from ion mobility mass spectrometry mobiligrams. The use of collision-induced unfolding resulted in more observed differences. Use of summed charge state datasets and the analysis of metrics beyond drift time allowed for a more comprehensive comparative study between different monoclonal antibody drug products. Ion mobility mass spectrometry enabled detection of differences between monoclonal antibodies with the same target protein but different production techniques, as well as products with different targets. These differences were not always detectable by traditional collision cross section studies. Ion mobility mass spectrometry, and the added separation capability of collision-induced unfolding, was highly reproducible and remains a promising technique for advanced analytical characterization of protein therapeutics.

  9. Evaluation of Enhanced Risk Monitors for Use on Advanced Reactors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ramuhalli, Pradeep; Veeramany, Arun; Bonebrake, Christopher A.

    This study provides an overview of the methodology for integrating time-dependent failure probabilities into nuclear power reactor risk monitors. This prototypic enhanced risk monitor (ERM) methodology was evaluated using a hypothetical probabilistic risk assessment (PRA) model, generated using a simplified design of a liquid-metal-cooled advanced reactor (AR). Component failure data from industry compilation of failures of components similar to those in the simplified AR model were used to initialize the PRA model. Core damage frequency (CDF) over time were computed and analyzed. In addition, a study on alternative risk metrics for ARs was conducted. Risk metrics that quantify the normalizedmore » cost of repairs, replacements, or other operations and management (O&M) actions were defined and used, along with an economic model, to compute the likely economic risk of future actions such as deferred maintenance based on the anticipated change in CDF due to current component condition and future anticipated degradation. Such integration of conventional-risk metrics with alternate-risk metrics provides a convenient mechanism for assessing the impact of O&M decisions on safety and economics of the plant. It is expected that, when integrated with supervisory control algorithms, such integrated-risk monitors will provide a mechanism for real-time control decision-making that ensure safety margins are maintained while operating the plant in an economically viable manner.« less

  10. Design and Implementation of Performance Metrics for Evaluation of Assessments Data

    ERIC Educational Resources Information Center

    Ahmed, Irfan; Bhatti, Arif

    2016-01-01

    Evocative evaluation of assessment data is essential to quantify the achievements at course and program levels. The objective of this paper is to design performance metrics and respective formulas to quantitatively evaluate the achievement of set objectives and expected outcomes at the course levels for program accreditation. Even though…

  11. Focus measure method based on the modulus of the gradient of the color planes for digital microscopy

    NASA Astrophysics Data System (ADS)

    Hurtado-Pérez, Román; Toxqui-Quitl, Carina; Padilla-Vivanco, Alfonso; Aguilar-Valdez, J. Félix; Ortega-Mendoza, Gabriel

    2018-02-01

    The modulus of the gradient of the color planes (MGC) is implemented to transform multichannel information to a grayscale image. This digital technique is used in two applications: (a) focus measurements during autofocusing (AF) process and (b) extending the depth of field (EDoF) by means of multifocus image fusion. In the first case, the MGC procedure is based on an edge detection technique and is implemented in over 15 focus metrics that are typically handled in digital microscopy. The MGC approach is tested on color images of histological sections for the selection of in-focus images. An appealing attribute of all the AF metrics working in the MGC space is their monotonic behavior even up to a magnification of 100×. An advantage of the MGC method is its computational simplicity and inherent parallelism. In the second application, a multifocus image fusion algorithm based on the MGC approach has been implemented on graphics processing units (GPUs). The resulting fused images are evaluated using a nonreference image quality metric. The proposed fusion method reveals a high-quality image independently of faulty illumination during the image acquisition. Finally, the three-dimensional visualization of the in-focus image is shown.

  12. Multiview marker-free registration of forest terrestrial laser scanner data with embedded confidence metrics

    DOE PAGES

    Kelbe, David; Oak Ridge National Lab.; van Aardt, Jan; ...

    2016-10-18

    Terrestrial laser scanning has demonstrated increasing potential for rapid comprehensive measurement of forest structure, especially when multiple scans are spatially registered in order to reduce the limitations of occlusion. Although marker-based registration techniques (based on retro-reflective spherical targets) are commonly used in practice, a blind marker-free approach is preferable, insofar as it supports rapid operational data acquisition. To support these efforts, we extend the pairwise registration approach of our earlier work, and develop a graph-theoretical framework to perform blind marker-free global registration of multiple point cloud data sets. Pairwise pose estimates are weighted based on their estimated error, in ordermore » to overcome pose conflict while exploiting redundant information and improving precision. The proposed approach was tested for eight diverse New England forest sites, with 25 scans collected at each site. Quantitative assessment was provided via a novel embedded confidence metric, with a mean estimated root-mean-square error of 7.2 cm and 89% of scans connected to the reference node. Lastly, this paper assesses the validity of the embedded multiview registration confidence metric and evaluates the performance of the proposed registration algorithm.« less

  13. Metrics for linear kinematic features in sea ice

    NASA Astrophysics Data System (ADS)

    Levy, G.; Coon, M.; Sulsky, D.

    2006-12-01

    The treatment of leads as cracks or discontinuities (see Coon et al. presentation) requires some shift in the procedure of evaluation and comparison of lead-resolving models and their validation against observations. Common metrics used to evaluate ice model skills are by and large an adaptation of a least square "metric" adopted from operational numerical weather prediction data assimilation systems and are most appropriate for continuous fields and Eilerian systems where the observations and predictions are commensurate. However, this class of metrics suffers from some flaws in areas of sharp gradients and discontinuities (e.g., leads) and when Lagrangian treatments are more natural. After a brief review of these metrics and their performance in areas of sharp gradients, we present two new metrics specifically designed to measure model accuracy in representing linear features (e.g., leads). The indices developed circumvent the requirement that both the observations and model variables be commensurate (i.e., measured with the same units) by considering the frequencies of the features of interest/importance. We illustrate the metrics by scoring several hypothetical "simulated" discontinuity fields against the lead interpreted from RGPS observations.

  14. Sound quality evaluation of air conditioning sound rating metric

    NASA Astrophysics Data System (ADS)

    Hodgdon, Kathleen K.; Peters, Jonathan A.; Burkhardt, Russell C.; Atchley, Anthony A.; Blood, Ingrid M.

    2003-10-01

    A product's success can depend on its acoustic signature as much as on the product's performance. The consumer's perception can strongly influence their satisfaction with and confidence in the product. A metric that can rate the content of the spectrum, and predict its consumer preference, is a valuable tool for manufacturers. The current method of assessing acoustic signatures from residential air conditioning units is defined in the Air Conditioning and Refrigeration Institute (ARI 270) 1995 Standard for Sound Rating of Outdoor Unitary Equipment. The ARI 270 metric, and modified versions of that metric, were implemented in software with the flexibility to modify the features applied. Numerous product signatures were analyzed to generate a set of synthesized spectra that targeted spectral configurations that challenged the metric's abilities. A subjective jury evaluation was conducted to establish the consumer preference for those spectra. Statistical correlations were conducted to assess the degree of relationship between the subjective preferences and the various metric calculations. Recommendations were made for modifications to improve the current metric's ability to predict subjective preference. [Research supported by the Air Conditioning and Refrigeration Institute.

  15. SU-C-BRA-03: An Automated and Quick Contour Errordetection for Auto Segmentation in Online Adaptive Radiotherapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, J; Ates, O; Li, X

    Purpose: To develop a tool that can quickly and automatically assess contour quality generated from auto segmentation during online adaptive replanning. Methods: Due to the strict time requirement of online replanning and lack of ‘ground truth’ contours in daily images, our method starts with assessing image registration accuracy focusing on the surface of the organ in question. Several metrics tightly related to registration accuracy including Jacobian maps, contours shell deformation, and voxel-based root mean square (RMS) analysis were computed. To identify correct contours, additional metrics and an adaptive decision tree are introduced. To approve in principle, tests were performed withmore » CT sets, planned and daily CTs acquired using a CT-on-rails during routine CT-guided RT delivery for 20 prostate cancer patients. The contours generated on daily CTs using an auto-segmentation tool (ADMIRE, Elekta, MIM) based on deformable image registration of the planning CT and daily CT were tested. Results: The deformed contours of 20 patients with total of 60 structures were manually checked as baselines. The incorrect rate of total contours is 49%. To evaluate the quality of local deformation, the Jacobian determinant (1.047±0.045) on contours has been analyzed. In an analysis of rectum contour shell deformed, the higher rate (0.41) of error contours detection was obtained compared to 0.32 with manual check. All automated detections took less than 5 seconds. Conclusion: The proposed method can effectively detect contour errors in micro and macro scope by evaluating multiple deformable registration metrics in a parallel computing process. Future work will focus on improving practicability and optimizing calculation algorithms and metric selection.« less

  16. Evaluating Quality Metrics and Cost After Discharge: A Population-based Cohort Study of Value in Health Care Following Elective Major Vascular Surgery.

    PubMed

    de Mestral, Charles; Salata, Konrad; Hussain, Mohamad A; Kayssi, Ahmed; Al-Omran, Mohammed; Roche-Nagle, Graham

    2018-04-18

    Early readmission to hospital after surgery is an omnipresent quality metric across surgical fields. We sought to understand the relative importance of hospital readmission among all health services received after hospital discharge. The aim of this study was to characterize 30-day postdischarge cost and risk of an emergency department (ED) visit, readmission, or death after hospitalization for elective major vascular surgery. This is a population-based retrospective cohort study of patients who underwent elective major vascular surgery - carotid endarterectomy, EVAR, open AAA repair, bypass for lower extremity peripheral arterial disease - in Ontario, Canada, between 2004 and 2015. The outcomes of interest included quality metrics - ED visit, readmission, death - and cost to the Ministry of Health, within 30 days of discharge. Costs after discharge included those attributable to hospital readmission, ED visits, rehab, physician billing, outpatient nursing and allied health care, medications, interventions, and tests. Multivariable regression models characterized the association of pre-discharge characteristics with the above-mentioned postdischarge quality metrics and cost. A total of 30,752 patients were identified. Within 30 days of discharge, 2588 (8.4%) patients were readmitted to hospital and 13 patients died (0.04%). Another 4145 (13.5%) patients visited an ED without requiring admission. Across all patients, over half of 30-day postdischarge costs were attributable to outpatient care. Patients at an increased risk of an ED visit, readmission, or death within 30 days of discharge differed from those patients with relatively higher 30-day costs. Events occurring outside the hospital setting should be integral to the evaluation of quality of care and cost after hospitalization for major vascular surgery.

  17. Novel methods to estimate antiretroviral adherence: protocol for a longitudinal study.

    PubMed

    Saberi, Parya; Ming, Kristin; Legnitto, Dominique; Neilands, Torsten B; Gandhi, Monica; Johnson, Mallory O

    2018-01-01

    There is currently no gold standard for assessing antiretroviral (ARV) adherence, so researchers often resort to the most feasible and cost-effective methods possible (eg, self-report), which may be biased or inaccurate. The goal of our study was to evaluate the feasibility and acceptability of innovative and remote methods to estimate ARV adherence, which can potentially be conducted with less time and financial resources in a wide range of clinic and research settings. Here, we describe the research protocol for studying these novel methods and some lessons learned. The 6-month pilot study aimed to examine the feasibility and acceptability of a remotely conducted study to evaluate the correlation between: 1) text-messaged photographs of pharmacy refill dates for refill-based adherence; 2) text-messaged photographs of pills for pill count-based adherence; and 3) home-collected hair sample measures of ARV concentration for pharmacologic-based adherence. Participants were sent monthly automated text messages to collect refill dates and pill counts that were taken and sent via mobile telephone photographs, and hair collection kits every 2 months by mail. At the study end, feasibility was calculated by specific metrics, such as the receipt of hair samples and responses to text messages. Participants completed a quantitative survey and qualitative exit interviews to examine the acceptability of these adherence evaluation methods. The relationship between the 3 novel metrics of adherence and self-reported adherence will be assessed. Investigators conducting adherence research are often limited to using either self-reported adherence, which is subjective, biased, and often overestimated, or other more complex methods. Here, we describe the protocol for evaluating the feasibility and acceptability of 3 novel and remote methods of estimating adherence, with the aim of evaluating the relationships between them. Additionally, we note the lessons learned from the protocol implementation to date. We expect that these novel measures will be feasible and acceptable. The implications of this research will be the identification and evaluation of innovative and accurate metrics of ARV adherence for future implementation.

  18. Novel methods to estimate antiretroviral adherence: protocol for a longitudinal study

    PubMed Central

    Saberi, Parya; Ming, Kristin; Legnitto, Dominique; Neilands, Torsten B; Gandhi, Monica; Johnson, Mallory O

    2018-01-01

    Background There is currently no gold standard for assessing antiretroviral (ARV) adherence, so researchers often resort to the most feasible and cost-effective methods possible (eg, self-report), which may be biased or inaccurate. The goal of our study was to evaluate the feasibility and acceptability of innovative and remote methods to estimate ARV adherence, which can potentially be conducted with less time and financial resources in a wide range of clinic and research settings. Here, we describe the research protocol for studying these novel methods and some lessons learned. Methods The 6-month pilot study aimed to examine the feasibility and acceptability of a remotely conducted study to evaluate the correlation between: 1) text-messaged photographs of pharmacy refill dates for refill-based adherence; 2) text-messaged photographs of pills for pill count-based adherence; and 3) home-collected hair sample measures of ARV concentration for pharmacologic-based adherence. Participants were sent monthly automated text messages to collect refill dates and pill counts that were taken and sent via mobile telephone photographs, and hair collection kits every 2 months by mail. At the study end, feasibility was calculated by specific metrics, such as the receipt of hair samples and responses to text messages. Participants completed a quantitative survey and qualitative exit interviews to examine the acceptability of these adherence evaluation methods. The relationship between the 3 novel metrics of adherence and self-reported adherence will be assessed. Discussion Investigators conducting adherence research are often limited to using either self-reported adherence, which is subjective, biased, and often overestimated, or other more complex methods. Here, we describe the protocol for evaluating the feasibility and acceptability of 3 novel and remote methods of estimating adherence, with the aim of evaluating the relationships between them. Additionally, we note the lessons learned from the protocol implementation to date. We expect that these novel measures will be feasible and acceptable. The implications of this research will be the identification and evaluation of innovative and accurate metrics of ARV adherence for future implementation. PMID:29950816

  19. Information Geometry for Landmark Shape Analysis: Unifying Shape Representation and Deformation

    PubMed Central

    Peter, Adrian M.; Rangarajan, Anand

    2010-01-01

    Shape matching plays a prominent role in the comparison of similar structures. We present a unifying framework for shape matching that uses mixture models to couple both the shape representation and deformation. The theoretical foundation is drawn from information geometry wherein information matrices are used to establish intrinsic distances between parametric densities. When a parameterized probability density function is used to represent a landmark-based shape, the modes of deformation are automatically established through the information matrix of the density. We first show that given two shapes parameterized by Gaussian mixture models (GMMs), the well-known Fisher information matrix of the mixture model is also a Riemannian metric (actually, the Fisher-Rao Riemannian metric) and can therefore be used for computing shape geodesics. The Fisher-Rao metric has the advantage of being an intrinsic metric and invariant to reparameterization. The geodesic—computed using this metric—establishes an intrinsic deformation between the shapes, thus unifying both shape representation and deformation. A fundamental drawback of the Fisher-Rao metric is that it is not available in closed form for the GMM. Consequently, shape comparisons are computationally very expensive. To address this, we develop a new Riemannian metric based on generalized ϕ-entropy measures. In sharp contrast to the Fisher-Rao metric, the new metric is available in closed form. Geodesic computations using the new metric are considerably more efficient. We validate the performance and discriminative capabilities of these new information geometry-based metrics by pairwise matching of corpus callosum shapes. We also study the deformations of fish shapes that have various topological properties. A comprehensive comparative analysis is also provided using other landmark-based distances, including the Hausdorff distance, the Procrustes metric, landmark-based diffeomorphisms, and the bending energies of the thin-plate (TPS) and Wendland splines. PMID:19110497

  20. The association between four citation metrics and peer rankings of research influence of Australian researchers in six fields of public health.

    PubMed

    Derrick, Gemma Elizabeth; Haynes, Abby; Chapman, Simon; Hall, Wayne D

    2011-04-06

    Doubt about the relevance, appropriateness and transparency of peer review has promoted the use of citation metrics as a viable adjunct or alternative in the assessment of research impact. It is also commonly acknowledged that research metrics will not replace peer review unless they are shown to correspond with the assessment of peers. This paper evaluates the relationship between researchers' influence as evaluated by their peers and various citation metrics representing different aspects of research output in 6 fields of public health in Australia. For four fields, the results showed a modest positive correlation between different research metrics and peer assessments of research influence. However, for two fields, tobacco and injury, negative or no correlations were found. This suggests a peer understanding of research influence within these fields differed from visibility in the mainstream, peer-reviewed scientific literature. This research therefore recommends the use of both peer review and metrics in a combined approach in assessing research influence. Future research evaluation frameworks intent on incorporating metrics should first analyse each field closely to determine what measures of research influence are valued highly by members of that research community. This will aid the development of comprehensive and relevant frameworks with which to fairly and transparently distribute research funds or approve promotion applications.

  1. The Association between Four Citation Metrics and Peer Rankings of Research Influence of Australian Researchers in Six Fields of Public Health

    PubMed Central

    Derrick, Gemma Elizabeth; Haynes, Abby; Chapman, Simon; Hall, Wayne D.

    2011-01-01

    Doubt about the relevance, appropriateness and transparency of peer review has promoted the use of citation metrics as a viable adjunct or alternative in the assessment of research impact. It is also commonly acknowledged that research metrics will not replace peer review unless they are shown to correspond with the assessment of peers. This paper evaluates the relationship between researchers' influence as evaluated by their peers and various citation metrics representing different aspects of research output in 6 fields of public health in Australia. For four fields, the results showed a modest positive correlation between different research metrics and peer assessments of research influence. However, for two fields, tobacco and injury, negative or no correlations were found. This suggests a peer understanding of research influence within these fields differed from visibility in the mainstream, peer-reviewed scientific literature. This research therefore recommends the use of both peer review and metrics in a combined approach in assessing research influence. Future research evaluation frameworks intent on incorporating metrics should first analyse each field closely to determine what measures of research influence are valued highly by members of that research community. This will aid the development of comprehensive and relevant frameworks with which to fairly and transparently distribute research funds or approve promotion applications. PMID:21494691

  2. Hybrid Pixel-Based Method for Cardiac Ultrasound Fusion Based on Integration of PCA and DWT.

    PubMed

    Mazaheri, Samaneh; Sulaiman, Puteri Suhaiza; Wirza, Rahmita; Dimon, Mohd Zamrin; Khalid, Fatimah; Moosavi Tayebi, Rohollah

    2015-01-01

    Medical image fusion is the procedure of combining several images from one or multiple imaging modalities. In spite of numerous attempts in direction of automation ventricle segmentation and tracking in echocardiography, due to low quality images with missing anatomical details or speckle noises and restricted field of view, this problem is a challenging task. This paper presents a fusion method which particularly intends to increase the segment-ability of echocardiography features such as endocardial and improving the image contrast. In addition, it tries to expand the field of view, decreasing impact of noise and artifacts and enhancing the signal to noise ratio of the echo images. The proposed algorithm weights the image information regarding an integration feature between all the overlapping images, by using a combination of principal component analysis and discrete wavelet transform. For evaluation, a comparison has been done between results of some well-known techniques and the proposed method. Also, different metrics are implemented to evaluate the performance of proposed algorithm. It has been concluded that the presented pixel-based method based on the integration of PCA and DWT has the best result for the segment-ability of cardiac ultrasound images and better performance in all metrics.

  3. Assessment of the effects of nickel on benthic macroinvertebrates in the field.

    PubMed

    Peters, Adam; Simpson, Peter; Merrington, Graham; Schlekat, Chris; Rogevich-Garman, Emily

    2014-01-01

    A field-based evaluation of the biological effects of potential nickel (Ni) exposures was conducted using monitoring data for benthic macroinvertebrates and water chemistry parameters for streams in England and Wales. Observed benthic community metrics were compared to expected community metrics under reference conditions using RIVPACS III+ software. In order to evaluate relationships between Ni concentrations and benthic community metrics, bioavailable Ni concentrations were also calculated for each site. A limiting effect from Ni on the 90th percentile of the maximum achievable ecological quality was derived at "bioavailable Ni" exposures of 10.3 μg l(-1). As snails have been identified as particularly sensitive to nickel exposure, snail abundance in the field in response to nickel exposure, relative to reference conditions, was also analysed. A "low effects" threshold for snail abundance based on an average of spring and autumn data was derived as 3.9 μg l(-1) bioavailable Ni. There was no apparent effect of Ni exposure on the abundance of Ephemeroptera (mayflies), Plecoptera (stoneflies) or Tricoptera (caddisflies) when expressed relative to a reference condition within the range of "bioavailable Ni" exposures observed within the dataset. Nickel exposure concentrations co-vary with the concentrations of other stressors in the dataset, and high concentrations of Ni are also associated with elevated concentrations of other contaminants.

  4. An Adaptive Handover Prediction Scheme for Seamless Mobility Based Wireless Networks

    PubMed Central

    Safa Sadiq, Ali; Fisal, Norsheila Binti; Ghafoor, Kayhan Zrar; Lloret, Jaime

    2014-01-01

    We propose an adaptive handover prediction (AHP) scheme for seamless mobility based wireless networks. That is, the AHP scheme incorporates fuzzy logic with AP prediction process in order to lend cognitive capability to handover decision making. Selection metrics, including received signal strength, mobile node relative direction towards the access points in the vicinity, and access point load, are collected and considered inputs of the fuzzy decision making system in order to select the best preferable AP around WLANs. The obtained handover decision which is based on the calculated quality cost using fuzzy inference system is also based on adaptable coefficients instead of fixed coefficients. In other words, the mean and the standard deviation of the normalized network prediction metrics of fuzzy inference system, which are collected from available WLANs are obtained adaptively. Accordingly, they are applied as statistical information to adjust or adapt the coefficients of membership functions. In addition, we propose an adjustable weight vector concept for input metrics in order to cope with the continuous, unpredictable variation in their membership degrees. Furthermore, handover decisions are performed in each MN independently after knowing RSS, direction toward APs, and AP load. Finally, performance evaluation of the proposed scheme shows its superiority compared with representatives of the prediction approaches. PMID:25574490

  5. An adaptive handover prediction scheme for seamless mobility based wireless networks.

    PubMed

    Sadiq, Ali Safa; Fisal, Norsheila Binti; Ghafoor, Kayhan Zrar; Lloret, Jaime

    2014-01-01

    We propose an adaptive handover prediction (AHP) scheme for seamless mobility based wireless networks. That is, the AHP scheme incorporates fuzzy logic with AP prediction process in order to lend cognitive capability to handover decision making. Selection metrics, including received signal strength, mobile node relative direction towards the access points in the vicinity, and access point load, are collected and considered inputs of the fuzzy decision making system in order to select the best preferable AP around WLANs. The obtained handover decision which is based on the calculated quality cost using fuzzy inference system is also based on adaptable coefficients instead of fixed coefficients. In other words, the mean and the standard deviation of the normalized network prediction metrics of fuzzy inference system, which are collected from available WLANs are obtained adaptively. Accordingly, they are applied as statistical information to adjust or adapt the coefficients of membership functions. In addition, we propose an adjustable weight vector concept for input metrics in order to cope with the continuous, unpredictable variation in their membership degrees. Furthermore, handover decisions are performed in each MN independently after knowing RSS, direction toward APs, and AP load. Finally, performance evaluation of the proposed scheme shows its superiority compared with representatives of the prediction approaches.

  6. Benchmark data sets for structure-based computational target prediction.

    PubMed

    Schomburg, Karen T; Rarey, Matthias

    2014-08-25

    Structure-based computational target prediction methods identify potential targets for a bioactive compound. Methods based on protein-ligand docking so far face many challenges, where the greatest probably is the ranking of true targets in a large data set of protein structures. Currently, no standard data sets for evaluation exist, rendering comparison and demonstration of improvements of methods cumbersome. Therefore, we propose two data sets and evaluation strategies for a meaningful evaluation of new target prediction methods, i.e., a small data set consisting of three target classes for detailed proof-of-concept and selectivity studies and a large data set consisting of 7992 protein structures and 72 drug-like ligands allowing statistical evaluation with performance metrics on a drug-like chemical space. Both data sets are built from openly available resources, and any information needed to perform the described experiments is reported. We describe the composition of the data sets, the setup of screening experiments, and the evaluation strategy. Performance metrics capable to measure the early recognition of enrichments like AUC, BEDROC, and NSLR are proposed. We apply a sequence-based target prediction method to the large data set to analyze its content of nontrivial evaluation cases. The proposed data sets are used for method evaluation of our new inverse screening method iRAISE. The small data set reveals the method's capability and limitations to selectively distinguish between rather similar protein structures. The large data set simulates real target identification scenarios. iRAISE achieves in 55% excellent or good enrichment a median AUC of 0.67 and RMSDs below 2.0 Å for 74% and was able to predict the first true target in 59 out of 72 cases in the top 2% of the protein data set of about 8000 structures.

  7. Development of an epiphyte indicator of nutrient enrichment: Threshold values for seagrass epiphyte load

    EPA Science Inventory

    Metrics of epiphyte load on macrophytes were evaluated for use as quantitative biological indicators for nutrient impacts in estuarine waters, based on review and analysis of the literature on epiphytes and macrophytes, primarily seagrasses, but including some brackish and freshw...

  8. Objective assessment based on motion-related metrics and technical performance in laparoscopic suturing.

    PubMed

    Sánchez-Margallo, Juan A; Sánchez-Margallo, Francisco M; Oropesa, Ignacio; Enciso, Silvia; Gómez, Enrique J

    2017-02-01

    The aim of this study is to present the construct and concurrent validity of a motion-tracking method of laparoscopic instruments based on an optical pose tracker and determine its feasibility as an objective assessment tool of psychomotor skills during laparoscopic suturing. A group of novice ([Formula: see text] laparoscopic procedures), intermediate (11-100 laparoscopic procedures) and experienced ([Formula: see text] laparoscopic procedures) surgeons performed three intracorporeal sutures on an ex vivo porcine stomach. Motion analysis metrics were recorded using the proposed tracking method, which employs an optical pose tracker to determine the laparoscopic instruments' position. Construct validation was measured for all 10 metrics across the three groups and between pairs of groups. Concurrent validation was measured against a previously validated suturing checklist. Checklists were completed by two independent surgeons over blinded video recordings of the task. Eighteen novices, 15 intermediates and 11 experienced surgeons took part in this study. Execution time and path length travelled by the laparoscopic dissector presented construct validity. Experienced surgeons required significantly less time ([Formula: see text]), travelled less distance using both laparoscopic instruments ([Formula: see text]) and made more efficient use of the work space ([Formula: see text]) compared with novice and intermediate surgeons. Concurrent validation showed strong correlation between both the execution time and path length and the checklist score ([Formula: see text] and [Formula: see text], [Formula: see text]). The suturing performance was successfully assessed by the motion analysis method. Construct and concurrent validity of the motion-based assessment method has been demonstrated for the execution time and path length metrics. This study demonstrates the efficacy of the presented method for objective evaluation of psychomotor skills in laparoscopic suturing. However, this method does not take into account the quality of the suture. Thus, future works will focus on developing new methods combining motion analysis and qualitative outcome evaluation to provide a complete performance assessment to trainees.

  9. SU-C-9A-02: Structured Noise Index as An Automated Quality Control for Nuclear Medicine: A Two Year Experience

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nelson, J; Christianson, O; Samei, E

    Purpose: Flood-field uniformity evaluation is an essential element in the assessment of nuclear medicine (NM) gamma cameras. It serves as the central element of the quality control (QC) program, acquired and analyzed on a daily basis prior to clinical imaging. Uniformity images are traditionally analyzed using pixel value-based metrics which often fail to capture subtle structure and patterns caused by changes in gamma camera performance requiring additional visual inspection which is subjective and time demanding. The goal of this project was to develop and implement a robust QC metrology for NM that is effective in identifying non-uniformity issues, reporting issuesmore » in a timely manner for efficient correction prior to clinical involvement, all incorporated into an automated effortless workflow, and to characterize the program over a two year period. Methods: A new quantitative uniformity analysis metric was developed based on 2D noise power spectrum metrology and confirmed based on expert observer visual analysis. The metric, termed Structured Noise Index (SNI) was then integrated into an automated program to analyze, archive, and report on daily NM QC uniformity images. The effectiveness of the program was evaluated over a period of 2 years. Results: The SNI metric successfully identified visually apparent non-uniformities overlooked by the pixel valuebased analysis methods. Implementation of the program has resulted in nonuniformity identification in about 12% of daily flood images. In addition, due to the vigilance of staff response, the percentage of days exceeding trigger value shows a decline over time. Conclusion: The SNI provides a robust quantification of the NM performance of gamma camera uniformity. It operates seamlessly across a fleet of multiple camera models. The automated process provides effective workflow within the NM spectra between physicist, technologist, and clinical engineer. The reliability of this process has made it the preferred platform for NM uniformity analysis.« less

  10. Regional Lung Function Profiles of Stage I and III Lung Cancer Patients: An Evaluation for Functional Avoidance Radiation Therapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vinogradskiy, Yevgeniy, E-mail: yevgeniy.vinogradskiy@ucdenver.edu; Schubert, Leah; Diot, Quentin

    2016-07-15

    Purpose: The development of clinical trials is underway to use 4-dimensional computed tomography (4DCT) ventilation imaging to preferentially spare functional lung in patients undergoing radiation therapy. The purpose of this work was to generate data to aide with clinical trial design by retrospectively characterizing dosimetric and functional profiles for patients with different stages of lung cancer. Methods and Materials: A total of 118 lung cancer patients (36% stage I and 64% stage III) from 2 institutions were used for the study. A 4DCT-ventilation map was calculated using the patient's 4DCT imaging, deformable image registration, and a density-change–based algorithm. To assessmore » each patient's spatial ventilation profile both quantitative and qualitative metrics were developed, including an observer-based defect observation and metrics based on the ventilation in each lung third. For each patient we used the clinical doses to calculate functionally weighted mean lung doses and metrics that assessed the interplay between the spatial location of the dose and high-functioning lung. Results: Both qualitative and quantitative metrics revealed a significant difference in functional profiles between the 2 stage groups (P<.01). We determined that 65% of stage III and 28% of stage I patients had ventilation defects. Average functionally weighted mean lung dose was 19.6 Gy and 5.4 Gy for stage III and I patients, respectively, with both groups containing patients with large spatial overlap between dose and high-function regions. Conclusion: Our 118-patient retrospective study found that 65% of stage III patients have regionally variant ventilation profiles that are suitable for functional avoidance. Our results suggest that regardless of disease stage, it is possible to have unique spatial interplay between dose and high-functional lung, highlighting the importance of evaluating the function of each patient and developing a personalized functional avoidance treatment approach.« less

  11. SU-F-T-231: Improving the Efficiency of a Radiotherapy Peer-Review System for Quality Assurance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hsu, S; Basavatia, A; Garg, M

    Purpose: To improve the efficiency of a radiotherapy peer-review system using a commercially available software application for plan quality evaluation and documentation. Methods: A commercial application, FullAccess (Radialogica LLC, Version 1.4.4), was implemented in a Citrix platform for peer-review process and patient documentation. This application can display images, isodose lines, and dose-volume histograms and create plan reports for peer-review process. Dose metrics in the report can also be benchmarked for plan quality evaluation. Site-specific templates were generated based on departmental treatment planning policies and procedures for each disease site, which generally follow RTOG protocols as well as published prospective clinicalmore » trial data, including both conventional fractionation and hypo-fractionation schema. Once a plan is ready for review, the planner exports the plan to FullAccess, applies the site-specific template, and presents the report for plan review. The plan is still reviewed in the treatment planning system, as that is the legal record. Upon physician’s approval of a plan, the plan is packaged for peer review with the plan report and dose metrics are saved to the database. Results: The reports show dose metrics of PTVs and critical organs for the plans and also indicate whether or not the metrics are within tolerance. Graphical results with green, yellow, and red lights are displayed of whether planning objectives have been met. In addition, benchmarking statistics are collected to see where the current plan falls compared to all historical plans on each metric. All physicians in peer review can easily verify constraints by these reports. Conclusion: We have demonstrated the improvement in a radiotherapy peer-review system, which allows physicians to easily verify planning constraints for different disease sites and fractionation schema, allows for standardization in the clinic to ensure that departmental policies are maintained, and builds a comprehensive database for potential clinical outcome evaluation.« less

  12. Green Chemistry Metrics with Special Reference to Green Analytical Chemistry.

    PubMed

    Tobiszewski, Marek; Marć, Mariusz; Gałuszka, Agnieszka; Namieśnik, Jacek

    2015-06-12

    The concept of green chemistry is widely recognized in chemical laboratories. To properly measure an environmental impact of chemical processes, dedicated assessment tools are required. This paper summarizes the current state of knowledge in the field of development of green chemistry and green analytical chemistry metrics. The diverse methods used for evaluation of the greenness of organic synthesis, such as eco-footprint, E-Factor, EATOS, and Eco-Scale are described. Both the well-established and recently developed green analytical chemistry metrics, including NEMI labeling and analytical Eco-scale, are presented. Additionally, this paper focuses on the possibility of the use of multivariate statistics in evaluation of environmental impact of analytical procedures. All the above metrics are compared and discussed in terms of their advantages and disadvantages. The current needs and future perspectives in green chemistry metrics are also discussed.

  13. Implementation and Evaluation of Multiple Adaptive Control Technologies for a Generic Transport Aircraft Simulation

    NASA Technical Reports Server (NTRS)

    Campbell, Stefan F.; Kaneshige, John T.; Nguyen, Nhan T.; Krishakumar, Kalmanje S.

    2010-01-01

    Presented here is the evaluation of multiple adaptive control technologies for a generic transport aircraft simulation. For this study, seven model reference adaptive control (MRAC) based technologies were considered. Each technology was integrated into an identical dynamic-inversion control architecture and tuned using a methodology based on metrics and specific design requirements. Simulation tests were then performed to evaluate each technology s sensitivity to time-delay, flight condition, model uncertainty, and artificially induced cross-coupling. The resulting robustness and performance characteristics were used to identify potential strengths, weaknesses, and integration challenges of the individual adaptive control technologies

  14. Adaptive metric learning with deep neural networks for video-based facial expression recognition

    NASA Astrophysics Data System (ADS)

    Liu, Xiaofeng; Ge, Yubin; Yang, Chao; Jia, Ping

    2018-01-01

    Video-based facial expression recognition has become increasingly important for plenty of applications in the real world. Despite that numerous efforts have been made for the single sequence, how to balance the complex distribution of intra- and interclass variations well between sequences has remained a great difficulty in this area. We propose the adaptive (N+M)-tuplet clusters loss function and optimize it with the softmax loss simultaneously in the training phrase. The variations introduced by personal attributes are alleviated using the similarity measurements of multiple samples in the feature space with many fewer comparison times as conventional deep metric learning approaches, which enables the metric calculations for large data applications (e.g., videos). Both the spatial and temporal relations are well explored by a unified framework that consists of an Inception-ResNet network with long short term memory and the two fully connected layer branches structure. Our proposed method has been evaluated with three well-known databases, and the experimental results show that our method outperforms many state-of-the-art approaches.

  15. Perceptual color difference metric including a CSF based on the perception threshold

    NASA Astrophysics Data System (ADS)

    Rosselli, Vincent; Larabi, Mohamed-Chaker; Fernandez-Maloigne, Christine

    2008-01-01

    The study of the Human Visual System (HVS) is very interesting to quantify the quality of a picture, to predict which information will be perceived on it, to apply adapted tools ... The Contrast Sensitivity Function (CSF) is one of the major ways to integrate the HVS properties into an imaging system. It characterizes the sensitivity of the visual system to spatial and temporal frequencies and predicts the behavior for the three channels. Common constructions of the CSF have been performed by estimating the detection threshold beyond which it is possible to perceive a stimulus. In this work, we developed a novel approach for spatio-chromatic construction based on matching experiments to estimate the perception threshold. It consists in matching the contrast of a test stimulus with that of a reference one. The obtained results are quite different in comparison with the standard approaches as the chromatic CSFs have band-pass behavior and not low pass. The obtained model has been integrated in a perceptual color difference metric inspired by the s-CIELAB. The metric is then evaluated with both objective and subjective procedures.

  16. An Investigation of Candidate Sensor-Observable Wake Vortex Strength Parameters for the NASA Aircraft Vortex Spacing System (AVOSS)

    NASA Technical Reports Server (NTRS)

    Tatnall, Chistopher R.

    1998-01-01

    The counter-rotating pair of wake vortices shed by flying aircraft can pose a threat to ensuing aircraft, particularly on landing approach. To allow adequate time for the vortices to disperse/decay, landing aircraft are required to maintain certain fixed separation distances. The Aircraft Vortex Spacing System (AVOSS), under development at NASA, is designed to prescribe safe aircraft landing approach separation distances appropriate to the ambient weather conditions. A key component of the AVOSS is a ground sensor, to ensure, safety by making wake observations to verify predicted behavior. This task requires knowledge of a flowfield strength metric which gauges the severity of disturbance an encountering aircraft could potentially experience. Several proposed strength metric concepts are defined and evaluated for various combinations of metric parameters and sensor line-of-sight elevation angles. Representative populations of generating and following aircraft types are selected, and their associated wake flowfields are modeled using various wake geometry definitions. Strength metric candidates are then rated and compared based on the correspondence of their computed values to associated aircraft response values, using basic statistical analyses.

  17. Multi-version software reliability through fault-avoidance and fault-tolerance

    NASA Technical Reports Server (NTRS)

    Vouk, Mladen A.; Mcallister, David F.

    1989-01-01

    A number of experimental and theoretical issues associated with the practical use of multi-version software to provide run-time tolerance to software faults were investigated. A specialized tool was developed and evaluated for measuring testing coverage for a variety of metrics. The tool was used to collect information on the relationships between software faults and coverage provided by the testing process as measured by different metrics (including data flow metrics). Considerable correlation was found between coverage provided by some higher metrics and the elimination of faults in the code. Back-to-back testing was continued as an efficient mechanism for removal of un-correlated faults, and common-cause faults of variable span. Software reliability estimation methods was also continued based on non-random sampling, and the relationship between software reliability and code coverage provided through testing. New fault tolerance models were formulated. Simulation studies of the Acceptance Voting and Multi-stage Voting algorithms were finished and it was found that these two schemes for software fault tolerance are superior in many respects to some commonly used schemes. Particularly encouraging are the safety properties of the Acceptance testing scheme.

  18. Measures of node centrality in mobile social networks

    NASA Astrophysics Data System (ADS)

    Gao, Zhenxiang; Shi, Yan; Chen, Shanzhi

    2015-02-01

    Mobile social networks exploit human mobility and consequent device-to-device contact to opportunistically create data paths over time. While links in mobile social networks are time-varied and strongly impacted by human mobility, discovering influential nodes is one of the important issues for efficient information propagation in mobile social networks. Although traditional centrality definitions give metrics to identify the nodes with central positions in static binary networks, they cannot effectively identify the influential nodes for information propagation in mobile social networks. In this paper, we address the problems of discovering the influential nodes in mobile social networks. We first use the temporal evolution graph model which can more accurately capture the topology dynamics of the mobile social network over time. Based on the model, we explore human social relations and mobility patterns to redefine three common centrality metrics: degree centrality, closeness centrality and betweenness centrality. We then employ empirical traces to evaluate the benefits of the proposed centrality metrics, and discuss the predictability of nodes' global centrality ranking by nodes' local centrality ranking. Results demonstrate the efficiency of the proposed centrality metrics.

  19. Lovelock branes

    NASA Astrophysics Data System (ADS)

    Kastor, David; Ray, Sourya; Traschen, Jennie

    2017-10-01

    We study the problem of finding brane-like solutions to Lovelock gravity, adopting a general approach to establish conditions that a lower dimensional base metric must satisfy in order that a solution to a given Lovelock theory can be constructed in one higher dimension. We find that for Lovelock theories with generic values of the coupling constants, the Lovelock tensors (higher curvature generalizations of the Einstein tensor) of the base metric must all be proportional to the metric. Hence, allowed base metrics form a subclass of Einstein metrics. This subclass includes so-called ‘universal metrics’, which have been previously investigated as solutions to quantum-corrected field equations. For specially tuned values of the Lovelock couplings, we find that the Lovelock tensors of the base metric need to satisfy fewer constraints. For example, for Lovelock theories with a unique vacuum there is only a single such constraint, a case previously identified in the literature, and brane solutions can be straightforwardly constructed.

  20. Performance evaluation of PCA-based spike sorting algorithms.

    PubMed

    Adamos, Dimitrios A; Kosmidis, Efstratios K; Theophilidis, George

    2008-09-01

    Deciphering the electrical activity of individual neurons from multi-unit noisy recordings is critical for understanding complex neural systems. A widely used spike sorting algorithm is being evaluated for single-electrode nerve trunk recordings. The algorithm is based on principal component analysis (PCA) for spike feature extraction. In the neuroscience literature it is generally assumed that the use of the first two or most commonly three principal components is sufficient. We estimate the optimum PCA-based feature space by evaluating the algorithm's performance on simulated series of action potentials. A number of modifications are made to the open source nev2lkit software to enable systematic investigation of the parameter space. We introduce a new metric to define clustering error considering over-clustering more favorable than under-clustering as proposed by experimentalists for our data. Both the program patch and the metric are available online. Correlated and white Gaussian noise processes are superimposed to account for biological and artificial jitter in the recordings. We report that the employment of more than three principal components is in general beneficial for all noise cases considered. Finally, we apply our results to experimental data and verify that the sorting process with four principal components is in agreement with a panel of electrophysiology experts.

  1. Evaluating Descriptive Metrics of the Human Cone Mosaic

    PubMed Central

    Cooper, Robert F.; Wilk, Melissa A.; Tarima, Sergey; Carroll, Joseph

    2016-01-01

    Purpose To evaluate how metrics used to describe the cone mosaic change in response to simulated photoreceptor undersampling (i.e., cell loss or misidentification). Methods Using an adaptive optics ophthalmoscope, we acquired images of the cone mosaic from the center of fixation to 10° along the temporal, superior, inferior, and nasal meridians in 20 healthy subjects. Regions of interest (n = 1780) were extracted at regular intervals along each meridian. Cone mosaic geometry was assessed using a variety of metrics − density, density recovery profile distance (DRPD), nearest neighbor distance (NND), intercell distance (ICD), farthest neighbor distance (FND), percentage of six-sided Voronoi cells, nearest neighbor regularity (NNR), number of neighbors regularity (NoNR), and Voronoi cell area regularity (VCAR). The “performance” of each metric was evaluated by determining the level of simulated loss necessary to obtain 80% statistical power. Results Of the metrics assessed, NND and DRPD were the least sensitive to undersampling, classifying mosaics that lost 50% of their coordinates as indistinguishable from normal. The NoNR was the most sensitive, detecting a significant deviation from normal with only a 10% cell loss. Conclusions The robustness of cone spacing metrics makes them unsuitable for reliably detecting small deviations from normal or for tracking small changes in the mosaic over time. In contrast, regularity metrics are more sensitive to diffuse loss and, therefore, better suited for detecting such changes, provided the fraction of misidentified cells is minimal. Combining metrics with a variety of sensitivities may provide a more complete picture of the integrity of the photoreceptor mosaic. PMID:27273598

  2. Translating glucose variability metrics into the clinic via Continuous Glucose Monitoring: a Graphical User Interface for Diabetes Evaluation (CGM-GUIDE©).

    PubMed

    Rawlings, Renata A; Shi, Hang; Yuan, Lo-Hua; Brehm, William; Pop-Busui, Rodica; Nelson, Patrick W

    2011-12-01

    Several metrics of glucose variability have been proposed to date, but an integrated approach that provides a complete and consistent assessment of glycemic variation is missing. As a consequence, and because of the tedious coding necessary during quantification, most investigators and clinicians have not yet adopted the use of multiple glucose variability metrics to evaluate glycemic variation. We compiled the most extensively used statistical techniques and glucose variability metrics, with adjustable hyper- and hypoglycemic limits and metric parameters, to create a user-friendly Continuous Glucose Monitoring Graphical User Interface for Diabetes Evaluation (CGM-GUIDE©). In addition, we introduce and demonstrate a novel transition density profile that emphasizes the dynamics of transitions between defined glucose states. Our combined dashboard of numerical statistics and graphical plots support the task of providing an integrated approach to describing glycemic variability. We integrated existing metrics, such as SD, area under the curve, and mean amplitude of glycemic excursion, with novel metrics such as the slopes across critical transitions and the transition density profile to assess the severity and frequency of glucose transitions per day as they move between critical glycemic zones. By presenting the above-mentioned metrics and graphics in a concise aggregate format, CGM-GUIDE provides an easy to use tool to compare quantitative measures of glucose variability. This tool can be used by researchers and clinicians to develop new algorithms of insulin delivery for patients with diabetes and to better explore the link between glucose variability and chronic diabetes complications.

  3. Translating Glucose Variability Metrics into the Clinic via Continuous Glucose Monitoring: A Graphical User Interface for Diabetes Evaluation (CGM-GUIDE©)

    PubMed Central

    Rawlings, Renata A.; Shi, Hang; Yuan, Lo-Hua; Brehm, William; Pop-Busui, Rodica

    2011-01-01

    Abstract Background Several metrics of glucose variability have been proposed to date, but an integrated approach that provides a complete and consistent assessment of glycemic variation is missing. As a consequence, and because of the tedious coding necessary during quantification, most investigators and clinicians have not yet adopted the use of multiple glucose variability metrics to evaluate glycemic variation. Methods We compiled the most extensively used statistical techniques and glucose variability metrics, with adjustable hyper- and hypoglycemic limits and metric parameters, to create a user-friendly Continuous Glucose Monitoring Graphical User Interface for Diabetes Evaluation (CGM-GUIDE©). In addition, we introduce and demonstrate a novel transition density profile that emphasizes the dynamics of transitions between defined glucose states. Results Our combined dashboard of numerical statistics and graphical plots support the task of providing an integrated approach to describing glycemic variability. We integrated existing metrics, such as SD, area under the curve, and mean amplitude of glycemic excursion, with novel metrics such as the slopes across critical transitions and the transition density profile to assess the severity and frequency of glucose transitions per day as they move between critical glycemic zones. Conclusions By presenting the above-mentioned metrics and graphics in a concise aggregate format, CGM-GUIDE provides an easy to use tool to compare quantitative measures of glucose variability. This tool can be used by researchers and clinicians to develop new algorithms of insulin delivery for patients with diabetes and to better explore the link between glucose variability and chronic diabetes complications. PMID:21932986

  4. The ranking of scientists based on scientific publications assessment.

    PubMed

    Zerem, Enver

    2017-11-01

    It is generally accepted that the scientific impact factor (Web of Science) and the total number of citations of the articles published in a journal, are the most relevant parameters of the journal's significance. However, the significance of scientists is much more complicated to establish and the value of their scientific production cannot be directly reflected by the importance of the journals in which their articles are published. Evaluating the significance of scientists' accomplishments involves more complicated metrics than just their publication records. Based on a long term of academic experience, the author proposes objective criteria to estimate the scientific merit of an individual's publication record. This metric can serve as a pragmatic tool and the nidus for discussion within the readership of this journal. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Measuring the impact of pharmacoepidemiologic research using altmetrics: A case study of a CNODES drug-safety article.

    PubMed

    Gamble, J M; Traynor, Robyn L; Gruzd, Anatoliy; Mai, Philip; Dormuth, Colin R; Sketris, Ingrid S

    2018-03-24

    To provide an overview of altmetrics, including their potential benefits and limitations, how they may be obtained, and their role in assessing pharmacoepidemiologic research impact. Our review was informed by compiling relevant literature identified through searching multiple health research databases (PubMed, Embase, and CIHNAHL) and grey literature sources (websites, blogs, and reports). We demonstrate how pharmacoepidemiologists, in particular, may use altmetrics to understand scholarly impact and knowledge translation by providing a case study of a drug-safety study conducted by the Canadian Network of Observational Drug Effect Studies. A common approach to measuring research impact is the use of citation-based metrics, such as an article's citation count or a journal's impact factor. "Alternative" metrics, or altmetrics, are increasingly supported as a complementary measure of research uptake in the age of social media. Altmetrics are nontraditional indicators that capture a diverse set of traceable, online research-related artifacts including peer-reviewed publications and other research outputs (software, datasets, blogs, videos, posters, policy documents, presentations, social media posts, wiki entries, etc). Compared with traditional citation-based metrics, altmetrics take a more holistic view of research impact, attempting to capture the activity and engagement of both scholarly and nonscholarly communities. Despite the limited theoretical underpinnings, possible commercial influence, potential for gaming and manipulation, and numerous data quality-related issues, altmetrics are promising as a supplement to more traditional citation-based metrics because they can ingest and process a larger set of data points related to the flow and reach of scholarly communication from an expanded pool of stakeholders. Unlike citation-based metrics, altmetrics are not inherently rooted in the research publication process, which includes peer review; it is unclear to what extent they should be used for research evaluation. © 2018 The Authors. Pharmacoepidemiology and Drug Safety. Published by John Wiley & Sons, Ltd.

  6. Quality Measurements in Radiology: A Systematic Review of the Literature and Survey of Radiology Benefit Management Groups.

    PubMed

    Narayan, Anand; Cinelli, Christina; Carrino, John A; Nagy, Paul; Coresh, Josef; Riese, Victoria G; Durand, Daniel J

    2015-11-01

    As the US health care system transitions toward value-based reimbursement, there is an increasing need for metrics to quantify health care quality. Within radiology, many quality metrics are in use, and still more have been proposed, but there have been limited attempts to systematically inventory these measures and classify them using a standard framework. The purpose of this study was to develop an exhaustive inventory of public and private sector imaging quality metrics classified according to the classic Donabedian framework (structure, process, and outcome). A systematic review was performed in which eligibility criteria included published articles (from 2000 onward) from multiple databases. Studies were double-read, with discrepancies resolved by consensus. For the radiology benefit management group (RBM) survey, the six known companies nationally were surveyed. Outcome measures were organized on the basis of standard categories (structure, process, and outcome) and reported using Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. The search strategy yielded 1,816 citations; review yielded 110 reports (29 included for final analysis). Three of six RBMs (50%) responded to the survey; the websites of the other RBMs were searched for additional metrics. Seventy-five unique metrics were reported: 35 structure (46%), 20 outcome (27%), and 20 process (27%) metrics. For RBMs, 35 metrics were reported: 27 structure (77%), 4 process (11%), and 4 outcome (11%) metrics. The most commonly cited structure, process, and outcome metrics included ACR accreditation (37%), ACR Appropriateness Criteria (85%), and peer review (95%), respectively. Imaging quality metrics are more likely to be structural (46%) than process (27%) or outcome (27%) based (P < .05). As national value-based reimbursement programs increasingly emphasize outcome-based metrics, radiologists must keep pace by developing the data infrastructure required to collect outcome-based quality metrics. Copyright © 2015 American College of Radiology. Published by Elsevier Inc. All rights reserved.

  7. Measuring the Value of Public Health Systems: The Disconnect Between Health Economists and Public Health Practitioners

    PubMed Central

    Jacobson, Peter D.; Palmer, Jennifer A.

    2008-01-01

    We investigated ways of defining and measuring the value of services provided by governmental public health systems. Our data sources included literature syntheses and qualitative interviews of public health professionals. Our examination of the health economic literature revealed growing attempts to measure value of public health services explicitly, but few studies have addressed systems or infrastructure. Interview responses demonstrated no consensus on metrics and no connection to the academic literature. Key challenges for practitioners include developing rigorous, data-driven methods and skilled staff; being politically willing to base allocation decisions on economic evaluation; and developing metrics to capture “intangibles” (e.g., social justice and reassurance value). Academic researchers evaluating the economics of public health investments should increase focus on the working needs of public health professionals. PMID:18923123

  8. Evaluation of metrics for benchmarking antimicrobial use in the UK dairy industry.

    PubMed

    Mills, Harriet L; Turner, Andrea; Morgans, Lisa; Massey, Jonathan; Schubert, Hannah; Rees, Gwen; Barrett, David; Dowsey, Andrew; Reyher, Kristen K

    2018-03-31

    The issue of antimicrobial resistance is of global concern across human and animal health. In 2016, the UK government committed to new targets for reducing antimicrobial use (AMU) in livestock. Although a number of metrics for quantifying AMU are defined in the literature, all give slightly different interpretations. This paper evaluates a selection of metrics for AMU in the dairy industry: total mg, total mg/kg, daily dose and daily course metrics. Although the focus is on their application to the dairy industry, the metrics and issues discussed are relevant across livestock sectors. In order to be used widely, a metric should be understandable and relevant to the veterinarians and farmers who are prescribing and using antimicrobials. This means that clear methods, assumptions (and possible biases), standardised values and exceptions should be published for all metrics. Particularly relevant are assumptions around the number and weight of cattle at risk of treatment and definitions of dose rates and course lengths; incorrect assumptions can mean metrics over-represent or under-represent AMU. The authors recommend that the UK dairy industry work towards the UK-specific metrics using the UK-specific medicine dose and course regimens as well as cattle weights in order to monitor trends nationally. © British Veterinary Association (unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  9. Vaccine adverse event text mining system for extracting features from vaccine safety reports.

    PubMed

    Botsis, Taxiarchis; Buttolph, Thomas; Nguyen, Michael D; Winiecki, Scott; Woo, Emily Jane; Ball, Robert

    2012-01-01

    To develop and evaluate a text mining system for extracting key clinical features from vaccine adverse event reporting system (VAERS) narratives to aid in the automated review of adverse event reports. Based upon clinical significance to VAERS reviewing physicians, we defined the primary (diagnosis and cause of death) and secondary features (eg, symptoms) for extraction. We built a novel vaccine adverse event text mining (VaeTM) system based on a semantic text mining strategy. The performance of VaeTM was evaluated using a total of 300 VAERS reports in three sequential evaluations of 100 reports each. Moreover, we evaluated the VaeTM contribution to case classification; an information retrieval-based approach was used for the identification of anaphylaxis cases in a set of reports and was compared with two other methods: a dedicated text classifier and an online tool. The performance metrics of VaeTM were text mining metrics: recall, precision and F-measure. We also conducted a qualitative difference analysis and calculated sensitivity and specificity for classification of anaphylaxis cases based on the above three approaches. VaeTM performed best in extracting diagnosis, second level diagnosis, drug, vaccine, and lot number features (lenient F-measure in the third evaluation: 0.897, 0.817, 0.858, 0.874, and 0.914, respectively). In terms of case classification, high sensitivity was achieved (83.1%); this was equal and better compared to the text classifier (83.1%) and the online tool (40.7%), respectively. Our VaeTM implementation of a semantic text mining strategy shows promise in providing accurate and efficient extraction of key features from VAERS narratives.

  10. Evaluating the application of multipollutant exposure metrics in air pollution health studies

    EPA Science Inventory

    Background: Health effects associated with air pollution are typically evaluated using a single-pollutant approach, yet people are exposed to mixtures consisting of multiple pollutants that may have independent or combined effects on human health. Development of metrics that re...

  11. Stochastic HKMDHE: A multi-objective contrast enhancement algorithm

    NASA Astrophysics Data System (ADS)

    Pratiher, Sawon; Mukhopadhyay, Sabyasachi; Maity, Srideep; Pradhan, Asima; Ghosh, Nirmalya; Panigrahi, Prasanta K.

    2018-02-01

    This contribution proposes a novel extension of the existing `Hyper Kurtosis based Modified Duo-Histogram Equalization' (HKMDHE) algorithm, for multi-objective contrast enhancement of biomedical images. A novel modified objective function has been formulated by joint optimization of the individual histogram equalization objectives. The optimal adequacy of the proposed methodology with respect to image quality metrics such as brightness preserving abilities, peak signal-to-noise ratio (PSNR), Structural Similarity Index (SSIM) and universal image quality metric has been experimentally validated. The performance analysis of the proposed Stochastic HKMDHE with existing histogram equalization methodologies like Global Histogram Equalization (GHE) and Contrast Limited Adaptive Histogram Equalization (CLAHE) has been given for comparative evaluation.

  12. Performance evaluation of a distance learning program.

    PubMed

    Dailey, D J; Eno, K R; Brinkley, J F

    1994-01-01

    This paper presents a performance metric which uses a single number to characterize the response time for a non-deterministic client-server application operating over the Internet. When applied to a Macintosh-based distance learning application called the Digital Anatomist Browser, the metric allowed us to observe that "A typical student doing a typical mix of Browser commands on a typical data set will experience the same delay if they use a slow Macintosh on a local network or a fast Macintosh on the other side of the country accessing the data over the Internet." The methodology presented is applicable to other client-server applications that are rapidly appearing on the Internet.

  13. The model for Fundamentals of Endovascular Surgery (FEVS) successfully defines the competent endovascular surgeon.

    PubMed

    Duran, Cassidy; Estrada, Sean; O'Malley, Marcia; Sheahan, Malachi G; Shames, Murray L; Lee, Jason T; Bismuth, Jean

    2015-12-01

    Fundamental skills testing is now required for certification in general surgery. No model for assessing fundamental endovascular skills exists. Our objective was to develop a model that tests the fundamental endovascular skills and differentiates competent from noncompetent performance. The Fundamentals of Endovascular Surgery model was developed in silicon and virtual-reality versions. Twenty individuals (with a range of experience) performed four tasks on each model in three separate sessions. Tasks on the silicon model were performed under fluoroscopic guidance, and electromagnetic tracking captured motion metrics for catheter tip position. Image processing captured tool tip position and motion on the virtual model. Performance was evaluated using a global rating scale, blinded video assessment of error metrics, and catheter tip movement and position. Motion analysis was based on derivations of speed and position that define proficiency of movement (spectral arc length, duration of submovement, and number of submovements). Performance was significantly different between competent and noncompetent interventionalists for the three performance measures of motion metrics, error metrics, and global rating scale. The mean error metric score was 6.83 for noncompetent individuals and 2.51 for the competent group (P < .0001). Median global rating scores were 2.25 for the noncompetent group and 4.75 for the competent users (P < .0001). The Fundamentals of Endovascular Surgery model successfully differentiates competent and noncompetent performance of fundamental endovascular skills based on a series of objective performance measures. This model could serve as a platform for skills testing for all trainees. Copyright © 2015 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.

  14. Driver Injury Risk Variability in Finite Element Reconstructions of Crash Injury Research and Engineering Network (CIREN) Frontal Motor Vehicle Crashes.

    PubMed

    Gaewsky, James P; Weaver, Ashley A; Koya, Bharath; Stitzel, Joel D

    2015-01-01

    A 3-phase real-world motor vehicle crash (MVC) reconstruction method was developed to analyze injury variability as a function of precrash occupant position for 2 full-frontal Crash Injury Research and Engineering Network (CIREN) cases. Phase I: A finite element (FE) simplified vehicle model (SVM) was developed and tuned to mimic the frontal crash characteristics of the CIREN case vehicle (Camry or Cobalt) using frontal New Car Assessment Program (NCAP) crash test data. Phase II: The Toyota HUman Model for Safety (THUMS) v4.01 was positioned in 120 precrash configurations per case within the SVM. Five occupant positioning variables were varied using a Latin hypercube design of experiments: seat track position, seat back angle, D-ring height, steering column angle, and steering column telescoping position. An additional baseline simulation was performed that aimed to match the precrash occupant position documented in CIREN for each case. Phase III: FE simulations were then performed using kinematic boundary conditions from each vehicle's event data recorder (EDR). HIC15, combined thoracic index (CTI), femur forces, and strain-based injury metrics in the lung and lumbar vertebrae were evaluated to predict injury. Tuning the SVM to specific vehicle models resulted in close matches between simulated and test injury metric data, allowing the tuned SVM to be used in each case reconstruction with EDR-derived boundary conditions. Simulations with the most rearward seats and reclined seat backs had the greatest HIC15, head injury risk, CTI, and chest injury risk. Calculated injury risks for the head, chest, and femur closely correlated to the CIREN occupant injury patterns. CTI in the Camry case yielded a 54% probability of Abbreviated Injury Scale (AIS) 2+ chest injury in the baseline case simulation and ranged from 34 to 88% (mean = 61%) risk in the least and most dangerous occupant positions. The greater than 50% probability was consistent with the case occupant's AIS 2 hemomediastinum. Stress-based metrics were used to predict injury to the lower leg of the Camry case occupant. The regional-level injury metrics evaluated for the Cobalt case occupant indicated a low risk of injury; however, strain-based injury metrics better predicted pulmonary contusion. Approximately 49% of the Cobalt occupant's left lung was contused, though the baseline simulation predicted 40.5% of the lung to be injured. A method to compute injury metrics and risks as functions of precrash occupant position was developed and applied to 2 CIREN MVC FE reconstructions. The reconstruction process allows for quantification of the sensitivity and uncertainty of the injury risk predictions based on occupant position to further understand important factors that lead to more severe MVC injuries.

  15. Development of an Objective Space Suit Mobility Performance Metric Using Metabolic Cost and Functional Tasks

    NASA Technical Reports Server (NTRS)

    McFarland, Shane M.; Norcross, Jason

    2016-01-01

    Existing methods for evaluating EVA suit performance and mobility have historically concentrated on isolated joint range of motion and torque. However, these techniques do little to evaluate how well a suited crewmember can actually perform during an EVA. An alternative method of characterizing suited mobility through measurement of metabolic cost to the wearer has been evaluated at Johnson Space Center over the past several years. The most recent study involved six test subjects completing multiple trials of various functional tasks in each of three different space suits; the results indicated it was often possible to discern between different suit designs on the basis of metabolic cost alone. However, other variables may have an effect on real-world suited performance; namely, completion time of the task, the gravity field in which the task is completed, etc. While previous results have analyzed completion time, metabolic cost, and metabolic cost normalized to system mass individually, it is desirable to develop a single metric comprising these (and potentially other) performance metrics. This paper outlines the background upon which this single-score metric is determined to be feasible, and initial efforts to develop such a metric. Forward work includes variable coefficient determination and verification of the metric through repeated testing.

  16. Feasibility of Turing-Style Tests for Autonomous Aerial Vehicle "Intelligence"

    NASA Technical Reports Server (NTRS)

    Young, Larry A.

    2007-01-01

    A new approach is suggested to define and evaluate key metrics as to autonomous aerial vehicle performance. This approach entails the conceptual definition of a "Turing Test" for UAVs. Such a "UAV Turing test" would be conducted by means of mission simulations and/or tailored flight demonstrations of vehicles under the guidance of their autonomous system software. These autonomous vehicle mission simulations and flight demonstrations would also have to be benchmarked against missions "flown" with pilots/human-operators in the loop. In turn, scoring criteria for such testing could be based upon both quantitative mission success metrics (unique to each mission) and by turning to analog "handling quality" metrics similar to the well-known Cooper-Harper pilot ratings used for manned aircraft. Autonomous aerial vehicles would be considered to have successfully passed this "UAV Turing Test" if the aggregate mission success metrics and handling qualities for the autonomous aerial vehicle matched or exceeded the equivalent metrics for missions conducted with pilots/human-operators in the loop. Alternatively, an independent, knowledgeable observer could provide the "UAV Turing Test" ratings of whether a vehicle is autonomous or "piloted." This observer ideally would, in the more sophisticated mission simulations, also have the enhanced capability of being able to override the scripted mission scenario and instigate failure modes and change of flight profile/plans. If a majority of mission tasks are rated as "piloted" by the observer, when in reality the vehicle/simulation is fully- or semi- autonomously controlled, then the vehicle/simulation "passes" the "UAV Turing Test." In this regards, this second "UAV Turing Test" approach is more consistent with Turing s original "imitation game" proposal. The overall feasibility, and important considerations and limitations, of such an approach for judging/evaluating autonomous aerial vehicle "intelligence" will be discussed from a theoretical perspective.

  17. Changing to the Metric System.

    ERIC Educational Resources Information Center

    Chambers, Donald L.; Dowling, Kenneth W.

    This report examines educational aspects of the conversion to the metric system of measurement in the United States. Statements of positions on metrication and basic mathematical skills are given from various groups. Base units, symbols, prefixes, and style of the metric system are outlined. Guidelines for teaching metric concepts are given,…

  18. Designing a Robust Micromixer Based on Fluid Stretching

    NASA Astrophysics Data System (ADS)

    Mott, David; Gautam, Dipesh; Voth, Greg; Oran, Elaine

    2010-11-01

    A metric for measuring fluid stretching based on finite-time Lyapunov exponents is described, and the use of this metric for optimizing mixing in microfluidic components is explored. The metric is implemented within an automated design approach called the Computational Toolbox (CTB). The CTB designs components by adding geometric features, such a grooves of various shapes, to a microchannel. The transport produced by each of these features in isolation was pre-computed and stored as an "advection map" for that feature, and the flow through a composite geometry that combines these features is calculated rapidly by applying the corresponding maps in sequence. A genetic algorithm search then chooses the feature combination that optimizes a user-specified metric. Metrics based on the variance of concentration generally require the user to specify the fluid distributions at inflow, which leads to different mixer designs for different inflow arrangements. The stretching metric is independent of the fluid arrangement at inflow. Mixers designed using the stretching metric are compared to those designed using a variance of concentration metric and show excellent performance across a variety of inflow distributions and diffusivities.

  19. Developing a Metric for the Cost of Green House Gas Abatement

    DOT National Transportation Integrated Search

    2017-02-28

    The authors introduce the levelized cost of carbon (LCC), a metric that can be used to evaluate MassDOT CO2 abatement projects in terms of their cost-effectiveness. The study presents ways in which the metric can be used to rank projects. The data ar...

  20. Interactive 3D segmentation of the prostate in magnetic resonance images using shape and local appearance similarity analysis

    NASA Astrophysics Data System (ADS)

    Shahedi, Maysam; Fenster, Aaron; Cool, Derek W.; Romagnoli, Cesare; Ward, Aaron D.

    2013-03-01

    3D segmentation of the prostate in medical images is useful to prostate cancer diagnosis and therapy guidance, but is time-consuming to perform manually. Clinical translation of computer-assisted segmentation algorithms for this purpose requires a comprehensive and complementary set of evaluation metrics that are informative to the clinical end user. We have developed an interactive 3D prostate segmentation method for 1.5T and 3.0T T2-weighted magnetic resonance imaging (T2W MRI) acquired using an endorectal coil. We evaluated our method against manual segmentations of 36 3D images using complementary boundary-based (mean absolute distance; MAD), regional overlap (Dice similarity coefficient; DSC) and volume difference (ΔV) metrics. Our technique is based on inter-subject prostate shape and local boundary appearance similarity. In the training phase, we calculated a point distribution model (PDM) and a set of local mean intensity patches centered on the prostate border to capture shape and appearance variability. To segment an unseen image, we defined a set of rays - one corresponding to each of the mean intensity patches computed in training - emanating from the prostate centre. We used a radial-based search strategy and translated each mean intensity patch along its corresponding ray, selecting as a candidate the boundary point with the highest normalized cross correlation along each ray. These boundary points were then regularized using the PDM. For the whole gland, we measured a mean+/-std MAD of 2.5+/-0.7 mm, DSC of 80+/-4%, and ΔV of 1.1+/-8.8 cc. We also provided an anatomic breakdown of these metrics within the prostatic base, mid-gland, and apex.

  1. Towards a Framework for Evaluating and Comparing Diagnosis Algorithms

    NASA Technical Reports Server (NTRS)

    Kurtoglu, Tolga; Narasimhan, Sriram; Poll, Scott; Garcia,David; Kuhn, Lukas; deKleer, Johan; vanGemund, Arjan; Feldman, Alexander

    2009-01-01

    Diagnostic inference involves the detection of anomalous system behavior and the identification of its cause, possibly down to a failed unit or to a parameter of a failed unit. Traditional approaches to solving this problem include expert/rule-based, model-based, and data-driven methods. Each approach (and various techniques within each approach) use different representations of the knowledge required to perform the diagnosis. The sensor data is expected to be combined with these internal representations to produce the diagnosis result. In spite of the availability of various diagnosis technologies, there have been only minimal efforts to develop a standardized software framework to run, evaluate, and compare different diagnosis technologies on the same system. This paper presents a framework that defines a standardized representation of the system knowledge, the sensor data, and the form of the diagnosis results and provides a run-time architecture that can execute diagnosis algorithms, send sensor data to the algorithms at appropriate time steps from a variety of sources (including the actual physical system), and collect resulting diagnoses. We also define a set of metrics that can be used to evaluate and compare the performance of the algorithms, and provide software to calculate the metrics.

  2. Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics

    DOE PAGES

    Webb-Robertson, Bobbie-Jo M.; Wiberg, Holli K.; Matzke, Melissa M.; ...

    2015-04-09

    In this review, we apply selected imputation strategies to label-free liquid chromatography–mass spectrometry (LC–MS) proteomics datasets to evaluate the accuracy with respect to metrics of variance and classification. We evaluate several commonly used imputation approaches for individual merits and discuss the caveats of each approach with respect to the example LC–MS proteomics data. In general, local similarity-based approaches, such as the regularized expectation maximization and least-squares adaptive algorithms, yield the best overall performances with respect to metrics of accuracy and robustness. However, no single algorithm consistently outperforms the remaining approaches, and in some cases, performing classification without imputation sometimes yieldedmore » the most accurate classification. Thus, because of the complex mechanisms of missing data in proteomics, which also vary from peptide to protein, no individual method is a single solution for imputation. In summary, on the basis of the observations in this review, the goal for imputation in the field of computational proteomics should be to develop new approaches that work generically for this data type and new strategies to guide users in the selection of the best imputation for their dataset and analysis objectives.« less

  3. The LSST metrics analysis framework (MAF)

    NASA Astrophysics Data System (ADS)

    Jones, R. L.; Yoachim, Peter; Chandrasekharan, Srinivasan; Connolly, Andrew J.; Cook, Kem H.; Ivezic, Željko; Krughoff, K. S.; Petry, Catherine; Ridgway, Stephen T.

    2014-07-01

    We describe the Metrics Analysis Framework (MAF), an open-source python framework developed to provide a user-friendly, customizable, easily-extensible set of tools for analyzing data sets. MAF is part of the Large Synoptic Survey Telescope (LSST) Simulations effort. Its initial goal is to provide a tool to evaluate LSST Operations Simulation (OpSim) simulated surveys to help understand the effects of telescope scheduling on survey performance, however MAF can be applied to a much wider range of datasets. The building blocks of the framework are Metrics (algorithms to analyze a given quantity of data), Slicers (subdividing the overall data set into smaller data slices as relevant for each Metric), and Database classes (to access the dataset and read data into memory). We describe how these building blocks work together, and provide an example of using MAF to evaluate different dithering strategies. We also outline how users can write their own custom Metrics and use these within the framework.

  4. 1984–2010 trends in fire burn severity and area for the conterminous US

    USGS Publications Warehouse

    Picotte, Joshua J.; Peterson, Birgit E.; Meier, Gretchen; Howard, Stephen M.

    2016-01-01

    Burn severity products created by the Monitoring Trends in Burn Severity (MTBS) project were used to analyse historical trends in burn severity. Using a severity metric calculated by modelling the cumulative distribution of differenced Normalized Burn Ratio (dNBR) and Relativized dNBR (RdNBR) data, we examined burn area and burn severity of 4893 historical fires (1984–2010) distributed across the conterminous US (CONUS) and mapped by MTBS. Yearly mean burn severity values (weighted by area), maximum burn severity metric values, mean area of burn, maximum burn area and total burn area were evaluated within 27 US National Vegetation Classification macrogroups. Time series assessments of burned area and severity were performed using Mann–Kendall tests. Burned area and severity varied by vegetation classification, but most vegetation groups showed no detectable change during the 1984–2010 period. Of the 27 analysed vegetation groups, trend analysis revealed burned area increased in eight, and burn severity has increased in seven. This study suggests that burned area and severity, as measured by the severity metric based on dNBR or RdNBR, have not changed substantially for most vegetation groups evaluated within CONUS.

  5. Local-metrics error-based Shepard interpolation as surrogate for highly non-linear material models in high dimensions

    NASA Astrophysics Data System (ADS)

    Lorenzi, Juan M.; Stecher, Thomas; Reuter, Karsten; Matera, Sebastian

    2017-10-01

    Many problems in computational materials science and chemistry require the evaluation of expensive functions with locally rapid changes, such as the turn-over frequency of first principles kinetic Monte Carlo models for heterogeneous catalysis. Because of the high computational cost, it is often desirable to replace the original with a surrogate model, e.g., for use in coupled multiscale simulations. The construction of surrogates becomes particularly challenging in high-dimensions. Here, we present a novel version of the modified Shepard interpolation method which can overcome the curse of dimensionality for such functions to give faithful reconstructions even from very modest numbers of function evaluations. The introduction of local metrics allows us to take advantage of the fact that, on a local scale, rapid variation often occurs only across a small number of directions. Furthermore, we use local error estimates to weigh different local approximations, which helps avoid artificial oscillations. Finally, we test our approach on a number of challenging analytic functions as well as a realistic kinetic Monte Carlo model. Our method not only outperforms existing isotropic metric Shepard methods but also state-of-the-art Gaussian process regression.

  6. Local-metrics error-based Shepard interpolation as surrogate for highly non-linear material models in high dimensions.

    PubMed

    Lorenzi, Juan M; Stecher, Thomas; Reuter, Karsten; Matera, Sebastian

    2017-10-28

    Many problems in computational materials science and chemistry require the evaluation of expensive functions with locally rapid changes, such as the turn-over frequency of first principles kinetic Monte Carlo models for heterogeneous catalysis. Because of the high computational cost, it is often desirable to replace the original with a surrogate model, e.g., for use in coupled multiscale simulations. The construction of surrogates becomes particularly challenging in high-dimensions. Here, we present a novel version of the modified Shepard interpolation method which can overcome the curse of dimensionality for such functions to give faithful reconstructions even from very modest numbers of function evaluations. The introduction of local metrics allows us to take advantage of the fact that, on a local scale, rapid variation often occurs only across a small number of directions. Furthermore, we use local error estimates to weigh different local approximations, which helps avoid artificial oscillations. Finally, we test our approach on a number of challenging analytic functions as well as a realistic kinetic Monte Carlo model. Our method not only outperforms existing isotropic metric Shepard methods but also state-of-the-art Gaussian process regression.

  7. Authorship attribution of source code by using back propagation neural network based on particle swarm optimization

    PubMed Central

    Xu, Guoai; Li, Qi; Guo, Yanhui; Zhang, Miao

    2017-01-01

    Authorship attribution is to identify the most likely author of a given sample among a set of candidate known authors. It can be not only applied to discover the original author of plain text, such as novels, blogs, emails, posts etc., but also used to identify source code programmers. Authorship attribution of source code is required in diverse applications, ranging from malicious code tracking to solving authorship dispute or software plagiarism detection. This paper aims to propose a new method to identify the programmer of Java source code samples with a higher accuracy. To this end, it first introduces back propagation (BP) neural network based on particle swarm optimization (PSO) into authorship attribution of source code. It begins by computing a set of defined feature metrics, including lexical and layout metrics, structure and syntax metrics, totally 19 dimensions. Then these metrics are input to neural network for supervised learning, the weights of which are output by PSO and BP hybrid algorithm. The effectiveness of the proposed method is evaluated on a collected dataset with 3,022 Java files belong to 40 authors. Experiment results show that the proposed method achieves 91.060% accuracy. And a comparison with previous work on authorship attribution of source code for Java language illustrates that this proposed method outperforms others overall, also with an acceptable overhead. PMID:29095934

  8. Developing a Security Metrics Scorecard for Healthcare Organizations.

    PubMed

    Elrefaey, Heba; Borycki, Elizabeth; Kushniruk, Andrea

    2015-01-01

    In healthcare, information security is a key aspect of protecting a patient's privacy and ensuring systems availability to support patient care. Security managers need to measure the performance of security systems and this can be achieved by using evidence-based metrics. In this paper, we describe the development of an evidence-based security metrics scorecard specific to healthcare organizations. Study participants were asked to comment on the usability and usefulness of a prototype of a security metrics scorecard that was developed based on current research in the area of general security metrics. Study findings revealed that scorecards need to be customized for the healthcare setting in order for the security information to be useful and usable in healthcare organizations. The study findings resulted in the development of a security metrics scorecard that matches the healthcare security experts' information requirements.

  9. Development of retrospective quantitative and qualitative job-exposure matrices for exposures at a beryllium processing facility.

    PubMed

    Couch, James R; Petersen, Martin; Rice, Carol; Schubauer-Berigan, Mary K

    2011-05-01

    To construct a job-exposure matrix (JEM) for an Ohio beryllium processing facility between 1953 and 2006 and to evaluate temporal changes in airborne beryllium exposures. Quantitative area- and breathing-zone-based exposure measurements of airborne beryllium were made between 1953 and 2006 and used by plant personnel to estimate daily weighted average (DWA) exposure concentrations for sampled departments and operations. These DWA measurements were used to create a JEM with 18 exposure metrics, which was linked to the plant cohort consisting of 18,568 unique job, department and year combinations. The exposure metrics ranged from quantitative metrics (annual arithmetic/geometric average DWA exposures, maximum DWA and peak exposures) to descriptive qualitative metrics (chemical beryllium species and physical form) to qualitative assignment of exposure to other risk factors (yes/no). Twelve collapsed job titles with long-term consistent industrial hygiene samples were evaluated using regression analysis for time trends in DWA estimates. Annual arithmetic mean DWA estimates (overall plant-wide exposures including administration, non-production, and production estimates) for the data by decade ranged from a high of 1.39 μg/m(3) in the 1950s to a low of 0.33 μg/m(3) in the 2000s. Of the 12 jobs evaluated for temporal trend, the average arithmetic DWA mean was 2.46 μg/m(3) and the average geometric mean DWA was 1.53 μg/m(3). After the DWA calculations were log-transformed, 11 of the 12 had a statistically significant (p < 0.05) decrease in reported exposure over time. The constructed JEM successfully differentiated beryllium exposures across jobs and over time. This is the only quantitative JEM containing exposure estimates (average and peak) for the entire plant history.

  10. Perception of the importance of chemistry research papers and comparison to citation rates.

    PubMed

    Borchardt, Rachel; Moran, Cullen; Cantrill, Stuart; Chemjobber; Oh, See Arr; Hartings, Matthew R

    2018-01-01

    Chemistry researchers are frequently evaluated on the perceived significance of their work with the citation count as the most commonly-used metric for gauging this property. Recent studies have called for a broader evaluation of significance that includes more nuanced bibliometrics as well as altmetrics to more completely evaluate scientific research. To better understand the relationship between metrics and peer judgements of significance in chemistry, we have conducted a survey of chemists to investigate their perceptions of previously published research. Focusing on a specific issue of the Journal of the American Chemical Society published in 2003, respondents were asked to select which articles they thought best matched importance and significance given several contexts: highest number of citations, most significant (subjectively defined), most likely to share among chemists, and most likely to share with a broader audience. The answers to the survey can be summed up in several observations. The ability of respondents to predict the citation counts of established research is markedly lower than the ability of those counts to be predicted by the h-index of the corresponding author of each article. This observation is conserved even when only considering responses from chemists whose expertise falls within the subdiscipline that best describes the work performed in an article. Respondents view both cited papers and significant papers differently than papers that should be shared with chemists. We conclude from our results that peer judgements of importance and significance differ from metrics-based measurements, and that chemists should work with bibliometricians to develop metrics that better capture the nuance of opinions on the importance of a given piece of research.

  11. Visualizing Similarity of Appearance by Arrangement of Cards

    PubMed Central

    Nakatsuji, Nao; Ihara, Hisayasu; Seno, Takeharu; Ito, Hiroshi

    2016-01-01

    This study proposes a novel method to extract the configuration of the psychological space by directly measuring subjects' similarity rating without computational work. Although multidimensional scaling (MDS) is well-known as a conventional method for extracting the psychological space, the method requires many pairwise evaluations. The times taken for evaluations increase in proportion to the square of the number of objects in MDS. The proposed method asks subjects to arrange cards on a poster sheet according to the degree of similarity of the objects. To compare the performance of the proposed method with the conventional one, we developed similarity maps of typefaces through the proposed method and through non-metric MDS. We calculated the trace correlation coefficient among all combinations of the configuration for both methods to evaluate the degree of similarity in the obtained configurations. The threshold value of trace correlation coefficient for statistically discriminating similar configuration was decided based on random data. The ratio of the trace correlation coefficient exceeding the threshold value was 62.0% so that the configurations of the typefaces obtained by the proposed method closely resembled those obtained by non-metric MDS. The required duration for the proposed method was approximately one third of the non-metric MDS's duration. In addition, all distances between objects in all the data for both methods were calculated. The frequency for the short distance in the proposed method was lower than that of the non-metric MDS so that a relatively small difference was likely to be emphasized among objects in the configuration by the proposed method. The card arrangement method we here propose, thus serves as a easier and time-saving tool to obtain psychological structures in the fields related to similarity of appearance. PMID:27242611

  12. A New Arbiter PUF for Enhancing Unpredictability on FPGA

    PubMed Central

    Machida, Takanori; Yamamoto, Dai; Iwamoto, Mitsugu; Sakiyama, Kazuo

    2015-01-01

    In general, conventional Arbiter-based Physically Unclonable Functions (PUFs) generate responses with low unpredictability. The N-XOR Arbiter PUF, proposed in 2007, is a well-known technique for improving this unpredictability. In this paper, we propose a novel design for Arbiter PUF, called Double Arbiter PUF, to enhance the unpredictability on field programmable gate arrays (FPGAs), and we compare our design to conventional N-XOR Arbiter PUFs. One metric for judging the unpredictability of responses is to measure their tolerance to machine-learning attacks. Although our previous work showed the superiority of Double Arbiter PUFs regarding unpredictability, its details were not clarified. We evaluate the dependency on the number of training samples for machine learning, and we discuss the reason why Double Arbiter PUFs are more tolerant than the N-XOR Arbiter PUFs by evaluating intrachip variation. Further, the conventional Arbiter PUFs and proposed Double Arbiter PUFs are evaluated according to other metrics, namely, their uniqueness, randomness, and steadiness. We demonstrate that 3-1 Double Arbiter PUF archives the best performance overall. PMID:26491720

  13. Improving Climate Projections Using "Intelligent" Ensembles

    NASA Technical Reports Server (NTRS)

    Baker, Noel C.; Taylor, Patrick C.

    2015-01-01

    Recent changes in the climate system have led to growing concern, especially in communities which are highly vulnerable to resource shortages and weather extremes. There is an urgent need for better climate information to develop solutions and strategies for adapting to a changing climate. Climate models provide excellent tools for studying the current state of climate and making future projections. However, these models are subject to biases created by structural uncertainties. Performance metrics-or the systematic determination of model biases-succinctly quantify aspects of climate model behavior. Efforts to standardize climate model experiments and collect simulation data-such as the Coupled Model Intercomparison Project (CMIP)-provide the means to directly compare and assess model performance. Performance metrics have been used to show that some models reproduce present-day climate better than others. Simulation data from multiple models are often used to add value to projections by creating a consensus projection from the model ensemble, in which each model is given an equal weight. It has been shown that the ensemble mean generally outperforms any single model. It is possible to use unequal weights to produce ensemble means, in which models are weighted based on performance (called "intelligent" ensembles). Can performance metrics be used to improve climate projections? Previous work introduced a framework for comparing the utility of model performance metrics, showing that the best metrics are related to the variance of top-of-atmosphere outgoing longwave radiation. These metrics improve present-day climate simulations of Earth's energy budget using the "intelligent" ensemble method. The current project identifies several approaches for testing whether performance metrics can be applied to future simulations to create "intelligent" ensemble-mean climate projections. It is shown that certain performance metrics test key climate processes in the models, and that these metrics can be used to evaluate model quality in both current and future climate states. This information will be used to produce new consensus projections and provide communities with improved climate projections for urgent decision-making.

  14. Aligning compensation with education: design and implementation of the Educational Value Unit (EVU) system in an academic internal medicine department.

    PubMed

    Stites, Steven; Vansaghi, Lisa; Pingleton, Susan; Cox, Glendon; Paolo, Anthony

    2005-12-01

    The authors report the development of a new metric for distributing university funds to support faculty efforts in education in the department of internal medicine at the University of Kansas School of Medicine. In 2003, a committee defined the educational value unit (EVU), which describes and measures the specific types of educational work done by faculty members, such as core education, clinical teaching, and administration of educational programs. The specific work profile of each faculty member was delineated. A dollar value was calculated for each 0.1 EVU. The metric was prospectively applied and a faculty survey was performed to evaluate the faculty's perception of the metric. Application of the metric resulted in a decrease in university support for 34 faculty and an increase in funding for 23 faculty. Total realignment of funding was US$1.6 million, or an absolute value of US$29,072 +/- 38,320.00 in average shift of university salary support per faculty member. Survey results showed that understanding of the purpose of university funding was enhanced, and that faculty members perceived a more equitable alignment of teaching effort with funding. The EVU metric resulted in a dramatic realignment of university funding for educational efforts in the department of internal medicine. The metric was easily understood, quickly implemented, and perceived to be fair by the faculty. By aligning specific salary support with faculty's educational responsibilities, a foundation was created for applying mission-based incentive programs.

  15. Introduction to the Special Collection of Papers on the San Luis Basin Sustainability Metrics Project: A Methodology for Evaluating Regional Sustainability

    EPA Science Inventory

    This paper introduces a collection of four articles describing the San Luis Basin Sustainability Metrics Project. The Project developed a methodology for evaluating regional sustainability. This introduction provides the necessary background information for the project, descripti...

  16. The Evaluation of Alternative Exposure Metrics for Traffic-related Air Pollutant Exposure in North Carolina

    EPA Science Inventory

    Transportation plays an important role in the modern society but can cause significant health impacts. To quantify the associated health impacts, an appropriate traffic-related air pollution exposure metric is required. In this study, we evaluate the suitability of four exposure ...

  17. Attenuation-based size metric for estimating organ dose to patients undergoing tube current modulated CT exams

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bostani, Maryam, E-mail: mbostani@mednet.ucla.edu; McMillan, Kyle; Lu, Peiyun

    2015-02-15

    Purpose: Task Group 204 introduced effective diameter (ED) as the patient size metric used to correlate size-specific-dose-estimates. However, this size metric fails to account for patient attenuation properties and has been suggested to be replaced by an attenuation-based size metric, water equivalent diameter (D{sub W}). The purpose of this study is to investigate different size metrics, effective diameter, and water equivalent diameter, in combination with regional descriptions of scanner output to establish the most appropriate size metric to be used as a predictor for organ dose in tube current modulated CT exams. Methods: 101 thoracic and 82 abdomen/pelvis scans frommore » clinically indicated CT exams were collected retrospectively from a multidetector row CT (Sensation 64, Siemens Healthcare) with Institutional Review Board approval to generate voxelized patient models. Fully irradiated organs (lung and breasts in thoracic scans and liver, kidneys, and spleen in abdominal scans) were segmented and used as tally regions in Monte Carlo simulations for reporting organ dose. Along with image data, raw projection data were collected to obtain tube current information for simulating tube current modulation scans using Monte Carlo methods. Additionally, previously described patient size metrics [ED, D{sub W}, and approximated water equivalent diameter (D{sub Wa})] were calculated for each patient and reported in three different ways: a single value averaged over the entire scan, a single value averaged over the region of interest, and a single value from a location in the middle of the scan volume. Organ doses were normalized by an appropriate mAs weighted CTDI{sub vol} to reflect regional variation of tube current. Linear regression analysis was used to evaluate the correlations between normalized organ doses and each size metric. Results: For the abdominal organs, the correlations between normalized organ dose and size metric were overall slightly higher for all three differently (global, regional, and middle slice) reported D{sub W} and D{sub Wa} than they were for ED, but the differences were not statistically significant. However, for lung dose, computed correlations using water equivalent diameter calculated in the middle of the image data (D{sub W,middle}) and averaged over the low attenuating region of lung (D{sub W,regional}) were statistically significantly higher than correlations of normalized lung dose with ED. Conclusions: To conclude, effective diameter and water equivalent diameter are very similar in abdominal regions; however, their difference becomes noticeable in lungs. Water equivalent diameter, specifically reported as a regional average and middle of scan volume, was shown to be better predictors of lung dose. Therefore, an attenuation-based size metric (water equivalent diameter) is recommended because it is more robust across different anatomic regions. Additionally, it was observed that the regional size metric reported as a single value averaged over a region of interest and the size metric calculated from a single slice/image chosen from the middle of the scan volume are highly correlated for these specific patient models and scan types.« less

  18. Measures and Metrics for Feasibility of Proof-of-Concept Studies With Human Immunodeficiency Virus Rapid Point-of-Care Technologies: The Evidence and the Framework.

    PubMed

    Pant Pai, Nitika; Chiavegatti, Tiago; Vijh, Rohit; Karatzas, Nicolaos; Daher, Jana; Smallwood, Megan; Wong, Tom; Engel, Nora

    2017-12-01

    Pilot (feasibility) studies form a vast majority of diagnostic studies with point-of-care technologies but often lack use of clear measures/metrics and a consistent framework for reporting and evaluation. To fill this gap, we systematically reviewed data to ( a ) catalog feasibility measures/metrics and ( b ) propose a framework. For the period January 2000 to March 2014, 2 reviewers searched 4 databases (MEDLINE, EMBASE, CINAHL, Scopus), retrieved 1441 citations, and abstracted data from 81 studies. We observed 2 major categories of measures, that is, implementation centered and patient centered, and 4 subcategories of measures, that is, feasibility, acceptability, preference, and patient experience. We defined and delineated metrics and measures for a feasibility framework. We documented impact measures for a comparison. We observed heterogeneity in reporting of metrics as well as misclassification and misuse of metrics within measures. Although we observed poorly defined measures and metrics for feasibility, preference, and patient experience, in contrast, acceptability measure was the best defined. For example, within feasibility, metrics such as consent, completion, new infection, linkage rates, and turnaround times were misclassified and reported. Similarly, patient experience was variously reported as test convenience, comfort, pain, and/or satisfaction. In contrast, within impact measures, all the metrics were well documented, thus serving as a good baseline comparator. With our framework, we classified, delineated, and defined quantitative measures and metrics for feasibility. Our framework, with its defined measures/metrics, could reduce misclassification and improve the overall quality of reporting for monitoring and evaluation of rapid point-of-care technology strategies and their context-driven optimization.

  19. Measures and Metrics for Feasibility of Proof-of-Concept Studies With Human Immunodeficiency Virus Rapid Point-of-Care Technologies

    PubMed Central

    Pant Pai, Nitika; Chiavegatti, Tiago; Vijh, Rohit; Karatzas, Nicolaos; Daher, Jana; Smallwood, Megan; Wong, Tom; Engel, Nora

    2017-01-01

    Objective Pilot (feasibility) studies form a vast majority of diagnostic studies with point-of-care technologies but often lack use of clear measures/metrics and a consistent framework for reporting and evaluation. To fill this gap, we systematically reviewed data to (a) catalog feasibility measures/metrics and (b) propose a framework. Methods For the period January 2000 to March 2014, 2 reviewers searched 4 databases (MEDLINE, EMBASE, CINAHL, Scopus), retrieved 1441 citations, and abstracted data from 81 studies. We observed 2 major categories of measures, that is, implementation centered and patient centered, and 4 subcategories of measures, that is, feasibility, acceptability, preference, and patient experience. We defined and delineated metrics and measures for a feasibility framework. We documented impact measures for a comparison. Findings We observed heterogeneity in reporting of metrics as well as misclassification and misuse of metrics within measures. Although we observed poorly defined measures and metrics for feasibility, preference, and patient experience, in contrast, acceptability measure was the best defined. For example, within feasibility, metrics such as consent, completion, new infection, linkage rates, and turnaround times were misclassified and reported. Similarly, patient experience was variously reported as test convenience, comfort, pain, and/or satisfaction. In contrast, within impact measures, all the metrics were well documented, thus serving as a good baseline comparator. With our framework, we classified, delineated, and defined quantitative measures and metrics for feasibility. Conclusions Our framework, with its defined measures/metrics, could reduce misclassification and improve the overall quality of reporting for monitoring and evaluation of rapid point-of-care technology strategies and their context-driven optimization. PMID:29333105

  20. Approaches to chronic disease management evaluation in use in Europe: a review of current methods and performance measures.

    PubMed

    Conklin, Annalijn; Nolte, Ellen; Vrijhoef, Hubertus

    2013-01-01

    An overview was produced of approaches currently used to evaluate chronic disease management in selected European countries. The study aims to describe the methods and metrics used in Europe as a first to help advance the methodological basis for their assessment. A common template for collection of evaluation methods and performance measures was sent to key informants in twelve European countries; responses were summarized in tables based on template evaluation categories. Extracted data were descriptively analyzed. Approaches to the evaluation of chronic disease management vary widely in objectives, designs, metrics, observation period, and data collection methods. Half of the reported studies used noncontrolled designs. The majority measure clinical process measures, patient behavior and satisfaction, cost and utilization; several also used a range of structural indicators. Effects are usually observed over 1 or 3 years on patient populations with a single, commonly prevalent, chronic disease. There is wide variation within and between European countries on approaches to evaluating chronic disease management in their objectives, designs, indicators, target audiences, and actors involved. This study is the first extensive, international overview of the area reported in the literature.

  1. Evaluation of dissolution profile similarity - Comparison between the f2, the multivariate statistical distance and the f2 bootstrapping methods.

    PubMed

    Paixão, Paulo; Gouveia, Luís F; Silva, Nuno; Morais, José A G

    2017-03-01

    A simulation study is presented, evaluating the performance of the f 2 , the model-independent multivariate statistical distance and the f 2 bootstrap methods in the ability to conclude similarity between two dissolution profiles. Different dissolution profiles, based on the Noyes-Whitney equation and ranging from theoretical f 2 values between 100 and 40, were simulated. Variability was introduced in the dissolution model parameters in an increasing order, ranging from a situation complying with the European guidelines requirements for the use of the f 2 metric to several situations where the f 2 metric could not be used anymore. Results have shown that the f 2 is an acceptable metric when used according to the regulatory requirements, but loses its applicability when variability increases. The multivariate statistical distance presented contradictory results in several of the simulation scenarios, which makes it an unreliable metric for dissolution profile comparisons. The bootstrap f 2 , although conservative in its conclusions is an alternative suitable method. Overall, as variability increases, all of the discussed methods reveal problems that can only be solved by increasing the number of dosage form units used in the comparison, which is usually not practical or feasible. Additionally, experimental corrective measures may be undertaken in order to reduce the overall variability, particularly when it is shown that it is mainly due to the dissolution assessment instead of being intrinsic to the dosage form. Copyright © 2016. Published by Elsevier B.V.

  2. Backward Registration Based Aspect Ratio Similarity (ARS) for Image Retargeting Quality Assessment.

    PubMed

    Zhang, Yabin; Fang, Yuming; Lin, Weisi; Zhang, Xinfeng; Li, Leida

    2016-06-28

    During the past few years, there have been various kinds of content-aware image retargeting operators proposed for image resizing. However, the lack of effective objective retargeting quality assessment metrics limits the further development of image retargeting techniques. Different from traditional Image Quality Assessment (IQA) metrics, the quality degradation during image retargeting is caused by artificial retargeting modifications, and the difficulty for Image Retargeting Quality Assessment (IRQA) lies in the alternation of the image resolution and content, which makes it impossible to directly evaluate the quality degradation like traditional IQA. In this paper, we interpret the image retargeting in a unified framework of resampling grid generation and forward resampling. We show that the geometric change estimation is an efficient way to clarify the relationship between the images. We formulate the geometric change estimation as a Backward Registration problem with Markov Random Field (MRF) and provide an effective solution. The geometric change aims to provide the evidence about how the original image is resized into the target image. Under the guidance of the geometric change, we develop a novel Aspect Ratio Similarity metric (ARS) to evaluate the visual quality of retargeted images by exploiting the local block changes with a visual importance pooling strategy. Experimental results on the publicly available MIT RetargetMe and CUHK datasets demonstrate that the proposed ARS can predict more accurate visual quality of retargeted images compared with state-of-the-art IRQA metrics.

  3. NASA metric transition plan

    NASA Technical Reports Server (NTRS)

    1992-01-01

    NASA science publications have used the metric system of measurement since 1970. Although NASA has maintained a metric use policy since 1979, practical constraints have restricted actual use of metric units. In 1988, an amendment to the Metric Conversion Act of 1975 required the Federal Government to adopt the metric system except where impractical. In response to Public Law 100-418 and Executive Order 12770, NASA revised its metric use policy and developed this Metric Transition Plan. NASA's goal is to use the metric system for program development and functional support activities to the greatest practical extent by the end of 1995. The introduction of the metric system into new flight programs will determine the pace of the metric transition. Transition of institutional capabilities and support functions will be phased to enable use of the metric system in flight program development and operations. Externally oriented elements of this plan will introduce and actively support use of the metric system in education, public information, and small business programs. The plan also establishes a procedure for evaluating and approving waivers and exceptions to the required use of the metric system for new programs. Coordination with other Federal agencies and departments (through the Interagency Council on Metric Policy) and industry (directly and through professional societies and interest groups) will identify sources of external support and minimize duplication of effort.

  4. The Insight ToolKit image registration framework

    PubMed Central

    Avants, Brian B.; Tustison, Nicholas J.; Stauffer, Michael; Song, Gang; Wu, Baohua; Gee, James C.

    2014-01-01

    Publicly available scientific resources help establish evaluation standards, provide a platform for teaching and improve reproducibility. Version 4 of the Insight ToolKit (ITK4) seeks to establish new standards in publicly available image registration methodology. ITK4 makes several advances in comparison to previous versions of ITK. ITK4 supports both multivariate images and objective functions; it also unifies high-dimensional (deformation field) and low-dimensional (affine) transformations with metrics that are reusable across transform types and with composite transforms that allow arbitrary series of geometric mappings to be chained together seamlessly. Metrics and optimizers take advantage of multi-core resources, when available. Furthermore, ITK4 reduces the parameter optimization burden via principled heuristics that automatically set scaling across disparate parameter types (rotations vs. translations). A related approach also constrains steps sizes for gradient-based optimizers. The result is that tuning for different metrics and/or image pairs is rarely necessary allowing the researcher to more easily focus on design/comparison of registration strategies. In total, the ITK4 contribution is intended as a structure to support reproducible research practices, will provide a more extensive foundation against which to evaluate new work in image registration and also enable application level programmers a broad suite of tools on which to build. Finally, we contextualize this work with a reference registration evaluation study with application to pediatric brain labeling.1 PMID:24817849

  5. Climate impacts of energy technologies depend on emissions timing

    NASA Astrophysics Data System (ADS)

    Edwards, Morgan R.; Trancik, Jessika E.

    2014-05-01

    Energy technologies emit greenhouse gases with differing radiative efficiencies and atmospheric lifetimes. Standard practice for evaluating technologies, which uses the global warming potential (GWP) to compare the integrated radiative forcing of emitted gases over a fixed time horizon, does not acknowledge the importance of a changing background climate relative to climate change mitigation targets. Here we demonstrate that the GWP misvalues the impact of CH4-emitting technologies as mid-century approaches, and we propose a new class of metrics to evaluate technologies based on their time of use. The instantaneous climate impact (ICI) compares gases in an expected radiative forcing stabilization year, and the cumulative climate impact (CCI) compares their time-integrated radiative forcing up to a stabilization year. Using these dynamic metrics, we quantify the climate impacts of technologies and show that high-CH4-emitting energy sources become less advantageous over time. The impact of natural gas for transportation, with CH4 leakage, exceeds that of gasoline within 1-2 decades for a commonly cited 3 W m-2 stabilization target. The impact of algae biodiesel overtakes that of corn ethanol within 2-3 decades, where algae co-products are used to produce biogas and corn co-products are used for animal feed. The proposed metrics capture the changing importance of CH4 emissions as a climate threshold is approached, thereby addressing a major shortcoming of the GWP for technology evaluation.

  6. Dose-volume metrics and their relation to memory performance in pediatric brain tumor patients: A preliminary study.

    PubMed

    Raghubar, Kimberly P; Lamba, Michael; Cecil, Kim M; Yeates, Keith Owen; Mahone, E Mark; Limke, Christina; Grosshans, David; Beckwith, Travis J; Ris, M Douglas

    2018-06-01

    Advances in radiation treatment (RT), specifically volumetric planning with detailed dose and volumetric data for specific brain structures, have provided new opportunities to study neurobehavioral outcomes of RT in children treated for brain tumor. The present study examined the relationship between biophysical and physical dose metrics and neurocognitive ability, namely learning and memory, 2 years post-RT in pediatric brain tumor patients. The sample consisted of 26 pediatric patients with brain tumor, 14 of whom completed neuropsychological evaluations on average 24 months post-RT. Prescribed dose and dose-volume metrics for specific brain regions were calculated including physical metrics (i.e., mean dose and maximum dose) and biophysical metrics (i.e., integral biological effective dose and generalized equivalent uniform dose). We examined the associations between dose-volume metrics (whole brain, right and left hippocampus), and performance on measures of learning and memory (Children's Memory Scale). Biophysical dose metrics were highly correlated with the physical metric of mean dose but not with prescribed dose. Biophysical metrics and mean dose, but not prescribed dose, correlated with measures of learning and memory. These preliminary findings call into question the value of prescribed dose for characterizing treatment intensity; they also suggest that biophysical dose has only a limited advantage compared to physical dose when calculated for specific regions of the brain. We discuss the implications of the findings for evaluating and understanding the relation between RT and neurocognitive functioning. © 2018 Wiley Periodicals, Inc.

  7. WHAT INFORMATION DO WE HAVE TO IDENTIFY AND EVALUATE ECOLOGICAL METRICS AND INDICATORS THAT DIRECTLY MATTER TO PEOPLE?

    EPA Science Inventory

    The use of ecological metrics and indicators that matter directly to people makes ecological information more useful. By more useful we mean in communication with people and for social and economic analysis. While the need to specify these metrics and indicators is a view widely ...

  8. Deal or No Deal? Evaluating Big Deals and Their Journals

    ERIC Educational Resources Information Center

    Blecic, Deborah D.; Wiberley, Stephen E., Jr.; Fiscella, Joan B.; Bahnmaier-Blaszczak, Sara; Lowery, Rebecca

    2013-01-01

    This paper presents methods to develop metrics that compare Big Deal journal packages and the journals within those packages. Deal-level metrics guide selection of a Big Deal for termination. Journal-level metrics guide selection of individual subscriptions from journals previously provided by a terminated deal. The paper argues that, while the…

  9. [Clinical trial data management and quality metrics system].

    PubMed

    Chen, Zhao-hua; Huang, Qin; Deng, Ya-zhong; Zhang, Yue; Xu, Yu; Yu, Hao; Liu, Zong-fan

    2015-11-01

    Data quality management system is essential to ensure accurate, complete, consistent, and reliable data collection in clinical research. This paper is devoted to various choices of data quality metrics. They are categorized by study status, e.g. study start up, conduct, and close-out. In each category, metrics for different purposes are listed according to ALCOA+ principles such us completeness, accuracy, timeliness, traceability, etc. Some general quality metrics frequently used are also introduced. This paper contains detail information as much as possible to each metric by providing definition, purpose, evaluation, referenced benchmark, and recommended targets in favor of real practice. It is important that sponsors and data management service providers establish a robust integrated clinical trial data quality management system to ensure sustainable high quality of clinical trial deliverables. It will also support enterprise level of data evaluation and bench marking the quality of data across projects, sponsors, data management service providers by using objective metrics from the real clinical trials. We hope this will be a significant input to accelerate the improvement of clinical trial data quality in the industry.

  10. A unified procedure for meta-analytic evaluation of surrogate end points in randomized clinical trials

    PubMed Central

    Dai, James Y.; Hughes, James P.

    2012-01-01

    The meta-analytic approach to evaluating surrogate end points assesses the predictiveness of treatment effect on the surrogate toward treatment effect on the clinical end point based on multiple clinical trials. Definition and estimation of the correlation of treatment effects were developed in linear mixed models and later extended to binary or failure time outcomes on a case-by-case basis. In a general regression setting that covers nonnormal outcomes, we discuss in this paper several metrics that are useful in the meta-analytic evaluation of surrogacy. We propose a unified 3-step procedure to assess these metrics in settings with binary end points, time-to-event outcomes, or repeated measures. First, the joint distribution of estimated treatment effects is ascertained by an estimating equation approach; second, the restricted maximum likelihood method is used to estimate the means and the variance components of the random treatment effects; finally, confidence intervals are constructed by a parametric bootstrap procedure. The proposed method is evaluated by simulations and applications to 2 clinical trials. PMID:22394448

  11. Processes and Metrics to Evaluate Faculty Practice Activities at US Schools of Pharmacy.

    PubMed

    Haines, Stuart T; Sicat, Brigitte L; Haines, Seena L; MacLaughlin, Eric J; Van Amburgh, Jenny A

    2016-05-25

    Objective. To determine what processes and metrics are employed to measure and evaluate pharmacy practice faculty members at colleges and schools of pharmacy in the United States. Methods. A 23-item web-based questionnaire was distributed to pharmacy practice department chairs at schools of pharmacy fully accredited by the Accreditation Council for Pharmacy Education (ACPE) (n=114). Results. Ninety-three pharmacy practice chairs or designees from 92 institutions responded. Seventy-six percent reported that more than 60% of the department's faculty members were engaged in practice-related activities at least eight hours per week. Fewer than half (47%) had written policies and procedures for conducting practice evaluations. Institutions commonly collected data regarding committee service at practice sites, community service events, educational programs, and number of hours engaged in practice-related activities; however, only 24% used a tool to longitudinally collect practice-related data. Publicly funded institutions were more likely than private schools to have written procedures. Conclusion. Data collection tools and best practice recommendations for conducting faculty practice evaluations are needed.

  12. α-Information Based Registration of Dynamic Scans for Magnetic Resonance Cystography

    PubMed Central

    Han, Hao; Lin, Qin; Li, Lihong; Duan, Chaijie; Lu, Hongbing; Li, Haifang; Yan, Zengmin; Fitzgerald, John

    2015-01-01

    To continue our effort on developing magnetic resonance (MR) cystography, we introduce a novel non–rigid 3D registration method to compensate for bladder wall motion and deformation in dynamic MR scans, which are impaired by relatively low signal–to–noise ratio in each time frame. The registration method is developed on the similarity measure of α–information, which has the potential of achieving higher registration accuracy than the commonly-used mutual information (MI) measure for either mono-modality or multi-modality image registration. The α–information metric was also demonstrated to be superior to both the mean squares and the cross-correlation metrics in multi-modality scenarios. The proposed α–registration method was applied for bladder motion compensation via real patient studies, and its effect to the automatic and accurate segmentation of bladder wall was also evaluated. Compared with the prevailing MI-based image registration approach, the presented α–information based registration was more effective to capture the bladder wall motion and deformation, which ensured the success of the following bladder wall segmentation to achieve the goal of evaluating the entire bladder wall for detection and diagnosis of abnormality. PMID:26087506

  13. Academic health sciences library Website navigation: an analysis of forty-one Websites and their navigation tools.

    PubMed

    Brower, Stewart M

    2004-10-01

    The analysis included forty-one academic health sciences library (HSL) Websites as captured in the first two weeks of January 2001. Home pages and persistent navigational tools (PNTs) were analyzed for layout, technology, and links, and other general site metrics were taken. Websites were selected based on rank in the National Network of Libraries of Medicine, with regional and resource libraries given preference on the basis that these libraries are recognized as leaders in their regions and would be the most reasonable source of standards for best practice. A three-page evaluation tool was developed based on previous similar studies. All forty-one sites were evaluated in four specific areas: library general information, Website aids and tools, library services, and electronic resources. Metrics taken for electronic resources included orientation of bibliographic databases alphabetically by title or by subject area and with links to specifically named databases. Based on the results, a formula for determining obligatory links was developed, listing items that should appear on all academic HSL Web home pages and PNTs. These obligatory links demonstrate a series of best practices that may be followed in the design and construction of academic HSL Websites.

  14. Identification of robust statistical downscaling methods based on a comprehensive suite of performance metrics for South Korea

    NASA Astrophysics Data System (ADS)

    Eum, H. I.; Cannon, A. J.

    2015-12-01

    Climate models are a key provider to investigate impacts of projected future climate conditions on regional hydrologic systems. However, there is a considerable mismatch of spatial resolution between GCMs and regional applications, in particular a region characterized by complex terrain such as Korean peninsula. Therefore, a downscaling procedure is an essential to assess regional impacts of climate change. Numerous statistical downscaling methods have been used mainly due to the computational efficiency and simplicity. In this study, four statistical downscaling methods [Bias-Correction/Spatial Disaggregation (BCSD), Bias-Correction/Constructed Analogue (BCCA), Multivariate Adaptive Constructed Analogs (MACA), and Bias-Correction/Climate Imprint (BCCI)] are applied to downscale the latest Climate Forecast System Reanalysis data to stations for precipitation, maximum temperature, and minimum temperature over South Korea. By split sampling scheme, all methods are calibrated with observational station data for 19 years from 1973 to 1991 are and tested for the recent 19 years from 1992 to 2010. To assess skill of the downscaling methods, we construct a comprehensive suite of performance metrics that measure an ability of reproducing temporal correlation, distribution, spatial correlation, and extreme events. In addition, we employ Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) to identify robust statistical downscaling methods based on the performance metrics for each season. The results show that downscaling skill is considerably affected by the skill of CFSR and all methods lead to large improvements in representing all performance metrics. According to seasonal performance metrics evaluated, when TOPSIS is applied, MACA is identified as the most reliable and robust method for all variables and seasons. Note that such result is derived from CFSR output which is recognized as near perfect climate data in climate studies. Therefore, the ranking of this study may be changed when various GCMs are downscaled and evaluated. Nevertheless, it may be informative for end-users (i.e. modelers or water resources managers) to understand and select more suitable downscaling methods corresponding to priorities on regional applications.

  15. SU-F-T-600: Influence of Acuros XB and AAA Dose Calculation Algorithms On Plan Quality Metrics and Normal Lung Doses in Lung SBRT

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yaparpalvi, R; Mynampati, D; Kuo, H

    Purpose: To study the influence of superposition-beam model (AAA) and determinant-photon transport-solver (Acuros XB) dose calculation algorithms on the treatment plan quality metrics and on normal lung dose in Lung SBRT. Methods: Treatment plans of 10 Lung SBRT patients were randomly selected. Patients were prescribed to a total dose of 50-54Gy in 3–5 fractions (10?5 or 18?3). Doses were optimized accomplished with 6-MV using 2-arcs (VMAT). Doses were calculated using AAA algorithm with heterogeneity correction. For each plan, plan quality metrics in the categories- coverage, homogeneity, conformity and gradient were quantified. Repeat dosimetry for these AAA treatment plans was performedmore » using AXB algorithm with heterogeneity correction for same beam and MU parameters. Plan quality metrics were again evaluated and compared with AAA plan metrics. For normal lung dose, V{sub 20} and V{sub 5} to (Total lung- GTV) were evaluated. Results: The results are summarized in Supplemental Table 1. PTV volume was mean 11.4 (±3.3) cm{sup 3}. Comparing RTOG 0813 protocol criteria for conformality, AXB plans yielded on average, similar PITV ratio (individual PITV ratio differences varied from −9 to +15%), reduced target coverage (−1.6%) and increased R50% (+2.6%). Comparing normal lung doses, the lung V{sub 20} (+3.1%) and V{sub 5} (+1.5%) were slightly higher for AXB plans compared to AAA plans. High-dose spillage ((V105%PD - PTV)/ PTV) was slightly lower for AXB plans but the % low dose spillage (D2cm) was similar between the two calculation algorithms. Conclusion: AAA algorithm overestimates lung target dose. Routinely adapting to AXB for dose calculations in Lung SBRT planning may improve dose calculation accuracy, as AXB based calculations have been shown to be closer to Monte Carlo based dose predictions in accuracy and with relatively faster computational time. For clinical practice, revisiting dose-fractionation in Lung SBRT to correct for dose overestimates attributable to algorithm may very well be warranted.« less

  16. Semantic Metrics for Analysis of Software

    NASA Technical Reports Server (NTRS)

    Etzkorn, Letha H.; Cox, Glenn W.; Farrington, Phil; Utley, Dawn R.; Ghalston, Sampson; Stein, Cara

    2005-01-01

    A recently conceived suite of object-oriented software metrics focus is on semantic aspects of software, in contradistinction to traditional software metrics, which focus on syntactic aspects of software. Semantic metrics represent a more human-oriented view of software than do syntactic metrics. The semantic metrics of a given computer program are calculated by use of the output of a knowledge-based analysis of the program, and are substantially more representative of software quality and more readily comprehensible from a human perspective than are the syntactic metrics.

  17. Model Evaluation of Continuous Data Pharmacometric Models: Metrics and Graphics

    PubMed Central

    Nguyen, THT; Mouksassi, M‐S; Holford, N; Al‐Huniti, N; Freedman, I; Hooker, AC; John, J; Karlsson, MO; Mould, DR; Pérez Ruixo, JJ; Plan, EL; Savic, R; van Hasselt, JGC; Weber, B; Zhou, C; Comets, E

    2017-01-01

    This article represents the first in a series of tutorials on model evaluation in nonlinear mixed effect models (NLMEMs), from the International Society of Pharmacometrics (ISoP) Model Evaluation Group. Numerous tools are available for evaluation of NLMEM, with a particular emphasis on visual assessment. This first basic tutorial focuses on presenting graphical evaluation tools of NLMEM for continuous data. It illustrates graphs for correct or misspecified models, discusses their pros and cons, and recalls the definition of metrics used. PMID:27884052

  18. Use of social media in health promotion: purposes, key performance indicators, and evaluation metrics.

    PubMed

    Neiger, Brad L; Thackeray, Rosemary; Van Wagenen, Sarah A; Hanson, Carl L; West, Joshua H; Barnes, Michael D; Fagen, Michael C

    2012-03-01

    Despite the expanding use of social media, little has been published about its appropriate role in health promotion, and even less has been written about evaluation. The purpose of this article is threefold: (a) outline purposes for social media in health promotion, (b) identify potential key performance indicators associated with these purposes, and (c) propose evaluation metrics for social media related to the key performance indicators. Process evaluation is presented in this article as an overarching evaluation strategy for social media.

  19. Testing Strategies for Model-Based Development

    NASA Technical Reports Server (NTRS)

    Heimdahl, Mats P. E.; Whalen, Mike; Rajan, Ajitha; Miller, Steven P.

    2006-01-01

    This report presents an approach for testing artifacts generated in a model-based development process. This approach divides the traditional testing process into two parts: requirements-based testing (validation testing) which determines whether the model implements the high-level requirements and model-based testing (conformance testing) which determines whether the code generated from a model is behaviorally equivalent to the model. The goals of the two processes differ significantly and this report explores suitable testing metrics and automation strategies for each. To support requirements-based testing, we define novel objective requirements coverage metrics similar to existing specification and code coverage metrics. For model-based testing, we briefly describe automation strategies and examine the fault-finding capability of different structural coverage metrics using tests automatically generated from the model.

  20. Road Risk Modeling and Cloud-Aided Safety-Based Route Planning.

    PubMed

    Li, Zhaojian; Kolmanovsky, Ilya; Atkins, Ella; Lu, Jianbo; Filev, Dimitar P; Michelini, John

    2016-11-01

    This paper presents a safety-based route planner that exploits vehicle-to-cloud-to-vehicle (V2C2V) connectivity. Time and road risk index (RRI) are considered as metrics to be balanced based on user preference. To evaluate road segment risk, a road and accident database from the highway safety information system is mined with a hybrid neural network model to predict RRI. Real-time factors such as time of day, day of the week, and weather are included as correction factors to the static RRI prediction. With real-time RRI and expected travel time, route planning is formulated as a multiobjective network flow problem and further reduced to a mixed-integer programming problem. A V2C2V implementation of our safety-based route planning approach is proposed to facilitate access to real-time information and computing resources. A real-world case study, route planning through the city of Columbus, Ohio, is presented. Several scenarios illustrate how the "best" route can be adjusted to favor time versus safety metrics.

  1. Fast and Robust Registration of Multimodal Remote Sensing Images via Dense Orientated Gradient Feature

    NASA Astrophysics Data System (ADS)

    Ye, Y.

    2017-09-01

    This paper presents a fast and robust method for the registration of multimodal remote sensing data (e.g., optical, LiDAR, SAR and map). The proposed method is based on the hypothesis that structural similarity between images is preserved across different modalities. In the definition of the proposed method, we first develop a pixel-wise feature descriptor named Dense Orientated Gradient Histogram (DOGH), which can be computed effectively at every pixel and is robust to non-linear intensity differences between images. Then a fast similarity metric based on DOGH is built in frequency domain using the Fast Fourier Transform (FFT) technique. Finally, a template matching scheme is applied to detect tie points between images. Experimental results on different types of multimodal remote sensing images show that the proposed similarity metric has the superior matching performance and computational efficiency than the state-of-the-art methods. Moreover, based on the proposed similarity metric, we also design a fast and robust automatic registration system for multimodal images. This system has been evaluated using a pair of very large SAR and optical images (more than 20000 × 20000 pixels). Experimental results show that our system outperforms the two popular commercial software systems (i.e. ENVI and ERDAS) in both registration accuracy and computational efficiency.

  2. Modeling the interannual variability of microbial quality metrics of irrigation water in a Pennsylvania stream.

    PubMed

    Hong, Eun-Mi; Shelton, Daniel; Pachepsky, Yakov A; Nam, Won-Ho; Coppock, Cary; Muirhead, Richard

    2017-02-01

    Knowledge of the microbial quality of irrigation waters is extremely limited. For this reason, the US FDA has promulgated the Produce Rule, mandating the testing of irrigation water sources for many farms. The rule requires the collection and analysis of at least 20 water samples over two to four years to adequately evaluate the quality of water intended for produce irrigation. The objective of this work was to evaluate the effect of interannual weather variability on surface water microbial quality. We used the Soil and Water Assessment Tool model to simulate E. coli concentrations in the Little Cove Creek; this is a perennial creek located in an agricultural watershed in south-eastern Pennsylvania. The model performance was evaluated using the US FDA regulatory microbial water quality metrics of geometric mean (GM) and the statistical threshold value (STV). Using the 90-year time series of weather observations, we simulated and randomly sampled the time series of E. coli concentrations. We found that weather conditions of a specific year may strongly affect the evaluation of microbial quality and that the long-term assessment of microbial water quality may be quite different from the evaluation based on short-term observations. The variations in microbial concentrations and water quality metrics were affected by location, wetness of the hydrological years, and seasonality, with 15.7-70.1% of samples exceeding the regulatory threshold. The results of this work demonstrate the value of using modeling to design and evaluate monitoring protocols to assess the microbial quality of water used for produce irrigation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. A novel approach for evaluating the performance of real time quantitative loop-mediated isothermal amplification-based methods.

    PubMed

    Nixon, Gavin J; Svenstrup, Helle F; Donald, Carol E; Carder, Caroline; Stephenson, Judith M; Morris-Jones, Stephen; Huggett, Jim F; Foy, Carole A

    2014-12-01

    Molecular diagnostic measurements are currently underpinned by the polymerase chain reaction (PCR). There are also a number of alternative nucleic acid amplification technologies, which unlike PCR, work at a single temperature. These 'isothermal' methods, reportedly offer potential advantages over PCR such as simplicity, speed and resistance to inhibitors and could also be used for quantitative molecular analysis. However there are currently limited mechanisms to evaluate their quantitative performance, which would assist assay development and study comparisons. This study uses a sexually transmitted infection diagnostic model in combination with an adapted metric termed isothermal doubling time (IDT), akin to PCR efficiency, to compare quantitative PCR and quantitative loop-mediated isothermal amplification (qLAMP) assays, and to quantify the impact of matrix interference. The performance metric described here facilitates the comparison of qLAMP assays that could assist assay development and validation activities.

  4. Investigating relationships between left atrial volume, symmetry, and sphericity

    NASA Astrophysics Data System (ADS)

    Menon, Prahlad G.; Nedios, Sotiris; Hindricks, Gerhard; Bollmann, Andreas

    2016-03-01

    Catheter ablation is a safe and effective therapy for drug-refractory patients symptomatic of atrial fibrillation (AF), with up to 80% of patients experiencing long-term arrhythmia-free survival. However, up to 20-40% of patients require more than one procedure in order to become arrhythmia-free. Therefore, appropriate patient selection is paramount to the effective implementation and long-term success of ablation therapy for patients with atrial fibrillation (AF). In this study, as a precursor to evaluating clinical significance of specific LA shape metrics as pre-procedural predictors of AF recurrence following ablative pulmonary vein isolation therapy, we report on a computational geometric analysis in a pilot cohort evaluating relationships between various patient-specific metrics of LA shape which might have such predictive value. This study specifically is focused on establishing the relationship between LA volume and sphericity, using a novel methodology for computing atrial sphericity based on regional shape.

  5. Comparing Two CBM Maze Selection Tools: Considering Scoring and Interpretive Metrics for Universal Screening

    ERIC Educational Resources Information Center

    Ford, Jeremy W.; Missall, Kristen N.; Hosp, John L.; Kuhle, Jennifer L.

    2016-01-01

    Advances in maze selection curriculum-based measurement have led to several published tools with technical information for interpretation (e.g., norms, benchmarks, cut-scores, classification accuracy) that have increased their usefulness for universal screening. A range of scoring practices have emerged for evaluating student performance on maze…

  6. Evaluating Fluorscence-Based Metrics for Early Detection of ...

    EPA Pesticide Factsheets

    Summary: This paper discusses the results of an ongoing Water Research Foundation project on developing a fluorescence sensor system for early detection of distribution system nitrification Summary: This paper discusses the results of an ongoing Water Research Foundation project on developing a fluorescence sensor system for early detection of distribution system nitrification

  7. Examining Learning Rates in the Evaluation of Academic Interventions That Target Reading Fluency

    ERIC Educational Resources Information Center

    Solomon, Benjamin G.; Poncy, Brian C.; Caravello, Devin J.; Schweiger, Emily M.

    2018-01-01

    The purpose of the current study is to determine whether single-case intervention studies targeting reading fluency, ranked by traditional outcome metrics (i.e., effect sizes derived from phase differences), were discrepant with rankings based on instructional efficiency, including growth per session and minutes of instruction. Converging with…

  8. Transitioning Technology to Naval Ships

    DTIC Science & Technology

    2010-06-18

    GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Dr. Norbert Doerry 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING...6.3.3 Evaluation of IPS and NGIPS........................................................................................ 48 6.4 Set Based Design on...56 7.2 Employ More Robust Metrics........................................................................................... 57 7.2.1 Knowledge

  9. Green Net Value Added as a Sustainability Metric Based on Life Cycle Assessment: An Application to Bounty® Paper Towel

    EPA Science Inventory

    Sustainability measurement in economics involves evaluation of environmental and economic impact in an integrated manner. In this study, system level economic data are combined with environmental impact from a life cycle assessment (LCA) of a common product. We are exploring a co...

  10. Lessons for Broadening School Accountability under the Every Student Succeeds Act. Strategy Paper

    ERIC Educational Resources Information Center

    Schanzenbach, Diane Whitmore; Bauer, Lauren; Mumford, Megan

    2016-01-01

    A quality education that promotes learning among all students is a prerequisite for an economy that increases opportunity, prosperity, and growth. School accountability policies, in which school performance is evaluated based on identified metrics, have developed over the past few decades as a strategy central to assessing and achieving progress…

  11. Evaluating motion processing algorithms for use with functional near-infrared spectroscopy data from young children.

    PubMed

    Delgado Reyes, Lourdes M; Bohache, Kevin; Wijeakumar, Sobanawartiny; Spencer, John P

    2018-04-01

    Motion artifacts are often a significant component of the measured signal in functional near-infrared spectroscopy (fNIRS) experiments. A variety of methods have been proposed to address this issue, including principal components analysis (PCA), correlation-based signal improvement (CBSI), wavelet filtering, and spline interpolation. The efficacy of these techniques has been compared using simulated data; however, our understanding of how these techniques fare when dealing with task-based cognitive data is limited. Brigadoi et al. compared motion correction techniques in a sample of adult data measured during a simple cognitive task. Wavelet filtering showed the most promise as an optimal technique for motion correction. Given that fNIRS is often used with infants and young children, it is critical to evaluate the effectiveness of motion correction techniques directly with data from these age groups. This study addresses that problem by evaluating motion correction algorithms implemented in HomER2. The efficacy of each technique was compared quantitatively using objective metrics related to the physiological properties of the hemodynamic response. Results showed that targeted PCA (tPCA), spline, and CBSI retained a higher number of trials. These techniques also performed well in direct head-to-head comparisons with the other approaches using quantitative metrics. The CBSI method corrected many of the artifacts present in our data; however, this approach produced sometimes unstable HRFs. The targeted PCA and spline methods proved to be the most robust, performing well across all comparison metrics. When compared head to head, tPCA consistently outperformed spline. We conclude, therefore, that tPCA is an effective technique for correcting motion artifacts in fNIRS data from young children.

  12. Application of online measures to monitor and evaluate multiplatform fusion performance

    NASA Astrophysics Data System (ADS)

    Stubberud, Stephen C.; Kowalski, Charlene; Klamer, Dale M.

    1999-07-01

    A primary concern of multiplatform data fusion is assessing the quality and utility of data shared among platforms. Constraints such as platform and sensor capability and task load necessitate development of an on-line system that computes a metric to determine which other platform can provide the best data for processing. To determine data quality, we are implementing an approach based on entropy coupled with intelligent agents. To determine data quality, we are implementing an approach based on entropy coupled with intelligent agents. Entropy measures quality of processed information such as localization, classification, and ambiguity in measurement-to-track association. Lower entropy scores imply less uncertainty about a particular target. When new information is provided, we compuete the level of improvement a particular track obtains from one measurement to another. The measure permits us to evaluate the utility of the new information. We couple entropy with intelligent agents that provide two main data gathering functions: estimation of another platform's performance and evaluation of the new measurement data's quality. Both functions result from the entropy metric. The intelligent agent on a platform makes an estimate of another platform's measurement and provides it to its own fusion system, which can then incorporate it, for a particular target. A resulting entropy measure is then calculated and returned to its own agent. From this metric, the agent determines a perceived value of the offboard platform's measurement. If the value is satisfactory, the agent requests the measurement from the other platform, usually by interacting with the other platform's agent. Once the actual measurement is received, again entropy is computed and the agent assesses its estimation process and refines it accordingly.

  13. Evaluation of the Absolute Regional Temperature Potential

    NASA Technical Reports Server (NTRS)

    Shindell, D. T.

    2012-01-01

    The Absolute Regional Temperature Potential (ARTP) is one of the few climate metrics that provides estimates of impacts at a sub-global scale. The ARTP presented here gives the time-dependent temperature response in four latitude bands (90-28degS, 28degS-28degN, 28-60degN and 60-90degN) as a function of emissions based on the forcing in those bands caused by the emissions. It is based on a large set of simulations performed with a single atmosphere-ocean climate model to derive regional forcing/response relationships. Here I evaluate the robustness of those relationships using the forcing/response portion of the ARTP to estimate regional temperature responses to the historic aerosol forcing in three independent climate models. These ARTP results are in good accord with the actual responses in those models. Nearly all ARTP estimates fall within +/-20%of the actual responses, though there are some exceptions for 90-28degS and the Arctic, and in the latter the ARTP may vary with forcing agent. However, for the tropics and the Northern Hemisphere mid-latitudes in particular, the +/-20% range appears to be roughly consistent with the 95% confidence interval. Land areas within these two bands respond 39-45% and 9-39% more than the latitude band as a whole. The ARTP, presented here in a slightly revised form, thus appears to provide a relatively robust estimate for the responses of large-scale latitude bands and land areas within those bands to inhomogeneous radiative forcing and thus potentially to emissions as well. Hence this metric could allow rapid evaluation of the effects of emissions policies at a finer scale than global metrics without requiring use of a full climate model.

  14. SeeDB: Efficient Data-Driven Visualization Recommendations to Support Visual Analytics

    PubMed Central

    Vartak, Manasi; Rahman, Sajjadur; Madden, Samuel; Parameswaran, Aditya; Polyzotis, Neoklis

    2015-01-01

    Data analysts often build visualizations as the first step in their analytical workflow. However, when working with high-dimensional datasets, identifying visualizations that show relevant or desired trends in data can be laborious. We propose SeeDB, a visualization recommendation engine to facilitate fast visual analysis: given a subset of data to be studied, SeeDB intelligently explores the space of visualizations, evaluates promising visualizations for trends, and recommends those it deems most “useful” or “interesting”. The two major obstacles in recommending interesting visualizations are (a) scale: evaluating a large number of candidate visualizations while responding within interactive time scales, and (b) utility: identifying an appropriate metric for assessing interestingness of visualizations. For the former, SeeDB introduces pruning optimizations to quickly identify high-utility visualizations and sharing optimizations to maximize sharing of computation across visualizations. For the latter, as a first step, we adopt a deviation-based metric for visualization utility, while indicating how we may be able to generalize it to other factors influencing utility. We implement SeeDB as a middleware layer that can run on top of any DBMS. Our experiments show that our framework can identify interesting visualizations with high accuracy. Our optimizations lead to multiple orders of magnitude speedup on relational row and column stores and provide recommendations at interactive time scales. Finally, we demonstrate via a user study the effectiveness of our deviation-based utility metric and the value of recommendations in supporting visual analytics. PMID:26779379

  15. SeeDB: Efficient Data-Driven Visualization Recommendations to Support Visual Analytics.

    PubMed

    Vartak, Manasi; Rahman, Sajjadur; Madden, Samuel; Parameswaran, Aditya; Polyzotis, Neoklis

    2015-09-01

    Data analysts often build visualizations as the first step in their analytical workflow. However, when working with high-dimensional datasets, identifying visualizations that show relevant or desired trends in data can be laborious. We propose SeeDB, a visualization recommendation engine to facilitate fast visual analysis: given a subset of data to be studied, SeeDB intelligently explores the space of visualizations, evaluates promising visualizations for trends, and recommends those it deems most "useful" or "interesting". The two major obstacles in recommending interesting visualizations are (a) scale : evaluating a large number of candidate visualizations while responding within interactive time scales, and (b) utility : identifying an appropriate metric for assessing interestingness of visualizations. For the former, SeeDB introduces pruning optimizations to quickly identify high-utility visualizations and sharing optimizations to maximize sharing of computation across visualizations. For the latter, as a first step, we adopt a deviation-based metric for visualization utility, while indicating how we may be able to generalize it to other factors influencing utility. We implement SeeDB as a middleware layer that can run on top of any DBMS. Our experiments show that our framework can identify interesting visualizations with high accuracy. Our optimizations lead to multiple orders of magnitude speedup on relational row and column stores and provide recommendations at interactive time scales. Finally, we demonstrate via a user study the effectiveness of our deviation-based utility metric and the value of recommendations in supporting visual analytics.

  16. Benchmarking Diagnostic Algorithms on an Electrical Power System Testbed

    NASA Technical Reports Server (NTRS)

    Kurtoglu, Tolga; Narasimhan, Sriram; Poll, Scott; Garcia, David; Wright, Stephanie

    2009-01-01

    Diagnostic algorithms (DAs) are key to enabling automated health management. These algorithms are designed to detect and isolate anomalies of either a component or the whole system based on observations received from sensors. In recent years a wide range of algorithms, both model-based and data-driven, have been developed to increase autonomy and improve system reliability and affordability. However, the lack of support to perform systematic benchmarking of these algorithms continues to create barriers for effective development and deployment of diagnostic technologies. In this paper, we present our efforts to benchmark a set of DAs on a common platform using a framework that was developed to evaluate and compare various performance metrics for diagnostic technologies. The diagnosed system is an electrical power system, namely the Advanced Diagnostics and Prognostics Testbed (ADAPT) developed and located at the NASA Ames Research Center. The paper presents the fundamentals of the benchmarking framework, the ADAPT system, description of faults and data sets, the metrics used for evaluation, and an in-depth analysis of benchmarking results obtained from testing ten diagnostic algorithms on the ADAPT electrical power system testbed.

  17. Rage against the Machine: Evaluation Metrics in the 21st Century

    ERIC Educational Resources Information Center

    Yang, Charles

    2017-01-01

    I review the classic literature in generative grammar and Marr's three-level program for cognitive science to defend the Evaluation Metric as a psychological theory of language learning. Focusing on well-established facts of language variation, change, and use, I argue that optimal statistical principles embodied in Bayesian inference models are…

  18. Can Multifactor Models of Teaching Improve Teacher Effectiveness Measures?

    ERIC Educational Resources Information Center

    Lazarev, Valeriy; Newman, Denis

    2014-01-01

    NCLB waiver requirements have led to development of teacher evaluation systems, in which student growth is a significant component. Recent empirical research has been focusing on metrics of student growth--value-added scores in particular--and their relationship to other metrics. An extensive set of recent teacher-evaluation studies conducted by…

  19. Vehicle Integrated Prognostic Reasoner (VIPR) Metric Report

    NASA Technical Reports Server (NTRS)

    Cornhill, Dennis; Bharadwaj, Raj; Mylaraswamy, Dinkar

    2013-01-01

    This document outlines a set of metrics for evaluating the diagnostic and prognostic schemes developed for the Vehicle Integrated Prognostic Reasoner (VIPR), a system-level reasoner that encompasses the multiple levels of large, complex systems such as those for aircraft and spacecraft. VIPR health managers are organized hierarchically and operate together to derive diagnostic and prognostic inferences from symptoms and conditions reported by a set of diagnostic and prognostic monitors. For layered reasoners such as VIPR, the overall performance cannot be evaluated by metrics solely directed toward timely detection and accuracy of estimation of the faults in individual components. Among other factors, overall vehicle reasoner performance is governed by the effectiveness of the communication schemes between monitors and reasoners in the architecture, and the ability to propagate and fuse relevant information to make accurate, consistent, and timely predictions at different levels of the reasoner hierarchy. We outline an extended set of diagnostic and prognostics metrics that can be broadly categorized as evaluation measures for diagnostic coverage, prognostic coverage, accuracy of inferences, latency in making inferences, computational cost, and sensitivity to different fault and degradation conditions. We report metrics from Monte Carlo experiments using two variations of an aircraft reference model that supported both flat and hierarchical reasoning.

  20. The importance of metrics for evaluating scientific performance

    NASA Astrophysics Data System (ADS)

    Miyakawa, Tsuyoshi

    Evaluation of scientific performance is a major factor that determines the behavior of both individual researchers and the academic institutes to which they belong. Because the number of researchers heavily outweighs the number of available research posts, and the competitive funding accounts for an ever-increasing proportion of research budget, some objective indicators of research performance have gained recognition for increasing transparency and openness. It is common practice to use metrics and indices to evaluate a researcher's performance or the quality of their grant applications. Such measures include the number of publications, the number of times these papers are cited and, more recently, the h-index, which measures the number of highly-cited papers the researcher has written. However, academic institutions and funding agencies in Japan have been rather slow to adopt such metrics. In this article, I will outline some of the currently available metrics, and discuss why we need to use such objective indicators of research performance more often in Japan. I will also discuss how to promote the use of metrics and what we should keep in mind when using them, as well as their potential impact on the research community in Japan.

Top