benchmark evaluation project: Topics by Science.gov

Sample records for benchmark evaluation project

Benchmarking and validation activities within JEFF project

NASA Astrophysics Data System (ADS)

Cabellos, O.; Alvarez-Velarde, F.; Angelone, M.; Diez, C. J.; Dyrda, J.; Fiorito, L.; Fischer, U.; Fleming, M.; Haeck, W.; Hill, I.; Ichou, R.; Kim, D. H.; Klix, A.; Kodeli, I.; Leconte, P.; Michel-Sendis, F.; Nunnenmann, E.; Pecchia, M.; Peneliau, Y.; Plompen, A.; Rochman, D.; Romojaro, P.; Stankovskiy, A.; Sublet, J. Ch.; Tamagno, P.; Marck, S. van der

2017-09-01

The challenge for any nuclear data evaluation project is to periodically release a revised, fully consistent and complete library, with all needed data and covariances, and ensure that it is robust and reliable for a variety of applications. Within an evaluation effort, benchmarking activities play an important role in validating proposed libraries. The Joint Evaluated Fission and Fusion (JEFF) Project aims to provide such a nuclear data library, and thus, requires a coherent and efficient benchmarking process. The aim of this paper is to present the activities carried out by the new JEFF Benchmarking and Validation Working Group, and to describe the role of the NEA Data Bank in this context. The paper will also review the status of preliminary benchmarking for the next JEFF-3.3 candidate cross-section files.
Providing Nuclear Criticality Safety Analysis Education through Benchmark Experiment Evaluation

DOE Office of Scientific and Technical Information (OSTI.GOV)

John D. Bess; J. Blair Briggs; David W. Nigg

2009-11-01

One of the challenges that today's new workforce of nuclear criticality safety engineers face is the opportunity to provide assessment of nuclear systems and establish safety guidelines without having received significant experience or hands-on training prior to graduation. Participation in the International Criticality Safety Benchmark Evaluation Project (ICSBEP) and/or the International Reactor Physics Experiment Evaluation Project (IRPhEP) provides students and young professionals the opportunity to gain experience and enhance critical engineering skills.
IT-benchmarking of clinical workflows: concept, implementation, and evaluation.

PubMed

Thye, Johannes; Straede, Matthias-Christopher; Liebe, Jan-David; Hübner, Ursula

2014-01-01

Due to the emerging evidence of health IT as opportunity and risk for clinical workflows, health IT must undergo a continuous measurement of its efficacy and efficiency. IT-benchmarks are a proven means for providing this information. The aim of this study was to enhance the methodology of an existing benchmarking procedure by including, in particular, new indicators of clinical workflows and by proposing new types of visualisation. Drawing on the concept of information logistics, we propose four workflow descriptors that were applied to four clinical processes. General and specific indicators were derived from these descriptors and processes. 199 chief information officers (CIOs) took part in the benchmarking. These hospitals were assigned to reference groups of a similar size and ownership from a total of 259 hospitals. Stepwise and comprehensive feedback was given to the CIOs. Most participants who evaluated the benchmark rated the procedure as very good, good, or rather good (98.4%). Benchmark information was used by CIOs for getting a general overview, advancing IT, preparing negotiations with board members, and arguing for a new IT project.
Evaluation of control strategies using an oxidation ditch benchmark.

PubMed

Abusam, A; Keesman, K J; Spanjers, H; van, Straten G; Meinema, K

2002-01-01

This paper presents validation and implementation results of a benchmark developed for a specific full-scale oxidation ditch wastewater treatment plant. A benchmark is a standard simulation procedure that can be used as a tool in evaluating various control strategies proposed for wastewater treatment plants. It is based on model and performance criteria development. Testing of this benchmark, by comparing benchmark predictions to real measurements of the electrical energy consumptions and amounts of disposed sludge for a specific oxidation ditch WWTP, has shown that it can (reasonably) be used for evaluating the performance of this WWTP. Subsequently, the validated benchmark was then used in evaluating some basic and advanced control strategies. Some of the interesting results obtained are the following: (i) influent flow splitting ratio, between the first and the fourth aerated compartments of the ditch, has no significant effect on the TN concentrations in the effluent, and (ii) for evaluation of long-term control strategies, future benchmarks need to be able to assess settlers' performance.
Model evaluation using a community benchmarking system for land surface models

NASA Astrophysics Data System (ADS)

Mu, M.; Hoffman, F. M.; Lawrence, D. M.; Riley, W. J.; Keppel-Aleks, G.; Kluzek, E. B.; Koven, C. D.; Randerson, J. T.

2014-12-01

Evaluation of atmosphere, ocean, sea ice, and land surface models is an important step in identifying deficiencies in Earth system models and developing improved estimates of future change. For the land surface and carbon cycle, the design of an open-source system has been an important objective of the International Land Model Benchmarking (ILAMB) project. Here we evaluated CMIP5 and CLM models using a benchmarking system that enables users to specify models, data sets, and scoring systems so that results can be tailored to specific model intercomparison projects. Our scoring system used information from four different aspects of global datasets, including climatological mean spatial patterns, seasonal cycle dynamics, interannual variability, and long-term trends. Variable-to-variable comparisons enable investigation of the mechanistic underpinnings of model behavior, and allow for some control of biases in model drivers. Graphics modules allow users to evaluate model performance at local, regional, and global scales. Use of modular structures makes it relatively easy for users to add new variables, diagnostic metrics, benchmarking datasets, or model simulations. Diagnostic results are automatically organized into HTML files, so users can conveniently share results with colleagues. We used this system to evaluate atmospheric carbon dioxide, burned area, global biomass and soil carbon stocks, net ecosystem exchange, gross primary production, ecosystem respiration, terrestrial water storage, evapotranspiration, and surface radiation from CMIP5 historical and ESM historical simulations. We found that the multi-model mean often performed better than many of the individual models for most variables. We plan to publicly release a stable version of the software during fall of 2014 that has land surface, carbon cycle, hydrology, radiation and energy cycle components.
A Better Benchmark Assessment: Multiple-Choice versus Project-Based

ERIC Educational Resources Information Center

Peariso, Jamon F.

2006-01-01

The purpose of this literature review and Ex Post Facto descriptive study was to determine which type of benchmark assessment, multiple-choice or project-based, provides the best indication of general success on the history portion of the CST (California Standards Tests). The result of the study indicates that although the project-based benchmark…
Limitations of Community College Benchmarking and Benchmarks

ERIC Educational Resources Information Center

Bers, Trudy H.

2006-01-01

This chapter distinguishes between benchmarks and benchmarking, describes a number of data and cultural limitations to benchmarking projects, and suggests that external demands for accountability are the dominant reason for growing interest in benchmarking among community colleges.
GROWTH OF THE INTERNATIONAL CRITICALITY SAFETY AND REACTOR PHYSICS EXPERIMENT EVALUATION PROJECTS

DOE Office of Scientific and Technical Information (OSTI.GOV)

J. Blair Briggs; John D. Bess; Jim Gulliford

2011-09-01

Since the International Conference on Nuclear Criticality Safety (ICNC) 2007, the International Criticality Safety Benchmark Evaluation Project (ICSBEP) and the International Reactor Physics Experiment Evaluation Project (IRPhEP) have continued to expand their efforts and broaden their scope. Eighteen countries participated on the ICSBEP in 2007. Now, there are 20, with recent contributions from Sweden and Argentina. The IRPhEP has also expanded from eight contributing countries in 2007 to 16 in 2011. Since ICNC 2007, the contents of the 'International Handbook of Evaluated Criticality Safety Benchmark Experiments1' have increased from 442 evaluations (38000 pages), containing benchmark specifications for 3955 critical ormore » subcritical configurations to 516 evaluations (nearly 55000 pages), containing benchmark specifications for 4405 critical or subcritical configurations in the 2010 Edition of the ICSBEP Handbook. The contents of the Handbook have also increased from 21 to 24 criticality-alarm-placement/shielding configurations with multiple dose points for each, and from 20 to 200 configurations categorized as fundamental physics measurements relevant to criticality safety applications. Approximately 25 new evaluations and 150 additional configurations are expected to be added to the 2011 edition of the Handbook. Since ICNC 2007, the contents of the 'International Handbook of Evaluated Reactor Physics Benchmark Experiments2' have increased from 16 different experimental series that were performed at 12 different reactor facilities to 53 experimental series that were performed at 30 different reactor facilities in the 2011 edition of the Handbook. Considerable effort has also been made to improve the functionality of the searchable database, DICE (Database for the International Criticality Benchmark Evaluation Project) and verify the accuracy of the data contained therein. DICE will be discussed in separate papers at ICNC 2011. The status of the ICSBEP and
Benchmark Evaluation of HTR-PROTEUS Pebble Bed Experimental Program

DOE PAGES

Bess, John D.; Montierth, Leland; Köberl, Oliver; ...

2014-10-09

Benchmark models were developed to evaluate 11 critical core configurations of the HTR-PROTEUS pebble bed experimental program. Various additional reactor physics measurements were performed as part of this program; currently only a total of 37 absorber rod worth measurements have been evaluated as acceptable benchmark experiments for Cores 4, 9, and 10. Dominant uncertainties in the experimental keff for all core configurations come from uncertainties in the ²³⁵U enrichment of the fuel, impurities in the moderator pebbles, and the density and impurity content of the radial reflector. Calculations of k eff with MCNP5 and ENDF/B-VII.0 neutron nuclear data aremore » greater than the benchmark values but within 1% and also within the 3σ uncertainty, except for Core 4, which is the only randomly packed pebble configuration. Repeated calculations of k eff with MCNP6.1 and ENDF/B-VII.1 are lower than the benchmark values and within 1% (~3σ) except for Cores 5 and 9, which calculate lower than the benchmark eigenvalues within 4σ. The primary difference between the two nuclear data libraries is the adjustment of the absorption cross section of graphite. Simulations of the absorber rod worth measurements are within 3σ of the benchmark experiment values. The complete benchmark evaluation details are available in the 2014 edition of the International Handbook of Evaluated Reactor Physics Benchmark Experiments.« less
Benchmarks for Evaluation of Distributed Denial of Service (DDOS)

DTIC Science & Technology

2008-01-01

publications: [1] E. Arikan , Attack Profiling for DDoS Benchmarks, MS Thesis, University of Delaware, August 2006. [2] J. Mirkovic, A. Hussain, B. Wilson...Sigmetrics 2007, June 2007 [5] J. Mirkovic, E. Arikan , S. Wei, S. Fahmy, R. Thomas, and P. Reiher Benchmarks for DDoS Defense Evaluation, Proceedings of the...Security Experimentation, June 2006. [9] J. Mirkovic, E. Arikan , S. Wei, S. Fahmy, R. Thomas, P. Reiher, Benchmarks for DDoS Defense Evaluation
Quality in E-Learning--A Conceptual Framework Based on Experiences from Three International Benchmarking Projects

ERIC Educational Resources Information Center

Ossiannilsson, E.; Landgren, L.

2012-01-01

Between 2008 and 2010, Lund University took part in three international benchmarking projects, "E-xcellence+," the "eLearning Benchmarking Exercise 2009," and the "First Dual-Mode Distance Learning Benchmarking Club." A comparison of these models revealed a rather high level of correspondence. From this finding and…
OWL2 benchmarking for the evaluation of knowledge based systems.

PubMed

Khan, Sher Afgun; Qadir, Muhammad Abdul; Abbas, Muhammad Azeem; Afzal, Muhammad Tanvir

2017-01-01

OWL2 semantics are becoming increasingly popular for the real domain applications like Gene engineering and health MIS. The present work identifies the research gap that negligible attention has been paid to the performance evaluation of Knowledge Base Systems (KBS) using OWL2 semantics. To fulfil this identified research gap, an OWL2 benchmark for the evaluation of KBS is proposed. The proposed benchmark addresses the foundational blocks of an ontology benchmark i.e. data schema, workload and performance metrics. The proposed benchmark is tested on memory based, file based, relational database and graph based KBS for performance and scalability measures. The results show that the proposed benchmark is able to evaluate the behaviour of different state of the art KBS on OWL2 semantics. On the basis of the results, the end users (i.e. domain expert) would be able to select a suitable KBS appropriate for his domain.
Using Web-Based Peer Benchmarking to Manage the Client-Based Project

ERIC Educational Resources Information Center

Raska, David; Keller, Eileen Weisenbach; Shaw, Doris

2013-01-01

The complexities of integrating client-based projects into marketing courses provide challenges for the instructor but produce richness of context and active learning for the student. This paper explains the integration of Web-based peer benchmarking as a means of improving student performance on client-based projects within a single semester in…
[Potential Benchmarks for Successful Interdisciplinary Collaboration Projects in Germany: A Systematic Review].

PubMed

Weißenborn, Marina; Schulz, Martin; Kraft, Manuel; Haefeli, Walter E; Seidling, Hanna M

2018-06-21

Collaboration between general practitioners and community pharmacists is essential to ensure safe and effective patient care. However, collaboration in primary care is not standardized and varies greatly. This review aims to highlight projects about professional collaboration in ambulatory care in Germany and identifies promising approaches and successful benchmarks that should be considered for future projects. A systematic literature search was performed based on the PRISMA guidelines to identify articles focusing on professional collaboration between general practitioners and pharmacists. A total of 542 articles were retrieved. Six potential premises for successful cooperation projects were identified: GP and CP knowing each other (I), involvement of both health care providers in the project planning (II), sharing of experience or concerns during regular joint meetings enabling continuing evaluation and adaption (III), ensuring (technical) feasibility (IV), particularly by providing incentives (V), and by integrating these projects into existing health care structures (VI). Only few studies have been published in scientific journals. There was no standardized assessment of how the participants perceived their collaboration and how it facilitates their daily work, even when the study aimed to evaluate GP-CP collaboration. Successful cooperation between GP and CP in daily routine care was often characterized by personal contact and longtime relationships. Therefore, collaborative teaching sessions at university might establish sympathy and mutual understanding right from the beginning. There is a strong need to establish standardized tools to evaluate collaboration in future projects and to enable comparability of different studies. © Georg Thieme Verlag KG Stuttgart · New York.
Performance Evaluation of Supercomputers using HPCC and IMB Benchmarks

NASA Technical Reports Server (NTRS)

Saini, Subhash; Ciotti, Robert; Gunney, Brian T. N.; Spelce, Thomas E.; Koniges, Alice; Dossa, Don; Adamidis, Panagiotis; Rabenseifner, Rolf; Tiyyagura, Sunil R.; Mueller, Matthias;

2006-01-01

The HPC Challenge (HPCC) benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnect fabric of five leading supercomputers - SGI Altix BX2, Cray XI, Cray Opteron Cluster, Dell Xeon cluster, and NEC SX-8. These five systems use five different networks (SGI NUMALINK4, Cray network, Myrinet, InfiniBand, and NEC IXS). The complete set of HPCC benchmarks are run on each of these systems. Additionally, we present Intel MPI Benchmarks (IMB) results to study the performance of 11 MPI communication functions on these systems.

Benchmarking Evaluation Results for Prototype Extravehicular Activity Gloves

NASA Technical Reports Server (NTRS)

Aitchison, Lindsay; McFarland, Shane

2012-01-01

The Space Suit Assembly (SSA) Development Team at NASA Johnson Space Center has invested heavily in the advancement of rear-entry planetary exploration suit design but largely deferred development of extravehicular activity (EVA) glove designs, and accepted the risk of using the current flight gloves, Phase VI, for unique mission scenarios outside the Space Shuttle and International Space Station (ISS) Program realm of experience. However, as design reference missions mature, the risks of using heritage hardware have highlighted the need for developing robust new glove technologies. To address the technology gap, the NASA Game-Changing Technology group provided start-up funding for the High Performance EVA Glove (HPEG) Project in the spring of 2012. The overarching goal of the HPEG Project is to develop a robust glove design that increases human performance during EVA and creates pathway for future implementation of emergent technologies, with specific aims of increasing pressurized mobility to 60% of barehanded capability, increasing the durability by 100%, and decreasing the potential of gloves to cause injury during use. The HPEG Project focused initial efforts on identifying potential new technologies and benchmarking the performance of current state of the art gloves to identify trends in design and fit leading to establish standards and metrics against which emerging technologies can be assessed at both the component and assembly levels. The first of the benchmarking tests evaluated the quantitative mobility performance and subjective fit of four prototype gloves developed by Flagsuit LLC, Final Frontier Designs, LLC Dover, and David Clark Company as compared to the Phase VI. All of the companies were asked to design and fabricate gloves to the same set of NASA provided hand measurements (which corresponded to a single size of Phase Vi glove) and focus their efforts on improving mobility in the metacarpal phalangeal and carpometacarpal joints. Four test
EBR-II Reactor Physics Benchmark Evaluation Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pope, Chad L.; Lum, Edward S; Stewart, Ryan

This report provides a reactor physics benchmark evaluation with associated uncertainty quantification for the critical configuration of the April 1986 Experimental Breeder Reactor II Run 138B core configuration.
Global Gridded Crop Model Evaluation: Benchmarking, Skills, Deficiencies and Implications.

NASA Technical Reports Server (NTRS)

Muller, Christoph; Elliott, Joshua; Chryssanthacopoulos, James; Arneth, Almut; Balkovic, Juraj; Ciais, Philippe; Deryng, Delphine; Folberth, Christian; Glotter, Michael; Hoek, Steven;

2017-01-01

Crop models are increasingly used to simulate crop yields at the global scale, but so far there is no general framework on how to assess model performance. Here we evaluate the simulation results of 14 global gridded crop modeling groups that have contributed historic crop yield simulations for maize, wheat, rice and soybean to the Global Gridded Crop Model Intercomparison (GGCMI) of the Agricultural Model Intercomparison and Improvement Project (AgMIP). Simulation results are compared to reference data at global, national and grid cell scales and we evaluate model performance with respect to time series correlation, spatial correlation and mean bias. We find that global gridded crop models (GGCMs) show mixed skill in reproducing time series correlations or spatial patterns at the different spatial scales. Generally, maize, wheat and soybean simulations of many GGCMs are capable of reproducing larger parts of observed temporal variability (time series correlation coefficients (r) of up to 0.888 for maize, 0.673 for wheat and 0.643 for soybean at the global scale) but rice yield variability cannot be well reproduced by most models. Yield variability can be well reproduced for most major producing countries by many GGCMs and for all countries by at least some. A comparison with gridded yield data and a statistical analysis of the effects of weather variability on yield variability shows that the ensemble of GGCMs can explain more of the yield variability than an ensemble of regression models for maize and soybean, but not for wheat and rice. We identify future research needs in global gridded crop modeling and for all individual crop modeling groups. In the absence of a purely observation-based benchmark for model evaluation, we propose that the best performing crop model per crop and region establishes the benchmark for all others, and modelers are encouraged to investigate how crop model performance can be increased. We make our evaluation system accessible to all

Learning from Follow Up Surveys of Graduates: The Austin Teacher Program and the Benchmark Project. A Discussion Paper.

ERIC Educational Resources Information Center

Baker, Thomas E.

This paper describes Austin College's (Texas) participation in the Benchmark Project, a collaborative followup study of teacher education graduates and their principals, focusing on the second round of data collection. The Benchmark Project was a collaboration of 11 teacher preparation programs that gathered and analyzed data comparing graduates…
Benchmarking and Performance Measurement.

ERIC Educational Resources Information Center

Town, J. Stephen

This paper defines benchmarking and its relationship to quality management, describes a project which applied the technique in a library context, and explores the relationship between performance measurement and benchmarking. Numerous benchmarking methods contain similar elements: deciding what to benchmark; identifying partners; gathering…

ICSBEP Benchmarks For Nuclear Data Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Briggs, J. Blair

2005-05-24

The International Criticality Safety Benchmark Evaluation Project (ICSBEP) was initiated in 1992 by the United States Department of Energy. The ICSBEP became an official activity of the Organization for Economic Cooperation and Development (OECD) -- Nuclear Energy Agency (NEA) in 1995. Representatives from the United States, United Kingdom, France, Japan, the Russian Federation, Hungary, Republic of Korea, Slovenia, Serbia and Montenegro (formerly Yugoslavia), Kazakhstan, Spain, Israel, Brazil, Poland, and the Czech Republic are now participating. South Africa, India, China, and Germany are considering participation. The purpose of the ICSBEP is to identify, evaluate, verify, and formally document a comprehensive andmore » internationally peer-reviewed set of criticality safety benchmark data. The work of the ICSBEP is published as an OECD handbook entitled ''International Handbook of Evaluated Criticality Safety Benchmark Experiments.'' The 2004 Edition of the Handbook contains benchmark specifications for 3331 critical or subcritical configurations that are intended for use in validation efforts and for testing basic nuclear data. New to the 2004 Edition of the Handbook is a draft criticality alarm / shielding type benchmark that should be finalized in 2005 along with two other similar benchmarks. The Handbook is being used extensively for nuclear data testing and is expected to be a valuable resource for code and data validation and improvement efforts for decades to come. Specific benchmarks that are useful for testing structural materials such as iron, chromium, nickel, and manganese; beryllium; lead; thorium; and 238U are highlighted.« less
Performance Evaluation and Benchmarking of Next Intelligent Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

del Pobil, Angel; Madhavan, Raj; Bonsignorio, Fabio

Performance Evaluation and Benchmarking of Intelligent Systems presents research dedicated to the subject of performance evaluation and benchmarking of intelligent systems by drawing from the experiences and insights of leading experts gained both through theoretical development and practical implementation of intelligent systems in a variety of diverse application domains. This contributed volume offers a detailed and coherent picture of state-of-the-art, recent developments, and further research areas in intelligent systems. The chapters cover a broad range of applications, such as assistive robotics, planetary surveying, urban search and rescue, and line tracking for automotive assembly. Subsystems or components described in this bookmore » include human-robot interaction, multi-robot coordination, communications, perception, and mapping. Chapters are also devoted to simulation support and open source software for cognitive platforms, providing examples of the type of enabling underlying technologies that can help intelligent systems to propagate and increase in capabilities. Performance Evaluation and Benchmarking of Intelligent Systems serves as a professional reference for researchers and practitioners in the field. This book is also applicable to advanced courses for graduate level students and robotics professionals in a wide range of engineering and related disciplines including computer science, automotive, healthcare, manufacturing, and service robotics.« less
A benchmark for fault tolerant flight control evaluation

NASA Astrophysics Data System (ADS)

Smaili, H.; Breeman, J.; Lombaerts, T.; Stroosma, O.

2013-12-01

A large transport aircraft simulation benchmark (REconfigurable COntrol for Vehicle Emergency Return - RECOVER) has been developed within the GARTEUR (Group for Aeronautical Research and Technology in Europe) Flight Mechanics Action Group 16 (FM-AG(16)) on Fault Tolerant Control (2004 2008) for the integrated evaluation of fault detection and identification (FDI) and reconfigurable flight control strategies. The benchmark includes a suitable set of assessment criteria and failure cases, based on reconstructed accident scenarios, to assess the potential of new adaptive control strategies to improve aircraft survivability. The application of reconstruction and modeling techniques, based on accident flight data, has resulted in high-fidelity nonlinear aircraft and fault models to evaluate new Fault Tolerant Flight Control (FTFC) concepts and their real-time performance to accommodate in-flight failures.
INTEGRAL BENCHMARK DATA FOR NUCLEAR DATA TESTING THROUGH THE ICSBEP AND THE NEWLY ORGANIZED IRPHEP

DOE Office of Scientific and Technical Information (OSTI.GOV)

J. Blair Briggs; Lori Scott; Yolanda Rugama

The status of the International Criticality Safety Benchmark Evaluation Project (ICSBEP) was last reported in a nuclear data conference at the International Conference on Nuclear Data for Science and Technology, ND-2004, in Santa Fe, New Mexico. Since that time the number and type of integral benchmarks have increased significantly. Included in the ICSBEP Handbook are criticality-alarm / shielding and fundamental physic benchmarks in addition to the traditional critical / subcritical benchmark data. Since ND 2004, a reactor physics counterpart to the ICSBEP, the International Reactor Physics Experiment Evaluation Project (IRPhEP) was initiated. The IRPhEP is patterned after the ICSBEP, butmore » focuses on other integral measurements, such as buckling, spectral characteristics, reactivity effects, reactivity coefficients, kinetics measurements, reaction-rate and power distributions, nuclide compositions, and other miscellaneous-type measurements in addition to the critical configuration. The status of these two projects is discussed and selected benchmarks highlighted in this paper.« less
Improving HEI Productivity and Performance through Project Management: Implications from a Benchmarking Case Study

ERIC Educational Resources Information Center

Bryde, David; Leighton, Diana

2009-01-01

As higher education institutions (HEIs) look to be more commercial in their outlook they are likely to become more dependent on the successful implementation of projects. This article reports a benchmarking survey of PM maturity in a HEI, with the purpose of assessing its capability to implement projects. Data were collected via questionnaires…
Implementation of Benchmarking Transportation Logistics Practices and Future Benchmarking Organizations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thrower, A.W.; Patric, J.; Keister, M.

2008-07-01

The purpose of the Office of Civilian Radioactive Waste Management's (OCRWM) Logistics Benchmarking Project is to identify established government and industry practices for the safe transportation of hazardous materials which can serve as a yardstick for design and operation of OCRWM's national transportation system for shipping spent nuclear fuel and high-level radioactive waste to the proposed repository at Yucca Mountain, Nevada. The project will present logistics and transportation practices and develop implementation recommendations for adaptation by the national transportation system. This paper will describe the process used to perform the initial benchmarking study, highlight interim findings, and explain how thesemore » findings are being implemented. It will also provide an overview of the next phase of benchmarking studies. The benchmarking effort will remain a high-priority activity throughout the planning and operational phases of the transportation system. The initial phase of the project focused on government transportation programs to identify those practices which are most clearly applicable to OCRWM. These Federal programs have decades of safe transportation experience, strive for excellence in operations, and implement effective stakeholder involvement, all of which parallel OCRWM's transportation mission and vision. The initial benchmarking project focused on four business processes that are critical to OCRWM's mission success, and can be incorporated into OCRWM planning and preparation in the near term. The processes examined were: transportation business model, contract management/out-sourcing, stakeholder relations, and contingency planning. More recently, OCRWM examined logistics operations of AREVA NC's Business Unit Logistics in France. The next phase of benchmarking will focus on integrated domestic and international commercial radioactive logistic operations. The prospective companies represent large scale shippers and have vast
Benchmarking specialty hospitals, a scoping review on theory and practice.

PubMed

Wind, A; van Harten, W H

2017-04-04

Although benchmarking may improve hospital processes, research on this subject is limited. The aim of this study was to provide an overview of publications on benchmarking in specialty hospitals and a description of study characteristics. We searched PubMed and EMBASE for articles published in English in the last 10 years. Eligible articles described a project stating benchmarking as its objective and involving a specialty hospital or specific patient category; or those dealing with the methodology or evaluation of benchmarking. Of 1,817 articles identified in total, 24 were included in the study. Articles were categorized into: pathway benchmarking, institutional benchmarking, articles on benchmark methodology or -evaluation and benchmarking using a patient registry. There was a large degree of variability:(1) study designs were mostly descriptive and retrospective; (2) not all studies generated and showed data in sufficient detail; and (3) there was variety in whether a benchmarking model was just described or if quality improvement as a consequence of the benchmark was reported upon. Most of the studies that described a benchmark model described the use of benchmarking partners from the same industry category, sometimes from all over the world. Benchmarking seems to be more developed in eye hospitals, emergency departments and oncology specialty hospitals. Some studies showed promising improvement effects. However, the majority of the articles lacked a structured design, and did not report on benchmark outcomes. In order to evaluate the effectiveness of benchmarking to improve quality in specialty hospitals, robust and structured designs are needed including a follow up to check whether the benchmark study has led to improvements.
Benchmarking Data Sets for the Evaluation of Virtual Ligand Screening Methods: Review and Perspectives.

PubMed

Lagarde, Nathalie; Zagury, Jean-François; Montes, Matthieu

2015-07-27

Virtual screening methods are commonly used nowadays in drug discovery processes. However, to ensure their reliability, they have to be carefully evaluated. The evaluation of these methods is often realized in a retrospective way, notably by studying the enrichment of benchmarking data sets. To this purpose, numerous benchmarking data sets were developed over the years, and the resulting improvements led to the availability of high quality benchmarking data sets. However, some points still have to be considered in the selection of the active compounds, decoys, and protein structures to obtain optimal benchmarking data sets.
Hospital benchmarking: are U.S. eye hospitals ready?

PubMed

de Korne, Dirk F; van Wijngaarden, Jeroen D H; Sol, Kees J C A; Betz, Robert; Thomas, Richard C; Schein, Oliver D; Klazinga, Niek S

2012-01-01

Benchmarking is increasingly considered a useful management instrument to improve quality in health care, but little is known about its applicability in hospital settings. The aims of this study were to assess the applicability of a benchmarking project in U.S. eye hospitals and compare the results with an international initiative. We evaluated multiple cases by applying an evaluation frame abstracted from the literature to five U.S. eye hospitals that used a set of 10 indicators for efficiency benchmarking. Qualitative analysis entailed 46 semistructured face-to-face interviews with stakeholders, document analyses, and questionnaires. The case studies only partially met the conditions of the evaluation frame. Although learning and quality improvement were stated as overall purposes, the benchmarking initiative was at first focused on efficiency only. No ophthalmic outcomes were included, and clinicians were skeptical about their reporting relevance and disclosure. However, in contrast with earlier findings in international eye hospitals, all U.S. hospitals worked with internal indicators that were integrated in their performance management systems and supported benchmarking. Benchmarking can support performance management in individual hospitals. Having a certain number of comparable institutes provide similar services in a noncompetitive milieu seems to lay fertile ground for benchmarking. International benchmarking is useful only when these conditions are not met nationally. Although the literature focuses on static conditions for effective benchmarking, our case studies show that it is a highly iterative and learning process. The journey of benchmarking seems to be more important than the destination. Improving patient value (health outcomes per unit of cost) requires, however, an integrative perspective where clinicians and administrators closely cooperate on both quality and efficiency issues. If these worlds do not share such a relationship, the added
Benchmarking FEniCS for mantle convection simulations

NASA Astrophysics Data System (ADS)

Vynnytska, L.; Rognes, M. E.; Clark, S. R.

2013-01-01

This paper evaluates the usability of the FEniCS Project for mantle convection simulations by numerical comparison to three established benchmarks. The benchmark problems all concern convection processes in an incompressible fluid induced by temperature or composition variations, and cover three cases: (i) steady-state convection with depth- and temperature-dependent viscosity, (ii) time-dependent convection with constant viscosity and internal heating, and (iii) a Rayleigh-Taylor instability. These problems are modeled by the Stokes equations for the fluid and advection-diffusion equations for the temperature and composition. The FEniCS Project provides a novel platform for the automated solution of differential equations by finite element methods. In particular, it offers a significant flexibility with regard to modeling and numerical discretization choices; we have here used a discontinuous Galerkin method for the numerical solution of the advection-diffusion equations. Our numerical results are in agreement with the benchmarks, and demonstrate the applicability of both the discontinuous Galerkin method and FEniCS for such applications.
Benchmark Evaluation of Start-Up and Zero-Power Measurements at the High-Temperature Engineering Test Reactor

DOE PAGES

Bess, John D.; Fujimoto, Nozomu

2014-10-09

Benchmark models were developed to evaluate six cold-critical and two warm-critical, zero-power measurements of the HTTR. Additional measurements of a fully-loaded subcritical configuration, core excess reactivity, shutdown margins, six isothermal temperature coefficients, and axial reaction-rate distributions were also evaluated as acceptable benchmark experiments. Insufficient information is publicly available to develop finely-detailed models of the HTTR as much of the design information is still proprietary. However, the uncertainties in the benchmark models are judged to be of sufficient magnitude to encompass any biases and bias uncertainties incurred through the simplification process used to develop the benchmark models. Dominant uncertainties in themore » experimental keff for all core configurations come from uncertainties in the impurity content of the various graphite blocks that comprise the HTTR. Monte Carlo calculations of keff are between approximately 0.9 % and 2.7 % greater than the benchmark values. Reevaluation of the HTTR models as additional information becomes available could improve the quality of this benchmark and possibly reduce the computational biases. High-quality characterization of graphite impurities would significantly improve the quality of the HTTR benchmark assessment. Simulation of the other reactor physics measurements are in good agreement with the benchmark experiment values. The complete benchmark evaluation details are available in the 2014 edition of the International Handbook of Evaluated Reactor Physics Benchmark Experiments.« less
Benchmark Evaluation of the HTR-PROTEUS Absorber Rod Worths (Core 4)

DOE Office of Scientific and Technical Information (OSTI.GOV)

John D. Bess; Leland M. Montierth

2014-06-01

PROTEUS was a zero-power research reactor at the Paul Scherrer Institute (PSI) in Switzerland. The critical assembly was constructed from a large graphite annulus surrounding a central cylindrical cavity. Various experimental programs were investigated in PROTEUS; during the years 1992 through 1996, it was configured as a pebble-bed reactor and designated HTR-PROTEUS. Various critical configurations were assembled with each accompanied by an assortment of reactor physics experiments including differential and integral absorber rod measurements, kinetics, reaction rate distributions, water ingress effects, and small sample reactivity effects [1]. Four benchmark reports were previously prepared and included in the March 2013 editionmore » of the International Handbook of Evaluated Reactor Physics Benchmark Experiments (IRPhEP Handbook) [2] evaluating eleven critical configurations. A summary of that effort was previously provided [3] and an analysis of absorber rod worth measurements for Cores 9 and 10 have been performed prior to this analysis and included in PROTEUS-GCR-EXP-004 [4]. In the current benchmark effort, absorber rod worths measured for Core Configuration 4, which was the only core with a randomly-packed pebble loading, have been evaluated for inclusion as a revision to the HTR-PROTEUS benchmark report PROTEUS-GCR-EXP-002.« less
Benchmarking on the evaluation of major accident-related risk assessment.

PubMed

Fabbri, Luciano; Contini, Sergio

2009-03-15

This paper summarises the main results of a European project BEQUAR (Benchmarking Exercise in Quantitative Area Risk Assessment in Central and Eastern European Countries). This project is among the first attempts to explore how independent evaluations of the same risk study associated with a certain chemical establishment could differ from each other and the consequent effects on the resulting area risk estimate. The exercise specifically aimed at exploring the manner and degree to which independent experts may disagree on the interpretation of quantitative risk assessments for the same entity. The project first compared the results of a number of independent expert evaluations of a quantitative risk assessment study for the same reference chemical establishment. This effort was then followed by a study of the impact of the different interpretations on the estimate of the overall risk on the area concerned. In order to improve the inter-comparability of the results, this exercise was conducted using a single tool for area risk assessment based on the ARIPAR methodology. The results of this study are expected to contribute to an improved understanding of the inspection criteria and practices used by the different national authorities responsible for the implementation of the Seveso II Directive in their countries. The activity was funded under the Enlargement and Integration Action of the Joint Research Centre (JRC), that aims at providing scientific and technological support for promoting integration of the New Member States and assisting the Candidate Countries on their way towards accession to the European Union.
Thermal Performance Benchmarking: Annual Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Moreno, Gilbert

2016-04-08

The goal for this project is to thoroughly characterize the performance of state-of-the-art (SOA) automotive power electronics and electric motor thermal management systems. Information obtained from these studies will be used to: Evaluate advantages and disadvantages of different thermal management strategies; establish baseline metrics for the thermal management systems; identify methods of improvement to advance the SOA; increase the publicly available information related to automotive traction-drive thermal management systems; help guide future electric drive technologies (EDT) research and development (R&D) efforts. The performance results combined with component efficiency and heat generation information obtained by Oak Ridge National Laboratory (ORNL) maymore » then be used to determine the operating temperatures for the EDT components under drive-cycle conditions. In FY15, the 2012 Nissan LEAF power electronics and electric motor thermal management systems were benchmarked. Testing of the 2014 Honda Accord Hybrid power electronics thermal management system started in FY15; however, due to time constraints it was not possible to include results for this system in this report. The focus of this project is to benchmark the thermal aspects of the systems. ORNL's benchmarking of electric and hybrid electric vehicle technology reports provide detailed descriptions of the electrical and packaging aspects of these automotive systems.« less
Benchmarking reference services: step by step.

PubMed

Buchanan, H S; Marshall, J G

1996-01-01

This article is a companion to an introductory article on benchmarking published in an earlier issue of Medical Reference Services Quarterly. Librarians interested in benchmarking often ask the following questions: How do I determine what to benchmark; how do I form a benchmarking team; how do I identify benchmarking partners; what's the best way to collect and analyze benchmarking information; and what will I do with the data? Careful planning is a critical success factor of any benchmarking project, and these questions must be answered before embarking on a benchmarking study. This article summarizes the steps necessary to conduct benchmarking research. Relevant examples of each benchmarking step are provided.
Benchmark Testing of a New 56Fe Evaluation for Criticality Safety Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leal, Luiz C; Ivanov, E.

2015-01-01

The SAMMY code was used to evaluate resonance parameters of the 56Fe cross section in the resolved resonance energy range of 0–2 MeV using transmission data, capture, elastic, inelastic, and double differential elastic cross sections. The resonance analysis was performed with the code SAMMY that fits R-matrix resonance parameters using the generalized least-squares technique (Bayes’ theory). The evaluation yielded a set of resonance parameters that reproduced the experimental data very well, along with a resonance parameter covariance matrix for data uncertainty calculations. Benchmark tests were conducted to assess the evaluation performance in benchmark calculations.
Benchmarking forensic mental health organizations.

PubMed

Coombs, Tim; Taylor, Monica; Pirkis, Jane

2011-04-01

This paper describes the forensic mental health forums that were conducted as part of the National Mental Health Benchmarking Project (NMHBP). These forums encouraged participating organizations to compare their performance on a range of key performance indicators (KPIs) with that of their peers. Four forensic mental health organizations took part in the NMHBP. Representatives from these organizations attended eight benchmarking forums at which they documented their performance against previously agreed KPIs. They also undertook three special projects which explored some of the factors that might explain inter-organizational variation in performance. The inter-organizational range for many of the indicators was substantial. Observing this led participants to conduct the special projects to explore three factors which might help explain the variability - seclusion practices, delivery of community mental health services, and provision of court liaison services. The process of conducting the special projects gave participants insights into the practices and structures employed by their counterparts, and provided them with some important lessons for quality improvement. The forensic mental health benchmarking forums have demonstrated that benchmarking is feasible and likely to be useful in improving service performance and quality.
Overview of the 2014 Edition of the International Handbook of Evaluated Reactor Physics Benchmark Experiments (IRPhEP Handbook)

DOE Office of Scientific and Technical Information (OSTI.GOV)

John D. Bess; J. Blair Briggs; Jim Gulliford

2014-10-01

The International Reactor Physics Experiment Evaluation Project (IRPhEP) is a widely recognized world class program. The work of the IRPhEP is documented in the International Handbook of Evaluated Reactor Physics Benchmark Experiments (IRPhEP Handbook). Integral data from the IRPhEP Handbook is used by reactor safety and design, nuclear data, criticality safety, and analytical methods development specialists, worldwide, to perform necessary validations of their calculational techniques. The IRPhEP Handbook is among the most frequently quoted reference in the nuclear industry and is expected to be a valuable resource for future decades.
Performance Evaluation and Benchmarking of Intelligent Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Madhavan, Raj; Messina, Elena; Tunstel, Edward

To design and develop capable, dependable, and affordable intelligent systems, their performance must be measurable. Scientific methodologies for standardization and benchmarking are crucial for quantitatively evaluating the performance of emerging robotic and intelligent systems technologies. There is currently no accepted standard for quantitatively measuring the performance of these systems against user-defined requirements; and furthermore, there is no consensus on what objective evaluation procedures need to be followed to understand the performance of these systems. The lack of reproducible and repeatable test methods has precluded researchers working towards a common goal from exchanging and communicating results, inter-comparing system performance, and leveragingmore » previous work that could otherwise avoid duplication and expedite technology transfer. Currently, this lack of cohesion in the community hinders progress in many domains, such as manufacturing, service, healthcare, and security. By providing the research community with access to standardized tools, reference data sets, and open source libraries of solutions, researchers and consumers will be able to evaluate the cost and benefits associated with intelligent systems and associated technologies. In this vein, the edited book volume addresses performance evaluation and metrics for intelligent systems, in general, while emphasizing the need and solutions for standardized methods. To the knowledge of the editors, there is not a single book on the market that is solely dedicated to the subject of performance evaluation and benchmarking of intelligent systems. Even books that address this topic do so only marginally or are out of date. The research work presented in this volume fills this void by drawing from the experiences and insights of experts gained both through theoretical development and practical implementation of intelligent systems in a variety of diverse application domains. The book
Benchmarking in emergency health systems.

PubMed

Kennedy, Marcus P; Allen, Jacqueline; Allen, Greg

2002-12-01

This paper discusses the role of benchmarking as a component of quality management. It describes the historical background of benchmarking, its competitive origin and the requirement in today's health environment for a more collaborative approach. The classical 'functional and generic' types of benchmarking are discussed with a suggestion to adopt a different terminology that describes the purpose and practicalities of benchmarking. Benchmarking is not without risks. The consequence of inappropriate focus and the need for a balanced overview of process is explored. The competition that is intrinsic to benchmarking is questioned and the negative impact it may have on improvement strategies in poorly performing organizations is recognized. The difficulty in achieving cross-organizational validity in benchmarking is emphasized, as is the need to scrutinize benchmarking measures. The cost effectiveness of benchmarking projects is questioned and the concept of 'best value, best practice' in an environment of fixed resources is examined.

Benchmarking for the Effective Use of Student Evaluation Data

ERIC Educational Resources Information Center

Smithson, John; Birks, Melanie; Harrison, Glenn; Nair, Chenicheri Sid; Hitchins, Marnie

2015-01-01

Purpose: The purpose of this paper is to examine current approaches to interpretation of student evaluation data and present an innovative approach to developing benchmark targets for the effective and efficient use of these data. Design/Methodology/Approach: This article discusses traditional approaches to gathering and using student feedback…
Ground truth and benchmarks for performance evaluation

NASA Astrophysics Data System (ADS)

Takeuchi, Ayako; Shneier, Michael; Hong, Tsai Hong; Chang, Tommy; Scrapper, Christopher; Cheok, Geraldine S.

2003-09-01

Progress in algorithm development and transfer of results to practical applications such as military robotics requires the setup of standard tasks, of standard qualitative and quantitative measurements for performance evaluation and validation. Although the evaluation and validation of algorithms have been discussed for over a decade, the research community still faces a lack of well-defined and standardized methodology. The range of fundamental problems include a lack of quantifiable measures of performance, a lack of data from state-of-the-art sensors in calibrated real-world environments, and a lack of facilities for conducting realistic experiments. In this research, we propose three methods for creating ground truth databases and benchmarks using multiple sensors. The databases and benchmarks will provide researchers with high quality data from suites of sensors operating in complex environments representing real problems of great relevance to the development of autonomous driving systems. At NIST, we have prototyped a High Mobility Multi-purpose Wheeled Vehicle (HMMWV) system with a suite of sensors including a Riegl ladar, GDRS ladar, stereo CCD, several color cameras, Global Position System (GPS), Inertial Navigation System (INS), pan/tilt encoders, and odometry . All sensors are calibrated with respect to each other in space and time. This allows a database of features and terrain elevation to be built. Ground truth for each sensor can then be extracted from the database. The main goal of this research is to provide ground truth databases for researchers and engineers to evaluate algorithms for effectiveness, efficiency, reliability, and robustness, thus advancing the development of algorithms.
Linking user and staff perspectives in the evaluation of innovative transition projects for youth with disabilities.

PubMed

McAnaney, Donal F; Wynne, Richard F

2016-06-01

A key challenge in formative evaluation is to gather appropriate evidence to inform the continuous improvement of initiatives. In the absence of outcome data, the programme evaluator often must rely on the perceptions of beneficiaries and staff in generating insight into what is making a difference. The article describes the approach adopted in an evaluation of 15 innovative projects supporting school-leavers with disabilities in making the transition to education, work and life in community settings. Two complementary processes provided an insight into what project staff and leadership viewed as the key project activities and features that facilitated successful transition as well as the areas of quality of life (QOL) that participants perceived as having been impacted positively by the projects. A comparison was made between participants' perceptions of QOL impact with the views of participants in services normally offered by the wider system. This revealed that project participants were significantly more positive in their views than participants in traditional services. In addition, the processes and activities of the more highly rated projects were benchmarked against less highly rated projects and also with usually available services. Even in the context of a range of intervening variables such as level and complexity of participant needs and variations in the stage of development of individual projects, the benchmarking process indicated a number of project characteristics that were highly valued by participants. © The Author(s) 2016.
Benchmarking on Tsunami Currents with ComMIT

NASA Astrophysics Data System (ADS)

Sharghi vand, N.; Kanoglu, U.

2015-12-01

There were no standards for the validation and verification of tsunami numerical models before 2004 Indian Ocean tsunami. Even, number of numerical models has been used for inundation mapping effort, evaluation of critical structures, etc. without validation and verification. After 2004, NOAA Center for Tsunami Research (NCTR) established standards for the validation and verification of tsunami numerical models (Synolakis et al. 2008 Pure Appl. Geophys. 165, 2197-2228), which will be used evaluation of critical structures such as nuclear power plants against tsunami attack. NCTR presented analytical, experimental and field benchmark problems aimed to estimate maximum runup and accepted widely by the community. Recently, benchmark problems were suggested by the US National Tsunami Hazard Mitigation Program Mapping & Modeling Benchmarking Workshop: Tsunami Currents on February 9-10, 2015 at Portland, Oregon, USA (http://nws.weather.gov/nthmp/index.html). These benchmark problems concentrated toward validation and verification of tsunami numerical models on tsunami currents. Three of the benchmark problems were: current measurement of the Japan 2011 tsunami in Hilo Harbor, Hawaii, USA and in Tauranga Harbor, New Zealand, and single long-period wave propagating onto a small-scale experimental model of the town of Seaside, Oregon, USA. These benchmark problems were implemented in the Community Modeling Interface for Tsunamis (ComMIT) (Titov et al. 2011 Pure Appl. Geophys. 168, 2121-2131), which is a user-friendly interface to the validated and verified Method of Splitting Tsunami (MOST) (Titov and Synolakis 1995 J. Waterw. Port Coastal Ocean Eng. 121, 308-316) model and is developed by NCTR. The modeling results are compared with the required benchmark data, providing good agreements and results are discussed. Acknowledgment: The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant
The NAS parallel benchmarks

NASA Technical Reports Server (NTRS)

Bailey, David (Editor); Barton, John (Editor); Lasinski, Thomas (Editor); Simon, Horst (Editor)

1993-01-01

A new set of benchmarks was developed for the performance evaluation of highly parallel supercomputers. These benchmarks consist of a set of kernels, the 'Parallel Kernels,' and a simulated application benchmark. Together they mimic the computation and data movement characteristics of large scale computational fluid dynamics (CFD) applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification - all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.
The Isprs Benchmark on Indoor Modelling

NASA Astrophysics Data System (ADS)

Khoshelham, K.; Díaz Vilariño, L.; Peter, M.; Kang, Z.; Acharya, D.

2017-09-01

Automated generation of 3D indoor models from point cloud data has been a topic of intensive research in recent years. While results on various datasets have been reported in literature, a comparison of the performance of different methods has not been possible due to the lack of benchmark datasets and a common evaluation framework. The ISPRS benchmark on indoor modelling aims to address this issue by providing a public benchmark dataset and an evaluation framework for performance comparison of indoor modelling methods. In this paper, we present the benchmark dataset comprising several point clouds of indoor environments captured by different sensors. We also discuss the evaluation and comparison of indoor modelling methods based on manually created reference models and appropriate quality evaluation criteria. The benchmark dataset is available for download at: benchmark-on-indoor-modelling.html"target="_blank">http://www2.isprs.org/commissions/comm4/wg5/benchmark-on-indoor-modelling.html.
Benchmarking facilities providing care: An international overview of initiatives

PubMed Central

Thonon, Frédérique; Watson, Jonathan; Saghatchian, Mahasti

2015-01-01

We performed a literature review of existing benchmarking projects of health facilities to explore (1) the rationales for those projects, (2) the motivation for health facilities to participate, (3) the indicators used and (4) the success and threat factors linked to those projects. We studied both peer-reviewed and grey literature. We examined 23 benchmarking projects of different medical specialities. The majority of projects used a mix of structure, process and outcome indicators. For some projects, participants had a direct or indirect financial incentive to participate (such as reimbursement by Medicaid/Medicare or litigation costs related to quality of care). A positive impact was reported for most projects, mainly in terms of improvement of practice and adoption of guidelines and, to a lesser extent, improvement in communication. Only 1 project reported positive impact in terms of clinical outcomes. Success factors and threats are linked to both the benchmarking process (such as organisation of meetings, link with existing projects) and indicators used (such as adjustment for diagnostic-related groups). The results of this review will help coordinators of a benchmarking project to set it up successfully. PMID:26770800
The NAS parallel benchmarks

NASA Technical Reports Server (NTRS)

Bailey, D. H.; Barszcz, E.; Barton, J. T.; Carter, R. L.; Lasinski, T. A.; Browning, D. S.; Dagum, L.; Fatoohi, R. A.; Frederickson, P. O.; Schreiber, R. S.

1991-01-01

A new set of benchmarks has been developed for the performance evaluation of highly parallel supercomputers in the framework of the NASA Ames Numerical Aerodynamic Simulation (NAS) Program. These consist of five 'parallel kernel' benchmarks and three 'simulated application' benchmarks. Together they mimic the computation and data movement characteristics of large-scale computational fluid dynamics applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification-all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.
Collected notes from the Benchmarks and Metrics Workshop

NASA Technical Reports Server (NTRS)

Drummond, Mark E.; Kaelbling, Leslie P.; Rosenschein, Stanley J.

1991-01-01

In recent years there has been a proliferation of proposals in the artificial intelligence (AI) literature for integrated agent architectures. Each architecture offers an approach to the general problem of constructing an integrated agent. Unfortunately, the ways in which one architecture might be considered better than another are not always clear. There has been a growing realization that many of the positive and negative aspects of an architecture become apparent only when experimental evaluation is performed and that to progress as a discipline, we must develop rigorous experimental methods. In addition to the intrinsic intellectual interest of experimentation, rigorous performance evaluation of systems is also a crucial practical concern to our research sponsors. DARPA, NASA, and AFOSR (among others) are actively searching for better ways of experimentally evaluating alternative approaches to building intelligent agents. One tool for experimental evaluation involves testing systems on benchmark tasks in order to assess their relative performance. As part of a joint DARPA and NASA funded project, NASA-Ames and Teleos Research are carrying out a research effort to establish a set of benchmark tasks and evaluation metrics by which the performance of agent architectures may be determined. As part of this project, we held a workshop on Benchmarks and Metrics at the NASA Ames Research Center on June 25, 1990. The objective of the workshop was to foster early discussion on this important topic. We did not achieve a consensus, nor did we expect to. Collected here is some of the information that was exchanged at the workshop. Given here is an outline of the workshop, a list of the participants, notes taken on the white-board during open discussions, position papers/notes from some participants, and copies of slides used in the presentations.
Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mishra, Alok; Li, Lingda; Kong, Martin

Here, the latest OpenMP standard offers automatic device offloading capabilities which facilitate GPU programming. Despite this, there remain many challenges. One of these is the unified memory feature introduced in recent GPUs. GPUs in current and future HPC systems have enhanced support for unified memory space. In such systems, CPU and GPU can access each other's memory transparently, that is, the data movement is managed automatically by the underlying system software and hardware. Memory over subscription is also possible in these systems. However, there is a significant lack of knowledge about how this mechanism will perform, and how programmers shouldmore » use it. We have modified several benchmarks codes, in the Rodinia benchmark suite, to study the behavior of OpenMP accelerator extensions and have used them to explore the impact of unified memory in an OpenMP context. We moreover modified the open source LLVM compiler to allow OpenMP programs to exploit unified memory. The results of our evaluation reveal that, while the performance of unified memory is comparable with that of normal GPU offloading for benchmarks with little data reuse, it suffers from significant overhead when GPU memory is over subcribed for benchmarks with large amount of data reuse. Based on these results, we provide several guidelines for programmers to achieve better performance with unified memory.« less
Benchmarking and Its Relevance to the Library and Information Sector. Interim Findings of "Best Practice Benchmarking in the Library and Information Sector," a British Library Research and Development Department Project.

ERIC Educational Resources Information Center

Kinnell, Margaret; Garrod, Penny

This British Library Research and Development Department study assesses current activities and attitudes toward quality management in library and information services (LIS) in the academic sector as well as the commercial/industrial sector. Definitions and types of benchmarking are described, and the relevance of benchmarking to LIS is evaluated.…
Availability of Neutronics Benchmarks in the ICSBEP and IRPhEP Handbooks for Computational Tools Testing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bess, John D.; Briggs, J. Blair; Ivanova, Tatiana

2017-02-01

In the past several decades, numerous experiments have been performed worldwide to support reactor operations, measurements, design, and nuclear safety. Those experiments represent an extensive international investment in infrastructure, expertise, and cost, representing significantly valuable resources of data supporting past, current, and future research activities. Those valuable assets represent the basis for recording, development, and validation of our nuclear methods and integral nuclear data [1]. The loss of these experimental data, which has occurred all too much in the recent years, is tragic. The high cost to repeat many of these measurements can be prohibitive, if not impossible, to surmount.more » Two international projects were developed, and are under the direction of the Organisation for Co-operation and Development Nuclear Energy Agency (OECD NEA) to address the challenges of not just data preservation, but evaluation of the data to determine its merit for modern and future use. The International Criticality Safety Benchmark Evaluation Project (ICSBEP) was established to identify and verify comprehensive critical benchmark data sets; evaluate the data, including quantification of biases and uncertainties; compile the data and calculations in a standardized format; and formally document the effort into a single source of verified benchmark data [2]. Similarly, the International Reactor Physics Experiment Evaluation Project (IRPhEP) was established to preserve integral reactor physics experimental data, including separate or special effects data for nuclear energy and technology applications [3]. Annually, contributors from around the world continue to collaborate in the evaluation and review of select benchmark experiments for preservation and dissemination. The extensively peer-reviewed integral benchmark data can then be utilized to support nuclear design and safety analysts to validate the analytical tools, methods, and data needed for next
The PATH project in eight European countries: an evaluation.

PubMed

Veillard, Jeremy Henri Maurice; Schiøtz, Michaela Louise; Guisset, Ann-Lise; Brown, Adalsteinn Davidson; Klazinga, Niek S

2013-01-01

This paper's aim is to evaluate the perceived impact and the enabling factors and barriers experienced by hospital staff participating in an international hospital performance measurement project focused on internal quality improvement. Semi-structured interviews involving international hospital performance measurement project coordinators, including 140 hospitals from eight European countries (Belgium, Estonia, France, Germany, Hungary, Poland, Slovakia and Slovenia). Inductively analyzing the interview transcripts was carried out using the grounded theory approach. Even when public reporting is absent, the project was perceived as having stimulated performance measurement and quality improvement initiatives in participating hospitals. Attention should be paid to leadership/ownership, context, content (project intrinsic features) and processes supporting elements. Generalizing the findings is limited by the study's small sample size. Possible implications for the WHO European Regional Office and for participating hospitals would be to assess hospital preparedness to participate in the PATH project, depending on context, process and structural elements; and enhance performance and practice benchmarking through suggested approaches. This research gathered rich and unique material related to an international performance measurement project. It derived actionable findings.
BENCHMARKING SUSTAINABILITY ENGINEERING EDUCATION

EPA Science Inventory

The goals of this project are to develop and apply a methodology for benchmarking curricula in sustainability engineering and to identify individuals active in sustainability engineering education.
Benchmarks: The Development of a New Approach to Student Evaluation.

ERIC Educational Resources Information Center

Larter, Sylvia

The Toronto Board of Education Benchmarks are libraries of reference materials that demonstrate student achievement at various levels. Each library contains video benchmarks, print benchmarks, a staff handbook, and summary and introductory documents. This book is about the development and the history of the benchmark program. It has taken over 3…
Benchmark Evaluation of Dounreay Prototype Fast Reactor Minor Actinide Depletion Measurements

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hess, J. D.; Gauld, I. C.; Gulliford, J.

2017-01-01

Historic measurements of actinide samples in the Dounreay Prototype Fast Reactor (PFR) are of interest for modern nuclear data and simulation validation. Samples of various higher-actinide isotopes were irradiated for 492 effective full-power days and radiochemically assayed at Oak Ridge National Laboratory (ORNL) and Japan Atomic Energy Research Institute (JAERI). Limited data were available regarding the PFR irradiation; a six-group neutron spectra was available with some power history data to support a burnup depletion analysis validation study. Under the guidance of the Organisation for Economic Co-Operation and Development Nuclear Energy Agency (OECD NEA), the International Reactor Physics Experiment Evaluation Projectmore » (IRPhEP) and Spent Fuel Isotopic Composition (SFCOMPO) Project are collaborating to recover all measurement data pertaining to these measurements, including collaboration with the United Kingdom to obtain pertinent reactor physics design and operational history data. These activities will produce internationally peer-reviewed benchmark data to support validation of minor actinide cross section data and modern neutronic simulation of fast reactors with accompanying fuel cycle activities such as transportation, recycling, storage, and criticality safety.« less
Outpatient echocardiography in the evaluation of innocent murmurs in children: utilisation benchmarking.

PubMed

Frias, Patricio A; Oster, Matthew; Daley, Patricia A; Boris, Jeffrey R

2016-03-01

We sought to benchmark the utilisation of echocardiography in the outpatient evaluation of heart murmurs by evaluating two large paediatric cardiology centres. Although criteria exist for appropriate use of echocardiography, there are no benchmarking data demonstrating its utilisation. We performed a retrospective cohort study of outpatients aged between 0 and 18 years at the Sibley Heart Center Cardiology and the Children's Hospital of Philadelphia Division of Cardiology, given a sole diagnosis of "innocent murmur" from 1 July, 2007 to 31 October, 2010. Using internal claims data, we compared the utilisation of echocardiography according to centre, patient age, and physician years of service. Of 23,114 eligible patients (Sibley Heart Center Cardiology: 12,815, Children's Hospital of Philadelphia Division of Cardiology: 10,299), 43.1% (Sibley Heart Center Cardiology: 45.2%, Children's Hospital of Philadelphia Division of Cardiology: 40.4%; p1-5 years had the lowest utilisation (32.7%). In two large paediatric cardiology practices, the overall utilisation of echocardiography by physicians with a sole diagnosis of innocent murmur was similar. There was significant and similar variability in utilisation by provider at both centres. Although these data serve as initial benchmarking, the variability in utilisation highlights the importance of appropriate use criteria.
Benchmarking child and adolescent mental health organizations.

PubMed

Brann, Peter; Walter, Garry; Coombs, Tim

2011-04-01

This paper describes aspects of the child and adolescent benchmarking forums that were part of the National Mental Health Benchmarking Project (NMHBP). These forums enabled participating child and adolescent mental health organizations to benchmark themselves against each other, with a view to understanding variability in performance against a range of key performance indicators (KPIs). Six child and adolescent mental health organizations took part in the NMHBP. Representatives from these organizations attended eight benchmarking forums at which they documented their performance against relevant KPIs. They also undertook two special projects designed to help them understand the variation in performance on given KPIs. There was considerable inter-organization variability on many of the KPIs. Even within organizations, there was often substantial variability over time. The variability in indicator data raised many questions for participants. This challenged participants to better understand and describe their local processes, prompted them to collect additional data, and stimulated them to make organizational comparisons. These activities fed into a process of reflection about their performance. Benchmarking has the potential to illuminate intra- and inter-organizational performance in the child and adolescent context.
PMLB: a large benchmark suite for machine learning evaluation and comparison.

PubMed

Olson, Randal S; La Cava, William; Orzechowski, Patryk; Urbanowicz, Ryan J; Moore, Jason H

2017-01-01

The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark datasets have emerged from different sources, but their organization and adoption as standards have been inconsistent. As such, selecting and curating specific benchmarks remains an unnecessary burden on machine learning practitioners and data scientists. The present study introduces an accessible, curated, and developing public benchmark resource to facilitate identification of the strengths and weaknesses of different machine learning methodologies. We compare meta-features among the current set of benchmark datasets in this resource to characterize the diversity of available data. Finally, we apply a number of established machine learning methods to the entire benchmark suite and analyze how datasets and algorithms cluster in terms of performance. From this study, we find that existing benchmarks lack the diversity to properly benchmark machine learning algorithms, and there are several gaps in benchmarking problems that still need to be considered. This work represents another important step towards understanding the limitations of popular benchmarking suites and developing a resource that connects existing benchmarking standards to more diverse and efficient standards in the future.
Antibody-protein interactions: benchmark datasets and prediction tools evaluation

PubMed Central

Ponomarenko, Julia V; Bourne, Philip E

2007-01-01

Background The ability to predict antibody binding sites (aka antigenic determinants or B-cell epitopes) for a given protein is a precursor to new vaccine design and diagnostics. Among the various methods of B-cell epitope identification X-ray crystallography is one of the most reliable methods. Using these experimental data computational methods exist for B-cell epitope prediction. As the number of structures of antibody-protein complexes grows, further interest in prediction methods using 3D structure is anticipated. This work aims to establish a benchmark for 3D structure-based epitope prediction methods. Results Two B-cell epitope benchmark datasets inferred from the 3D structures of antibody-protein complexes were defined. The first is a dataset of 62 representative 3D structures of protein antigens with inferred structural epitopes. The second is a dataset of 82 structures of antibody-protein complexes containing different structural epitopes. Using these datasets, eight web-servers developed for antibody and protein binding sites prediction have been evaluated. In no method did performance exceed a 40% precision and 46% recall. The values of the area under the receiver operating characteristic curve for the evaluated methods were about 0.6 for ConSurf, DiscoTope, and PPI-PRED methods and above 0.65 but not exceeding 0.70 for protein-protein docking methods when the best of the top ten models for the bound docking were considered; the remaining methods performed close to random. The benchmark datasets are included as a supplement to this paper. Conclusion It may be possible to improve epitope prediction methods through training on datasets which include only immune epitopes and through utilizing more features characterizing epitopes, for example, the evolutionary conservation score. Notwithstanding, overall poor performance may reflect the generality of antigenicity and hence the inability to decipher B-cell epitopes as an intrinsic feature of the protein. It

Cloud-Based Evaluation of Anatomical Structure Segmentation and Landmark Detection Algorithms: VISCERAL Anatomy Benchmarks.

PubMed

Jimenez-Del-Toro, Oscar; Muller, Henning; Krenn, Markus; Gruenberg, Katharina; Taha, Abdel Aziz; Winterstein, Marianne; Eggel, Ivan; Foncubierta-Rodriguez, Antonio; Goksel, Orcun; Jakab, Andras; Kontokotsios, Georgios; Langs, Georg; Menze, Bjoern H; Salas Fernandez, Tomas; Schaer, Roger; Walleyo, Anna; Weber, Marc-Andre; Dicente Cid, Yashin; Gass, Tobias; Heinrich, Mattias; Jia, Fucang; Kahl, Fredrik; Kechichian, Razmig; Mai, Dominic; Spanier, Assaf B; Vincent, Graham; Wang, Chunliang; Wyeth, Daniel; Hanbury, Allan

2016-11-01

Variations in the shape and appearance of anatomical structures in medical images are often relevant radiological signs of disease. Automatic tools can help automate parts of this manual process. A cloud-based evaluation framework is presented in this paper including results of benchmarking current state-of-the-art medical imaging algorithms for anatomical structure segmentation and landmark detection: the VISCERAL Anatomy benchmarks. The algorithms are implemented in virtual machines in the cloud where participants can only access the training data and can be run privately by the benchmark administrators to objectively compare their performance in an unseen common test set. Overall, 120 computed tomography and magnetic resonance patient volumes were manually annotated to create a standard Gold Corpus containing a total of 1295 structures and 1760 landmarks. Ten participants contributed with automatic algorithms for the organ segmentation task, and three for the landmark localization task. Different algorithms obtained the best scores in the four available imaging modalities and for subsets of anatomical structures. The annotation framework, resulting data set, evaluation setup, results and performance analysis from the three VISCERAL Anatomy benchmarks are presented in this article. Both the VISCERAL data set and Silver Corpus generated with the fusion of the participant algorithms on a larger set of non-manually-annotated medical images are available to the research community.
Design and development of a community carbon cycle benchmarking system for CMIP5 models

NASA Astrophysics Data System (ADS)

Mu, M.; Hoffman, F. M.; Lawrence, D. M.; Riley, W. J.; Keppel-Aleks, G.; Randerson, J. T.

2013-12-01

Benchmarking has been widely used to assess the ability of atmosphere, ocean, sea ice, and land surface models to capture the spatial and temporal variability of observations during the historical period. For the carbon cycle and terrestrial ecosystems, the design and development of an open-source community platform has been an important goal as part of the International Land Model Benchmarking (ILAMB) project. Here we designed and developed a software system that enables the user to specify the models, benchmarks, and scoring systems so that results can be tailored to specific model intercomparison projects. We used this system to evaluate the performance of CMIP5 Earth system models (ESMs). Our scoring system used information from four different aspects of climate, including the climatological mean spatial pattern of gridded surface variables, seasonal cycle dynamics, the amplitude of interannual variability, and long-term decadal trends. We used this system to evaluate burned area, global biomass stocks, net ecosystem exchange, gross primary production, and ecosystem respiration from CMIP5 historical simulations. Initial results indicated that the multi-model mean often performed better than many of the individual models for most of the observational constraints.
Benchmark and Framework for Encouraging Research on Multi-Threaded Testing Tools

NASA Technical Reports Server (NTRS)

Havelund, Klaus; Stoller, Scott D.; Ur, Shmuel

2003-01-01

A problem that has been getting prominence in testing is that of looking for intermittent bugs. Multi-threaded code is becoming very common, mostly on the server side. As there is no silver bullet solution, research focuses on a variety of partial solutions. In this paper (invited by PADTAD 2003) we outline a proposed project to facilitate research. The project goals are as follows. The first goal is to create a benchmark that can be used to evaluate different solutions. The benchmark, apart from containing programs with documented bugs, will include other artifacts, such as traces, that are useful for evaluating some of the technologies. The second goal is to create a set of tools with open API s that can be used to check ideas without building a large system. For example an instrumentor will be available, that could be used to test temporal noise making heuristics. The third goal is to create a focus for the research in this area around which a community of people who try to solve similar problems with different techniques, could congregate.
Benchmarking of municipal waste water treatment plants (an Austrian project).

PubMed

Lindtner, S; Kroiss, H; Nowak, O

2004-01-01

An Austrian research project focused on the development of process indicators for treatment plants with different process and operation modes. The whole treatment scheme was subdivided into four processes, i.e. mechanical pretreatment (Process 1), mechanical-biological waste water treatment (Process 2), sludge thickening and stabilisation (Process 3) and further sludge treatment and disposal (Process 4). In order to get comparable process indicators it was necessary to subdivide the sample of 76 individual treatment plants all over Austria into five groups according to their mean organic load (COD) in the influent. The specific total yearly costs, the yearly operating costs and the yearly capital costs of the four processes have been related to the yearly average of the measured organic load expressed in COD (110 g COD/pe/d). The specific investment costs for the whole treatment plant and for Process 2 have been related to a calculated standard design capacity of the mechanical-biological part of the treatment plant expressed in COD. The capital costs of processes 1, 3 and 4 have been related to the design capacity of the treatment plant. For each group (related to the size of the plant) a benchmark band has been defined for the total yearly costs, the total yearly operational costs and the total yearly capital costs. For the operational costs of the Processes 1 to 4 one benchmark ([see symbol in text] per pe/year) has been defined for each group. In addition a theoretical cost reduction potential has been calculated. The cost efficiency in regard to water protection and some special sub-processes such as aeration and sludge dewatering has been analysed.
Authentic e-Learning in a Multicultural Context: Virtual Benchmarking Cases from Five Countries

ERIC Educational Resources Information Center

Leppisaari, Irja; Herrington, Jan; Vainio, Leena; Im, Yeonwook

2013-01-01

The implementation of authentic learning elements at education institutions in five countries, eight online courses in total, is examined in this paper. The International Virtual Benchmarking Project (2009-2010) applied the elements of authentic learning developed by Herrington and Oliver (2000) as criteria to evaluate authenticity. Twelve…
A benchmarking tool to evaluate computer tomography perfusion infarct core predictions against a DWI standard.

PubMed

Cereda, Carlo W; Christensen, Søren; Campbell, Bruce Cv; Mishra, Nishant K; Mlynash, Michael; Levi, Christopher; Straka, Matus; Wintermark, Max; Bammer, Roland; Albers, Gregory W; Parsons, Mark W; Lansberg, Maarten G

2016-10-01

Differences in research methodology have hampered the optimization of Computer Tomography Perfusion (CTP) for identification of the ischemic core. We aim to optimize CTP core identification using a novel benchmarking tool. The benchmarking tool consists of an imaging library and a statistical analysis algorithm to evaluate the performance of CTP. The tool was used to optimize and evaluate an in-house developed CTP-software algorithm. Imaging data of 103 acute stroke patients were included in the benchmarking tool. Median time from stroke onset to CT was 185 min (IQR 180-238), and the median time between completion of CT and start of MRI was 36 min (IQR 25-79). Volumetric accuracy of the CTP-ROIs was optimal at an rCBF threshold of <38%; at this threshold, the mean difference was 0.3 ml (SD 19.8 ml), the mean absolute difference was 14.3 (SD 13.7) ml, and CTP was 67% sensitive and 87% specific for identification of DWI positive tissue voxels. The benchmarking tool can play an important role in optimizing CTP software as it provides investigators with a novel method to directly compare the performance of alternative CTP software packages. © The Author(s) 2015.
A Competitive Benchmarking Study of Noncredit Program Administration.

ERIC Educational Resources Information Center

Alstete, Jeffrey W.

1996-01-01

A benchmarking project to measure administrative processes and financial ratios received 57 usable replies from 300 noncredit continuing education programs. Programs with strong financial surpluses were identified and their processes benchmarked (including response to inquiries, registrants, registrant/staff ratio, new courses, class size,…
Benchmarking reference services: an introduction.

PubMed

Marshall, J G; Buchanan, H S

1995-01-01

Benchmarking is based on the common sense idea that someone else, either inside or outside of libraries, has found a better way of doing certain things and that your own library's performance can be improved by finding out how others do things and adopting the best practices you find. Benchmarking is one of the tools used for achieving continuous improvement in Total Quality Management (TQM) programs. Although benchmarking can be done on an informal basis, TQM puts considerable emphasis on formal data collection and performance measurement. Used to its full potential, benchmarking can provide a common measuring stick to evaluate process performance. This article introduces the general concept of benchmarking, linking it whenever possible to reference services in health sciences libraries. Data collection instruments that have potential application in benchmarking studies are discussed and the need to develop common measurement tools to facilitate benchmarking is emphasized.
Data Race Benchmark Collection

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liao, Chunhua; Lin, Pei-Hung; Asplund, Joshua

2017-03-21

This project is a benchmark suite of Open-MP parallel codes that have been checked for data races. The programs are marked to show which do and do not have races. This allows them to be leveraged while testing and developing race detection tools.
HPC Analytics Support. Requirements for Uncertainty Quantification Benchmarks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paulson, Patrick R.; Purohit, Sumit; Rodriguez, Luke R.

2015-05-01

This report outlines techniques for extending benchmark generation products so they support uncertainty quantification by benchmarked systems. We describe how uncertainty quantification requirements can be presented to candidate analytical tools supporting SPARQL. We describe benchmark data sets for evaluating uncertainty quantification, as well as an approach for using our benchmark generator to produce data sets for generating benchmark data sets.
Benchmarking expert system tools

NASA Technical Reports Server (NTRS)

Riley, Gary

1988-01-01

As part of its evaluation of new technologies, the Artificial Intelligence Section of the Mission Planning and Analysis Div. at NASA-Johnson has made timing tests of several expert system building tools. Among the production systems tested were Automated Reasoning Tool, several versions of OPS5, and CLIPS (C Language Integrated Production System), an expert system builder developed by the AI section. Also included in the test were a Zetalisp version of the benchmark along with four versions of the benchmark written in Knowledge Engineering Environment, an object oriented, frame based expert system tool. The benchmarks used for testing are studied.
Evaluation of CHO Benchmarks on the Arria 10 FPGA using Intel FPGA SDK for OpenCL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal

The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. Benchmarking of OpenCL-based framework is an effective way for analyzing the performance of system by studying the execution of the benchmark applications. CHO is a suite of benchmark applications that provides support for OpenCL [1]. The authors presented CHO as an OpenCL port of the CHStone benchmark. Using Altera OpenCL (AOCL) compiler to synthesize the benchmark applications, they listed the resource usage and performance of each kernel that can be successfully synthesized by the compiler. In this report, we evaluate the resource usage and performance of the CHO benchmark applications using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board that features an Arria 10 FPGA device. The focus of the report is to have a better understanding of the resource usage and performance of the kernel implementations using Arria-10 FPGA devices compared to Stratix-5 FPGA devices. In addition, we also gain knowledge about the limitations of the current compiler when it fails to synthesize a benchmark
A European benchmarking system to evaluate in-hospital mortality rates in acute coronary syndrome: the EURHOBOP project.

PubMed

Dégano, Irene R; Subirana, Isaac; Torre, Marina; Grau, María; Vila, Joan; Fusco, Danilo; Kirchberger, Inge; Ferrières, Jean; Malmivaara, Antti; Azevedo, Ana; Meisinger, Christa; Bongard, Vanina; Farmakis, Dimitros; Davoli, Marina; Häkkinen, Unto; Araújo, Carla; Lekakis, John; Elosua, Roberto; Marrugat, Jaume

2015-03-01

Hospital performance models in acute myocardial infarction (AMI) are useful to assess patient management. While models are available for individual countries, mainly US, cross-European performance models are lacking. Thus, we aimed to develop a system to benchmark European hospitals in AMI and percutaneous coronary intervention (PCI), based on predicted in-hospital mortality. We used the EURopean HOspital Benchmarking by Outcomes in ACS Processes (EURHOBOP) cohort to develop the models, which included 11,631 AMI patients and 8276 acute coronary syndrome (ACS) patients who underwent PCI. Models were validated with a cohort of 55,955 European ACS patients. Multilevel logistic regression was used to predict in-hospital mortality in European hospitals for AMI and PCI. Administrative and clinical models were constructed with patient- and hospital-level covariates, as well as hospital- and country-based random effects. Internal cross-validation and external validation showed good discrimination at the patient level and good calibration at the hospital level, based on the C-index (0.736-0.819) and the concordance correlation coefficient (55.4%-80.3%). Mortality ratios (MRs) showed excellent concordance between administrative and clinical models (97.5% for AMI and 91.6% for PCI). Exclusion of transfers and hospital stays ≤1day did not affect in-hospital mortality prediction in sensitivity analyses, as shown by MR concordance (80.9%-85.4%). Models were used to develop a benchmarking system to compare in-hospital mortality rates of European hospitals with similar characteristics. The developed system, based on the EURHOBOP models, is a simple and reliable tool to compare in-hospital mortality rates between European hospitals in AMI and PCI. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Towards Systematic Benchmarking of Climate Model Performance

NASA Astrophysics Data System (ADS)

Gleckler, P. J.

2014-12-01

The process by which climate models are evaluated has evolved substantially over the past decade, with the Coupled Model Intercomparison Project (CMIP) serving as a centralizing activity for coordinating model experimentation and enabling research. Scientists with a broad spectrum of expertise have contributed to the CMIP model evaluation process, resulting in many hundreds of publications that have served as a key resource for the IPCC process. For several reasons, efforts are now underway to further systematize some aspects of the model evaluation process. First, some model evaluation can now be considered routine and should not require "re-inventing the wheel" or a journal publication simply to update results with newer models. Second, the benefit of CMIP research to model development has not been optimal because the publication of results generally takes several years and is usually not reproducible for benchmarking newer model versions. And third, there are now hundreds of model versions and many thousands of simulations, but there is no community-based mechanism for routinely monitoring model performance changes. An important change in the design of CMIP6 can help address these limitations. CMIP6 will include a small set standardized experiments as an ongoing exercise (CMIP "DECK": ongoing Diagnostic, Evaluation and Characterization of Klima), so that modeling groups can submit them at any time and not be overly constrained by deadlines. In this presentation, efforts to establish routine benchmarking of existing and future CMIP simulations will be described. To date, some benchmarking tools have been made available to all CMIP modeling groups to enable them to readily compare with CMIP5 simulations during the model development process. A natural extension of this effort is to make results from all CMIP simulations widely available, including the results from newer models as soon as the simulations become available for research. Making the results from routine
Benchmarking Using Basic DBMS Operations

NASA Astrophysics Data System (ADS)

Crolotte, Alain; Ghazal, Ahmad

The TPC-H benchmark proved to be successful in the decision support area. Many commercial database vendors and their related hardware vendors used these benchmarks to show the superiority and competitive edge of their products. However, over time, the TPC-H became less representative of industry trends as vendors keep tuning their database to this benchmark-specific workload. In this paper, we present XMarq, a simple benchmark framework that can be used to compare various software/hardware combinations. Our benchmark model is currently composed of 25 queries that measure the performance of basic operations such as scans, aggregations, joins and index access. This benchmark model is based on the TPC-H data model due to its maturity and well-understood data generation capability. We also propose metrics to evaluate single-system performance and compare two systems. Finally we illustrate the effectiveness of this model by showing experimental results comparing two systems under different conditions.
International benchmarking of specialty hospitals. A series of case studies on comprehensive cancer centres.

PubMed

van Lent, Wineke A M; de Beer, Relinde D; van Harten, Wim H

2010-08-31

Benchmarking is one of the methods used in business that is applied to hospitals to improve the management of their operations. International comparison between hospitals can explain performance differences. As there is a trend towards specialization of hospitals, this study examines the benchmarking process and the success factors of benchmarking in international specialized cancer centres. Three independent international benchmarking studies on operations management in cancer centres were conducted. The first study included three comprehensive cancer centres (CCC), three chemotherapy day units (CDU) were involved in the second study and four radiotherapy departments were included in the final study. Per multiple case study a research protocol was used to structure the benchmarking process. After reviewing the multiple case studies, the resulting description was used to study the research objectives. We adapted and evaluated existing benchmarking processes through formalizing stakeholder involvement and verifying the comparability of the partners. We also devised a framework to structure the indicators to produce a coherent indicator set and better improvement suggestions. Evaluating the feasibility of benchmarking as a tool to improve hospital processes led to mixed results. Case study 1 resulted in general recommendations for the organizations involved. In case study 2, the combination of benchmarking and lean management led in one CDU to a 24% increase in bed utilization and a 12% increase in productivity. Three radiotherapy departments of case study 3, were considering implementing the recommendations.Additionally, success factors, such as a well-defined and small project scope, partner selection based on clear criteria, stakeholder involvement, simple and well-structured indicators, analysis of both the process and its results and, adapt the identified better working methods to the own setting, were found. The improved benchmarking process and the success
International benchmarking of specialty hospitals. A series of case studies on comprehensive cancer centres

PubMed Central

2010-01-01

Background Benchmarking is one of the methods used in business that is applied to hospitals to improve the management of their operations. International comparison between hospitals can explain performance differences. As there is a trend towards specialization of hospitals, this study examines the benchmarking process and the success factors of benchmarking in international specialized cancer centres. Methods Three independent international benchmarking studies on operations management in cancer centres were conducted. The first study included three comprehensive cancer centres (CCC), three chemotherapy day units (CDU) were involved in the second study and four radiotherapy departments were included in the final study. Per multiple case study a research protocol was used to structure the benchmarking process. After reviewing the multiple case studies, the resulting description was used to study the research objectives. Results We adapted and evaluated existing benchmarking processes through formalizing stakeholder involvement and verifying the comparability of the partners. We also devised a framework to structure the indicators to produce a coherent indicator set and better improvement suggestions. Evaluating the feasibility of benchmarking as a tool to improve hospital processes led to mixed results. Case study 1 resulted in general recommendations for the organizations involved. In case study 2, the combination of benchmarking and lean management led in one CDU to a 24% increase in bed utilization and a 12% increase in productivity. Three radiotherapy departments of case study 3, were considering implementing the recommendations. Additionally, success factors, such as a well-defined and small project scope, partner selection based on clear criteria, stakeholder involvement, simple and well-structured indicators, analysis of both the process and its results and, adapt the identified better working methods to the own setting, were found. Conclusions The improved
Computational Chemistry Comparison and Benchmark Database

National Institute of Standards and Technology Data Gateway

SRD 101 NIST Computational Chemistry Comparison and Benchmark Database (Web, free access) The NIST Computational Chemistry Comparison and Benchmark Database is a collection of experimental and ab initio thermochemical properties for a selected set of molecules. The goals are to provide a benchmark set of molecules for the evaluation of ab initio computational methods and allow the comparison between different ab initio computational methods for the prediction of thermochemical properties.
Managing for Results in America's Great City Schools 2014: Results from Fiscal Year 2012-13. A Report of the Performance Measurement and Benchmarking Project

ERIC Educational Resources Information Center

Council of the Great City Schools, 2014

2014-01-01

In 2002 the "Council of the Great City Schools" and its members set out to develop performance measures that could be used to improve business operations in urban public school districts. The Council launched the "Performance Measurement and Benchmarking Project" to achieve these objectives. The purposes of the project was to:…
Design and Application of a Community Land Benchmarking System for Earth System Models

NASA Astrophysics Data System (ADS)

Mu, M.; Hoffman, F. M.; Lawrence, D. M.; Riley, W. J.; Keppel-Aleks, G.; Koven, C. D.; Kluzek, E. B.; Mao, J.; Randerson, J. T.

2015-12-01

Benchmarking has been widely used to assess the ability of climate models to capture the spatial and temporal variability of observations during the historical era. For the carbon cycle and terrestrial ecosystems, the design and development of an open-source community platform has been an important goal as part of the International Land Model Benchmarking (ILAMB) project. Here we developed a new benchmarking software system that enables the user to specify the models, benchmarks, and scoring metrics, so that results can be tailored to specific model intercomparison projects. Evaluation data sets included soil and aboveground carbon stocks, fluxes of energy, carbon and water, burned area, leaf area, and climate forcing and response variables. We used this system to evaluate simulations from the 5th Phase of the Coupled Model Intercomparison Project (CMIP5) with prognostic atmospheric carbon dioxide levels over the period from 1850 to 2005 (i.e., esmHistorical simulations archived on the Earth System Grid Federation). We found that the multi-model ensemble had a high bias in incoming solar radiation across Asia, likely as a consequence of incomplete representation of aerosol effects in this region, and in South America, primarily as a consequence of a low bias in mean annual precipitation. The reduced precipitation in South America had a larger influence on gross primary production than the high bias in incoming light, and as a consequence gross primary production had a low bias relative to the observations. Although model to model variations were large, the multi-model mean had a positive bias in atmospheric carbon dioxide that has been attributed in past work to weak ocean uptake of fossil emissions. In mid latitudes of the northern hemisphere, most models overestimate latent heat fluxes in the early part of the growing season, and underestimate these fluxes in mid-summer and early fall, whereas sensible heat fluxes show the opposite trend.

Algorithm and Architecture Independent Benchmarking with SEAK

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tallent, Nathan R.; Manzano Franco, Joseph B.; Gawande, Nitin A.

2016-05-23

Many applications of high performance embedded computing are limited by performance or power bottlenecks. We have designed the Suite for Embedded Applications & Kernels (SEAK), a new benchmark suite, (a) to capture these bottlenecks in a way that encourages creative solutions; and (b) to facilitate rigorous, objective, end-user evaluation for their solutions. To avoid biasing solutions toward existing algorithms, SEAK benchmarks use a mission-centric (abstracted from a particular algorithm) and goal-oriented (functional) specification. To encourage solutions that are any combination of software or hardware, we use an end-user black-box evaluation that can capture tradeoffs between performance, power, accuracy, size, andmore » weight. The tradeoffs are especially informative for procurement decisions. We call our benchmarks future proof because each mission-centric interface and evaluation remains useful despite shifting algorithmic preferences. It is challenging to create both concise and precise goal-oriented specifications for mission-centric problems. This paper describes the SEAK benchmark suite and presents an evaluation of sample solutions that highlights power and performance tradeoffs.« less
Benchmarking a geostatistical procedure for the homogenisation of annual precipitation series

NASA Astrophysics Data System (ADS)

Caineta, Júlio; Ribeiro, Sara; Henriques, Roberto; Soares, Amílcar; Costa, Ana Cristina

2014-05-01

The European project COST Action ES0601, Advances in homogenisation methods of climate series: an integrated approach (HOME), has brought to attention the importance of establishing reliable homogenisation methods for climate data. In order to achieve that, a benchmark data set, containing monthly and daily temperature and precipitation data, was created to be used as a comparison basis for the effectiveness of those methods. Several contributions were submitted and evaluated by a number of performance metrics, validating the results against realistic inhomogeneous data. HOME also led to the development of new homogenisation software packages, which included feedback and lessons learned during the project. Preliminary studies have suggested a geostatistical stochastic approach, which uses Direct Sequential Simulation (DSS), as a promising methodology for the homogenisation of precipitation data series. Based on the spatial and temporal correlation between the neighbouring stations, DSS calculates local probability density functions at a candidate station to detect inhomogeneities. The purpose of the current study is to test and compare this geostatistical approach with the methods previously presented in the HOME project, using surrogate precipitation series from the HOME benchmark data set. The benchmark data set contains monthly precipitation surrogate series, from which annual precipitation data series were derived. These annual precipitation series were subject to exploratory analysis and to a thorough variography study. The geostatistical approach was then applied to the data set, based on different scenarios for the spatial continuity. Implementing this procedure also promoted the development of a computer program that aims to assist on the homogenisation of climate data, while minimising user interaction. Finally, in order to compare the effectiveness of this methodology with the homogenisation methods submitted during the HOME project, the obtained results
Benchmarking NNWSI flow and transport codes: COVE 1 results

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hayden, N.K.

1985-06-01

The code verification (COVE) activity of the Nevada Nuclear Waste Storage Investigations (NNWSI) Project is the first step in certification of flow and transport codes used for NNWSI performance assessments of a geologic repository for disposing of high-level radioactive wastes. The goals of the COVE activity are (1) to demonstrate and compare the numerical accuracy and sensitivity of certain codes, (2) to identify and resolve problems in running typical NNWSI performance assessment calculations, and (3) to evaluate computer requirements for running the codes. This report describes the work done for COVE 1, the first step in benchmarking some of themore » codes. Isothermal calculations for the COVE 1 benchmarking have been completed using the hydrologic flow codes SAGUARO, TRUST, and GWVIP; the radionuclide transport codes FEMTRAN and TRUMP; and the coupled flow and transport code TRACR3D. This report presents the results of three cases of the benchmarking problem solved for COVE 1, a comparison of the results, questions raised regarding sensitivities to modeling techniques, and conclusions drawn regarding the status and numerical sensitivities of the codes. 30 refs.« less
How to achieve and prove performance improvement - 15 years of experience in German wastewater benchmarking.

PubMed

Bertzbach, F; Franz, T; Möller, K

2012-01-01

This paper shows the results of performance improvement, which have been achieved in benchmarking projects in the wastewater industry in Germany over the last 15 years. A huge number of changes in operational practice and also in achieved annual savings can be shown, induced in particular by benchmarking at process level. Investigation of this question produces some general findings for the inclusion of performance improvement in a benchmarking project and for the communication of its results. Thus, we elaborate on the concept of benchmarking at both utility and process level, which is still a necessary distinction for the integration of performance improvement into our benchmarking approach. To achieve performance improvement via benchmarking it should be made quite clear that this outcome depends, on one hand, on a well conducted benchmarking programme and, on the other, on the individual situation within each participating utility.
MoMaS reactive transport benchmark using PFLOTRAN

NASA Astrophysics Data System (ADS)

Park, H.

2017-12-01

MoMaS benchmark was developed to enhance numerical simulation capability for reactive transport modeling in porous media. The benchmark was published in late September of 2009; it is not taken from a real chemical system, but realistic and numerically challenging tests. PFLOTRAN is a state-of-art massively parallel subsurface flow and reactive transport code that is being used in multiple nuclear waste repository projects at Sandia National Laboratories including Waste Isolation Pilot Plant and Used Fuel Disposition. MoMaS benchmark has three independent tests with easy, medium, and hard chemical complexity. This paper demonstrates how PFLOTRAN is applied to this benchmark exercise and shows results of the easy benchmark test case which includes mixing of aqueous components and surface complexation. Surface complexations consist of monodentate and bidentate reactions which introduces difficulty in defining selectivity coefficient if the reaction applies to a bulk reference volume. The selectivity coefficient becomes porosity dependent for bidentate reaction in heterogeneous porous media. The benchmark is solved by PFLOTRAN with minimal modification to address the issue and unit conversions were made properly to suit PFLOTRAN.
Conceptual Soundness, Metric Development, Benchmarking, and Targeting for PATH Subprogram Evaluation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mosey. G.; Doris, E.; Coggeshall, C.

The objective of this study is to evaluate the conceptual soundness of the U.S. Department of Housing and Urban Development (HUD) Partnership for Advancing Technology in Housing (PATH) program's revised goals and establish and apply a framework to identify and recommend metrics that are the most useful for measuring PATH's progress. This report provides an evaluative review of PATH's revised goals, outlines a structured method for identifying and selecting metrics, proposes metrics and benchmarks for a sampling of individual PATH programs, and discusses other metrics that potentially could be developed that may add value to the evaluation process. The frameworkmore » and individual program metrics can be used for ongoing management improvement efforts and to inform broader program-level metrics for government reporting requirements.« less
Closed-Loop Neuromorphic Benchmarks

PubMed Central

Stewart, Terrence C.; DeWolf, Travis; Kleinhans, Ashley; Eliasmith, Chris

2015-01-01

Evaluating the effectiveness and performance of neuromorphic hardware is difficult. It is even more difficult when the task of interest is a closed-loop task; that is, a task where the output from the neuromorphic hardware affects some environment, which then in turn affects the hardware's future input. However, closed-loop situations are one of the primary potential uses of neuromorphic hardware. To address this, we present a methodology for generating closed-loop benchmarks that makes use of a hybrid of real physical embodiment and a type of “minimal” simulation. Minimal simulation has been shown to lead to robust real-world performance, while still maintaining the practical advantages of simulation, such as making it easy for the same benchmark to be used by many researchers. This method is flexible enough to allow researchers to explicitly modify the benchmarks to identify specific task domains where particular hardware excels. To demonstrate the method, we present a set of novel benchmarks that focus on motor control for an arbitrary system with unknown external forces. Using these benchmarks, we show that an error-driven learning rule can consistently improve motor control performance across a randomly generated family of closed-loop simulations, even when there are up to 15 interacting joints to be controlled. PMID:26696820
Evaluation of the ACEC Benchmark Suite for Real-Time Applications

DTIC Science & Technology

1990-07-23

1.0 benchmark suite waSanalyzed with respect to its measuring of Ada real-time features such as tasking, memory management, input/output, scheduling...and delay statement, Chapter 13 features , pragmas, interrupt handling, subprogram overhead, numeric computations etc. For most of the features that...meant for programming real-time systems. The ACEC benchmarks have been analyzed extensively with respect to their measuring of Ada real-time features
International land Model Benchmarking (ILAMB) Package v002.00

DOE Data Explorer

Collier, Nathaniel [Oak Ridge National Laboratory; Hoffman, Forrest M. [Oak Ridge National Laboratory; Mu, Mingquan [University of California, Irvine; Randerson, James T. [University of California, Irvine; Riley, William J. [Lawrence Berkeley National Laboratory

2016-05-09

As a contribution to International Land Model Benchmarking (ILAMB) Project, we are providing new analysis approaches, benchmarking tools, and science leadership. The goal of ILAMB is to assess and improve the performance of land models through international cooperation and to inform the design of new measurement campaigns and field studies to reduce uncertainties associated with key biogeochemical processes and feedbacks. ILAMB is expected to be a primary analysis tool for CMIP6 and future model-data intercomparison experiments. This team has developed initial prototype benchmarking systems for ILAMB, which will be improved and extended to include ocean model metrics and diagnostics.
International land Model Benchmarking (ILAMB) Package v001.00

DOE Data Explorer

Mu, Mingquan [University of California, Irvine; Randerson, James T. [University of California, Irvine; Riley, William J. [Lawrence Berkeley National Laboratory; Hoffman, Forrest M. [Oak Ridge National Laboratory

2016-05-02

As a contribution to International Land Model Benchmarking (ILAMB) Project, we are providing new analysis approaches, benchmarking tools, and science leadership. The goal of ILAMB is to assess and improve the performance of land models through international cooperation and to inform the design of new measurement campaigns and field studies to reduce uncertainties associated with key biogeochemical processes and feedbacks. ILAMB is expected to be a primary analysis tool for CMIP6 and future model-data intercomparison experiments. This team has developed initial prototype benchmarking systems for ILAMB, which will be improved and extended to include ocean model metrics and diagnostics.
All inclusive benchmarking.

PubMed

Ellis, Judith

2006-07-01

The aim of this article is to review published descriptions of benchmarking activity and synthesize benchmarking principles to encourage the acceptance and use of Essence of Care as a new benchmarking approach to continuous quality improvement, and to promote its acceptance as an integral and effective part of benchmarking activity in health services. The Essence of Care, was launched by the Department of Health in England in 2001 to provide a benchmarking tool kit to support continuous improvement in the quality of fundamental aspects of health care, for example, privacy and dignity, nutrition and hygiene. The tool kit is now being effectively used by some frontline staff. However, use is inconsistent, with the value of the tool kit, or the support clinical practice benchmarking requires to be effective, not always recognized or provided by National Health Service managers, who are absorbed with the use of quantitative benchmarking approaches and measurability of comparative performance data. This review of published benchmarking literature, was obtained through an ever-narrowing search strategy commencing from benchmarking within quality improvement literature through to benchmarking activity in health services and including access to not only published examples of benchmarking approaches and models used but the actual consideration of web-based benchmarking data. This supported identification of how benchmarking approaches have developed and been used, remaining true to the basic benchmarking principles of continuous improvement through comparison and sharing (Camp 1989). Descriptions of models and exemplars of quantitative and specifically performance benchmarking activity in industry abound (Camp 1998), with far fewer examples of more qualitative and process benchmarking approaches in use in the public services and then applied to the health service (Bullivant 1998). The literature is also in the main descriptive in its support of the effectiveness of
Processor Emulator with Benchmark Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lloyd, G. Scott; Pearce, Roger; Gokhale, Maya

2015-11-13

A processor emulator and a suite of benchmark applications have been developed to assist in characterizing the performance of data-centric workloads on current and future computer architectures. Some of the applications have been collected from other open source projects. For more details on the emulator and an example of its usage, see reference [1].
Medical school benchmarking - from tools to programmes.

PubMed

Wilkinson, Tim J; Hudson, Judith N; Mccoll, Geoffrey J; Hu, Wendy C Y; Jolly, Brian C; Schuwirth, Lambert W T

2015-02-01

Benchmarking among medical schools is essential, but may result in unwanted effects. To apply a conceptual framework to selected benchmarking activities of medical schools. We present an analogy between the effects of assessment on student learning and the effects of benchmarking on medical school educational activities. A framework by which benchmarking can be evaluated was developed and applied to key current benchmarking activities in Australia and New Zealand. The analogy generated a conceptual framework that tested five questions to be considered in relation to benchmarking: what is the purpose? what are the attributes of value? what are the best tools to assess the attributes of value? what happens to the results? and, what is the likely "institutional impact" of the results? If the activities were compared against a blueprint of desirable medical graduate outcomes, notable omissions would emerge. Medical schools should benchmark their performance on a range of educational activities to ensure quality improvement and to assure stakeholders that standards are being met. Although benchmarking potentially has positive benefits, it could also result in perverse incentives with unforeseen and detrimental effects on learning if it is undertaken using only a few selected assessment tools.
SP2Bench: A SPARQL Performance Benchmark

NASA Astrophysics Data System (ADS)

Schmidt, Michael; Hornung, Thomas; Meier, Michael; Pinkel, Christoph; Lausen, Georg

A meaningful analysis and comparison of both existing storage schemes for RDF data and evaluation approaches for SPARQL queries necessitates a comprehensive and universal benchmark platform. We present SP2Bench, a publicly available, language-specific performance benchmark for the SPARQL query language. SP2Bench is settled in the DBLP scenario and comprises a data generator for creating arbitrarily large DBLP-like documents and a set of carefully designed benchmark queries. The generated documents mirror vital key characteristics and social-world distributions encountered in the original DBLP data set, while the queries implement meaningful requests on top of this data, covering a variety of SPARQL operator constellations and RDF access patterns. In this chapter, we discuss requirements and desiderata for SPARQL benchmarks and present the SP2Bench framework, including its data generator, benchmark queries and performance metrics.
Benchmark Dataset for Whole Genome Sequence Compression.

PubMed

C L, Biji; S Nair, Achuthsankar

2017-01-01

The research in DNA data compression lacks a standard dataset to test out compression tools specific to DNA. This paper argues that the current state of achievement in DNA compression is unable to be benchmarked in the absence of such scientifically compiled whole genome sequence dataset and proposes a benchmark dataset using multistage sampling procedure. Considering the genome sequence of organisms available in the National Centre for Biotechnology and Information (NCBI) as the universe, the proposed dataset selects 1,105 prokaryotes, 200 plasmids, 164 viruses, and 65 eukaryotes. This paper reports the results of using three established tools on the newly compiled dataset and show that their strength and weakness are evident only with a comparison based on the scientifically compiled benchmark dataset. The sample dataset and the respective links are available @ https://sourceforge.net/projects/benchmarkdnacompressiondataset/.
Combining Phase Identification and Statistic Modeling for Automated Parallel Benchmark Generation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Ye; Ma, Xiaosong; Liu, Qing Gary

2015-01-01

Parallel application benchmarks are indispensable for evaluating/optimizing HPC software and hardware. However, it is very challenging and costly to obtain high-fidelity benchmarks reflecting the scale and complexity of state-of-the-art parallel applications. Hand-extracted synthetic benchmarks are time-and labor-intensive to create. Real applications themselves, while offering most accurate performance evaluation, are expensive to compile, port, reconfigure, and often plainly inaccessible due to security or ownership concerns. This work contributes APPRIME, a novel tool for trace-based automatic parallel benchmark generation. Taking as input standard communication-I/O traces of an application's execution, it couples accurate automatic phase identification with statistical regeneration of event parameters tomore » create compact, portable, and to some degree reconfigurable parallel application benchmarks. Experiments with four NAS Parallel Benchmarks (NPB) and three real scientific simulation codes confirm the fidelity of APPRIME benchmarks. They retain the original applications' performance characteristics, in particular the relative performance across platforms.« less
Research Project Evaluation-Learnings from the PATHWAYS Project Experience.

PubMed

Galas, Aleksander; Pilat, Aleksandra; Leonardi, Matilde; Tobiasz-Adamczyk, Beata

2018-05-25

Every research project faces challenges regarding how to achieve its goals in a timely and effective manner. The purpose of this paper is to present a project evaluation methodology gathered during the implementation of the Participation to Healthy Workplaces and Inclusive Strategies in the Work Sector (the EU PATHWAYS Project). The PATHWAYS project involved multiple countries and multi-cultural aspects of re/integrating chronically ill patients into labor markets in different countries. This paper describes key project's evaluation issues including: (1) purposes, (2) advisability, (3) tools, (4) implementation, and (5) possible benefits and presents the advantages of a continuous monitoring. Project evaluation tool to assess structure and resources, process, management and communication, achievements, and outcomes. The project used a mixed evaluation approach and included Strengths (S), Weaknesses (W), Opportunities (O), and Threats (SWOT) analysis. A methodology for longitudinal EU projects' evaluation is described. The evaluation process allowed to highlight strengths and weaknesses and highlighted good coordination and communication between project partners as well as some key issues such as: the need for a shared glossary covering areas investigated by the project, problematic issues related to the involvement of stakeholders from outside the project, and issues with timing. Numerical SWOT analysis showed improvement in project performance over time. The proportion of participating project partners in the evaluation varied from 100% to 83.3%. There is a need for the implementation of a structured evaluation process in multidisciplinary projects involving different stakeholders in diverse socio-environmental and political conditions. Based on the PATHWAYS experience, a clear monitoring methodology is suggested as essential in every multidisciplinary research projects.
Middleware Evaluation and Benchmarking for Use in Mission Operations Centers

NASA Technical Reports Server (NTRS)

Antonucci, Rob; Waktola, Waka

2005-01-01

Middleware technologies have been promoted as timesaving, cost-cutting alternatives to the point-to-point communication used in traditional mission operations systems. However, missions have been slow to adopt the new technology. The lack of existing middleware-based missions has given rise to uncertainty about middleware's ability to perform in an operational setting. Most mission architects are also unfamiliar with the technology and do not know the benefits and detriments to architectural choices - or even what choices are available. We will present the findings of a study that evaluated several middleware options specifically for use in a mission operations system. We will address some common misconceptions regarding the applicability of middleware-based architectures, and we will identify the design decisions and tradeoffs that must be made when choosing a middleware solution. The Middleware Comparison and Benchmark Study was conducted at NASA Goddard Space Flight Center to comprehensively evaluate candidate middleware products, compare and contrast the performance of middleware solutions with the traditional point- to-point socket approach, and assess data delivery and reliability strategies. The study focused on requirements of the Global Precipitation Measurement (GPM) mission, validating the potential use of middleware in the GPM mission ground system. The study was jointly funded by GPM and the Goddard Mission Services Evolution Center (GMSEC), a virtual organization for providing mission enabling solutions and promoting the use of appropriate new technologies for mission support. The study was broken into two phases. To perform the generic middleware benchmarking and performance analysis, a network was created with data producers and consumers passing data between themselves. The benchmark monitored the delay, throughput, and reliability of the data as the characteristics were changed. Measurements were taken under a variety of topologies, data demands
Radiation Detection Computational Benchmark Scenarios

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shaver, Mark W.; Casella, Andrew M.; Wittman, Richard S.

2013-09-24

Modeling forms an important component of radiation detection development, allowing for testing of new detector designs, evaluation of existing equipment against a wide variety of potential threat sources, and assessing operation performance of radiation detection systems. This can, however, result in large and complex scenarios which are time consuming to model. A variety of approaches to radiation transport modeling exist with complementary strengths and weaknesses for different problems. This variety of approaches, and the development of promising new tools (such as ORNL’s ADVANTG) which combine benefits of multiple approaches, illustrates the need for a means of evaluating or comparing differentmore » techniques for radiation detection problems. This report presents a set of 9 benchmark problems for comparing different types of radiation transport calculations, identifying appropriate tools for classes of problems, and testing and guiding the development of new methods. The benchmarks were drawn primarily from existing or previous calculations with a preference for scenarios which include experimental data, or otherwise have results with a high level of confidence, are non-sensitive, and represent problem sets of interest to NA-22. From a technical perspective, the benchmarks were chosen to span a range of difficulty and to include gamma transport, neutron transport, or both and represent different important physical processes and a range of sensitivity to angular or energy fidelity. Following benchmark identification, existing information about geometry, measurements, and previous calculations were assembled. Monte Carlo results (MCNP decks) were reviewed or created and re-run in order to attain accurate computational times and to verify agreement with experimental data, when present. Benchmark information was then conveyed to ORNL in order to guide testing and development of hybrid calculations. The results of those ADVANTG calculations were then sent to
Quality management benchmarking: FDA compliance in pharmaceutical industry.

PubMed

Jochem, Roland; Landgraf, Katja

2010-01-01

By analyzing and comparing industry and business best practice, processes can be optimized and become more successful mainly because efficiency and competitiveness increase. This paper aims to focus on some examples. Case studies are used to show knowledge exchange in the pharmaceutical industry. Best practice solutions were identified in two companies using a benchmarking method and five-stage model. Despite large administrations, there is much potential regarding business process organization. This project makes it possible for participants to fully understand their business processes. The benchmarking method gives an opportunity to critically analyze value chains (a string of companies or players working together to satisfy market demands for a special product). Knowledge exchange is interesting for companies that like to be global players. Benchmarking supports information exchange and improves competitive ability between different enterprises. Findings suggest that the five-stage model improves efficiency and effectiveness. Furthermore, the model increases the chances for reaching targets. The method gives security to partners that did not have benchmarking experience. The study identifies new quality management procedures. Process management and especially benchmarking is shown to support pharmaceutical industry improvements.

Benchmarking routine psychological services: a discussion of challenges and methods.

PubMed

Delgadillo, Jaime; McMillan, Dean; Leach, Chris; Lucock, Mike; Gilbody, Simon; Wood, Nick

2014-01-01

Policy developments in recent years have led to important changes in the level of access to evidence-based psychological treatments. Several methods have been used to investigate the effectiveness of these treatments in routine care, with different approaches to outcome definition and data analysis. To present a review of challenges and methods for the evaluation of evidence-based treatments delivered in routine mental healthcare. This is followed by a case example of a benchmarking method applied in primary care. High, average and poor performance benchmarks were calculated through a meta-analysis of published data from services working under the Improving Access to Psychological Therapies (IAPT) Programme in England. Pre-post treatment effect sizes (ES) and confidence intervals were estimated to illustrate a benchmarking method enabling services to evaluate routine clinical outcomes. High, average and poor performance ES for routine IAPT services were estimated to be 0.91, 0.73 and 0.46 for depression (using PHQ-9) and 1.02, 0.78 and 0.52 for anxiety (using GAD-7). Data from one specific IAPT service exemplify how to evaluate and contextualize routine clinical performance against these benchmarks. The main contribution of this report is to summarize key recommendations for the selection of an adequate set of psychometric measures, the operational definition of outcomes, and the statistical evaluation of clinical performance. A benchmarking method is also presented, which may enable a robust evaluation of clinical performance against national benchmarks. Some limitations concerned significant heterogeneity among data sources, and wide variations in ES and data completeness.
Benchmarking short sequence mapping tools

PubMed Central

2013-01-01

Background The development of next-generation sequencing instruments has led to the generation of millions of short sequences in a single run. The process of aligning these reads to a reference genome is time consuming and demands the development of fast and accurate alignment tools. However, the current proposed tools make different compromises between the accuracy and the speed of mapping. Moreover, many important aspects are overlooked while comparing the performance of a newly developed tool to the state of the art. Therefore, there is a need for an objective evaluation method that covers all the aspects. In this work, we introduce a benchmarking suite to extensively analyze sequencing tools with respect to various aspects and provide an objective comparison. Results We applied our benchmarking tests on 9 well known mapping tools, namely, Bowtie, Bowtie2, BWA, SOAP2, MAQ, RMAP, GSNAP, Novoalign, and mrsFAST (mrFAST) using synthetic data and real RNA-Seq data. MAQ and RMAP are based on building hash tables for the reads, whereas the remaining tools are based on indexing the reference genome. The benchmarking tests reveal the strengths and weaknesses of each tool. The results show that no single tool outperforms all others in all metrics. However, Bowtie maintained the best throughput for most of the tests while BWA performed better for longer read lengths. The benchmarking tests are not restricted to the mentioned tools and can be further applied to others. Conclusion The mapping process is still a hard problem that is affected by many factors. In this work, we provided a benchmarking suite that reveals and evaluates the different factors affecting the mapping process. Still, there is no tool that outperforms all of the others in all the tests. Therefore, the end user should clearly specify his needs in order to choose the tool that provides the best results. PMID:23758764
Parallel Ada benchmarks for the SVMS

NASA Technical Reports Server (NTRS)

Collard, Philippe E.

1990-01-01

The use of parallel processing paradigm to design and develop faster and more reliable computers appear to clearly mark the future of information processing. NASA started the development of such an architecture: the Spaceborne VHSIC Multi-processor System (SVMS). Ada will be one of the languages used to program the SVMS. One of the unique characteristics of Ada is that it supports parallel processing at the language level through the tasking constructs. It is important for the SVMS project team to assess how efficiently the SVMS architecture will be implemented, as well as how efficiently Ada environment will be ported to the SVMS. AUTOCLASS II, a Bayesian classifier written in Common Lisp, was selected as one of the benchmarks for SVMS configurations. The purpose of the R and D effort was to provide the SVMS project team with the version of AUTOCLASS II, written in Ada, that would make use of Ada tasking constructs as much as possible so as to constitute a suitable benchmark. Additionally, a set of programs was developed that would measure Ada tasking efficiency on parallel architectures as well as determine the critical parameters influencing tasking efficiency. All this was designed to provide the SVMS project team with a set of suitable tools in the development of the SVMS architecture.
Development of a Computer-based Benchmarking and Analytical Tool. Benchmarking and Energy & Water Savings Tool in Dairy Plants (BEST-Dairy)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, Tengfang; Flapper, Joris; Ke, Jing

The overall goal of the project is to develop a computer-based benchmarking and energy and water savings tool (BEST-Dairy) for use in the California dairy industry – including four dairy processes – cheese, fluid milk, butter, and milk powder.
Benchmarks for Psychotherapy Efficacy in Adult Major Depression

ERIC Educational Resources Information Center

Minami, Takuya; Wampold, Bruce E.; Serlin, Ronald C.; Kircher, John C.; Brown, George S.

2007-01-01

This study estimates pretreatment-posttreatment effect size benchmarks for the treatment of major depression in adults that may be useful in evaluating psychotherapy effectiveness in clinical practice. Treatment efficacy benchmarks for major depression were derived for 3 different types of outcome measures: the Hamilton Rating Scale for Depression…
Benchmarking U.S. Small Wind Costs with the Distributed Wind Taxonomy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Orrell, Alice C.; Poehlman, Eric A.

The objective of this report is to benchmark costs for small wind projects installed in the United States using a distributed wind taxonomy. Consequently, this report is a starting point to help expand the U.S. distributed wind market by informing potential areas for small wind cost-reduction opportunities and providing a benchmark to track future small wind cost-reduction progress.
Evaluating the Effect of Labeled Benchmarks on Children's Number Line Estimation Performance and Strategy Use.

PubMed

Peeters, Dominique; Sekeris, Elke; Verschaffel, Lieven; Luwel, Koen

2017-01-01

Some authors argue that age-related improvements in number line estimation (NLE) performance result from changes in strategy use. More specifically, children's strategy use develops from only using the origin of the number line, to using the origin and the endpoint, to eventually also relying on the midpoint of the number line. Recently, Peeters et al. (unpublished) investigated whether the provision of additional unlabeled benchmarks at 25, 50, and 75% of the number line, positively affects third and fifth graders' NLE performance and benchmark-based strategy use. It was found that only the older children benefitted from the presence of these benchmarks at the quartiles of the number line (i.e., 25 and 75%), as they made more use of these benchmarks, leading to more accurate estimates. A possible explanation for this lack of improvement in third graders might be their inability to correctly link the presented benchmarks with their corresponding numerical values. In the present study, we investigated whether labeling these benchmarks with their corresponding numerical values, would have a positive effect on younger children's NLE performance and quartile-based strategy use as well. Third and sixth graders were assigned to one of three conditions: (a) a control condition with an empty number line bounded by 0 at the origin and 1,000 at the endpoint, (b) an unlabeled condition with three additional external benchmarks without numerical labels at 25, 50, and 75% of the number line, and (c) a labeled condition in which these benchmarks were labeled with 250, 500, and 750, respectively. Results indicated that labeling the benchmarks has a positive effect on third graders' NLE performance and quartile-based strategy use, whereas sixth graders already benefited from the mere provision of unlabeled benchmarks. These findings imply that children's benchmark-based strategy use can be stimulated by adding additional externally provided benchmarks on the number line, but that
BENCHMARK DOSES FOR CHEMICAL MIXTURES: EVALUATION OF A MIXTURE OF 18 PHAHS.

EPA Science Inventory

Benchmark doses (BMDs), defined as doses of a substance that are expected to result in a pre-specified level of "benchmark" response (BMR), have been used for quantifying the risk associated with exposure to environmental hazards. The lower confidence limit of the BMD is used as...
A chemical EOR benchmark study of different reservoir simulators

NASA Astrophysics Data System (ADS)

Goudarzi, Ali; Delshad, Mojdeh; Sepehrnoori, Kamy

2016-09-01

Interest in chemical EOR processes has intensified in recent years due to the advancements in chemical formulations and injection techniques. Injecting Polymer (P), surfactant/polymer (SP), and alkaline/surfactant/polymer (ASP) are techniques for improving sweep and displacement efficiencies with the aim of improving oil production in both secondary and tertiary floods. There has been great interest in chemical flooding recently for different challenging situations. These include high temperature reservoirs, formations with extreme salinity and hardness, naturally fractured carbonates, and sandstone reservoirs with heavy and viscous crude oils. More oil reservoirs are reaching maturity where secondary polymer floods and tertiary surfactant methods have become increasingly important. This significance has added to the industry's interest in using reservoir simulators as tools for reservoir evaluation and management to minimize costs and increase the process efficiency. Reservoir simulators with special features are needed to represent coupled chemical and physical processes present in chemical EOR processes. The simulators need to be first validated against well controlled lab and pilot scale experiments to reliably predict the full field implementations. The available data from laboratory scale include 1) phase behavior and rheological data; and 2) results of secondary and tertiary coreflood experiments for P, SP, and ASP floods under reservoir conditions, i.e. chemical retentions, pressure drop, and oil recovery. Data collected from corefloods are used as benchmark tests comparing numerical reservoir simulators with chemical EOR modeling capabilities such as STARS of CMG, ECLIPSE-100 of Schlumberger, REVEAL of Petroleum Experts. The research UTCHEM simulator from The University of Texas at Austin is also included since it has been the benchmark for chemical flooding simulation for over 25 years. The results of this benchmark comparison will be utilized to improve
Analysis of a benchmark suite to evaluate mixed numeric and symbolic processing

NASA Technical Reports Server (NTRS)

Ragharan, Bharathi; Galant, David

1992-01-01

The suite of programs that formed the benchmark for a proposed advanced computer is described and analyzed. The features of the processor and its operating system that are tested by the benchmark are discussed. The computer codes and the supporting data for the analysis are given as appendices.
SU-E-J-30: Benchmark Image-Based TCP Calculation for Evaluation of PTV Margins for Lung SBRT Patients

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, M; Chetty, I; Zhong, H

2014-06-01

Purpose: Tumor control probability (TCP) calculated with accumulated radiation doses may help design appropriate treatment margins. Image registration errors, however, may compromise the calculated TCP. The purpose of this study is to develop benchmark CT images to quantify registration-induced errors in the accumulated doses and their corresponding TCP. Methods: 4DCT images were registered from end-inhale (EI) to end-exhale (EE) using a “demons” algorithm. The demons DVFs were corrected by an FEM model to get realistic deformation fields. The FEM DVFs were used to warp the EI images to create the FEM-simulated images. The two images combined with the FEM DVFmore » formed a benchmark model. Maximum intensity projection (MIP) images, created from the EI and simulated images, were used to develop IMRT plans. Two plans with 3 and 5 mm margins were developed for each patient. With these plans, radiation doses were recalculated on the simulated images and warped back to the EI images using the FEM DVFs to get the accumulated doses. The Elastix software was used to register the FEM-simulated images to the EI images. TCPs calculated with the Elastix-accumulated doses were compared with those generated by the FEM to get the TCP error of the Elastix registrations. Results: For six lung patients, the mean Elastix registration error ranged from 0.93 to 1.98 mm. Their relative dose errors in PTV were between 0.28% and 6.8% for 3mm margin plans, and between 0.29% and 6.3% for 5mm-margin plans. As the PTV margin reduced from 5 to 3 mm, the mean TCP error of the Elastix-reconstructed doses increased from 2.0% to 2.9%, and the mean NTCP errors decreased from 1.2% to 1.1%. Conclusion: Patient-specific benchmark images can be used to evaluate the impact of registration errors on the computed TCPs, and may help select appropriate PTV margins for lung SBRT patients.« less
Benchmarking the cost efficiency of community care in Australian child and adolescent mental health services: implications for future benchmarking.

PubMed

Furber, Gareth; Brann, Peter; Skene, Clive; Allison, Stephen

2011-06-01

The purpose of this study was to benchmark the cost efficiency of community care across six child and adolescent mental health services (CAMHS) drawn from different Australian states. Organizational, contact and outcome data from the National Mental Health Benchmarking Project (NMHBP) data-sets were used to calculate cost per "treatment hour" and cost per episode for the six participating organizations. We also explored the relationship between intake severity as measured by the Health of the Nations Outcome Scales for Children and Adolescents (HoNOSCA) and cost per episode. The average cost per treatment hour was $223, with cost differences across the six services ranging from a mean of $156 to $273 per treatment hour. The average cost per episode was $3349 (median $1577) and there were significant differences in the CAMHS organizational medians ranging from $388 to $7076 per episode. HoNOSCA scores explained at best 6% of the cost variance per episode. These large cost differences indicate that community CAMHS have the potential to make substantial gains in cost efficiency through collaborative benchmarking. Benchmarking forums need considerable financial and business expertise for detailed comparison of business models for service provision.
Benchmark simulation Model no 2 in Matlab-simulink: towards plant-wide WWTP control strategy evaluation.

PubMed

Vreck, D; Gernaey, K V; Rosen, C; Jeppsson, U

2006-01-01

In this paper, implementation of the Benchmark Simulation Model No 2 (BSM2) within Matlab-Simulink is presented. The BSM2 is developed for plant-wide WWTP control strategy evaluation on a long-term basis. It consists of a pre-treatment process, an activated sludge process and sludge treatment processes. Extended evaluation criteria are proposed for plant-wide control strategy assessment. Default open-loop and closed-loop strategies are also proposed to be used as references with which to compare other control strategies. Simulations indicate that the BM2 is an appropriate tool for plant-wide control strategy evaluation.
Evaluation and optimization of virtual screening workflows with DEKOIS 2.0--a public library of challenging docking benchmark sets.

PubMed

Bauer, Matthias R; Ibrahim, Tamer M; Vogel, Simon M; Boeckler, Frank M

2013-06-24

The application of molecular benchmarking sets helps to assess the actual performance of virtual screening (VS) workflows. To improve the efficiency of structure-based VS approaches, the selection and optimization of various parameters can be guided by benchmarking. With the DEKOIS 2.0 library, we aim to further extend and complement the collection of publicly available decoy sets. Based on BindingDB bioactivity data, we provide 81 new and structurally diverse benchmark sets for a wide variety of different target classes. To ensure a meaningful selection of ligands, we address several issues that can be found in bioactivity data. We have improved our previously introduced DEKOIS methodology with enhanced physicochemical matching, now including the consideration of molecular charges, as well as a more sophisticated elimination of latent actives in the decoy set (LADS). We evaluate the docking performance of Glide, GOLD, and AutoDock Vina with our data sets and highlight existing challenges for VS tools. All DEKOIS 2.0 benchmark sets will be made accessible at http://www.dekois.com.
A One-group, One-dimensional Transport Benchmark in Cylindrical Geometry

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barry Ganapol; Abderrafi M. Ougouag

A 1-D, 1-group computational benchmark in cylndrical geometry is described. This neutron transport benchmark is useful for evaluating reactor concepts that possess azimuthal symmetry such as a pebble-bed reactor.
Benchmarking infrastructure for mutation text mining

PubMed Central

2014-01-01

Background Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. Results We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. Conclusion We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption. PMID:24568600
Benchmarking infrastructure for mutation text mining.

PubMed

Klein, Artjom; Riazanov, Alexandre; Hindle, Matthew M; Baker, Christopher Jo

2014-02-25

Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption.
Developing a benchmark for emotional analysis of music

PubMed Central

Yang, Yi-Hsuan; Soleymani, Mohammad

2017-01-01

Music emotion recognition (MER) field rapidly expanded in the last decade. Many new methods and new audio features are developed to improve the performance of MER algorithms. However, it is very difficult to compare the performance of the new methods because of the data representation diversity and scarcity of publicly available data. In this paper, we address these problems by creating a data set and a benchmark for MER. The data set that we release, a MediaEval Database for Emotional Analysis in Music (DEAM), is the largest available data set of dynamic annotations (valence and arousal annotations for 1,802 songs and song excerpts licensed under Creative Commons with 2Hz time resolution). Using DEAM, we organized the ‘Emotion in Music’ task at MediaEval Multimedia Evaluation Campaign from 2013 to 2015. The benchmark attracted, in total, 21 active teams to participate in the challenge. We analyze the results of the benchmark: the winning algorithms and feature-sets. We also describe the design of the benchmark, the evaluation procedures and the data cleaning and transformations that we suggest. The results from the benchmark suggest that the recurrent neural network based approaches combined with large feature-sets work best for dynamic MER. PMID:28282400
Decoys Selection in Benchmarking Datasets: Overview and Perspectives

PubMed Central

Réau, Manon; Langenfeld, Florent; Zagury, Jean-François; Lagarde, Nathalie; Montes, Matthieu

2018-01-01

Virtual Screening (VS) is designed to prospectively help identifying potential hits, i.e., compounds capable of interacting with a given target and potentially modulate its activity, out of large compound collections. Among the variety of methodologies, it is crucial to select the protocol that is the most adapted to the query/target system under study and that yields the most reliable output. To this aim, the performance of VS methods is commonly evaluated and compared by computing their ability to retrieve active compounds in benchmarking datasets. The benchmarking datasets contain a subset of known active compounds together with a subset of decoys, i.e., assumed non-active molecules. The composition of both the active and the decoy compounds subsets is critical to limit the biases in the evaluation of the VS methods. In this review, we focus on the selection of decoy compounds that has considerably changed over the years, from randomly selected compounds to highly customized or experimentally validated negative compounds. We first outline the evolution of decoys selection in benchmarking databases as well as current benchmarking databases that tend to minimize the introduction of biases, and secondly, we propose recommendations for the selection and the design of benchmarking datasets. PMID:29416509
Developing a benchmark for emotional analysis of music.

PubMed

Aljanaki, Anna; Yang, Yi-Hsuan; Soleymani, Mohammad

2017-01-01

Music emotion recognition (MER) field rapidly expanded in the last decade. Many new methods and new audio features are developed to improve the performance of MER algorithms. However, it is very difficult to compare the performance of the new methods because of the data representation diversity and scarcity of publicly available data. In this paper, we address these problems by creating a data set and a benchmark for MER. The data set that we release, a MediaEval Database for Emotional Analysis in Music (DEAM), is the largest available data set of dynamic annotations (valence and arousal annotations for 1,802 songs and song excerpts licensed under Creative Commons with 2Hz time resolution). Using DEAM, we organized the 'Emotion in Music' task at MediaEval Multimedia Evaluation Campaign from 2013 to 2015. The benchmark attracted, in total, 21 active teams to participate in the challenge. We analyze the results of the benchmark: the winning algorithms and feature-sets. We also describe the design of the benchmark, the evaluation procedures and the data cleaning and transformations that we suggest. The results from the benchmark suggest that the recurrent neural network based approaches combined with large feature-sets work best for dynamic MER.

Project Pride Evaluation Report.

ERIC Educational Resources Information Center

Jennewein, Marilyn; And Others

Project PRIDE (Probe, Research, Inquire, Discover, and Evaluate) is evaluated in this report to provide data to be used as a learning tool for project staff and student participants. Major objectives of the project are to provide an inter-disciplinary, objective approach to the study of the American heritage, and to incorporate methods and…
A benchmarking program to reduce red blood cell outdating: implementation, evaluation, and a conceptual framework.

PubMed

Barty, Rebecca L; Gagliardi, Kathleen; Owens, Wendy; Lauzon, Deborah; Scheuermann, Sheena; Liu, Yang; Wang, Grace; Pai, Menaka; Heddle, Nancy M

2015-07-01

Benchmarking is a quality improvement tool that compares an organization's performance to that of its peers for selected indicators, to improve practice. Processes to develop evidence-based benchmarks for red blood cell (RBC) outdating in Ontario hospitals, based on RBC hospital disposition data from Canadian Blood Services, have been previously reported. These benchmarks were implemented in 160 hospitals provincewide with a multifaceted approach, which included hospital education, inventory management tools and resources, summaries of best practice recommendations, recognition of high-performing sites, and audit tools on the Transfusion Ontario website (http://transfusionontario.org). In this study we describe the implementation process and the impact of the benchmarking program on RBC outdating. A conceptual framework for continuous quality improvement of a benchmarking program was also developed. The RBC outdating rate for all hospitals trended downward continuously from April 2006 to February 2012, irrespective of hospitals' transfusion rates or their distance from the blood supplier. The highest annual outdating rate was 2.82%, at the beginning of the observation period. Each year brought further reductions, with a nadir outdating rate of 1.02% achieved in 2011. The key elements of the successful benchmarking strategy included dynamic targets, a comprehensive and evidence-based implementation strategy, ongoing information sharing, and a robust data system to track information. The Ontario benchmarking program for RBC outdating resulted in continuous and sustained quality improvement. Our conceptual iterative framework for benchmarking provides a guide for institutions implementing a benchmarking program. © 2015 AABB.
Experimental power density distribution benchmark in the TRIGA Mark II reactor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Snoj, L.; Stancar, Z.; Radulovic, V.

2012-07-01

In order to improve the power calibration process and to benchmark the existing computational model of the TRIGA Mark II reactor at the Josef Stefan Inst. (JSI), a bilateral project was started as part of the agreement between the French Commissariat a l'energie atomique et aux energies alternatives (CEA) and the Ministry of higher education, science and technology of Slovenia. One of the objectives of the project was to analyze and improve the power calibration process of the JSI TRIGA reactor (procedural improvement and uncertainty reduction) by using absolutely calibrated CEA fission chambers (FCs). This is one of the fewmore » available power density distribution benchmarks for testing not only the fission rate distribution but also the absolute values of the fission rates. Our preliminary calculations indicate that the total experimental uncertainty of the measured reaction rate is sufficiently low that the experiments could be considered as benchmark experiments. (authors)« less
Developing integrated benchmarks for DOE performance measurement

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barancik, J.I.; Kramer, C.F.; Thode, Jr. H.C.

1992-09-30

The objectives of this task were to describe and evaluate selected existing sources of information on occupational safety and health with emphasis on hazard and exposure assessment, abatement, training, reporting, and control identifying for exposure and outcome in preparation for developing DOE performance benchmarks. Existing resources and methodologies were assessed for their potential use as practical performance benchmarks. Strengths and limitations of current data resources were identified. Guidelines were outlined for developing new or improved performance factors, which then could become the basis for selecting performance benchmarks. Data bases for non-DOE comparison populations were identified so that DOE performance couldmore » be assessed relative to non-DOE occupational and industrial groups. Systems approaches were described which can be used to link hazards and exposure, event occurrence, and adverse outcome factors, as needed to generate valid, reliable, and predictive performance benchmarks. Data bases were identified which contain information relevant to one or more performance assessment categories . A list of 72 potential performance benchmarks was prepared to illustrate the kinds of information that can be produced through a benchmark development program. Current information resources which may be used to develop potential performance benchmarks are limited. There is need to develop an occupational safety and health information and data system in DOE, which is capable of incorporating demonstrated and documented performance benchmarks prior to, or concurrent with the development of hardware and software. A key to the success of this systems approach is rigorous development and demonstration of performance benchmark equivalents to users of such data before system hardware and software commitments are institutionalized.« less
Benchmarking: measuring the outcomes of evidence-based practice.

PubMed

DeLise, D C; Leasure, A R

2001-01-01

Measurement of the outcomes associated with implementation of evidence-based practice changes is becoming increasingly emphasized by multiple health care disciplines. A final step to the process of implementing and sustaining evidence-supported practice changes is that of outcomes evaluation and monitoring. The comparison of outcomes to internal and external measures is known as benchmarking. This article discusses evidence-based practice, provides an overview of outcomes evaluation, and describes the process of benchmarking to improve practice. A case study is used to illustrate this concept.
Performance evaluation of tile-based Fisher Ratio analysis using a benchmark yeast metabolome dataset.

PubMed

Watson, Nathanial E; Parsons, Brendon A; Synovec, Robert E

2016-08-12

Performance of tile-based Fisher Ratio (F-ratio) data analysis, recently developed for discovery-based studies using comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry (GC×GC-TOFMS), is evaluated with a metabolomics dataset that had been previously analyzed in great detail, but while taking a brute force approach. The previously analyzed data (referred to herein as the benchmark dataset) were intracellular extracts from Saccharomyces cerevisiae (yeast), either metabolizing glucose (repressed) or ethanol (derepressed), which define the two classes in the discovery-based analysis to find metabolites that are statistically different in concentration between the two classes. Beneficially, this previously analyzed dataset provides a concrete means to validate the tile-based F-ratio software. Herein, we demonstrate and validate the significant benefits of applying tile-based F-ratio analysis. The yeast metabolomics data are analyzed more rapidly in about one week versus one year for the prior studies with this dataset. Furthermore, a null distribution analysis is implemented to statistically determine an adequate F-ratio threshold, whereby the variables with F-ratio values below the threshold can be ignored as not class distinguishing, which provides the analyst with confidence when analyzing the hit table. Forty-six of the fifty-four benchmarked changing metabolites were discovered by the new methodology while consistently excluding all but one of the benchmarked nineteen false positive metabolites previously identified. Copyright © 2016 Elsevier B.V. All rights reserved.
Ultracool dwarf benchmarks with Gaia primaries

NASA Astrophysics Data System (ADS)

Marocco, F.; Pinfield, D. J.; Cook, N. J.; Zapatero Osorio, M. R.; Montes, D.; Caballero, J. A.; Gálvez-Ortiz, M. C.; Gromadzki, M.; Jones, H. R. A.; Kurtev, R.; Smart, R. L.; Zhang, Z.; Cabrera Lavers, A. L.; García Álvarez, D.; Qi, Z. X.; Rickard, M. J.; Dover, L.

2017-10-01

We explore the potential of Gaia for the field of benchmark ultracool/brown dwarf companions, and present the results of an initial search for metal-rich/metal-poor systems. A simulated population of resolved ultracool dwarf companions to Gaia primary stars is generated and assessed. Of the order of ˜24 000 companions should be identifiable outside of the Galactic plane (|b| > 10 deg) with large-scale ground- and space-based surveys including late M, L, T and Y types. Our simulated companion parameter space covers 0.02 ≤ M/M⊙ ≤ 0.1, 0.1 ≤ age/Gyr ≤ 14 and -2.5 ≤ [Fe/H] ≤ 0.5, with systems required to have a false alarm probability <10-4, based on projected separation and expected constraints on common distance, common proper motion and/or common radial velocity. Within this bulk population, we identify smaller target subsets of rarer systems whose collective properties still span the full parameter space of the population, as well as systems containing primary stars that are good age calibrators. Our simulation analysis leads to a series of recommendations for candidate selection and observational follow-up that could identify ˜500 diverse Gaia benchmarks. As a test of the veracity of our methodology and simulations, our initial search uses UKIRT Infrared Deep Sky Survey and Sloan Digital Sky Survey to select secondaries, with the parameters of primaries taken from Tycho-2, Radial Velocity Experiment, Large sky Area Multi-Object fibre Spectroscopic Telescope and Tycho-Gaia Astrometric Solution. We identify and follow up 13 new benchmarks. These include M8-L2 companions, with metallicity constraints ranging in quality, but robust in the range -0.39 ≤ [Fe/H] ≤ +0.36, and with projected physical separation in the range 0.6 < s/kau < 76. Going forward, Gaia offers a very high yield of benchmark systems, from which diverse subsamples may be able to calibrate a range of foundational ultracool/sub-stellar theory and observation.
How do I know if my forecasts are better? Using benchmarks in hydrological ensemble prediction

NASA Astrophysics Data System (ADS)

Pappenberger, F.; Ramos, M. H.; Cloke, H. L.; Wetterhall, F.; Alfieri, L.; Bogner, K.; Mueller, A.; Salamon, P.

2015-03-01

The skill of a forecast can be assessed by comparing the relative proximity of both the forecast and a benchmark to the observations. Example benchmarks include climatology or a naïve forecast. Hydrological ensemble prediction systems (HEPS) are currently transforming the hydrological forecasting environment but in this new field there is little information to guide researchers and operational forecasters on how benchmarks can be best used to evaluate their probabilistic forecasts. In this study, it is identified that the forecast skill calculated can vary depending on the benchmark selected and that the selection of a benchmark for determining forecasting system skill is sensitive to a number of hydrological and system factors. A benchmark intercomparison experiment is then undertaken using the continuous ranked probability score (CRPS), a reference forecasting system and a suite of 23 different methods to derive benchmarks. The benchmarks are assessed within the operational set-up of the European Flood Awareness System (EFAS) to determine those that are 'toughest to beat' and so give the most robust discrimination of forecast skill, particularly for the spatial average fields that EFAS relies upon. Evaluating against an observed discharge proxy the benchmark that has most utility for EFAS and avoids the most naïve skill across different hydrological situations is found to be meteorological persistency. This benchmark uses the latest meteorological observations of precipitation and temperature to drive the hydrological model. Hydrological long term average benchmarks, which are currently used in EFAS, are very easily beaten by the forecasting system and the use of these produces much naïve skill. When decomposed into seasons, the advanced meteorological benchmarks, which make use of meteorological observations from the past 20 years at the same calendar date, have the most skill discrimination. They are also good at discriminating skill in low flows and for all
A benchmark for subduction zone modeling

NASA Astrophysics Data System (ADS)

van Keken, P.; King, S.; Peacock, S.

2003-04-01

Our understanding of subduction zones hinges critically on the ability to discern its thermal structure and dynamics. Computational modeling has become an essential complementary approach to observational and experimental studies. The accurate modeling of subduction zones is challenging due to the unique geometry, complicated rheological description and influence of fluid and melt formation. The complicated physics causes problems for the accurate numerical solution of the governing equations. As a consequence it is essential for the subduction zone community to be able to evaluate the ability and limitations of various modeling approaches. The participants of a workshop on the modeling of subduction zones, held at the University of Michigan at Ann Arbor, MI, USA in 2002, formulated a number of case studies to be developed into a benchmark similar to previous mantle convection benchmarks (Blankenbach et al., 1989; Busse et al., 1991; Van Keken et al., 1997). Our initial benchmark focuses on the dynamics of the mantle wedge and investigates three different rheologies: constant viscosity, diffusion creep, and dislocation creep. In addition we investigate the ability of codes to accurate model dynamic pressure and advection dominated flows. Proceedings of the workshop and the formulation of the benchmark are available at www.geo.lsa.umich.edu/~keken/subduction02.html We strongly encourage interested research groups to participate in this benchmark. At Nice 2003 we will provide an update and first set of benchmark results. Interested researchers are encouraged to contact one of the authors for further details.
New Reactor Physics Benchmark Data in the March 2012 Edition of the IRPhEP Handbook

DOE Office of Scientific and Technical Information (OSTI.GOV)

John D. Bess; J. Blair Briggs; Jim Gulliford

2012-11-01

The International Reactor Physics Experiment Evaluation Project (IRPhEP) was established to preserve integral reactor physics experimental data, including separate or special effects data for nuclear energy and technology applications. Numerous experiments that have been performed worldwide, represent a large investment of infrastructure, expertise, and cost, and are valuable resources of data for present and future research. These valuable assets provide the basis for recording, development, and validation of methods. If the experimental data are lost, the high cost to repeat many of these measurements may be prohibitive. The purpose of the IRPhEP is to provide an extensively peer-reviewed set ofmore » reactor physics-related integral data that can be used by reactor designers and safety analysts to validate the analytical tools used to design next-generation reactors and establish the safety basis for operation of these reactors. Contributors from around the world collaborate in the evaluation and review of selected benchmark experiments for inclusion in the International Handbook of Evaluated Reactor Physics Benchmark Experiments (IRPhEP Handbook) [1]. Several new evaluations have been prepared for inclusion in the March 2012 edition of the IRPhEP Handbook.« less
In response to an open invitation for comments on AAAS project 2061's Benchmark books on science. Part 1: documentation of serious errors in cell biology.

PubMed

Ling, Gilbert

2006-01-01

Project 2061 was founded by the American Association for the Advancement of Science (AAAS) to improve secondary school science education. An in-depth study of ten 9 to 12th grade biology textbooks led to the verdict that none conveyed "Big Ideas" that would give coherence and meaning to the profusion of lavishly illustrated isolated details. However, neither the Project report itself nor the Benchmark books put out earlier by the Project carries what deserves the designation of "Big Ideas." Worse, in the two earliest-published Benchmark books, the basic unit of all life forms--the living cell--is described as a soup enclosed by a cell membrane, that determines what can enter or leave the cell. This is astonishing since extensive experimental evidence has unequivocally disproved this idea 60 years ago. A "new" version of the membrane theory brought in to replace the discredited (sieve) version is the pump model--currently taught as established truth in all high-school and college biology textbooks--was also unequivocally disproved 40 years ago. This comment is written partly in response to Bechmark's gracious open invitation for ideas to improve the books and through them, to improve US secondary school science education.
Evaluating the Effect of Labeled Benchmarks on Children’s Number Line Estimation Performance and Strategy Use

PubMed Central

Peeters, Dominique; Sekeris, Elke; Verschaffel, Lieven; Luwel, Koen

2017-01-01

Some authors argue that age-related improvements in number line estimation (NLE) performance result from changes in strategy use. More specifically, children’s strategy use develops from only using the origin of the number line, to using the origin and the endpoint, to eventually also relying on the midpoint of the number line. Recently, Peeters et al. (unpublished) investigated whether the provision of additional unlabeled benchmarks at 25, 50, and 75% of the number line, positively affects third and fifth graders’ NLE performance and benchmark-based strategy use. It was found that only the older children benefitted from the presence of these benchmarks at the quartiles of the number line (i.e., 25 and 75%), as they made more use of these benchmarks, leading to more accurate estimates. A possible explanation for this lack of improvement in third graders might be their inability to correctly link the presented benchmarks with their corresponding numerical values. In the present study, we investigated whether labeling these benchmarks with their corresponding numerical values, would have a positive effect on younger children’s NLE performance and quartile-based strategy use as well. Third and sixth graders were assigned to one of three conditions: (a) a control condition with an empty number line bounded by 0 at the origin and 1,000 at the endpoint, (b) an unlabeled condition with three additional external benchmarks without numerical labels at 25, 50, and 75% of the number line, and (c) a labeled condition in which these benchmarks were labeled with 250, 500, and 750, respectively. Results indicated that labeling the benchmarks has a positive effect on third graders’ NLE performance and quartile-based strategy use, whereas sixth graders already benefited from the mere provision of unlabeled benchmarks. These findings imply that children’s benchmark-based strategy use can be stimulated by adding additional externally provided benchmarks on the number line
Performance of Landslide-HySEA tsunami model for NTHMP benchmarking validation process

NASA Astrophysics Data System (ADS)

Macias, Jorge

2017-04-01

In its FY2009 Strategic Plan, the NTHMP required that all numerical tsunami inundation models be verified as accurate and consistent through a model benchmarking process. This was completed in 2011, but only for seismic tsunami sources and in a limited manner for idealized solid underwater landslides. Recent work by various NTHMP states, however, has shown that landslide tsunami hazard may be dominant along significant parts of the US coastline, as compared to hazards from other tsunamigenic sources. To perform the above-mentioned validation process, a set of candidate benchmarks were proposed. These benchmarks are based on a subset of available laboratory date sets for solid slide experiments and deformable slide experiments, and include both submarine and subaerial slides. A benchmark based on a historic field event (Valdez, AK, 1964) close the list of proposed benchmarks. The Landslide-HySEA model has participated in the workshop that was organized at Texas A&M University - Galveston, on January 9-11, 2017. The aim of this presentation is to show some of the numerical results obtained for Landslide-HySEA in the framework of this benchmarking validation/verification effort. Acknowledgements. This research has been partially supported by the Junta de Andalucía research project TESELA (P11-RNM7069), the Spanish Government Research project SIMURISK (MTM2015-70490-C02-01-R) and Universidad de Málaga, Campus de Excelencia Internacional Andalucía Tech. The GPU computations were performed at the Unit of Numerical Methods (University of Malaga).
Nutrient cycle benchmarks for earth system land model

NASA Astrophysics Data System (ADS)

Zhu, Q.; Riley, W. J.; Tang, J.; Zhao, L.

2017-12-01

Projecting future biosphere-climate feedbacks using Earth system models (ESMs) relies heavily on robust modeling of land surface carbon dynamics. More importantly, soil nutrient (particularly, nitrogen (N) and phosphorus (P)) dynamics strongly modulate carbon dynamics, such as plant sequestration of atmospheric CO2. Prevailing ESM land models all consider nitrogen as a potentially limiting nutrient, and several consider phosphorus. However, including nutrient cycle processes in ESM land models potentially introduces large uncertainties that could be identified and addressed by improved observational constraints. We describe the development of two nutrient cycle benchmarks for ESM land models: (1) nutrient partitioning between plants and soil microbes inferred from 15N and 33P tracers studies and (2) nutrient limitation effects on carbon cycle informed by long-term fertilization experiments. We used these benchmarks to evaluate critical hypotheses regarding nutrient cycling and their representation in ESMs. We found that a mechanistic representation of plant-microbe nutrient competition based on relevant functional traits best reproduced observed plant-microbe nutrient partitioning. We also found that for multiple-nutrient models (i.e., N and P), application of Liebig's law of the minimum is often inaccurate. Rather, the Multiple Nutrient Limitation (MNL) concept better reproduces observed carbon-nutrient interactions.
U.S. Solar Photovoltaic System Cost Benchmark: Q1 2017

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fu, Ran; Feldman, David; Margolis, Robert

This report benchmarks U.S. solar photovoltaic (PV) system installed costs as of the first quarter of 2017 (Q1 2017). We use a bottom-up methodology, accounting for all system and projectdevelopment costs incurred during the installation to model the costs for residential, commercial, and utility-scale systems. In general, we attempt to model the typical installation techniques and business operations from an installed-cost perspective. Costs are represented from the perspective of the developer/installer; thus, all hardware costs represent the price at which components are purchased by the developer/installer, not accounting for preexisting supply agreements or other contracts. Importantly, the benchmark also representsmore » the sales price paid to the installer; therefore, it includes profit in the cost of the hardware, 1 along with the profit the installer/developer receives, as a separate cost category. However, it does not include any additional net profit, such as a developer fee or price gross-up, which is common in the marketplace. We adopt this approach owing to the wide variation in developer profits in all three sectors, where project pricing is highly dependent on region and project specifics such as local retail electricity rate structures, local rebate and incentive structures, competitive environment, and overall project or deal structures. Finally, our benchmarks are national averages weighted by state installed capacities.« less
Teaching Medical Students at a Distance: Using Distance Learning Benchmarks to Plan and Evaluate a Web-Enhanced Medical Student Curriculum

ERIC Educational Resources Information Center

Olney, Cynthia A.; Chumley, Heidi; Parra, Juan M.

2004-01-01

A team designing a Web-enhanced third-year medical education didactic curriculum based their course planning and evaluation activities on the Institute for Higher Education Policy's (2000) 24 benchmarks for online distance learning. The authors present the team's blueprint for planning and evaluating the Web-enhanced curriculum, which incorporates…
5 CFR 470.317 - Project evaluation.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 5 Administrative Personnel 1 2014-01-01 2014-01-01 false Project evaluation. 470.317 Section 470... MANAGEMENT RESEARCH PROGRAMS AND DEMONSTRATIONS PROJECTS Regulatory Requirements Pertaining to Demonstration Projects § 470.317 Project evaluation. (a) Compliance evaluation. OPM will review the operation of the...
5 CFR 470.317 - Project evaluation.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 5 Administrative Personnel 1 2012-01-01 2012-01-01 false Project evaluation. 470.317 Section 470... MANAGEMENT RESEARCH PROGRAMS AND DEMONSTRATIONS PROJECTS Regulatory Requirements Pertaining to Demonstration Projects § 470.317 Project evaluation. (a) Compliance evaluation. OPM will review the operation of the...
A review on the benchmarking concept in Malaysian construction safety performance

NASA Astrophysics Data System (ADS)

Ishak, Nurfadzillah; Azizan, Muhammad Azizi

2018-02-01

Construction industry is one of the major industries that propels Malaysia's economy in highly contributes to our nation's GDP growth, yet the high fatality rates on construction sites have caused concern among safety practitioners and the stakeholders. Hence, there is a need of benchmarking in performance of Malaysia's construction industry especially in terms of safety. This concept can create a fertile ground for ideas, but only in a receptive environment, organization that share good practices and compare their safety performance against other benefit most to establish improvement in safety culture. This research was conducted to study the awareness important, evaluate current practice and improvement, and also identify the constraint in implement of benchmarking on safety performance in our industry. Additionally, interviews with construction professionals were come out with different views on this concept. Comparison has been done to show the different understanding of benchmarking approach and how safety performance can be benchmarked. But, it's viewed as one mission, which to evaluate objectives identified through benchmarking that will improve the organization's safety performance. Finally, the expected result from this research is to help Malaysia's construction industry implement best practice in safety performance management through the concept of benchmarking.
Benchmarking the evaluated proton differential cross sections suitable for the EBS analysis of natSi and 16O

NASA Astrophysics Data System (ADS)

Kokkoris, M.; Dede, S.; Kantre, K.; Lagoyannis, A.; Ntemou, E.; Paneta, V.; Preketes-Sigalas, K.; Provatas, G.; Vlastou, R.; Bogdanović-Radović, I.; Siketić, Z.; Obajdin, N.

2017-08-01

The evaluated proton differential cross sections suitable for the Elastic Backscattering Spectroscopy (EBS) analysis of natSi and 16O, as obtained from SigmaCalc 2.0, have been benchmarked over a wide energy and angular range at two different accelerator laboratories, namely at N.C.S.R. 'Demokritos', Athens, Greece and at Ruđer Bošković Institute (RBI), Zagreb, Croatia, using a variety of high-purity thick targets of known stoichiometry. The results are presented in graphical and tabular forms, while the observed discrepancies, as well as, the limits in accuracy of the benchmarking procedure, along with target related effects, are thoroughly discussed and analysed. In the case of oxygen the agreement between simulated and experimental spectra was generally good, while for silicon serious discrepancies were observed above Ep,lab = 2.5 MeV, suggesting that a further tuning of the appropriate nuclear model parameters in the evaluated differential cross-section datasets is required.

Benchmarking high performance computing architectures with CMS’ skeleton framework

NASA Astrophysics Data System (ADS)

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

2017-10-01

In 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta, Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.
Benchmarking: A Method for Continuous Quality Improvement in Health

PubMed Central

Ettorchi-Tardy, Amina; Levif, Marie; Michel, Philippe

2012-01-01

Benchmarking, a management approach for implementing best practices at best cost, is a recent concept in the healthcare system. The objectives of this paper are to better understand the concept and its evolution in the healthcare sector, to propose an operational definition, and to describe some French and international experiences of benchmarking in the healthcare sector. To this end, we reviewed the literature on this approach's emergence in the industrial sector, its evolution, its fields of application and examples of how it has been used in the healthcare sector. Benchmarking is often thought to consist simply of comparing indicators and is not perceived in its entirety, that is, as a tool based on voluntary and active collaboration among several organizations to create a spirit of competition and to apply best practices. The key feature of benchmarking is its integration within a comprehensive and participatory policy of continuous quality improvement (CQI). Conditions for successful benchmarking focus essentially on careful preparation of the process, monitoring of the relevant indicators, staff involvement and inter-organizational visits. Compared to methods previously implemented in France (CQI and collaborative projects), benchmarking has specific features that set it apart as a healthcare innovation. This is especially true for healthcare or medical–social organizations, as the principle of inter-organizational visiting is not part of their culture. Thus, this approach will need to be assessed for feasibility and acceptability before it is more widely promoted. PMID:23634166
Benchmarking: a method for continuous quality improvement in health.

PubMed

Ettorchi-Tardy, Amina; Levif, Marie; Michel, Philippe

2012-05-01

Benchmarking, a management approach for implementing best practices at best cost, is a recent concept in the healthcare system. The objectives of this paper are to better understand the concept and its evolution in the healthcare sector, to propose an operational definition, and to describe some French and international experiences of benchmarking in the healthcare sector. To this end, we reviewed the literature on this approach's emergence in the industrial sector, its evolution, its fields of application and examples of how it has been used in the healthcare sector. Benchmarking is often thought to consist simply of comparing indicators and is not perceived in its entirety, that is, as a tool based on voluntary and active collaboration among several organizations to create a spirit of competition and to apply best practices. The key feature of benchmarking is its integration within a comprehensive and participatory policy of continuous quality improvement (CQI). Conditions for successful benchmarking focus essentially on careful preparation of the process, monitoring of the relevant indicators, staff involvement and inter-organizational visits. Compared to methods previously implemented in France (CQI and collaborative projects), benchmarking has specific features that set it apart as a healthcare innovation. This is especially true for healthcare or medical-social organizations, as the principle of inter-organizational visiting is not part of their culture. Thus, this approach will need to be assessed for feasibility and acceptability before it is more widely promoted.
The DLESE Evaluation Toolkit Project

NASA Astrophysics Data System (ADS)

Buhr, S. M.; Barker, L. J.; Marlino, M.

2002-12-01

The Evaluation Toolkit and Community project is a new Digital Library for Earth System Education (DLESE) collection designed to raise awareness of project evaluation within the geoscience education community, and to enable principal investigators, teachers, and evaluators to implement project evaluation more readily. This new resource is grounded in the needs of geoscience educators, and will provide a virtual home for a geoscience education evaluation community. The goals of the project are to 1) provide a robust collection of evaluation resources useful for Earth systems educators, 2) establish a forum and community for evaluation dialogue within DLESE, and 3) disseminate the resources through the DLESE infrastructure and through professional society workshops and proceedings. Collaboration and expertise in education, geoscience and evaluation are necessary if we are to conduct the best possible geoscience education. The Toolkit allows users to engage in evaluation at whichever level best suits their needs, get more evaluation professional development if desired, and access the expertise of other segments of the community. To date, a test web site has been built and populated, initial community feedback from the DLESE and broader community is being garnered, and we have begun to heighten awareness of geoscience education evaluation within our community. The web site contains features that allow users to access professional development about evaluation, search and find evaluation resources, submit resources, find or offer evaluation services, sign up for upcoming workshops, take the user survey, and submit calendar items. The evaluation resource matrix currently contains resources that have met our initial review. The resources are currently organized by type; they will become searchable on multiple dimensions of project type, audience, objectives and evaluation resource type as efforts to develop a collection-specific search engine mature. The peer review
Benchmarking for maximum value.

PubMed

Baldwin, Ed

2009-03-01

Speaking at the most recent Healthcare Estates conference, Ed Baldwin, of international built asset consultancy EC Harris LLP, examined the role of benchmarking and market-testing--two of the key methods used to evaluate the quality and cost-effectiveness of hard and soft FM services provided under PFI healthcare schemes to ensure they are offering maximum value for money.
Reducing accounts receivable through benchmarking and best practices identification.

PubMed

Berkey, T

1998-01-01

As HIM professionals look for ways to become more competitive and achieve the best results, the importance of discovering best practices becomes more apparent. Here's how one team used a benchmarking project to provide specific best practices that reduced accounts receivable days.
ComprehensiveBench: a Benchmark for the Extensive Evaluation of Global Scheduling Algorithms

NASA Astrophysics Data System (ADS)

Pilla, Laércio L.; Bozzetti, Tiago C.; Castro, Márcio; Navaux, Philippe O. A.; Méhaut, Jean-François

2015-10-01

Parallel applications that present tasks with imbalanced loads or complex communication behavior usually do not exploit the underlying resources of parallel platforms to their full potential. In order to mitigate this issue, global scheduling algorithms are employed. As finding the optimal task distribution is an NP-Hard problem, identifying the most suitable algorithm for a specific scenario and comparing algorithms are not trivial tasks. In this context, this paper presents ComprehensiveBench, a benchmark for global scheduling algorithms that enables the variation of a vast range of parameters that affect performance. ComprehensiveBench can be used to assist in the development and evaluation of new scheduling algorithms, to help choose a specific algorithm for an arbitrary application, to emulate other applications, and to enable statistical tests. We illustrate its use in this paper with an evaluation of Charm++ periodic load balancers that stresses their characteristics.
NASA Software Engineering Benchmarking Study

NASA Technical Reports Server (NTRS)

Rarick, Heather L.; Godfrey, Sara H.; Kelly, John C.; Crumbley, Robert T.; Wifl, Joel M.

2013-01-01

To identify best practices for the improvement of software engineering on projects, NASA's Offices of Chief Engineer (OCE) and Safety and Mission Assurance (OSMA) formed a team led by Heather Rarick and Sally Godfrey to conduct this benchmarking study. The primary goals of the study are to identify best practices that: Improve the management and technical development of software intensive systems; Have a track record of successful deployment by aerospace industries, universities [including research and development (R&D) laboratories], and defense services, as well as NASA's own component Centers; and Identify candidate solutions for NASA's software issues. Beginning in the late fall of 2010, focus topics were chosen and interview questions were developed, based on the NASA top software challenges. Between February 2011 and November 2011, the Benchmark Team interviewed a total of 18 organizations, consisting of five NASA Centers, five industry organizations, four defense services organizations, and four university or university R and D laboratory organizations. A software assurance representative also participated in each of the interviews to focus on assurance and software safety best practices. Interviewees provided a wealth of information on each topic area that included: software policy, software acquisition, software assurance, testing, training, maintaining rigor in small projects, metrics, and use of the Capability Maturity Model Integration (CMMI) framework, as well as a number of special topics that came up in the discussions. NASA's software engineering practices compared favorably with the external organizations in most benchmark areas, but in every topic, there were ways in which NASA could improve its practices. Compared to defense services organizations and some of the industry organizations, one of NASA's notable weaknesses involved communication with contractors regarding its policies and requirements for acquired software. One of NASA's strengths
Benchmarking nitrogen removal suspended-carrier biofilm systems using dynamic simulation.

PubMed

Vanhooren, H; Yuan, Z; Vanrolleghem, P A

2002-01-01

We are witnessing an enormous growth in biological nitrogen removal from wastewater. It presents specific challenges beyond traditional COD (carbon) removal. A possibility for optimised process design is the use of biomass-supporting media. In this paper, attached growth processes (AGP) are evaluated using dynamic simulations. The advantages of these systems that were qualitatively described elsewhere, are validated quantitatively based on a simulation benchmark for activated sludge treatment systems. This simulation benchmark is extended with a biofilm model that allows for fast and accurate simulation of the conversion of different substrates in a biofilm. The economic feasibility of this system is evaluated using the data generated with the benchmark simulations. Capital savings due to volume reduction and reduced sludge production are weighed out against increased aeration costs. In this evaluation, effluent quality is integrated as well.
Energy saving in WWTP: Daily benchmarking under uncertainty and data availability limitations.

PubMed

Torregrossa, D; Schutz, G; Cornelissen, A; Hernández-Sancho, F; Hansen, J

2016-07-01

Efficient management of Waste Water Treatment Plants (WWTPs) can produce significant environmental and economic benefits. Energy benchmarking can be used to compare WWTPs, identify targets and use these to improve their performance. Different authors have performed benchmark analysis on monthly or yearly basis but their approaches suffer from a time lag between an event, its detection, interpretation and potential actions. The availability of on-line measurement data on many WWTPs should theoretically enable the decrease of the management response time by daily benchmarking. Unfortunately this approach is often impossible because of limited data availability. This paper proposes a methodology to perform a daily benchmark analysis under database limitations. The methodology has been applied to the Energy Online System (EOS) developed in the framework of the project "INNERS" (INNovative Energy Recovery Strategies in the urban water cycle). EOS calculates a set of Key Performance Indicators (KPIs) for the evaluation of energy and process performances. In EOS, the energy KPIs take in consideration the pollutant load in order to enable the comparison between different plants. For example, EOS does not analyse the energy consumption but the energy consumption on pollutant load. This approach enables the comparison of performances for plants with different loads or for a single plant under different load conditions. The energy consumption is measured by on-line sensors, while the pollutant load is measured in the laboratory approximately every 14 days. Consequently, the unavailability of the water quality parameters is the limiting factor in calculating energy KPIs. In this paper, in order to overcome this limitation, the authors have developed a methodology to estimate the required parameters and manage the uncertainty in the estimation. By coupling the parameter estimation with an interval based benchmark approach, the authors propose an effective, fast and reproducible
Benchmarking high performance computing architectures with CMS’ skeleton framework

DOE PAGES

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

2017-11-23

Here, in 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta,more » Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.« less
Benchmarking high performance computing architectures with CMS’ skeleton framework

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

Here, in 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta,more » Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.« less
Refining multi-model projections of temperature extremes by evaluation against land-atmosphere coupling diagnostics

NASA Astrophysics Data System (ADS)

Sippel, Sebastian; Zscheischler, Jakob; Mahecha, Miguel D.; Orth, Rene; Reichstein, Markus; Vogel, Martha; Seneviratne, Sonia I.

2017-05-01

The Earth's land surface and the atmosphere are strongly interlinked through the exchange of energy and matter. This coupled behaviour causes various land-atmosphere feedbacks, and an insufficient understanding of these feedbacks contributes to uncertain global climate model projections. For example, a crucial role of the land surface in exacerbating summer heat waves in midlatitude regions has been identified empirically for high-impact heat waves, but individual climate models differ widely in their respective representation of land-atmosphere coupling. Here, we compile an ensemble of 54 combinations of observations-based temperature (T) and evapotranspiration (ET) benchmarking datasets and investigate coincidences of T anomalies with ET anomalies as a proxy for land-atmosphere interactions during periods of anomalously warm temperatures. First, we demonstrate that a large fraction of state-of-the-art climate models from the Coupled Model Intercomparison Project (CMIP5) archive produces systematically too frequent coincidences of high T anomalies with negative ET anomalies in midlatitude regions during the warm season and in several tropical regions year-round. These coincidences (high T, low ET) are closely related to the representation of temperature variability and extremes across the multi-model ensemble. Second, we derive a land-coupling constraint based on the spread of the T-ET datasets and consequently retain only a subset of CMIP5 models that produce a land-coupling behaviour that is compatible with these benchmark estimates. The constrained multi-model simulations exhibit more realistic temperature extremes of reduced magnitude in present climate in regions where models show substantial spread in T-ET coupling, i.e. biases in the model ensemble are consistently reduced. Also the multi-model simulations for the coming decades display decreased absolute temperature extremes in the constrained ensemble. On the other hand, the differences between projected
An evidence-based approach to benchmarking the fairness of health-sector reform in developing countries.

PubMed Central

Daniels, Norman; Flores, Walter; Pannarunothai, Supasit; Ndumbe, Peter N.; Bryant, John H.; Ngulube, T. J.; Wang, Yuankun

2005-01-01

The Benchmarks of Fairness instrument is an evidence-based policy tool developed in generic form in 2000 for evaluating the effects of health-system reforms on equity, efficiency and accountability. By integrating measures of these effects on the central goal of fairness, the approach fills a gap that has hampered reform efforts for more than two decades. Over the past three years, projects in developing countries on three continents have adapted the generic version of these benchmarks for use at both national and subnational levels. Interdisciplinary teams of managers, providers, academics and advocates agree on the relevant criteria for assessing components of fairness and, depending on which aspects of reform they wish to evaluate, select appropriate indicators that rely on accessible information; they also agree on scoring rules for evaluating the diverse changes in the indicators. In contrast to a comprehensive index that aggregates all measured changes into a single evaluation or rank, the pattern of changes revealed by the benchmarks is used to inform policy deliberation aboutwhich aspects of the reforms have been successfully implemented, and it also allows for improvements to be made in the reforms. This approach permits useful evidence about reform to be gathered in settings where existing information is underused and where there is a weak information infrastructure. Brief descriptions of early results from Cameroon, Ecuador, Guatemala, Thailand and Zambia demonstrate that the method can produce results that are useful for policy and reveal the variety of purposes to which the approach can be put. Collaboration across sites can yield a catalogue of indicators that will facilitate further work. PMID:16175828
The philosophy of benchmark testing a standards-based picture archiving and communications system.

PubMed

Richardson, N E; Thomas, J A; Lyche, D K; Romlein, J; Norton, G S; Dolecek, Q E

1999-05-01

The Department of Defense issued its requirements for a Digital Imaging Network-Picture Archiving and Communications System (DIN-PACS) in a Request for Proposals (RFP) to industry in January 1997, with subsequent contracts being awarded in November 1997 to the Agfa Division of Bayer and IBM Global Government Industry. The Government's technical evaluation process consisted of evaluating a written technical proposal as well as conducting a benchmark test of each proposed system at the vendor's test facility. The purpose of benchmark testing was to evaluate the performance of the fully integrated system in a simulated operational environment. The benchmark test procedures and test equipment were developed through a joint effort between the Government, academic institutions, and private consultants. Herein the authors discuss the resources required and the methods used to benchmark test a standards-based PACS.
NDEC: A NEA platform for nuclear data testing, verification and benchmarking

NASA Astrophysics Data System (ADS)

Díez, C. J.; Michel-Sendis, F.; Cabellos, O.; Bossant, M.; Soppera, N.

2017-09-01

The selection, testing, verification and benchmarking of evaluated nuclear data consists, in practice, in putting an evaluated file through a number of checking steps where different computational codes verify that the file and the data it contains complies with different requirements. These requirements range from format compliance to good performance in application cases, while at the same time physical constraints and the agreement with experimental data are verified. At NEA, the NDEC (Nuclear Data Evaluation Cycle) platform aims at providing, in a user friendly interface, a thorough diagnose of the quality of a submitted evaluated nuclear data file. Such diagnose is based on the results of different computational codes and routines which carry out the mentioned verifications, tests and checks. NDEC also searches synergies with other existing NEA tools and databases, such as JANIS, DICE or NDaST, including them into its working scheme. Hence, this paper presents NDEC, its current development status and its usage in the JEFF nuclear data project.
Using Benchmarking To Strengthen the Assessment of Persistence.

PubMed

McLachlan, Michael S; Zou, Hongyan; Gouin, Todd

2017-01-03

Chemical persistence is a key property for assessing chemical risk and chemical hazard. Current methods for evaluating persistence are based on laboratory tests. The relationship between the laboratory based estimates and persistence in the environment is often unclear, in which case the current methods for evaluating persistence can be questioned. Chemical benchmarking opens new possibilities to measure persistence in the field. In this paper we explore how the benchmarking approach can be applied in both the laboratory and the field to deepen our understanding of chemical persistence in the environment and create a firmer scientific basis for laboratory to field extrapolation of persistence test results.
Multirate Flutter Suppression System Design for the Benchmark Active Controls Technology Wing. Part 1; Theory and Design Procedure

NASA Technical Reports Server (NTRS)

Mason, Gregory S.; Berg, Martin C.; Mukhopadhyay, Vivek

2002-01-01

To study the effectiveness of various control system design methodologies, the NASA Langley Research Center initiated the Benchmark Active Controls Project. In this project, the various methodologies were applied to design a flutter suppression system for the Benchmark Active Controls Technology (BACT) Wing. This report describes a project at the University of Washington to design a multirate suppression system for the BACT wing. The objective of the project was two fold. First, to develop a methodology for designing robust multirate compensators, and second, to demonstrate the methodology by applying it to the design of a multirate flutter suppression system for the BACT wing.
Benchmarking Diagnostic Algorithms on an Electrical Power System Testbed

NASA Technical Reports Server (NTRS)

Kurtoglu, Tolga; Narasimhan, Sriram; Poll, Scott; Garcia, David; Wright, Stephanie

2009-01-01

Diagnostic algorithms (DAs) are key to enabling automated health management. These algorithms are designed to detect and isolate anomalies of either a component or the whole system based on observations received from sensors. In recent years a wide range of algorithms, both model-based and data-driven, have been developed to increase autonomy and improve system reliability and affordability. However, the lack of support to perform systematic benchmarking of these algorithms continues to create barriers for effective development and deployment of diagnostic technologies. In this paper, we present our efforts to benchmark a set of DAs on a common platform using a framework that was developed to evaluate and compare various performance metrics for diagnostic technologies. The diagnosed system is an electrical power system, namely the Advanced Diagnostics and Prognostics Testbed (ADAPT) developed and located at the NASA Ames Research Center. The paper presents the fundamentals of the benchmarking framework, the ADAPT system, description of faults and data sets, the metrics used for evaluation, and an in-depth analysis of benchmarking results obtained from testing ten diagnostic algorithms on the ADAPT electrical power system testbed.
An automated protocol for performance benchmarking a widefield fluorescence microscope.

PubMed

Halter, Michael; Bier, Elianna; DeRose, Paul C; Cooksey, Gregory A; Choquette, Steven J; Plant, Anne L; Elliott, John T

2014-11-01

Widefield fluorescence microscopy is a highly used tool for visually assessing biological samples and for quantifying cell responses. Despite its widespread use in high content analysis and other imaging applications, few published methods exist for evaluating and benchmarking the analytical performance of a microscope. Easy-to-use benchmarking methods would facilitate the use of fluorescence imaging as a quantitative analytical tool in research applications, and would aid the determination of instrumental method validation for commercial product development applications. We describe and evaluate an automated method to characterize a fluorescence imaging system's performance by benchmarking the detection threshold, saturation, and linear dynamic range to a reference material. The benchmarking procedure is demonstrated using two different materials as the reference material, uranyl-ion-doped glass and Schott 475 GG filter glass. Both are suitable candidate reference materials that are homogeneously fluorescent and highly photostable, and the Schott 475 GG filter glass is currently commercially available. In addition to benchmarking the analytical performance, we also demonstrate that the reference materials provide for accurate day to day intensity calibration. Published 2014 Wiley Periodicals Inc. Published 2014 Wiley Periodicals Inc. This article is a US government work and, as such, is in the public domain in the United States of America.

Multirate Flutter Suppression System Design for the Benchmark Active Controls Technology Wing. Part 2; Methodology Application Software Toolbox

NASA Technical Reports Server (NTRS)

Mason, Gregory S.; Berg, Martin C.; Mukhopadhyay, Vivek

2002-01-01

To study the effectiveness of various control system design methodologies, the NASA Langley Research Center initiated the Benchmark Active Controls Project. In this project, the various methodologies were applied to design a flutter suppression system for the Benchmark Active Controls Technology (BACT) Wing. This report describes the user's manual and software toolbox developed at the University of Washington to design a multirate flutter suppression control law for the BACT wing.
BENCHMARK DOSE TECHNICAL GUIDANCE DOCUMENT ...

EPA Pesticide Factsheets

The U.S. EPA conducts risk assessments for an array of health effects that may result from exposure to environmental agents, and that require an analysis of the relationship between exposure and health-related outcomes. The dose-response assessment is essentially a two-step process, the first being the definition of a point of departure (POD), and the second extrapolation from the POD to low environmentally-relevant exposure levels. The benchmark dose (BMD) approach provides a more quantitative alternative to the first step in the dose-response assessment than the current NOAEL/LOAEL process for noncancer health effects, and is similar to that for determining the POD proposed for cancer endpoints. As the Agency moves toward harmonization of approaches for human health risk assessment, the dichotomy between cancer and noncancer health effects is being replaced by consideration of mode of action and whether the effects of concern are likely to be linear or nonlinear at low doses. Thus, the purpose of this project is to provide guidance for the Agency and the outside community on the application of the BMD approach in determining the POD for all types of health effects data, whether a linear or nonlinear low dose extrapolation is used. A guidance document is being developed under the auspices of EPA's Risk Assessment Forum. The purpose of this project is to provide guidance for the Agency and the outside community on the application of the benchmark dose (BMD) appr
Sensitivity Analysis of OECD Benchmark Tests in BISON

DOE Office of Scientific and Technical Information (OSTI.GOV)

Swiler, Laura Painton; Gamble, Kyle; Schmidt, Rodney C.

2015-09-01

This report summarizes a NEAMS (Nuclear Energy Advanced Modeling and Simulation) project focused on sensitivity analysis of a fuels performance benchmark problem. The benchmark problem was defined by the Uncertainty Analysis in Modeling working group of the Nuclear Science Committee, part of the Nuclear Energy Agency of the Organization for Economic Cooperation and Development (OECD ). The benchmark problem involv ed steady - state behavior of a fuel pin in a Pressurized Water Reactor (PWR). The problem was created in the BISON Fuels Performance code. Dakota was used to generate and analyze 300 samples of 17 input parameters defining coremore » boundary conditions, manuf acturing tolerances , and fuel properties. There were 24 responses of interest, including fuel centerline temperatures at a variety of locations and burnup levels, fission gas released, axial elongation of the fuel pin, etc. Pearson and Spearman correlatio n coefficients and Sobol' variance - based indices were used to perform the sensitivity analysis. This report summarizes the process and presents results from this study.« less
Benchmark problems for numerical implementations of phase field models

DOE PAGES

Jokisaari, A. M.; Voorhees, P. W.; Guyer, J. E.; ...

2016-10-01

Here, we present the first set of benchmark problems for phase field models that are being developed by the Center for Hierarchical Materials Design (CHiMaD) and the National Institute of Standards and Technology (NIST). While many scientific research areas use a limited set of well-established software, the growing phase field community continues to develop a wide variety of codes and lacks benchmark problems to consistently evaluate the numerical performance of new implementations. Phase field modeling has become significantly more popular as computational power has increased and is now becoming mainstream, driving the need for benchmark problems to validate and verifymore » new implementations. We follow the example set by the micromagnetics community to develop an evolving set of benchmark problems that test the usability, computational resources, numerical capabilities and physical scope of phase field simulation codes. In this paper, we propose two benchmark problems that cover the physics of solute diffusion and growth and coarsening of a second phase via a simple spinodal decomposition model and a more complex Ostwald ripening model. We demonstrate the utility of benchmark problems by comparing the results of simulations performed with two different adaptive time stepping techniques, and we discuss the needs of future benchmark problems. The development of benchmark problems will enable the results of quantitative phase field models to be confidently incorporated into integrated computational materials science and engineering (ICME), an important goal of the Materials Genome Initiative.« less
NASA PC software evaluation project

NASA Technical Reports Server (NTRS)

Dominick, Wayne D. (Editor); Kuan, Julie C.

1986-01-01

The USL NASA PC software evaluation project is intended to provide a structured framework for facilitating the development of quality NASA PC software products. The project will assist NASA PC development staff to understand the characteristics and functions of NASA PC software products. Based on the results of the project teams' evaluations and recommendations, users can judge the reliability, usability, acceptability, maintainability and customizability of all the PC software products. The objective here is to provide initial, high-level specifications and guidelines for NASA PC software evaluation. The primary tasks to be addressed in this project are as follows: to gain a strong understanding of what software evaluation entails and how to organize a structured software evaluation process; to define a structured methodology for conducting the software evaluation process; to develop a set of PC software evaluation criteria and evaluation rating scales; and to conduct PC software evaluations in accordance with the identified methodology. Communication Packages, Network System Software, Graphics Support Software, Environment Management Software, General Utilities. This report represents one of the 72 attachment reports to the University of Southwestern Louisiana's Final Report on NASA Grant NGT-19-010-900. Accordingly, appropriate care should be taken in using this report out of context of the full Final Report.
Aircraft Engine Gas Path Diagnostic Methods: Public Benchmarking Results

NASA Technical Reports Server (NTRS)

Simon, Donald L.; Borguet, Sebastien; Leonard, Olivier; Zhang, Xiaodong (Frank)

2013-01-01

Recent technology reviews have identified the need for objective assessments of aircraft engine health management (EHM) technologies. To help address this issue, a gas path diagnostic benchmark problem has been created and made publicly available. This software tool, referred to as the Propulsion Diagnostic Method Evaluation Strategy (ProDiMES), has been constructed based on feedback provided by the aircraft EHM community. It provides a standard benchmark problem enabling users to develop, evaluate and compare diagnostic methods. This paper will present an overview of ProDiMES along with a description of four gas path diagnostic methods developed and applied to the problem. These methods, which include analytical and empirical diagnostic techniques, will be described and associated blind-test-case metric results will be presented and compared. Lessons learned along with recommendations for improving the public benchmarking processes will also be presented and discussed.
Benchmarking novel approaches for modelling species range dynamics

PubMed Central

Zurell, Damaris; Thuiller, Wilfried; Pagel, Jörn; Cabral, Juliano S; Münkemüller, Tamara; Gravel, Dominique; Dullinger, Stefan; Normand, Signe; Schiffers, Katja H.; Moore, Kara A.; Zimmermann, Niklaus E.

2016-01-01

Increasing biodiversity loss due to climate change is one of the most vital challenges of the 21st century. To anticipate and mitigate biodiversity loss, models are needed that reliably project species’ range dynamics and extinction risks. Recently, several new approaches to model range dynamics have been developed to supplement correlative species distribution models (SDMs), but applications clearly lag behind model development. Indeed, no comparative analysis has been performed to evaluate their performance. Here, we build on process-based, simulated data for benchmarking five range (dynamic) models of varying complexity including classical SDMs, SDMs coupled with simple dispersal or more complex population dynamic models (SDM hybrids), and a hierarchical Bayesian process-based dynamic range model (DRM). We specifically test the effects of demographic and community processes on model predictive performance. Under current climate, DRMs performed best, although only marginally. Under climate change, predictive performance varied considerably, with no clear winners. Yet, all range dynamic models improved predictions under climate change substantially compared to purely correlative SDMs, and the population dynamic models also predicted reasonable extinction risks for most scenarios. When benchmarking data were simulated with more complex demographic and community processes, simple SDM hybrids including only dispersal often proved most reliable. Finally, we found that structural decisions during model building can have great impact on model accuracy, but prior system knowledge on important processes can reduce these uncertainties considerably. Our results reassure the clear merit in using dynamic approaches for modelling species’ response to climate change but also emphasise several needs for further model and data improvement. We propose and discuss perspectives for improving range projections through combination of multiple models and for making these approaches
Development and application of freshwater sediment-toxicity benchmarks for currently used pesticides

USGS Publications Warehouse

Nowell, Lisa H.; Norman, Julia E.; Ingersoll, Christopher G.; Moran, Patrick W.

2016-01-01

Sediment-toxicity benchmarks are needed to interpret the biological significance of currently used pesticides detected in whole sediments. Two types of freshwater sediment benchmarks for pesticides were developed using spiked-sediment bioassay (SSB) data from the literature. These benchmarks can be used to interpret sediment-toxicity data or to assess the potential toxicity of pesticides in whole sediment. The Likely Effect Benchmark (LEB) defines a pesticide concentration in whole sediment above which there is a high probability of adverse effects on benthic invertebrates, and the Threshold Effect Benchmark (TEB) defines a concentration below which adverse effects are unlikely. For compounds without available SSBs, benchmarks were estimated using equilibrium partitioning (EqP). When a sediment sample contains a pesticide mixture, benchmark quotients can be summed for all detected pesticides to produce an indicator of potential toxicity for that mixture. Benchmarks were developed for 48 pesticide compounds using SSB data and 81 compounds using the EqP approach. In an example application, data for pesticides measured in sediment from 197 streams across the United States were evaluated using these benchmarks, and compared to measured toxicity from whole-sediment toxicity tests conducted with the amphipod Hyalella azteca (28-d exposures) and the midge Chironomus dilutus (10-d exposures). Amphipod survival, weight, and biomass were significantly and inversely related to summed benchmark quotients, whereas midge survival, weight, and biomass showed no relationship to benchmarks. Samples with LEB exceedances were rare (n = 3), but all were toxic to amphipods (i.e., significantly different from control). Significant toxicity to amphipods was observed for 72% of samples exceeding one or more TEBs, compared to 18% of samples below all TEBs. Factors affecting toxicity below TEBs may include the presence of contaminants other than pesticides, physical
Benchmark Simulation Model No 2: finalisation of plant layout and default control strategy.

PubMed

Nopens, I; Benedetti, L; Jeppsson, U; Pons, M-N; Alex, J; Copp, J B; Gernaey, K V; Rosen, C; Steyer, J-P; Vanrolleghem, P A

2010-01-01

The COST/IWA Benchmark Simulation Model No 1 (BSM1) has been available for almost a decade. Its primary purpose has been to create a platform for control strategy benchmarking of activated sludge processes. The fact that the research work related to the benchmark simulation models has resulted in more than 300 publications worldwide demonstrates the interest in and need of such tools within the research community. Recent efforts within the IWA Task Group on "Benchmarking of control strategies for WWTPs" have focused on an extension of the benchmark simulation model. This extension aims at facilitating control strategy development and performance evaluation at a plant-wide level and, consequently, includes both pretreatment of wastewater as well as the processes describing sludge treatment. The motivation for the extension is the increasing interest and need to operate and control wastewater treatment systems not only at an individual process level but also on a plant-wide basis. To facilitate the changes, the evaluation period has been extended to one year. A prolonged evaluation period allows for long-term control strategies to be assessed and enables the use of control handles that cannot be evaluated in a realistic fashion in the one week BSM1 evaluation period. In this paper, the finalised plant layout is summarised and, as was done for BSM1, a default control strategy is proposed. A demonstration of how BSM2 can be used to evaluate control strategies is also given.
Thermo-hydro-mechanical-chemical processes in fractured-porous media: Benchmarks and examples

NASA Astrophysics Data System (ADS)

Kolditz, O.; Shao, H.; Görke, U.; Kalbacher, T.; Bauer, S.; McDermott, C. I.; Wang, W.

2012-12-01

The book comprises an assembly of benchmarks and examples for porous media mechanics collected over the last twenty years. Analysis of thermo-hydro-mechanical-chemical (THMC) processes is essential to many applications in environmental engineering, such as geological waste deposition, geothermal energy utilisation, carbon capture and storage, water resources management, hydrology, even climate change. In order to assess the feasibility as well as the safety of geotechnical applications, process-based modelling is the only tool to put numbers, i.e. to quantify future scenarios. This charges a huge responsibility concerning the reliability of computational tools. Benchmarking is an appropriate methodology to verify the quality of modelling tools based on best practices. Moreover, benchmarking and code comparison foster community efforts. The benchmark book is part of the OpenGeoSys initiative - an open source project to share knowledge and experience in environmental analysis and scientific computation.
Benchmark matrix and guide: Part III.

PubMed

1992-01-01

The final article in the "Benchmark Matrix and Guide" series developed by Headquarters Air Force Logistics Command completes the discussion of the last three categories that are essential ingredients of a successful total quality management (TQM) program. Detailed behavioral objectives are listed in the areas of recognition, process improvement, and customer focus. These vertical categories are meant to be applied to the levels of the matrix that define the progressive stages of the TQM: business as usual, initiation, implementation, expansion, and integration. By charting the horizontal progress level and the vertical TQM category, the quality management professional can evaluate the current state of TQM in any given organization. As each category is completed, new goals can be defined in order to advance to a higher level. The benchmarking process is integral to quality improvement efforts because it focuses on the highest possible standards to evaluate quality programs.
The KMAT: Benchmarking Knowledge Management.

ERIC Educational Resources Information Center

de Jager, Martha

Provides an overview of knowledge management and benchmarking, including the benefits and methods of benchmarking (e.g., competitive, cooperative, collaborative, and internal benchmarking). Arthur Andersen's KMAT (Knowledge Management Assessment Tool) is described. The KMAT is a collaborative benchmarking tool, designed to help organizations make…
Benchmarking. Issues in the Design and Implementation of a Benchmarking System for Employment and Training Programs for Young People.

ERIC Educational Resources Information Center

Coughlin, David C.; Bielen, Rhonda P.

This paper has been prepared to assist the United States Department of Labor to explore new approaches to evaluating and measuring the performance of employment and training activities for youth. As one of several tools for evaluating success of local youth training programs, "benchmarking" provides a system for measuring the development…
Benchmarks for target tracking

NASA Astrophysics Data System (ADS)

Dunham, Darin T.; West, Philip D.

2011-09-01

The term benchmark originates from the chiseled horizontal marks that surveyors made, into which an angle-iron could be placed to bracket ("bench") a leveling rod, thus ensuring that the leveling rod can be repositioned in exactly the same place in the future. A benchmark in computer terms is the result of running a computer program, or a set of programs, in order to assess the relative performance of an object by running a number of standard tests and trials against it. This paper will discuss the history of simulation benchmarks that are being used by multiple branches of the military and agencies of the US government. These benchmarks range from missile defense applications to chemical biological situations. Typically, a benchmark is used with Monte Carlo runs in order to tease out how algorithms deal with variability and the range of possible inputs. We will also describe problems that can be solved by a benchmark.
Benchmarking Ensemble Streamflow Prediction Skill in the UK

NASA Astrophysics Data System (ADS)

Harrigan, Shaun; Smith, Katie; Parry, Simon; Tanguy, Maliko; Prudhomme, Christel

2017-04-01

Skilful hydrological forecasts at weekly to seasonal lead times would be extremely beneficial for decision-making in operational water management, especially during drought conditions. Hydro-meteorological ensemble forecasting systems are an attractive approach as they use two sources of streamflow predictability: (i) initial hydrologic conditions (IHCs), where soil moisture, groundwater and snow storage states can provide an estimate of future streamflow situations, and (ii) atmospheric predictability, where skilful forecasts of weather and climate variables can be used to force hydrological models. In the UK, prediction of rainfall at long lead times and for summer months in particular is notoriously difficult given the large degree of natural climate variability in ocean influenced mid-latitude regions, but recent research has uncovered exciting prospects for improved rainfall skill at seasonal lead times due to improved prediction of the North Atlantic Oscillation. However, before we fully understand what this improved atmospheric predictability might mean in terms of improved hydrological forecasts, we must first evaluate how much skill can be gained from IHCs alone. Ensemble Streamflow Prediction (ESP) is a well-established method for generating an ensemble of streamflow forecasts in the absence of skilful future meteorological predictions. The aim of this study is therefore to benchmark when (lead time/forecast initialisation month) and where (spatial pattern/catchment characteristics) ESP is skilful across a diverse set of catchments in the UK. Forecast skill was evaluated seamlessly from lead times of 1-day to 12-months and forecasts were initialised at the first of each month over the 1965-2015 hindcast period. This ESP output also provides a robust benchmark against which to assess how much improvement in skill can be achieved when meteorological forecasts are incorporated (next steps). To provide a 'tough to beat' benchmark, several variants of ESP with
Assessment of the monitoring and evaluation system for integrated community case management (ICCM) in Ethiopia: a comparison against global benchmark indicators.

PubMed

Mamo, Dereje; Hazel, Elizabeth; Lemma, Israel; Guenther, Tanya; Bekele, Abeba; Demeke, Berhanu

2014-10-01

Program managers require feasible, timely, reliable, and valid measures of iCCM implementation to identify problems and assess progress. The global iCCM Task Force developed benchmark indicators to guide implementers to develop or improve monitoring and evaluation (M&E) systems. To assesses Ethiopia's iCCM M&E system by determining the availability and feasibility of the iCCM benchmark indicators. We conducted a desk review of iCCM policy documents, monitoring tools, survey reports, and other rele- vant documents; and key informant interviews with government and implementing partners involved in iCCM scale-up and M&E. Currently, Ethiopia collects data to inform most (70% [33/47]) iCCM benchmark indicators, and modest extra effort could boost this to 83% (39/47). Eight (17%) are not available given the current system. Most benchmark indicators that track coordination and policy, human resources, service delivery and referral, supervision, and quality assurance are available through the routine monitoring systems or periodic surveys. Indicators for supply chain management are less available due to limited consumption data and a weak link with treatment data. Little information is available on iCCM costs. Benchmark indicators can detail the status of iCCM implementation; however, some indicators may not fit country priorities, and others may be difficult to collect. The government of Ethiopia and partners should review and prioritize the benchmark indicators to determine which should be included in the routine M&E system, especially since iCCMdata are being reviewed for addition to the HMIS. Moreover, the Health Extension Worker's reporting burden can be minimized by an integrated reporting approach.
Automated benchmarking of peptide-MHC class I binding predictions.

PubMed

Trolle, Thomas; Metushi, Imir G; Greenbaum, Jason A; Kim, Yohan; Sidney, John; Lund, Ole; Sette, Alessandro; Peters, Bjoern; Nielsen, Morten

2015-07-01

Numerous in silico methods predicting peptide binding to major histocompatibility complex (MHC) class I molecules have been developed over the last decades. However, the multitude of available prediction tools makes it non-trivial for the end-user to select which tool to use for a given task. To provide a solid basis on which to compare different prediction tools, we here describe a framework for the automated benchmarking of peptide-MHC class I binding prediction tools. The framework runs weekly benchmarks on data that are newly entered into the Immune Epitope Database (IEDB), giving the public access to frequent, up-to-date performance evaluations of all participating tools. To overcome potential selection bias in the data included in the IEDB, a strategy was implemented that suggests a set of peptides for which different prediction methods give divergent predictions as to their binding capability. Upon experimental binding validation, these peptides entered the benchmark study. The benchmark has run for 15 weeks and includes evaluation of 44 datasets covering 17 MHC alleles and more than 4000 peptide-MHC binding measurements. Inspection of the results allows the end-user to make educated selections between participating tools. Of the four participating servers, NetMHCpan performed the best, followed by ANN, SMM and finally ARB. Up-to-date performance evaluations of each server can be found online at http://tools.iedb.org/auto_bench/mhci/weekly. All prediction tool developers are invited to participate in the benchmark. Sign-up instructions are available at http://tools.iedb.org/auto_bench/mhci/join. mniel@cbs.dtu.dk or bpeters@liai.org Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Automated benchmarking of peptide-MHC class I binding predictions

PubMed Central

Trolle, Thomas; Metushi, Imir G.; Greenbaum, Jason A.; Kim, Yohan; Sidney, John; Lund, Ole; Sette, Alessandro; Peters, Bjoern; Nielsen, Morten

2015-01-01

Motivation: Numerous in silico methods predicting peptide binding to major histocompatibility complex (MHC) class I molecules have been developed over the last decades. However, the multitude of available prediction tools makes it non-trivial for the end-user to select which tool to use for a given task. To provide a solid basis on which to compare different prediction tools, we here describe a framework for the automated benchmarking of peptide-MHC class I binding prediction tools. The framework runs weekly benchmarks on data that are newly entered into the Immune Epitope Database (IEDB), giving the public access to frequent, up-to-date performance evaluations of all participating tools. To overcome potential selection bias in the data included in the IEDB, a strategy was implemented that suggests a set of peptides for which different prediction methods give divergent predictions as to their binding capability. Upon experimental binding validation, these peptides entered the benchmark study. Results: The benchmark has run for 15 weeks and includes evaluation of 44 datasets covering 17 MHC alleles and more than 4000 peptide-MHC binding measurements. Inspection of the results allows the end-user to make educated selections between participating tools. Of the four participating servers, NetMHCpan performed the best, followed by ANN, SMM and finally ARB. Availability and implementation: Up-to-date performance evaluations of each server can be found online at http://tools.iedb.org/auto_bench/mhci/weekly. All prediction tool developers are invited to participate in the benchmark. Sign-up instructions are available at http://tools.iedb.org/auto_bench/mhci/join. Contact: mniel@cbs.dtu.dk or bpeters@liai.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25717196
Evaluation of Project Symbiosis: An Interdisciplinary Science Education Project.

ERIC Educational Resources Information Center

Altschuld, James W.

1993-01-01

The goal of this report is to provide a summary of the evaluation of Project Symbiosis which focused on enhancing the teaching of science principles in high school agriculture courses. The project initially involved 15 teams of science and agriculture teachers and was characterized by an extensive evaluation component consisting of six formal…
A proposed benchmark problem for cargo nuclear threat monitoring

NASA Astrophysics Data System (ADS)

Wesley Holmes, Thomas; Calderon, Adan; Peeples, Cody R.; Gardner, Robin P.

2011-10-01

There is currently a great deal of technical and political effort focused on reducing the risk of potential attacks on the United States involving radiological dispersal devices or nuclear weapons. This paper proposes a benchmark problem for gamma-ray and X-ray cargo monitoring with results calculated using MCNP5, v1.51. The primary goal is to provide a benchmark problem that will allow researchers in this area to evaluate Monte Carlo models for both speed and accuracy in both forward and inverse calculational codes and approaches for nuclear security applications. A previous benchmark problem was developed by one of the authors (RPG) for two similar oil well logging problems (Gardner and Verghese, 1991, [1]). One of those benchmarks has recently been used by at least two researchers in the nuclear threat area to evaluate the speed and accuracy of Monte Carlo codes combined with variance reduction techniques. This apparent need has prompted us to design this benchmark problem specifically for the nuclear threat researcher. This benchmark consists of conceptual design and preliminary calculational results using gamma-ray interactions on a system containing three thicknesses of three different shielding materials. A point source is placed inside the three materials lead, aluminum, and plywood. The first two materials are in right circular cylindrical form while the third is a cube. The entire system rests on a sufficiently thick lead base so as to reduce undesired scattering events. The configuration was arranged in such a manner that as gamma-ray moves from the source outward it first passes through the lead circular cylinder, then the aluminum circular cylinder, and finally the wooden cube before reaching the detector. A 2 in.×4 in.×16 in. box style NaI (Tl) detector was placed 1 m from the point source located in the center with the 4 in.×16 in. side facing the system. The two sources used in the benchmark are 137Cs and 235U.

Benchmarking and the laboratory

PubMed Central

Galloway, M; Nadin, L

2001-01-01

This article describes how benchmarking can be used to assess laboratory performance. Two benchmarking schemes are reviewed, the Clinical Benchmarking Company's Pathology Report and the College of American Pathologists' Q-Probes scheme. The Clinical Benchmarking Company's Pathology Report is undertaken by staff based in the clinical management unit, Keele University with appropriate input from the professional organisations within pathology. Five annual reports have now been completed. Each report is a detailed analysis of 10 areas of laboratory performance. In this review, particular attention is focused on the areas of quality, productivity, variation in clinical practice, skill mix, and working hours. The Q-Probes scheme is part of the College of American Pathologists programme in studies of quality assurance. The Q-Probes scheme and its applicability to pathology in the UK is illustrated by reviewing two recent Q-Probe studies: routine outpatient test turnaround time and outpatient test order accuracy. The Q-Probes scheme is somewhat limited by the small number of UK laboratories that have participated. In conclusion, as a result of the government's policy in the UK, benchmarking is here to stay. Benchmarking schemes described in this article are one way in which pathologists can demonstrate that they are providing a cost effective and high quality service. Key Words: benchmarking • pathology PMID:11477112
Benchmarking for Higher Education.

ERIC Educational Resources Information Center

Jackson, Norman, Ed.; Lund, Helen, Ed.

The chapters in this collection explore the concept of benchmarking as it is being used and developed in higher education (HE). Case studies and reviews show how universities in the United Kingdom are using benchmarking to aid in self-regulation and self-improvement. The chapters are: (1) "Introduction to Benchmarking" (Norman Jackson…
Overview of TPC Benchmark E: The Next Generation of OLTP Benchmarks

NASA Astrophysics Data System (ADS)

Hogan, Trish

Set to replace the aging TPC-C, the TPC Benchmark E is the next generation OLTP benchmark, which more accurately models client database usage. TPC-E addresses the shortcomings of TPC-C. It has a much more complex workload, requires the use of RAID-protected storage, generates much less I/O, and is much cheaper and easier to set up, run, and audit. After a period of overlap, it is expected that TPC-E will become the de facto OLTP benchmark.
Benchmark Evaluation of Fuel Effect and Material Worth Measurements for a Beryllium-Reflected Space Reactor Mockup

DOE Office of Scientific and Technical Information (OSTI.GOV)

Marshall, Margaret A.; Bess, John D.

2015-02-01

The critical configuration of the small, compact critical assembly (SCCA) experiments performed at the Oak Ridge Critical Experiments Facility (ORCEF) in 1962-1965 have been evaluated as acceptable benchmark experiments for inclusion in the International Handbook of Evaluated Criticality Safety Benchmark Experiments. The initial intent of these experiments was to support the design of the Medium Power Reactor Experiment (MPRE) program, whose purpose was to study “power plants for the production of electrical power in space vehicles.” The third configuration in this series of experiments was a beryllium-reflected assembly of stainless-steel-clad, highly enriched uranium (HEU)-O 2 fuel mockup of a potassium-cooledmore » space power reactor. Reactivity measurements cadmium ratio spectral measurements and fission rate measurements were measured through the core and top reflector. Fuel effect worth measurements and neutron moderating and absorbing material worths were also measured in the assembly fuel region. The cadmium ratios, fission rate, and worth measurements were evaluated for inclusion in the International Handbook of Evaluated Criticality Safety Benchmark Experiments. The fuel tube effect and neutron moderating and absorbing material worth measurements are the focus of this paper. Additionally, a measurement of the worth of potassium filling the core region was performed but has not yet been evaluated Pellets of 93.15 wt.% enriched uranium dioxide (UO 2) were stacked in 30.48 cm tall stainless steel fuel tubes (0.3 cm tall end caps). Each fuel tube had 26 pellets with a total mass of 295.8 g UO 2 per tube. 253 tubes were arranged in 1.506-cm triangular lattice. An additional 7-tube cluster critical configuration was also measured but not used for any physics measurements. The core was surrounded on all side by a beryllium reflector. The fuel effect worths were measured by removing fuel tubes at various radius. An accident scenario was also simulated by
Rethinking the reference collection: exploring benchmarks and e-book availability.

PubMed

Husted, Jeffrey T; Czechowski, Leslie J

2012-01-01

Librarians in the Health Sciences Library System at the University of Pittsburgh explored the possibility of developing an electronic reference collection that would replace the print reference collection, thus providing access to these valuable materials to a widely dispersed user population. The librarians evaluated the print reference collection and standard collection development lists as potential benchmarks for the electronic collection, and they determined which books were available in electronic format. They decided that the low availability of electronic versions of titles in each benchmark group rendered the creation of an electronic reference collection using either benchmark impractical.
An Alignment of the Canadian Language Benchmarks to the BC ESL Articulation Levels. Final Report - January 2007

ERIC Educational Resources Information Center

Barbour, Ross; Ostler, Catherine; Templeman, Elizabeth; West, Elizabeth

2007-01-01

The British Columbia (BC) English as a Second Language (ESL) Articulation Committee's Canadian Language Benchmarks project was precipitated by ESL instructors' desire to address transfer difficulties of ESL students within the BC transfer system and to respond to the recognition that the Canadian Language Benchmarks, a descriptive scale of ESL…
U.S. Solar Photovoltaic System Cost Benchmark: Q1 2017

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fu, Ran; Feldman, David J.; Margolis, Robert M.

NREL has been modeling U.S. photovoltaic (PV) system costs since 2009. This year, our report benchmarks costs of U.S. solar PV for residential, commercial, and utility-scale systems built in the first quarter of 2017 (Q1 2017). Costs are represented from the perspective of the developer/installer, thus all hardware costs represent the price at which components are purchased by the developer/installer, not accounting for preexisting supply agreements or other contracts. Importantly, the benchmark this year (2017) also represents the sales price paid to the installer; therefore, it includes profit in the cost of the hardware, along with the profit the installer/developermore » receives, as a separate cost category. However, it does not include any additional net profit, such as a developer fee or price gross-up, which are common in the marketplace. We adopt this approach owing to the wide variation in developer profits in all three sectors, where project pricing is highly dependent on region and project specifics such as local retail electricity rate structures, local rebate and incentive structures, competitive environment, and overall project or deal structures.« less
INL Results for Phases I and III of the OECD/NEA MHTGR-350 Benchmark

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gerhard Strydom; Javier Ortensi; Sonat Sen

2013-09-01

The Idaho National Laboratory (INL) Very High Temperature Reactor (VHTR) Technology Development Office (TDO) Methods Core Simulation group led the construction of the Organization for Economic Cooperation and Development (OECD) Modular High Temperature Reactor (MHTGR) 350 MW benchmark for comparing and evaluating prismatic VHTR analysis codes. The benchmark is sponsored by the OECD's Nuclear Energy Agency (NEA), and the project will yield a set of reference steady-state, transient, and lattice depletion problems that can be used by the Department of Energy (DOE), the Nuclear Regulatory Commission (NRC), and vendors to assess their code suits. The Methods group is responsible formore » defining the benchmark specifications, leading the data collection and comparison activities, and chairing the annual technical workshops. This report summarizes the latest INL results for Phase I (steady state) and Phase III (lattice depletion) of the benchmark. The INSTANT, Pronghorn and RattleSnake codes were used for the standalone core neutronics modeling of Exercise 1, and the results obtained from these codes are compared in Section 4. Exercise 2 of Phase I requires the standalone steady-state thermal fluids modeling of the MHTGR-350 design, and the results for the systems code RELAP5-3D are discussed in Section 5. The coupled neutronics and thermal fluids steady-state solution for Exercise 3 are reported in Section 6, utilizing the newly developed Parallel and Highly Innovative Simulation for INL Code System (PHISICS)/RELAP5-3D code suit. Finally, the lattice depletion models and results obtained for Phase III are compared in Section 7. The MHTGR-350 benchmark proved to be a challenging simulation set of problems to model accurately, and even with the simplifications introduced in the benchmark specification this activity is an important step in the code-to-code verification of modern prismatic VHTR codes. A final OECD/NEA comparison report will compare the Phase I and III
Model benchmarking and reference signals for angled-beam shear wave ultrasonic nondestructive evaluation (NDE) inspections

NASA Astrophysics Data System (ADS)

Aldrin, John C.; Hopkins, Deborah; Datuin, Marvin; Warchol, Mark; Warchol, Lyudmila; Forsyth, David S.; Buynak, Charlie; Lindgren, Eric A.

2017-02-01

For model benchmark studies, the accuracy of the model is typically evaluated based on the change in response relative to a selected reference signal. The use of a side drilled hole (SDH) in a plate was investigated as a reference signal for angled beam shear wave inspection for aircraft structure inspections of fastener sites. Systematic studies were performed with varying SDH depth and size, and varying the ultrasonic probe frequency, focal depth, and probe height. Increased error was observed with the simulation of angled shear wave beams in the near-field. Even more significant, asymmetry in real probes and the inherent sensitivity of signals in the near-field to subtle test conditions were found to provide a greater challenge with achieving model agreement. To achieve quality model benchmark results for this problem, it is critical to carefully align the probe with the part geometry, to verify symmetry in probe response, and ideally avoid using reference signals from the near-field response. Suggested reference signals for angled beam shear wave inspections include using the `through hole' corner specular reflection signal and the full skip' signal off of the far wall from the side drilled hole.
Benchmarking Multilayer-HySEA model for landslide generated tsunami. HTHMP validation process.

NASA Astrophysics Data System (ADS)

Macias, J.; Escalante, C.; Castro, M. J.

2017-12-01

Landslide tsunami hazard may be dominant along significant parts of the coastline around the world, in particular in the USA, as compared to hazards from other tsunamigenic sources. This fact motivated NTHMP about the need of benchmarking models for landslide generated tsunamis, following the same methodology already used for standard tsunami models when the source is seismic. To perform the above-mentioned validation process, a set of candidate benchmarks were proposed. These benchmarks are based on a subset of available laboratory data sets for solid slide experiments and deformable slide experiments, and include both submarine and subaerial slides. A benchmark based on a historic field event (Valdez, AK, 1964) close the list of proposed benchmarks. A total of 7 benchmarks. The Multilayer-HySEA model including non-hydrostatic effects has been used to perform all the benchmarking problems dealing with laboratory experiments proposed in the workshop that was organized at Texas A&M University - Galveston, on January 9-11, 2017 by NTHMP. The aim of this presentation is to show some of the latest numerical results obtained with the Multilayer-HySEA (non-hydrostatic) model in the framework of this validation effort.Acknowledgements. This research has been partially supported by the Spanish Government Research project SIMURISK (MTM2015-70490-C02-01-R) and University of Malaga, Campus de Excelencia Internacional Andalucía Tech. The GPU computations were performed at the Unit of Numerical Methods (University of Malaga).
Integral Full Core Multi-Physics PWR Benchmark with Measured Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Forget, Benoit; Smith, Kord; Kumar, Shikhar

In recent years, the importance of modeling and simulation has been highlighted extensively in the DOE research portfolio with concrete examples in nuclear engineering with the CASL and NEAMS programs. These research efforts and similar efforts worldwide aim at the development of high-fidelity multi-physics analysis tools for the simulation of current and next-generation nuclear power reactors. Like all analysis tools, verification and validation is essential to guarantee proper functioning of the software and methods employed. The current approach relies mainly on the validation of single physic phenomena (e.g. critical experiment, flow loops, etc.) and there is a lack of relevantmore » multiphysics benchmark measurements that are necessary to validate high-fidelity methods being developed today. This work introduces a new multi-cycle full-core Pressurized Water Reactor (PWR) depletion benchmark based on two operational cycles of a commercial nuclear power plant that provides a detailed description of fuel assemblies, burnable absorbers, in-core fission detectors, core loading and re-loading patterns. This benchmark enables analysts to develop extremely detailed reactor core models that can be used for testing and validation of coupled neutron transport, thermal-hydraulics, and fuel isotopic depletion. The benchmark also provides measured reactor data for Hot Zero Power (HZP) physics tests, boron letdown curves, and three-dimensional in-core flux maps from 58 instrumented assemblies. The benchmark description is now available online and has been used by many groups. However, much work remains to be done on the quantification of uncertainties and modeling sensitivities. This work aims to address these deficiencies and make this benchmark a true non-proprietary international benchmark for the validation of high-fidelity tools. This report details the BEAVRS uncertainty quantification for the first two cycle of operations and serves as the final report of the
Project Change Evaluation Research Brief.

ERIC Educational Resources Information Center

Leiderman, Sally A.; Dupree, David M.

Project Change is a community-driven anti-racism initiative operating in four communities: Albuquerque, New Mexico; El Paso, Texas; Knoxville, Tennessee; and Valdosta, Georgia. The formative evaluation of Project Change began in 1994 when all of the sites were still in planning or early action phases. Findings from the summative evaluation will be…
Team Projects and Peer Evaluations

ERIC Educational Resources Information Center

Doyle, John Kevin; Meeker, Ralph D.

2008-01-01

The authors assign semester- or quarter-long team-based projects in several Computer Science and Finance courses. This paper reports on our experience in designing, managing, and evaluating such projects. In particular, we discuss the effects of team size and of various peer evaluation schemes on team performance and student learning. We report…
Benchmarking in Academic Pharmacy Departments

PubMed Central

Chisholm-Burns, Marie; Nappi, Jean; Gubbins, Paul O.; Ross, Leigh Ann

2010-01-01

Benchmarking in academic pharmacy, and recommendations for the potential uses of benchmarking in academic pharmacy departments are discussed in this paper. Benchmarking is the process by which practices, procedures, and performance metrics are compared to an established standard or best practice. Many businesses and industries use benchmarking to compare processes and outcomes, and ultimately plan for improvement. Institutions of higher learning have embraced benchmarking practices to facilitate measuring the quality of their educational and research programs. Benchmarking is used internally as well to justify the allocation of institutional resources or to mediate among competing demands for additional program staff or space. Surveying all chairs of academic pharmacy departments to explore benchmarking issues such as department size and composition, as well as faculty teaching, scholarly, and service productivity, could provide valuable information. To date, attempts to gather this data have had limited success. We believe this information is potentially important, urge that efforts to gather it should be continued, and offer suggestions to achieve full participation. PMID:21179251
Benchmarking in academic pharmacy departments.

PubMed

Bosso, John A; Chisholm-Burns, Marie; Nappi, Jean; Gubbins, Paul O; Ross, Leigh Ann

2010-10-11

Benchmarking in academic pharmacy, and recommendations for the potential uses of benchmarking in academic pharmacy departments are discussed in this paper. Benchmarking is the process by which practices, procedures, and performance metrics are compared to an established standard or best practice. Many businesses and industries use benchmarking to compare processes and outcomes, and ultimately plan for improvement. Institutions of higher learning have embraced benchmarking practices to facilitate measuring the quality of their educational and research programs. Benchmarking is used internally as well to justify the allocation of institutional resources or to mediate among competing demands for additional program staff or space. Surveying all chairs of academic pharmacy departments to explore benchmarking issues such as department size and composition, as well as faculty teaching, scholarly, and service productivity, could provide valuable information. To date, attempts to gather this data have had limited success. We believe this information is potentially important, urge that efforts to gather it should be continued, and offer suggestions to achieve full participation.
FDA Benchmark Medical Device Flow Models for CFD Validation.

PubMed

Malinauskas, Richard A; Hariharan, Prasanna; Day, Steven W; Herbertson, Luke H; Buesen, Martin; Steinseifer, Ulrich; Aycock, Kenneth I; Good, Bryan C; Deutsch, Steven; Manning, Keefe B; Craven, Brent A

Computational fluid dynamics (CFD) is increasingly being used to develop blood-contacting medical devices. However, the lack of standardized methods for validating CFD simulations and blood damage predictions limits its use in the safety evaluation of devices. Through a U.S. Food and Drug Administration (FDA) initiative, two benchmark models of typical device flow geometries (nozzle and centrifugal blood pump) were tested in multiple laboratories to provide experimental velocities, pressures, and hemolysis data to support CFD validation. In addition, computational simulations were performed by more than 20 independent groups to assess current CFD techniques. The primary goal of this article is to summarize the FDA initiative and to report recent findings from the benchmark blood pump model study. Discrepancies between CFD predicted velocities and those measured using particle image velocimetry most often occurred in regions of flow separation (e.g., downstream of the nozzle throat, and in the pump exit diffuser). For the six pump test conditions, 57% of the CFD predictions of pressure head were within one standard deviation of the mean measured values. Notably, only 37% of all CFD submissions contained hemolysis predictions. This project aided in the development of an FDA Guidance Document on factors to consider when reporting computational studies in medical device regulatory submissions. There is an accompanying podcast available for this article. Please visit the journal's Web site (www.asaiojournal.com) to listen.
Benchmarking strategies for measuring the quality of healthcare: problems and prospects.

PubMed

Lovaglio, Pietro Giorgio

2012-01-01

Over the last few years, increasing attention has been directed toward the problems inherent to measuring the quality of healthcare and implementing benchmarking strategies. Besides offering accreditation and certification processes, recent approaches measure the performance of healthcare institutions in order to evaluate their effectiveness, defined as the capacity to provide treatment that modifies and improves the patient's state of health. This paper, dealing with hospital effectiveness, focuses on research methods for effectiveness analyses within a strategy comparing different healthcare institutions. The paper, after having introduced readers to the principle debates on benchmarking strategies, which depend on the perspective and type of indicators used, focuses on the methodological problems related to performing consistent benchmarking analyses. Particularly, statistical methods suitable for controlling case-mix, analyzing aggregate data, rare events, and continuous outcomes measured with error are examined. Specific challenges of benchmarking strategies, such as the risk of risk adjustment (case-mix fallacy, underreporting, risk of comparing noncomparable hospitals), selection bias, and possible strategies for the development of consistent benchmarking analyses, are discussed. Finally, to demonstrate the feasibility of the illustrated benchmarking strategies, an application focused on determining regional benchmarks for patient satisfaction (using 2009 Lombardy Region Patient Satisfaction Questionnaire) is proposed.
Creating Robust Evaluation of ATE Projects

ERIC Educational Resources Information Center

Eddy, Pamela L.

2017-01-01

Funded grant projects all involve some form of evaluation, and Advanced Technological Education (ATE) grants are no exception. Program evaluation serves as a critical component not only for evaluating if a project has met its intended and desired outcomes, but the evaluation process is also a central feature of the grant application itself.…
Species management benchmarking: outcomes over outputs in a changing operating environment.

PubMed

Hogg, Carolyn J; Hibbard, Chris; Ford, Claire; Embury, Amanda

2013-03-01

Species management has been utilized by the zoo and aquarium industry, since the mid-1990s, to ensure the ongoing genetic and demographic viability of populations, which can be difficult to maintain in the ever-changing operating environments of zoos. In 2009, the Zoo and Aquarium Association Australasia reviewed their species management services, focusing on addressing issues that had arisen as a result of the managed programs maturing and operating environments evolving. In summary, the project examined resourcing, policies, processes, and species to be managed. As a result, a benchmarking tool was developed (Health Check Report, HCR), which evaluated the programs against a set of broad criteria. A comparison of managed programs (n = 98), between 2008 and 2011, was undertaken to ascertain the tool's effectiveness. There was a marked decrease in programs that were designated as weak (37 down to 13); and an increase in excellent programs (24 up to 49) between the 2 years. Further, there were significant improvements in the administration benchmarking area (submission of reports, captive management plan development) across a number of taxon advisory groups. This HCR comparison showed that a benchmarking tool enables a program's performance to be quickly assessed and any remedial measures applied. The increases observed in program health were mainly due to increased management goals being attained. The HCR will be an ongoing program, as the management of the programs increases and goals are achieved, criteria will be refined to better highlight ongoing issues and ways in which these can be resolved. © 2012 Wiley Periodicals, Inc.
Benchmarking: applications to transfusion medicine.

PubMed

Apelseth, Torunn Oveland; Molnar, Laura; Arnold, Emmy; Heddle, Nancy M

2012-10-01

Benchmarking is as a structured continuous collaborative process in which comparisons for selected indicators are used to identify factors that, when implemented, will improve transfusion practices. This study aimed to identify transfusion medicine studies reporting on benchmarking, summarize the benchmarking approaches used, and identify important considerations to move the concept of benchmarking forward in the field of transfusion medicine. A systematic review of published literature was performed to identify transfusion medicine-related studies that compared at least 2 separate institutions or regions with the intention of benchmarking focusing on 4 areas: blood utilization, safety, operational aspects, and blood donation. Forty-five studies were included: blood utilization (n = 35), safety (n = 5), operational aspects of transfusion medicine (n = 5), and blood donation (n = 0). Based on predefined criteria, 7 publications were classified as benchmarking, 2 as trending, and 36 as single-event studies. Three models of benchmarking are described: (1) a regional benchmarking program that collects and links relevant data from existing electronic sources, (2) a sentinel site model where data from a limited number of sites are collected, and (3) an institutional-initiated model where a site identifies indicators of interest and approaches other institutions. Benchmarking approaches are needed in the field of transfusion medicine. Major challenges include defining best practices and developing cost-effective methods of data collection. For those interested in initiating a benchmarking program, the sentinel site model may be most effective and sustainable as a starting point, although the regional model would be the ideal goal. Copyright © 2012 Elsevier Inc. All rights reserved.

Results Oriented Benchmarking: The Evolution of Benchmarking at NASA from Competitive Comparisons to World Class Space Partnerships

NASA Technical Reports Server (NTRS)

Bell, Michael A.

1999-01-01

Informal benchmarking using personal or professional networks has taken place for many years at the Kennedy Space Center (KSC). The National Aeronautics and Space Administration (NASA) recognized early on, the need to formalize the benchmarking process for better utilization of resources and improved benchmarking performance. The need to compete in a faster, better, cheaper environment has been the catalyst for formalizing these efforts. A pioneering benchmarking consortium was chartered at KSC in January 1994. The consortium known as the Kennedy Benchmarking Clearinghouse (KBC), is a collaborative effort of NASA and all major KSC contractors. The charter of this consortium is to facilitate effective benchmarking, and leverage the resulting quality improvements across KSC. The KBC acts as a resource with experienced facilitators and a proven process. One of the initial actions of the KBC was to develop a holistic methodology for Center-wide benchmarking. This approach to Benchmarking integrates the best features of proven benchmarking models (i.e., Camp, Spendolini, Watson, and Balm). This cost-effective alternative to conventional Benchmarking approaches has provided a foundation for consistent benchmarking at KSC through the development of common terminology, tools, and techniques. Through these efforts a foundation and infrastructure has been built which allows short duration benchmarking studies yielding results gleaned from world class partners that can be readily implemented. The KBC has been recognized with the Silver Medal Award (in the applied research category) from the International Benchmarking Clearinghouse.
FireHose Streaming Benchmarks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Karl Anderson, Steve Plimpton

2015-01-27

The FireHose Streaming Benchmarks are a suite of stream-processing benchmarks defined to enable comparison of streaming software and hardware, both quantitatively vis-a-vis the rate at which they can process data, and qualitatively by judging the effort involved to implement and run the benchmarks. Each benchmark has two parts. The first is a generator which produces and outputs datums at a high rate in a specific format. The second is an analytic which reads the stream of datums and is required to perform a well-defined calculation on the collection of datums, typically to find anomalous datums that have been created inmore » the stream by the generator. The FireHose suite provides code for the generators, sample code for the analytics (which users are free to re-implement in their own custom frameworks), and a precise definition of each benchmark calculation.« less
Evaluating success levels of mega-projects

NASA Technical Reports Server (NTRS)

Kumaraswamy, Mohan M.

1994-01-01

Today's mega-projects transcend the traditional trajectories traced within national and technological limitations. Powers unleashed by internationalization of initiatives, in for example space exploration and environmental protection, are arguably only temporarily suppressed by narrower national, economic, and professional disagreements as to how best they should be harnessed. While the world gets its act together there is time to develop the technologies of such supra-mega-project management that will synergize truly diverse resources and smoothly mesh their interfaces. Such mega-projects and their management need to be realistically evaluated, when implementing such improvements. This paper examines current approaches to evaluating mega-projects and questions the validity of extrapolations to the supra-mega-projects of the future. Alternatives to improve such evaluations are proposed and described.
Benchmarking novel approaches for modelling species range dynamics.

PubMed

Zurell, Damaris; Thuiller, Wilfried; Pagel, Jörn; Cabral, Juliano S; Münkemüller, Tamara; Gravel, Dominique; Dullinger, Stefan; Normand, Signe; Schiffers, Katja H; Moore, Kara A; Zimmermann, Niklaus E

2016-08-01

Increasing biodiversity loss due to climate change is one of the most vital challenges of the 21st century. To anticipate and mitigate biodiversity loss, models are needed that reliably project species' range dynamics and extinction risks. Recently, several new approaches to model range dynamics have been developed to supplement correlative species distribution models (SDMs), but applications clearly lag behind model development. Indeed, no comparative analysis has been performed to evaluate their performance. Here, we build on process-based, simulated data for benchmarking five range (dynamic) models of varying complexity including classical SDMs, SDMs coupled with simple dispersal or more complex population dynamic models (SDM hybrids), and a hierarchical Bayesian process-based dynamic range model (DRM). We specifically test the effects of demographic and community processes on model predictive performance. Under current climate, DRMs performed best, although only marginally. Under climate change, predictive performance varied considerably, with no clear winners. Yet, all range dynamic models improved predictions under climate change substantially compared to purely correlative SDMs, and the population dynamic models also predicted reasonable extinction risks for most scenarios. When benchmarking data were simulated with more complex demographic and community processes, simple SDM hybrids including only dispersal often proved most reliable. Finally, we found that structural decisions during model building can have great impact on model accuracy, but prior system knowledge on important processes can reduce these uncertainties considerably. Our results reassure the clear merit in using dynamic approaches for modelling species' response to climate change but also emphasize several needs for further model and data improvement. We propose and discuss perspectives for improving range projections through combination of multiple models and for making these approaches
Evaluation of microfinance projects.

PubMed

Johnson, S

1999-08-01

This paper criticizes the quick system proposed by Henk Moll for evaluating microfinance projects in the article ¿How to Pre-Evaluate Credit Projects in Ten Minutes¿. The author contended that there is a need to emphasize the objectives of the project. The procedure used by Moll, he contended, is applicable only to projects that have only two key objectives, such as credit operations, and the provision of services. Arguments are presented on the three specific questions proposed by Moll, ranging from the availability of externally audited financial reports, the performance of interest rate on loans vis-a-vis the inflation rate, and the provision of loans according to the individual requirements of the borrowers. Lastly, the author emphasizes that the overall approach is not useful and suggests that careful considerations should be observed in the use or abuse of a simple scoring system or checklist such as the one proposed by Moll.
Piloting a Process Maturity Model as an e-Learning Benchmarking Method

ERIC Educational Resources Information Center

Petch, Jim; Calverley, Gayle; Dexter, Hilary; Cappelli, Tim

2007-01-01

As part of a national e-learning benchmarking initiative of the UK Higher Education Academy, the University of Manchester is carrying out a pilot study of a method to benchmark e-learning in an institution. The pilot was designed to evaluate the operational viability of a method based on the e-Learning Maturity Model developed at the University of…
A benchmarking method to measure dietary absorption efficiency of chemicals by fish.

PubMed

Xiao, Ruiyang; Adolfsson-Erici, Margaretha; Åkerman, Gun; McLachlan, Michael S; MacLeod, Matthew

2013-12-01

Understanding the dietary absorption efficiency of chemicals in the gastrointestinal tract of fish is important from both a scientific and a regulatory point of view. However, reported fish absorption efficiencies for well-studied chemicals are highly variable. In the present study, the authors developed and exploited an internal chemical benchmarking method that has the potential to reduce uncertainty and variability and, thus, to improve the precision of measurements of fish absorption efficiency. The authors applied the benchmarking method to measure the gross absorption efficiency for 15 chemicals with a wide range of physicochemical properties and structures. They selected 2,2',5,6'-tetrachlorobiphenyl (PCB53) and decabromodiphenyl ethane as absorbable and nonabsorbable benchmarks, respectively. Quantities of chemicals determined in fish were benchmarked to the fraction of PCB53 recovered in fish, and quantities of chemicals determined in feces were benchmarked to the fraction of decabromodiphenyl ethane recovered in feces. The performance of the benchmarking procedure was evaluated based on the recovery of the test chemicals and precision of absorption efficiency from repeated tests. Benchmarking did not improve the precision of the measurements; after benchmarking, however, the median recovery for 15 chemicals was 106%, and variability of recoveries was reduced compared with before benchmarking, suggesting that benchmarking could account for incomplete extraction of chemical in fish and incomplete collection of feces from different tests. © 2013 SETAC.
Issues in Benchmarking Human Reliability Analysis Methods: A Literature Review

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ronald L. Boring; Stacey M. L. Hendrickson; John A. Forester

There is a diversity of human reliability analysis (HRA) methods available for use in assessing human performance within probabilistic risk assessments (PRA). Due to the significant differences in the methods, including the scope, approach, and underlying models, there is a need for an empirical comparison investigating the validity and reliability of the methods. To accomplish this empirical comparison, a benchmarking study comparing and evaluating HRA methods in assessing operator performance in simulator experiments is currently underway. In order to account for as many effects as possible in the construction of this benchmarking study, a literature review was conducted, reviewing pastmore » benchmarking studies in the areas of psychology and risk assessment. A number of lessons learned through these studies are presented in order to aid in the design of future HRA benchmarking endeavors.« less
Benchmark Report on Key Outage Attributes: An Analysis of Outage Improvement Opportunities and Priorities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Germain, Shawn St.; Farris, Ronald

2014-09-01

Advanced Outage Control Center (AOCC), is a multi-year pilot project targeted at Nuclear Power Plant (NPP) outage improvement. The purpose of this pilot project is to improve management of NPP outages through the development of an AOCC that is specifically designed to maximize the usefulness of communication and collaboration technologies for outage coordination and problem resolution activities. This report documents the results of a benchmarking effort to evaluate the transferability of technologies demonstrated at Idaho National Laboratory and the primary pilot project partner, Palo Verde Nuclear Generating Station. The initial assumption for this pilot project was that NPPs generally domore » not take advantage of advanced technology to support outage management activities. Several researchers involved in this pilot project have commercial NPP experience and believed that very little technology has been applied towards outage communication and collaboration. To verify that the technology options researched and demonstrated through this pilot project would in fact have broad application for the US commercial nuclear fleet, and to look for additional outage management best practices, LWRS program researchers visited several additional nuclear facilities.« less
Linking Project Evaluation and Goals-Based Teacher Evaluation: Evaluating the Accelerated Schools Project in South Carolina.

ERIC Educational Resources Information Center

Finnan, Christine; Davis, Sara Calhoun

This paper describes efforts to design an evaluation system that has as its primary objective helping schools effect positive change through the Accelerated Schools Project. Three characteristics were deemed essential: (1) that the evaluation be useful and meaningful; (2) that it be sensitive to local conditions; and (3) that evaluations of…
23 CFR 505.11 - Project evaluation and rating.

Code of Federal Regulations, 2010 CFR

2010-04-01

... MANAGEMENT PROJECTS OF NATIONAL AND REGIONAL SIGNIFICANCE EVALUATION AND RATING § 505.11 Project evaluation and rating. (a) The Secretary shall evaluate and rate each proposed project as “highly recommended... 23 Highways 1 2010-04-01 2010-04-01 false Project evaluation and rating. 505.11 Section 505.11...
Multilevel Evaluation Systems Project. Final Report.

ERIC Educational Resources Information Center

Herman, Joan L.

Several studies were conducted in 1987 by the Multilevel Evaluation Systems Project, which focuses on developing a model for a multi-purpose, multi-user evaluation system to facilitate educational decision making and evaluation. The project model emphasizes on-going integrated assessment of individuals, classes, and programs using a variety of…
Outcome Benchmarks for Adaptations of Research-Supported Treatments for Adult Traumatic Stress

ERIC Educational Resources Information Center

Rubin, Allen; Parrish, Danielle E.; Washburn, Micki

2016-01-01

This article provides benchmark data on within-group effect sizes from published randomized controlled trials (RCTs) that evaluated the efficacy of research-supported treatments (RSTs) for adult traumatic stress. Agencies can compare these benchmarks to their treatment group effect size to inform their decisions as to whether the way they are…
Benchmarking, benchmarks, or best practices? Applying quality improvement principles to decrease surgical turnaround time.

PubMed

Mitchell, L

1996-01-01

The processes of benchmarking, benchmark data comparative analysis, and study of best practices are distinctly different. The study of best practices is explained with an example based on the Arthur Andersen & Co. 1992 "Study of Best Practices in Ambulatory Surgery". The results of a national best practices study in ambulatory surgery were used to provide our quality improvement team with the goal of improving the turnaround time between surgical cases. The team used a seven-step quality improvement problem-solving process to improve the surgical turnaround time. The national benchmark for turnaround times between surgical cases in 1992 was 13.5 minutes. The initial turnaround time at St. Joseph's Medical Center was 19.9 minutes. After the team implemented solutions, the time was reduced to an average of 16.3 minutes, an 18% improvement. Cost-benefit analysis showed a potential enhanced revenue of approximately $300,000, or a potential savings of $10,119. Applying quality improvement principles to benchmarking, benchmarks, or best practices can improve process performance. Understanding which form of benchmarking the institution wishes to embark on will help focus a team and use appropriate resources. Communicating with professional organizations that have experience in benchmarking will save time and money and help achieve the desired results.
Benchmarking Strategies for Measuring the Quality of Healthcare: Problems and Prospects

PubMed Central

Lovaglio, Pietro Giorgio

2012-01-01

Over the last few years, increasing attention has been directed toward the problems inherent to measuring the quality of healthcare and implementing benchmarking strategies. Besides offering accreditation and certification processes, recent approaches measure the performance of healthcare institutions in order to evaluate their effectiveness, defined as the capacity to provide treatment that modifies and improves the patient's state of health. This paper, dealing with hospital effectiveness, focuses on research methods for effectiveness analyses within a strategy comparing different healthcare institutions. The paper, after having introduced readers to the principle debates on benchmarking strategies, which depend on the perspective and type of indicators used, focuses on the methodological problems related to performing consistent benchmarking analyses. Particularly, statistical methods suitable for controlling case-mix, analyzing aggregate data, rare events, and continuous outcomes measured with error are examined. Specific challenges of benchmarking strategies, such as the risk of risk adjustment (case-mix fallacy, underreporting, risk of comparing noncomparable hospitals), selection bias, and possible strategies for the development of consistent benchmarking analyses, are discussed. Finally, to demonstrate the feasibility of the illustrated benchmarking strategies, an application focused on determining regional benchmarks for patient satisfaction (using 2009 Lombardy Region Patient Satisfaction Questionnaire) is proposed. PMID:22666140
Benchmarking of HEU Mental Annuli Critical Assemblies with Internally Reflected Graphite Cylinder

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xiaobo, Liu; Bess, John D.; Marshall, Margaret A.

Three experimental configurations of critical assemblies, performed in 1963 at the Oak Ridge Critical Experiment Facility, which are assembled using three different diameter HEU annuli (15-9 inches, 15-7 inches and 13-7 inches) metal annuli with internally reflected graphite cylinder are evaluated and benchmarked. The experimental uncertainties which are 0.00055, 0.00055 and 0.00055 respectively, and biases to the detailed benchmark models which are -0.00179, -0.00189 and -0.00114 respectively, were determined, and the experimental benchmark keff results were obtained for both detailed and simplified model. The calculation results for both detailed and simplified models using MCNP6-1.0 and ENDF VII.1 agree well tomore » the benchmark experimental results with a difference of less than 0.2%. These are acceptable benchmark experiments for inclusion in the ICSBEP Handbook.« less
Phase field benchmark problems for dendritic growth and linear elasticity

DOE PAGES

Jokisaari, Andrea M.; Voorhees, P. W.; Guyer, Jonathan E.; ...

2018-03-26

We present the second set of benchmark problems for phase field models that are being jointly developed by the Center for Hierarchical Materials Design (CHiMaD) and the National Institute of Standards and Technology (NIST) along with input from other members in the phase field community. As the integrated computational materials engineering (ICME) approach to materials design has gained traction, there is an increasing need for quantitative phase field results. New algorithms and numerical implementations increase computational capabilities, necessitating standard problems to evaluate their impact on simulated microstructure evolution as well as their computational performance. We propose one benchmark problem formore » solidifiication and dendritic growth in a single-component system, and one problem for linear elasticity via the shape evolution of an elastically constrained precipitate. We demonstrate the utility and sensitivity of the benchmark problems by comparing the results of 1) dendritic growth simulations performed with different time integrators and 2) elastically constrained precipitate simulations with different precipitate sizes, initial conditions, and elastic moduli. As a result, these numerical benchmark problems will provide a consistent basis for evaluating different algorithms, both existing and those to be developed in the future, for accuracy and computational efficiency when applied to simulate physics often incorporated in phase field models.« less
Phase field benchmark problems for dendritic growth and linear elasticity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jokisaari, Andrea M.; Voorhees, P. W.; Guyer, Jonathan E.

We present the second set of benchmark problems for phase field models that are being jointly developed by the Center for Hierarchical Materials Design (CHiMaD) and the National Institute of Standards and Technology (NIST) along with input from other members in the phase field community. As the integrated computational materials engineering (ICME) approach to materials design has gained traction, there is an increasing need for quantitative phase field results. New algorithms and numerical implementations increase computational capabilities, necessitating standard problems to evaluate their impact on simulated microstructure evolution as well as their computational performance. We propose one benchmark problem formore » solidifiication and dendritic growth in a single-component system, and one problem for linear elasticity via the shape evolution of an elastically constrained precipitate. We demonstrate the utility and sensitivity of the benchmark problems by comparing the results of 1) dendritic growth simulations performed with different time integrators and 2) elastically constrained precipitate simulations with different precipitate sizes, initial conditions, and elastic moduli. As a result, these numerical benchmark problems will provide a consistent basis for evaluating different algorithms, both existing and those to be developed in the future, for accuracy and computational efficiency when applied to simulate physics often incorporated in phase field models.« less
Uav Cameras: Overview and Geometric Calibration Benchmark

NASA Astrophysics Data System (ADS)

Cramer, M.; Przybilla, H.-J.; Zurhorst, A.

2017-08-01

Different UAV platforms and sensors are used in mapping already, many of them equipped with (sometimes) modified cameras as known from the consumer market. Even though these systems normally fulfil their requested mapping accuracy, the question arises, which system performs best? This asks for a benchmark, to check selected UAV based camera systems in well-defined, reproducible environments. Such benchmark is tried within this work here. Nine different cameras used on UAV platforms, representing typical camera classes, are considered. The focus is laid on the geometry here, which is tightly linked to the process of geometrical calibration of the system. In most applications the calibration is performed in-situ, i.e. calibration parameters are obtained as part of the project data itself. This is often motivated because consumer cameras do not keep constant geometry, thus, cannot be seen as metric cameras. Still, some of the commercial systems are quite stable over time, as it was proven from repeated (terrestrial) calibrations runs. Already (pre-)calibrated systems may offer advantages, especially when the block geometry of the project does not allow for a stable and sufficient in-situ calibration. Especially for such scenario close to metric UAV cameras may have advantages. Empirical airborne test flights in a calibration field have shown how block geometry influences the estimated calibration parameters and how consistent the parameters from lab calibration can be reproduced.
ENDF/B-VII.1 Neutron Cross Section Data Testing with Critical Assembly Benchmarks and Reactor Experiments

NASA Astrophysics Data System (ADS)

Kahler, A. C.; MacFarlane, R. E.; Mosteller, R. D.; Kiedrowski, B. C.; Frankle, S. C.; Chadwick, M. B.; McKnight, R. D.; Lell, R. M.; Palmiotti, G.; Hiruta, H.; Herman, M.; Arcilla, R.; Mughabghab, S. F.; Sublet, J. C.; Trkov, A.; Trumbull, T. H.; Dunn, M.

2011-12-01

The ENDF/B-VII.1 library is the latest revision to the United States' Evaluated Nuclear Data File (ENDF). The ENDF library is currently in its seventh generation, with ENDF/B-VII.0 being released in 2006. This revision expands upon that library, including the addition of new evaluated files (was 393 neutron files previously, now 423 including replacement of elemental vanadium and zinc evaluations with isotopic evaluations) and extension or updating of many existing neutron data files. Complete details are provided in the companion paper [M. B. Chadwick et al., "ENDF/B-VII.1 Nuclear Data for Science and Technology: Cross Sections, Covariances, Fission Product Yields and Decay Data," Nuclear Data Sheets, 112, 2887 (2011)]. This paper focuses on how accurately application libraries may be expected to perform in criticality calculations with these data. Continuous energy cross section libraries, suitable for use with the MCNP Monte Carlo transport code, have been generated and applied to a suite of nearly one thousand critical benchmark assemblies defined in the International Criticality Safety Benchmark Evaluation Project's International Handbook of Evaluated Criticality Safety Benchmark Experiments. This suite covers uranium and plutonium fuel systems in a variety of forms such as metallic, oxide or solution, and under a variety of spectral conditions, including unmoderated (i.e., bare), metal reflected and water or other light element reflected. Assembly eigenvalues that were accurately predicted with ENDF/B-VII.0 cross sections such as unmoderated and uranium reflected 235U and 239Pu assemblies, HEU solution systems and LEU oxide lattice systems that mimic commercial PWR configurations continue to be accurately calculated with ENDF/B-VII.1 cross sections, and deficiencies in predicted eigenvalues for assemblies containing selected materials, including titanium, manganese, cadmium and tungsten are greatly reduced. Improvements are also confirmed for selected

SU-E-T-148: Benchmarks and Pre-Treatment Reviews: A Study of Quality Assurance Effectiveness

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lowenstein, J; Nguyen, H; Roll, J

Purpose: To determine the impact benchmarks and pre-treatment reviews have on improving the quality of submitted clinical trial data. Methods: Benchmarks are used to evaluate a site’s ability to develop a treatment that meets a specific protocol’s treatment guidelines prior to placing their first patient on the protocol. A pre-treatment review is an actual patient placed on the protocol in which the dosimetry and contour volumes are evaluated to be per protocol guidelines prior to allowing the beginning of the treatment. A key component of these QA mechanisms is that sites are provided timely feedback to educate them on howmore » to plan per the protocol and prevent protocol deviations on patients accrued to a protocol. For both benchmarks and pre-treatment reviews a dose volume analysis (DVA) was performed using MIM softwareTM. For pre-treatment reviews a volume contour evaluation was also performed. Results: IROC Houston performed a QA effectiveness analysis of a protocol which required both benchmarks and pre-treatment reviews. In 70 percent of the patient cases submitted, the benchmark played an effective role in assuring that the pre-treatment review of the cases met protocol requirements. The 35 percent of sites failing the benchmark subsequently modified there planning technique to pass the benchmark before being allowed to submit a patient for pre-treatment review. However, in 30 percent of the submitted cases the pre-treatment review failed where the majority (71 percent) failed the DVA. 20 percent of sites submitting patients failed to correct their dose volume discrepancies indicated by the benchmark case. Conclusion: Benchmark cases and pre-treatment reviews can be an effective QA tool to educate sites on protocol guidelines and to minimize deviations. Without the benchmark cases it is possible that 65 percent of the cases undergoing a pre-treatment review would have failed to meet the protocols requirements.Support: U24-CA-180803.« less
Benchmarking Tool Kit.

ERIC Educational Resources Information Center

Canadian Health Libraries Association.

Nine Canadian health libraries participated in a pilot test of the Benchmarking Tool Kit between January and April, 1998. Although the Tool Kit was designed specifically for health libraries, the content and approach are useful to other types of libraries as well. Used to its full potential, benchmarking can provide a common measuring stick to…
Implementing a benchmarking and feedback concept decreases postoperative pain after total knee arthroplasty: A prospective study including 256 patients.

PubMed

Benditz, A; Drescher, J; Greimel, F; Zeman, F; Grifka, J; Meißner, W; Völlner, F

2016-12-05

Perioperative pain reduction, particularly during the first two days, is highly important for patients after total knee arthroplasty (TKA). Problems are not only caused by medical issues but by organization and hospital structure. The present study shows how the quality of pain management can be increased by implementing a standardized pain concept and simple, consistent benchmarking. All patients included into the study had undergone total knee arthroplasty. Outcome parameters were analyzed by means of a questionnaire on the first postoperative day. A multidisciplinary team implemented a regular procedure of data analyzes and external benchmarking by participating in a nationwide quality improvement project. At the beginning of the study, our hospital ranked 16 th in terms of activity-related pain and 9 th in patient satisfaction among 47 anonymized hospitals participating in the benchmarking project. At the end of the study, we had improved to 1 st activity-related pain and to 2 nd in patient satisfaction. Although benchmarking started and finished with the same standardized pain management concept, results were initially pure. Beside pharmacological treatment, interdisciplinary teamwork and benchmarking with direct feedback mechanisms are also very important for decreasing postoperative pain and for increasing patient satisfaction after TKA.
Simulation of Benchmark Cases with the Terminal Area Simulation System (TASS)

NASA Technical Reports Server (NTRS)

Ahmad, Nashat N.; Proctor, Fred H.

2011-01-01

The hydrodynamic core of the Terminal Area Simulation System (TASS) is evaluated against different benchmark cases. In the absence of closed form solutions for the equations governing atmospheric flows, the models are usually evaluated against idealized test cases. Over the years, various authors have suggested a suite of these idealized cases which have become standards for testing and evaluating the dynamics and thermodynamics of atmospheric flow models. In this paper, simulations of three such cases are described. In addition, the TASS model is evaluated against a test case that uses an exact solution of the Navier-Stokes equations. The TASS results are compared against previously reported simulations of these benchmark cases in the literature. It is demonstrated that the TASS model is highly accurate, stable and robust.
Restaurant Energy Use Benchmarking Guideline

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hedrick, R.; Smith, V.; Field, K.

2011-07-01

A significant operational challenge for food service operators is defining energy use benchmark metrics to compare against the performance of individual stores. Without metrics, multiunit operators and managers have difficulty identifying which stores in their portfolios require extra attention to bring their energy performance in line with expectations. This report presents a method whereby multiunit operators may use their own utility data to create suitable metrics for evaluating their operations.
Benchmarking and Self-Assessment in the Wine Industry

DOE Office of Scientific and Technical Information (OSTI.GOV)

Galitsky, Christina; Radspieler, Anthony; Worrell, Ernst

2005-12-01

Not all industrial facilities have the staff or theopportunity to perform a detailed audit of their operations. The lack ofknowledge of energy efficiency opportunities provides an importantbarrier to improving efficiency. Benchmarking programs in the U.S. andabroad have shown to improve knowledge of the energy performance ofindustrial facilities and buildings and to fuel energy managementpractices. Benchmarking provides a fair way to compare the energyintensity of plants, while accounting for structural differences (e.g.,the mix of products produced, climate conditions) between differentfacilities. In California, the winemaking industry is not only one of theeconomic pillars of the economy; it is also a large energymore » consumer, witha considerable potential for energy-efficiency improvement. LawrenceBerkeley National Laboratory and Fetzer Vineyards developed the firstbenchmarking tool for the California wine industry called "BEST(Benchmarking and Energy and water Savings Tool) Winery". BEST Wineryenables a winery to compare its energy efficiency to a best practicereference winery. Besides overall performance, the tool enables the userto evaluate the impact of implementing efficiency measures. The toolfacilitates strategic planning of efficiency measures, based on theestimated impact of the measures, their costs and savings. The tool willraise awareness of current energy intensities and offer an efficient wayto evaluate the impact of future efficiency measures.« less
Benchmarking Gas Path Diagnostic Methods: A Public Approach

NASA Technical Reports Server (NTRS)

Simon, Donald L.; Bird, Jeff; Davison, Craig; Volponi, Al; Iverson, R. Eugene

2008-01-01

Recent technology reviews have identified the need for objective assessments of engine health management (EHM) technology. The need is two-fold: technology developers require relevant data and problems to design and validate new algorithms and techniques while engine system integrators and operators need practical tools to direct development and then evaluate the effectiveness of proposed solutions. This paper presents a publicly available gas path diagnostic benchmark problem that has been developed by the Propulsion and Power Systems Panel of The Technical Cooperation Program (TTCP) to help address these needs. The problem is coded in MATLAB (The MathWorks, Inc.) and coupled with a non-linear turbofan engine simulation to produce "snap-shot" measurements, with relevant noise levels, as if collected from a fleet of engines over their lifetime of use. Each engine within the fleet will experience unique operating and deterioration profiles, and may encounter randomly occurring relevant gas path faults including sensor, actuator and component faults. The challenge to the EHM community is to develop gas path diagnostic algorithms to reliably perform fault detection and isolation. An example solution to the benchmark problem is provided along with associated evaluation metrics. A plan is presented to disseminate this benchmark problem to the engine health management technical community and invite technology solutions.
Benchmarking multimedia performance

NASA Astrophysics Data System (ADS)

Zandi, Ahmad; Sudharsanan, Subramania I.

1998-03-01

With the introduction of faster processors and special instruction sets tailored to multimedia, a number of exciting applications are now feasible on the desktops. Among these is the DVD playback consisting, among other things, of MPEG-2 video and Dolby digital audio or MPEG-2 audio. Other multimedia applications such as video conferencing and speech recognition are also becoming popular on computer systems. In view of this tremendous interest in multimedia, a group of major computer companies have formed, Multimedia Benchmarks Committee as part of Standard Performance Evaluation Corp. to address the performance issues of multimedia applications. The approach is multi-tiered with three tiers of fidelity from minimal to full compliant. In each case the fidelity of the bitstream reconstruction as well as quality of the video or audio output are measured and the system is classified accordingly. At the next step the performance of the system is measured. In many multimedia applications such as the DVD playback the application needs to be run at a specific rate. In this case the measurement of the excess processing power, makes all the difference. All these make a system level, application based, multimedia benchmark very challenging. Several ideas and methodologies for each aspect of the problems will be presented and analyzed.
A Quantitative Methodology for Determining the Critical Benchmarks for Project 2061 Strand Maps

ERIC Educational Resources Information Center

Kuhn, G.

2008-01-01

The American Association for the Advancement of Science (AAAS) was tasked with identifying the key science concepts for science literacy in K-12 students in America (AAAS, 1990, 1993). The AAAS Atlas of Science Literacy (2001) has organized roughly half of these science concepts or benchmarks into fifty flow charts. Each flow chart or strand map…
Evaluation of the influence of the definition of an isolated hip fracture as an exclusion criterion for trauma system benchmarking: a multicenter cohort study.

PubMed

Tiao, J; Moore, L; Porgo, T V; Belcaid, A

2016-06-01

To assess whether the definition of an IHF used as an exclusion criterion influences the results of trauma center benchmarking. We conducted a multicenter retrospective cohort study with data from an integrated Canadian trauma system. The study population included all patients admitted between 1999 and 2010 to any of the 57 adult trauma centers. Seven definitions of IHF based on diagnostic codes, age, mechanism of injury, and secondary injuries, identified in a systematic review, were used. Trauma centers were benchmarked using risk-adjusted mortality estimates generated using the Trauma Risk Adjustment Model. The agreement between benchmarking results generated under different IHF definitions was evaluated with correlation coefficients on adjusted mortality estimates. Correlation coefficients >0.95 were considered to convey acceptable agreement. The study population consisted of 172,872 patients before exclusion of IHF and between 128,094 and 139,588 patients after exclusion. Correlation coefficients between risk-adjusted mortality estimates generated in populations including and excluding IHF varied between 0.86 and 0.90. Correlation coefficients of estimates generated under different definitions of IHF varied between 0.97 and 0.99, even when analyses were restricted to patients aged ≥65 years. Although the exclusion of patients with IHF has an influence on the results of trauma center benchmarking based on mortality, the definition of IHF in terms of diagnostic codes, age, mechanism of injury and secondary injury has no significant impact on benchmarking results. Results suggest that there is no need to obtain formal consensus on the definition of IHF for benchmarking activities.
MHEC Survey Establishes Midwest Property Insurance Benchmarks.

ERIC Educational Resources Information Center

Midwestern Higher Education Commission Risk Management Institute Research Bulletin, 1994

1994-01-01

This publication presents the results of a survey of over 200 midwestern colleges and universities on their property insurance programs and establishes benchmarks to help these institutions evaluate their insurance programs. Findings included the following: (1) 51 percent of respondents currently purchase their property insurance as part of a…
Analysis of Students' Assessments in Middle School Curriculum Materials: Aiming Precisely at Benchmarks and Standards.

ERIC Educational Resources Information Center

Stern, Luli; Ahlgren, Andrew

2002-01-01

Project 2061 of the American Association for the Advancement of Science (AAAS) developed and field-tested a procedure for analyzing curriculum materials, including assessments, in terms of contribution to the attainment of benchmarks and standards. Using this procedure, Project 2061 produced a database of reports on nine science middle school…
Validation of mechanical models for reinforced concrete structures: Presentation of the French project ``Benchmark des Poutres de la Rance''

NASA Astrophysics Data System (ADS)

L'Hostis, V.; Brunet, C.; Poupard, O.; Petre-Lazar, I.

2006-11-01

Several ageing models are available for the prediction of the mechanical consequences of rebar corrosion. They are used for service life prediction of reinforced concrete structures. Concerning corrosion diagnosis of reinforced concrete, some Non Destructive Testing (NDT) tools have been developed, and have been in use for some years. However, these developments require validation on existing concrete structures. The French project “Benchmark des Poutres de la Rance” contributes to this aspect. It has two main objectives: (i) validation of mechanical models to estimate the influence of rebar corrosion on the load bearing capacity of a structure, (ii) qualification of the use of the NDT results to collect information on steel corrosion within reinforced-concrete structures. Ten French and European institutions from both academic research laboratories and industrial companies contributed during the years 2004 and 2005. This paper presents the project that was divided into several work packages: (i) the reinforced concrete beams were characterized from non-destructive testing tools, (ii) the mechanical behaviour of the beams was experimentally tested, (iii) complementary laboratory analysis were performed and (iv) finally numerical simulations results were compared to the experimental results obtained with the mechanical tests.
Interactive visual optimization and analysis for RFID benchmarking.

PubMed

Wu, Yingcai; Chung, Ka-Kei; Qu, Huamin; Yuan, Xiaoru; Cheung, S C

2009-01-01

Radio frequency identification (RFID) is a powerful automatic remote identification technique that has wide applications. To facilitate RFID deployment, an RFID benchmarking instrument called aGate has been invented to identify the strengths and weaknesses of different RFID technologies in various environments. However, the data acquired by aGate are usually complex time varying multidimensional 3D volumetric data, which are extremely challenging for engineers to analyze. In this paper, we introduce a set of visualization techniques, namely, parallel coordinate plots, orientation plots, a visual history mechanism, and a 3D spatial viewer, to help RFID engineers analyze benchmark data visually and intuitively. With the techniques, we further introduce two workflow procedures (a visual optimization procedure for finding the optimum reader antenna configuration and a visual analysis procedure for comparing the performance and identifying the flaws of RFID devices) for the RFID benchmarking, with focus on the performance analysis of the aGate system. The usefulness and usability of the system are demonstrated in the user evaluation.
Object-Oriented Implementation of the NAS Parallel Benchmarks using Charm++

NASA Technical Reports Server (NTRS)

Krishnan, Sanjeev; Bhandarkar, Milind; Kale, Laxmikant V.

1996-01-01

This report describes experiences with implementing the NAS Computational Fluid Dynamics benchmarks using a parallel object-oriented language, Charm++. Our main objective in implementing the NAS CFD kernel benchmarks was to develop a code that could be used to easily experiment with different domain decomposition strategies and dynamic load balancing. We also wished to leverage the object-orientation provided by the Charm++ parallel object-oriented language, to develop reusable abstractions that would simplify the process of developing parallel applications. We first describe the Charm++ parallel programming model and the parallel object array abstraction, then go into detail about each of the Scalar Pentadiagonal (SP) and Lower/Upper Triangular (LU) benchmarks, along with performance results. Finally we conclude with an evaluation of the methodology used.
NetBenchmark: a bioconductor package for reproducible benchmarks of gene regulatory network inference.

PubMed

Bellot, Pau; Olsen, Catharina; Salembier, Philippe; Oliveras-Vergés, Albert; Meyer, Patrick E

2015-09-29

In the last decade, a great number of methods for reconstructing gene regulatory networks from expression data have been proposed. However, very few tools and datasets allow to evaluate accurately and reproducibly those methods. Hence, we propose here a new tool, able to perform a systematic, yet fully reproducible, evaluation of transcriptional network inference methods. Our open-source and freely available Bioconductor package aggregates a large set of tools to assess the robustness of network inference algorithms against different simulators, topologies, sample sizes and noise intensities. The benchmarking framework that uses various datasets highlights the specialization of some methods toward network types and data. As a result, it is possible to identify the techniques that have broad overall performances.
Developing Benchmarks for Solar Radio Bursts

NASA Astrophysics Data System (ADS)

Biesecker, D. A.; White, S. M.; Gopalswamy, N.; Black, C.; Domm, P.; Love, J. J.; Pierson, J.

2016-12-01

Solar radio bursts can interfere with radar, communication, and tracking signals. In severe cases, radio bursts can inhibit the successful use of radio communications and disrupt a wide range of systems that are reliant on Position, Navigation, and Timing services on timescales ranging from minutes to hours across wide areas on the dayside of Earth. The White House's Space Weather Action Plan has asked for solar radio burst intensity benchmarks for an event occurrence frequency of 1 in 100 years and also a theoretical maximum intensity benchmark. The solar radio benchmark team was also asked to define the wavelength/frequency bands of interest. The benchmark team developed preliminary (phase 1) benchmarks for the VHF (30-300 MHz), UHF (300-3000 MHz), GPS (1176-1602 MHz), F10.7 (2800 MHz), and Microwave (4000-20000) bands. The preliminary benchmarks were derived based on previously published work. Limitations in the published work will be addressed in phase 2 of the benchmark process. In addition, deriving theoretical maxima requires additional work, where it is even possible to, in order to meet the Action Plan objectives. In this presentation, we will present the phase 1 benchmarks and the basis used to derive them. We will also present the work that needs to be done in order to complete the final, or phase 2 benchmarks.
The PIE Institute Project: Final Evaluation Report

ERIC Educational Resources Information Center

St. John, Mark; Carroll, Becky; Helms, Jen; Smith, Anita

2008-01-01

The Playful Invention and Exploration (PIE) Institute project was funded in 2005 by the National Science Foundation (NSF). For the past three years, Inverness Research has served as the external evaluator for the PIE project. The authors' evaluation efforts have included extensive observation and documentation of PIE project activities; ongoing…
NASA Indexing Benchmarks: Evaluating Text Search Engines

NASA Technical Reports Server (NTRS)

Esler, Sandra L.; Nelson, Michael L.

1997-01-01

The current proliferation of on-line information resources underscores the requirement for the ability to index collections of information and search and retrieve them in a convenient manner. This study develops criteria for analytically comparing the index and search engines and presents results for a number of freely available search engines. A product of this research is a toolkit capable of automatically indexing, searching, and extracting performance statistics from each of the focused search engines. This toolkit is highly configurable and has the ability to run these benchmark tests against other engines as well. Results demonstrate that the tested search engines can be grouped into two levels. Level one engines are efficient on small to medium sized data collections, but show weaknesses when used for collections 100MB or larger. Level two search engines are recommended for data collections up to and beyond 100MB.
Benchmarking in European Higher Education: A Step beyond Current Quality Models

ERIC Educational Resources Information Center

Burquel, Nadine; van Vught, Frans

2010-01-01

This paper presents the findings of a two-year EU-funded project (DG Education and Culture) "Benchmarking in European Higher Education", carried out from 2006 to 2008 by a consortium led by the European Centre for Strategic Management of Universities (ESMU), with the Centre for Higher Education Development, UNESCO-CEPES, and the…

Preliminary Results for the OECD/NEA Time Dependent Benchmark using Rattlesnake, Rattlesnake-IQS and TDKENO

DOE Office of Scientific and Technical Information (OSTI.GOV)

DeHart, Mark D.; Mausolff, Zander; Weems, Zach

2016-08-01

One goal of the MAMMOTH M&S project is to validate the analysis capabilities within MAMMOTH. Historical data has shown limited value for validation of full three-dimensional (3D) multi-physics methods. Initial analysis considered the TREAT startup minimum critical core and one of the startup transient tests. At present, validation is focusing on measurements taken during the M8CAL test calibration series. These exercises will valuable in preliminary assessment of the ability of MAMMOTH to perform coupled multi-physics calculations; calculations performed to date are being used to validate the neutron transport solver Rattlesnake\\cite{Rattlesnake} and the fuels performance code BISON. Other validation projects outsidemore » of TREAT are available for single-physics benchmarking. Because the transient solution capability of Rattlesnake is one of the key attributes that makes it unique for TREAT transient simulations, validation of the transient solution of Rattlesnake using other time dependent kinetics benchmarks has considerable value. The Nuclear Energy Agency (NEA) of the Organization for Economic Cooperation and Development (OECD) has recently developed a computational benchmark for transient simulations. This benchmark considered both two-dimensional (2D) and 3D configurations for a total number of 26 different transients. All are negative reactivity insertions, typically returning to the critical state after some time.« less
WWTP dynamic disturbance modelling--an essential module for long-term benchmarking development.

PubMed

Gernaey, K V; Rosen, C; Jeppsson, U

2006-01-01

Intensive use of the benchmark simulation model No. 1 (BSM1), a protocol for objective comparison of the effectiveness of control strategies in biological nitrogen removal activated sludge plants, has also revealed a number of limitations. Preliminary definitions of the long-term benchmark simulation model No. 1 (BSM1_LT) and the benchmark simulation model No. 2 (BSM2) have been made to extend BSM1 for evaluation of process monitoring methods and plant-wide control strategies, respectively. Influent-related disturbances for BSM1_LT/BSM2 are to be generated with a model, and this paper provides a general overview of the modelling methods used. Typical influent dynamic phenomena generated with the BSM1_LT/BSM2 influent disturbance model, including diurnal, weekend, seasonal and holiday effects, as well as rainfall, are illustrated with simulation results. As a result of the work described in this paper, a proposed influent model/file has been released to the benchmark developers for evaluation purposes. Pending this evaluation, a final BSM1_LT/BSM2 influent disturbance model definition is foreseen. Preliminary simulations with dynamic influent data generated by the influent disturbance model indicate that default BSM1 activated sludge plant control strategies will need extensions for BSM1_LT/BSM2 to efficiently handle 1 year of influent dynamics.
Implementing a benchmarking and feedback concept decreases postoperative pain after total knee arthroplasty: A prospective study including 256 patients

PubMed Central

Benditz, A.; Drescher, J.; Greimel, F.; Zeman, F.; Grifka, J.; Meißner, W.; Völlner, F.

2016-01-01

Perioperative pain reduction, particularly during the first two days, is highly important for patients after total knee arthroplasty (TKA). Problems are not only caused by medical issues but by organization and hospital structure. The present study shows how the quality of pain management can be increased by implementing a standardized pain concept and simple, consistent benchmarking. All patients included into the study had undergone total knee arthroplasty. Outcome parameters were analyzed by means of a questionnaire on the first postoperative day. A multidisciplinary team implemented a regular procedure of data analyzes and external benchmarking by participating in a nationwide quality improvement project. At the beginning of the study, our hospital ranked 16th in terms of activity-related pain and 9th in patient satisfaction among 47 anonymized hospitals participating in the benchmarking project. At the end of the study, we had improved to 1st activity-related pain and to 2nd in patient satisfaction. Although benchmarking started and finished with the same standardized pain management concept, results were initially pure. Beside pharmacological treatment, interdisciplinary teamwork and benchmarking with direct feedback mechanisms are also very important for decreasing postoperative pain and for increasing patient satisfaction after TKA. PMID:27917911
Requirements for benchmarking personal image retrieval systems

NASA Astrophysics Data System (ADS)

Bouguet, Jean-Yves; Dulong, Carole; Kozintsev, Igor; Wu, Yi

2006-01-01

It is now common to have accumulated tens of thousands of personal ictures. Efficient access to that many pictures can only be done with a robust image retrieval system. This application is of high interest to Intel processor architects. It is highly compute intensive, and could motivate end users to upgrade their personal computers to the next generations of processors. A key question is how to assess the robustness of a personal image retrieval system. Personal image databases are very different from digital libraries that have been used by many Content Based Image Retrieval Systems.1 For example a personal image database has a lot of pictures of people, but a small set of different people typically family, relatives, and friends. Pictures are taken in a limited set of places like home, work, school, and vacation destination. The most frequent queries are searched for people, and for places. These attributes, and many others affect how a personal image retrieval system should be benchmarked, and benchmarks need to be different from existing ones based on art images, or medical images for examples. The attributes of the data set do not change the list of components needed for the benchmarking of such systems as specified in2: - data sets - query tasks - ground truth - evaluation measures - benchmarking events. This paper proposed a way to build these components to be representative of personal image databases, and of the corresponding usage models.
Project CAREER/CAN. Final Evaluation Report.

ERIC Educational Resources Information Center

National Educational Evaluation Services, Inc., Chestnut Hill, MA.

A description and evaluation of (1) the development of the 4-column process which completes the behavioral objective data base, (2) the development of the computer retrieval capability, and (3) the pilot testing of the product in high school classrooms are included in this summative evaluation of Project CAREER/CAN. (Goals of Project CAREER/CAN,…
Evaluation of Title I ESEA Projects: 1975-76.

ERIC Educational Resources Information Center

Philadelphia School District, PA. Office of Research and Evaluation.

Evaluation services to be provided during 1975-76 to projects funded under the Elementary and Secondary Education Act Title I are listed in this annual booklet. For each project, the following information is provided: goals to be assessed, evaluation techniques (design), and evaluation milestones. Regular term and summer term projects reported on…
Evaluating a collaborative IT based research and development project.

PubMed

Khan, Zaheer; Ludlow, David; Caceres, Santiago

2013-10-01

In common with all projects, evaluating an Information Technology (IT) based research and development project is necessary in order to discover whether or not the outcomes of the project are successful. However, evaluating large-scale collaborative projects is especially difficult as: (i) stakeholders from different countries are involved who, almost inevitably, have diverse technological and/or application domain backgrounds and objectives; (ii) multiple and sometimes conflicting application specific and user-defined requirements exist; and (iii) multiple and often conflicting technological research and development objectives are apparent. In this paper, we share our experiences based on the large-scale integrated research project - The HUMBOLDT project - with project duration of 54 months, involving contributions from 27 partner organisations, plus 4 sub-contractors from 14 different European countries. In the HUMBOLDT project, a specific evaluation methodology was defined and utilised for the user evaluation of the project outcomes. The user evaluation performed on the HUMBOLDT Framework and its associated nine application scenarios from various application domains, resulted in not only an evaluation of the integrated project, but also revealed the benefits and disadvantages of the evaluation methodology. This paper presents the evaluation methodology, discusses in detail the process of applying it to the HUMBOLDT project and provides an in-depth analysis of the results, which can be usefully applied to other collaborative research projects in a variety of domains. Copyright © 2013 Elsevier Ltd. All rights reserved.
Benchmarking Spike-Based Visual Recognition: A Dataset and Evaluation

PubMed Central

Liu, Qian; Pineda-García, Garibaldi; Stromatias, Evangelos; Serrano-Gotarredona, Teresa; Furber, Steve B.

2016-01-01

Today, increasing attention is being paid to research into spike-based neural computation both to gain a better understanding of the brain and to explore biologically-inspired computation. Within this field, the primate visual pathway and its hierarchical organization have been extensively studied. Spiking Neural Networks (SNNs), inspired by the understanding of observed biological structure and function, have been successfully applied to visual recognition and classification tasks. In addition, implementations on neuromorphic hardware have enabled large-scale networks to run in (or even faster than) real time, making spike-based neural vision processing accessible on mobile robots. Neuromorphic sensors such as silicon retinas are able to feed such mobile systems with real-time visual stimuli. A new set of vision benchmarks for spike-based neural processing are now needed to measure progress quantitatively within this rapidly advancing field. We propose that a large dataset of spike-based visual stimuli is needed to provide meaningful comparisons between different systems, and a corresponding evaluation methodology is also required to measure the performance of SNN models and their hardware implementations. In this paper we first propose an initial NE (Neuromorphic Engineering) dataset based on standard computer vision benchmarksand that uses digits from the MNIST database. This dataset is compatible with the state of current research on spike-based image recognition. The corresponding spike trains are produced using a range of techniques: rate-based Poisson spike generation, rank order encoding, and recorded output from a silicon retina with both flashing and oscillating input stimuli. In addition, a complementary evaluation methodology is presented to assess both model-level and hardware-level performance. Finally, we demonstrate the use of the dataset and the evaluation methodology using two SNN models to validate the performance of the models and their hardware
Benchmarking Spike-Based Visual Recognition: A Dataset and Evaluation.

PubMed

Liu, Qian; Pineda-García, Garibaldi; Stromatias, Evangelos; Serrano-Gotarredona, Teresa; Furber, Steve B

2016-01-01

Today, increasing attention is being paid to research into spike-based neural computation both to gain a better understanding of the brain and to explore biologically-inspired computation. Within this field, the primate visual pathway and its hierarchical organization have been extensively studied. Spiking Neural Networks (SNNs), inspired by the understanding of observed biological structure and function, have been successfully applied to visual recognition and classification tasks. In addition, implementations on neuromorphic hardware have enabled large-scale networks to run in (or even faster than) real time, making spike-based neural vision processing accessible on mobile robots. Neuromorphic sensors such as silicon retinas are able to feed such mobile systems with real-time visual stimuli. A new set of vision benchmarks for spike-based neural processing are now needed to measure progress quantitatively within this rapidly advancing field. We propose that a large dataset of spike-based visual stimuli is needed to provide meaningful comparisons between different systems, and a corresponding evaluation methodology is also required to measure the performance of SNN models and their hardware implementations. In this paper we first propose an initial NE (Neuromorphic Engineering) dataset based on standard computer vision benchmarksand that uses digits from the MNIST database. This dataset is compatible with the state of current research on spike-based image recognition. The corresponding spike trains are produced using a range of techniques: rate-based Poisson spike generation, rank order encoding, and recorded output from a silicon retina with both flashing and oscillating input stimuli. In addition, a complementary evaluation methodology is presented to assess both model-level and hardware-level performance. Finally, we demonstrate the use of the dataset and the evaluation methodology using two SNN models to validate the performance of the models and their hardware
Research on computer systems benchmarking

NASA Technical Reports Server (NTRS)

Smith, Alan Jay (Principal Investigator)

1996-01-01

This grant addresses the topic of research on computer systems benchmarking and is more generally concerned with performance issues in computer systems. This report reviews work in those areas during the period of NASA support under this grant. The bulk of the work performed concerned benchmarking and analysis of CPUs, compilers, caches, and benchmark programs. The first part of this work concerned the issue of benchmark performance prediction. A new approach to benchmarking and machine characterization was reported, using a machine characterizer that measures the performance of a given system in terms of a Fortran abstract machine. Another report focused on analyzing compiler performance. The performance impact of optimization in the context of our methodology for CPU performance characterization was based on the abstract machine model. Benchmark programs are analyzed in another paper. A machine-independent model of program execution was developed to characterize both machine performance and program execution. By merging these machine and program characterizations, execution time can be estimated for arbitrary machine/program combinations. The work was continued into the domain of parallel and vector machines, including the issue of caches in vector processors and multiprocessors. All of the afore-mentioned accomplishments are more specifically summarized in this report, as well as those smaller in magnitude supported by this grant.
Project evaluation process manual

DOT National Transportation Integrated Search

1997-07-01

Describes the process for evaluating airport environments, safety standards, airport infrastructure, licensing standards and multitransportational systems. The project rating system is intended to be used for determining state and federal funding.
Medico-economic evaluation of healthcare products. Methodology for defining a significant impact on French health insurance costs and selection of benchmarks for interpreting results.

PubMed

Dervaux, Benoît; Baseilhac, Eric; Fagon, Jean-Yves; Biot, Claire; Blachier, Corinne; Braun, Eric; Debroucker, Frédérique; Detournay, Bruno; Ferretti, Carine; Granger, Muriel; Jouan-Flahault, Chrystel; Lussier, Marie-Dominique; Meyer, Arlette; Muller, Sophie; Pigeon, Martine; De Sahb, Rima; Sannié, Thomas; Sapède, Claudine; Vray, Muriel

2014-01-01

Decree No. 2012-1116 of 2 October 2012 on medico-economic assignments of the French National Authority for Health (Haute autorité de santé, HAS) significantly alters the conditions for accessing the health products market in France. This paper presents a theoretical framework for interpreting the results of the economic evaluation of health technologies and summarises the facts available in France for developing benchmarks that will be used to interpret incremental cost-effectiveness ratios. This literature review shows that it is difficult to determine a threshold value but it is also difficult to interpret then incremental cost effectiveness ratio (ICER) results without a threshold value. In this context, round table participants favour a pragmatic approach based on "benchmarks" as opposed to a threshold value, based on an interpretative and normative perspective, i.e. benchmarks that can change over time based on feedback. © 2014 Société Française de Pharmacologie et de Thérapeutique.
Evaluating the Information Power Grid using the NAS Grid Benchmarks

NASA Technical Reports Server (NTRS)

VanderWijngaartm Rob F.; Frumkin, Michael A.

2004-01-01

The NAS Grid Benchmarks (NGB) are a collection of synthetic distributed applications designed to rate the performance and functionality of computational grids. We compare several implementations of the NGB to determine programmability and efficiency of NASA's Information Power Grid (IPG), whose services are mostly based on the Globus Toolkit. We report on the overheads involved in porting existing NGB reference implementations to the IPG. No changes were made to the component tasks of the NGB can still be improved.
International E-Benchmarking: Flexible Peer Development of Authentic Learning Principles in Higher Education

ERIC Educational Resources Information Center

Leppisaari, Irja; Vainio, Leena; Herrington, Jan; Im, Yeonwook

2011-01-01

More and more, social technologies and virtual work methods are facilitating new ways of crossing boundaries in professional development and international collaborations. This paper examines the peer development of higher education teachers through the experiences of the IVBM project (International Virtual Benchmarking, 2009-2010). The…
Making Benchmark Testing Work

ERIC Educational Resources Information Center

Herman, Joan L.; Baker, Eva L.

2005-01-01

Many schools are moving to develop benchmark tests to monitor their students' progress toward state standards throughout the academic year. Benchmark tests can provide the ongoing information that schools need to guide instructional programs and to address student learning problems. The authors discuss six criteria that educators can use to…
Oneida Language Project: Evaluative Report.

ERIC Educational Resources Information Center

Grittner, Frank M.

Based on data collected during site visits in March 1978 to four Wisconsin school districts (Freedom, Pulaski, Seymour, and West DePere), this document evaluates a state project on Oneida language instruction. The project evolved from an expressed interest by Oneida Tribal members, elementary school children, and university students in learning…
Developing and Trialling an independent, scalable and repeatable IT-benchmarking procedure for healthcare organisations.

PubMed

Liebe, J D; Hübner, U

2013-01-01

Continuous improvements of IT-performance in healthcare organisations require actionable performance indicators, regularly conducted, independent measurements and meaningful and scalable reference groups. Existing IT-benchmarking initiatives have focussed on the development of reliable and valid indicators, but less on the questions about how to implement an environment for conducting easily repeatable and scalable IT-benchmarks. This study aims at developing and trialling a procedure that meets the afore-mentioned requirements. We chose a well established, regularly conducted (inter-) national IT-survey of healthcare organisations (IT-Report Healthcare) as the environment and offered the participants of the 2011 survey (CIOs of hospitals) to enter a benchmark. The 61 structural and functional performance indicators covered among others the implementation status and integration of IT-systems and functions, global user satisfaction and the resources of the IT-department. Healthcare organisations were grouped by size and ownership. The benchmark results were made available electronically and feedback on the use of these results was requested after several months. Fifty-ninehospitals participated in the benchmarking. Reference groups consisted of up to 141 members depending on the number of beds (size) and the ownership (public vs. private). A total of 122 charts showing single indicator frequency views were sent to each participant. The evaluation showed that 94.1% of the CIOs who participated in the evaluation considered this benchmarking beneficial and reported that they would enter again. Based on the feedback of the participants we developed two additional views that provide a more consolidated picture. The results demonstrate that establishing an independent, easily repeatable and scalable IT-benchmarking procedure is possible and was deemed desirable. Based on these encouraging results a new benchmarking round which includes process indicators is currently
HS06 Benchmark for an ARM Server

NASA Astrophysics Data System (ADS)

Kluth, Stefan

2014-06-01

We benchmarked an ARM cortex-A9 based server system with a four-core CPU running at 1.1 GHz. The system used Ubuntu 12.04 as operating system and the HEPSPEC 2006 (HS06) benchmarking suite was compiled natively with gcc-4.4 on the system. The benchmark was run for various settings of the relevant gcc compiler options. We did not find significant influence from the compiler options on the benchmark result. The final HS06 benchmark result is 10.4.
How is success or failure in river restoration projects evaluated? Feedback from French restoration projects.

PubMed

Morandi, Bertrand; Piégay, Hervé; Lamouroux, Nicolas; Vaudor, Lise

2014-05-01

Since the 1990s, French operational managers and scientists have been involved in the environmental restoration of rivers. The European Water Framework Directive (2000) highlights the need for feedback from restoration projects and for evidence-based evaluation of success. Based on 44 French pilot projects that included such an evaluation, the present study includes: 1) an introduction to restoration projects based on their general characteristics 2) a description of evaluation strategies and authorities in charge of their implementation, and 3) a focus on the evaluation of results and the links between these results and evaluation strategies. The results show that: 1) the quality of an evaluation strategy often remains too poor to understand well the link between a restoration project and ecological changes; 2) in many cases, the conclusions drawn are contradictory, making it difficult to determine the success or failure of a restoration project; and 3) the projects with the poorest evaluation strategies generally have the most positive conclusions about the effects of restoration. Recommendations are that evaluation strategies should be designed early in the project planning process and be based on clearly-defined objectives. Copyright © 2014 Elsevier Ltd. All rights reserved.
NAS Grid Benchmarks. 1.0

NASA Technical Reports Server (NTRS)

VanderWijngaart, Rob; Frumkin, Michael; Biegel, Bryan A. (Technical Monitor)

2002-01-01

We provide a paper-and-pencil specification of a benchmark suite for computational grids. It is based on the NAS (NASA Advanced Supercomputing) Parallel Benchmarks (NPB) and is called the NAS Grid Benchmarks (NGB). NGB problems are presented as data flow graphs encapsulating an instance of a slightly modified NPB task in each graph node, which communicates with other nodes by sending/receiving initialization data. Like NPB, NGB specifies several different classes (problem sizes). In this report we describe classes S, W, and A, and provide verification values for each. The implementor has the freedom to choose any language, grid environment, security model, fault tolerance/error correction mechanism, etc., as long as the resulting implementation passes the verification test and reports the turnaround time of the benchmark.

Evaluating Productivity Predictions Under Elevated CO2 Conditions: Multi-Model Benchmarking Across FACE Experiments

NASA Astrophysics Data System (ADS)

Cowdery, E.; Dietze, M.

2016-12-01

As atmospheric levels of carbon dioxide levels continue to increase, it is critical that terrestrial ecosystem models can accurately predict ecological responses to the changing environment. Current predictions of net primary productivity (NPP) in response to elevated atmospheric CO2 concentration are highly variable and contain a considerable amount of uncertainty.The Predictive Ecosystem Analyzer (PEcAn) is an informatics toolbox that wraps around an ecosystem model and can be used to help identify which factors drive uncertainty. We tested a suite of models (LPJ-GUESS, MAESPA, GDAY, CLM5, DALEC, ED2), which represent a range from low to high structural complexity, across a range of Free-Air CO2 Enrichment (FACE) experiments: the Kennedy Space Center Open Top Chamber Experiment, the Rhinelander FACE experiment, the Duke Forest FACE experiment and the Oak Ridge Experiment on CO2 Enrichment. These tests were implemented in a novel benchmarking workflow that is automated, repeatable, and generalized to incorporate different sites and ecological models. Observational data from the FACE experiments represent a first test of this flexible, extensible approach aimed at providing repeatable tests of model process representation.To identify and evaluate the assumptions causing inter-model differences we used PEcAn to perform model sensitivity and uncertainty analysis, not only to assess the components of NPP, but also to examine system processes such nutrient uptake and and water use. Combining the observed patterns of uncertainty between multiple models with results of the recent FACE-model data synthesis project (FACE-MDS) can help identify which processes need further study and additional data constraints. These findings can be used to inform future experimental design and in turn can provide informative starting point for data assimilation.
Energy benchmarking of commercial buildings: a low-cost pathway toward urban sustainability

NASA Astrophysics Data System (ADS)

Cox, Matt; Brown, Marilyn A.; Sun, Xiaojing

2013-09-01

US cities are beginning to experiment with a regulatory approach to address information failures in the real estate market by mandating the energy benchmarking of commercial buildings. Understanding how a commercial building uses energy has many benefits; for example, it helps building owners and tenants identify poor-performing buildings and subsystems and it enables high-performing buildings to achieve greater occupancy rates, rents, and property values. This paper estimates the possible impacts of a national energy benchmarking mandate through analysis chiefly utilizing the Georgia Tech version of the National Energy Modeling System (GT-NEMS). Correcting input discount rates results in a 4.0% reduction in projected energy consumption for seven major classes of equipment relative to the reference case forecast in 2020, rising to 8.7% in 2035. Thus, the official US energy forecasts appear to overestimate future energy consumption by underestimating investments in energy-efficient equipment. Further discount rate reductions spurred by benchmarking policies yield another 1.3-1.4% in energy savings in 2020, increasing to 2.2-2.4% in 2035. Benchmarking would increase the purchase of energy-efficient equipment, reducing energy bills, CO2 emissions, and conventional air pollution. Achieving comparable CO2 savings would require more than tripling existing US solar capacity. Our analysis suggests that nearly 90% of the energy saved by a national benchmarking policy would benefit metropolitan areas, and the policy’s benefits would outweigh its costs, both to the private sector and society broadly.
Analysis of students' assessments in middle school curriculum materials: Aiming precisely at benchmarks and standards

NASA Astrophysics Data System (ADS)

Stern, Luli

2002-11-01

Assessment influences every level of the education system and is one of the most crucial catalysts for reform in science curriculum and instruction. Teachers, administrators, and others who choose, assemble, or develop assessments face the difficulty of judging whether tasks are truly aligned with national or state standards and whether they are effective in revealing what students actually know. Project 2061 of the American Association for the Advancement of Science has developed and field-tested a procedure for analyzing curriculum materials, including their assessments, in terms of how well they are likely to contribute to the attainment of benchmarks and standards. With respect to assessment in curriculum materials, this procedure evaluates whether this assessment has the potential to reveal whether students have attained specific ideas in benchmarks and standards and whether information gained from students' responses can be used to inform subsequent instruction. Using this procedure, Project 2061 had produced a database of analytical reports on nine widely used science middle school curriculum materials. The analysis of assessments included in these materials shows that whereas currently available materials devote significant sections in their instruction to ideas included in national standards documents, students are typically not assessed on these ideas. The analysis results described in the report point to strengths and limitations of these widely used assessments and identify a range of good and poor assessment tasks that can shed light on important characteristics of good assessment.
The Zoo, Benchmarks & You: How To Reach the Oregon State Benchmarks with Zoo Resources.

ERIC Educational Resources Information Center

2002

This document aligns Oregon state educational benchmarks and standards with Oregon Zoo resources. Benchmark areas examined include English, mathematics, science, social studies, and career and life roles. Brief descriptions of the programs offered by the zoo are presented. (SOE)
Analyzing the BBOB results by means of benchmarking concepts.

PubMed

Mersmann, O; Preuss, M; Trautmann, H; Bischl, B; Weihs, C

2015-01-01

We present methods to answer two basic questions that arise when benchmarking optimization algorithms. The first one is: which algorithm is the "best" one? and the second one is: which algorithm should I use for my real-world problem? Both are connected and neither is easy to answer. We present a theoretical framework for designing and analyzing the raw data of such benchmark experiments. This represents a first step in answering the aforementioned questions. The 2009 and 2010 BBOB benchmark results are analyzed by means of this framework and we derive insight regarding the answers to the two questions. Furthermore, we discuss how to properly aggregate rankings from algorithm evaluations on individual problems into a consensus, its theoretical background and which common pitfalls should be avoided. Finally, we address the grouping of test problems into sets with similar optimizer rankings and investigate whether these are reflected by already proposed test problem characteristics, finding that this is not always the case.
Benchmarking transportation logistics practices for effective system planning

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thrower, A.W.; Dravo, A.N.; Keister, M.

2007-07-01

This paper presents preliminary findings of an Office of Civilian Radioactive Waste Management (OCRWM) benchmarking project to identify best practices for logistics enterprises. The results will help OCRWM's Office of Logistics Management (OLM) design and implement a system to move spent nuclear fuel (SNF) and high-level radioactive waste (HLW) to the Yucca Mountain repository for disposal when that facility is licensed and built. This report suggests topics for additional study. The project team looked at three Federal radioactive material logistics operations that are widely viewed to be successful: (1) the Waste Isolation Pilot Plant (WIPP) in Carlsbad, New Mexico; (2)more » the Naval Nuclear Propulsion Program (NNPP); and (3) domestic and foreign research reactor (FRR) SNF acceptance programs. (authors)« less
A Seafloor Benchmark for 3-dimensional Geodesy

NASA Astrophysics Data System (ADS)

Chadwell, C. D.; Webb, S. C.; Nooner, S. L.

2014-12-01

We have developed an inexpensive, permanent seafloor benchmark to increase the longevity of seafloor geodetic measurements. The benchmark provides a physical tie to the sea floor lasting for decades (perhaps longer) on which geodetic sensors can be repeatedly placed and removed with millimeter resolution. Global coordinates estimated with seafloor geodetic techniques will remain attached to the benchmark allowing for the interchange of sensors as they fail or become obsolete, or for the sensors to be removed and used elsewhere, all the while maintaining a coherent series of positions referenced to the benchmark. The benchmark has been designed to free fall from the sea surface with transponders attached. The transponder can be recalled via an acoustic command sent from the surface to release from the benchmark and freely float to the sea surface for recovery. The duration of the sensor attachment to the benchmark will last from a few days to a few years depending on the specific needs of the experiment. The recovered sensors are then available to be reused at other locations, or again at the same site in the future. Three pins on the sensor frame mate precisely and unambiguously with three grooves on the benchmark. To reoccupy a benchmark a Remotely Operated Vehicle (ROV) uses its manipulator arm to place the sensor pins into the benchmark grooves. In June 2014 we deployed four benchmarks offshore central Oregon. We used the ROV Jason to successfully demonstrate the removal and replacement of packages onto the benchmark. We will show the benchmark design and its operational capabilities. Presently models of megathrust slip within the Cascadia Subduction Zone (CSZ) are mostly constrained by the sub-aerial GPS vectors from the Plate Boundary Observatory, a part of Earthscope. More long-lived seafloor geodetic measures are needed to better understand the earthquake and tsunami risk associated with a large rupture of the thrust fault within the Cascadia subduction zone
The Concepts "Benchmarks and Benchmarking" Used in Education Planning: Teacher Education as Example

ERIC Educational Resources Information Center

Steyn, H. J.

2015-01-01

Planning in education is a structured activity that includes several phases and steps that take into account several kinds of information (Steyn, Steyn, De Waal & Wolhuter, 2002: 146). One of the sets of information that are usually considered is the (so-called) "benchmarks" and "benchmarking" regarding the focus of a…
Benchmarking: A strategic overview of a key management tool

Treesearch

Chris Leclair

1999-01-01

Benchmarking is a continuous, systematic process for evaluating the products, services, and work processes of organizations in an effort to identifY best practices for possible adoption in support of the objectives of enhanced activity service delivery and organizational effectiveness.
Chemotherapy Extravasation: Establishing a National Benchmark for Incidence Among Cancer Centers.

PubMed

Jackson-Rose, Jeannette; Del Monte, Judith; Groman, Adrienne; Dial, Linda S; Atwell, Leah; Graham, Judy; O'Neil Semler, Rosemary; O'Sullivan, Maryellen; Truini-Pittman, Lisa; Cunningham, Terri A; Roman-Fischetti, Lisa; Costantinou, Eileen; Rimkus, Chris; Banavage, Adrienne J; Dietz, Barbara; Colussi, Carol J; Catania, Kimberly; Wasko, Michelle; Schreffler, Kevin A; West, Colleen; Siefert, Mary Lou; Rice, Robert David

2017-08-01

Given the high-risk nature and nurse sensitivity of chemotherapy infusion and extravasation prevention, as well as the absence of an industry benchmark, a group of nurses studied oncology-specific nursing-sensitive indicators.  . The purpose was to establish a benchmark for the incidence of chemotherapy extravasation with vesicants, irritants, and irritants with vesicant potential. . Infusions with actual or suspected extravasations of vesicant and irritant chemotherapies were evaluated. Extravasation events were reviewed by type of agent, occurrence by drug category, route of administration, level of harm, follow-up, and patient referrals to surgical consultation. . A total of 739,812 infusions were evaluated, with 673 extravasation events identified. Incidence for all extravasation events was 0.09%.
GEAR UP Aspirations Project Evaluation

ERIC Educational Resources Information Center

Trimble, Brad A.

2013-01-01

The purpose of this study was to conduct a formative evaluation of the first two years of the Gaining Early Awareness and Readiness for Undergraduate Programs (GEAR UP) Aspirations Project (Aspirations) using a Context, Input, Process, and Product (CIPP) model so as to gain an in-depth understanding of the project during the middle school…
Translational benchmark risk analysis

PubMed Central

Piegorsch, Walter W.

2010-01-01

Translational development – in the sense of translating a mature methodology from one area of application to another, evolving area – is discussed for the use of benchmark doses in quantitative risk assessment. Illustrations are presented with traditional applications of the benchmark paradigm in biology and toxicology, and also with risk endpoints that differ from traditional toxicological archetypes. It is seen that the benchmark approach can apply to a diverse spectrum of risk management settings. This suggests a promising future for this important risk-analytic tool. Extensions of the method to a wider variety of applications represent a significant opportunity for enhancing environmental, biomedical, industrial, and socio-economic risk assessments. PMID:20953283
Encoding color information for visual tracking: Algorithms and benchmark.

PubMed

Liang, Pengpeng; Blasch, Erik; Ling, Haibin

2015-12-01

While color information is known to provide rich discriminative clues for visual inference, most modern visual trackers limit themselves to the grayscale realm. Despite recent efforts to integrate color in tracking, there is a lack of comprehensive understanding of the role color information can play. In this paper, we attack this problem by conducting a systematic study from both the algorithm and benchmark perspectives. On the algorithm side, we comprehensively encode 10 chromatic models into 16 carefully selected state-of-the-art visual trackers. On the benchmark side, we compile a large set of 128 color sequences with ground truth and challenge factor annotations (e.g., occlusion). A thorough evaluation is conducted by running all the color-encoded trackers, together with two recently proposed color trackers. A further validation is conducted on an RGBD tracking benchmark. The results clearly show the benefit of encoding color information for tracking. We also perform detailed analysis on several issues, including the behavior of various combinations between color model and visual tracker, the degree of difficulty of each sequence for tracking, and how different challenge factors affect the tracking performance. We expect the study to provide the guidance, motivation, and benchmark for future work on encoding color in visual tracking.
ARL Physics Web Pages: An Evaluation by Established, Transitional and Emerging Benchmarks.

ERIC Educational Resources Information Center

Duffy, Jane C.

2002-01-01

Provides an overview of characteristics among Association of Research Libraries (ARL) physics Web pages. Examines current academic Web literature and from that develops six benchmarks to measure physics Web pages: ease of navigation; logic of presentation; representation of all forms of information; engagement of the discipline; interactivity of…
A Privacy-Preserving Platform for User-Centric Quantitative Benchmarking

NASA Astrophysics Data System (ADS)

Herrmann, Dominik; Scheuer, Florian; Feustel, Philipp; Nowey, Thomas; Federrath, Hannes

We propose a centralised platform for quantitative benchmarking of key performance indicators (KPI) among mutually distrustful organisations. Our platform offers users the opportunity to request an ad-hoc benchmarking for a specific KPI within a peer group of their choice. Architecture and protocol are designed to provide anonymity to its users and to hide the sensitive KPI values from other clients and the central server. To this end, we integrate user-centric peer group formation, exchangeable secure multi-party computation protocols, short-lived ephemeral key pairs as pseudonyms, and attribute certificates. We show by empirical evaluation of a prototype that the performance is acceptable for reasonably sized peer groups.
Stakeholder approach for evaluating organizational change projects.

PubMed

Peltokorpi, Antti; Alho, Antti; Kujala, Jaakko; Aitamurto, Johanna; Parvinen, Petri

2008-01-01

This paper aims to create a model for evaluating organizational change initiatives from a stakeholder resistance viewpoint. The paper presents a model to evaluate change projects and their expected benefits. Factors affecting the challenge to implement change were defined based on stakeholder theory literature. The authors test the model's practical validity for screening change initiatives to improve operating room productivity. Change initiatives can be evaluated using six factors: the effect of the planned intervention on stakeholders' actions and position; stakeholders' capability to influence the project's implementation; motivation to participate; capability to change; change complexity; and management capability. The presented model's generalizability should be explored by filtering presented factors through a larger number of historical cases operating in different healthcare contexts. The link between stakeholders, the change challenge and the outcomes of change projects needs to be empirically tested. The proposed model can be used to prioritize change projects, manage stakeholder resistance and establish a better organizational and professional competence for managing healthcare organization change projects. New insights into existing stakeholder-related understanding of change project successes are provided.
Fingerprinting sea-level variations in response to continental ice loss: a benchmark exercise

NASA Astrophysics Data System (ADS)

Barletta, Valentina R.; Spada, Giorgio; Riva, Riccardo E. M.; James, Thomas S.; Simon, Karen M.; van der Wal, Wouter; Martinec, Zdenek; Klemann, Volker; Olsson, Per-Anders; Hagedoorn, Jan; Stocchi, Paolo; Vermeersen, Bert

2013-04-01

Understanding the response of the Earth to the waxing and waning ice sheets is crucial in various contexts, ranging from the interpretation of modern satellite geodetic measurements to the projections of future sea level trends in response to climate change. All the processes accompanying Glacial Isostatic Adjustment (GIA) can be described solving the so-called Sea Level Equation (SLE), an integral equation that accounts for the interactions between the ice sheets, the solid Earth, and the oceans. Modern approaches to the SLE are based on various techniques that range from purely analytical formulations to fully numerical methods. Here we present the results of a benchmark exercise of independently developed codes designed to solve the SLE. The study involves predictions of current sea level changes due to present-day ice mass loss. In spite of the differences in the methods employed, the comparison shows that a significant number of GIA modellers can reproduce their sea-level computations within 2% for well defined, large-scale present-day ice mass changes. Smaller and more detailed loads need further and dedicated benchmarking and high resolution computation. This study shows how the details of the implementation and the inputs specifications are an important, and often underappreciated, aspect. Hence this represents a step toward the assessment of reliability of sea level projections obtained with benchmarked SLE codes.
Accelerating progress in Artificial General Intelligence: Choosing a benchmark for natural world interaction

NASA Astrophysics Data System (ADS)

Rohrer, Brandon

2010-12-01

Measuring progress in the field of Artificial General Intelligence (AGI) can be difficult without commonly accepted methods of evaluation. An AGI benchmark would allow evaluation and comparison of the many computational intelligence algorithms that have been developed. In this paper I propose that a benchmark for natural world interaction would possess seven key characteristics: fitness, breadth, specificity, low cost, simplicity, range, and task focus. I also outline two benchmark examples that meet most of these criteria. In the first, the direction task, a human coach directs a machine to perform a novel task in an unfamiliar environment. The direction task is extremely broad, but may be idealistic. In the second, the AGI battery, AGI candidates are evaluated based on their performance on a collection of more specific tasks. The AGI battery is designed to be appropriate to the capabilities of currently existing systems. Both the direction task and the AGI battery would require further definition before implementing. The paper concludes with a description of a task that might be included in the AGI battery: the search and retrieve task.
Project SEARCH UK - Evaluating Its Employment Outcomes.

PubMed

Kaehne, Axel

2016-11-01

The study reports the findings of an evaluation of Project SEARCH UK. The programme develops internships for young people with intellectual disabilities who are about to leave school or college. The aim of the evaluation was to investigate at what rate Project SEARCH provided employment opportunities to participants. The evaluation obtained data from all sites operational in the UK at the time of evaluation (n = 17) and analysed employment outcomes. Data were available for 315 young people (n = 315) in the programme and pay and other employment related data were available for a subsample. The results of the analysis suggest that Project SEARCH achieves on average employment rates of around 50 per cent. Project SEARCH UK represents a valuable addition to the supported employment provision in the UK. Its unique model should inform discussions around best practice in supported employment. Implications for other supported employment programmes are discussed. © 2015 John Wiley & Sons Ltd.
BENCHMARK DOSE TECHNICAL GUIDANCE DOCUMENT ...

EPA Pesticide Factsheets

The purpose of this document is to provide guidance for the Agency on the application of the benchmark dose approach in determining the point of departure (POD) for health effects data, whether a linear or nonlinear low dose extrapolation is used. The guidance includes discussion on computation of benchmark doses and benchmark concentrations (BMDs and BMCs) and their lower confidence limits, data requirements, dose-response analysis, and reporting requirements. This guidance is based on today's knowledge and understanding, and on experience gained in using this approach.

Issues in Benchmark Metric Selection

NASA Astrophysics Data System (ADS)

Crolotte, Alain

It is true that a metric can influence a benchmark but will esoteric metrics create more problems than they will solve? We answer this question affirmatively by examining the case of the TPC-D metric which used the much debated geometric mean for the single-stream test. We will show how a simple choice influenced the benchmark and its conduct and, to some extent, DBMS development. After examining other alternatives our conclusion is that the “real” measure for a decision-support benchmark is the arithmetic mean.
Benchmarking clinical photography services in the NHS.

PubMed

Arbon, Giles

2015-01-01

Benchmarking is used in services across the National Health Service (NHS) using various benchmarking programs. Clinical photography services do not have a program in place and services have to rely on ad hoc surveys of other services. A trial benchmarking exercise was undertaken with 13 services in NHS Trusts. This highlights valuable data and comparisons that can be used to benchmark and improve services throughout the profession.
Method and system for benchmarking computers

DOEpatents

Gustafson, John L.

1993-09-14

A testing system and method for benchmarking computer systems. The system includes a store containing a scalable set of tasks to be performed to produce a solution in ever-increasing degrees of resolution as a larger number of the tasks are performed. A timing and control module allots to each computer a fixed benchmarking interval in which to perform the stored tasks. Means are provided for determining, after completion of the benchmarking interval, the degree of progress through the scalable set of tasks and for producing a benchmarking rating relating to the degree of progress for each computer.
International health IT benchmarking: learning from cross-country comparisons.

PubMed

Zelmer, Jennifer; Ronchi, Elettra; Hyppönen, Hannele; Lupiáñez-Villanueva, Francisco; Codagnone, Cristiano; Nøhr, Christian; Huebner, Ursula; Fazzalari, Anne; Adler-Milstein, Julia

2017-03-01

To pilot benchmark measures of health information and communication technology (ICT) availability and use to facilitate cross-country learning. A prior Organization for Economic Cooperation and Development-led effort involving 30 countries selected and defined functionality-based measures for availability and use of electronic health records, health information exchange, personal health records, and telehealth. In this pilot, an Organization for Economic Cooperation and Development Working Group compiled results for 38 countries for a subset of measures with broad coverage using new and/or adapted country-specific or multinational surveys and other sources from 2012 to 2015. We also synthesized country learnings to inform future benchmarking. While electronic records are widely used to store and manage patient information at the point of care-all but 2 pilot countries reported use by at least half of primary care physicians; many had rates above 75%-patient information exchange across organizations/settings is less common. Large variations in the availability and use of telehealth and personal health records also exist. Pilot participation demonstrated interest in cross-national benchmarking. Using the most comparable measures available to date, it showed substantial diversity in health ICT availability and use in all domains. The project also identified methodological considerations (e.g., structural and health systems issues that can affect measurement) important for future comparisons. While health policies and priorities differ, many nations aim to increase access, quality, and/or efficiency of care through effective ICT use. By identifying variations and describing key contextual factors, benchmarking offers the potential to facilitate cross-national learning and accelerate the progress of individual countries. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Present Status and Extensions of the Monte Carlo Performance Benchmark

NASA Astrophysics Data System (ADS)

Hoogenboom, J. Eduard; Petrovic, Bojan; Martin, William R.

2014-06-01

The NEA Monte Carlo Performance benchmark started in 2011 aiming to monitor over the years the abilities to perform a full-size Monte Carlo reactor core calculation with a detailed power production for each fuel pin with axial distribution. This paper gives an overview of the contributed results thus far. It shows that reaching a statistical accuracy of 1 % for most of the small fuel zones requires about 100 billion neutron histories. The efficiency of parallel execution of Monte Carlo codes on a large number of processor cores shows clear limitations for computer clusters with common type computer nodes. However, using true supercomputers the speedup of parallel calculations is increasing up to large numbers of processor cores. More experience is needed from calculations on true supercomputers using large numbers of processors in order to predict if the requested calculations can be done in a short time. As the specifications of the reactor geometry for this benchmark test are well suited for further investigations of full-core Monte Carlo calculations and a need is felt for testing other issues than its computational performance, proposals are presented for extending the benchmark to a suite of benchmark problems for evaluating fission source convergence for a system with a high dominance ratio, for coupling with thermal-hydraulics calculations to evaluate the use of different temperatures and coolant densities and to study the correctness and effectiveness of burnup calculations. Moreover, other contemporary proposals for a full-core calculation with realistic geometry and material composition will be discussed.
Evaluating Performance of Highway Safety Projects

DOT National Transportation Integrated Search

2016-12-01

The purpose of this project was to investigate and document methods that the Idaho Transportation Department (ITD) and Local Highway Technical Assistance Council (LHTAC) can use to evaluate the performance of safety projects that have been implemente...
A study on operation efficiency evaluation based on firm's financial index and benchmark selection: take China Unicom as an example

NASA Astrophysics Data System (ADS)

Wu, Zu-guang; Tian, Zhan-jun; Liu, Hui; Huang, Rui; Zhu, Guo-hua

2009-07-01

Being the only listed telecom operators of A share market, China Unicom has always been attracted many institutional investors under the concept of 3G recent years,which itself is a great technical progress expectation.Do the institutional investors or the concept of technical progress have signficant effect on the improving of firm's operating efficiency?Though reviewing the documentary about operating efficiency we find that schoolars study this problem useing the regress analyzing based on traditional production function and data envelopment analysis(DEA) and financial index anayzing and marginal function and capital labor ratio coefficient etc. All the methods mainly based on macrodata. This paper we use the micro-data of company to evaluate the operating efficiency.Using factor analyzing based on financial index and comparing the factor score of three years from 2005 to 2007, we find that China Unicom's operating efficiency is under the averge level of benchmark corporates and has't improved under the concept of 3G from 2005 to 2007.In other words,institutional investor or the conception of technical progress expectation have faint effect on the changes of China Unicom's operating efficiency. Selecting benchmark corporates as post to evaluate the operating efficiency is a characteristic of this method ,which is basicallly sipmly and direct.This method is suit for the operation efficiency evaluation of agriculture listed companies because agriculture listed also face technical progress and marketing concept such as tax-free etc.
Real-case benchmark for flow and tracer transport in the fractured rock

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hokr, M.; Shao, H.; Gardner, W. P.

The paper is intended to define a benchmark problem related to groundwater flow and natural tracer transport using observations of discharge and isotopic tracers in fractured, crystalline rock. Three numerical simulators: Flow123d, OpenGeoSys, and PFLOTRAN are compared. The data utilized in the project were collected in a water-supply tunnel in granite of the Jizera Mountains, Bedrichov, Czech Republic. The problem configuration combines subdomains of different dimensions, 3D continuum for hard-rock blocks or matrix and 2D features for fractures or fault zones, together with realistic boundary conditions for tunnel-controlled drainage. Steady-state and transient flow and a pulse injection tracer transport problemmore » are solved. The results confirm mostly consistent behavior of the codes. Both the codes Flow123d and OpenGeoSys with 3D–2D coupling implemented differ by several percent in most cases, which is appropriate to, e.g., effects of discrete unknown placing in the mesh. Some of the PFLOTRAN results differ more, which can be explained by effects of the dispersion tensor evaluation scheme and of the numerical diffusion. Here, the phenomenon can get stronger with fracture/matrix coupling and with parameter magnitude contrasts. Although the study was not aimed on inverse solution, the models were fit to the measured data approximately, demonstrating the intended real-case relevance of the benchmark.« less
Real-case benchmark for flow and tracer transport in the fractured rock

DOE PAGES

Hokr, M.; Shao, H.; Gardner, W. P.; ...

2016-09-19

The paper is intended to define a benchmark problem related to groundwater flow and natural tracer transport using observations of discharge and isotopic tracers in fractured, crystalline rock. Three numerical simulators: Flow123d, OpenGeoSys, and PFLOTRAN are compared. The data utilized in the project were collected in a water-supply tunnel in granite of the Jizera Mountains, Bedrichov, Czech Republic. The problem configuration combines subdomains of different dimensions, 3D continuum for hard-rock blocks or matrix and 2D features for fractures or fault zones, together with realistic boundary conditions for tunnel-controlled drainage. Steady-state and transient flow and a pulse injection tracer transport problemmore » are solved. The results confirm mostly consistent behavior of the codes. Both the codes Flow123d and OpenGeoSys with 3D–2D coupling implemented differ by several percent in most cases, which is appropriate to, e.g., effects of discrete unknown placing in the mesh. Some of the PFLOTRAN results differ more, which can be explained by effects of the dispersion tensor evaluation scheme and of the numerical diffusion. Here, the phenomenon can get stronger with fracture/matrix coupling and with parameter magnitude contrasts. Although the study was not aimed on inverse solution, the models were fit to the measured data approximately, demonstrating the intended real-case relevance of the benchmark.« less
Project SEARCH UK--Evaluating Its Employment Outcomes

ERIC Educational Resources Information Center

Kaehne, Axel

2016-01-01

Background: The study reports the findings of an evaluation of Project SEARCH UK. The programme develops internships for young people with intellectual disabilities who are about to leave school or college. The aim of the evaluation was to investigate at what rate Project SEARCH provided employment opportunities to participants. Methods: The…
School Safety Project: Product Evaluation, 1990-1991.

ERIC Educational Resources Information Center

Saginaw Public Schools, MI. Dept. of Evaluation Services.

A districtwide school safety project implemented in Saginaw, Michigan, in 1990-91, the third year of its operation, is evaluated in this report. The project is evaluated on the basis of the following objectives: employment and training of home-school liaison officers; establishment of an advisory council; development and implementation of…
Paediatric International Nursing Study: using person-centred key performance indicators to benchmark children's services.

PubMed

McCance, Tanya; Wilson, Val; Kornman, Kelly

2016-07-01

The aim of the Paediatric International Nursing Study was to explore the utility of key performance indicators in developing person-centred practice across a range of services provided to sick children. The objective addressed in this paper was evaluating the use of these indicators to benchmark services internationally. This study builds on primary research, which produced indicators that were considered novel both in terms of their positive orientation and use in generating data that privileges the patient voice. This study extends this research through wider testing on an international platform within paediatrics. The overall methodological approach was a realistic evaluation used to evaluate the implementation of the key performance indicators, which combined an integrated development and evaluation methodology. The study involved children's wards/hospitals in Australia (six sites across three states) and Europe (seven sites across four countries). Qualitative and quantitative methods were used during the implementation process, however, this paper reports the quantitative data only, which used survey, observations and documentary review. The findings demonstrate the quality of care being delivered to children and their families across different international sites. The benchmarking does, however, highlight some differences between paediatric and general hospitals, and between the different key performance indicators across all the sites. The findings support the use of the key performance indicators as a novel method to benchmark services internationally. Whilst the data collected across 20 paediatric sites suggest services are more similar than different, benchmarking illuminates variations that encourage a critical dialogue about what works and why. The transferability of the key performance indicators and measurement framework across different settings has significant implications for practice. The findings offer an approach to benchmarking and celebrating
5 CFR 470.317 - Project evaluation.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 5 Administrative Personnel 1 2010-01-01 2010-01-01 false Project evaluation. 470.317 Section 470.317 Administrative Personnel OFFICE OF PERSONNEL MANAGEMENT CIVIL SERVICE REGULATIONS PERSONNEL MANAGEMENT RESEARCH PROGRAMS AND DEMONSTRATIONS PROJECTS Regulatory Requirements Pertaining to Demonstration...
Evaluation of the Pool Critical Assembly Benchmark with Explicitly-Modeled Geometry using MCNP6

DOE PAGES

Kulesza, Joel A.; Martz, Roger Lee

2017-03-01

Despite being one of the most widely used benchmarks for qualifying light water reactor (LWR) radiation transport methods and data, no benchmark calculation of the Oak Ridge National Laboratory (ORNL) Pool Critical Assembly (PCA) pressure vessel wall benchmark facility (PVWBF) using MCNP6 with explicitly modeled core geometry exists. As such, this paper provides results for such an analysis. First, a criticality calculation is used to construct the fixed source term. Next, ADVANTG-generated variance reduction parameters are used within the final MCNP6 fixed source calculations. These calculations provide unadjusted dosimetry results using three sets of dosimetry reaction cross sections of varyingmore » ages (those packaged with MCNP6, from the IRDF-2002 multi-group library, and from the ACE-formatted IRDFF v1.05 library). These results are then compared to two different sets of measured reaction rates. The comparison agrees in an overall sense within 2% and on a specific reaction- and dosimetry location-basis within 5%. Except for the neptunium dosimetry, the individual foil raw calculation-to-experiment comparisons usually agree within 10% but is typically greater than unity. Finally, in the course of developing these calculations, geometry that has previously not been completely specified is provided herein for the convenience of future analysts.« less
Benchmarking initiatives in the water industry.

PubMed

Parena, R; Smeets, E

2001-01-01

Customer satisfaction and service care are every day pushing professionals in the water industry to seek to improve their performance, lowering costs and increasing the provided service level. Process Benchmarking is generally recognised as a systematic mechanism of comparing one's own utility with other utilities or businesses with the intent of self-improvement by adopting structures or methods used elsewhere. The IWA Task Force on Benchmarking, operating inside the Statistics and Economics Committee, has been committed to developing a general accepted concept of Process Benchmarking to support water decision-makers in addressing issues of efficiency. In a first step the Task Force disseminated among the Committee members a questionnaire focused on providing suggestions about the kind, the evolution degree and the main concepts of Benchmarking adopted in the represented Countries. A comparison among the guidelines adopted in The Netherlands and Scandinavia has recently challenged the Task Force in drafting a methodology for a worldwide process benchmarking in water industry. The paper provides a framework of the most interesting benchmarking experiences in the water sector and describes in detail both the final results of the survey and the methodology focused on identification of possible improvement areas.
Project Sell, Title VII: Final Evaluation 1970-1971.

ERIC Educational Resources Information Center

Condon, Elaine C.; And Others

This evaluative report consists of two parts. The first is a narrative report which represents a summary by the evaluation team and recommendations regarding project activities; the second part provides a statistical analysis of project achievements. Details are provided on evaluation techniques, staff, management, instructional materials,…
Harvard Project Physics Research and Evaluation Bibliography.

ERIC Educational Resources Information Center

Welch, Wayne W.

The articles listed were written as part of the research and evaluation program of Project Physics. Many articles are concerned with problems of educational research while others are devoted directly to the evaluation of Project Physics. The articles are listed in four categories according to the availability of reprints: (1) published articles…
Benchmarking--Measuring and Comparing for Continuous Improvement.

ERIC Educational Resources Information Center

Henczel, Sue

2002-01-01

Discussion of benchmarking focuses on the use of internal and external benchmarking by special librarians. Highlights include defining types of benchmarking; historical development; benefits, including efficiency, improved performance, increased competitiveness, and better decision making; problems, including inappropriate adaptation; developing a…
EPA's Benchmark Dose Modeling Software

EPA Science Inventory

The EPA developed the Benchmark Dose Software (BMDS) as a tool to help Agency risk assessors facilitate applying benchmark dose (BMD) method’s to EPA’s human health risk assessment (HHRA) documents. The application of BMD methods overcomes many well know limitations ...
42 CFR 440.330 - Benchmark health benefits coverage.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 42 Public Health 4 2012-10-01 2012-10-01 false Benchmark health benefits coverage. 440.330 Section 440.330 Public Health CENTERS FOR MEDICARE & MEDICAID SERVICES, DEPARTMENT OF HEALTH AND HUMAN... Benchmark-Equivalent Coverage § 440.330 Benchmark health benefits coverage. Benchmark coverage is health...

How to benchmark methods for structure-based virtual screening of large compound libraries.

PubMed

Christofferson, Andrew J; Huang, Niu

2012-01-01

Structure-based virtual screening is a useful computational technique for ligand discovery. To systematically evaluate different docking approaches, it is important to have a consistent benchmarking protocol that is both relevant and unbiased. Here, we describe the designing of a benchmarking data set for docking screen assessment, a standard docking screening process, and the analysis and presentation of the enrichment of annotated ligands among a background decoy database.
Workforce development and effective evaluation of projects.

PubMed

Dickerson, Claire; Green, Tess; Blass, Eddie

The success of a project or programme is typically determined in relation to outputs. However, there is a commitment among UK public services to spending public funds efficiently and on activities that provide the greatest benefit to society. Skills for Health recognised the need for a tool to manage the complex process of evaluating project benefits. An integrated evaluation framework was developed to help practitioners identify, describe, measure and evaluate the benefits of workforce development projects. Practitioners tested the framework on projects within three NHS trusts and provided valuable feedback to support its development. The prospective approach taken to identify benefits and collect baseline data to support evaluation was positively received and the clarity and completeness of the framework, as well as the relevance of the questions, were commended. Users reported that the framework was difficult to complete; an online version could be developed, which might help to improve usability. Effective implementation of this approach will depend on the quality and usability of the framework, the willingness of organisations to implement it, and the presence or establishment of an effective change management culture.
The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS).

PubMed

Menze, Bjoern H; Jakab, Andras; Bauer, Stefan; Kalpathy-Cramer, Jayashree; Farahani, Keyvan; Kirby, Justin; Burren, Yuliya; Porz, Nicole; Slotboom, Johannes; Wiest, Roland; Lanczi, Levente; Gerstner, Elizabeth; Weber, Marc-André; Arbel, Tal; Avants, Brian B; Ayache, Nicholas; Buendia, Patricia; Collins, D Louis; Cordier, Nicolas; Corso, Jason J; Criminisi, Antonio; Das, Tilak; Delingette, Hervé; Demiralp, Çağatay; Durst, Christopher R; Dojat, Michel; Doyle, Senan; Festa, Joana; Forbes, Florence; Geremia, Ezequiel; Glocker, Ben; Golland, Polina; Guo, Xiaotao; Hamamci, Andac; Iftekharuddin, Khan M; Jena, Raj; John, Nigel M; Konukoglu, Ender; Lashkari, Danial; Mariz, José Antonió; Meier, Raphael; Pereira, Sérgio; Precup, Doina; Price, Stephen J; Raviv, Tammy Riklin; Reza, Syed M S; Ryan, Michael; Sarikaya, Duygu; Schwartz, Lawrence; Shin, Hoo-Chang; Shotton, Jamie; Silva, Carlos A; Sousa, Nuno; Subbanna, Nagesh K; Szekely, Gabor; Taylor, Thomas J; Thomas, Owen M; Tustison, Nicholas J; Unal, Gozde; Vasseur, Flor; Wintermark, Max; Ye, Dong Hye; Zhao, Liang; Zhao, Binsheng; Zikic, Darko; Prastawa, Marcel; Reyes, Mauricio; Van Leemput, Koen

2015-10-01

In this paper we report the set-up and results of the Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) organized in conjunction with the MICCAI 2012 and 2013 conferences. Twenty state-of-the-art tumor segmentation algorithms were applied to a set of 65 multi-contrast MR scans of low- and high-grade glioma patients-manually annotated by up to four raters-and to 65 comparable scans generated using tumor image simulation software. Quantitative evaluations revealed considerable disagreement between the human raters in segmenting various tumor sub-regions (Dice scores in the range 74%-85%), illustrating the difficulty of this task. We found that different algorithms worked best for different sub-regions (reaching performance comparable to human inter-rater variability), but that no single algorithm ranked in the top for all sub-regions simultaneously. Fusing several good algorithms using a hierarchical majority vote yielded segmentations that consistently ranked above all individual algorithms, indicating remaining opportunities for further methodological improvements. The BRATS image data and manual annotations continue to be publicly available through an online evaluation system as an ongoing benchmarking resource.
Benchmarking gate-based quantum computers

NASA Astrophysics Data System (ADS)

Michielsen, Kristel; Nocon, Madita; Willsch, Dennis; Jin, Fengping; Lippert, Thomas; De Raedt, Hans

2017-11-01

With the advent of public access to small gate-based quantum processors, it becomes necessary to develop a benchmarking methodology such that independent researchers can validate the operation of these processors. We explore the usefulness of a number of simple quantum circuits as benchmarks for gate-based quantum computing devices and show that circuits performing identity operations are very simple, scalable and sensitive to gate errors and are therefore very well suited for this task. We illustrate the procedure by presenting benchmark results for the IBM Quantum Experience, a cloud-based platform for gate-based quantum computing.
A high-fidelity airbus benchmark for system fault detection and isolation and flight control law clearance

NASA Astrophysics Data System (ADS)

Goupil, Ph.; Puyou, G.

2013-12-01

This paper presents a high-fidelity generic twin engine civil aircraft model developed by Airbus for advanced flight control system research. The main features of this benchmark are described to make the reader aware of the model complexity and representativeness. It is a complete representation including the nonlinear rigid-body aircraft model with a full set of control surfaces, actuator models, sensor models, flight control laws (FCL), and pilot inputs. Two applications of this benchmark in the framework of European projects are presented: FCL clearance using optimization and advanced fault detection and diagnosis (FDD).
Overview of Experiments for Physics of Fast Reactors from the International Handbooks of Evaluated Criticality Safety Benchmark Experiments and Evaluated Reactor Physics Benchmark Experiments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bess, J. D.; Briggs, J. B.; Gulliford, J.

Overview of Experiments to Study the Physics of Fast Reactors Represented in the International Directories of Critical and Reactor Experiments John D. Bess Idaho National Laboratory Jim Gulliford, Tatiana Ivanova Nuclear Energy Agency of the Organisation for Economic Cooperation and Development E.V.Rozhikhin, M.Yu.Sem?nov, A.M.Tsibulya Institute of Physics and Power Engineering The study the physics of fast reactors traditionally used the experiments presented in the manual labor of the Working Group on Evaluation of sections CSEWG (ENDF-202) issued by the Brookhaven National Laboratory in 1974. This handbook presents simplified homogeneous model experiments with relevant experimental data, as amended. The Nuclear Energymore » Agency of the Organization for Economic Cooperation and Development coordinates the activities of two international projects on the collection, evaluation and documentation of experimental data - the International Project on the assessment of critical experiments (1994) and the International Project on the assessment of reactor experiments (since 2005). The result of the activities of these projects are replenished every year, an international directory of critical (ICSBEP Handbook) and reactor (IRPhEP Handbook) experiments. The handbooks present detailed models of experiments with minimal amendments. Such models are of particular interest in terms of the settlements modern programs. The directories contain a large number of experiments which are suitable for the study of physics of fast reactors. Many of these experiments were performed at specialized critical stands, such as BFS (Russia), ZPR and ZPPR (USA), the ZEBRA (UK) and the experimental reactor JOYO (Japan), FFTF (USA). Other experiments, such as compact metal assembly, is also of interest in terms of the physics of fast reactors, they have been carried out on the universal critical stands in Russian institutes (VNIITF and VNIIEF) and the US (LANL, LLNL, and others.). Also worth
First benchmark of the Unstructured Grid Adaptation Working Group

NASA Technical Reports Server (NTRS)

Ibanez, Daniel; Barral, Nicolas; Krakos, Joshua; Loseille, Adrien; Michal, Todd; Park, Mike

2017-01-01

Unstructured grid adaptation is a technology that holds the potential to improve the automation and accuracy of computational fluid dynamics and other computational disciplines. Difficulty producing the highly anisotropic elements necessary for simulation on complex curved geometries that satisfies a resolution request has limited this technology's widespread adoption. The Unstructured Grid Adaptation Working Group is an open gathering of researchers working on adapting simplicial meshes to conform to a metric field. Current members span a wide range of institutions including academia, industry, and national laboratories. The purpose of this group is to create a common basis for understanding and improving mesh adaptation. We present our first major contribution: a common set of benchmark cases, including input meshes and analytic metric specifications, that are publicly available to be used for evaluating any mesh adaptation code. We also present the results of several existing codes on these benchmark cases, to illustrate their utility in identifying key challenges common to all codes and important differences between available codes. Future directions are defined to expand this benchmark to mature the technology necessary to impact practical simulation workflows.
Machine characterization and benchmark performance prediction

NASA Technical Reports Server (NTRS)

Saavedra-Barrera, Rafael H.

1988-01-01

From runs of standard benchmarks or benchmark suites, it is not possible to characterize the machine nor to predict the run time of other benchmarks which have not been run. A new approach to benchmarking and machine characterization is reported. The creation and use of a machine analyzer is described, which measures the performance of a given machine on FORTRAN source language constructs. The machine analyzer yields a set of parameters which characterize the machine and spotlight its strong and weak points. Also described is a program analyzer, which analyzes FORTRAN programs and determines the frequency of execution of each of the same set of source language operations. It is then shown that by combining a machine characterization and a program characterization, we are able to predict with good accuracy the run time of a given benchmark on a given machine. Characterizations are provided for the Cray-X-MP/48, Cyber 205, IBM 3090/200, Amdahl 5840, Convex C-1, VAX 8600, VAX 11/785, VAX 11/780, SUN 3/50, and IBM RT-PC/125, and for the following benchmark programs or suites: Los Alamos (BMK8A1), Baskett, Linpack, Livermore Loops, Madelbrot Set, NAS Kernels, Shell Sort, Smith, Whetstone and Sieve of Erathostenes.
Proposed re-evaluation of the 154Eu thermal ( n, γ) capture cross-section based on spent fuel benchmarking studies

DOE PAGES

Skutnik, Steven E.

2016-09-22

154Eu is a nuclide of considerable importance to both non-destructive measurements of used nuclear fuel assembly burnup as well as for calculating the radiation source term for used fuel storage and transportation. But, recent evidence from code validation studies of spent fuel benchmarks have revealed evidence of a systemic bias in predicted 154Eu inventories when using ENDF/B-VII.0 and ENDF/B-VII.1 nuclear data libraries, wherein Eu-154 is consistently over-predicted on the order of 10% or more. Further, this bias is found to correlate with sample burnup, resulting in a larger departure from experimental measurements for higher sample burnups. Here, the bias in Eu-154 is characterized across eleven spent fuel destructive assay benchmarks from five different assemblies. Based on these studies, possible amendments to the ENDF/B-VII.0 and VII.1 evaluations of the 154Eu (n,γ) 155Eu are explored. By amending the location of the first resolved resonance for the 154Eu radiative capture cross-section (centered at 0.195 eV in ENDF/B-VII.0 and VII.1) to 0.188 eV and adjusting the neutron capture width proportional tomore » $$\\sqrt1/E$$, the amended cross-section evaluation was found to reduce the bias in predicted 154Eu inventories by approximately 5–7%. And while the amended capture cross-section still results in a residual over-prediction of 154Eu (ranging from 2% to 9%), the effect is substantially attenuated compared with the nominal ENDF/B-VII.0 and VII.1 evaluations.« less
Proposed re-evaluation of the 154Eu thermal ( n, γ) capture cross-section based on spent fuel benchmarking studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Skutnik, Steven E.

154Eu is a nuclide of considerable importance to both non-destructive measurements of used nuclear fuel assembly burnup as well as for calculating the radiation source term for used fuel storage and transportation. But, recent evidence from code validation studies of spent fuel benchmarks have revealed evidence of a systemic bias in predicted 154Eu inventories when using ENDF/B-VII.0 and ENDF/B-VII.1 nuclear data libraries, wherein Eu-154 is consistently over-predicted on the order of 10% or more. Further, this bias is found to correlate with sample burnup, resulting in a larger departure from experimental measurements for higher sample burnups. Here, the bias in Eu-154 is characterized across eleven spent fuel destructive assay benchmarks from five different assemblies. Based on these studies, possible amendments to the ENDF/B-VII.0 and VII.1 evaluations of the 154Eu (n,γ) 155Eu are explored. By amending the location of the first resolved resonance for the 154Eu radiative capture cross-section (centered at 0.195 eV in ENDF/B-VII.0 and VII.1) to 0.188 eV and adjusting the neutron capture width proportional tomore » $$\\sqrt1/E$$, the amended cross-section evaluation was found to reduce the bias in predicted 154Eu inventories by approximately 5–7%. And while the amended capture cross-section still results in a residual over-prediction of 154Eu (ranging from 2% to 9%), the effect is substantially attenuated compared with the nominal ENDF/B-VII.0 and VII.1 evaluations.« less
Internal Benchmarking for Institutional Effectiveness

ERIC Educational Resources Information Center

Ronco, Sharron L.

2012-01-01

Internal benchmarking is an established practice in business and industry for identifying best in-house practices and disseminating the knowledge about those practices to other groups in the organization. Internal benchmarking can be done with structures, processes, outcomes, or even individuals. In colleges or universities with multicampuses or a…
A benchmark for vehicle detection on wide area motion imagery

NASA Astrophysics Data System (ADS)

Catrambone, Joseph; Amzovski, Ismail; Liang, Pengpeng; Blasch, Erik; Sheaff, Carolyn; Wang, Zhonghai; Chen, Genshe; Ling, Haibin

2015-05-01

Wide area motion imagery (WAMI) has been attracting an increased amount of research attention due to its large spatial and temporal coverage. An important application includes moving target analysis, where vehicle detection is often one of the first steps before advanced activity analysis. While there exist many vehicle detection algorithms, a thorough evaluation of them on WAMI data still remains a challenge mainly due to the lack of an appropriate benchmark data set. In this paper, we address a research need by presenting a new benchmark for wide area motion imagery vehicle detection data. The WAMI benchmark is based on the recently available Wright-Patterson Air Force Base (WPAFB09) dataset and the Temple Resolved Uncertainty Target History (TRUTH) associated target annotation. Trajectory annotations were provided in the original release of the WPAFB09 dataset, but detailed vehicle annotations were not available with the dataset. In addition, annotations of static vehicles, e.g., in parking lots, are also not identified in the original release. Addressing these issues, we re-annotated the whole dataset with detailed information for each vehicle, including not only a target's location, but also its pose and size. The annotated WAMI data set should be useful to community for a common benchmark to compare WAMI detection, tracking, and identification methods.
Nanomagnet Logic: Architectures, design, and benchmarking

NASA Astrophysics Data System (ADS)

Kurtz, Steven J.

Nanomagnet Logic (NML) is an emerging technology being studied as a possible replacement or supplementary device for Complimentary Metal-Oxide-Semiconductor (CMOS) Field-Effect Transistors (FET) by the year 2020. NML devices offer numerous potential advantages including: low energy operation, steady state non-volatility, radiation hardness and a clear path to fabrication and integration with CMOS. However, maintaining both low-energy operation and non-volatility while scaling from the device to the architectural level is non-trivial as (i) nearest neighbor interactions within NML circuits complicate the modeling of ensemble nanomagnet behavior and (ii) the energy intensive clock structures required for re-evaluation and NML's relatively high latency challenge its ability to offer system-level performance wins against other emerging nanotechnologies. Thus, further research efforts are required to model more complex circuits while also identifying circuit design techniques that balance low-energy operation with steady state non-volatility. In addition, further work is needed to design and model low-power on-chip clocks while simultaneously identifying application spaces where NML systems (including clock overhead) offer sufficient energy savings to merit their inclusion in future processors. This dissertation presents research advancing the understanding and modeling of NML at all levels including devices, circuits, and line clock structures while also benchmarking NML against both scaled CMOS and tunneling FETs (TFET) devices. This is accomplished through the development of design tools and methodologies for (i) quantifying both energy and stability in NML circuits and (ii) evaluating line-clocked NML system performance. The application of these newly developed tools improves the understanding of ideal design criteria (i.e., magnet size, clock wire geometry, etc.) for NML architectures. Finally, the system-level performance evaluation tool offers the ability to
Utilizing Benchmarking to Study the Effectiveness of Parent-Child Interaction Therapy Implemented in a Community Setting

ERIC Educational Resources Information Center

Self-Brown, Shannon; Valente, Jessica R.; Wild, Robert C.; Whitaker, Daniel J.; Galanter, Rachel; Dorsey, Shannon; Stanley, Jenelle

2012-01-01

Benchmarking is a program evaluation approach that can be used to study whether the outcomes of parents/children who participate in an evidence-based program in the community approximate the outcomes found in randomized trials. This paper presents a case illustration using benchmarking methodology to examine a community implementation of…
HANFORD DST THERMAL & SEISMIC PROJECT ANSYS BENCHMARK ANALYSIS OF SEISMIC INDUCED FLUID STRUCTURE INTERACTION IN A HANFORD DOUBLE SHELL PRIMARY TANK

DOE Office of Scientific and Technical Information (OSTI.GOV)

MACKEY, T.C.

M&D Professional Services, Inc. (M&D) is under subcontract to Pacific Northwest National Laboratories (PNNL) to perform seismic analysis of the Hanford Site Double-Shell Tanks (DSTs) in support of a project entitled ''Double-Shell Tank (DSV Integrity Project-DST Thermal and Seismic Analyses)''. The overall scope of the project is to complete an up-to-date comprehensive analysis of record of the DST System at Hanford in support of Tri-Party Agreement Milestone M-48-14. The work described herein was performed in support of the seismic analysis of the DSTs. The thermal and operating loads analysis of the DSTs is documented in Rinker et al. (2004). Themore » overall seismic analysis of the DSTs is being performed with the general-purpose finite element code ANSYS. The overall model used for the seismic analysis of the DSTs includes the DST structure, the contained waste, and the surrounding soil. The seismic analysis of the DSTs must address the fluid-structure interaction behavior and sloshing response of the primary tank and contained liquid. ANSYS has demonstrated capabilities for structural analysis, but the capabilities and limitations of ANSYS to perform fluid-structure interaction are less well understood. The purpose of this study is to demonstrate the capabilities and investigate the limitations of ANSYS for performing a fluid-structure interaction analysis of the primary tank and contained waste. To this end, the ANSYS solutions are benchmarked against theoretical solutions appearing in BNL 1995, when such theoretical solutions exist. When theoretical solutions were not available, comparisons were made to theoretical solutions of similar problems and to the results from Dytran simulations. The capabilities and limitations of the finite element code Dytran for performing a fluid-structure interaction analysis of the primary tank and contained waste were explored in a parallel investigation (Abatt 2006). In conjunction with the results of the global ANSYS
Validation and Comparison of 2D and 3D Codes for Nearshore Motion of Long Waves Using Benchmark Problems

NASA Astrophysics Data System (ADS)

Velioǧlu, Deniz; Cevdet Yalçıner, Ahmet; Zaytsev, Andrey

2016-04-01

Tsunamis are huge waves with long wave periods and wave lengths that can cause great devastation and loss of life when they strike a coast. The interest in experimental and numerical modeling of tsunami propagation and inundation increased considerably after the 2011 Great East Japan earthquake. In this study, two numerical codes, FLOW 3D and NAMI DANCE, that analyze tsunami propagation and inundation patterns are considered. Flow 3D simulates linear and nonlinear propagating surface waves as well as long waves by solving three-dimensional Navier-Stokes (3D-NS) equations. NAMI DANCE uses finite difference computational method to solve 2D depth-averaged linear and nonlinear forms of shallow water equations (NSWE) in long wave problems, specifically tsunamis. In order to validate these two codes and analyze the differences between 3D-NS and 2D depth-averaged NSWE equations, two benchmark problems are applied. One benchmark problem investigates the runup of long waves over a complex 3D beach. The experimental setup is a 1:400 scale model of Monai Valley located on the west coast of Okushiri Island, Japan. Other benchmark problem is discussed in 2015 National Tsunami Hazard Mitigation Program (NTHMP) Annual meeting in Portland, USA. It is a field dataset, recording the Japan 2011 tsunami in Hilo Harbor, Hawaii. The computed water surface elevation and velocity data are compared with the measured data. The comparisons showed that both codes are in fairly good agreement with each other and benchmark data. The differences between 3D-NS and 2D depth-averaged NSWE equations are highlighted. All results are presented with discussions and comparisons. Acknowledgements: Partial support by Japan-Turkey Joint Research Project by JICA on earthquakes and tsunamis in Marmara Region (JICA SATREPS - MarDiM Project), 603839 ASTARTE Project of EU, UDAP-C-12-14 project of AFAD Turkey, 108Y227, 113M556 and 213M534 projects of TUBITAK Turkey, RAPSODI (CONCERT_Dis-021) of CONCERT
Evaluation on Collaborative Satisfaction for Project Management Team in Integrated Project Delivery Mode

NASA Astrophysics Data System (ADS)

Zhang, L.; Li, Y.; Wu, Q.

2013-05-01

Integrated Project Delivery (IPD) is a newly-developed project delivery approach for construction projects, and the level of collaboration of project management team is crucial to the success of its implementation. Existing research has shown that collaborative satisfaction is one of the key indicators of team collaboration. By reviewing the literature on team collaborative satisfaction and taking into consideration the characteristics of IPD projects, this paper summarizes the factors that influence collaborative satisfaction of IPD project management team. Based on these factors, this research develops a fuzzy linguistic method to effectively evaluate the level of team collaborative satisfaction, in which the authors adopted the 2-tuple linguistic variables and 2-tuple linguistic hybrid average operators to enhance the objectivity and accuracy of the evaluation. The paper demonstrates the practicality and effectiveness of the method through carrying out a case study with the method.
Regional restoration benchmarks for Acropora cervicornis

NASA Astrophysics Data System (ADS)

Schopmeyer, Stephanie A.; Lirman, Diego; Bartels, Erich; Gilliam, David S.; Goergen, Elizabeth A.; Griffin, Sean P.; Johnson, Meaghan E.; Lustic, Caitlin; Maxwell, Kerry; Walter, Cory S.

2017-12-01

Coral gardening plays an important role in the recovery of depleted populations of threatened Acropora cervicornis in the Caribbean. Over the past decade, high survival coupled with fast growth of in situ nursery corals have allowed practitioners to create healthy and genotypically diverse nursery stocks. Currently, thousands of corals are propagated and outplanted onto degraded reefs on a yearly basis, representing a substantial increase in the abundance, biomass, and overall footprint of A. cervicornis. Here, we combined an extensive dataset collected by restoration practitioners to document early (1-2 yr) restoration success metrics in Florida and Puerto Rico, USA. By reporting region-specific data on the impacts of fragment collection on donor colonies, survivorship and productivity of nursery corals, and survivorship and productivity of outplanted corals during normal conditions, we provide the basis for a stop-light indicator framework for new or existing restoration programs to evaluate their performance. We show that current restoration methods are very effective, that no excess damage is caused to donor colonies, and that once outplanted, corals behave just as wild colonies. We also provide science-based benchmarks that can be used by programs to evaluate successes and challenges of their efforts, and to make modifications where needed. We propose that up to 10% of the biomass can be collected from healthy, large A. cervicornis donor colonies for nursery propagation. We also propose the following benchmarks for the first year of activities for A. cervicornis restoration: (1) >75% live tissue cover on donor colonies; (2) >80% survivorship of nursery corals; and (3) >70% survivorship of outplanted corals. Finally, we report productivity means of 4.4 cm yr-1 for nursery corals and 4.8 cm yr-1 for outplants as a frame of reference for ranking performance within programs. Such benchmarks, and potential subsequent adaptive actions, are needed to fully assess the
Benchmark for Strategic Performance Improvement.

ERIC Educational Resources Information Center

Gohlke, Annette

1997-01-01

Explains benchmarking, a total quality management tool used to measure and compare the work processes in a library with those in other libraries to increase library performance. Topics include the main groups of upper management, clients, and staff; critical success factors for each group; and benefits of benchmarking. (Author/LRW)
Beyond Benchmarking: Value-Adding Metrics

ERIC Educational Resources Information Center

Fitz-enz, Jac

2007-01-01

HR metrics has grown up a bit over the past two decades, moving away from simple benchmarking practices and toward a more inclusive approach to measuring institutional performance and progress. In this article, the acknowledged "father" of human capital performance benchmarking provides an overview of several aspects of today's HR metrics…

The NAS kernel benchmark program

NASA Technical Reports Server (NTRS)

Bailey, D. H.; Barton, J. T.

1985-01-01

A collection of benchmark test kernels that measure supercomputer performance has been developed for the use of the NAS (Numerical Aerodynamic Simulation) program at the NASA Ames Research Center. This benchmark program is described in detail and the specific ground rules are given for running the program as a performance test.
How Benchmarking and Higher Education Came Together

ERIC Educational Resources Information Center

Levy, Gary D.; Ronco, Sharron L.

2012-01-01

This chapter introduces the concept of benchmarking and how higher education institutions began to use benchmarking for a variety of purposes. Here, benchmarking is defined as a strategic and structured approach whereby an organization compares aspects of its processes and/or outcomes to those of another organization or set of organizations to…
Benchmarks--Standards Comparisons. Math Competencies: EFF Benchmarks Comparison [and] Reading Competencies: EFF Benchmarks Comparison [and] Writing Competencies: EFF Benchmarks Comparison.

ERIC Educational Resources Information Center

Kent State Univ., OH. Ohio Literacy Resource Center.

This document is intended to show the relationship between Ohio's Standards and Competencies, Equipped for the Future's (EFF's) Standards and Components of Performance, and Ohio's Revised Benchmarks. The document is divided into three parts, with Part 1 covering mathematics instruction, Part 2 covering reading instruction, and Part 3 covering…
Benchmarking the Integration of WAVEWATCH III Results into HAZUS-MH: Preliminary Results

NASA Technical Reports Server (NTRS)

Berglund, Judith; Holland, Donald; McKellip, Rodney; Sciaudone, Jeff; Vickery, Peter; Wang, Zhanxian; Ying, Ken

2005-01-01

The report summarizes the results from the preliminary benchmarking activities associated with the use of WAVEWATCH III (WW3) results in the HAZUS-MH MR1 flood module. Project partner Applied Research Associates (ARA) is integrating the WW3 model into HAZUS. The current version of HAZUS-MH predicts loss estimates from hurricane-related coastal flooding by using values of surge only. Using WW3, wave setup can be included with surge. Loss estimates resulting from the use of surge-only and surge-plus-wave-setup were compared. This benchmarking study is preliminary because the HAZUS-MH MR1 flood module was under development at the time of the study. In addition, WW3 is not scheduled to be fully integrated with HAZUS-MH and available for public release until 2008.
Benchmarking: A Process for Improvement.

ERIC Educational Resources Information Center

Peischl, Thomas M.

One problem with the outcome-based measures used in higher education is that they measure quantity but not quality. Benchmarking, or the use of some external standard of quality to measure tasks, processes, and outputs, is partially solving that difficulty. Benchmarking allows for the establishment of a systematic process to indicate if outputs…
Developing a Benchmarking Process in Perfusion: A Report of the Perfusion Downunder Collaboration

PubMed Central

Baker, Robert A.; Newland, Richard F.; Fenton, Carmel; McDonald, Michael; Willcox, Timothy W.; Merry, Alan F.

2012-01-01

Abstract: Improving and understanding clinical practice is an appropriate goal for the perfusion community. The Perfusion Downunder Collaboration has established a multi-center perfusion focused database aimed at achieving these goals through the development of quantitative quality indicators for clinical improvement through benchmarking. Data were collected using the Perfusion Downunder Collaboration database from procedures performed in eight Australian and New Zealand cardiac centers between March 2007 and February 2011. At the Perfusion Downunder Meeting in 2010, it was agreed by consensus, to report quality indicators (QI) for glucose level, arterial outlet temperature, and pCO2 management during cardiopulmonary bypass. The values chosen for each QI were: blood glucose ≥4 mmol/L and ≤10 mmol/L; arterial outlet temperature ≤37°C; and arterial blood gas pCO2 ≥ 35 and ≤45 mmHg. The QI data were used to derive benchmarks using the Achievable Benchmark of Care (ABC™) methodology to identify the incidence of QIs at the best performing centers. Five thousand four hundred and sixty-five procedures were evaluated to derive QI and benchmark data. The incidence of the blood glucose QI ranged from 37–96% of procedures, with a benchmark value of 90%. The arterial outlet temperature QI occurred in 16–98% of procedures with the benchmark of 94%; while the arterial pCO2 QI occurred in 21–91%, with the benchmark value of 80%. We have derived QIs and benchmark calculations for the management of several key aspects of cardiopulmonary bypass to provide a platform for improving the quality of perfusion practice. PMID:22730861
Space network scheduling benchmark: A proof-of-concept process for technology transfer

NASA Technical Reports Server (NTRS)

Moe, Karen; Happell, Nadine; Hayden, B. J.; Barclay, Cathy

1993-01-01

This paper describes a detailed proof-of-concept activity to evaluate flexible scheduling technology as implemented in the Request Oriented Scheduling Engine (ROSE) and applied to Space Network (SN) scheduling. The criteria developed for an operational evaluation of a reusable scheduling system is addressed including a methodology to prove that the proposed system performs at least as well as the current system in function and performance. The improvement of the new technology must be demonstrated and evaluated against the cost of making changes. Finally, there is a need to show significant improvement in SN operational procedures. Successful completion of a proof-of-concept would eventually lead to an operational concept and implementation transition plan, which is outside the scope of this paper. However, a high-fidelity benchmark using actual SN scheduling requests has been designed to test the ROSE scheduling tool. The benchmark evaluation methodology, scheduling data, and preliminary results are described.
Benchmarking Is Associated With Improved Quality of Care in Type 2 Diabetes

PubMed Central

Hermans, Michel P.; Elisaf, Moses; Michel, Georges; Muls, Erik; Nobels, Frank; Vandenberghe, Hans; Brotons, Carlos

2013-01-01

OBJECTIVE To assess prospectively the effect of benchmarking on quality of primary care for patients with type 2 diabetes by using three major modifiable cardiovascular risk factors as critical quality indicators. RESEARCH DESIGN AND METHODS Primary care physicians treating patients with type 2 diabetes in six European countries were randomized to give standard care (control group) or standard care with feedback benchmarked against other centers in each country (benchmarking group). In both groups, laboratory tests were performed every 4 months. The primary end point was the percentage of patients achieving preset targets of the critical quality indicators HbA1c, LDL cholesterol, and systolic blood pressure (SBP) after 12 months of follow-up. RESULTS Of 4,027 patients enrolled, 3,996 patients were evaluable and 3,487 completed 12 months of follow-up. Primary end point of HbA1c target was achieved in the benchmarking group by 58.9 vs. 62.1% in the control group (P = 0.398) after 12 months; 40.0 vs. 30.1% patients met the SBP target (P < 0.001); 54.3 vs. 49.7% met the LDL cholesterol target (P = 0.006). Percentages of patients meeting all three targets increased during the study in both groups, with a statistically significant increase observed in the benchmarking group. The percentage of patients achieving all three targets at month 12 was significantly larger in the benchmarking group than in the control group (12.5 vs. 8.1%; P < 0.001). CONCLUSIONS In this prospective, randomized, controlled study, benchmarking was shown to be an effective tool for increasing achievement of critical quality indicators and potentially reducing patient cardiovascular residual risk profile. PMID:23846810
Benchmarking and testing the "Sea Level Equation

NASA Astrophysics Data System (ADS)

Spada, G.; Barletta, V. R.; Klemann, V.; van der Wal, W.; James, T. S.; Simon, K.; Riva, R. E. M.; Martinec, Z.; Gasperini, P.; Lund, B.; Wolf, D.; Vermeersen, L. L. A.; King, M. A.

2012-04-01

The study of the process of Glacial Isostatic Adjustment (GIA) and of the consequent sea level variations is gaining an increasingly important role within the geophysical community. Understanding the response of the Earth to the waxing and waning ice sheets is crucial in various contexts, ranging from the interpretation of modern satellite geodetic measurements to the projections of future sea level trends in response to climate change. All the processes accompanying GIA can be described solving the so-called Sea Level Equation (SLE), an integral equation that accounts for the interactions between the ice sheets, the solid Earth, and the oceans. Modern approaches to the SLE are based on various techniques that range from purely analytical formulations to fully numerical methods. Despite various teams independently investigating GIA, we do not have a suitably large set of agreed numerical results through which the methods may be validated. Following the example of the mantle convection community and our recent successful Benchmark for Post Glacial Rebound codes (Spada et al., 2011, doi: 10.1111/j.1365-246X.2011.04952.x), here we present the results of a benchmark study of independently developed codes designed to solve the SLE. This study has taken place within a collaboration facilitated through the European Cooperation in Science and Technology (COST) Action ES0701. The tests involve predictions of past and current sea level variations, and 3D deformations of the Earth surface. In spite of the signi?cant differences in the numerical methods employed, the test computations performed so far show a satisfactory agreement between the results provided by the participants. The differences found, which can be often attributed to the different numerical algorithms employed within the community, help to constrain the intrinsic errors in model predictions. These are of fundamental importance for a correct interpretation of the geodetic variations observed today, and
The Medical Library Association Benchmarking Network: results.

PubMed

Dudden, Rosalind Farnam; Corcoran, Kate; Kaplan, Janice; Magouirk, Jeff; Rand, Debra C; Smith, Bernie Todd

2006-04-01

This article presents some limited results from the Medical Library Association (MLA) Benchmarking Network survey conducted in 2002. Other uses of the data are also presented. After several years of development and testing, a Web-based survey opened for data input in December 2001. Three hundred eighty-five MLA members entered data on the size of their institutions and the activities of their libraries. The data from 344 hospital libraries were edited and selected for reporting in aggregate tables and on an interactive site in the Members-Only area of MLANET. The data represent a 16% to 23% return rate and have a 95% confidence level. Specific questions can be answered using the reports. The data can be used to review internal processes, perform outcomes benchmarking, retest a hypothesis, refute a previous survey findings, or develop library standards. The data can be used to compare to current surveys or look for trends by comparing the data to past surveys. The impact of this project on MLA will reach into areas of research and advocacy. The data will be useful in the everyday working of small health sciences libraries as well as provide concrete data on the current practices of health sciences libraries.
The Medical Library Association Benchmarking Network: results*

PubMed Central

Dudden, Rosalind Farnam; Corcoran, Kate; Kaplan, Janice; Magouirk, Jeff; Rand, Debra C.; Smith, Bernie Todd

2006-01-01

Objective: This article presents some limited results from the Medical Library Association (MLA) Benchmarking Network survey conducted in 2002. Other uses of the data are also presented. Methods: After several years of development and testing, a Web-based survey opened for data input in December 2001. Three hundred eighty-five MLA members entered data on the size of their institutions and the activities of their libraries. The data from 344 hospital libraries were edited and selected for reporting in aggregate tables and on an interactive site in the Members-Only area of MLANET. The data represent a 16% to 23% return rate and have a 95% confidence level. Results: Specific questions can be answered using the reports. The data can be used to review internal processes, perform outcomes benchmarking, retest a hypothesis, refute a previous survey findings, or develop library standards. The data can be used to compare to current surveys or look for trends by comparing the data to past surveys. Conclusions: The impact of this project on MLA will reach into areas of research and advocacy. The data will be useful in the everyday working of small health sciences libraries as well as provide concrete data on the current practices of health sciences libraries. PMID:16636703
An international land-biosphere model benchmarking activity for the IPCC Fifth Assessment Report (AR5)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoffman, Forrest M; Randerson, James T; Thornton, Peter E

2009-12-01

the carbon-nitrogen (CN) model of Thornton. Comparisons of the CLM3 offline results against observational datasets have been performed and are described in Randerson et al. (2009). CLM version 4 has been evaluated using C-LAMP, showing improvement in many of the metrics. Efforts are now underway to initiate a Nitrogen-Land Model Intercomparison Project (N-LAMP) to better constrain the effects of the nitrogen cycle in biosphere models. Presented will be new results from C-LAMP for CLM4, initial N-LAMP developments, and the proposed land-biosphere model benchmarking activity.« less
Benchmarking in national health service procurement in Scotland.

PubMed

Walker, Scott; Masson, Ron; Telford, Ronnie; White, David

2007-11-01

The paper reports the results of a study on benchmarking activities undertaken by the procurement organization within the National Health Service (NHS) in Scotland, namely National Procurement (previously Scottish Healthcare Supplies Contracts Branch). NHS performance is of course politically important, and benchmarking is increasingly seen as a means to improve performance, so the study was carried out to determine if the current benchmarking approaches could be enhanced. A review of the benchmarking activities used by the private sector, local government and NHS organizations was carried out to establish a framework of the motivations, benefits, problems and costs associated with benchmarking. This framework was used to carry out the research through case studies and a questionnaire survey of NHS procurement organizations both in Scotland and other parts of the UK. Nine of the 16 Scottish Health Boards surveyed reported carrying out benchmarking during the last three years. The findings of the research were that there were similarities in approaches between local government and NHS Scotland Health, but differences between NHS Scotland and other UK NHS procurement organizations. Benefits were seen as significant and it was recommended that National Procurement should pursue the formation of a benchmarking group with members drawn from NHS Scotland and external benchmarking bodies to establish measures to be used in benchmarking across the whole of NHS Scotland.
Benchmark Factors in Student Retention.

ERIC Educational Resources Information Center

Waggener, Anna T.; Smith, Constance K.

The first purpose of this study was to identify significant factors affecting the first benchmark in retaining students in college--the decision to enroll in the first fall semester after orientation. The second purpose was to examine enrollment decisions at the second benchmark--the decision to re-enroll in the second fall semester after freshman…
Benchmarking of Decision-Support Tools Used for Tiered Sustainable Remediation Appraisal.

PubMed

Smith, Jonathan W N; Kerrison, Gavin

2013-01-01

Sustainable remediation comprises soil and groundwater risk-management actions that are selected, designed, and operated to maximize net environmental, social, and economic benefit (while assuring protection of human health and safety). This paper describes a benchmarking exercise to comparatively assess potential differences in environmental management decision making resulting from application of different sustainability appraisal tools ranging from simple (qualitative) to more quantitative (multi-criteria and fully monetized cost-benefit analysis), as outlined in the SuRF-UK framework. The appraisal tools were used to rank remedial options for risk management of a subsurface petroleum release that occurred at a petrol filling station in central England. The remediation options were benchmarked using a consistent set of soil and groundwater data for each tier of sustainability appraisal. The ranking of remedial options was very similar in all three tiers, and an environmental management decision to select the most sustainable options at tier 1 would have been the same decision at tiers 2 and 3. The exercise showed that, for relatively simple remediation projects, a simple sustainability appraisal led to the same remediation option selection as more complex appraisal, and can be used to reliably inform environmental management decisions on other relatively simple land contamination projects.
Reflections on Evaluation Costs: Direct and Indirect. Evaluation Productivity Project.

ERIC Educational Resources Information Center

Alkin, Marvin; Ruskus, Joan A.

This paper summarizes views on the costs of evaluation developed as part of CSE's Evaluation Productivity Project. In particular, it focuses on ideas about the kinds of costs associated with factors known to effect evaluation utilization. The first section deals with general issues involved in identifying and valuing cost components, particularly…
An automated benchmarking platform for MHC class II binding prediction methods.

PubMed

Andreatta, Massimo; Trolle, Thomas; Yan, Zhen; Greenbaum, Jason A; Peters, Bjoern; Nielsen, Morten

2018-05-01

Computational methods for the prediction of peptide-MHC binding have become an integral and essential component for candidate selection in experimental T cell epitope discovery studies. The sheer amount of published prediction methods-and often discordant reports on their performance-poses a considerable quandary to the experimentalist who needs to choose the best tool for their research. With the goal to provide an unbiased, transparent evaluation of the state-of-the-art in the field, we created an automated platform to benchmark peptide-MHC class II binding prediction tools. The platform evaluates the absolute and relative predictive performance of all participating tools on data newly entered into the Immune Epitope Database (IEDB) before they are made public, thereby providing a frequent, unbiased assessment of available prediction tools. The benchmark runs on a weekly basis, is fully automated, and displays up-to-date results on a publicly accessible website. The initial benchmark described here included six commonly used prediction servers, but other tools are encouraged to join with a simple sign-up procedure. Performance evaluation on 59 data sets composed of over 10 000 binding affinity measurements suggested that NetMHCIIpan is currently the most accurate tool, followed by NN-align and the IEDB consensus method. Weekly reports on the participating methods can be found online at: http://tools.iedb.org/auto_bench/mhcii/weekly/. mniel@bioinformatics.dtu.dk. Supplementary data are available at Bioinformatics online.
Benchmark study on glyphosate-resistant crop systems in the United States. Part 2: Perspectives.

PubMed

Owen, Micheal D K; Young, Bryan G; Shaw, David R; Wilson, Robert G; Jordan, David L; Dixon, Philip M; Weller, Stephen C

2011-07-01

A six-state, 5 year field project was initiated in 2006 to study weed management methods that foster the sustainability of genetically engineered (GE) glyphosate-resistant (GR) crop systems. The benchmark study field-scale experiments were initiated following a survey, conducted in the winter of 2005-2006, of farmer opinions on weed management practices and their views on GR weeds and management tactics. The main survey findings supported the premise that growers were generally less aware of the significance of evolved herbicide resistance and did not have a high recognition of the strong selection pressure from herbicides on the evolution of herbicide-resistant (HR) weeds. The results of the benchmark study survey indicated that there are educational challenges to implement sustainable GR-based crop systems and helped guide the development of the field-scale benchmark study. Paramount is the need to develop consistent and clearly articulated science-based management recommendations that enable farmers to reduce the potential for HR weeds. This paper provides background perspectives about the use of GR crops, the impact of these crops and an overview of different opinions about the use of GR crops on agriculture and society, as well as defining how the benchmark study will address these issues. Copyright © 2011 Society of Chemical Industry.
ELAPSE - NASA AMES LISP AND ADA BENCHMARK SUITE: EFFICIENCY OF LISP AND ADA PROCESSING - A SYSTEM EVALUATION

NASA Technical Reports Server (NTRS)

Davis, G. J.

1994-01-01

One area of research of the Information Sciences Division at NASA Ames Research Center is devoted to the analysis and enhancement of processors and advanced computer architectures, specifically in support of automation and robotic systems. To compare systems' abilities to efficiently process Lisp and Ada, scientists at Ames Research Center have developed a suite of non-parallel benchmarks called ELAPSE. The benchmark suite was designed to test a single computer's efficiency as well as alternate machine comparisons on Lisp, and/or Ada languages. ELAPSE tests the efficiency with which a machine can execute the various routines in each environment. The sample routines are based on numeric and symbolic manipulations and include two-dimensional fast Fourier transformations, Cholesky decomposition and substitution, Gaussian elimination, high-level data processing, and symbol-list references. Also included is a routine based on a Bayesian classification program sorting data into optimized groups. The ELAPSE benchmarks are available for any computer with a validated Ada compiler and/or Common Lisp system. Of the 18 routines that comprise ELAPSE, provided within this package are 14 developed or translated at Ames. The others are readily available through literature. The benchmark that requires the most memory is CHOLESKY.ADA. Under VAX/VMS, CHOLESKY.ADA requires 760K of main memory. ELAPSE is available on either two 5.25 inch 360K MS-DOS format diskettes (standard distribution) or a 9-track 1600 BPI ASCII CARD IMAGE format magnetic tape. The contents of the diskettes are compressed using the PKWARE archiving tools. The utility to unarchive the files, PKUNZIP.EXE, is included. The ELAPSE benchmarks were written in 1990. VAX and VMS are trademarks of Digital Equipment Corporation. MS-DOS is a registered trademark of Microsoft Corporation.
Benchmarks of fairness for health care reform: a policy tool for developing countries.

PubMed Central

Daniels, N.; Bryant, J.; Castano, R. A.; Dantes, O. G.; Khan, K. S.; Pannarunothai, S.

2000-01-01

Teams of collaborators from Colombia, Mexico, Pakistan, and Thailand have adapted a policy tool originally developed for evaluating health insurance reforms in the United States into "benchmarks of fairness" for assessing health system reform in developing countries. We describe briefly the history of the benchmark approach, the tool itself, and the uses to which it may be put. Fairness is a wide term that includes exposure to risk factors, access to all forms of care, and to financing. It also includes efficiency of management and resource allocation, accountability, and patient and provider autonomy. The benchmarks standardize the criteria for fairness. Reforms are then evaluated by scoring according to the degree to which they improve the situation, i.e. on a scale of -5 to 5, with zero representing the status quo. The object is to promote discussion about fairness across the disciplinary divisions that keep policy analysts and the public from understanding how trade-offs between different effects of reforms can affect the overall fairness of the reform. The benchmarks can be used at both national and provincial or district levels, and we describe plans for such uses in the collaborating sites. A striking feature of the adaptation process is that there was wide agreement on this ethical framework among the collaborating sites despite their large historical, political and cultural differences. PMID:10916911

Benchmark problems and solutions

NASA Technical Reports Server (NTRS)

Tam, Christopher K. W.

1995-01-01

The scientific committee, after careful consideration, adopted six categories of benchmark problems for the workshop. These problems do not cover all the important computational issues relevant to Computational Aeroacoustics (CAA). The deciding factor to limit the number of categories to six was the amount of effort needed to solve these problems. For reference purpose, the benchmark problems are provided here. They are followed by the exact or approximate analytical solutions. At present, an exact solution for the Category 6 problem is not available.
An unbiased method to build benchmarking sets for ligand-based virtual screening and its application to GPCRs.

PubMed

Xia, Jie; Jin, Hongwei; Liu, Zhenming; Zhang, Liangren; Wang, Xiang Simon

2014-05-27

Benchmarking data sets have become common in recent years for the purpose of virtual screening, though the main focus had been placed on the structure-based virtual screening (SBVS) approaches. Due to the lack of crystal structures, there is great need for unbiased benchmarking sets to evaluate various ligand-based virtual screening (LBVS) methods for important drug targets such as G protein-coupled receptors (GPCRs). To date these ready-to-apply data sets for LBVS are fairly limited, and the direct usage of benchmarking sets designed for SBVS could bring the biases to the evaluation of LBVS. Herein, we propose an unbiased method to build benchmarking sets for LBVS and validate it on a multitude of GPCRs targets. To be more specific, our methods can (1) ensure chemical diversity of ligands, (2) maintain the physicochemical similarity between ligands and decoys, (3) make the decoys dissimilar in chemical topology to all ligands to avoid false negatives, and (4) maximize spatial random distribution of ligands and decoys. We evaluated the quality of our Unbiased Ligand Set (ULS) and Unbiased Decoy Set (UDS) using three common LBVS approaches, with Leave-One-Out (LOO) Cross-Validation (CV) and a metric of average AUC of the ROC curves. Our method has greatly reduced the "artificial enrichment" and "analogue bias" of a published GPCRs benchmarking set, i.e., GPCR Ligand Library (GLL)/GPCR Decoy Database (GDD). In addition, we addressed an important issue about the ratio of decoys per ligand and found that for a range of 30 to 100 it does not affect the quality of the benchmarking set, so we kept the original ratio of 39 from the GLL/GDD.
42 CFR 440.335 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 42 Public Health 4 2013-10-01 2013-10-01 false Benchmark-equivalent health benefits coverage. 440... and Benchmark-Equivalent Coverage § 440.335 Benchmark-equivalent health benefits coverage. (a) Aggregate actuarial value. Benchmark-equivalent coverage is health benefits coverage that has an aggregate...
42 CFR 440.335 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 42 Public Health 4 2011-10-01 2011-10-01 false Benchmark-equivalent health benefits coverage. 440... and Benchmark-Equivalent Coverage § 440.335 Benchmark-equivalent health benefits coverage. (a) Aggregate actuarial value. Benchmark-equivalent coverage is health benefits coverage that has an aggregate...
The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS)

PubMed Central

Jakab, Andras; Bauer, Stefan; Kalpathy-Cramer, Jayashree; Farahani, Keyvan; Kirby, Justin; Burren, Yuliya; Porz, Nicole; Slotboom, Johannes; Wiest, Roland; Lanczi, Levente; Gerstner, Elizabeth; Weber, Marc-André; Arbel, Tal; Avants, Brian B.; Ayache, Nicholas; Buendia, Patricia; Collins, D. Louis; Cordier, Nicolas; Corso, Jason J.; Criminisi, Antonio; Das, Tilak; Delingette, Hervé; Demiralp, Çağatay; Durst, Christopher R.; Dojat, Michel; Doyle, Senan; Festa, Joana; Forbes, Florence; Geremia, Ezequiel; Glocker, Ben; Golland, Polina; Guo, Xiaotao; Hamamci, Andac; Iftekharuddin, Khan M.; Jena, Raj; John, Nigel M.; Konukoglu, Ender; Lashkari, Danial; Mariz, José António; Meier, Raphael; Pereira, Sérgio; Precup, Doina; Price, Stephen J.; Raviv, Tammy Riklin; Reza, Syed M. S.; Ryan, Michael; Sarikaya, Duygu; Schwartz, Lawrence; Shin, Hoo-Chang; Shotton, Jamie; Silva, Carlos A.; Sousa, Nuno; Subbanna, Nagesh K.; Szekely, Gabor; Taylor, Thomas J.; Thomas, Owen M.; Tustison, Nicholas J.; Unal, Gozde; Vasseur, Flor; Wintermark, Max; Ye, Dong Hye; Zhao, Liang; Zhao, Binsheng; Zikic, Darko; Prastawa, Marcel; Reyes, Mauricio; Van Leemput, Koen

2016-01-01

In this paper we report the set-up and results of the Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) organized in conjunction with the MICCAI 2012 and 2013 conferences. Twenty state-of-the-art tumor segmentation algorithms were applied to a set of 65 multi-contrast MR scans of low- and high-grade glioma patients—manually annotated by up to four raters—and to 65 comparable scans generated using tumor image simulation software. Quantitative evaluations revealed considerable disagreement between the human raters in segmenting various tumor sub-regions (Dice scores in the range 74%–85%), illustrating the difficulty of this task. We found that different algorithms worked best for different sub-regions (reaching performance comparable to human inter-rater variability), but that no single algorithm ranked in the top for all sub-regions simultaneously. Fusing several good algorithms using a hierarchical majority vote yielded segmentations that consistently ranked above all individual algorithms, indicating remaining opportunities for further methodological improvements. The BRATS image data and manual annotations continue to be publicly available through an online evaluation system as an ongoing benchmarking resource. PMID:25494501
Evaluating soil carbon in global climate models: benchmarking, future projections, and model drivers

NASA Astrophysics Data System (ADS)

Todd-Brown, K. E.; Randerson, J. T.; Post, W. M.; Allison, S. D.

2012-12-01

The carbon cycle plays a critical role in how the climate responds to anthropogenic carbon dioxide. To evaluate how well Earth system models (ESMs) from the Climate Model Intercomparison Project (CMIP5) represent the carbon cycle, we examined predictions of current soil carbon stocks from the historical simulation. We compared the soil and litter carbon pools from 17 ESMs with data on soil carbon stocks from the Harmonized World Soil Database (HWSD). We also examined soil carbon predictions for 2100 from 16 ESMs from the rcp85 (highest radiative forcing) simulation to investigate the effects of climate change on soil carbon stocks. In both analyses, we used a reduced complexity model to separate the effects of variation in model drivers from the effects of model parameters on soil carbon predictions. Drivers included NPP, soil temperature, and soil moisture, and the reduced complexity model represented one pool of soil carbon as a function of these drivers. The ESMs predicted global soil carbon totals of 500 to 2980 Pg-C, compared to 1260 Pg-C in the HWSD. This 5-fold variation in predicted soil stocks was a consequence of a 3.4-fold variation in NPP inputs and 3.8-fold variability in mean global turnover times. None of the ESMs correlated well with the global distribution of soil carbon in the HWSD (Pearson's correlation <0.40, RMSE 9-22 kg m-2). On a biome level there was a broad range of agreement between the ESMs and the HWSD. Some models predicted HWSD biome totals well (R2=0.91) while others did not (R2=0.23). All of the ESM terrestrial decomposition models are structurally similar with outputs that were well described by a reduced complexity model that included NPP and soil temperature (R2 of 0.73-0.93). However, MPI-ESM-LR outputs showed only a moderate fit to this model (R2=0.51), and CanESM2 outputs were better described by a reduced model that included soil moisture (R2=0.74), We also found a broad range in soil carbon responses to climate change
XWeB: The XML Warehouse Benchmark

NASA Astrophysics Data System (ADS)

Mahboubi, Hadj; Darmont, Jérôme

With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems.
Building Bridges Between Geoscience and Data Science through Benchmark Data Sets

NASA Astrophysics Data System (ADS)

Thompson, D. R.; Ebert-Uphoff, I.; Demir, I.; Gel, Y.; Hill, M. C.; Karpatne, A.; Güereque, M.; Kumar, V.; Cabral, E.; Smyth, P.

2017-12-01

The changing nature of observational field data demands richer and more meaningful collaboration between data scientists and geoscientists. Thus, among other efforts, the Working Group on Case Studies of the NSF-funded RCN on Intelligent Systems Research To Support Geosciences (IS-GEO) is developing a framework to strengthen such collaborations through the creation of benchmark datasets. Benchmark datasets provide an interface between disciplines without requiring extensive background knowledge. The goals are to create (1) a means for two-way communication between geoscience and data science researchers; (2) new collaborations, which may lead to new approaches for data analysis in the geosciences; and (3) a public, permanent repository of complex data sets, representative of geoscience problems, useful to coordinate efforts in research and education. The group identified 10 key elements and characteristics for ideal benchmarks. High impact: A problem with high potential impact. Active research area: A group of geoscientists should be eager to continue working on the topic. Challenge: The problem should be challenging for data scientists. Data science generality and versatility: It should stimulate development of new general and versatile data science methods. Rich information content: Ideally the data set provides stimulus for analysis at many different levels. Hierarchical problem statement: A hierarchy of suggested analysis tasks, from relatively straightforward to open-ended tasks. Means for evaluating success: Data scientists and geoscientists need means to evaluate whether the algorithms are successful and achieve intended purpose. Quick start guide: Introduction for data scientists on how to easily read the data to enable rapid initial data exploration. Geoscience context: Summary for data scientists of the specific data collection process, instruments used, any pre-processing and the science questions to be answered. Citability: A suitable identifier to
Design Alternatives for Evaluating the Impact of Conservation Projects

ERIC Educational Resources Information Center

Margoluis, Richard; Stem, Caroline; Salafsky, Nick; Brown, Marcia

2009-01-01

Historically, examples of project evaluation in conservation were rare. In recent years, however, conservation professionals have begun to recognize the importance of evaluation both for accountability and for improving project interventions. Even with this growing interest in evaluation, the conservation community has paid little attention to…
A Web Resource for Standardized Benchmark Datasets, Metrics, and Rosetta Protocols for Macromolecular Modeling and Design.

PubMed

Ó Conchúir, Shane; Barlow, Kyle A; Pache, Roland A; Ollikainen, Noah; Kundert, Kale; O'Meara, Matthew J; Smith, Colin A; Kortemme, Tanja

2015-01-01

The development and validation of computational macromolecular modeling and design methods depend on suitable benchmark datasets and informative metrics for comparing protocols. In addition, if a method is intended to be adopted broadly in diverse biological applications, there needs to be information on appropriate parameters for each protocol, as well as metrics describing the expected accuracy compared to experimental data. In certain disciplines, there exist established benchmarks and public resources where experts in a particular methodology are encouraged to supply their most efficient implementation of each particular benchmark. We aim to provide such a resource for protocols in macromolecular modeling and design. We present a freely accessible web resource (https://kortemmelab.ucsf.edu/benchmarks) to guide the development of protocols for protein modeling and design. The site provides benchmark datasets and metrics to compare the performance of a variety of modeling protocols using different computational sampling methods and energy functions, providing a "best practice" set of parameters for each method. Each benchmark has an associated downloadable benchmark capture archive containing the input files, analysis scripts, and tutorials for running the benchmark. The captures may be run with any suitable modeling method; we supply command lines for running the benchmarks using the Rosetta software suite. We have compiled initial benchmarks for the resource spanning three key areas: prediction of energetic effects of mutations, protein design, and protein structure prediction, each with associated state-of-the-art modeling protocols. With the help of the wider macromolecular modeling community, we hope to expand the variety of benchmarks included on the website and continue to evaluate new iterations of current methods as they become available.
Benchmarking: your performance measurement and improvement tool.

PubMed

Senn, G F

2000-01-01

Many respected professional healthcare organizations and societies today are seeking to establish data-driven performance measurement strategies such as benchmarking. Clinicians are, however, resistant to "benchmarking" that is based on financial data alone, concerned that it may be adverse to the patients' best interests. Benchmarking of clinical procedures that uses physician's codes such as Current Procedural Terminology (CPTs) has greater credibility with practitioners. Better Performers, organizations that can perform procedures successfully at lower cost and in less time, become the "benchmark" against which other organizations can measure themselves. The Better Performers' strategies can be adopted by other facilities to save time or money while maintaining quality patient care.
Evaluation of Project Trend.

ERIC Educational Resources Information Center

Unco, Inc., Washington, DC.

This report is a descriptive evaluation of the five pilot sites of Project TREND (Targeting Resources on the Educational Needs of the Disadvantaged). The five Local Education Agency (LEA) pilot sites are the educational systems of: (1) Akron, Ohio; (2) El Paso, Texas; (3) Newark, New Jersey; (4) Portland, Oregon; and, (5) San Jose (Unified),…
Reviews and syntheses: Field data to benchmark the carbon cycle models for tropical forests

USGS Publications Warehouse

Clark, Deborah A.; Asao, Shinichi; Fisher, Rosie A.; Reed, Sasha C.; Reich, Peter B.; Ryan, Michael G.; Wood, Tana E.; Yang, Xiaojuan

2017-01-01

For more accurate projections of both the global carbon (C) cycle and the changing climate, a critical current need is to improve the representation of tropical forests in Earth system models. Tropical forests exchange more C, energy, and water with the atmosphere than any other class of land ecosystems. Further, tropical-forest C cycling is likely responding to the rapid global warming, intensifying water stress, and increasing atmospheric CO2 levels. Projections of the future C balance of the tropics vary widely among global models. A current effort of the modeling community, the ILAMB (International Land Model Benchmarking) project, is to compile robust observations that can be used to improve the accuracy and realism of the land models for all major biomes. Our goal with this paper is to identify field observations of tropical-forest ecosystem C stocks and fluxes, and of their long-term trends and climatic and CO2 sensitivities, that can serve this effort. We propose criteria for reference-level field data from this biome and present a set of documented examples from old-growth lowland tropical forests. We offer these as a starting point towards the goal of a regularly updated consensus set of benchmark field observations of C cycling in tropical forests.
Reviews and syntheses: Field data to benchmark the carbon cycle models for tropical forests

NASA Astrophysics Data System (ADS)

Clark, Deborah A.; Asao, Shinichi; Fisher, Rosie; Reed, Sasha; Reich, Peter B.; Ryan, Michael G.; Wood, Tana E.; Yang, Xiaojuan

2017-10-01

For more accurate projections of both the global carbon (C) cycle and the changing climate, a critical current need is to improve the representation of tropical forests in Earth system models. Tropical forests exchange more C, energy, and water with the atmosphere than any other class of land ecosystems. Further, tropical-forest C cycling is likely responding to the rapid global warming, intensifying water stress, and increasing atmospheric CO2 levels. Projections of the future C balance of the tropics vary widely among global models. A current effort of the modeling community, the ILAMB (International Land Model Benchmarking) project, is to compile robust observations that can be used to improve the accuracy and realism of the land models for all major biomes. Our goal with this paper is to identify field observations of tropical-forest ecosystem C stocks and fluxes, and of their long-term trends and climatic and CO2 sensitivities, that can serve this effort. We propose criteria for reference-level field data from this biome and present a set of documented examples from old-growth lowland tropical forests. We offer these as a starting point towards the goal of a regularly updated consensus set of benchmark field observations of C cycling in tropical forests.
Reviews and syntheses: Field data to benchmark the carbon cycle models for tropical forests

DOE Office of Scientific and Technical Information (OSTI.GOV)

Clark, Deborah A.; Asao, Shinichi; Fisher, Rosie

For more accurate projections of both the global carbon (C) cycle and the changing climate, a critical current need is to improve the representation of tropical forests in Earth system models. Tropical forests exchange more C, energy, and water with the atmosphere than any other class of land ecosystems. Further, tropical-forest C cycling is likely responding to the rapid global warming, intensifying water stress, and increasing atmospheric CO 2 levels. Projections of the future C balance of the tropics vary widely among global models. A current effort of the modeling community, the ILAMB (International Land Model Benchmarking) project, is tomore » compile robust observations that can be used to improve the accuracy and realism of the land models for all major biomes. Our goal with this paper is to identify field observations of tropical-forest ecosystem C stocks and fluxes, and of their long-term trends and climatic and CO 2 sensitivities, that can serve this effort. We propose criteria for reference-level field data from this biome and present a set of documented examples from old-growth lowland tropical forests. We offer these as a starting point towards the goal of a regularly updated consensus set of benchmark field observations of C cycling in tropical forests.« less
Reviews and syntheses: Field data to benchmark the carbon cycle models for tropical forests

DOE PAGES

Clark, Deborah A.; Asao, Shinichi; Fisher, Rosie; ...

2017-10-23

For more accurate projections of both the global carbon (C) cycle and the changing climate, a critical current need is to improve the representation of tropical forests in Earth system models. Tropical forests exchange more C, energy, and water with the atmosphere than any other class of land ecosystems. Further, tropical-forest C cycling is likely responding to the rapid global warming, intensifying water stress, and increasing atmospheric CO 2 levels. Projections of the future C balance of the tropics vary widely among global models. A current effort of the modeling community, the ILAMB (International Land Model Benchmarking) project, is tomore » compile robust observations that can be used to improve the accuracy and realism of the land models for all major biomes. Our goal with this paper is to identify field observations of tropical-forest ecosystem C stocks and fluxes, and of their long-term trends and climatic and CO 2 sensitivities, that can serve this effort. We propose criteria for reference-level field data from this biome and present a set of documented examples from old-growth lowland tropical forests. We offer these as a starting point towards the goal of a regularly updated consensus set of benchmark field observations of C cycling in tropical forests.« less
Benchmark Airport Charges

NASA Technical Reports Server (NTRS)

deWit, A.; Cohn, N.

1999-01-01

The Netherlands Directorate General of Civil Aviation (DGCA) commissioned Hague Consulting Group (HCG) to complete a benchmark study of airport charges at twenty eight airports in Europe and around the world, based on 1996 charges. This study followed previous DGCA research on the topic but included more airports in much more detail. The main purpose of this new benchmark study was to provide insight into the levels and types of airport charges worldwide and into recent changes in airport charge policy and structure, This paper describes the 1996 analysis. It is intended that this work be repeated every year in order to follow developing trends and provide the most up-to-date information possible.
Benchmark Airport Charges

NASA Technical Reports Server (NTRS)

de Wit, A.; Cohn, N.

1999-01-01

The Netherlands Directorate General of Civil Aviation (DGCA) commissioned Hague Consulting Group (HCG) to complete a benchmark study of airport charges at twenty eight airports in Europe and around the world, based on 1996 charges. This study followed previous DGCA research on the topic but included more airports in much more detail. The main purpose of this new benchmark study was to provide insight into the levels and types of airport charges worldwide and into recent changes in airport charge policy and structure. This paper describes the 1996 analysis. It is intended that this work be repeated every year in order to follow developing trends and provide the most up-to-date information possible.
42 CFR 440.330 - Benchmark health benefits coverage.

Code of Federal Regulations, 2011 CFR

2011-10-01

... Benchmark-Equivalent Coverage § 440.330 Benchmark health benefits coverage. Benchmark coverage is health...) Federal Employees Health Benefit Plan Equivalent Coverage (FEHBP—Equivalent Health Insurance Coverage). A benefit plan equivalent to the standard Blue Cross/Blue Shield preferred provider option service benefit...
42 CFR 440.330 - Benchmark health benefits coverage.

Code of Federal Regulations, 2014 CFR

2014-10-01

... Benchmark-Equivalent Coverage § 440.330 Benchmark health benefits coverage. Benchmark coverage is health...) Federal Employees Health Benefit Plan Equivalent Coverage (FEHBP—Equivalent Health Insurance Coverage). A benefit plan equivalent to the standard Blue Cross/Blue Shield preferred provider option service benefit...

42 CFR 440.330 - Benchmark health benefits coverage.

Code of Federal Regulations, 2013 CFR

2013-10-01

... Benchmark-Equivalent Coverage § 440.330 Benchmark health benefits coverage. Benchmark coverage is health...) Federal Employees Health Benefit Plan Equivalent Coverage (FEHBP—Equivalent Health Insurance Coverage). A benefit plan equivalent to the standard Blue Cross/Blue Shield preferred provider option service benefit...
42 CFR 440.330 - Benchmark health benefits coverage.

Code of Federal Regulations, 2010 CFR

2010-10-01

... Benchmark-Equivalent Coverage § 440.330 Benchmark health benefits coverage. Benchmark coverage is health...) Federal Employees Health Benefit Plan Equivalent Coverage (FEHBP—Equivalent Health Insurance Coverage). A benefit plan equivalent to the standard Blue Cross/Blue Shield preferred provider option service benefit...
The Program Evaluator's Role in Cross-Project Pollination.

ERIC Educational Resources Information Center

Yasgur, Bruce J.

An expanded duties role of the multiple-program evaluator as an integral part of the ongoing decision-making process in all projects served is defended. Assumptions discussed included that need for projects with related objectives to pool resources and avoid duplication of effort and the evaluator's unique ability to provide an objective…
Regional Interstate Planning Project Program . . . Vol. IX. California Program Evaluation Improvement Project. Seminar Report.

ERIC Educational Resources Information Center

Dearmin, Evalyn, Ed.; And Others

Program evaluation strategies and techniques based on materials developed by the California Evaluation Improvement Project were discussed at this meeting of the Regional Interstate Planning Project (RIPP). RIPP members represent the State Departments of Education of ten western states, and have met periodically over the past nine years to discuss…
Implications of the Trauma Quality Improvement Project inclusion of nonsurvivable injuries in performance benchmarking.

PubMed

Heaney, Jiselle Bock; Schroll, Rebecca; Turney, Jennifer; Stuke, Lance; Marr, Alan B; Greiffenstein, Patrick; Robledo, Rosemarie; Theriot, Amanda; Duchesne, Juan; Hunt, John

2017-10-01

The Trauma Quality Improvement Project (TQIP) uses an injury prediction model for performance benchmarking. We hypothesize that at a Level I high-volume penetrating trauma center, performance outcomes will be biased due to inclusion of patients with nonsurvivable injuries. Retrospective chart review was conducted for all patients included in the institutional TQIP analysis from 2013 to 2014 with length of stay (LOS) less than 1 day to determine survivability of the injuries. Observed (O)/expected (E) mortality ratios were calculated before and after exclusion of these patients. Completeness of data reported to TQIP was examined. Eight hundred twenty-six patients were reported to TQIP including 119 deaths. Nonsurvivable injuries accounted 90.9% of the deaths in patients with an LOS of 1 day or less. The O/E mortality ratio for all patients was 1.061, and the O/E ratio after excluding all patients with LOS less than 1 day found to have nonsurvivable injuries was 0.895. Data for key variables were missing in 63.3% of patients who died in the emergency department, 50% of those taken to the operating room and 0% of those admitted to the intensive care unit. Charts for patients who died with LOS less than 1 day were significantly more likely than those who lived to be missing crucial. This study shows TQIP inclusion of patients with nonsurvivable injuries biases outcomes at an urban trauma center. Missing data results in imputation of values, increasing inaccuracy. Further investigation is needed to determine if these findings exist at other institutions, and whether the current TQIP model needs revision to accurately identify and exclude patients with nonsurvivable injuries. Prognostic and epidemiological, level III.
Multirate flutter suppression system design for the Benchmark Active Controls Technology Wing

NASA Technical Reports Server (NTRS)

Berg, Martin C.; Mason, Gregory S.

1994-01-01

To study the effectiveness of various control system design methodologies, the NASA Langley Research Center initiated the Benchmark Active Controls Project. In this project, the various methodologies will be applied to design a flutter suppression system for the Benchmark Active Controls Technology (BACT) Wing (also called the PAPA wing). Eventually, the designs will be implemented in hardware and tested on the BACT wing in a wind tunnel. This report describes a project at the University of Washington to design a multirate flutter suppression system for the BACT wing. The objective of the project was two fold. First, to develop a methodology for designing robust multirate compensators, and second, to demonstrate the methodology by applying it to the design of a multirate flutter suppression system for the BACT wing. The contributions of this project are (1) development of an algorithm for synthesizing robust low order multirate control laws (the algorithm is capable of synthesizing a single compensator which stabilizes both the nominal plant and multiple plant perturbations; (2) development of a multirate design methodology, and supporting software, for modeling, analyzing and synthesizing multirate compensators; and (3) design of a multirate flutter suppression system for NASA's BACT wing which satisfies the specified design criteria. This report describes each of these contributions in detail. Section 2.0 discusses our design methodology. Section 3.0 details the results of our multirate flutter suppression system design for the BACT wing. Finally, Section 4.0 presents our conclusions and suggestions for future research. The body of the report focuses primarily on the results. The associated theoretical background appears in the three technical papers that are included as Attachments 1-3. Attachment 4 is a user's manual for the software that is key to our design methodology.
A Benchmark and Comparative Study of Video-Based Face Recognition on COX Face Database.

PubMed

Huang, Zhiwu; Shan, Shiguang; Wang, Ruiping; Zhang, Haihong; Lao, Shihong; Kuerban, Alifu; Chen, Xilin

2015-12-01

Face recognition with still face images has been widely studied, while the research on video-based face recognition is inadequate relatively, especially in terms of benchmark datasets and comparisons. Real-world video-based face recognition applications require techniques for three distinct scenarios: 1) Videoto-Still (V2S); 2) Still-to-Video (S2V); and 3) Video-to-Video (V2V), respectively, taking video or still image as query or target. To the best of our knowledge, few datasets and evaluation protocols have benchmarked for all the three scenarios. In order to facilitate the study of this specific topic, this paper contributes a benchmarking and comparative study based on a newly collected still/video face database, named COX(1) Face DB. Specifically, we make three contributions. First, we collect and release a largescale still/video face database to simulate video surveillance with three different video-based face recognition scenarios (i.e., V2S, S2V, and V2V). Second, for benchmarking the three scenarios designed on our database, we review and experimentally compare a number of existing set-based methods. Third, we further propose a novel Point-to-Set Correlation Learning (PSCL) method, and experimentally show that it can be used as a promising baseline method for V2S/S2V face recognition on COX Face DB. Extensive experimental results clearly demonstrate that video-based face recognition needs more efforts, and our COX Face DB is a good benchmark database for evaluation.
Annual Report. Technical Reports. Evaluation Productivity Project.

ERIC Educational Resources Information Center

Alkin, Marvin; And Others

After outlining the 1984 activities and results of the Center for the Study of Evaluation's (CSE's) Evaluation Productivity Project, this monograph presents three reports. The first, "The Administrator's Role in Evaluation Use," by James Burry, Marvin C. Alkin, and Joan A. Ruskus, describes the factors influencing an evaluation's use…
40 CFR 141.172 - Disinfection profiling and benchmarking.

Code of Federal Regulations, 2011 CFR

2011-07-01

... benchmarking. 141.172 Section 141.172 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED... Disinfection-Systems Serving 10,000 or More People § 141.172 Disinfection profiling and benchmarking. (a... sanitary surveys conducted by the State. (c) Disinfection benchmarking. (1) Any system required to develop...
Raising Quality and Achievement. A College Guide to Benchmarking.

ERIC Educational Resources Information Center

Owen, Jane

This booklet introduces the principles and practices of benchmarking as a way of raising quality and achievement at further education colleges in Britain. Section 1 defines the concept of benchmarking. Section 2 explains what benchmarking is not and the steps that should be taken before benchmarking is initiated. The following aspects and…
Regression Tree-Based Methodology for Customizing Building Energy Benchmarks to Individual Commercial Buildings

NASA Astrophysics Data System (ADS)

Kaskhedikar, Apoorva Prakash

According to the U.S. Energy Information Administration, commercial buildings represent about 40% of the United State's energy consumption of which office buildings consume a major portion. Gauging the extent to which an individual building consumes energy in excess of its peers is the first step in initiating energy efficiency improvement. Energy Benchmarking offers initial building energy performance assessment without rigorous evaluation. Energy benchmarking tools based on the Commercial Buildings Energy Consumption Survey (CBECS) database are investigated in this thesis. This study proposes a new benchmarking methodology based on decision trees, where a relationship between the energy use intensities (EUI) and building parameters (continuous and categorical) is developed for different building types. This methodology was applied to medium office and school building types contained in the CBECS database. The Random Forest technique was used to find the most influential parameters that impact building energy use intensities. Subsequently, correlations which were significant were identified between EUIs and CBECS variables. Other than floor area, some of the important variables were number of workers, location, number of PCs and main cooling equipment. The coefficient of variation was used to evaluate the effectiveness of the new model. The customization technique proposed in this thesis was compared with another benchmarking model that is widely used by building owners and designers namely, the ENERGY STAR's Portfolio Manager. This tool relies on the standard Linear Regression methods which is only able to handle continuous variables. The model proposed uses data mining technique and was found to perform slightly better than the Portfolio Manager. The broader impacts of the new benchmarking methodology proposed is that it allows for identifying important categorical variables, and then incorporating them in a local, as against a global, model framework for EUI
A benchmark for comparison of dental radiography analysis algorithms.

PubMed

Wang, Ching-Wei; Huang, Cheng-Ta; Lee, Jia-Hong; Li, Chung-Hsing; Chang, Sheng-Wei; Siao, Ming-Jhih; Lai, Tat-Ming; Ibragimov, Bulat; Vrtovec, Tomaž; Ronneberger, Olaf; Fischer, Philipp; Cootes, Tim F; Lindner, Claudia

2016-07-01

Dental radiography plays an important role in clinical diagnosis, treatment and surgery. In recent years, efforts have been made on developing computerized dental X-ray image analysis systems for clinical usages. A novel framework for objective evaluation of automatic dental radiography analysis algorithms has been established under the auspices of the IEEE International Symposium on Biomedical Imaging 2015 Bitewing Radiography Caries Detection Challenge and Cephalometric X-ray Image Analysis Challenge. In this article, we present the datasets, methods and results of the challenge and lay down the principles for future uses of this benchmark. The main contributions of the challenge include the creation of the dental anatomy data repository of bitewing radiographs, the creation of the anatomical abnormality classification data repository of cephalometric radiographs, and the definition of objective quantitative evaluation for comparison and ranking of the algorithms. With this benchmark, seven automatic methods for analysing cephalometric X-ray image and two automatic methods for detecting bitewing radiography caries have been compared, and detailed quantitative evaluation results are presented in this paper. Based on the quantitative evaluation results, we believe automatic dental radiography analysis is still a challenging and unsolved problem. The datasets and the evaluation software will be made available to the research community, further encouraging future developments in this field. (http://www-o.ntust.edu.tw/~cweiwang/ISBI2015/). Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
How to Advance TPC Benchmarks with Dependability Aspects

NASA Astrophysics Data System (ADS)

Almeida, Raquel; Poess, Meikel; Nambiar, Raghunath; Patil, Indira; Vieira, Marco

Transactional systems are the core of the information systems of most organizations. Although there is general acknowledgement that failures in these systems often entail significant impact both on the proceeds and reputation of companies, the benchmarks developed and managed by the Transaction Processing Performance Council (TPC) still maintain their focus on reporting bare performance. Each TPC benchmark has to pass a list of dependability-related tests (to verify ACID properties), but not all benchmarks require measuring their performances. While TPC-E measures the recovery time of some system failures, TPC-H and TPC-C only require functional correctness of such recovery. Consequently, systems used in TPC benchmarks are tuned mostly for performance. In this paper we argue that nowadays systems should be tuned for a more comprehensive suite of dependability tests, and that a dependability metric should be part of TPC benchmark publications. The paper discusses WHY and HOW this can be achieved. Two approaches are introduced and discussed: augmenting each TPC benchmark in a customized way, by extending each specification individually; and pursuing a more unified approach, defining a generic specification that could be adjoined to any TPC benchmark.
A Methodology for Benchmarking Relational Database Machines,

DTIC Science & Technology

1984-01-01

user benchmarks is to compare the multiple users to the best-case performance The data for each query classification coll and the performance...called a benchmark. The term benchmark originates from the markers used by sur - veyors in establishing common reference points for their measure...formatted databases. In order to further simplify the problem, we restrict our study to those DBMs which support the relational model. A sur - vey
The MCNP6 Analytic Criticality Benchmark Suite

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, Forrest B.

2016-06-16

Analytical benchmarks provide an invaluable tool for verifying computer codes used to simulate neutron transport. Several collections of analytical benchmark problems [1-4] are used routinely in the verification of production Monte Carlo codes such as MCNP® [5,6]. Verification of a computer code is a necessary prerequisite to the more complex validation process. The verification process confirms that a code performs its intended functions correctly. The validation process involves determining the absolute accuracy of code results vs. nature. In typical validations, results are computed for a set of benchmark experiments using a particular methodology (code, cross-section data with uncertainties, and modeling)more » and compared to the measured results from the set of benchmark experiments. The validation process determines bias, bias uncertainty, and possibly additional margins. Verification is generally performed by the code developers, while validation is generally performed by code users for a particular application space. The VERIFICATION_KEFF suite of criticality problems [1,2] was originally a set of 75 criticality problems found in the literature for which exact analytical solutions are available. Even though the spatial and energy detail is necessarily limited in analytical benchmarks, typically to a few regions or energy groups, the exact solutions obtained can be used to verify that the basic algorithms, mathematics, and methods used in complex production codes perform correctly. The present work has focused on revisiting this benchmark suite. A thorough review of the problems resulted in discarding some of them as not suitable for MCNP benchmarking. For the remaining problems, many of them were reformulated to permit execution in either multigroup mode or in the normal continuous-energy mode for MCNP. Execution of the benchmarks in continuous-energy mode provides a significant advance to MCNP verification methods.« less
Benchmarking Discount Rate in Natural Resource Damage Assessment with Risk Aversion.

PubMed

Wu, Desheng; Chen, Shuzhen

2017-08-01

Benchmarking a credible discount rate is of crucial importance in natural resource damage assessment (NRDA) and restoration evaluation. This article integrates a holistic framework of NRDA with prevailing low discount rate theory, and proposes a discount rate benchmarking decision support system based on service-specific risk aversion. The proposed approach has the flexibility of choosing appropriate discount rates for gauging long-term services, as opposed to decisions based simply on duration. It improves injury identification in NRDA since potential damages and side-effects to ecosystem services are revealed within the service-specific framework. A real embankment case study demonstrates valid implementation of the method. © 2017 Society for Risk Analysis.
Benchmarking government action for obesity prevention--an innovative advocacy strategy.

PubMed

Martin, J; Peeters, A; Honisett, S; Mavoa, H; Swinburn, B; de Silva-Sanigorski, A

2014-01-01

Successful obesity prevention will require a leading role for governments, but internationally they have been slow to act. League tables of benchmark indicators of action can be a valuable advocacy and evaluation tool. To develop a benchmarking tool for government action on obesity prevention, implement it across Australian jurisdictions and to publicly award the best and worst performers. A framework was developed which encompassed nine domains, reflecting best practice government action on obesity prevention: whole-of-government approaches; marketing restrictions; access to affordable, healthy food; school food and physical activity; food in public facilities; urban design and transport; leisure and local environments; health services, and; social marketing. A scoring system was used by non-government key informants to rate the performance of their government. National rankings were generated and the results were communicated to all Premiers/Chief Ministers, the media and the national obesity research and practice community. Evaluation of the initial tool in 2010 showed it to be feasible to implement and able to discriminate the better and worse performing governments. Evaluation of the rubric in 2011 confirmed this to be a robust and useful method. In relation to government action, the best performing governments were those with whole-of-government approaches, had extended common initiatives and demonstrated innovation and strong political will. This new benchmarking tool, the Obesity Action Award, has enabled identification of leading government action on obesity prevention and the key characteristics associated with their success. We recommend this tool for other multi-state/country comparisons. Copyright © 2013 Asian Oceanian Association for the Study of Obesity. Published by Elsevier Ltd. All rights reserved.
Protein Models Docking Benchmark 2

PubMed Central

Anishchenko, Ivan; Kundrotas, Petras J.; Tuzikov, Alexander V.; Vakser, Ilya A.

2015-01-01

Structural characterization of protein-protein interactions is essential for our ability to understand life processes. However, only a fraction of known proteins have experimentally determined structures. Such structures provide templates for modeling of a large part of the proteome, where individual proteins can be docked by template-free or template-based techniques. Still, the sensitivity of the docking methods to the inherent inaccuracies of protein models, as opposed to the experimentally determined high-resolution structures, remains largely untested, primarily due to the absence of appropriate benchmark set(s). Structures in such a set should have pre-defined inaccuracy levels and, at the same time, resemble actual protein models in terms of structural motifs/packing. The set should also be large enough to ensure statistical reliability of the benchmarking results. We present a major update of the previously developed benchmark set of protein models. For each interactor, six models were generated with the model-to-native Cα RMSD in the 1 to 6 Å range. The models in the set were generated by a new approach, which corresponds to the actual modeling of new protein structures in the “real case scenario,” as opposed to the previous set, where a significant number of structures were model-like only. In addition, the larger number of complexes (165 vs. 63 in the previous set) increases the statistical reliability of the benchmarking. We estimated the highest accuracy of the predicted complexes (according to CAPRI criteria), which can be attained using the benchmark structures. The set is available at http://dockground.bioinformatics.ku.edu. PMID:25712716
Professional Learning: Trends in State Efforts. Benchmarking State Implementation of College- and Career-Readiness Standards

ERIC Educational Resources Information Center

Anderson, Kimberly; Mire, Mary Elizabeth

2016-01-01

This report presents a multi-year study of how states are implementing their state college- and career-readiness standards. In this report, the Southern Regional Education Board's (SREB's) Benchmarking State Implementation of College- and Career-Readiness Standards project studied state efforts in 2014-15 and 2015-16 to foster effective…
Gatemon Benchmarking and Two-Qubit Operation

NASA Astrophysics Data System (ADS)

Casparis, Lucas; Larsen, Thorvald; Olsen, Michael; Petersson, Karl; Kuemmeth, Ferdinand; Krogstrup, Peter; Nygard, Jesper; Marcus, Charles

Recent experiments have demonstrated superconducting transmon qubits with semiconductor nanowire Josephson junctions. These hybrid gatemon qubits utilize field effect tunability singular to semiconductors to allow complete qubit control using gate voltages, potentially a technological advantage over conventional flux-controlled transmons. Here, we present experiments with a two-qubit gatemon circuit. We characterize qubit coherence and stability and use randomized benchmarking to demonstrate single-qubit gate errors of ~0.5 % for all gates, including voltage-controlled Z rotations. We show coherent capacitive coupling between two gatemons and coherent SWAP operations. Finally, we perform a two-qubit controlled-phase gate with an estimated fidelity of ~91 %, demonstrating the potential of gatemon qubits for building scalable quantum processors. We acknowledge financial support from Microsoft Project Q and the Danish National Research Foundation.

[Do you mean benchmarking?].

PubMed

Bonnet, F; Solignac, S; Marty, J

2008-03-01

The purpose of benchmarking is to settle improvement processes by comparing the activities to quality standards. The proposed methodology is illustrated by benchmark business cases performed inside medical plants on some items like nosocomial diseases or organization of surgery facilities. Moreover, the authors have built a specific graphic tool, enhanced with balance score numbers and mappings, so that the comparison between different anesthesia-reanimation services, which are willing to start an improvement program, is easy and relevant. This ready-made application is even more accurate as far as detailed tariffs of activities are implemented.
Evaluation Project of a Postvention Program.

ERIC Educational Resources Information Center

Simon, Robert; And Others

A student suicide or parasuicide increases the risk that potentially suicidal teenagers see suicide as an enviable option. The "copycat effect" can be reduced by a postvention program. This proposed evaluative research project will provide an implementation and impact evaluation of a school's postvention program following a suicide or…
Benchmarking, Total Quality Management, and Libraries.

ERIC Educational Resources Information Center

Shaughnessy, Thomas W.

1993-01-01

Discussion of the use of Total Quality Management (TQM) in higher education and academic libraries focuses on the identification, collection, and use of reliable data. Methods for measuring quality, including benchmarking, are described; performance measures are considered; and benchmarking techniques are examined. (11 references) (MES)
Evaluation of the Bangalore Project.

ERIC Educational Resources Information Center

Beretta, Alan; Davies, Alan

1985-01-01

Follows up an article by Brumfit on the Bangalore/Madras Communicational Teaching Project (CTP). Discusses the framework, tests, and results of a 1984 evaluation supporting the claim that grammar construction can take place through a focus on meaning alone. (SED)
StirMark Benchmark: audio watermarking attacks based on lossy compression

NASA Astrophysics Data System (ADS)

Steinebach, Martin; Lang, Andreas; Dittmann, Jana

2002-04-01

StirMark Benchmark is a well-known evaluation tool for watermarking robustness. Additional attacks are added to it continuously. To enable application based evaluation, in our paper we address attacks against audio watermarks based on lossy audio compression algorithms to be included in the test environment. We discuss the effect of different lossy compression algorithms like MPEG-2 audio Layer 3, Ogg or VQF on a selection of audio test data. Our focus is on changes regarding the basic characteristics of the audio data like spectrum or average power and on removal of embedded watermarks. Furthermore we compare results of different watermarking algorithms and show that lossy compression is still a challenge for most of them. There are two strategies for adding evaluation of robustness against lossy compression to StirMark Benchmark: (a) use of existing free compression algorithms (b) implementation of a generic lossy compression simulation. We discuss how such a model can be implemented based on the results of our tests. This method is less complex, as no real psycho acoustic model has to be applied. Our model can be used for audio watermarking evaluation of numerous application fields. As an example, we describe its importance for e-commerce applications with watermarking security.
An Unbiased Method To Build Benchmarking Sets for Ligand-Based Virtual Screening and its Application To GPCRs

PubMed Central

2015-01-01

Benchmarking data sets have become common in recent years for the purpose of virtual screening, though the main focus had been placed on the structure-based virtual screening (SBVS) approaches. Due to the lack of crystal structures, there is great need for unbiased benchmarking sets to evaluate various ligand-based virtual screening (LBVS) methods for important drug targets such as G protein-coupled receptors (GPCRs). To date these ready-to-apply data sets for LBVS are fairly limited, and the direct usage of benchmarking sets designed for SBVS could bring the biases to the evaluation of LBVS. Herein, we propose an unbiased method to build benchmarking sets for LBVS and validate it on a multitude of GPCRs targets. To be more specific, our methods can (1) ensure chemical diversity of ligands, (2) maintain the physicochemical similarity between ligands and decoys, (3) make the decoys dissimilar in chemical topology to all ligands to avoid false negatives, and (4) maximize spatial random distribution of ligands and decoys. We evaluated the quality of our Unbiased Ligand Set (ULS) and Unbiased Decoy Set (UDS) using three common LBVS approaches, with Leave-One-Out (LOO) Cross-Validation (CV) and a metric of average AUC of the ROC curves. Our method has greatly reduced the “artificial enrichment” and “analogue bias” of a published GPCRs benchmarking set, i.e., GPCR Ligand Library (GLL)/GPCR Decoy Database (GDD). In addition, we addressed an important issue about the ratio of decoys per ligand and found that for a range of 30 to 100 it does not affect the quality of the benchmarking set, so we kept the original ratio of 39 from the GLL/GDD. PMID:24749745
Public Interest Energy Research (PIER) Program Development of a Computer-based Benchmarking and Analytical Tool. Benchmarking and Energy & Water Savings Tool in Dairy Plants (BEST-Dairy)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, Tengfang; Flapper, Joris; Ke, Jing

The overall goal of the project is to develop a computer-based benchmarking and energy and water savings tool (BEST-Dairy) for use in the California dairy industry - including four dairy processes - cheese, fluid milk, butter, and milk powder. BEST-Dairy tool developed in this project provides three options for the user to benchmark each of the dairy product included in the tool, with each option differentiated based on specific detail level of process or plant, i.e., 1) plant level; 2) process-group level, and 3) process-step level. For each detail level, the tool accounts for differences in production and other variablesmore » affecting energy use in dairy processes. The dairy products include cheese, fluid milk, butter, milk powder, etc. The BEST-Dairy tool can be applied to a wide range of dairy facilities to provide energy and water savings estimates, which are based upon the comparisons with the best available reference cases that were established through reviewing information from international and national samples. We have performed and completed alpha- and beta-testing (field testing) of the BEST-Dairy tool, through which feedback from voluntary users in the U.S. dairy industry was gathered to validate and improve the tool's functionality. BEST-Dairy v1.2 was formally published in May 2011, and has been made available for free downloads from the internet (i.e., http://best-dairy.lbl.gov). A user's manual has been developed and published as the companion documentation for use with the BEST-Dairy tool. In addition, we also carried out technology transfer activities by engaging the dairy industry in the process of tool development and testing, including field testing, technical presentations, and technical assistance throughout the project. To date, users from more than ten countries in addition to those in the U.S. have downloaded the BEST-Dairy from the LBNL website. It is expected that the use of BEST-Dairy tool will advance understanding of energy and
Benchmarking Helps Measure Union Programs, Operations.

ERIC Educational Resources Information Center

Mann, Jerry

2001-01-01

Explores three examples of benchmarking by college student unions. Focuses on how a union can collect information from other unions for use as benchmarking standards for the purposes of selling a concept or justifying program increases, or for comparing a union's financial performance to other unions. (EV)
Constructing Benchmark Databases and Protocols for Medical Image Analysis: Diabetic Retinopathy

PubMed Central

Kauppi, Tomi; Kämäräinen, Joni-Kristian; Kalesnykiene, Valentina; Sorri, Iiris; Uusitalo, Hannu; Kälviäinen, Heikki

2013-01-01

We address the performance evaluation practices for developing medical image analysis methods, in particular, how to establish and share databases of medical images with verified ground truth and solid evaluation protocols. Such databases support the development of better algorithms, execution of profound method comparisons, and, consequently, technology transfer from research laboratories to clinical practice. For this purpose, we propose a framework consisting of reusable methods and tools for the laborious task of constructing a benchmark database. We provide a software tool for medical image annotation helping to collect class label, spatial span, and expert's confidence on lesions and a method to appropriately combine the manual segmentations from multiple experts. The tool and all necessary functionality for method evaluation are provided as public software packages. As a case study, we utilized the framework and tools to establish the DiaRetDB1 V2.1 database for benchmarking diabetic retinopathy detection algorithms. The database contains a set of retinal images, ground truth based on information from multiple experts, and a baseline algorithm for the detection of retinopathy lesions. PMID:23956787
Evaluating Soil Health Using Remotely Sensed Evapotranspiration on the Benchmark Barnes Soils of North Dakota

NASA Astrophysics Data System (ADS)

Bohn, Meyer; Hopkins, David; Steele, Dean; Tuscherer, Sheldon

2017-04-01

The benchmark Barnes soil series is an extensive upland Hapludoll of the northern Great Plains that is both economically and ecologically vital to the region. Effects of tillage erosion coupled with wind and water erosion have degraded Barnes soil quality, but with unknown extent, distribution, or severity. Evidence of soil degradation documented for a half century warrants that the assumption of productivity be tested. Soil resilience is linked to several dynamic soil properties and National Cooperative Soil Survey initiatives are now focused on identifying those properties for benchmark soils. Quantification of soil degradation is dependent on a reliable method for broad-scale evaluation. The soil survey community is currently developing rapid and widespread soil property assessment technologies. Improvements in satellite based remote-sensing and image analysis software have stimulated the application of broad-scale resource assessment. Furthermore, these technologies have fostered refinement of land-based surface energy balance algorithms, i.e. Mapping Evapotranspiration at High Resolution with Internalized Calibration (METRIC) algorithm for evapotranspiration (ET) mapping. The hypothesis of this study is that ET mapping technology can differentiate soil function on extensive landscapes and identify degraded areas. A recent soil change study in eastern North Dakota resampled legacy Barnes pedons sampled prior to 1960 and found significant decreases in organic carbon. An ancillary study showed that evapotranspiration (ET) estimates from METRIC decreased with Barnes erosion class severity. An ET raster map has been developed for three eastern North Dakota counties using METRIC and Landsat 5 imagery. ET pixel candidates on major Barnes soil map units were stratified into tertiles and classified as ranked ET subdivisions. A sampling population of randomly selected points stratified by ET class and county proportion was established. Morphologic and chemical data will
Healthy city projects in developing countries: the first evaluation.

PubMed

Harpham, T; Burton, S; Blue, I

2001-06-01

The 'healthy city' concept has only recently been adopted in developing countries. From 1995 to 1999, the World Health Organization (WHO), Geneva, supported healthy city projects (HCPs) in Cox's Bazar (Bangladesh), Dar es Salaam (Tanzania), Fayoum (Egypt), Managua (Nicaragua) and Quetta (Pakistan). The authors evaluated four of these projects, representing the first major evaluation of HCPs in developing countries. Methods used were stakeholder analysis, workshops, document analysis and interviews with 102 managers/implementers and 103 intended beneficiaries. Municipal health plan development (one of the main components of the healthy city strategy) in these cities was limited, which is a similar finding to evaluations of HCPs in Europe. The main activities selected by the projects were awareness raising and environmental improvements, particularly solid waste disposal. Two of the cities effectively used the 'settings' approach of the healthy city concept, whereby places such as markets and schools are targeted. The evaluation found that stakeholder involvement varied in relation to: (i) the level of knowledge of the project; (ii) the project office location; (iii) the project management structure; and (iv) type of activities (ranging from low stakeholder involvement in capital-intensive infrastructure projects, to high in some settings-type activities). There was evidence to suggest that understanding of environment-health links was increased across stakeholders. There was limited political commitment to the healthy city projects, perhaps due to the fact that most of the municipalities had not requested the projects. Consequently, the projects had little influence on written/expressed municipal policies. Some of the projects mobilized considerable resources, and most projects achieved effective intersectoral collaboration. WHO support enabled the project coordinators to network at national and international levels, and the capacity of these individuals (although
Benchmark Study of Global Clean Energy Manufacturing | Advanced

Science.gov Websites

Manufacturing Research | NREL Benchmark Study of Global Clean Energy Manufacturing Benchmark Study of Global Clean Energy Manufacturing Through a first-of-its-kind benchmark study, the Clean Energy Technology End Product.' The study examined four clean energy technologies: wind turbine components
Benchmarking: contexts and details matter.

PubMed

Zheng, Siyuan

2017-07-05

Benchmarking is an essential step in the development of computational tools. We take this opportunity to pitch in our opinions on tool benchmarking, in light of two correspondence articles published in Genome Biology.Please see related Li et al. and Newman et al. correspondence articles: www.dx.doi.org/10.1186/s13059-017-1256-5 and www.dx.doi.org/10.1186/s13059-017-1257-4.
A benchmark study of scoring methods for non-coding mutations.

PubMed

Drubay, Damien; Gautheret, Daniel; Michiels, Stefan

2018-05-15

Detailed knowledge of coding sequences has led to different candidate models for pathogenic variant prioritization. Several deleteriousness scores have been proposed for the non-coding part of the genome, but no large-scale comparison has been realized to date to assess their performance. We compared the leading scoring tools (CADD, FATHMM-MKL, Funseq2 and GWAVA) and some recent competitors (DANN, SNP and SOM scores) for their ability to discriminate assumed pathogenic variants from assumed benign variants (using the ClinVar, COSMIC and 1000 genomes project databases). Using the ClinVar benchmark, CADD was the best tool for detecting the pathogenic variants that are mainly located in protein coding gene regions. Using the COSMIC benchmark, FATHMM-MKL, GWAVA and SOMliver outperformed the other tools for pathogenic variants that are typically located in lincRNAs, pseudogenes and other parts of the non-coding genome. However, all tools had low precision, which could potentially be improved by future non-coding genome feature discoveries. These results may have been influenced by the presence of potential benign variants in the COSMIC database. The development of a gold standard as consistent as ClinVar for these regions will be necessary to confirm our tool ranking. The Snakemake, C++ and R codes are freely available from https://github.com/Oncostat/BenchmarkNCVTools and supported on Linux. damien.drubay@gustaveroussy.fr or stefan.michiels@gustaveroussy.fr. Supplementary data are available at Bioinformatics online.
Aeroelasticity Benchmark Assessment: Subsonic Fixed Wing Program

NASA Technical Reports Server (NTRS)

Florance, Jennifer P.; Chwalowski, Pawel; Wieseman, Carol D.

2010-01-01

The fundamental technical challenge in computational aeroelasticity is the accurate prediction of unsteady aerodynamic phenomena and the effect on the aeroelastic response of a vehicle. Currently, a benchmarking standard for use in validating the accuracy of computational aeroelasticity codes does not exist. Many aeroelastic data sets have been obtained in wind-tunnel and flight testing throughout the world; however, none have been globally presented or accepted as an ideal data set. There are numerous reasons for this. One reason is that often, such aeroelastic data sets focus on the aeroelastic phenomena alone (flutter, for example) and do not contain associated information such as unsteady pressures and time-correlated structural dynamic deflections. Other available data sets focus solely on the unsteady pressures and do not address the aeroelastic phenomena. Other discrepancies can include omission of relevant data, such as flutter frequency and / or the acquisition of only qualitative deflection data. In addition to these content deficiencies, all of the available data sets present both experimental and computational technical challenges. Experimental issues include facility influences, nonlinearities beyond those being modeled, and data processing. From the computational perspective, technical challenges include modeling geometric complexities, coupling between the flow and the structure, grid issues, and boundary conditions. The Aeroelasticity Benchmark Assessment task seeks to examine the existing potential experimental data sets and ultimately choose the one that is viewed as the most suitable for computational benchmarking. An initial computational evaluation of that configuration will then be performed using the Langley-developed computational fluid dynamics (CFD) software FUN3D1 as part of its code validation process. In addition to the benchmarking activity, this task also includes an examination of future research directions. Researchers within the
GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods.

PubMed

Schaffter, Thomas; Marbach, Daniel; Floreano, Dario

2011-08-15

Over the last decade, numerous methods have been developed for inference of regulatory networks from gene expression data. However, accurate and systematic evaluation of these methods is hampered by the difficulty of constructing adequate benchmarks and the lack of tools for a differentiated analysis of network predictions on such benchmarks. Here, we describe a novel and comprehensive method for in silico benchmark generation and performance profiling of network inference methods available to the community as an open-source software called GeneNetWeaver (GNW). In addition to the generation of detailed dynamical models of gene regulatory networks to be used as benchmarks, GNW provides a network motif analysis that reveals systematic prediction errors, thereby indicating potential ways of improving inference methods. The accuracy of network inference methods is evaluated using standard metrics such as precision-recall and receiver operating characteristic curves. We show how GNW can be used to assess the performance and identify the strengths and weaknesses of six inference methods. Furthermore, we used GNW to provide the international Dialogue for Reverse Engineering Assessments and Methods (DREAM) competition with three network inference challenges (DREAM3, DREAM4 and DREAM5). GNW is available at http://gnw.sourceforge.net along with its Java source code, user manual and supporting data. Supplementary data are available at Bioinformatics online. dario.floreano@epfl.ch.
Diagnostic Algorithm Benchmarking

NASA Technical Reports Server (NTRS)

Poll, Scott

2011-01-01

A poster for the NASA Aviation Safety Program Annual Technical Meeting. It describes empirical benchmarking on diagnostic algorithms using data from the ADAPT Electrical Power System testbed and a diagnostic software framework.
Project Aprendizaje. Final Evaluation Report 1992-93.

ERIC Educational Resources Information Center

Clark, Andrew

This report provides evaluative information regarding the effectiveness of Project Aprendizaje, a New York City program that served 269 Spanish-speaking students of limited English proficiency (LEP). The project promoted parent and community involvement by sponsoring cultural events, such as a large Latin American festival. Students developed…
Validating Cellular Automata Lava Flow Emplacement Algorithms with Standard Benchmarks

NASA Astrophysics Data System (ADS)

Richardson, J. A.; Connor, L.; Charbonnier, S. J.; Connor, C.; Gallant, E.

2015-12-01

A major existing need in assessing lava flow simulators is a common set of validation benchmark tests. We propose three levels of benchmarks which test model output against increasingly complex standards. First, imulated lava flows should be morphologically identical, given changes in parameter space that should be inconsequential, such as slope direction. Second, lava flows simulated in simple parameter spaces can be tested against analytical solutions or empirical relationships seen in Bingham fluids. For instance, a lava flow simulated on a flat surface should produce a circular outline. Third, lava flows simulated over real world topography can be compared to recent real world lava flows, such as those at Tolbachik, Russia, and Fogo, Cape Verde. Success or failure of emplacement algorithms in these validation benchmarks can be determined using a Bayesian approach, which directly tests the ability of an emplacement algorithm to correctly forecast lava inundation. Here we focus on two posterior metrics, P(A|B) and P(¬A|¬B), which describe the positive and negative predictive value of flow algorithms. This is an improvement on less direct statistics such as model sensitivity and the Jaccard fitness coefficient. We have performed these validation benchmarks on a new, modular lava flow emplacement simulator that we have developed. This simulator, which we call MOLASSES, follows a Cellular Automata (CA) method. The code is developed in several interchangeable modules, which enables quick modification of the distribution algorithm from cell locations to their neighbors. By assessing several different distribution schemes with the benchmark tests, we have improved the performance of MOLASSES to correctly match early stages of the 2012-3 Tolbachik Flow, Kamchakta Russia, to 80%. We also can evaluate model performance given uncertain input parameters using a Monte Carlo setup. This illuminates sensitivity to model uncertainty.
An Evaluation of the Community Education Demonstration Projects.

ERIC Educational Resources Information Center

Wynne, Ronald D.

The report summarizes a six-month evaluation (July-December 1967) of 11 community education projects funded by the Office of Economic Opportunity. Major focus is on eight projects which comprised a special series of Section 207 Community Education Demonstration Projects--Appalachian Volunteers, Inc., the Columbia College Citizenship Council…

Academic Productivity in Psychiatry: Benchmarks for the H-Index.

PubMed

MacMaster, Frank P; Swansburg, Rose; Rittenbach, Katherine

2017-08-01

Bibliometrics play an increasingly critical role in the assessment of faculty for promotion and merit increases. Bibliometrics is the statistical analysis of publications, aimed at evaluating their impact. The objective of this study is to describe h-index and citation benchmarks in academic psychiatry. Faculty lists were acquired from online resources for all academic departments of psychiatry listed as having residency training programs in Canada (as of June 2016). Potential authors were then searched on Web of Science (Thomson Reuters) for their corresponding h-index and total number of citations. The sample included 1683 faculty members in academic psychiatry departments. Restricted to those with a rank of assistant, associate, or full professor resulted in 1601 faculty members (assistant = 911, associate = 387, full = 303). h-index and total citations differed significantly by academic rank. Both were highest in the full professor rank, followed by associate, then assistant. The range in each, however, was large. This study provides the initial benchmarks for the h-index and total citations in academic psychiatry. Regardless of any controversies or criticisms of bibliometrics, they are increasingly influencing promotion, merit increases, and grant support. As such, benchmarking by specialties is needed in order to provide needed context.
Economic evaluation of environmental epidemiological projects in national industrial complexes.

PubMed

Shin, Youngchul

2017-01-01

In this economic evaluation of environmental epidemiological monitoring projects, we analyzed the economic feasibility of these projects by determining the social cost and benefit of these projects and conducting a cost/benefit analysis. Here, the social cost was evaluated by converting annual budgets for these research and survey projects into present values. Meanwhile, the societal benefit of these projects was evaluated by using the contingent valuation method to estimate the willingness-to-pay of residents living in or near industrial complexes. In addition, the extent to which these projects reduced negative health effects (i.e., excess disease and premature death) was evaluated through expert surveys, and the analysis was conducted to reflect the unit of economic value, based on the cost of illness and benefit transfer method. The results were then used to calculate the benefit of these projects in terms of the decrease in negative health effects. For residents living near industrial complexes, the benefit/cost ratio was 1.44 in the analysis based on resident surveys and 5.17 in the analysis based on expert surveys. Thus, whichever method was used for the economic analysis, the economic feasibility of these projects was confirmed.
Preview: Evaluation of the 1973-1974 Bilingual/Bicultural Project. Formative Evaluation Report.

ERIC Educational Resources Information Center

Ligon, Glynn; And Others

The formative report provided the Austin Independent School District personnel with information useful for planning the remaining activities for the 1973-74 Bilingual/Bicultural Project and the activities for the 1974-75 Project. Emphasis was on what had been done to evaluate the 1973-74 Project, the data which was or would be available for the…
Benchmarking hypercube hardware and software

NASA Technical Reports Server (NTRS)

Grunwald, Dirk C.; Reed, Daniel A.

1986-01-01

It was long a truism in computer systems design that balanced systems achieve the best performance. Message passing parallel processors are no different. To quantify the balance of a hypercube design, an experimental methodology was developed and the associated suite of benchmarks was applied to several existing hypercubes. The benchmark suite includes tests of both processor speed in the absence of internode communication and message transmission speed as a function of communication patterns.
MicroRNA array normalization: an evaluation using a randomized dataset as the benchmark.

PubMed

Qin, Li-Xuan; Zhou, Qin

2014-01-01

MicroRNA arrays possess a number of unique data features that challenge the assumption key to many normalization methods. We assessed the performance of existing normalization methods using two microRNA array datasets derived from the same set of tumor samples: one dataset was generated using a blocked randomization design when assigning arrays to samples and hence was free of confounding array effects; the second dataset was generated without blocking or randomization and exhibited array effects. The randomized dataset was assessed for differential expression between two tumor groups and treated as the benchmark. The non-randomized dataset was assessed for differential expression after normalization and compared against the benchmark. Normalization improved the true positive rate significantly in the non-randomized data but still possessed a false discovery rate as high as 50%. Adding a batch adjustment step before normalization further reduced the number of false positive markers while maintaining a similar number of true positive markers, which resulted in a false discovery rate of 32% to 48%, depending on the specific normalization method. We concluded the paper with some insights on possible causes of false discoveries to shed light on how to improve normalization for microRNA arrays.
Benchmarking for Excellence and the Nursing Process

NASA Technical Reports Server (NTRS)

Sleboda, Claire

1999-01-01

Nursing is a service profession. The services provided are essential to life and welfare. Therefore, setting the benchmark for high quality care is fundamental. Exploring the definition of a benchmark value will help to determine a best practice approach. A benchmark is the descriptive statement of a desired level of performance against which quality can be judged. It must be sufficiently well understood by managers and personnel in order that it may serve as a standard against which to measure value.
Benchmark dynamics in the environmental performance of ports.

PubMed

Puig, Martí; Michail, Antonis; Wooldridge, Chris; Darbra, Rosa Mari

2017-08-15

This paper analyses the 2016 environmental benchmark performance of the port sector, based on a wide representation of EcoPorts members. This is the fifth time that this study has been conducted as an initiative of the European Sea Ports Organisation (ESPO). The data and results are derived from the Self-Diagnosis Method (SDM), a concise checklist against which port managers can self-assess the environmental management of their port in relation to the performance of the EcoPorts membership. The SDM tool was developed in the framework of the ECOPORTS project (2002-2005) and it is managed by ESPO. A total number of 91 ports from 20 different European Maritime States contributed to this evaluation. The main results are that air quality remains as the top environmental priority of the respondent ports, followed by energy consumption and noise. In terms of environmental management, the study confirms that key components are commonly implemented in the majority of European ports. 94% of contributing ports have a designated environmental manager, 92% own an environmental policy and 82% implement an environmental monitoring program. Waste is identified as the most monitored issue in ports (80%), followed by energy consumption (73%) and water quality (70%). Copyright © 2017 Elsevier Ltd. All rights reserved.
Methodology for evaluation of railroad technology research projects

DOT National Transportation Integrated Search

1981-04-01

This Project memorandum presents a methodology for evaluating railroad research projects. The methodology includes consideration of industry and societal benefits, with special attention given to technical risks, implementation considerations, and po...
School-Based Cognitive-Behavioral Therapy for Adolescent Depression: A Benchmarking Study

ERIC Educational Resources Information Center

Shirk, Stephen R.; Kaplinski, Heather; Gudmundsen, Gretchen

2009-01-01

The current study evaluated cognitive-behavioral therapy (CBT) for adolescent depression delivered in health clinics and counseling centers in four high schools. Outcomes were benchmarked to results from prior efficacy trials. Fifty adolescents diagnosed with depressive disorders were treated by eight doctoral-level psychologists who followed a…
Toward Scalable Benchmarks for Mass Storage Systems

NASA Technical Reports Server (NTRS)

Miller, Ethan L.

1996-01-01

This paper presents guidelines for the design of a mass storage system benchmark suite, along with preliminary suggestions for programs to be included. The benchmarks will measure both peak and sustained performance of the system as well as predicting both short- and long-term behavior. These benchmarks should be both portable and scalable so they may be used on storage systems from tens of gigabytes to petabytes or more. By developing a standard set of benchmarks that reflect real user workload, we hope to encourage system designers and users to publish performance figures that can be compared with those of other systems. This will allow users to choose the system that best meets their needs and give designers a tool with which they can measure the performance effects of improvements to their systems.
An Evaluation of Project Gifted 1971-1972.

ERIC Educational Resources Information Center

Renzulli, Joseph S.

Evaluated was Project Gifted, a tri-city (Cranston, East Providence, and Warwick, Rhode Island) program which focused on the training of gifted children in grades 4-6 in the creative thinking process. Project goals were identification of gifted students, development of differential experiences, and development of innovative programs. Cranston's…
Risk evaluation of highway engineering project based on the fuzzy-AHP

NASA Astrophysics Data System (ADS)

Yang, Qian; Wei, Yajun

2011-10-01

Engineering projects are social activities, which integrate with technology, economy, management and organization. There are uncertainties in each respect of engineering projects, and it needs to strengthen risk management urgently. Based on the analysis of the characteristics of highway engineering, and the study of the basic theory on risk evaluation, the paper built an index system of highway project risk evaluation. Besides based on fuzzy mathematics principle, analytical hierarchy process was used and as a result, the model of the comprehensive appraisal method of fuzzy and AHP was set up for the risk evaluation of express way concessionary project. The validity and the practicability of the risk evaluation of expressway concessionary project were verified after the model was applied to the practice of a project.
A portfolio evaluation framework for air transportation improvement projects

NASA Astrophysics Data System (ADS)

Baik, Hyeoncheol

This thesis explores the application of portfolio theory to the Air Transportation System (ATS) improvement. The ATS relies on complexly related resources and different stakeholder groups. Moreover, demand for air travel is significantly increasing relative to capacity of air transportation. In this environment, improving the ATS is challenging. Many projects, which are defined as technologies or initiatives, for improvement have been proposed and some have been demonstrated in practice. However, there is no clear understanding of how well these projects work in different conditions nor of how they interact with each other or with existing systems. These limitations make it difficult to develop good project combinations, or portfolios that maximize improvement. To help address this gap, a framework for identifying good portfolios is proposed. The framework can be applied to individual projects or portfolios of projects. Projects or portfolios are evaluated using four different groups of factors (effectiveness, time-to-implement, scope of applicability, and stakeholder impacts). Portfolios are also evaluated in terms of interaction-determining factors (prerequisites, co-requisites, limiting factors, and amplifying factors) because, while a given project might work well in isolation, interdependencies between projects or with existing systems could result in lower overall performance in combination. Ways to communicate a portfolio to decision makers are also introduced. The framework is unique because (1) it allows using a variety of available data, and (2) it covers diverse benefit metrics. For demonstrating the framework, an application to ground delay management projects serves as a case study. The portfolio evaluation approach introduced in this thesis can aid decision makers and researchers at universities and aviation agencies such as Federal Aviation Administration (FAA), National Aeronautics and Space Administration (NASA), and Department of Defense (DoD), in
Small Business Learning through Mentoring: Evaluating a Project

ERIC Educational Resources Information Center

Barrett, Rowena

2006-01-01

Purpose: The purpose of this paper is to evaluate a small business-mentoring project, which was delivered in regional Australia. Design/methodology/approach: This paper contains a case study of the mentoring project and focuses on the process and the outcomes of that project from different perspectives. Data collected in semi structured telephone…
34 CFR 636.21 - What selection criteria does the Secretary use to evaluate an application?

Code of Federal Regulations, 2011 CFR

2011-07-01

...) Agencies of local government. (ii) Public and private elementary and secondary schools. (iii) Business... implementation strategy for each key project component activity is— (i) Comprehensive; (ii) Based on a sound... operation; (5) Describe a time-line chart that relates key evaluation processes and benchmarks to other...
34 CFR 636.21 - What selection criteria does the Secretary use to evaluate an application?

Code of Federal Regulations, 2014 CFR

2014-07-01

...) Agencies of local government. (ii) Public and private elementary and secondary schools. (iii) Business... implementation strategy for each key project component activity is— (i) Comprehensive; (ii) Based on a sound... operation; (5) Describe a time-line chart that relates key evaluation processes and benchmarks to other...
34 CFR 636.21 - What selection criteria does the Secretary use to evaluate an application?

Code of Federal Regulations, 2013 CFR

2013-07-01

...) Agencies of local government. (ii) Public and private elementary and secondary schools. (iii) Business... implementation strategy for each key project component activity is— (i) Comprehensive; (ii) Based on a sound... operation; (5) Describe a time-line chart that relates key evaluation processes and benchmarks to other...
34 CFR 636.21 - What selection criteria does the Secretary use to evaluate an application?

Code of Federal Regulations, 2012 CFR

2012-07-01

...) Agencies of local government. (ii) Public and private elementary and secondary schools. (iii) Business... implementation strategy for each key project component activity is— (i) Comprehensive; (ii) Based on a sound... operation; (5) Describe a time-line chart that relates key evaluation processes and benchmarks to other...
Evaluation of anode (electro)catalytic materials for the direct borohydride fuel cell: Methods and benchmarks

NASA Astrophysics Data System (ADS)

Olu, Pierre-Yves; Job, Nathalie; Chatenet, Marian

2016-09-01

In this paper, different methods are discussed for the evaluation of the potential of a given catalyst, in view of an application as a direct borohydride fuel cell DBFC anode material. Characterizations results in DBFC configuration are notably analyzed at the light of important experimental variables which influence the performances of the DBFC. However, in many practical DBFC-oriented studies, these various experimental variables prevent one to isolate the influence of the anode catalyst on the cell performances. Thus, the electrochemical three-electrode cell is a widely-employed and useful tool to isolate the DBFC anode catalyst and to investigate its electrocatalytic activity towards the borohydride oxidation reaction (BOR) in the absence of other limitations. This article reviews selected results for different types of catalysts in electrochemical cell containing a sodium borohydride alkaline electrolyte. In particular, propositions of common experimental conditions and benchmarks are given for practical evaluation of the electrocatalytic activity towards the BOR in three-electrode cell configuration. The major issue of gaseous hydrogen generation and escape upon DBFC operation is also addressed through a comprehensive review of various results depending on the anode composition. At last, preliminary concerns are raised about the stability of potential anode catalysts upon DBFC operation.
ASPER Research and Evaluation Projects 1970-79.

ERIC Educational Resources Information Center

1980

This inventory of research and evaluation projects completed during calendar years 1970-1979 for the Office of the Assistant Secretary for Policy, Education, and Research (ASPER) of the Department of Labor contains summaries of projects on economic, social, and policy background; the labor market; the nature and impact of Department of Labor…

Childhood Obesity Research Demonstration Project: Cross-Site Evaluation Methods

PubMed Central

Lee, Rebecca E.; Mehta, Paras; Thompson, Debbe; Bhargava, Alok; Carlson, Coleen; Kao, Dennis; Layne, Charles S.; Ledoux, Tracey; O'Connor, Teresia; Rifai, Hanadi; Gulley, Lauren; Hallett, Allen M.; Kudia, Ousswa; Joseph, Sitara; Modelska, Maria; Ortega, Dana; Parker, Nathan; Stevens, Andria

2015-01-01

Abstract Introduction: The Childhood Obesity Research Demonstration (CORD) project links public health and primary care interventions in three projects described in detail in accompanying articles in this issue of Childhood Obesity. This article describes a comprehensive evaluation plan to determine the extent to which the CORD model is associated with changes in behavior, body weight, BMI, quality of life, and healthcare satisfaction in children 2–12 years of age. Design/Methods: The CORD Evaluation Center (EC-CORD) will analyze the pooled data from three independent demonstration projects that each integrate public health and primary care childhood obesity interventions. An extensive set of common measures at the family, facility, and community levels were defined by consensus among the CORD projects and EC-CORD. Process evaluation will assess reach, dose delivered, and fidelity of intervention components. Impact evaluation will use a mixed linear models approach to account for heterogeneity among project-site populations and interventions. Sustainability evaluation will assess the potential for replicability, continuation of benefits beyond the funding period, institutionalization of the intervention activities, and community capacity to support ongoing program delivery. Finally, cost analyses will assess how much benefit can potentially be gained per dollar invested in programs based on the CORD model. Conclusions: The keys to combining and analyzing data across multiple projects include the CORD model framework and common measures for the behavioral and health outcomes along with important covariates at the individual, setting, and community levels. The overall objective of the comprehensive evaluation will develop evidence-based recommendations for replicating and disseminating community-wide, integrated public health and primary care programs based on the CORD model. PMID:25679060
Evaluation of 1983 selective speed enforcement projects in Virginia.

DOT National Transportation Integrated Search

1985-01-01

This report describes and evaluates Virginia's 1983 selective speed enforcement projects. These projects are one of the various types of highway safety programs, classified as selective traffic enforcement projects (STEPs) partially funded by the fed...
The NIEHS Predictive-Toxicology Evaluation Project.

PubMed Central

Bristol, D W; Wachsman, J T; Greenwell, A

1996-01-01

The Predictive-Toxicology Evaluation (PTE) project conducts collaborative experiments that subject the performance of predictive-toxicology (PT) methods to rigorous, objective evaluation in a uniquely informative manner. Sponsored by the National Institute of Environmental Health Sciences, it takes advantage of the ongoing testing conducted by the U.S. National Toxicology Program (NTP) to estimate the true error of models that have been applied to make prospective predictions on previously untested, noncongeneric-chemical substances. The PTE project first identifies a group of standardized NTP chemical bioassays either scheduled to be conducted or are ongoing, but not yet complete. The project then announces and advertises the evaluation experiment, disseminates information about the chemical bioassays, and encourages researchers from a wide variety of disciplines to publish their predictions in peer-reviewed journals, using whatever approaches and methods they feel are best. A collection of such papers is published in this Environmental Health Perspectives Supplement, providing readers the opportunity to compare and contrast PT approaches and models, within the context of their prospective application to an actual-use situation. This introduction to this collection of papers on predictive toxicology summarizes the predictions made and the final results obtained for the 44 chemical carcinogenesis bioassays of the first PTE experiment (PTE-1) and presents information that identifies the 30 chemical carcinogenesis bioassays of PTE-2, along with a table of prediction sets that have been published to date. It also provides background about the origin and goals of the PTE project, outlines the special challenge associated with estimating the true error of models that aspire to predict open-system behavior, and summarizes what has been learned to date. PMID:8933048
Modelling in Evaluating a Working Life Project in Higher Education

ERIC Educational Resources Information Center

Sarja, Anneli; Janhonen, Sirpa; Havukainen, Pirjo; Vesterinen, Anne

2012-01-01

This article describes an evaluation method based on collaboration between the higher education, a care home and university, in a R&D project. The aim of the project was to elaborate modelling as a tool of developmental evaluation for innovation and competence in project cooperation. The approach was based on activity theory. Modelling enabled a…
Reliable B Cell Epitope Predictions: Impacts of Method Development and Improved Benchmarking

PubMed Central

Kringelum, Jens Vindahl; Lundegaard, Claus; Lund, Ole; Nielsen, Morten

2012-01-01

The interaction between antibodies and antigens is one of the most important immune system mechanisms for clearing infectious organisms from the host. Antibodies bind to antigens at sites referred to as B-cell epitopes. Identification of the exact location of B-cell epitopes is essential in several biomedical applications such as; rational vaccine design, development of disease diagnostics and immunotherapeutics. However, experimental mapping of epitopes is resource intensive making in silico methods an appealing complementary approach. To date, the reported performance of methods for in silico mapping of B-cell epitopes has been moderate. Several issues regarding the evaluation data sets may however have led to the performance values being underestimated: Rarely, all potential epitopes have been mapped on an antigen, and antibodies are generally raised against the antigen in a given biological context not against the antigen monomer. Improper dealing with these aspects leads to many artificial false positive predictions and hence to incorrect low performance values. To demonstrate the impact of proper benchmark definitions, we here present an updated version of the DiscoTope method incorporating a novel spatial neighborhood definition and half-sphere exposure as surface measure. Compared to other state-of-the-art prediction methods, Discotope-2.0 displayed improved performance both in cross-validation and in independent evaluations. Using DiscoTope-2.0, we assessed the impact on performance when using proper benchmark definitions. For 13 proteins in the training data set where sufficient biological information was available to make a proper benchmark redefinition, the average AUC performance was improved from 0.791 to 0.824. Similarly, the average AUC performance on an independent evaluation data set improved from 0.712 to 0.727. Our results thus demonstrate that given proper benchmark definitions, B-cell epitope prediction methods achieve highly significant
78 FR 27957 - Fisheries of the South Atlantic, Southeast Data, Assessment, and Review (SEDAR); Public Meetings

Federal Register 2010, 2011, 2012, 2013, 2014

2013-05-13

..., describes the fisheries, evaluates the status of the stock, estimates biological benchmarks, projects future.... Participants will evaluate and recommend datasets appropriate for assessment analysis, employ assessment models to evaluate stock status, estimate population benchmarks and management criteria, and project future...
The National Evaluation of Project Follow Through: 1967-1968.

ERIC Educational Resources Information Center

Kurfeerst, Marvin; Stephens, Thomas M.

Twenty-nine projects evaluated during the 1967-68 school year in the national evaluation of Project Follow Through are discussed in this final report. The report provides, in a narrative format, only the "hard" data gathered during the pre-test period. Specifically, this report reviews relevant literature, describes the procedures and…
Designing an External Evaluation of a Large-Scale Software Development Project.

ERIC Educational Resources Information Center

Collis, Betty; Moonen, Jef

This paper describes the design and implementation of the evaluation of the POCO Project, a large-scale national software project in the Netherlands which incorporates the perspective of an evaluator throughout the entire span of the project, and uses the experiences gained from it to suggest an evaluation procedure that could be applied to other…
42 CFR 457.430 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 42 Public Health 4 2011-10-01 2011-10-01 false Benchmark-equivalent health benefits coverage. 457... STATES State Plan Requirements: Coverage and Benefits § 457.430 Benchmark-equivalent health benefits coverage. (a) Aggregate actuarial value. Benchmark-equivalent coverage is health benefits coverage that has...
42 CFR 457.430 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 42 Public Health 4 2013-10-01 2013-10-01 false Benchmark-equivalent health benefits coverage. 457... STATES State Plan Requirements: Coverage and Benefits § 457.430 Benchmark-equivalent health benefits coverage. (a) Aggregate actuarial value. Benchmark-equivalent coverage is health benefits coverage that has...
42 CFR 457.430 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 42 Public Health 4 2010-10-01 2010-10-01 false Benchmark-equivalent health benefits coverage. 457... STATES State Plan Requirements: Coverage and Benefits § 457.430 Benchmark-equivalent health benefits coverage. (a) Aggregate actuarial value. Benchmark-equivalent coverage is health benefits coverage that has...
42 CFR 440.335 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 42 Public Health 4 2012-10-01 2012-10-01 false Benchmark-equivalent health benefits coverage. 440.335 Section 440.335 Public Health CENTERS FOR MEDICARE & MEDICAID SERVICES, DEPARTMENT OF HEALTH AND... and Benchmark-Equivalent Coverage § 440.335 Benchmark-equivalent health benefits coverage. (a...
42 CFR 440.335 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 42 Public Health 4 2014-10-01 2014-10-01 false Benchmark-equivalent health benefits coverage. 440.335 Section 440.335 Public Health CENTERS FOR MEDICARE & MEDICAID SERVICES, DEPARTMENT OF HEALTH AND... and Benchmark-Equivalent Coverage § 440.335 Benchmark-equivalent health benefits coverage. (a...
Evaluation of observation-driven evaporation algorithms: results of the WACMOS-ET project

NASA Astrophysics Data System (ADS)

Miralles, Diego G.; Jimenez, Carlos; Ershadi, Ali; McCabe, Matthew F.; Michel, Dominik; Hirschi, Martin; Seneviratne, Sonia I.; Jung, Martin; Wood, Eric F.; (Bob) Su, Z.; Timmermans, Joris; Chen, Xuelong; Fisher, Joshua B.; Mu, Quiaozen; Fernandez, Diego

2015-04-01

scales for the 2005-2007 reference period will be disclosed. The skill of these algorithms to close the water balance over the continents will be assessed by comparisons to runoff data. The consistency in forcing data will allow to (a) evaluate the skill of these five algorithms in producing ET over particular ecosystems, (b) facilitate the attribution of the observed differences to either algorithms or driving data, and (c) set up a solid scientific basis for the development of global long-term benchmark ET products. Project progress can be followed on our website http://wacmoset.estellus.eu. REFERENCES Fisher, J. B., Tu, K.P., and Baldocchi, D.D. Global estimates of the land-atmosphere water flux based on monthly AVHRR and ISLSCP-II data, validated at 16 FLUXNET sites. Remote Sens. Environ. 112, 901-919, 2008. Jiménez, C. et al. Global intercomparison of 12 land surface heat flux estimates. J. Geophys. Res. 116, D02102, 2011. Jung, M. et al. Recent decline in the global land evapotranspiration trend due to limited moisture supply. Nature 467, 951-954, 2010. Miralles, D.G. et al. Global land-surface evaporation estimated from satellite-based observations. Hydrol. Earth Syst. Sci. 15, 453-469, 2011. Mu, Q., Zhao, M. & Running, S.W. Improvements to a MODIS global terrestrial evapotranspiration algorithm. Remote Sens. Environ. 115, 1781-1800, 2011. Mueller, B. et al. Benchmark products for land evapotranspiration: LandFlux-EVAL multi- dataset synthesis. Hydrol. Earth Syst. Sci. 17, 3707-3720, 2013. Su, Z. The Surface Energy Balance System (SEBS) for estimation of turbulent heat fluxes. Hydrol. Earth Syst. Sci. 6, 85-99, 2002.
Chattanooga SmartBus Project : phase 2 evaluation report

DOT National Transportation Integrated Search

2008-06-10

This report presents the results of Phase II of the national evaluation of the Chattanooga Area Regional Transportation Authoritys (CARTAs) SmartBus Project. The Smartbus Project is a comprehensive transit ITS program for the city of Chattanoog...
Taking the Battle Upstream: Towards a Benchmarking Role for NATO

DTIC Science & Technology

2012-09-01

Benchmark.........................................................................................14 Figure 8. World Bank Benchmarking Work on Quality...Search of a Benchmarking Theory for the Public Sector.” 16 Figure 8. World Bank Benchmarking Work on Quality of Governance One of the most...the Ministries of Defense in the countries in which it works ). Another interesting innovation is that for comparison purposes, McKinsey categorized
Kentucky Migrant Technology Project: External Evaluation Report, 1997-98.

ERIC Educational Resources Information Center

Popp, Robert J.

During its first year of operation (1997-98), the Kentucky Migrant Technology Project successfully implemented its model, used internal and external evaluations to inform improvement of the model, and began plans for expansion into new service areas. This evaluation report is organized around five questions that focus on the project model and its…
Human Health Benchmarks for Pesticides

EPA Pesticide Factsheets

Advanced testing methods now allow pesticides to be detected in water at very low levels. These small amounts of pesticides detected in drinking water or source water for drinking water do not necessarily indicate a health risk. The EPA has developed human health benchmarks for 363 pesticides to enable our partners to better determine whether the detection of a pesticide in drinking water or source waters for drinking water may indicate a potential health risk and to help them prioritize monitoring efforts.The table below includes benchmarks for acute (one-day) and chronic (lifetime) exposures for the most sensitive populations from exposure to pesticides that may be found in surface or ground water sources of drinking water. The table also includes benchmarks for 40 pesticides in drinking water that have the potential for cancer risk. The HHBP table includes pesticide active ingredients for which Health Advisories or enforceable National Primary Drinking Water Regulations (e.g., maximum contaminant levels) have not been developed.
A new numerical benchmark of a freshwater lens

NASA Astrophysics Data System (ADS)

Stoeckl, L.; Walther, M.; Graf, T.

2016-04-01

A numerical benchmark for 2-D variable-density flow and solute transport in a freshwater lens is presented. The benchmark is based on results of laboratory experiments conducted by Stoeckl and Houben (2012) using a sand tank on the meter scale. This benchmark describes the formation and degradation of a freshwater lens over time as it can be found under real-world islands. An error analysis gave the appropriate spatial and temporal discretization of 1 mm and 8.64 s, respectively. The calibrated parameter set was obtained using the parameter estimation tool PEST. Comparing density-coupled and density-uncoupled results showed that the freshwater-saltwater interface position is strongly dependent on density differences. A benchmark that adequately represents saltwater intrusion and that includes realistic features of coastal aquifers or freshwater lenses was lacking. This new benchmark was thus developed and is demonstrated to be suitable to test variable-density groundwater models applied to saltwater intrusion investigations.
Benchmarking to improve the quality of cystic fibrosis care.

PubMed

Schechter, Michael S

2012-11-01

Benchmarking involves the ascertainment of healthcare programs with most favorable outcomes as a means to identify and spread effective strategies for delivery of care. The recent interest in the development of patient registries for patients with cystic fibrosis (CF) has been fueled in part by an interest in using them to facilitate benchmarking. This review summarizes reports of how benchmarking has been operationalized in attempts to improve CF care. Although certain goals of benchmarking can be accomplished with an exclusive focus on registry data analysis, benchmarking programs in Germany and the United States have supplemented these data analyses with exploratory interactions and discussions to better understand successful approaches to care and encourage their spread throughout the care network. Benchmarking allows the discovery and facilitates the spread of effective approaches to care. It provides a pragmatic alternative to traditional research methods such as randomized controlled trials, providing insights into methods that optimize delivery of care and allowing judgments about the relative effectiveness of different therapeutic approaches.

Science Base and Tools for Evaluating Stream Restoration Project Proposals.

NASA Astrophysics Data System (ADS)

Cluer, B.; Thorne, C.; Skidmore, P.; Castro, J.; Pess, G.; Beechie, T.; Shea, C.

2008-12-01

Stream restoration, stabilization, or enhancement projects typically employ site-specific designs and site- scale habitat improvement projects have become the default solution to many habitat problems and constraints. Such projects are often planned and implemented without thorough consideration of the broader scale problems that may be contributing to habitat degradation, attention to project resiliency to flood events, accounting for possible changes in climate or watershed land use, or ensuring the long term sustainability of the project. To address these issues, NOAA Fisheries and USFWS have collaboratively commissioned research to develop a science document and accompanying tools to support more consistent and comprehensive review of stream management and restoration projects proposals by Service staff responsible for permitting. The science document synthesizes the body of knowledge in fluvial geomorphology and presents it in a way that is accessible to the Services staff biologists, who are not trained experts in this field. Accompanying the science document are two electronic tools: a Project Information Checklist to assist in evaluating whether a proposal includes all the information necessary to allow critical and thorough project evaluation; and a Project Evaluation Tool (in flow chart format) that guides reviewers through the steps necessary to critically evaluate the quality of the information submitted, the goals and objectives of the project, project planning and development, project design, geomorphic-habitat-species relevance, and risks to listed species. Materials for training Services staff and others in the efficient use of the science document and tools have also been developed. The longer term goals of this effort include: enabling consistent and comprehensive reviews that are completed in a timely fashion by regulators; facilitating improved project planning and design by proponents; encouraging projects that are attuned to their watershed
Maximal Unbiased Benchmarking Data Sets for Human Chemokine Receptors and Comparative Analysis.

PubMed

Xia, Jie; Reid, Terry-Elinor; Wu, Song; Zhang, Liangren; Wang, Xiang Simon

2018-05-29

Chemokine receptors (CRs) have long been druggable targets for the treatment of inflammatory diseases and HIV-1 infection. As a powerful technique, virtual screening (VS) has been widely applied to identifying small molecule leads for modern drug targets including CRs. For rational selection of a wide variety of VS approaches, ligand enrichment assessment based on a benchmarking data set has become an indispensable practice. However, the lack of versatile benchmarking sets for the whole CRs family that are able to unbiasedly evaluate every single approach including both structure- and ligand-based VS somewhat hinders modern drug discovery efforts. To address this issue, we constructed Maximal Unbiased Benchmarking Data sets for human Chemokine Receptors (MUBD-hCRs) using our recently developed tools of MUBD-DecoyMaker. The MUBD-hCRs encompasses 13 subtypes out of 20 chemokine receptors, composed of 404 ligands and 15756 decoys so far and is readily expandable in the future. It had been thoroughly validated that MUBD-hCRs ligands are chemically diverse while its decoys are maximal unbiased in terms of "artificial enrichment", "analogue bias". In addition, we studied the performance of MUBD-hCRs, in particular CXCR4 and CCR5 data sets, in ligand enrichment assessments of both structure- and ligand-based VS approaches in comparison with other benchmarking data sets available in the public domain and demonstrated that MUBD-hCRs is very capable of designating the optimal VS approach. MUBD-hCRs is a unique and maximal unbiased benchmarking set that covers major CRs subtypes so far.
29 CFR 1952.153 - Compliance staffing benchmarks.

Code of Federal Regulations, 2014 CFR

2014-07-01

... further revision of its benchmarks to 64 safety inspectors and 50 industrial hygienists. After opportunity... Labor Regulations Relating to Labor (Continued) OCCUPATIONAL SAFETY AND HEALTH ADMINISTRATION... benchmarks of 50 safety and 27 health compliance officers. After opportunity for public comment and service...
29 CFR 1952.153 - Compliance staffing benchmarks.

Code of Federal Regulations, 2012 CFR

2012-07-01

... further revision of its benchmarks to 64 safety inspectors and 50 industrial hygienists. After opportunity... Labor Regulations Relating to Labor (Continued) OCCUPATIONAL SAFETY AND HEALTH ADMINISTRATION... benchmarks of 50 safety and 27 health compliance officers. After opportunity for public comment and service...
29 CFR 1952.153 - Compliance staffing benchmarks.

Code of Federal Regulations, 2011 CFR

2011-07-01

... further revision of its benchmarks to 64 safety inspectors and 50 industrial hygienists. After opportunity... Labor Regulations Relating to Labor (Continued) OCCUPATIONAL SAFETY AND HEALTH ADMINISTRATION... benchmarks of 50 safety and 27 health compliance officers. After opportunity for public comment and service...
29 CFR 1952.153 - Compliance staffing benchmarks.

Code of Federal Regulations, 2010 CFR

2010-07-01

... further revision of its benchmarks to 64 safety inspectors and 50 industrial hygienists. After opportunity... Labor Regulations Relating to Labor (Continued) OCCUPATIONAL SAFETY AND HEALTH ADMINISTRATION... benchmarks of 50 safety and 27 health compliance officers. After opportunity for public comment and service...
29 CFR 1952.153 - Compliance staffing benchmarks.

Code of Federal Regulations, 2013 CFR

2013-07-01

... further revision of its benchmarks to 64 safety inspectors and 50 industrial hygienists. After opportunity... Labor Regulations Relating to Labor (Continued) OCCUPATIONAL SAFETY AND HEALTH ADMINISTRATION... benchmarks of 50 safety and 27 health compliance officers. After opportunity for public comment and service...
CSC Tip Sheets: Conducting and Evaluating Pilot Projects

EPA Pesticide Factsheets

Learn how to conduct and evaluate pilot projects, which are opportunities to “test the waters” for your project on a small scale, provide insight and data on what works, and adjust your strategy for full-scale implementation.
Chattanooga SmartBus Project : phase III evaluation report

DOT National Transportation Integrated Search

2009-12-01

This report presents the results of Phase III of the national evaluation of the Chattanooga Area Regional Transportation Authoritys (CARTA) SmartBus Project. The SmartBus Project is a comprehensive transit ITS program for the city of Chattanooga, ...
Python/Lua Benchmarks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Busby, L.

This is an adaptation of the pre-existing Scimark benchmark code to a variety of Python and Lua implementations. It also measures performance of the Fparser expression parser and C and C++ code on a variety of simple scientific expressions.
40 CFR 57.604 - Evaluation of projects.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 40 Protection of Environment 5 2010-07-01 2010-07-01 false Evaluation of projects. 57.604 Section 57.604 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR PROGRAMS (CONTINUED... independent engineering firm to prepare written reports at least annually which evaluate each completed...
MicroRNA Array Normalization: An Evaluation Using a Randomized Dataset as the Benchmark

PubMed Central

Qin, Li-Xuan; Zhou, Qin

2014-01-01

MicroRNA arrays possess a number of unique data features that challenge the assumption key to many normalization methods. We assessed the performance of existing normalization methods using two microRNA array datasets derived from the same set of tumor samples: one dataset was generated using a blocked randomization design when assigning arrays to samples and hence was free of confounding array effects; the second dataset was generated without blocking or randomization and exhibited array effects. The randomized dataset was assessed for differential expression between two tumor groups and treated as the benchmark. The non-randomized dataset was assessed for differential expression after normalization and compared against the benchmark. Normalization improved the true positive rate significantly in the non-randomized data but still possessed a false discovery rate as high as 50%. Adding a batch adjustment step before normalization further reduced the number of false positive markers while maintaining a similar number of true positive markers, which resulted in a false discovery rate of 32% to 48%, depending on the specific normalization method. We concluded the paper with some insights on possible causes of false discoveries to shed light on how to improve normalization for microRNA arrays. PMID:24905456
Projection pursuit water quality evaluation model based on chicken swam algorithm

NASA Astrophysics Data System (ADS)

Hu, Zhe

2018-03-01

In view of the uncertainty and ambiguity of each index in water quality evaluation, in order to solve the incompatibility of evaluation results of individual water quality indexes, a projection pursuit model based on chicken swam algorithm is proposed. The projection index function which can reflect the water quality condition is constructed, the chicken group algorithm (CSA) is introduced, the projection index function is optimized, the best projection direction of the projection index function is sought, and the best projection value is obtained to realize the water quality evaluation. The comparison between this method and other methods shows that it is reasonable and feasible to provide decision-making basis for water pollution control in the basin.
Benchmarking image fusion system design parameters

NASA Astrophysics Data System (ADS)

Howell, Christopher L.

2013-06-01

A clear and absolute method for discriminating between image fusion algorithm performances is presented. This method can effectively be used to assist in the design and modeling of image fusion systems. Specifically, it is postulated that quantifying human task performance using image fusion should be benchmarked to whether the fusion algorithm, at a minimum, retained the performance benefit achievable by each independent spectral band being fused. The established benchmark would then clearly represent the threshold that a fusion system should surpass to be considered beneficial to a particular task. A genetic algorithm is employed to characterize the fused system parameters using a Matlab® implementation of NVThermIP as the objective function. By setting the problem up as a mixed-integer constraint optimization problem, one can effectively look backwards through the image acquisition process: optimizing fused system parameters by minimizing the difference between modeled task difficulty measure and the benchmark task difficulty measure. The results of an identification perception experiment are presented, where human observers were asked to identify a standard set of military targets, and used to demonstrate the effectiveness of the benchmarking process.
MITSI project : final local evaluation report

DOT National Transportation Integrated Search

2003-01-01

The mission statement for the MITSI project was facilitating National Standards Compliance migration for NaviGAtor, conducting National Architecture mapping for MARTA and E911, and evaluating CORBA as a methodology for exchanging data. This involved ...
Evaluation of various LandFlux evapotranspiration algorithms using the LandFlux-EVAL synthesis benchmark products and observational data

NASA Astrophysics Data System (ADS)

Michel, Dominik; Hirschi, Martin; Jimenez, Carlos; McCabe, Mathew; Miralles, Diego; Wood, Eric; Seneviratne, Sonia

2014-05-01

Research on climate variations and the development of predictive capabilities largely rely on globally available reference data series of the different components of the energy and water cycles. Several efforts aimed at producing large-scale and long-term reference data sets of these components, e.g. based on in situ observations and remote sensing, in order to allow for diagnostic analyses of the drivers of temporal variations in the climate system. Evapotranspiration (ET) is an essential component of the energy and water cycle, which can not be monitored directly on a global scale by remote sensing techniques. In recent years, several global multi-year ET data sets have been derived from remote sensing-based estimates, observation-driven land surface model simulations or atmospheric reanalyses. The LandFlux-EVAL initiative presented an ensemble-evaluation of these data sets over the time periods 1989-1995 and 1989-2005 (Mueller et al. 2013). Currently, a multi-decadal global reference heat flux data set for ET at the land surface is being developed within the LandFlux initiative of the Global Energy and Water Cycle Experiment (GEWEX). This LandFlux v0 ET data set comprises four ET algorithms forced with a common radiation and surface meteorology. In order to estimate the agreement of this LandFlux v0 ET data with existing data sets, it is compared to the recently available LandFlux-EVAL synthesis benchmark product. Additional evaluation of the LandFlux v0 ET data set is based on a comparison to in situ observations of a weighing lysimeter from the hydrological research site Rietholzbach in Switzerland. These analyses serve as a test bed for similar evaluation procedures that are envisaged for ESA's WACMOS-ET initiative (http://wacmoset.estellus.eu). Reference: Mueller, B., Hirschi, M., Jimenez, C., Ciais, P., Dirmeyer, P. A., Dolman, A. J., Fisher, J. B., Jung, M., Ludwig, F., Maignan, F., Miralles, D. G., McCabe, M. F., Reichstein, M., Sheffield, J., Wang, K
OPTIMIZATION OF MUD HAMMER DRILLING PERFORMANCE - A PROGRAM TO BENCHMARK THE VIABILITY OF ADVANCED MUD HAMMER DRILLING

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alan Black; Arnis Judzis

2003-01-01

Progress during current reporting year 2002 by quarter--Progress during Q1 2002: (1) In accordance to Task 7.0 (D. No.2 Technical Publications) TerraTek, NETL, and the Industry Contributors successfully presented a paper detailing Phase 1 testing results at the February 2002 IADC/SPE Drilling Conference, a prestigious venue for presenting DOE and private sector drilling technology advances. The full reference is as follows: IADC/SPE 74540 ''World's First Benchmarking of Drilling Mud Hammer Performance at Depth Conditions'' authored by Gordon A. Tibbitts, TerraTek; Roy C. Long, US Department of Energy, Brian E. Miller, BP America, Inc.; Arnis Judzis, TerraTek; and Alan D. Black,more » TerraTek. Gordon Tibbitts, TerraTek, will presented the well-attended paper in February of 2002. The full text of the Mud Hammer paper was included in the last quarterly report. (2) The Phase 2 project planning meeting (Task 6) was held at ExxonMobil's Houston Greenspoint offices on February 22, 2002. In attendance were representatives from TerraTek, DOE, BP, ExxonMobil, PDVSA, Novatek, and SDS Digger Tools. (3) PDVSA has joined the advisory board to this DOE mud hammer project. PDVSA's commitment of cash and in-kind contributions were reported during the last quarter. (4) Strong Industry support remains for the DOE project. Both Andergauge and Smith Tools have expressed an interest in participating in the ''optimization'' phase of the program. The potential for increased testing with additional Industry cash support was discussed at the planning meeting in February 2002. Progress during Q2 2002: (1) Presentation material was provided to the DOE/NETL project manager (Dr. John Rogers) for the DOE exhibit at the 2002 Offshore Technology Conference. (2) Two meeting at Smith International and one at Andergauge in Houston were held to investigate their interest in joining the Mud Hammer Performance study. (3) SDS Digger Tools (Task 3 Benchmarking participant) apparently has not negotiated
A benchmark for comparison of cell tracking algorithms

PubMed Central

Maška, Martin; Ulman, Vladimír; Svoboda, David; Matula, Pavel; Matula, Petr; Ederra, Cristina; Urbiola, Ainhoa; España, Tomás; Venkatesan, Subramanian; Balak, Deepak M.W.; Karas, Pavel; Bolcková, Tereza; Štreitová, Markéta; Carthel, Craig; Coraluppi, Stefano; Harder, Nathalie; Rohr, Karl; Magnusson, Klas E. G.; Jaldén, Joakim; Blau, Helen M.; Dzyubachyk, Oleh; Křížek, Pavel; Hagen, Guy M.; Pastor-Escuredo, David; Jimenez-Carretero, Daniel; Ledesma-Carbayo, Maria J.; Muñoz-Barrutia, Arrate; Meijering, Erik; Kozubek, Michal; Ortiz-de-Solorzano, Carlos

2014-01-01

Motivation: Automatic tracking of cells in multidimensional time-lapse fluorescence microscopy is an important task in many biomedical applications. A novel framework for objective evaluation of cell tracking algorithms has been established under the auspices of the IEEE International Symposium on Biomedical Imaging 2013 Cell Tracking Challenge. In this article, we present the logistics, datasets, methods and results of the challenge and lay down the principles for future uses of this benchmark. Results: The main contributions of the challenge include the creation of a comprehensive video dataset repository and the definition of objective measures for comparison and ranking of the algorithms. With this benchmark, six algorithms covering a variety of segmentation and tracking paradigms have been compared and ranked based on their performance on both synthetic and real datasets. Given the diversity of the datasets, we do not declare a single winner of the challenge. Instead, we present and discuss the results for each individual dataset separately. Availability and implementation: The challenge Web site (http://www.codesolorzano.com/celltrackingchallenge) provides access to the training and competition datasets, along with the ground truth of the training videos. It also provides access to Windows and Linux executable files of the evaluation software and most of the algorithms that competed in the challenge. Contact: codesolorzano@unav.es Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24526711
[Benchmarking in ambulatory care practices--The European Practice Assessment (EPA)].

PubMed

Szecsenyi, Joachim; Broge, Björn; Willms, Sara; Brodowski, Marc; Götz, Katja

2011-01-01

The European Practice Assessment (EPA) is a comprehensive quality management which consists of 220 indicators covering 5 domains (infrastructure, people, information, finance, and quality and safety). The aim of the project presented was to evaluate EPA as an instrument for benchmarking in ambulatory care practices. A before-and-after design with a comparison group was chosen. One hundred and two practices conducted EPA at baseline (t1) and at the 3-year follow-up (t2). A further 209 practices began EPA at t2 (comparison group). Since both practice groups differed in several variables (age of GP, location and size of practice), a matched-pair design based on propensity scores was applied leading to a subgroup of 102 comparable practices (out of the 209 practices). Data analysis was carried out using Z scores of the EPA domains. The results showed significant improvements in all domains between t1 and t2 as well as between the comparison group and t2. Furthermore, the results demonstrate that the implementation of total quality management and the re-assessment of the EPA procedure can lead to significant improvements in almost all domains. Copyright © 2011. Published by Elsevier GmbH.
Using Benchmarking To Influence Tuition and Fee Decisions.

ERIC Educational Resources Information Center

Hubbell, Loren W. Loomis; Massa, Robert J.; Lapovsky, Lucie

2002-01-01

Discusses the use of benchmarking in managing enrollment. Using a case study, illustrates how benchmarking can help administrators develop strategies for planning and implementing admissions and pricing practices. (EV)

The Employment Impact of the Des Moines Occupational Upgrading Project and Model Cities High School Equivalency Project: Project Year One Evaluation.

ERIC Educational Resources Information Center

Palomba, Neil A.; And Others

This study was conducted to: (1) evaluate the Occupational Upgrading Project (OUP) and the Model Neighborhood High School Equivalency (HSE) Project's first year of operation, and (2) create baseline data from which future and more conclusive evaluation can be undertaken. Data were gathered by conducting open-ended interviews with the…
SparseBeads data: benchmarking sparsity-regularized computed tomography

NASA Astrophysics Data System (ADS)

Jørgensen, Jakob S.; Coban, Sophia B.; Lionheart, William R. B.; McDonald, Samuel A.; Withers, Philip J.

2017-12-01

Sparsity regularization (SR) such as total variation (TV) minimization allows accurate image reconstruction in x-ray computed tomography (CT) from fewer projections than analytical methods. Exactly how few projections suffice and how this number may depend on the image remain poorly understood. Compressive sensing connects the critical number of projections to the image sparsity, but does not cover CT, however empirical results suggest a similar connection. The present work establishes for real CT data a connection between gradient sparsity and the sufficient number of projections for accurate TV-regularized reconstruction. A collection of 48 x-ray CT datasets called SparseBeads was designed for benchmarking SR reconstruction algorithms. Beadpacks comprising glass beads of five different sizes as well as mixtures were scanned in a micro-CT scanner to provide structured datasets with variable image sparsity levels, number of projections and noise levels to allow the systematic assessment of parameters affecting performance of SR reconstruction algorithms6. Using the SparseBeads data, TV-regularized reconstruction quality was assessed as a function of numbers of projections and gradient sparsity. The critical number of projections for satisfactory TV-regularized reconstruction increased almost linearly with the gradient sparsity. This establishes a quantitative guideline from which one may predict how few projections to acquire based on expected sample sparsity level as an aid in planning of dose- or time-critical experiments. The results are expected to hold for samples of similar characteristics, i.e. consisting of few, distinct phases with relatively simple structure. Such cases are plentiful in porous media, composite materials, foams, as well as non-destructive testing and metrology. For samples of other characteristics the proposed methodology may be used to investigate similar relations.
Software Architecture Evaluation in Global Software Development Projects

NASA Astrophysics Data System (ADS)

Salger, Frank

Due to ever increasing system complexity, comprehensive methods for software architecture evaluation become more and more important. This is further stressed in global software development (GSD), where the software architecture acts as a central knowledge and coordination mechanism. However, existing methods for architecture evaluation do not take characteristics of GSD into account. In this paper we discuss what aspects are specific for architecture evaluations in GSD. Our experiences from GSD projects at Capgemini sd&m indicate, that architecture evaluations differ in how rigorously one has to assess modularization, architecturally relevant processes, knowledge transfer and process alignment. From our project experiences, we derive nine good practices, the compliance to which should be checked in architecture evaluations in GSD. As an example, we discuss how far the standard architecture evaluation method used at Capgemini sd&m already considers the GSD-specific good practices, and outline what extensions are necessary to achieve a comprehensive architecture evaluation framework for GSD.
Evaluation of Embedded System Component Utilized in Delivery Integrated Design Project Course

NASA Astrophysics Data System (ADS)

Junid, Syed Abdul Mutalib Al; Hussaini, Yusnira; Nazmie Osman, Fairul; Razak, Abdul Hadi Abdul; Idros, Mohd Faizul Md; Karimi Halim, Abdul

2018-03-01

This paper reports the evaluation of the embedded system component utilized in delivering the integrated electronic engineering design project course. The evaluation is conducted based on the report project submitted as to fulfil the assessment criteria for the integrated electronic engineering design project course named; engineering system design. Six projects were assessed in this evaluation. The evaluation covers the type of controller, programming language and the number of embedded component utilization as well. From the evaluation, the C-programming based language is the best solution preferred by the students which provide them flexibility in the programming. Moreover, the Analog to Digital converter is intensively used in the projects which include sensors in their proposed design. As a conclusion, in delivering the integrated design project course, the knowledge over the embedded system solution is very important since the high density of the knowledge acquired in accomplishing the project assigned.
Project Developmental Continuity Evaluation: Site Visitors' Manual.

ERIC Educational Resources Information Center

Morris, Mary; Smith, Allen

This site visitors' manual is part of a series of documents on the evaluation of Project Developmental Continuity (PDC), a Head Start demonstration program aimed at providing educational and developmental continuity between children's Head Start and primary school experiences. The PDC evaluation documents and analyzes the process of program…
CompaRNA: a server for continuous benchmarking of automated methods for RNA secondary structure prediction

PubMed Central

Puton, Tomasz; Kozlowski, Lukasz P.; Rother, Kristian M.; Bujnicki, Janusz M.

2013-01-01

We present a continuous benchmarking approach for the assessment of RNA secondary structure prediction methods implemented in the CompaRNA web server. As of 3 October 2012, the performance of 28 single-sequence and 13 comparative methods has been evaluated on RNA sequences/structures released weekly by the Protein Data Bank. We also provide a static benchmark generated on RNA 2D structures derived from the RNAstrand database. Benchmarks on both data sets offer insight into the relative performance of RNA secondary structure prediction methods on RNAs of different size and with respect to different types of structure. According to our tests, on the average, the most accurate predictions obtained by a comparative approach are generated by CentroidAlifold, MXScarna, RNAalifold and TurboFold. On the average, the most accurate predictions obtained by single-sequence analyses are generated by CentroidFold, ContextFold and IPknot. The best comparative methods typically outperform the best single-sequence methods if an alignment of homologous RNA sequences is available. This article presents the results of our benchmarks as of 3 October 2012, whereas the rankings presented online are continuously updated. We will gladly include new prediction methods and new measures of accuracy in the new editions of CompaRNA benchmarks. PMID:23435231
Benchmarking and Hardware-In-The-Loop Operation of a ...

EPA Pesticide Factsheets

Engine Performance evaluation in support of LD MTE. EPA used elements of its ALPHA model to apply hardware-in-the-loop (HIL) controls to the SKYACTIV engine test setup to better understand how the engine would operate in a chassis test after combined with future leading edge technologies, advanced high-efficiency transmission, reduced mass, and reduced roadload. Predict future vehicle performance with Atkinson engine. As part of its technology assessment for the upcoming midterm evaluation of the 2017-2025 LD vehicle GHG emissions regulation, EPA has been benchmarking engines and transmissions to generate inputs for use in its ALPHA model
A comparative evaluation of risk-adjustment models for benchmarking amputation-free survival after lower extremity bypass.

PubMed

Simons, Jessica P; Goodney, Philip P; Flahive, Julie; Hoel, Andrew W; Hallett, John W; Kraiss, Larry W; Schanzer, Andres

2016-04-01

Providing patients and payers with publicly reported risk-adjusted quality metrics for the purpose of benchmarking physicians and institutions has become a national priority. Several prediction models have been developed to estimate outcomes after lower extremity revascularization for critical limb ischemia, but the optimal model to use in contemporary practice has not been defined. We sought to identify the highest-performing risk-adjustment model for amputation-free survival (AFS) at 1 year after lower extremity bypass (LEB). We used the national Society for Vascular Surgery Vascular Quality Initiative (VQI) database (2003-2012) to assess the performance of three previously validated risk-adjustment models for AFS. The Bypass versus Angioplasty in Severe Ischaemia of the Leg (BASIL), Finland National Vascular (FINNVASC) registry, and the modified Project of Ex-vivo vein graft Engineering via Transfection III (PREVENT III [mPIII]) risk scores were applied to the VQI cohort. A novel model for 1-year AFS was also derived using the VQI data set and externally validated using the PIII data set. The relative discrimination (Harrell c-index) and calibration (Hosmer-May goodness-of-fit test) of each model were compared. Among 7754 patients in the VQI who underwent LEB for critical limb ischemia, the AFS was 74% at 1 year. Each of the previously published models for AFS demonstrated similar discriminative performance: c-indices for BASIL, FINNVASC, mPIII were 0.66, 0.60, and 0.64, respectively. The novel VQI-derived model had improved discriminative ability with a c-index of 0.71 and appropriate generalizability on external validation with a c-index of 0.68. The model was well calibrated in both the VQI and PIII data sets (goodness of fit P = not significant). Currently available prediction models for AFS after LEB perform modestly when applied to national contemporary VQI data. Moreover, the performance of each model was inferior to that of the novel VQI-derived model
47 CFR 69.108 - Transport rate benchmark.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 47 Telecommunication 3 2010-10-01 2010-10-01 false Transport rate benchmark. 69.108 Section 69.108... Computation of Charges § 69.108 Transport rate benchmark. (a) For transport charges computed in accordance... interoffice transmission using the telephone company's DS1 special access rates. (b) Initial transport rates...
47 CFR 69.108 - Transport rate benchmark.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 47 Telecommunication 3 2011-10-01 2011-10-01 false Transport rate benchmark. 69.108 Section 69.108... Computation of Charges § 69.108 Transport rate benchmark. (a) For transport charges computed in accordance... interoffice transmission using the telephone company's DS1 special access rates. (b) Initial transport rates...
Electric-Drive Vehicle Thermal Performance Benchmarking | Transportation

Science.gov Websites

studies are as follows: Characterize the thermal resistance and conductivity of various layers in the Research | NREL Electric-Drive Vehicle Thermal Performance Benchmarking Electric-Drive Vehicle Thermal Performance Benchmarking A photo of the internal components of an automotive inverter. NREL
Bio-inspired benchmark generator for extracellular multi-unit recordings

PubMed Central

Mondragón-González, Sirenia Lizbeth; Burguière, Eric

2017-01-01

The analysis of multi-unit extracellular recordings of brain activity has led to the development of numerous tools, ranging from signal processing algorithms to electronic devices and applications. Currently, the evaluation and optimisation of these tools are hampered by the lack of ground-truth databases of neural signals. These databases must be parameterisable, easy to generate and bio-inspired, i.e. containing features encountered in real electrophysiological recording sessions. Towards that end, this article introduces an original computational approach to create fully annotated and parameterised benchmark datasets, generated from the summation of three components: neural signals from compartmental models and recorded extracellular spikes, non-stationary slow oscillations, and a variety of different types of artefacts. We present three application examples. (1) We reproduced in-vivo extracellular hippocampal multi-unit recordings from either tetrode or polytrode designs. (2) We simulated recordings in two different experimental conditions: anaesthetised and awake subjects. (3) Last, we also conducted a series of simulations to study the impact of different level of artefacts on extracellular recordings and their influence in the frequency domain. Beyond the results presented here, such a benchmark dataset generator has many applications such as calibration, evaluation and development of both hardware and software architectures. PMID:28233819
Towards unbiased benchmarking of evolutionary and hybrid algorithms for real-valued optimisation

NASA Astrophysics Data System (ADS)

MacNish, Cara

2007-12-01

Randomised population-based algorithms, such as evolutionary, genetic and swarm-based algorithms, and their hybrids with traditional search techniques, have proven successful and robust on many difficult real-valued optimisation problems. This success, along with the readily applicable nature of these techniques, has led to an explosion in the number of algorithms and variants proposed. In order for the field to advance it is necessary to carry out effective comparative evaluations of these algorithms, and thereby better identify and understand those properties that lead to better performance. This paper discusses the difficulties of providing benchmarking of evolutionary and allied algorithms that is both meaningful and logistically viable. To be meaningful the benchmarking test must give a fair comparison that is free, as far as possible, from biases that favour one style of algorithm over another. To be logistically viable it must overcome the need for pairwise comparison between all the proposed algorithms. To address the first problem, we begin by attempting to identify the biases that are inherent in commonly used benchmarking functions. We then describe a suite of test problems, generated recursively as self-similar or fractal landscapes, designed to overcome these biases. For the second, we describe a server that uses web services to allow researchers to 'plug in' their algorithms, running on their local machines, to a central benchmarking repository.
Critical Assessment of Metagenome Interpretation – a benchmark of computational metagenomics software

PubMed Central

Sczyrba, Alexander; Hofmann, Peter; Belmann, Peter; Koslicki, David; Janssen, Stefan; Dröge, Johannes; Gregor, Ivan; Majda, Stephan; Fiedler, Jessika; Dahms, Eik; Bremges, Andreas; Fritz, Adrian; Garrido-Oter, Ruben; Jørgensen, Tue Sparholt; Shapiro, Nicole; Blood, Philip D.; Gurevich, Alexey; Bai, Yang; Turaev, Dmitrij; DeMaere, Matthew Z.; Chikhi, Rayan; Nagarajan, Niranjan; Quince, Christopher; Meyer, Fernando; Balvočiūtė, Monika; Hansen, Lars Hestbjerg; Sørensen, Søren J.; Chia, Burton K. H.; Denis, Bertrand; Froula, Jeff L.; Wang, Zhong; Egan, Robert; Kang, Dongwan Don; Cook, Jeffrey J.; Deltel, Charles; Beckstette, Michael; Lemaitre, Claire; Peterlongo, Pierre; Rizk, Guillaume; Lavenier, Dominique; Wu, Yu-Wei; Singer, Steven W.; Jain, Chirag; Strous, Marc; Klingenberg, Heiner; Meinicke, Peter; Barton, Michael; Lingner, Thomas; Lin, Hsin-Hung; Liao, Yu-Chieh; Silva, Genivaldo Gueiros Z.; Cuevas, Daniel A.; Edwards, Robert A.; Saha, Surya; Piro, Vitor C.; Renard, Bernhard Y.; Pop, Mihai; Klenk, Hans-Peter; Göker, Markus; Kyrpides, Nikos C.; Woyke, Tanja; Vorholt, Julia A.; Schulze-Lefert, Paul; Rubin, Edward M.; Darling, Aaron E.; Rattei, Thomas; McHardy, Alice C.

2018-01-01

In metagenome analysis, computational methods for assembly, taxonomic profiling and binning are key components facilitating downstream biological data interpretation. However, a lack of consensus about benchmarking datasets and evaluation metrics complicates proper performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on datasets of unprecedented complexity and realism. Benchmark metagenomes were generated from ~700 newly sequenced microorganisms and ~600 novel viruses and plasmids, including genomes with varying degrees of relatedness to each other and to publicly available ones and representing common experimental setups. Across all datasets, assembly and genome binning programs performed well for species represented by individual genomes, while performance was substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below the family level. Parameter settings substantially impacted performances, underscoring the importance of program reproducibility. While highlighting current challenges in computational metagenomics, the CAMI results provide a roadmap for software selection to answer specific research questions. PMID:28967888
Unstructured Adaptive (UA) NAS Parallel Benchmark. Version 1.0

NASA Technical Reports Server (NTRS)

Feng, Huiyu; VanderWijngaart, Rob; Biswas, Rupak; Mavriplis, Catherine

2004-01-01

We present a complete specification of a new benchmark for measuring the performance of modern computer systems when solving scientific problems featuring irregular, dynamic memory accesses. It complements the existing NAS Parallel Benchmark suite. The benchmark involves the solution of a stylized heat transfer problem in a cubic domain, discretized on an adaptively refined, unstructured mesh.
Major Factors Influencing HIV/AIDS Project Evaluation

ERIC Educational Resources Information Center

Niba, Mercy Bi; Green, J. Maryann

2005-01-01

This article aimed at finding out if participatory processes (group discussions, enactments, and others) do make a valuable contribution in communication-based project implementation/evaluation and the fight against HIV/AIDS. A case study backed by documentary analysis of evaluation reports and occasional insights from interviews stood as the main…
NASA Countermeasures Evaluation and Validation Project

NASA Technical Reports Server (NTRS)

Lundquist, Charlie M.; Paloski, William H. (Technical Monitor)

2000-01-01

To support its ISS and exploration class mission objectives, NASA has developed a Countermeasure Evaluation and Validation Project (CEVP). The goal of this project is to evaluate and validate the optimal complement of countermeasures required to maintain astronaut health, safety, and functional ability during and after short- and long-duration space flight missions. The CEVP is the final element of the process in which ideas and concepts emerging from basic research evolve into operational countermeasures. The CEVP is accomplishing these objectives by conducting operational/clinical research to evaluate and validate countermeasures to mitigate these maladaptive responses. Evaluation is accomplished by testing in space flight analog facilities, and validation is accomplished by space flight testing. Both will utilize a standardized complement of integrated physiological and psychological tests, termed the Integrated Testing Regimen (ITR) to examine candidate countermeasure efficacy and intersystem effects. The CEVP emphasis is currently placed on validating the initial complement of ISS countermeasures targeting bone, muscle, and aerobic fitness; followed by countermeasures for neurological, psychological, immunological, nutrition and metabolism, and radiation risks associated with space flight. This presentation will review the processes, plans, and procedures that will enable CEVP to play a vital role in transitioning promising research results into operational countermeasures necessary to maintain crew health and performance during long duration space flight.
Evaluation Guidelines for Service and Methods Demonstration Projects

DOT National Transportation Integrated Search

1976-02-01

The document consists of evaluation guidelines for planning, implementing, and reporting the findings of the evaluation of Service and Methods Demonstration (SMD) projects sponsored by the Urban Mass Transportation Administration (UMTA). The objectiv...
Tennessee long-range transportation plan : project evaluation system

DOT National Transportation Integrated Search

2005-12-01

The Project Evaluation System (PES) Report is an analytical methodology to aid programming efforts and prioritize multimodal investments. The methodology consists of both quantitative and qualitative evaluation criteria built upon the Guiding Princip...
Evaluation in Cross-Cultural Contexts: Proposing a Framework for International Education and Training Project Evaluations.

ERIC Educational Resources Information Center

bin Yahya, Ismail; And Others

This paper focuses on the need for increased sensitivity and responsiveness in international education and training project evaluations, particularly those in Third World countries. A conceptual-theoretical framework for designing and developing models appropriate for evaluating education and training projects in non-Western cultures is presented.…

Information-Theoretic Benchmarking of Land Surface Models

NASA Astrophysics Data System (ADS)

Nearing, Grey; Mocko, David; Kumar, Sujay; Peters-Lidard, Christa; Xia, Youlong

2016-04-01

Benchmarking is a type of model evaluation that compares model performance against a baseline metric that is derived, typically, from a different existing model. Statistical benchmarking was used to qualitatively show that land surface models do not fully utilize information in boundary conditions [1] several years before Gong et al [2] discovered the particular type of benchmark that makes it possible to *quantify* the amount of information lost by an incorrect or imperfect model structure. This theoretical development laid the foundation for a formal theory of model benchmarking [3]. We here extend that theory to separate uncertainty contributions from the three major components of dynamical systems models [4]: model structures, model parameters, and boundary conditions describe time-dependent details of each prediction scenario. The key to this new development is the use of large-sample [5] data sets that span multiple soil types, climates, and biomes, which allows us to segregate uncertainty due to parameters from the two other sources. The benefit of this approach for uncertainty quantification and segregation is that it does not rely on Bayesian priors (although it is strictly coherent with Bayes' theorem and with probability theory), and therefore the partitioning of uncertainty into different components is *not* dependent on any a priori assumptions. We apply this methodology to assess the information use efficiency of the four land surface models that comprise the North American Land Data Assimilation System (Noah, Mosaic, SAC-SMA, and VIC). Specifically, we looked at the ability of these models to estimate soil moisture and latent heat fluxes. We found that in the case of soil moisture, about 25% of net information loss was from boundary conditions, around 45% was from model parameters, and 30-40% was from the model structures. In the case of latent heat flux, boundary conditions contributed about 50% of net uncertainty, and model structures contributed
Looking Past Primary Productivity: Benchmarking System Processes that Drive Ecosystem Level Responses in Models

NASA Astrophysics Data System (ADS)

Cowdery, E.; Dietze, M.

2017-12-01

As atmospheric levels of carbon dioxide levels continue to increase, it is critical that terrestrial ecosystem models can accurately predict ecological responses to the changing environment. Current predictions of net primary productivity (NPP) in response to elevated atmospheric CO2 concentration are highly variable and contain a considerable amount of uncertainty. Benchmarking model predictions against data are necessary to assess their ability to replicate observed patterns, but also to identify and evaluate the assumptions causing inter-model differences. We have implemented a novel benchmarking workflow as part of the Predictive Ecosystem Analyzer (PEcAn) that is automated, repeatable, and generalized to incorporate different sites and ecological models. Building on the recent Free-Air CO2 Enrichment Model Data Synthesis (FACE-MDS) project, we used observational data from the FACE experiments to test this flexible, extensible benchmarking approach aimed at providing repeatable tests of model process representation that can be performed quickly and frequently. Model performance assessments are often limited to traditional residual error analysis; however, this can result in a loss of critical information. Models that fail tests of relative measures of fit may still perform well under measures of absolute fit and mathematical similarity. This implies that models that are discounted as poor predictors of ecological productivity may still be capturing important patterns. Conversely, models that have been found to be good predictors of productivity may be hiding error in their sub-process that result in the right answers for the wrong reasons. Our suite of tests have not only highlighted process based sources of uncertainty in model productivity calculations, they have also quantified the patterns and scale of this error. Combining these findings with PEcAn's model sensitivity analysis and variance decomposition strengthen our ability to identify which processes
Grey Comprehensive Evaluation of Biomass Power Generation Project Based on Group Judgement

NASA Astrophysics Data System (ADS)

Xia, Huicong; Niu, Dongxiao

2017-06-01

The comprehensive evaluation of benefit is an important task needed to be carried out at all stages of biomass power generation projects. This paper proposed an improved grey comprehensive evaluation method based on triangle whiten function. To improve the objectivity of weight calculation result of only reference comparison judgment method, this paper introduced group judgment to the weighting process. In the process of grey comprehensive evaluation, this paper invited a number of experts to estimate the benefit level of projects, and optimized the basic estimations based on the minimum variance principle to improve the accuracy of evaluation result. Taking a biomass power generation project as an example, the grey comprehensive evaluation result showed that the benefit level of this project was good. This example demonstrates the feasibility of grey comprehensive evaluation method based on group judgment for benefit evaluation of biomass power generation project.
Sequoia Messaging Rate Benchmark

DOE Office of Scientific and Technical Information (OSTI.GOV)

Friedley, Andrew

2008-01-22

The purpose of this benchmark is to measure the maximal message rate of a single compute node. The first num_cores ranks are expected to reside on the 'core' compute node for which message rate is being tested. After that, the next num_nbors ranks are neighbors for the first core rank, the next set of num_nbors ranks are neighbors for the second core rank, and so on. For example, testing an 8-core node (num_cores = 8) with 4 neighbors (num_nbors = 4) requires 8 + 8 * 4 - 40 ranks. The first 8 of those 40 ranks are expected tomore » be on the 'core' node being benchmarked, while the rest of the ranks are on separate nodes.« less
Using a health promotion model to promote benchmarking.

PubMed

Welby, Jane

2006-07-01

The North East (England) Neonatal Benchmarking Group has been established for almost a decade and has researched and developed a substantial number of evidence-based benchmarks. With no firm evidence that these were being used or that there was any standardisation of neonatal care throughout the region, the group embarked on a programme to review the benchmarks and determine what evidence-based guidelines were needed to support standardisation. A health promotion planning model was used by one subgroup to structure the programme; it enabled all members of the sub group to engage in the review process and provided the motivation and supporting documentation for implementation of changes in practice. The need for a regional guideline development group to complement the activity of the benchmarking group is being addressed.
Evaluation of construction strategies for PCC pavement rehabilitation projects.

DOT National Transportation Integrated Search

2010-09-30

This study investigated project management level solutions to optimizing resources, minimizing costs : (including user costs) and time for PCC pavement rehabilitation projects. This study extensively : evaluated the applicability of the Construction ...
A model for evaluating the environmental benefits of elementary school facilities.

PubMed

Ji, Changyoon; Hong, Taehoon; Jeong, Kwangbok; Leigh, Seung-Bok

2014-01-01

In this study, a model that is capable of evaluating the environmental benefits of a new elementary school facility was developed. The model is composed of three steps: (i) retrieval of elementary school facilities having similar characteristics as the new elementary school facility using case-based reasoning; (ii) creation of energy consumption and material data for the benchmark elementary school facility using the retrieved similar elementary school facilities; and (iii) evaluation of the environmental benefits of the new elementary school facility by assessing and comparing the environmental impact of the new and created benchmark elementary school facility using life cycle assessment. The developed model can present the environmental benefits of a new elementary school facility in terms of monetary values using Environmental Priority Strategy 2000, a damage-oriented life cycle impact assessment method. The developed model can be used for the following: (i) as criteria for a green-building rating system; (ii) as criteria for setting the support plan and size, such as the government's incentives for promoting green-building projects; and (iii) as criteria for determining the feasibility of green building projects in key business sectors. Copyright © 2013 Elsevier Ltd. All rights reserved.
Project Developmental Continuity Evaluation: Final Report. Executive Summary.

ERIC Educational Resources Information Center

Bond, James T.; Rosario, Jose

This executive summary presents the major results of the longitudinal evaluation of Project Developmental Continuity (PDC). A Head Start demonstration project initiated by the Administration for Children, Youth and Families (ACYF) in 1974, the PDC aimed to stimulate the development and implementation of comprehensive programs linking Head Start…
Criticality experiments and benchmarks for cross section evaluation: the neptunium case

NASA Astrophysics Data System (ADS)

Leong, L. S.; Tassan-Got, L.; Audouin, L.; Paradela, C.; Wilson, J. N.; Tarrio, D.; Berthier, B.; Duran, I.; Le Naour, C.; Stéphan, C.

2013-03-01

The 237Np neutron-induced fission cross section has been recently measured in a large energy range (from eV to GeV) at the n_TOF facility at CERN. When compared to previous measurement the n_TOF fission cross section appears to be higher by 5-7% beyond the fission threshold. To check the relevance of n_TOF data, we apply a criticality experiment performed at Los Alamos with a 6 kg sphere of 237Np, surrounded by enriched uranium 235U so as to approach criticality with fast neutrons. The multiplication factor ke f f of the calculation is in better agreement with the experiment (the deviation of 750 pcm is reduced to 250 pcm) when we replace the ENDF/B-VII.0 evaluation of the 237Np fission cross section by the n_TOF data. We also explore the hypothesis of deficiencies of the inelastic cross section in 235U which has been invoked by some authors to explain the deviation of 750 pcm. With compare to inelastic large distortion calculation, it is incompatible with existing measurements. Also we show that the v of 237Np can hardly be incriminated because of the high accuracy of the existing data. Fission rate ratios or averaged fission cross sections measured in several fast neutron fields seem to give contradictory results on the validation of the 237Np cross section but at least one of the benchmark experiments, where the active deposits have been well calibrated for the number of atoms, favors the n_TOF data set. These outcomes support the hypothesis of a higher fission cross section of 237Np.
A Field-Based Aquatic Life Benchmark for Conductivity in ...

EPA Pesticide Factsheets

EPA announced the availability of the final report, A Field-Based Aquatic Life Benchmark for Conductivity in Central Appalachian Streams. This report describes a method to characterize the relationship between the extirpation (the effective extinction) of invertebrate genera and salinity (measured as conductivity) and from that relationship derives a freshwater aquatic life benchmark. This benchmark of 300 µS/cm may be applied to waters in Appalachian streams that are dominated by calcium and magnesium salts of sulfate and bicarbonate at circum-neutral to mildly alkaline pH. This report provides scientific evidence for a conductivity benchmark in a specific region rather than for the entire United States.
Using benchmarks for radiation testing of microprocessors and FPGAs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Quinn, Heather; Robinson, William H.; Rech, Paolo

Performance benchmarks have been used over the years to compare different systems. These benchmarks can be useful for researchers trying to determine how changes to the technology, architecture, or compiler affect the system's performance. No such standard exists for systems deployed into high radiation environments, making it difficult to assess whether changes in the fabrication process, circuitry, architecture, or software affect reliability or radiation sensitivity. In this paper, we propose a benchmark suite for high-reliability systems that is designed for field-programmable gate arrays and microprocessors. As a result, we describe the development process and report neutron test data for themore » hardware and software benchmarks.« less
Using benchmarks for radiation testing of microprocessors and FPGAs

DOE PAGES

Quinn, Heather; Robinson, William H.; Rech, Paolo; ...

2015-12-17

Performance benchmarks have been used over the years to compare different systems. These benchmarks can be useful for researchers trying to determine how changes to the technology, architecture, or compiler affect the system's performance. No such standard exists for systems deployed into high radiation environments, making it difficult to assess whether changes in the fabrication process, circuitry, architecture, or software affect reliability or radiation sensitivity. In this paper, we propose a benchmark suite for high-reliability systems that is designed for field-programmable gate arrays and microprocessors. As a result, we describe the development process and report neutron test data for themore » hardware and software benchmarks.« less
Standardised Benchmarking in the Quest for Orthologs

PubMed Central

Altenhoff, Adrian M.; Boeckmann, Brigitte; Capella-Gutierrez, Salvador; Dalquen, Daniel A.; DeLuca, Todd; Forslund, Kristoffer; Huerta-Cepas, Jaime; Linard, Benjamin; Pereira, Cécile; Pryszcz, Leszek P.; Schreiber, Fabian; Sousa da Silva, Alan; Szklarczyk, Damian; Train, Clément-Marie; Bork, Peer; Lecompte, Odile; von Mering, Christian; Xenarios, Ioannis; Sjölander, Kimmen; Juhl Jensen, Lars; Martin, Maria J.; Muffato, Matthieu; Gabaldón, Toni; Lewis, Suzanna E.; Thomas, Paul D.; Sonnhammer, Erik; Dessimoz, Christophe

2016-01-01

The identification of evolutionarily related genes across different species—orthologs in particular—forms the backbone of many comparative, evolutionary, and functional genomic analyses. Achieving high accuracy in orthology inference is thus essential. Yet the true evolutionary history of genes, required to ascertain orthology, is generally unknown. Furthermore, orthologs are used for very different applications across different phyla, with different requirements in terms of the precision-recall trade-off. As a result, assessing the performance of orthology inference methods remains difficult for both users and method developers. Here, we present a community effort to establish standards in orthology benchmarking and facilitate orthology benchmarking through an automated web-based service (http://orthology.benchmarkservice.org). Using this new service, we characterise the performance of 15 well-established orthology inference methods and resources on a battery of 20 different benchmarks. Standardised benchmarking provides a way for users to identify the most effective methods for the problem at hand, sets a minimal requirement for new tools and resources, and guides the development of more accurate orthology inference methods. PMID:27043882
Benchmarking CRISPR on-target sgRNA design.

PubMed

Yan, Jifang; Chuai, Guohui; Zhou, Chi; Zhu, Chenyu; Yang, Jing; Zhang, Chao; Gu, Feng; Xu, Han; Wei, Jia; Liu, Qi

2017-02-15

CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)-based gene editing has been widely implemented in various cell types and organisms. A major challenge in the effective application of the CRISPR system is the need to design highly efficient single-guide RNA (sgRNA) with minimal off-target cleavage. Several tools are available for sgRNA design, while limited tools were compared. In our opinion, benchmarking the performance of the available tools and indicating their applicable scenarios are important issues. Moreover, whether the reported sgRNA design rules are reproducible across different sgRNA libraries, cell types and organisms remains unclear. In our study, a systematic and unbiased benchmark of the sgRNA predicting efficacy was performed on nine representative on-target design tools, based on six benchmark data sets covering five different cell types. The benchmark study presented here provides novel quantitative insights into the available CRISPR tools. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Thought Experiment to Examine Benchmark Performance for Fusion Nuclear Data

NASA Astrophysics Data System (ADS)

Murata, Isao; Ohta, Masayuki; Kusaka, Sachie; Sato, Fuminobu; Miyamaru, Hiroyuki

2017-09-01

There are many benchmark experiments carried out so far with DT neutrons especially aiming at fusion reactor development. These integral experiments seemed vaguely to validate the nuclear data below 14 MeV. However, no precise studies exist now. The author's group thus started to examine how well benchmark experiments with DT neutrons can play a benchmarking role for energies below 14 MeV. Recently, as a next phase, to generalize the above discussion, the energy range was expanded to the entire region. In this study, thought experiments with finer energy bins have thus been conducted to discuss how to generally estimate performance of benchmark experiments. As a result of thought experiments with a point detector, the sensitivity for a discrepancy appearing in the benchmark analysis is "equally" due not only to contribution directly conveyed to the deterctor, but also due to indirect contribution of neutrons (named (A)) making neutrons conveying the contribution, indirect controbution of neutrons (B) making the neutrons (A) and so on. From this concept, it would become clear from a sensitivity analysis in advance how well and which energy nuclear data could be benchmarked with a benchmark experiment.
Cross-industry benchmarking: is it applicable to the operating room?

PubMed

Marco, A P; Hart, S

2001-01-01

The use of benchmarking has been growing in nonmedical industries. This concept is being increasingly applied to medicine as the industry strives to improve quality and improve financial performance. Benchmarks can be either internal (set by the institution) or external (use other's performance as a goal). In some industries, benchmarking has crossed industry lines to identify breakthroughs in thinking. In this article, we examine whether the airline industry can be used as a source of external process benchmarking for the operating room.
RESULTS OF QA/QC TESTING OF EPA BENCHMARK DOSE SOFTWARE VERSION 1.2

EPA Science Inventory

EPA is developing benchmark dose software (BMDS) to support cancer and non-cancer dose-response assessments. Following the recent public review of BMDS version 1.1b, EPA developed a Hill model for evaluating continuous data, and improved the user interface and Multistage, Polyno...
Evaluation framework for 16 earmarked projects in Washington State

DOT National Transportation Integrated Search

2007-05-01

This report documents the results of applying a previously developed, standardized approach for evaluating advanced traveler information systems (ATIS) projects to a much more diverse group of 16 intelligent transportation systems (ITS) projects. The...
Hawaii English Program: Project End Evaluation Report 1970-1971.

ERIC Educational Resources Information Center

Hawaii State Dept. of Education, Honolulu.

This report is comprised of two reports: the Final Audit Report of the Hawaii English Project, submitted by the Northwest Regional Educational Laboratory, and the main report, the Hawaii English Program Project End Evaluation Report by the Hawaii English Project Staff. The Audit Report is limited to a review of data reduction, analysis, and…
Project Aprendizaje. 1990-91 Final Evaluation Profile. OREA Report.

ERIC Educational Resources Information Center

New York City Board of Education, Brooklyn, NY. Office of Research, Evaluation, and Assessment.

An evaluation was done of New York City Public Schools' Project Aprendizaje, which served disadvantaged, immigrant, Spanish-speaking high school students at Seward Park High School in Manhattan. The Project enrolled 290 students in grades 9 through 12, 93.1 percent of whom were eligible for the Free Lunch Program. The Project provided students of…

Some links on this page may take you to non-federal websites. Their policies may differ from this site.