takahama-3 benchmark experiment: Topics by Science.gov

Sample records for takahama-3 benchmark experiment

Prediction of the Reactor Antineutrino Flux for the Double Chooz Experiment

NASA Astrophysics Data System (ADS)

Jones, Chirstopher LaDon

This thesis benchmarks the deterministic lattice code, DRAGON, against data, and then applies this code to make a prediction for the antineutrino flux from the Chooz Bl and B2 reactors. Data from the destructive assay of rods from the Takahama-3 reactor and from the SONGS antineutrino detector are used for comparisons. The resulting prediction from the tuned DRAGON code is then compared to the first antineutrino event spectra from Double Chooz. Use of this simulation in nuclear nonproliferation studies is discussed. (Copies available exclusively from MIT Libraries, libraries.mit.edu/docs - docs@mit.edu)
Thought Experiment to Examine Benchmark Performance for Fusion Nuclear Data

NASA Astrophysics Data System (ADS)

Murata, Isao; Ohta, Masayuki; Kusaka, Sachie; Sato, Fuminobu; Miyamaru, Hiroyuki

2017-09-01

There are many benchmark experiments carried out so far with DT neutrons especially aiming at fusion reactor development. These integral experiments seemed vaguely to validate the nuclear data below 14 MeV. However, no precise studies exist now. The author's group thus started to examine how well benchmark experiments with DT neutrons can play a benchmarking role for energies below 14 MeV. Recently, as a next phase, to generalize the above discussion, the energy range was expanded to the entire region. In this study, thought experiments with finer energy bins have thus been conducted to discuss how to generally estimate performance of benchmark experiments. As a result of thought experiments with a point detector, the sensitivity for a discrepancy appearing in the benchmark analysis is "equally" due not only to contribution directly conveyed to the deterctor, but also due to indirect contribution of neutrons (named (A)) making neutrons conveying the contribution, indirect controbution of neutrons (B) making the neutrons (A) and so on. From this concept, it would become clear from a sensitivity analysis in advance how well and which energy nuclear data could be benchmarked with a benchmark experiment.
A Seafloor Benchmark for 3-dimensional Geodesy

NASA Astrophysics Data System (ADS)

Chadwell, C. D.; Webb, S. C.; Nooner, S. L.

2014-12-01

We have developed an inexpensive, permanent seafloor benchmark to increase the longevity of seafloor geodetic measurements. The benchmark provides a physical tie to the sea floor lasting for decades (perhaps longer) on which geodetic sensors can be repeatedly placed and removed with millimeter resolution. Global coordinates estimated with seafloor geodetic techniques will remain attached to the benchmark allowing for the interchange of sensors as they fail or become obsolete, or for the sensors to be removed and used elsewhere, all the while maintaining a coherent series of positions referenced to the benchmark. The benchmark has been designed to free fall from the sea surface with transponders attached. The transponder can be recalled via an acoustic command sent from the surface to release from the benchmark and freely float to the sea surface for recovery. The duration of the sensor attachment to the benchmark will last from a few days to a few years depending on the specific needs of the experiment. The recovered sensors are then available to be reused at other locations, or again at the same site in the future. Three pins on the sensor frame mate precisely and unambiguously with three grooves on the benchmark. To reoccupy a benchmark a Remotely Operated Vehicle (ROV) uses its manipulator arm to place the sensor pins into the benchmark grooves. In June 2014 we deployed four benchmarks offshore central Oregon. We used the ROV Jason to successfully demonstrate the removal and replacement of packages onto the benchmark. We will show the benchmark design and its operational capabilities. Presently models of megathrust slip within the Cascadia Subduction Zone (CSZ) are mostly constrained by the sub-aerial GPS vectors from the Plate Boundary Observatory, a part of Earthscope. More long-lived seafloor geodetic measures are needed to better understand the earthquake and tsunami risk associated with a large rupture of the thrust fault within the Cascadia subduction zone
Copper benchmark experiment for the testing of JEFF-3.2 nuclear data for fusion applications

NASA Astrophysics Data System (ADS)

Angelone, M.; Flammini, D.; Loreti, S.; Moro, F.; Pillon, M.; Villar, R.; Klix, A.; Fischer, U.; Kodeli, I.; Perel, R. L.; Pohorecky, W.

2017-09-01

A neutronics benchmark experiment on a pure Copper block (dimensions 60 × 70 × 70 cm3) aimed at testing and validating the recent nuclear data libraries for fusion applications was performed in the frame of the European Fusion Program at the 14 MeV ENEA Frascati Neutron Generator (FNG). Reaction rates, neutron flux spectra and doses were measured using different experimental techniques (e.g. activation foils techniques, NE213 scintillator and thermoluminescent detectors). This paper first summarizes the analyses of the experiment carried-out using the MCNP5 Monte Carlo code and the European JEFF-3.2 library. Large discrepancies between calculation (C) and experiment (E) were found for the reaction rates both in the high and low neutron energy range. The analysis was complemented by sensitivity/uncertainty analyses (S/U) using the deterministic and Monte Carlo SUSD3D and MCSEN codes, respectively. The S/U analyses enabled to identify the cross sections and energy ranges which are mostly affecting the calculated responses. The largest discrepancy among the C/E values was observed for the thermal (capture) reactions indicating severe deficiencies in the 63,65Cu capture and elastic cross sections at lower rather than at high energy. Deterministic and MC codes produced similar results. The 14 MeV copper experiment and its analysis thus calls for a revision of the JEFF-3.2 copper cross section and covariance data evaluation. A new analysis of the experiment was performed with the MCNP5 code using the revised JEFF-3.3-T2 library released by NEA and a new, not yet distributed, revised JEFF-3.2 Cu evaluation produced by KIT. A noticeable improvement of the C/E results was obtained with both new libraries.
Benchmarking study of the MCNP code against cold critical experiments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sitaraman, S.

1991-01-01

The purpose of this study was to benchmark the widely used Monte Carlo code MCNP against a set of cold critical experiments with a view to using the code as a means of independently verifying the performance of faster but less accurate Monte Carlo and deterministic codes. The experiments simulated consisted of both fast and thermal criticals as well as fuel in a variety of chemical forms. A standard set of benchmark cold critical experiments was modeled. These included the two fast experiments, GODIVA and JEZEBEL, the TRX metallic uranium thermal experiments, the Babcock and Wilcox oxide and mixed oxidemore » experiments, and the Oak Ridge National Laboratory (ORNL) and Pacific Northwest Laboratory (PNL) nitrate solution experiments. The principal case studied was a small critical experiment that was performed with boiling water reactor bundles.« less
Benchmark gamma-ray skyshine experiment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nason, R.R.; Shultis, J.K.; Faw, R.E.

1982-01-01

A benchmark gamma-ray skyshine experiment is descibed in which /sup 60/Co sources were either collimated into an upward 150-deg conical beam or shielded vertically by two different thicknesses of concrete. A NaI(Tl) spectrometer and a high pressure ion chamber were used to measure, respectively, the energy spectrum and the 4..pi..-exposure rate of the air-reflected gamma photons up to 700 m from the source. Analyses of the data and comparison to DOT discrete ordinates calculations are presented.
[Benchmark experiment to verify radiation transport calculations for dosimetry in radiation therapy].

PubMed

Renner, Franziska

2016-09-01

Monte Carlo simulations are regarded as the most accurate method of solving complex problems in the field of dosimetry and radiation transport. In (external) radiation therapy they are increasingly used for the calculation of dose distributions during treatment planning. In comparison to other algorithms for the calculation of dose distributions, Monte Carlo methods have the capability of improving the accuracy of dose calculations - especially under complex circumstances (e.g. consideration of inhomogeneities). However, there is a lack of knowledge of how accurate the results of Monte Carlo calculations are on an absolute basis. A practical verification of the calculations can be performed by direct comparison with the results of a benchmark experiment. This work presents such a benchmark experiment and compares its results (with detailed consideration of measurement uncertainty) with the results of Monte Carlo calculations using the well-established Monte Carlo code EGSnrc. The experiment was designed to have parallels to external beam radiation therapy with respect to the type and energy of the radiation, the materials used and the kind of dose measurement. Because the properties of the beam have to be well known in order to compare the results of the experiment and the simulation on an absolute basis, the benchmark experiment was performed using the research electron accelerator of the Physikalisch-Technische Bundesanstalt (PTB), whose beam was accurately characterized in advance. The benchmark experiment and the corresponding Monte Carlo simulations were carried out for two different types of ionization chambers and the results were compared. Considering the uncertainty, which is about 0.7 % for the experimental values and about 1.0 % for the Monte Carlo simulation, the results of the simulation and the experiment coincide. Copyright © 2015. Published by Elsevier GmbH.
Providing Nuclear Criticality Safety Analysis Education through Benchmark Experiment Evaluation

DOE Office of Scientific and Technical Information (OSTI.GOV)

John D. Bess; J. Blair Briggs; David W. Nigg

2009-11-01

One of the challenges that today's new workforce of nuclear criticality safety engineers face is the opportunity to provide assessment of nuclear systems and establish safety guidelines without having received significant experience or hands-on training prior to graduation. Participation in the International Criticality Safety Benchmark Evaluation Project (ICSBEP) and/or the International Reactor Physics Experiment Evaluation Project (IRPhEP) provides students and young professionals the opportunity to gain experience and enhance critical engineering skills.
Adaptive unified continuum FEM modeling of a 3D FSI benchmark problem.

PubMed

Jansson, Johan; Degirmenci, Niyazi Cem; Hoffman, Johan

2017-09-01

In this paper, we address a 3D fluid-structure interaction benchmark problem that represents important characteristics of biomedical modeling. We present a goal-oriented adaptive finite element methodology for incompressible fluid-structure interaction based on a streamline diffusion-type stabilization of the balance equations for mass and momentum for the entire continuum in the domain, which is implemented in the Unicorn/FEniCS software framework. A phase marker function and its corresponding transport equation are introduced to select the constitutive law, where the mesh tracks the discontinuous fluid-structure interface. This results in a unified simulation method for fluids and structures. We present detailed results for the benchmark problem compared with experiments, together with a mesh convergence study. Copyright © 2016 John Wiley & Sons, Ltd.
Benchmark Evaluation of HTR-PROTEUS Pebble Bed Experimental Program

DOE PAGES

Bess, John D.; Montierth, Leland; Köberl, Oliver; ...

2014-10-09

Benchmark models were developed to evaluate 11 critical core configurations of the HTR-PROTEUS pebble bed experimental program. Various additional reactor physics measurements were performed as part of this program; currently only a total of 37 absorber rod worth measurements have been evaluated as acceptable benchmark experiments for Cores 4, 9, and 10. Dominant uncertainties in the experimental keff for all core configurations come from uncertainties in the ²³⁵U enrichment of the fuel, impurities in the moderator pebbles, and the density and impurity content of the radial reflector. Calculations of k eff with MCNP5 and ENDF/B-VII.0 neutron nuclear data aremore » greater than the benchmark values but within 1% and also within the 3σ uncertainty, except for Core 4, which is the only randomly packed pebble configuration. Repeated calculations of k eff with MCNP6.1 and ENDF/B-VII.1 are lower than the benchmark values and within 1% (~3σ) except for Cores 5 and 9, which calculate lower than the benchmark eigenvalues within 4σ. The primary difference between the two nuclear data libraries is the adjustment of the absorption cross section of graphite. Simulations of the absorber rod worth measurements are within 3σ of the benchmark experiment values. The complete benchmark evaluation details are available in the 2014 edition of the International Handbook of Evaluated Reactor Physics Benchmark Experiments.« less
Benchmark tests of JENDL-3.2 for thermal and fast reactors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Takano, Hideki; Akie, Hiroshi; Kikuchi, Yasuyuki

1994-12-31

Benchmark calculations for a variety of thermal and fast reactors have been performed by using the newly evaluated JENDL-3 Version-2 (JENDL-3.2) file. In the thermal reactor calculations for the uranium and plutonium fueled cores of TRX and TCA, the k{sub eff} and lattice parameters were well predicted. The fast reactor calculations for ZPPR-9 and FCA assemblies showed that the k{sub eff} reactivity worths of Doppler, sodium void and control rod, and reaction rate distribution were in a very good agreement with the experiments.
Material Activation Benchmark Experiments at the NuMI Hadron Absorber Hall in Fermilab

NASA Astrophysics Data System (ADS)

Matsumura, H.; Matsuda, N.; Kasugai, Y.; Toyoda, A.; Yashima, H.; Sekimoto, S.; Iwase, H.; Oishi, K.; Sakamoto, Y.; Nakashima, H.; Leveling, A.; Boehnlein, D.; Lauten, G.; Mokhov, N.; Vaziri, K.

2014-06-01

In our previous study, double and mirror symmetric activation peaks found for Al and Au arranged spatially on the back of the Hadron absorber of the NuMI beamline in Fermilab were considerably higher than those expected purely from muon-induced reactions. From material activation bench-mark experiments, we conclude that this activation is due to hadrons with energy greater than 3 GeV that had passed downstream through small gaps in the hadron absorber.
All inclusive benchmarking.

PubMed

Ellis, Judith

2006-07-01

benchmarking activity and although this does not seem to have restricted its popularity in quantitative activity, reticence about the value of the more qualitative approaches, for example Essence of Care, needs to be overcome in order to improve the quality of patient care and experiences. The perceived immeasurability and subjectivity of Essence of Care and clinical practice benchmarks means that these benchmarking approaches are not always accepted or supported by health service organizations as valid benchmarking activity. In conclusion, Essence of Care benchmarking is a sophisticated clinical practice benchmarking approach which needs to be accepted as an integral part of health service benchmarking activity to support improvement in the quality of patient care and experiences.
Benchmark experiments at ASTRA facility on definition of space distribution of {sup 235}U fission reaction rate

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bobrov, A. A.; Boyarinov, V. F.; Glushkov, A. E.

2012-07-01

Results of critical experiments performed at five ASTRA facility configurations modeling the high-temperature helium-cooled graphite-moderated reactors are presented. Results of experiments on definition of space distribution of {sup 235}U fission reaction rate performed at four from these five configurations are presented more detail. Analysis of available information showed that all experiments on criticality at these five configurations are acceptable for use them as critical benchmark experiments. All experiments on definition of space distribution of {sup 235}U fission reaction rate are acceptable for use them as physical benchmark experiments. (authors)
Benchmarking the Physical Therapist Academic Environment to Understand the Student Experience.

PubMed

Shields, Richard K; Dudley-Javoroski, Shauna; Sass, Kelly J; Becker, Marcie

2018-04-19

Identifying excellence in physical therapist academic environments is complicated by the lack of nationally available benchmarking data. The objective of this study was to compare a physical therapist academic environment to another health care profession (medicine) academic environment using the Association of American Medical Colleges Graduation Questionnaire (GQ) survey. The design consisted of longitudinal benchmarking. Between 2009 and 2017, the GQ was administered to graduates of a physical therapist education program (Department of Physical Therapy and Rehabilitation Science, Carver College of Medicine, The University of Iowa [PTRS]). Their ratings of the educational environment were compared to nationwide data for a peer health care profession (medicine) educational environment. Benchmarking to the GQ capitalizes on a large, psychometrically validated database of academic domains that may be broadly applicable to health care education. The GQ captures critical information about the student experience (eg, faculty professionalism, burnout, student mistreatment) that can be used to characterize the educational environment. This study hypothesized that the ratings provided by 9 consecutive cohorts of PTRS students (n = 316) would reveal educational environment differences from academic medical education. PTRS students reported significantly higher ratings of the educational emotional climate and student-faculty interactions than medical students. PTRS and medical students did not differ on ratings of empathy and tolerance for ambiguity. PTRS students reported significantly lower ratings of burnout than medical students. PTRS students descriptively reported observing greater faculty professionalism and experiencing less mistreatment than medical students. The generalizability of these findings to other physical therapist education environments has not been established. Selected elements of the GQ survey revealed differences in the educational environments
Categorical Regression and Benchmark Dose Software 3.0

EPA Science Inventory

The objective of this full-day course is to provide participants with interactive training on the use of the U.S. Environmental Protection Agency’s (EPA) Benchmark Dose software (BMDS, version 3.0, released fall 2018) and Categorical Regression software (CatReg, version 3.1...
Benchmarking, benchmarks, or best practices? Applying quality improvement principles to decrease surgical turnaround time.

PubMed

Mitchell, L

1996-01-01

The processes of benchmarking, benchmark data comparative analysis, and study of best practices are distinctly different. The study of best practices is explained with an example based on the Arthur Andersen & Co. 1992 "Study of Best Practices in Ambulatory Surgery". The results of a national best practices study in ambulatory surgery were used to provide our quality improvement team with the goal of improving the turnaround time between surgical cases. The team used a seven-step quality improvement problem-solving process to improve the surgical turnaround time. The national benchmark for turnaround times between surgical cases in 1992 was 13.5 minutes. The initial turnaround time at St. Joseph's Medical Center was 19.9 minutes. After the team implemented solutions, the time was reduced to an average of 16.3 minutes, an 18% improvement. Cost-benefit analysis showed a potential enhanced revenue of approximately $300,000, or a potential savings of $10,119. Applying quality improvement principles to benchmarking, benchmarks, or best practices can improve process performance. Understanding which form of benchmarking the institution wishes to embark on will help focus a team and use appropriate resources. Communicating with professional organizations that have experience in benchmarking will save time and money and help achieve the desired results.
JEFF-3.1, ENDF/B-VII and JENDL-3.3 Critical Assemblies Benchmarking With the Monte Carlo Code TRIPOLI

NASA Astrophysics Data System (ADS)

Sublet, Jean-Christophe

2008-02-01

ENDF/B-VII.0, the first release of the ENDF/B-VII nuclear data library, was formally released in December 2006. Prior to this event the European JEFF-3.1 nuclear data library was distributed in April 2005, while the Japanese JENDL-3.3 library has been available since 2002. The recent releases of these neutron transport libraries and special purpose files, the updates of the processing tools and the significant progress in computer power and potency, allow today far better leaner Monte Carlo code and pointwise library integration leading to enhanced benchmarking studies. A TRIPOLI-4.4 critical assembly suite has been set up as a collection of 86 benchmarks taken principally from the International Handbook of Evaluated Criticality Benchmarks Experiments (2006 Edition). It contains cases for a variety of U and Pu fuels and systems, ranging from fast to deep thermal solutions and assemblies. It covers cases with a variety of moderators, reflectors, absorbers, spectra and geometries. The results presented show that while the most recent library ENDF/B-VII.0, which benefited from the timely development of JENDL-3.3 and JEFF-3.1, produces better overall results, it suggest clearly also that improvements are still needed. This is true in particular in Light Water Reactor applications for thermal and epithermal plutonium data for all libraries and fast uranium data for JEFF-3.1 and JENDL-3.3. It is also true to state that other domains, in which Monte Carlo code are been used, such as astrophysics, fusion, high-energy or medical, radiation transport in general benefit notably from such enhanced libraries. It is particularly noticeable in term of the number of isotopes, materials available, the overall quality of the data and the much broader energy range for which evaluated (as opposed to modeled) data are available, spanning from meV to hundreds of MeV. In pointing out the impact of the different nuclear data at the library but also the isotopic levels one could not help
TRUST. I. A 3D externally illuminated slab benchmark for dust radiative transfer

NASA Astrophysics Data System (ADS)

Gordon, K. D.; Baes, M.; Bianchi, S.; Camps, P.; Juvela, M.; Kuiper, R.; Lunttila, T.; Misselt, K. A.; Natale, G.; Robitaille, T.; Steinacker, J.

2017-07-01

Context. The radiative transport of photons through arbitrary three-dimensional (3D) structures of dust is a challenging problem due to the anisotropic scattering of dust grains and strong coupling between different spatial regions. The radiative transfer problem in 3D is solved using Monte Carlo or Ray Tracing techniques as no full analytic solution exists for the true 3D structures. Aims: We provide the first 3D dust radiative transfer benchmark composed of a slab of dust with uniform density externally illuminated by a star. This simple 3D benchmark is explicitly formulated to provide tests of the different components of the radiative transfer problem including dust absorption, scattering, and emission. Methods: The details of the external star, the slab itself, and the dust properties are provided. This benchmark includes models with a range of dust optical depths fully probing cases that are optically thin at all wavelengths to optically thick at most wavelengths. The dust properties adopted are characteristic of the diffuse Milky Way interstellar medium. This benchmark includes solutions for the full dust emission including single photon (stochastic) heating as well as two simplifying approximations: One where all grains are considered in equilibrium with the radiation field and one where the emission is from a single effective grain with size-distribution-averaged properties. A total of six Monte Carlo codes and one Ray Tracing code provide solutions to this benchmark. Results: The solution to this benchmark is given as global spectral energy distributions (SEDs) and images at select diagnostic wavelengths from the ultraviolet through the infrared. Comparison of the results revealed that the global SEDs are consistent on average to a few percent for all but the scattered stellar flux at very high optical depths. The image results are consistent within 10%, again except for the stellar scattered flux at very high optical depths. The lack of agreement between
Implementation of Benchmarking Transportation Logistics Practices and Future Benchmarking Organizations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thrower, A.W.; Patric, J.; Keister, M.

2008-07-01

The purpose of the Office of Civilian Radioactive Waste Management's (OCRWM) Logistics Benchmarking Project is to identify established government and industry practices for the safe transportation of hazardous materials which can serve as a yardstick for design and operation of OCRWM's national transportation system for shipping spent nuclear fuel and high-level radioactive waste to the proposed repository at Yucca Mountain, Nevada. The project will present logistics and transportation practices and develop implementation recommendations for adaptation by the national transportation system. This paper will describe the process used to perform the initial benchmarking study, highlight interim findings, and explain how thesemore » findings are being implemented. It will also provide an overview of the next phase of benchmarking studies. The benchmarking effort will remain a high-priority activity throughout the planning and operational phases of the transportation system. The initial phase of the project focused on government transportation programs to identify those practices which are most clearly applicable to OCRWM. These Federal programs have decades of safe transportation experience, strive for excellence in operations, and implement effective stakeholder involvement, all of which parallel OCRWM's transportation mission and vision. The initial benchmarking project focused on four business processes that are critical to OCRWM's mission success, and can be incorporated into OCRWM planning and preparation in the near term. The processes examined were: transportation business model, contract management/out-sourcing, stakeholder relations, and contingency planning. More recently, OCRWM examined logistics operations of AREVA NC's Business Unit Logistics in France. The next phase of benchmarking will focus on integrated domestic and international commercial radioactive logistic operations. The prospective companies represent large scale shippers and have vast

Benchmarking infrastructure for mutation text mining

PubMed Central

2014-01-01

Background Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. Results We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. Conclusion We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption. PMID:24568600
Benchmarking infrastructure for mutation text mining.

PubMed

Klein, Artjom; Riazanov, Alexandre; Hindle, Matthew M; Baker, Christopher Jo

2014-02-25

Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption.
Benchmark measurements and calculations of a 3-dimensional neutron streaming experiment

NASA Astrophysics Data System (ADS)

Barnett, D. A., Jr.

1991-02-01

An experimental assembly known as the Dog-Legged Void assembly was constructed to measure the effect of neutron streaming in iron and void regions. The primary purpose of the measurements was to provide benchmark data against which various neutron transport calculation tools could be compared. The measurements included neutron flux spectra at four places and integral measurements at two places in the iron streaming path as well as integral measurements along several axial traverses. These data have been used in the verification of Oak Ridge National Laboratory's three-dimensional discrete ordinates code, TORT. For a base case calculation using one-half inch mesh spacing, finite difference spatial differencing, an S(sub 16) quadrature and P(sub 1) cross sections in the MUFT multigroup structure, the calculated solution agreed to within 18 percent with the spectral measurements and to within 24 percent of the integral measurements. Variations on the base case using a fewgroup energy structure and P(sub 1) and P(sub 3) cross sections showed similar agreement. Calculations using a linear nodal spatial differencing scheme and fewgroup cross sections also showed similar agreement. For the same mesh size, the nodal method was seen to require 2.2 times as much CPU time as the finite difference method. A nodal calculation using a typical mesh spacing of 2 inches, which had approximately 32 times fewer mesh cells than the base case, agreed with the measurements to within 34 percent and yet required on 8 percent of the CPU time.
BENCHMARK DOSE TECHNICAL GUIDANCE DOCUMENT ...

EPA Pesticide Factsheets

The purpose of this document is to provide guidance for the Agency on the application of the benchmark dose approach in determining the point of departure (POD) for health effects data, whether a linear or nonlinear low dose extrapolation is used. The guidance includes discussion on computation of benchmark doses and benchmark concentrations (BMDs and BMCs) and their lower confidence limits, data requirements, dose-response analysis, and reporting requirements. This guidance is based on today's knowledge and understanding, and on experience gained in using this approach.
Statistical Analysis of NAS Parallel Benchmarks and LINPACK Results

NASA Technical Reports Server (NTRS)

Meuer, Hans-Werner; Simon, Horst D.; Strohmeier, Erich; Lasinski, T. A. (Technical Monitor)

1994-01-01

In the last three years extensive performance data have been reported for parallel machines both based on the NAS Parallel Benchmarks, and on LINPACK. In this study we have used the reported benchmark results and performed a number of statistical experiments using factor, cluster, and regression analyses. In addition to the performance results of LINPACK and the eight NAS parallel benchmarks, we have also included peak performance of the machine, and the LINPACK n and n(sub 1/2) values. Some of the results and observations can be summarized as follows: 1) All benchmarks are strongly correlated with peak performance. 2) LINPACK and EP have each a unique signature. 3) The remaining NPB can grouped into three groups as follows: (CG and IS), (LU and SP), and (MG, FT, and BT). Hence three (or four with EP) benchmarks are sufficient to characterize the overall NPB performance. Our poster presentation will follow a standard poster format, and will present the data of our statistical analysis in detail.
Revisiting Yasinsky and Henry`s benchmark using modern nodal codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Feltus, M.A.; Becker, M.W.

1995-12-31

The numerical experiments analyzed by Yasinsky and Henry are quite trivial by comparison with today`s standards because they used the finite difference code WIGLE for their benchmark. Also, this problem is a simple slab (one-dimensional) case with no feedback mechanisms. This research attempts to obtain STAR (Ref. 2) and NEM (Ref. 3) code results in order to produce a more modern kinetics benchmark with results comparable WIGLE.
Use of integral experiments in support to the validation of JEFF-3.2 nuclear data evaluation

NASA Astrophysics Data System (ADS)

Leclaire, Nicolas; Cochet, Bertrand; Jinaphanh, Alexis; Haeck, Wim

2017-09-01

For many years now, IRSN has developed its own Monte Carlo continuous energy capability, which allows testing various nuclear data libraries. In that prospect, a validation database of 1136 experiments was built from cases used for the validation of the APOLLO2-MORET 5 multigroup route of the CRISTAL V2.0 package. In this paper, the keff obtained for more than 200 benchmarks using the JEFF-3.1.1 and JEFF-3.2 libraries are compared to benchmark keff values and main discrepancies are analyzed regarding the neutron spectrum. Special attention is paid on benchmarks for which the results have been highly modified between both JEFF-3 versions.
The skyshine benchmark experiment revisited.

PubMed

Terry, Ian R

2005-01-01

With the coming renaissance of nuclear power, heralded by new nuclear power plant construction in Finland, the issue of qualifying modern tools for calculation becomes prominent. Among the calculations required may be the determination of radiation levels outside the plant owing to skyshine. For example, knowledge of the degree of accuracy in the calculation of gamma skyshine through the turbine hall roof of a BWR plant is important. Modern survey programs which can calculate skyshine dose rates tend to be qualified only by verification with the results of Monte Carlo calculations. However, in the past, exacting experimental work has been performed in the field for gamma skyshine, notably the benchmark work in 1981 by Shultis and co-workers, which considered not just the open source case but also the effects of placing a concrete roof above the source enclosure. The latter case is a better reflection of reality as safety considerations nearly always require the source to be shielded in some way, usually by substantial walls but by a thinner roof. One of the tools developed since that time, which can both calculate skyshine radiation and accurately model the geometrical set-up of an experiment, is the code RANKERN, which is used by Framatome ANP and other organisations for general shielding design work. The following description concerns the use of this code to re-address the experimental results from 1981. This then provides a realistic gauge to validate, but also to set limits on, the program for future gamma skyshine applications within the applicable licensing procedures for all users of the code.
Seismo-acoustic ray model benchmarking against experimental tank data.

PubMed

Camargo Rodríguez, Orlando; Collis, Jon M; Simpson, Harry J; Ey, Emanuel; Schneiderwind, Joseph; Felisberto, Paulo

2012-08-01

Acoustic predictions of the recently developed traceo ray model, which accounts for bottom shear properties, are benchmarked against tank experimental data from the EPEE-1 and EPEE-2 (Elastic Parabolic Equation Experiment) experiments. Both experiments are representative of signal propagation in a Pekeris-like shallow-water waveguide over a non-flat isotropic elastic bottom, where significant interaction of the signal with the bottom can be expected. The benchmarks show, in particular, that the ray model can be as accurate as a parabolic approximation model benchmarked in similar conditions. The results of benchmarking are important, on one side, as a preliminary experimental validation of the model and, on the other side, demonstrates the reliability of the ray approach for seismo-acoustic applications.
Benchmarking gate-based quantum computers

NASA Astrophysics Data System (ADS)

Michielsen, Kristel; Nocon, Madita; Willsch, Dennis; Jin, Fengping; Lippert, Thomas; De Raedt, Hans

2017-11-01

With the advent of public access to small gate-based quantum processors, it becomes necessary to develop a benchmarking methodology such that independent researchers can validate the operation of these processors. We explore the usefulness of a number of simple quantum circuits as benchmarks for gate-based quantum computing devices and show that circuits performing identity operations are very simple, scalable and sensitive to gate errors and are therefore very well suited for this task. We illustrate the procedure by presenting benchmark results for the IBM Quantum Experience, a cloud-based platform for gate-based quantum computing.
Limitations of Community College Benchmarking and Benchmarks

ERIC Educational Resources Information Center

Bers, Trudy H.

2006-01-01

This chapter distinguishes between benchmarks and benchmarking, describes a number of data and cultural limitations to benchmarking projects, and suggests that external demands for accountability are the dominant reason for growing interest in benchmarking among community colleges.
Analogue experiments as benchmarks for models of lava flow emplacement

NASA Astrophysics Data System (ADS)

Garel, F.; Kaminski, E. C.; Tait, S.; Limare, A.

2013-12-01

During an effusive volcanic eruption, the crisis management is mainly based on the prediction of lava flow advance and its velocity. The spreading of a lava flow, seen as a gravity current, depends on its "effective rheology" and on the effusion rate. Fast-computing models have arisen in the past decade in order to predict in near real time lava flow path and rate of advance. This type of model, crucial to mitigate volcanic hazards and organize potential evacuation, has been mainly compared a posteriori to real cases of emplaced lava flows. The input parameters of such simulations applied to natural eruptions, especially effusion rate and topography, are often not known precisely, and are difficult to evaluate after the eruption. It is therefore not straightforward to identify the causes of discrepancies between model outputs and observed lava emplacement, whereas the comparison of models with controlled laboratory experiments appears easier. The challenge for numerical simulations of lava flow emplacement is to model the simultaneous advance and thermal structure of viscous lava flows. To provide original constraints later to be used in benchmark numerical simulations, we have performed lab-scale experiments investigating the cooling of isoviscous gravity currents. The simplest experimental set-up is as follows: silicone oil, whose viscosity, around 5 Pa.s, varies less than a factor of 2 in the temperature range studied, is injected from a point source onto a horizontal plate and spreads axisymmetrically. The oil is injected hot, and progressively cools down to ambient temperature away from the source. Once the flow is developed, it presents a stationary radial thermal structure whose characteristics depend on the input flow rate. In addition to the experimental observations, we have developed in Garel et al., JGR, 2012 a theoretical model confirming the relationship between supply rate, flow advance and stationary surface thermal structure. We also provide
Community-based benchmarking of the CMIP DECK experiments

NASA Astrophysics Data System (ADS)

Gleckler, P. J.

2015-12-01

A diversity of community-based efforts are independently developing "diagnostic packages" with little or no coordination between them. A short list of examples include NCAR's Climate Variability Diagnostics Package (CVDP), ORNL's International Land Model Benchmarking (ILAMB), LBNL's Toolkit for Extreme Climate Analysis (TECA), PCMDI's Metrics Package (PMP), the EU EMBRACE ESMValTool, the WGNE MJO diagnostics package, and CFMIP diagnostics. The full value of these efforts cannot be realized without some coordination. As a first step, a WCRP effort has initiated a catalog to document candidate packages that could potentially be applied in a "repeat-use" fashion to all simulations contributed to the CMIP DECK (Diagnostic, Evaluation and Characterization of Klima) experiments. Some coordination of community-based diagnostics has the additional potential to improve how CMIP modeling groups analyze their simulations during model-development. The fact that most modeling groups now maintain a "CMIP compliant" data stream means that in principal without much effort they could readily adopt a set of well organized diagnostic capabilities specifically designed to operate on CMIP DECK experiments. Ultimately, a detailed listing of and access to analysis codes that are demonstrated to work "out of the box" with CMIP data could enable model developers (and others) to select those codes they wish to implement in-house, potentially enabling more systematic evaluation during the model development process.
Developing Evidence for Action on the Postgraduate Experience: An Effective Local Instrument to Move beyond Benchmarking

ERIC Educational Resources Information Center

Sampson, K. A.; Johnston, L.; Comer, K.; Brogt, E.

2016-01-01

Summative and benchmarking surveys to measure the postgraduate student research experience are well reported in the literature. While useful, we argue that local instruments that provide formative resources with an academic development focus are also required. If higher education institutions are to move beyond the identification of issues and…
Results of the GABLS3 diurnal-cycle benchmark for wind energy applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rodrigo, J. Sanz; Allaerts, D.; Avila, M.

We present results of the GABLS3 model intercomparison benchmark revisited for wind energy applications. The case consists of a diurnal cycle, measured at the 200-m tall Cabauw tower in the Netherlands, including a nocturnal low-level jet. The benchmark includes a sensitivity analysis of WRF simulations using two input meteorological databases and five planetary boundary-layer schemes. A reference set of mesoscale tendencies is used to drive microscale simulations using RANS k-ϵ and LES turbulence models. The validation is based on rotor-based quantities of interest. Cycle-integrated mean absolute errors are used to quantify model performance. The results of the benchmark are usedmore » to discuss input uncertainties from mesoscale modelling, different meso-micro coupling strategies (online vs offline) and consistency between RANS and LES codes when dealing with boundary-layer mean flow quantities. Altogether, all the microscale simulations produce a consistent coupling with mesoscale forcings.« less
Results of the GABLS3 diurnal-cycle benchmark for wind energy applications

DOE PAGES

Rodrigo, J. Sanz; Allaerts, D.; Avila, M.; ...

2017-06-13

We present results of the GABLS3 model intercomparison benchmark revisited for wind energy applications. The case consists of a diurnal cycle, measured at the 200-m tall Cabauw tower in the Netherlands, including a nocturnal low-level jet. The benchmark includes a sensitivity analysis of WRF simulations using two input meteorological databases and five planetary boundary-layer schemes. A reference set of mesoscale tendencies is used to drive microscale simulations using RANS k-ϵ and LES turbulence models. The validation is based on rotor-based quantities of interest. Cycle-integrated mean absolute errors are used to quantify model performance. The results of the benchmark are usedmore » to discuss input uncertainties from mesoscale modelling, different meso-micro coupling strategies (online vs offline) and consistency between RANS and LES codes when dealing with boundary-layer mean flow quantities. Altogether, all the microscale simulations produce a consistent coupling with mesoscale forcings.« less
Benchmarking atomic physics models for magnetically confined fusion plasma physics experiments

DOE Office of Scientific and Technical Information (OSTI.GOV)

May, M.J.; Finkenthal, M.; Soukhanovskii, V.

In present magnetically confined fusion devices, high and intermediate {ital Z} impurities are either puffed into the plasma for divertor radiative cooling experiments or are sputtered from the high {ital Z} plasma facing armor. The beneficial cooling of the edge as well as the detrimental radiative losses from the core of these impurities can be properly understood only if the atomic physics used in the modeling of the cooling curves is very accurate. To this end, a comprehensive experimental and theoretical analysis of some relevant impurities is undertaken. Gases (Ne, Ar, Kr, and Xe) are puffed and nongases are introducedmore » through laser ablation into the FTU tokamak plasma. The charge state distributions and total density of these impurities are determined from spatial scans of several photometrically calibrated vacuum ultraviolet and x-ray spectrographs (3{endash}1600 {Angstrom}), the multiple ionization state transport code transport code (MIST) and a collisional radiative model. The radiative power losses are measured with bolometery, and the emissivity profiles were measured by a visible bremsstrahlung array. The ionization balance, excitation physics, and the radiative cooling curves are computed from the Hebrew University Lawrence Livermore atomic code (HULLAC) and are benchmarked by these experiments. (Supported by U.S. DOE Grant No. DE-FG02-86ER53214 at JHU and Contract No. W-7405-ENG-48 at LLNL.) {copyright} {ital 1999 American Institute of Physics.}« less
Benchmark Shock Tube Experiments for Radiative Heating Relevant to Earth Re-Entry

NASA Technical Reports Server (NTRS)

Brandis, A. M.; Cruden, B. A.

2017-01-01

Detailed spectrally and spatially resolved radiance has been measured in the Electric Arc Shock Tube (EAST) facility for conditions relevant to high speed entry into a variety of atmospheres, including Earth, Venus, Titan, Mars and the Outer Planets. The tests that measured radiation relevant for Earth re-entry are the focus of this work and are taken from campaigns 47, 50, 52 and 57. These tests covered conditions from 8 km/s to 15.5 km/s at initial pressures ranging from 0.05 Torr to 1 Torr, of which shots at 0.1 and 0.2 Torr are analyzed in this paper. These conditions cover a range of points of interest for potential fight missions, including return from Low Earth Orbit, the Moon and Mars. The large volume of testing available from EAST is useful for statistical analysis of radiation data, but is problematic for identifying representative experiments for performing detailed analysis. Therefore, the intent of this paper is to select a subset of benchmark test data that can be considered for further detailed study. These benchmark shots are intended to provide more accessible data sets for future code validation studies and facility-to-facility comparisons. The shots that have been selected as benchmark data are the ones in closest agreement to a line of best fit through all of the EAST results, whilst also showing the best experimental characteristics, such as test time and convergence to equilibrium. The EAST data are presented in different formats for analysis. These data include the spectral radiance at equilibrium, the spatial dependence of radiance over defined wavelength ranges and the mean non-equilibrium spectral radiance (so-called 'spectral non-equilibrium metric'). All the information needed to simulate each experimental trace, including free-stream conditions, shock time of arrival (i.e. x-t) relation, and the spectral and spatial resolution functions, are provided.
The MCNP6 Analytic Criticality Benchmark Suite

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, Forrest B.

2016-06-16

Analytical benchmarks provide an invaluable tool for verifying computer codes used to simulate neutron transport. Several collections of analytical benchmark problems [1-4] are used routinely in the verification of production Monte Carlo codes such as MCNP® [5,6]. Verification of a computer code is a necessary prerequisite to the more complex validation process. The verification process confirms that a code performs its intended functions correctly. The validation process involves determining the absolute accuracy of code results vs. nature. In typical validations, results are computed for a set of benchmark experiments using a particular methodology (code, cross-section data with uncertainties, and modeling)more » and compared to the measured results from the set of benchmark experiments. The validation process determines bias, bias uncertainty, and possibly additional margins. Verification is generally performed by the code developers, while validation is generally performed by code users for a particular application space. The VERIFICATION_KEFF suite of criticality problems [1,2] was originally a set of 75 criticality problems found in the literature for which exact analytical solutions are available. Even though the spatial and energy detail is necessarily limited in analytical benchmarks, typically to a few regions or energy groups, the exact solutions obtained can be used to verify that the basic algorithms, mathematics, and methods used in complex production codes perform correctly. The present work has focused on revisiting this benchmark suite. A thorough review of the problems resulted in discarding some of them as not suitable for MCNP benchmarking. For the remaining problems, many of them were reformulated to permit execution in either multigroup mode or in the normal continuous-energy mode for MCNP. Execution of the benchmarks in continuous-energy mode provides a significant advance to MCNP verification methods.« less
RETRANO3 benchmarks for Beaver Valley plant transients and FSAR analyses

DOE Office of Scientific and Technical Information (OSTI.GOV)

Beaumont, E.T.; Feltus, M.A.

1993-01-01

Any best-estimate code (e.g., RETRANO3) results must be validated against plant data and final safety analysis report (FSAR) predictions. The need for two independent means of benchmarking is necessary to ensure that the results were not biased toward a particular data set and to have a certain degree of accuracy. The code results need to be compared with previous results and show improvements over previous code results. Ideally, the two best means of benchmarking a thermal hydraulics code are comparing results from previous versions of the same code along with actual plant data. This paper describes RETRAN03 benchmarks against RETRAN02more » results, actual plant data, and FSAR predictions. RETRAN03, the Electric Power Research Institute's latest version of the RETRAN thermal-hydraulic analysis codes, offers several upgrades over its predecessor, RETRAN02 Mod5. RETRAN03 can use either implicit or semi-implicit numerics, whereas RETRAN02 Mod5 uses only semi-implicit numerics. Another major upgrade deals with slip model options. RETRAN03 added several new models, including a five-equation model for more accurate modeling of two-phase flow. RETPAN02 Mod5 should give similar but slightly more conservative results than RETRAN03 when executed with RETRAN02 Mod5 options.« less

Benchmarking initiatives in the water industry.

PubMed

Parena, R; Smeets, E

2001-01-01

Customer satisfaction and service care are every day pushing professionals in the water industry to seek to improve their performance, lowering costs and increasing the provided service level. Process Benchmarking is generally recognised as a systematic mechanism of comparing one's own utility with other utilities or businesses with the intent of self-improvement by adopting structures or methods used elsewhere. The IWA Task Force on Benchmarking, operating inside the Statistics and Economics Committee, has been committed to developing a general accepted concept of Process Benchmarking to support water decision-makers in addressing issues of efficiency. In a first step the Task Force disseminated among the Committee members a questionnaire focused on providing suggestions about the kind, the evolution degree and the main concepts of Benchmarking adopted in the represented Countries. A comparison among the guidelines adopted in The Netherlands and Scandinavia has recently challenged the Task Force in drafting a methodology for a worldwide process benchmarking in water industry. The paper provides a framework of the most interesting benchmarking experiences in the water sector and describes in detail both the final results of the survey and the methodology focused on identification of possible improvement areas.
How to achieve and prove performance improvement - 15 years of experience in German wastewater benchmarking.

PubMed

Bertzbach, F; Franz, T; Möller, K

2012-01-01

This paper shows the results of performance improvement, which have been achieved in benchmarking projects in the wastewater industry in Germany over the last 15 years. A huge number of changes in operational practice and also in achieved annual savings can be shown, induced in particular by benchmarking at process level. Investigation of this question produces some general findings for the inclusion of performance improvement in a benchmarking project and for the communication of its results. Thus, we elaborate on the concept of benchmarking at both utility and process level, which is still a necessary distinction for the integration of performance improvement into our benchmarking approach. To achieve performance improvement via benchmarking it should be made quite clear that this outcome depends, on one hand, on a well conducted benchmarking programme and, on the other, on the individual situation within each participating utility.
Validating the performance of correlated fission multiplicity implementation in radiation transport codes with subcritical neutron multiplication benchmark experiments

DOE PAGES

Arthur, Jennifer; Bahran, Rian; Hutchinson, Jesson; ...

2018-06-14

Historically, radiation transport codes have uncorrelated fission emissions. In reality, the particles emitted by both spontaneous and induced fissions are correlated in time, energy, angle, and multiplicity. This work validates the performance of various current Monte Carlo codes that take into account the underlying correlated physics of fission neutrons, specifically neutron multiplicity distributions. The performance of 4 Monte Carlo codes - MCNP®6.2, MCNP®6.2/FREYA, MCNP®6.2/CGMF, and PoliMi - was assessed using neutron multiplicity benchmark experiments. In addition, MCNP®6.2 simulations were run using JEFF-3.2 and JENDL-4.0, rather than ENDF/B-VII.1, data for 239Pu and 240Pu. The sensitive benchmark parameters that in this workmore » represent the performance of each correlated fission multiplicity Monte Carlo code include the singles rate, the doubles rate, leakage multiplication, and Feynman histograms. Although it is difficult to determine which radiation transport code shows the best overall performance in simulating subcritical neutron multiplication inference benchmark measurements, it is clear that correlations exist between the underlying nuclear data utilized by (or generated by) the various codes, and the correlated neutron observables of interest. This could prove useful in nuclear data validation and evaluation applications, in which a particular moment of the neutron multiplicity distribution is of more interest than the other moments. It is also quite clear that, because transport is handled by MCNP®6.2 in 3 of the 4 codes, with the 4th code (PoliMi) being based on an older version of MCNP®, the differences in correlated neutron observables of interest are most likely due to the treatment of fission event generation in each of the different codes, as opposed to the radiation transport.« less
Validating the performance of correlated fission multiplicity implementation in radiation transport codes with subcritical neutron multiplication benchmark experiments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arthur, Jennifer; Bahran, Rian; Hutchinson, Jesson

Historically, radiation transport codes have uncorrelated fission emissions. In reality, the particles emitted by both spontaneous and induced fissions are correlated in time, energy, angle, and multiplicity. This work validates the performance of various current Monte Carlo codes that take into account the underlying correlated physics of fission neutrons, specifically neutron multiplicity distributions. The performance of 4 Monte Carlo codes - MCNP®6.2, MCNP®6.2/FREYA, MCNP®6.2/CGMF, and PoliMi - was assessed using neutron multiplicity benchmark experiments. In addition, MCNP®6.2 simulations were run using JEFF-3.2 and JENDL-4.0, rather than ENDF/B-VII.1, data for 239Pu and 240Pu. The sensitive benchmark parameters that in this workmore » represent the performance of each correlated fission multiplicity Monte Carlo code include the singles rate, the doubles rate, leakage multiplication, and Feynman histograms. Although it is difficult to determine which radiation transport code shows the best overall performance in simulating subcritical neutron multiplication inference benchmark measurements, it is clear that correlations exist between the underlying nuclear data utilized by (or generated by) the various codes, and the correlated neutron observables of interest. This could prove useful in nuclear data validation and evaluation applications, in which a particular moment of the neutron multiplicity distribution is of more interest than the other moments. It is also quite clear that, because transport is handled by MCNP®6.2 in 3 of the 4 codes, with the 4th code (PoliMi) being based on an older version of MCNP®, the differences in correlated neutron observables of interest are most likely due to the treatment of fission event generation in each of the different codes, as opposed to the radiation transport.« less
XWeB: The XML Warehouse Benchmark

NASA Astrophysics Data System (ADS)

Mahboubi, Hadj; Darmont, Jérôme

With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems.
Benchmark datasets for 3D MALDI- and DESI-imaging mass spectrometry.

PubMed

Oetjen, Janina; Veselkov, Kirill; Watrous, Jeramie; McKenzie, James S; Becker, Michael; Hauberg-Lotte, Lena; Kobarg, Jan Hendrik; Strittmatter, Nicole; Mróz, Anna K; Hoffmann, Franziska; Trede, Dennis; Palmer, Andrew; Schiffler, Stefan; Steinhorst, Klaus; Aichler, Michaela; Goldin, Robert; Guntinas-Lichius, Orlando; von Eggeling, Ferdinand; Thiele, Herbert; Maedler, Kathrin; Walch, Axel; Maass, Peter; Dorrestein, Pieter C; Takats, Zoltan; Alexandrov, Theodore

2015-01-01

Three-dimensional (3D) imaging mass spectrometry (MS) is an analytical chemistry technique for the 3D molecular analysis of a tissue specimen, entire organ, or microbial colonies on an agar plate. 3D-imaging MS has unique advantages over existing 3D imaging techniques, offers novel perspectives for understanding the spatial organization of biological processes, and has growing potential to be introduced into routine use in both biology and medicine. Owing to the sheer quantity of data generated, the visualization, analysis, and interpretation of 3D imaging MS data remain a significant challenge. Bioinformatics research in this field is hampered by the lack of publicly available benchmark datasets needed to evaluate and compare algorithms. High-quality 3D imaging MS datasets from different biological systems at several labs were acquired, supplied with overview images and scripts demonstrating how to read them, and deposited into MetaboLights, an open repository for metabolomics data. 3D imaging MS data were collected from five samples using two types of 3D imaging MS. 3D matrix-assisted laser desorption/ionization imaging (MALDI) MS data were collected from murine pancreas, murine kidney, human oral squamous cell carcinoma, and interacting microbial colonies cultured in Petri dishes. 3D desorption electrospray ionization (DESI) imaging MS data were collected from a human colorectal adenocarcinoma. With the aim to stimulate computational research in the field of computational 3D imaging MS, selected high-quality 3D imaging MS datasets are provided that could be used by algorithm developers as benchmark datasets.
Performance analysis of fusion nuclear-data benchmark experiments for light to heavy materials in MeV energy region with a neutron spectrum shifter

NASA Astrophysics Data System (ADS)

Murata, Isao; Ohta, Masayuki; Miyamaru, Hiroyuki; Kondo, Keitaro; Yoshida, Shigeo; Iida, Toshiyuki; Ochiai, Kentaro; Konno, Chikara

2011-10-01

Nuclear data are indispensable for development of fusion reactor candidate materials. However, benchmarking of the nuclear data in MeV energy region is not yet adequate. In the present study, benchmark performance in the MeV energy region was investigated theoretically for experiments by using a 14 MeV neutron source. We carried out a systematical analysis for light to heavy materials. As a result, the benchmark performance for the neutron spectrum was confirmed to be acceptable, while for gamma-rays it was not sufficiently accurate. Consequently, a spectrum shifter has to be applied. Beryllium had the best performance as a shifter. Moreover, a preliminary examination of whether it is really acceptable that only the spectrum before the last collision is considered in the benchmark performance analysis. It was pointed out that not only the last collision but also earlier collisions should be considered equally in the benchmark performance analysis.
Benchmark Evaluation of the HTR-PROTEUS Absorber Rod Worths (Core 4)

DOE Office of Scientific and Technical Information (OSTI.GOV)

John D. Bess; Leland M. Montierth

2014-06-01

PROTEUS was a zero-power research reactor at the Paul Scherrer Institute (PSI) in Switzerland. The critical assembly was constructed from a large graphite annulus surrounding a central cylindrical cavity. Various experimental programs were investigated in PROTEUS; during the years 1992 through 1996, it was configured as a pebble-bed reactor and designated HTR-PROTEUS. Various critical configurations were assembled with each accompanied by an assortment of reactor physics experiments including differential and integral absorber rod measurements, kinetics, reaction rate distributions, water ingress effects, and small sample reactivity effects [1]. Four benchmark reports were previously prepared and included in the March 2013 editionmore » of the International Handbook of Evaluated Reactor Physics Benchmark Experiments (IRPhEP Handbook) [2] evaluating eleven critical configurations. A summary of that effort was previously provided [3] and an analysis of absorber rod worth measurements for Cores 9 and 10 have been performed prior to this analysis and included in PROTEUS-GCR-EXP-004 [4]. In the current benchmark effort, absorber rod worths measured for Core Configuration 4, which was the only core with a randomly-packed pebble loading, have been evaluated for inclusion as a revision to the HTR-PROTEUS benchmark report PROTEUS-GCR-EXP-002.« less
Surgeon's experiences of receiving peer benchmarked feedback using patient-reported outcome measures: a qualitative study.

PubMed

Boyce, Maria B; Browne, John P; Greenhalgh, Joanne

2014-06-27

The use of patient-reported outcome measures (PROMs) to provide healthcare professionals with peer benchmarked feedback is growing. However, there is little evidence on the opinions of professionals on the value of this information in practice. The purpose of this research is to explore surgeon's experiences of receiving peer benchmarked PROMs feedback and to examine whether this information led to changes in their practice. This qualitative research employed a Framework approach. Semi-structured interviews were undertaken with surgeons who received peer benchmarked PROMs feedback. The participants included eleven consultant orthopaedic surgeons in the Republic of Ireland. Five themes were identified: conceptual, methodological, practical, attitudinal, and impact. A typology was developed based on the attitudinal and impact themes from which three distinct groups emerged. 'Advocates' had positive attitudes towards PROMs and confirmed that the information promoted a self-reflective process. 'Converts' were uncertain about the value of PROMs, which reduced their inclination to use the data. 'Sceptics' had negative attitudes towards PROMs and claimed that the information had no impact on their behaviour. The conceptual, methodological and practical factors were linked to the typology. Surgeons had mixed opinions on the value of peer benchmarked PROMs data. Many appreciated the feedback as it reassured them that their practice was similar to their peers. However, PROMs information alone was considered insufficient to help identify opportunities for quality improvements. The reasons for the observed reluctance of participants to embrace PROMs can be categorised into conceptual, methodological, and practical factors. Policy makers and researchers need to increase professionals' awareness of the numerous purposes and benefits of using PROMs, challenge the current methods to measure performance using PROMs, and reduce the burden of data collection and information
The Schultz MIDI Benchmarking Toolbox for MIDI interfaces, percussion pads, and sound cards.

PubMed

Schultz, Benjamin G

2018-04-17

The Musical Instrument Digital Interface (MIDI) was readily adopted for auditory sensorimotor synchronization experiments. These experiments typically use MIDI percussion pads to collect responses, a MIDI-USB converter (or MIDI-PCI interface) to record responses on a PC and manipulate feedback, and an external MIDI sound module to generate auditory feedback. Previous studies have suggested that auditory feedback latencies can be introduced by these devices. The Schultz MIDI Benchmarking Toolbox (SMIDIBT) is an open-source, Arduino-based package designed to measure the point-to-point latencies incurred by several devices used in the generation of response-triggered auditory feedback. Experiment 1 showed that MIDI messages are sent and received within 1 ms (on average) in the absence of any external MIDI device. Latencies decreased when the baud rate increased above the MIDI protocol default (31,250 bps). Experiment 2 benchmarked the latencies introduced by different MIDI-USB and MIDI-PCI interfaces. MIDI-PCI was superior to MIDI-USB, primarily because MIDI-USB is subject to USB polling. Experiment 3 tested three MIDI percussion pads. Both the audio and MIDI message latencies were significantly greater than 1 ms for all devices, and there were significant differences between percussion pads and instrument patches. Experiment 4 benchmarked four MIDI sound modules. Audio latencies were significantly greater than 1 ms, and there were significant differences between sound modules and instrument patches. These experiments suggest that millisecond accuracy might not be achievable with MIDI devices. The SMIDIBT can be used to benchmark a range of MIDI devices, thus allowing researchers to make informed decisions when choosing testing materials and to arrive at an acceptable latency at their discretion.
Benchmarking: applications to transfusion medicine.

PubMed

Apelseth, Torunn Oveland; Molnar, Laura; Arnold, Emmy; Heddle, Nancy M

2012-10-01

Benchmarking is as a structured continuous collaborative process in which comparisons for selected indicators are used to identify factors that, when implemented, will improve transfusion practices. This study aimed to identify transfusion medicine studies reporting on benchmarking, summarize the benchmarking approaches used, and identify important considerations to move the concept of benchmarking forward in the field of transfusion medicine. A systematic review of published literature was performed to identify transfusion medicine-related studies that compared at least 2 separate institutions or regions with the intention of benchmarking focusing on 4 areas: blood utilization, safety, operational aspects, and blood donation. Forty-five studies were included: blood utilization (n = 35), safety (n = 5), operational aspects of transfusion medicine (n = 5), and blood donation (n = 0). Based on predefined criteria, 7 publications were classified as benchmarking, 2 as trending, and 36 as single-event studies. Three models of benchmarking are described: (1) a regional benchmarking program that collects and links relevant data from existing electronic sources, (2) a sentinel site model where data from a limited number of sites are collected, and (3) an institutional-initiated model where a site identifies indicators of interest and approaches other institutions. Benchmarking approaches are needed in the field of transfusion medicine. Major challenges include defining best practices and developing cost-effective methods of data collection. For those interested in initiating a benchmarking program, the sentinel site model may be most effective and sustainable as a starting point, although the regional model would be the ideal goal. Copyright © 2012 Elsevier Inc. All rights reserved.
ICSBEP Benchmarks For Nuclear Data Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Briggs, J. Blair

2005-05-24

The International Criticality Safety Benchmark Evaluation Project (ICSBEP) was initiated in 1992 by the United States Department of Energy. The ICSBEP became an official activity of the Organization for Economic Cooperation and Development (OECD) -- Nuclear Energy Agency (NEA) in 1995. Representatives from the United States, United Kingdom, France, Japan, the Russian Federation, Hungary, Republic of Korea, Slovenia, Serbia and Montenegro (formerly Yugoslavia), Kazakhstan, Spain, Israel, Brazil, Poland, and the Czech Republic are now participating. South Africa, India, China, and Germany are considering participation. The purpose of the ICSBEP is to identify, evaluate, verify, and formally document a comprehensive andmore » internationally peer-reviewed set of criticality safety benchmark data. The work of the ICSBEP is published as an OECD handbook entitled ''International Handbook of Evaluated Criticality Safety Benchmark Experiments.'' The 2004 Edition of the Handbook contains benchmark specifications for 3331 critical or subcritical configurations that are intended for use in validation efforts and for testing basic nuclear data. New to the 2004 Edition of the Handbook is a draft criticality alarm / shielding type benchmark that should be finalized in 2005 along with two other similar benchmarks. The Handbook is being used extensively for nuclear data testing and is expected to be a valuable resource for code and data validation and improvement efforts for decades to come. Specific benchmarks that are useful for testing structural materials such as iron, chromium, nickel, and manganese; beryllium; lead; thorium; and 238U are highlighted.« less
The Isprs Benchmark on Indoor Modelling

NASA Astrophysics Data System (ADS)

Khoshelham, K.; Díaz Vilariño, L.; Peter, M.; Kang, Z.; Acharya, D.

2017-09-01

Automated generation of 3D indoor models from point cloud data has been a topic of intensive research in recent years. While results on various datasets have been reported in literature, a comparison of the performance of different methods has not been possible due to the lack of benchmark datasets and a common evaluation framework. The ISPRS benchmark on indoor modelling aims to address this issue by providing a public benchmark dataset and an evaluation framework for performance comparison of indoor modelling methods. In this paper, we present the benchmark dataset comprising several point clouds of indoor environments captured by different sensors. We also discuss the evaluation and comparison of indoor modelling methods based on manually created reference models and appropriate quality evaluation criteria. The benchmark dataset is available for download at: benchmark-on-indoor-modelling.html"target="_blank">http://www2.isprs.org/commissions/comm4/wg5/benchmark-on-indoor-modelling.html.
Quality in E-Learning--A Conceptual Framework Based on Experiences from Three International Benchmarking Projects

ERIC Educational Resources Information Center

Ossiannilsson, E.; Landgren, L.

2012-01-01

Between 2008 and 2010, Lund University took part in three international benchmarking projects, "E-xcellence+," the "eLearning Benchmarking Exercise 2009," and the "First Dual-Mode Distance Learning Benchmarking Club." A comparison of these models revealed a rather high level of correspondence. From this finding and…
JENDL-4.0/HE Benchmark Test with Concrete and Iron Shielding Experiments at JAEA/TIARA

NASA Astrophysics Data System (ADS)

Konno, Chikara; Matsuda, Norihiro; Kwon, Saerom; Ohta, Masayuki; Sato, Satoshi

2017-09-01

As a benchmark test of JENDL-4.0/HE released in 2015, we have analyzed the concrete and iron shielding experiments with the quasi mono-energetic 40 and 65 MeV neutron sources at TIARA in JAEA by using MCNP5 and ACE files processed from JENDL-4.0/HE with NJOY2012. As a result, it was found out that the calculation results with JENDL-4.0/HE agreed with the measured ones in the concrete experiment well, while they underestimated the measured ones in the iron experiment with 65 MeV neutrons more for the thicker assemblies. We examined the 56Fe data of JENDL-4.0/HE in detail and it was considered that the larger non-elastic scattering cross sections of 56Fe caused the underestimation in the calculation with JENDL-4.0/HE for the iron experiment with 65 MeV neutrons.
Benchmarks--Standards Comparisons. Math Competencies: EFF Benchmarks Comparison [and] Reading Competencies: EFF Benchmarks Comparison [and] Writing Competencies: EFF Benchmarks Comparison.

ERIC Educational Resources Information Center

Kent State Univ., OH. Ohio Literacy Resource Center.

This document is intended to show the relationship between Ohio's Standards and Competencies, Equipped for the Future's (EFF's) Standards and Components of Performance, and Ohio's Revised Benchmarks. The document is divided into three parts, with Part 1 covering mathematics instruction, Part 2 covering reading instruction, and Part 3 covering…
Benchmarking of HEU Mental Annuli Critical Assemblies with Internally Reflected Graphite Cylinder

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xiaobo, Liu; Bess, John D.; Marshall, Margaret A.

Three experimental configurations of critical assemblies, performed in 1963 at the Oak Ridge Critical Experiment Facility, which are assembled using three different diameter HEU annuli (15-9 inches, 15-7 inches and 13-7 inches) metal annuli with internally reflected graphite cylinder are evaluated and benchmarked. The experimental uncertainties which are 0.00055, 0.00055 and 0.00055 respectively, and biases to the detailed benchmark models which are -0.00179, -0.00189 and -0.00114 respectively, were determined, and the experimental benchmark keff results were obtained for both detailed and simplified model. The calculation results for both detailed and simplified models using MCNP6-1.0 and ENDF VII.1 agree well tomore » the benchmark experimental results with a difference of less than 0.2%. These are acceptable benchmark experiments for inclusion in the ICSBEP Handbook.« less
Fusion neutron source blanket: requirements for calculation accuracy and benchmark experiment precision

NASA Astrophysics Data System (ADS)

Zhirkin, A. V.; Alekseev, P. N.; Batyaev, V. F.; Gurevich, M. I.; Dudnikov, A. A.; Kuteev, B. V.; Pavlov, K. V.; Titarenko, Yu. E.; Titarenko, A. Yu.

2017-06-01

In this report the calculation accuracy requirements of the main parameters of the fusion neutron source, and the thermonuclear blankets with a DT fusion power of more than 10 MW, are formulated. To conduct the benchmark experiments the technical documentation and calculation models were developed for two blanket micro-models: the molten salt and the heavy water solid-state blankets. The calculations of the neutron spectra, and 37 dosimetric reaction rates that are widely used for the registration of thermal, resonance and threshold (0.25-13.45 MeV) neutrons, were performed for each blanket micro-model. The MCNP code and the neutron data library ENDF/B-VII were used for the calculations. All the calculations were performed for two kinds of neutron source: source I is the fusion source, source II is the source of neutrons generated by the 7Li target irradiated by protons with energy 24.6 MeV. The spectral indexes ratios were calculated to describe the spectrum variations from different neutron sources. The obtained results demonstrate the advantage of using the fusion neutron source in future experiments.
A new numerical benchmark of a freshwater lens

NASA Astrophysics Data System (ADS)

Stoeckl, L.; Walther, M.; Graf, T.

2016-04-01

A numerical benchmark for 2-D variable-density flow and solute transport in a freshwater lens is presented. The benchmark is based on results of laboratory experiments conducted by Stoeckl and Houben (2012) using a sand tank on the meter scale. This benchmark describes the formation and degradation of a freshwater lens over time as it can be found under real-world islands. An error analysis gave the appropriate spatial and temporal discretization of 1 mm and 8.64 s, respectively. The calibrated parameter set was obtained using the parameter estimation tool PEST. Comparing density-coupled and density-uncoupled results showed that the freshwater-saltwater interface position is strongly dependent on density differences. A benchmark that adequately represents saltwater intrusion and that includes realistic features of coastal aquifers or freshwater lenses was lacking. This new benchmark was thus developed and is demonstrated to be suitable to test variable-density groundwater models applied to saltwater intrusion investigations.
Benchmarking the MCNP Monte Carlo code with a photon skyshine experiment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Olsher, R.H.; Hsu, Hsiao Hua; Harvey, W.F.

1993-07-01

The MCNP Monte Carlo transport code is used by the Los Alamos National Laboratory Health and Safety Division for a broad spectrum of radiation shielding calculations. One such application involves the determination of skyshine dose for a variety of photon sources. To verify the accuracy of the code, it was benchmarked with the Kansas State Univ. (KSU) photon skyshine experiment of 1977. The KSU experiment for the unshielded source geometry was simulated in great detail to include the contribution of groundshine, in-silo photon scatter, and the effect of spectral degradation in the source capsule. The standard deviation of the KSUmore » experimental data was stated to be 7%, while the statistical uncertainty of the simulation was kept at or under 1%. The results of the simulation agreed closely with the experimental data, generally to within 6%. At distances of under 100 m from the silo, the modeling of the in-silo scatter was crucial to achieving close agreement with the experiment. Specifically, scatter off the top layer of the source cask accounted for [approximately]12% of the dose at 50 m. At distance >300m, using the [sup 60]Co line spectrum led to a dose overresponse as great as 19% at 700 m. It was necessary to use the actual source spectrum, which includes a Compton tail from photon collisions in the source capsule, to achieve close agreement with experimental data. These results highlight the importance of using Monte Carlo transport techniques to account for the nonideal features of even simple experiments''.« less

Benchmarking of calculation schemes in APOLLO2 and COBAYA3 for WER lattices

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zheleva, N.; Ivanov, P.; Todorova, G.

This paper presents solutions of the NURISP WER lattice benchmark using APOLLO2, TRIPOLI4 and COBAYA3 pin-by-pin. The main objective is to validate MOC based calculation schemes for pin-by-pin cross-section generation with APOLLO2 against TRIPOLI4 reference results. A specific objective is to test the APOLLO2 generated cross-sections and interface discontinuity factors in COBAYA3 pin-by-pin calculations with unstructured mesh. The VVER-1000 core consists of large hexagonal assemblies with 2 mm inter-assembly water gaps which require the use of unstructured meshes in the pin-by-pin core simulators. The considered 2D benchmark problems include 19-pin clusters, fuel assemblies and 7-assembly clusters. APOLLO2 calculation schemes withmore » the step characteristic method (MOC) and the higher-order Linear Surface MOC have been tested. The comparison of APOLLO2 vs. TRIPOLI4 results shows a very close agreement. The 3D lattice solver in COBAYA3 uses transport corrected multi-group diffusion approximation with interface discontinuity factors of Generalized Equivalence Theory (GET) or Black Box Homogenization (BBH) type. The COBAYA3 pin-by-pin results in 2, 4 and 8 energy groups are close to the reference solutions when using side-dependent interface discontinuity factors. (authors)« less
A comparison of five benchmarks

NASA Technical Reports Server (NTRS)

Huss, Janice E.; Pennline, James A.

1987-01-01

Five benchmark programs were obtained and run on the NASA Lewis CRAY X-MP/24. A comparison was made between the programs codes and between the methods for calculating performance figures. Several multitasking jobs were run to gain experience in how parallel performance is measured.
Benchmarking Multilayer-HySEA model for landslide generated tsunami. HTHMP validation process.

NASA Astrophysics Data System (ADS)

Macias, J.; Escalante, C.; Castro, M. J.

2017-12-01

Landslide tsunami hazard may be dominant along significant parts of the coastline around the world, in particular in the USA, as compared to hazards from other tsunamigenic sources. This fact motivated NTHMP about the need of benchmarking models for landslide generated tsunamis, following the same methodology already used for standard tsunami models when the source is seismic. To perform the above-mentioned validation process, a set of candidate benchmarks were proposed. These benchmarks are based on a subset of available laboratory data sets for solid slide experiments and deformable slide experiments, and include both submarine and subaerial slides. A benchmark based on a historic field event (Valdez, AK, 1964) close the list of proposed benchmarks. A total of 7 benchmarks. The Multilayer-HySEA model including non-hydrostatic effects has been used to perform all the benchmarking problems dealing with laboratory experiments proposed in the workshop that was organized at Texas A&M University - Galveston, on January 9-11, 2017 by NTHMP. The aim of this presentation is to show some of the latest numerical results obtained with the Multilayer-HySEA (non-hydrostatic) model in the framework of this validation effort.Acknowledgements. This research has been partially supported by the Spanish Government Research project SIMURISK (MTM2015-70490-C02-01-R) and University of Malaga, Campus de Excelencia Internacional Andalucía Tech. The GPU computations were performed at the Unit of Numerical Methods (University of Malaga).
Overview of the 2014 Edition of the International Handbook of Evaluated Reactor Physics Benchmark Experiments (IRPhEP Handbook)

DOE Office of Scientific and Technical Information (OSTI.GOV)

John D. Bess; J. Blair Briggs; Jim Gulliford

2014-10-01

The International Reactor Physics Experiment Evaluation Project (IRPhEP) is a widely recognized world class program. The work of the IRPhEP is documented in the International Handbook of Evaluated Reactor Physics Benchmark Experiments (IRPhEP Handbook). Integral data from the IRPhEP Handbook is used by reactor safety and design, nuclear data, criticality safety, and analytical methods development specialists, worldwide, to perform necessary validations of their calculational techniques. The IRPhEP Handbook is among the most frequently quoted reference in the nuclear industry and is expected to be a valuable resource for future decades.
Simulation of underwater explosion benchmark experiments with ALE3D

DOE Office of Scientific and Technical Information (OSTI.GOV)

Couch, R.; Faux, D.

1997-05-19

Some code improvements have been made during the course of this study. One immediately obvious need was for more flexibility in the constitutive representation for materials in shell elements. To remedy this situation, a model with a tabular representation of stress versus strain and rate dependent effects was implemented. This was required in order to obtain reasonable results in the IED cylinder simulation. Another deficiency was in the ability to extract and plot variables associated with shell elements. The pipe whip analysis required the development of a scheme to tally and plot time dependent shell quantities such as stresses andmore » strains. This capability had previously existed only for solid elements. Work was initiated to provide the same range of plotting capability for structural elements that exist with the DYNA3D/TAURUS tools. One of the characteristics of these problems is the disparity in zoning required in the vicinity of the charge and bubble compared to that needed in the far field. This disparity can cause the equipotential relaxation logic to provide a less than optimal solution. Various approaches were utilized to bias the relaxation to obtain more optimal meshing during relaxation. Extensions of these techniques have been developed to provide more powerful options, but more work still needs to be done. The results presented here are representative of what can be produced with an ALE code structured like ALE3D. They are not necessarily the best results that could have been obtained. More experience in assessing sensitivities to meshing and boundary conditions would be very useful. A number of code deficiencies discovered in the course of this work have been corrected and are available for any future investigations.« less
Benchmarking and validation activities within JEFF project

NASA Astrophysics Data System (ADS)

Cabellos, O.; Alvarez-Velarde, F.; Angelone, M.; Diez, C. J.; Dyrda, J.; Fiorito, L.; Fischer, U.; Fleming, M.; Haeck, W.; Hill, I.; Ichou, R.; Kim, D. H.; Klix, A.; Kodeli, I.; Leconte, P.; Michel-Sendis, F.; Nunnenmann, E.; Pecchia, M.; Peneliau, Y.; Plompen, A.; Rochman, D.; Romojaro, P.; Stankovskiy, A.; Sublet, J. Ch.; Tamagno, P.; Marck, S. van der

2017-09-01

The challenge for any nuclear data evaluation project is to periodically release a revised, fully consistent and complete library, with all needed data and covariances, and ensure that it is robust and reliable for a variety of applications. Within an evaluation effort, benchmarking activities play an important role in validating proposed libraries. The Joint Evaluated Fission and Fusion (JEFF) Project aims to provide such a nuclear data library, and thus, requires a coherent and efficient benchmarking process. The aim of this paper is to present the activities carried out by the new JEFF Benchmarking and Validation Working Group, and to describe the role of the NEA Data Bank in this context. The paper will also review the status of preliminary benchmarking for the next JEFF-3.3 candidate cross-section files.
Object-Oriented Implementation of the NAS Parallel Benchmarks using Charm++

NASA Technical Reports Server (NTRS)

Krishnan, Sanjeev; Bhandarkar, Milind; Kale, Laxmikant V.

1996-01-01

This report describes experiences with implementing the NAS Computational Fluid Dynamics benchmarks using a parallel object-oriented language, Charm++. Our main objective in implementing the NAS CFD kernel benchmarks was to develop a code that could be used to easily experiment with different domain decomposition strategies and dynamic load balancing. We also wished to leverage the object-orientation provided by the Charm++ parallel object-oriented language, to develop reusable abstractions that would simplify the process of developing parallel applications. We first describe the Charm++ parallel programming model and the parallel object array abstraction, then go into detail about each of the Scalar Pentadiagonal (SP) and Lower/Upper Triangular (LU) benchmarks, along with performance results. Finally we conclude with an evaluation of the methodology used.
Toxicological Benchmarks for Screening Potential Contaminants of Concern for Effects on Soil and Litter Invertebrates and Heterotrophic Process

DOE Office of Scientific and Technical Information (OSTI.GOV)

Will, M.E.

1994-01-01

This report presents a standard method for deriving benchmarks for the purpose of ''contaminant screening,'' performed by comparing measured ambient concentrations of chemicals. The work was performed under Work Breakdown Structure 1.4.12.2.3.04.07.02 (Activity Data Sheet 8304). In addition, this report presents sets of data concerning the effects of chemicals in soil on invertebrates and soil microbial processes, benchmarks for chemicals potentially associated with United States Department of Energy sites, and literature describing the experiments from which data were drawn for benchmark derivation.
A quasi two-dimensional benchmark experiment for the solidification of a tin lead binary alloy

NASA Astrophysics Data System (ADS)

Wang, Xiao Dong; Petitpas, Patrick; Garnier, Christian; Paulin, Jean-Pierre; Fautrelle, Yves

2007-05-01

A horizontal solidification benchmark experiment with pure tin and a binary alloy of Sn-10 wt.%Pb is proposed. The experiment consists in solidifying a rectangular sample using two lateral heat exchangers which allow the application a controlled horizontal temperature difference. An array of fifty thermocouples placed on the lateral wall permits the determination of the instantaneous temperature distribution. The cases with the temperature gradient G=0, and the cooling rates equal to 0.02 and 0.04 K/s are studied. The time evolution of the interfacial total heat flux and the temperature field are recorded and analyzed. This allows us to evaluate heat transfer evolution due to natural convection, as well as its influence on the solidification macrostructure. To cite this article: X.D. Wang et al., C. R. Mecanique 335 (2007).
Characterization of addressability by simultaneous randomized benchmarking.

PubMed

Gambetta, Jay M; Córcoles, A D; Merkel, S T; Johnson, B R; Smolin, John A; Chow, Jerry M; Ryan, Colm A; Rigetti, Chad; Poletto, S; Ohki, Thomas A; Ketchen, Mark B; Steffen, M

2012-12-14

The control and handling of errors arising from cross talk and unwanted interactions in multiqubit systems is an important issue in quantum information processing architectures. We introduce a benchmarking protocol that provides information about the amount of addressability present in the system and implement it on coupled superconducting qubits. The protocol consists of randomized benchmarking experiments run both individually and simultaneously on pairs of qubits. A relevant figure of merit for the addressability is then related to the differences in the measured average gate fidelities in the two experiments. We present results from two similar samples with differing cross talk and unwanted qubit-qubit interactions. The results agree with predictions based on simple models of the classical cross talk and Stark shifts.
Benchmarking image fusion system design parameters

NASA Astrophysics Data System (ADS)

Howell, Christopher L.

2013-06-01

A clear and absolute method for discriminating between image fusion algorithm performances is presented. This method can effectively be used to assist in the design and modeling of image fusion systems. Specifically, it is postulated that quantifying human task performance using image fusion should be benchmarked to whether the fusion algorithm, at a minimum, retained the performance benefit achievable by each independent spectral band being fused. The established benchmark would then clearly represent the threshold that a fusion system should surpass to be considered beneficial to a particular task. A genetic algorithm is employed to characterize the fused system parameters using a Matlab® implementation of NVThermIP as the objective function. By setting the problem up as a mixed-integer constraint optimization problem, one can effectively look backwards through the image acquisition process: optimizing fused system parameters by minimizing the difference between modeled task difficulty measure and the benchmark task difficulty measure. The results of an identification perception experiment are presented, where human observers were asked to identify a standard set of military targets, and used to demonstrate the effectiveness of the benchmarking process.
Benchmark Evaluation of Start-Up and Zero-Power Measurements at the High-Temperature Engineering Test Reactor

DOE PAGES

Bess, John D.; Fujimoto, Nozomu

2014-10-09

Benchmark models were developed to evaluate six cold-critical and two warm-critical, zero-power measurements of the HTTR. Additional measurements of a fully-loaded subcritical configuration, core excess reactivity, shutdown margins, six isothermal temperature coefficients, and axial reaction-rate distributions were also evaluated as acceptable benchmark experiments. Insufficient information is publicly available to develop finely-detailed models of the HTTR as much of the design information is still proprietary. However, the uncertainties in the benchmark models are judged to be of sufficient magnitude to encompass any biases and bias uncertainties incurred through the simplification process used to develop the benchmark models. Dominant uncertainties in themore » experimental keff for all core configurations come from uncertainties in the impurity content of the various graphite blocks that comprise the HTTR. Monte Carlo calculations of keff are between approximately 0.9 % and 2.7 % greater than the benchmark values. Reevaluation of the HTTR models as additional information becomes available could improve the quality of this benchmark and possibly reduce the computational biases. High-quality characterization of graphite impurities would significantly improve the quality of the HTTR benchmark assessment. Simulation of the other reactor physics measurements are in good agreement with the benchmark experiment values. The complete benchmark evaluation details are available in the 2014 edition of the International Handbook of Evaluated Reactor Physics Benchmark Experiments.« less
SEMANTIC3D.NET: a New Large-Scale Point Cloud Classification Benchmark

NASA Astrophysics Data System (ADS)

Hackel, T.; Savinov, N.; Ladicky, L.; Wegner, J. D.; Schindler, K.; Pollefeys, M.

2017-05-01

This paper presents a new 3D point cloud classification benchmark data set with over four billion manually labelled points, meant as input for data-hungry (deep) learning methods. We also discuss first submissions to the benchmark that use deep convolutional neural networks (CNNs) as a work horse, which already show remarkable performance improvements over state-of-the-art. CNNs have become the de-facto standard for many tasks in computer vision and machine learning like semantic segmentation or object detection in images, but have no yet led to a true breakthrough for 3D point cloud labelling tasks due to lack of training data. With the massive data set presented in this paper, we aim at closing this data gap to help unleash the full potential of deep learning methods for 3D labelling tasks. Our semantic3D.net data set consists of dense point clouds acquired with static terrestrial laser scanners. It contains 8 semantic classes and covers a wide range of urban outdoor scenes: churches, streets, railroad tracks, squares, villages, soccer fields and castles. We describe our labelling interface and show that our data set provides more dense and complete point clouds with much higher overall number of labelled points compared to those already available to the research community. We further provide baseline method descriptions and comparison between methods submitted to our online system. We hope semantic3D.net will pave the way for deep learning methods in 3D point cloud labelling to learn richer, more general 3D representations, and first submissions after only a few months indicate that this might indeed be the case.
International land Model Benchmarking (ILAMB) Package v002.00

DOE Data Explorer

Collier, Nathaniel [Oak Ridge National Laboratory; Hoffman, Forrest M. [Oak Ridge National Laboratory; Mu, Mingquan [University of California, Irvine; Randerson, James T. [University of California, Irvine; Riley, William J. [Lawrence Berkeley National Laboratory

2016-05-09

As a contribution to International Land Model Benchmarking (ILAMB) Project, we are providing new analysis approaches, benchmarking tools, and science leadership. The goal of ILAMB is to assess and improve the performance of land models through international cooperation and to inform the design of new measurement campaigns and field studies to reduce uncertainties associated with key biogeochemical processes and feedbacks. ILAMB is expected to be a primary analysis tool for CMIP6 and future model-data intercomparison experiments. This team has developed initial prototype benchmarking systems for ILAMB, which will be improved and extended to include ocean model metrics and diagnostics.
International land Model Benchmarking (ILAMB) Package v001.00

DOE Data Explorer

Mu, Mingquan [University of California, Irvine; Randerson, James T. [University of California, Irvine; Riley, William J. [Lawrence Berkeley National Laboratory; Hoffman, Forrest M. [Oak Ridge National Laboratory

2016-05-02

As a contribution to International Land Model Benchmarking (ILAMB) Project, we are providing new analysis approaches, benchmarking tools, and science leadership. The goal of ILAMB is to assess and improve the performance of land models through international cooperation and to inform the design of new measurement campaigns and field studies to reduce uncertainties associated with key biogeochemical processes and feedbacks. ILAMB is expected to be a primary analysis tool for CMIP6 and future model-data intercomparison experiments. This team has developed initial prototype benchmarking systems for ILAMB, which will be improved and extended to include ocean model metrics and diagnostics.
The art and science of using routine outcome measurement in mental health benchmarking.

PubMed

McKay, Roderick; Coombs, Tim; Duerden, David

2014-02-01

To report and critique the application of routine outcome measurement data when benchmarking Australian mental health services. The experience of the authors as participants and facilitators of benchmarking activities is augmented by a review of the literature regarding mental health benchmarking in Australia. Although the published literature is limited, in practice, routine outcome measures, in particular the Health of the National Outcomes Scales (HoNOS) family of measures, are used in a variety of benchmarking activities. Use in exploring similarities and differences in consumers between services and the outcomes of care are illustrated. This requires the rigour of science in data management and interpretation, supplemented by the art that comes from clinical experience, a desire to reflect on clinical practice and the flexibility to use incomplete data to explore clinical practice. Routine outcome measurement data can be used in a variety of ways to support mental health benchmarking. With the increasing sophistication of information development in mental health, the opportunity to become involved in benchmarking will continue to increase. The techniques used during benchmarking and the insights gathered may prove useful to support reflection on practice by psychiatrists and other senior mental health clinicians.
Combining Phase Identification and Statistic Modeling for Automated Parallel Benchmark Generation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Ye; Ma, Xiaosong; Liu, Qing Gary

2015-01-01

Parallel application benchmarks are indispensable for evaluating/optimizing HPC software and hardware. However, it is very challenging and costly to obtain high-fidelity benchmarks reflecting the scale and complexity of state-of-the-art parallel applications. Hand-extracted synthetic benchmarks are time-and labor-intensive to create. Real applications themselves, while offering most accurate performance evaluation, are expensive to compile, port, reconfigure, and often plainly inaccessible due to security or ownership concerns. This work contributes APPRIME, a novel tool for trace-based automatic parallel benchmark generation. Taking as input standard communication-I/O traces of an application's execution, it couples accurate automatic phase identification with statistical regeneration of event parameters tomore » create compact, portable, and to some degree reconfigurable parallel application benchmarks. Experiments with four NAS Parallel Benchmarks (NPB) and three real scientific simulation codes confirm the fidelity of APPRIME benchmarks. They retain the original applications' performance characteristics, in particular the relative performance across platforms.« less
The NAS parallel benchmarks

NASA Technical Reports Server (NTRS)

Bailey, David (Editor); Barton, John (Editor); Lasinski, Thomas (Editor); Simon, Horst (Editor)

1993-01-01

A new set of benchmarks was developed for the performance evaluation of highly parallel supercomputers. These benchmarks consist of a set of kernels, the 'Parallel Kernels,' and a simulated application benchmark. Together they mimic the computation and data movement characteristics of large scale computational fluid dynamics (CFD) applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification - all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.
Benchmarks: The Development of a New Approach to Student Evaluation.

ERIC Educational Resources Information Center

Larter, Sylvia

The Toronto Board of Education Benchmarks are libraries of reference materials that demonstrate student achievement at various levels. Each library contains video benchmarks, print benchmarks, a staff handbook, and summary and introductory documents. This book is about the development and the history of the benchmark program. It has taken over 3…
Benchmarking and Performance Measurement.

ERIC Educational Resources Information Center

Town, J. Stephen

This paper defines benchmarking and its relationship to quality management, describes a project which applied the technique in a library context, and explores the relationship between performance measurement and benchmarking. Numerous benchmarking methods contain similar elements: deciding what to benchmark; identifying partners; gathering…

The NAS parallel benchmarks

NASA Technical Reports Server (NTRS)

Bailey, D. H.; Barszcz, E.; Barton, J. T.; Carter, R. L.; Lasinski, T. A.; Browning, D. S.; Dagum, L.; Fatoohi, R. A.; Frederickson, P. O.; Schreiber, R. S.

1991-01-01

A new set of benchmarks has been developed for the performance evaluation of highly parallel supercomputers in the framework of the NASA Ames Numerical Aerodynamic Simulation (NAS) Program. These consist of five 'parallel kernel' benchmarks and three 'simulated application' benchmarks. Together they mimic the computation and data movement characteristics of large-scale computational fluid dynamics applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification-all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.
Parallelization of NAS Benchmarks for Shared Memory Multiprocessors

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry C.; Saini, Subhash (Technical Monitor)

1998-01-01

This paper presents our experiences of parallelizing the sequential implementation of NAS benchmarks using compiler directives on SGI Origin2000 distributed shared memory (DSM) system. Porting existing applications to new high performance parallel and distributed computing platforms is a challenging task. Ideally, a user develops a sequential version of the application, leaving the task of porting to new generations of high performance computing systems to parallelization tools and compilers. Due to the simplicity of programming shared-memory multiprocessors, compiler developers have provided various facilities to allow the users to exploit parallelism. Native compilers on SGI Origin2000 support multiprocessing directives to allow users to exploit loop-level parallelism in their programs. Additionally, supporting tools can accomplish this process automatically and present the results of parallelization to the users. We experimented with these compiler directives and supporting tools by parallelizing sequential implementation of NAS benchmarks. Results reported in this paper indicate that with minimal effort, the performance gain is comparable with the hand-parallelized, carefully optimized, message-passing implementations of the same benchmarks.
Quality management benchmarking: FDA compliance in pharmaceutical industry.

PubMed

Jochem, Roland; Landgraf, Katja

2010-01-01

By analyzing and comparing industry and business best practice, processes can be optimized and become more successful mainly because efficiency and competitiveness increase. This paper aims to focus on some examples. Case studies are used to show knowledge exchange in the pharmaceutical industry. Best practice solutions were identified in two companies using a benchmarking method and five-stage model. Despite large administrations, there is much potential regarding business process organization. This project makes it possible for participants to fully understand their business processes. The benchmarking method gives an opportunity to critically analyze value chains (a string of companies or players working together to satisfy market demands for a special product). Knowledge exchange is interesting for companies that like to be global players. Benchmarking supports information exchange and improves competitive ability between different enterprises. Findings suggest that the five-stage model improves efficiency and effectiveness. Furthermore, the model increases the chances for reaching targets. The method gives security to partners that did not have benchmarking experience. The study identifies new quality management procedures. Process management and especially benchmarking is shown to support pharmaceutical industry improvements.
Benchmarking in Czech Higher Education: The Case of Schools of Economics

ERIC Educational Resources Information Center

Placek, Michal; Ochrana, František; Pucek, Milan

2015-01-01

This article describes the use of benchmarking in universities in the Czech Republic and academics' experiences with it. It is based on research conducted among academics from economics schools in Czech public and private universities. The results identified several issues regarding the utilisation and understanding of benchmarking in the Czech…
Machine characterization and benchmark performance prediction

NASA Technical Reports Server (NTRS)

Saavedra-Barrera, Rafael H.

1988-01-01

From runs of standard benchmarks or benchmark suites, it is not possible to characterize the machine nor to predict the run time of other benchmarks which have not been run. A new approach to benchmarking and machine characterization is reported. The creation and use of a machine analyzer is described, which measures the performance of a given machine on FORTRAN source language constructs. The machine analyzer yields a set of parameters which characterize the machine and spotlight its strong and weak points. Also described is a program analyzer, which analyzes FORTRAN programs and determines the frequency of execution of each of the same set of source language operations. It is then shown that by combining a machine characterization and a program characterization, we are able to predict with good accuracy the run time of a given benchmark on a given machine. Characterizations are provided for the Cray-X-MP/48, Cyber 205, IBM 3090/200, Amdahl 5840, Convex C-1, VAX 8600, VAX 11/785, VAX 11/780, SUN 3/50, and IBM RT-PC/125, and for the following benchmark programs or suites: Los Alamos (BMK8A1), Baskett, Linpack, Livermore Loops, Madelbrot Set, NAS Kernels, Shell Sort, Smith, Whetstone and Sieve of Erathostenes.
Evaluation of the concrete shield compositions from the 2010 criticality accident alarm system benchmark experiments at the CEA Valduc SILENE facility

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miller, Thomas Martin; Celik, Cihangir; Dunn, Michael E

In October 2010, a series of benchmark experiments were conducted at the French Commissariat a l'Energie Atomique et aux Energies Alternatives (CEA) Valduc SILENE facility. These experiments were a joint effort between the United States Department of Energy Nuclear Criticality Safety Program and the CEA. The purpose of these experiments was to create three benchmarks for the verification and validation of radiation transport codes and evaluated nuclear data used in the analysis of criticality accident alarm systems. This series of experiments consisted of three single-pulsed experiments with the SILENE reactor. For the first experiment, the reactor was bare (unshielded), whereasmore » in the second and third experiments, it was shielded by lead and polyethylene, respectively. The polyethylene shield of the third experiment had a cadmium liner on its internal and external surfaces, which vertically was located near the fuel region of SILENE. During each experiment, several neutron activation foils and thermoluminescent dosimeters (TLDs) were placed around the reactor. Nearly half of the foils and TLDs had additional high-density magnetite concrete, high-density barite concrete, standard concrete, and/or BoroBond shields. CEA Saclay provided all the concrete, and the US Y-12 National Security Complex provided the BoroBond. Measurement data from the experiments were published at the 2011 International Conference on Nuclear Criticality (ICNC 2011) and the 2013 Nuclear Criticality Safety Division (NCSD 2013) topical meeting. Preliminary computational results for the first experiment were presented in the ICNC 2011 paper, which showed poor agreement between the computational results and the measured values of the foils shielded by concrete. Recently the hydrogen content, boron content, and density of these concrete shields were further investigated within the constraints of the previously available data. New computational results for the first experiment are now
47 CFR 69.108 - Transport rate benchmark.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 47 Telecommunication 3 2010-10-01 2010-10-01 false Transport rate benchmark. 69.108 Section 69.108... Computation of Charges § 69.108 Transport rate benchmark. (a) For transport charges computed in accordance... interoffice transmission using the telephone company's DS1 special access rates. (b) Initial transport rates...
47 CFR 69.108 - Transport rate benchmark.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 47 Telecommunication 3 2011-10-01 2011-10-01 false Transport rate benchmark. 69.108 Section 69.108... Computation of Charges § 69.108 Transport rate benchmark. (a) For transport charges computed in accordance... interoffice transmission using the telephone company's DS1 special access rates. (b) Initial transport rates...
Criticality experiments and benchmarks for cross section evaluation: the neptunium case

NASA Astrophysics Data System (ADS)

Leong, L. S.; Tassan-Got, L.; Audouin, L.; Paradela, C.; Wilson, J. N.; Tarrio, D.; Berthier, B.; Duran, I.; Le Naour, C.; Stéphan, C.

2013-03-01

The 237Np neutron-induced fission cross section has been recently measured in a large energy range (from eV to GeV) at the n_TOF facility at CERN. When compared to previous measurement the n_TOF fission cross section appears to be higher by 5-7% beyond the fission threshold. To check the relevance of n_TOF data, we apply a criticality experiment performed at Los Alamos with a 6 kg sphere of 237Np, surrounded by enriched uranium 235U so as to approach criticality with fast neutrons. The multiplication factor ke f f of the calculation is in better agreement with the experiment (the deviation of 750 pcm is reduced to 250 pcm) when we replace the ENDF/B-VII.0 evaluation of the 237Np fission cross section by the n_TOF data. We also explore the hypothesis of deficiencies of the inelastic cross section in 235U which has been invoked by some authors to explain the deviation of 750 pcm. With compare to inelastic large distortion calculation, it is incompatible with existing measurements. Also we show that the v of 237Np can hardly be incriminated because of the high accuracy of the existing data. Fission rate ratios or averaged fission cross sections measured in several fast neutron fields seem to give contradictory results on the validation of the 237Np cross section but at least one of the benchmark experiments, where the active deposits have been well calibrated for the number of atoms, favors the n_TOF data set. These outcomes support the hypothesis of a higher fission cross section of 237Np.
Benchmarking of Heavy Ion Transport Codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Remec, Igor; Ronningen, Reginald M.; Heilbronn, Lawrence

Accurate prediction of radiation fields generated by heavy ion interactions is important in medical applications, space missions, and in designing and operation of rare isotope research facilities. In recent years, several well-established computer codes in widespread use for particle and radiation transport calculations have been equipped with the capability to simulate heavy ion transport and interactions. To assess and validate these capabilities, we performed simulations of a series of benchmark-quality heavy ion experiments with the computer codes FLUKA, MARS15, MCNPX, and PHITS. We focus on the comparisons of secondary neutron production. Results are encouraging; however, further improvements in models andmore » codes and additional benchmarking are required.« less
HMM-ModE: implementation, benchmarking and validation with HMMER3

PubMed Central

2014-01-01

Background HMM-ModE is a computational method that generates family specific profile HMMs using negative training sequences. The method optimizes the discrimination threshold using 10 fold cross validation and modifies the emission probabilities of profiles to reduce common fold based signals shared with other sub-families. The protocol depends on the program HMMER for HMM profile building and sequence database searching. The recent release of HMMER3 has improved database search speed by several orders of magnitude, allowing for the large scale deployment of the method in sequence annotation projects. We have rewritten our existing scripts both at the level of parsing the HMM profiles and modifying emission probabilities to upgrade HMM-ModE using HMMER3 that takes advantage of its probabilistic inference with high computational speed. The method is benchmarked and tested on GPCR dataset as an accurate and fast method for functional annotation. Results The implementation of this method, which now works with HMMER3, is benchmarked with the earlier version of HMMER, to show that the effect of local-local alignments is marked only in the case of profiles containing a large number of discontinuous match states. The method is tested on a gold standard set of families and we have reported a significant reduction in the number of false positive hits over the default HMM profiles. When implemented on GPCR sequences, the results showed an improvement in the accuracy of classification compared with other methods used to classify the familyat different levels of their classification hierarchy. Conclusions The present findings show that the new version of HMM-ModE is a highly specific method used to differentiate between fold (superfamily) and function (family) specific signals, which helps in the functional annotation of protein sequences. The use of modified profile HMMs of GPCR sequences provides a simple yet highly specific method for classification of the family, being
Comparison of Origin 2000 and Origin 3000 Using NAS Parallel Benchmarks

NASA Technical Reports Server (NTRS)

Turney, Raymond D.

2001-01-01

This report describes results of benchmark tests on the Origin 3000 system currently being installed at the NASA Ames National Advanced Supercomputing facility. This machine will ultimately contain 1024 R14K processors. The first part of the system, installed in November, 2000 and named mendel, is an Origin 3000 with 128 R12K processors. For comparison purposes, the tests were also run on lomax, an Origin 2000 with R12K processors. The BT, LU, and SP application benchmarks in the NAS Parallel Benchmark Suite and the kernel benchmark FT were chosen to determine system performance and measure the impact of changes on the machine as it evolves. Having been written to measure performance on Computational Fluid Dynamics applications, these benchmarks are assumed appropriate to represent the NAS workload. Since the NAS runs both message passing (MPI) and shared-memory, compiler directive type codes, both MPI and OpenMP versions of the benchmarks were used. The MPI versions used were the latest official release of the NAS Parallel Benchmarks, version 2.3. The OpenMP versiqns used were PBN3b2, a beta version that is in the process of being released. NPB 2.3 and PBN 3b2 are technically different benchmarks, and NPB results are not directly comparable to PBN results.
Simplified Numerical Analysis of ECT Probe - Eddy Current Benchmark Problem 3

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sikora, R.; Chady, T.; Gratkowski, S.

2005-04-09

In this paper a third eddy current benchmark problem is considered. The objective of the benchmark is to determine optimal operating frequency and size of the pancake coil designated for testing tubes made of Inconel. It can be achieved by maximization of the change in impedance of the coil due to a flaw. Approximation functions of the probe (coil) characteristic were developed and used in order to reduce number of required calculations. It results in significant speed up of the optimization process. An optimal testing frequency and size of the probe were achieved as a final result of the calculation.
Benchmarking in emergency health systems.

PubMed

Kennedy, Marcus P; Allen, Jacqueline; Allen, Greg

2002-12-01

This paper discusses the role of benchmarking as a component of quality management. It describes the historical background of benchmarking, its competitive origin and the requirement in today's health environment for a more collaborative approach. The classical 'functional and generic' types of benchmarking are discussed with a suggestion to adopt a different terminology that describes the purpose and practicalities of benchmarking. Benchmarking is not without risks. The consequence of inappropriate focus and the need for a balanced overview of process is explored. The competition that is intrinsic to benchmarking is questioned and the negative impact it may have on improvement strategies in poorly performing organizations is recognized. The difficulty in achieving cross-organizational validity in benchmarking is emphasized, as is the need to scrutinize benchmarking measures. The cost effectiveness of benchmarking projects is questioned and the concept of 'best value, best practice' in an environment of fixed resources is examined.
Benchmarking the QUAD4/TRIA3 element

NASA Technical Reports Server (NTRS)

Pitrof, Stephen M.; Venkayya, Vipperla B.

1993-01-01

The QUAD4 and TRIA3 elements are the primary plate/shell elements in NASTRAN. These elements enable the user to analyze thin plate/shell structures for membrane, bending and shear phenomena. They are also very new elements in the NASTRAN library. These elements are extremely versatile and constitute a substantially enhanced analysis capability in NASTRAN. However, with the versatility comes the burden of understanding a myriad of modeling implications and their effect on accuracy and analysis quality. The validity of many aspects of these elements were established through a series of benchmark problem results and comparison with those available in the literature and obtained from other programs like MSC/NASTRAN and CSAR/NASTRAN. Never-the-less such a comparison is never complete because of the new and creative use of these elements in complex modeling situations. One of the important features of QUAD4 and TRIA3 elements is the offset capability which allows the midsurface of the plate to be noncoincident with the surface of the grid points. None of the previous elements, with the exception of bar (beam), has this capability. The offset capability played a crucial role in the design of QUAD4 and TRIA3 elements. It allowed modeling layered composites, laminated plates and sandwich plates with the metal and composite face sheets. Even though the basic implementation of the offset capability is found to be sound in the previous applications, there is some uncertainty in relatively simple applications. The main purpose of this paper is to test the integrity of the offset capability and provide guidelines for its effective use. For the purpose of simplicity, references in this paper to the QUAD4 element will also include the TRIA3 element.
Availability of Neutronics Benchmarks in the ICSBEP and IRPhEP Handbooks for Computational Tools Testing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bess, John D.; Briggs, J. Blair; Ivanova, Tatiana

2017-02-01

In the past several decades, numerous experiments have been performed worldwide to support reactor operations, measurements, design, and nuclear safety. Those experiments represent an extensive international investment in infrastructure, expertise, and cost, representing significantly valuable resources of data supporting past, current, and future research activities. Those valuable assets represent the basis for recording, development, and validation of our nuclear methods and integral nuclear data [1]. The loss of these experimental data, which has occurred all too much in the recent years, is tragic. The high cost to repeat many of these measurements can be prohibitive, if not impossible, to surmount.more » Two international projects were developed, and are under the direction of the Organisation for Co-operation and Development Nuclear Energy Agency (OECD NEA) to address the challenges of not just data preservation, but evaluation of the data to determine its merit for modern and future use. The International Criticality Safety Benchmark Evaluation Project (ICSBEP) was established to identify and verify comprehensive critical benchmark data sets; evaluate the data, including quantification of biases and uncertainties; compile the data and calculations in a standardized format; and formally document the effort into a single source of verified benchmark data [2]. Similarly, the International Reactor Physics Experiment Evaluation Project (IRPhEP) was established to preserve integral reactor physics experimental data, including separate or special effects data for nuclear energy and technology applications [3]. Annually, contributors from around the world continue to collaborate in the evaluation and review of select benchmark experiments for preservation and dissemination. The extensively peer-reviewed integral benchmark data can then be utilized to support nuclear design and safety analysts to validate the analytical tools, methods, and data needed for next
Benchmarking reference services: step by step.

PubMed

Buchanan, H S; Marshall, J G

1996-01-01

This article is a companion to an introductory article on benchmarking published in an earlier issue of Medical Reference Services Quarterly. Librarians interested in benchmarking often ask the following questions: How do I determine what to benchmark; how do I form a benchmarking team; how do I identify benchmarking partners; what's the best way to collect and analyze benchmarking information; and what will I do with the data? Careful planning is a critical success factor of any benchmarking project, and these questions must be answered before embarking on a benchmarking study. This article summarizes the steps necessary to conduct benchmarking research. Relevant examples of each benchmarking step are provided.
High-energy neutron depth-dose distribution experiment.

PubMed

Ferenci, M S; Hertel, N E

2003-01-01

A unique set of high-energy neutron depth-dose benchmark experiments were performed at the Los Alamos Neutron Science Center/Weapons Neutron Research (LANSCE/WNR) complex. The experiments consisted of filtered neutron beams with energies up to 800 MeV impinging on a 30 x 30 x 30 cm3 liquid, tissue-equivalent phantom. The absorbed dose was measured in the phantom at various depths with tissue-equivalent ion chambers. This experiment is intended to serve as a benchmark experiment for the testing of high-energy radiation transport codes for the international radiation protection community.
Issues in Benchmarking Human Reliability Analysis Methods: A Literature Review

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ronald L. Boring; Stacey M. L. Hendrickson; John A. Forester

There is a diversity of human reliability analysis (HRA) methods available for use in assessing human performance within probabilistic risk assessments (PRA). Due to the significant differences in the methods, including the scope, approach, and underlying models, there is a need for an empirical comparison investigating the validity and reliability of the methods. To accomplish this empirical comparison, a benchmarking study comparing and evaluating HRA methods in assessing operator performance in simulator experiments is currently underway. In order to account for as many effects as possible in the construction of this benchmarking study, a literature review was conducted, reviewing pastmore » benchmarking studies in the areas of psychology and risk assessment. A number of lessons learned through these studies are presented in order to aid in the design of future HRA benchmarking endeavors.« less
Discovering and Implementing Best Practices to Strengthen SEAs: Collaborative Benchmarking

ERIC Educational Resources Information Center

Building State Capacity and Productivity Center, 2013

2013-01-01

This paper is written for state educational agency (SEA) leaders who are considering the benefits of collaborative benchmarking, and it addresses the following questions: (1) What does benchmarking of best practices entail?; (2) How does "collaborative benchmarking" enhance the process?; (3) How do SEAs control the process so that "their" needs…

Test One to Test Many: A Unified Approach to Quantum Benchmarks

NASA Astrophysics Data System (ADS)

Bai, Ge; Chiribella, Giulio

2018-04-01

Quantum benchmarks are routinely used to validate the experimental demonstration of quantum information protocols. Many relevant protocols, however, involve an infinite set of input states, of which only a finite subset can be used to test the quality of the implementation. This is a problem, because the benchmark for the finitely many states used in the test can be higher than the original benchmark calculated for infinitely many states. This situation arises in the teleportation and storage of coherent states, for which the benchmark of 50% fidelity is commonly used in experiments, although finite sets of coherent states normally lead to higher benchmarks. Here, we show that the average fidelity over all coherent states can be indirectly probed with a single setup, requiring only two-mode squeezing, a 50-50 beam splitter, and homodyne detection. Our setup enables a rigorous experimental validation of quantum teleportation, storage, amplification, attenuation, and purification of noisy coherent states. More generally, we prove that every quantum benchmark can be tested by preparing a single entangled state and measuring a single observable.
Energy benchmarking of commercial buildings: a low-cost pathway toward urban sustainability

NASA Astrophysics Data System (ADS)

Cox, Matt; Brown, Marilyn A.; Sun, Xiaojing

2013-09-01

US cities are beginning to experiment with a regulatory approach to address information failures in the real estate market by mandating the energy benchmarking of commercial buildings. Understanding how a commercial building uses energy has many benefits; for example, it helps building owners and tenants identify poor-performing buildings and subsystems and it enables high-performing buildings to achieve greater occupancy rates, rents, and property values. This paper estimates the possible impacts of a national energy benchmarking mandate through analysis chiefly utilizing the Georgia Tech version of the National Energy Modeling System (GT-NEMS). Correcting input discount rates results in a 4.0% reduction in projected energy consumption for seven major classes of equipment relative to the reference case forecast in 2020, rising to 8.7% in 2035. Thus, the official US energy forecasts appear to overestimate future energy consumption by underestimating investments in energy-efficient equipment. Further discount rate reductions spurred by benchmarking policies yield another 1.3-1.4% in energy savings in 2020, increasing to 2.2-2.4% in 2035. Benchmarking would increase the purchase of energy-efficient equipment, reducing energy bills, CO2 emissions, and conventional air pollution. Achieving comparable CO2 savings would require more than tripling existing US solar capacity. Our analysis suggests that nearly 90% of the energy saved by a national benchmarking policy would benefit metropolitan areas, and the policy’s benefits would outweigh its costs, both to the private sector and society broadly.
The KMAT: Benchmarking Knowledge Management.

ERIC Educational Resources Information Center

de Jager, Martha

Provides an overview of knowledge management and benchmarking, including the benefits and methods of benchmarking (e.g., competitive, cooperative, collaborative, and internal benchmarking). Arthur Andersen's KMAT (Knowledge Management Assessment Tool) is described. The KMAT is a collaborative benchmarking tool, designed to help organizations make…
Benchmarks for target tracking

NASA Astrophysics Data System (ADS)

Dunham, Darin T.; West, Philip D.

2011-09-01

The term benchmark originates from the chiseled horizontal marks that surveyors made, into which an angle-iron could be placed to bracket ("bench") a leveling rod, thus ensuring that the leveling rod can be repositioned in exactly the same place in the future. A benchmark in computer terms is the result of running a computer program, or a set of programs, in order to assess the relative performance of an object by running a number of standard tests and trials against it. This paper will discuss the history of simulation benchmarks that are being used by multiple branches of the military and agencies of the US government. These benchmarks range from missile defense applications to chemical biological situations. Typically, a benchmark is used with Monte Carlo runs in order to tease out how algorithms deal with variability and the range of possible inputs. We will also describe problems that can be solved by a benchmark.
Benchmarks for Psychotherapy Efficacy in Adult Major Depression

ERIC Educational Resources Information Center

Minami, Takuya; Wampold, Bruce E.; Serlin, Ronald C.; Kircher, John C.; Brown, George S.

2007-01-01

This study estimates pretreatment-posttreatment effect size benchmarks for the treatment of major depression in adults that may be useful in evaluating psychotherapy effectiveness in clinical practice. Treatment efficacy benchmarks for major depression were derived for 3 different types of outcome measures: the Hamilton Rating Scale for Depression…
Benchmarking specialty hospitals, a scoping review on theory and practice.

PubMed

Wind, A; van Harten, W H

2017-04-04

Although benchmarking may improve hospital processes, research on this subject is limited. The aim of this study was to provide an overview of publications on benchmarking in specialty hospitals and a description of study characteristics. We searched PubMed and EMBASE for articles published in English in the last 10 years. Eligible articles described a project stating benchmarking as its objective and involving a specialty hospital or specific patient category; or those dealing with the methodology or evaluation of benchmarking. Of 1,817 articles identified in total, 24 were included in the study. Articles were categorized into: pathway benchmarking, institutional benchmarking, articles on benchmark methodology or -evaluation and benchmarking using a patient registry. There was a large degree of variability:(1) study designs were mostly descriptive and retrospective; (2) not all studies generated and showed data in sufficient detail; and (3) there was variety in whether a benchmarking model was just described or if quality improvement as a consequence of the benchmark was reported upon. Most of the studies that described a benchmark model described the use of benchmarking partners from the same industry category, sometimes from all over the world. Benchmarking seems to be more developed in eye hospitals, emergency departments and oncology specialty hospitals. Some studies showed promising improvement effects. However, the majority of the articles lacked a structured design, and did not report on benchmark outcomes. In order to evaluate the effectiveness of benchmarking to improve quality in specialty hospitals, robust and structured designs are needed including a follow up to check whether the benchmark study has led to improvements.
Spherical harmonic results for the 3D Kobayashi Benchmark suite

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, P N; Chang, B; Hanebutte, U R

1999-03-02

Spherical harmonic solutions are presented for the Kobayashi benchmark suite. The results were obtained with Ardra, a scalable, parallel neutron transport code developed at Lawrence Livermore National Laboratory (LLNL). The calculations were performed on the IBM ASCI Blue-Pacific computer at LLNL.
ENDF/B-VII.1 Neutron Cross Section Data Testing with Critical Assembly Benchmarks and Reactor Experiments

NASA Astrophysics Data System (ADS)

Kahler, A. C.; MacFarlane, R. E.; Mosteller, R. D.; Kiedrowski, B. C.; Frankle, S. C.; Chadwick, M. B.; McKnight, R. D.; Lell, R. M.; Palmiotti, G.; Hiruta, H.; Herman, M.; Arcilla, R.; Mughabghab, S. F.; Sublet, J. C.; Trkov, A.; Trumbull, T. H.; Dunn, M.

2011-12-01

The ENDF/B-VII.1 library is the latest revision to the United States' Evaluated Nuclear Data File (ENDF). The ENDF library is currently in its seventh generation, with ENDF/B-VII.0 being released in 2006. This revision expands upon that library, including the addition of new evaluated files (was 393 neutron files previously, now 423 including replacement of elemental vanadium and zinc evaluations with isotopic evaluations) and extension or updating of many existing neutron data files. Complete details are provided in the companion paper [M. B. Chadwick et al., "ENDF/B-VII.1 Nuclear Data for Science and Technology: Cross Sections, Covariances, Fission Product Yields and Decay Data," Nuclear Data Sheets, 112, 2887 (2011)]. This paper focuses on how accurately application libraries may be expected to perform in criticality calculations with these data. Continuous energy cross section libraries, suitable for use with the MCNP Monte Carlo transport code, have been generated and applied to a suite of nearly one thousand critical benchmark assemblies defined in the International Criticality Safety Benchmark Evaluation Project's International Handbook of Evaluated Criticality Safety Benchmark Experiments. This suite covers uranium and plutonium fuel systems in a variety of forms such as metallic, oxide or solution, and under a variety of spectral conditions, including unmoderated (i.e., bare), metal reflected and water or other light element reflected. Assembly eigenvalues that were accurately predicted with ENDF/B-VII.0 cross sections such as unmoderated and uranium reflected 235U and 239Pu assemblies, HEU solution systems and LEU oxide lattice systems that mimic commercial PWR configurations continue to be accurately calculated with ENDF/B-VII.1 cross sections, and deficiencies in predicted eigenvalues for assemblies containing selected materials, including titanium, manganese, cadmium and tungsten are greatly reduced. Improvements are also confirmed for selected
Benchmarking and the laboratory

PubMed Central

Galloway, M; Nadin, L

2001-01-01

This article describes how benchmarking can be used to assess laboratory performance. Two benchmarking schemes are reviewed, the Clinical Benchmarking Company's Pathology Report and the College of American Pathologists' Q-Probes scheme. The Clinical Benchmarking Company's Pathology Report is undertaken by staff based in the clinical management unit, Keele University with appropriate input from the professional organisations within pathology. Five annual reports have now been completed. Each report is a detailed analysis of 10 areas of laboratory performance. In this review, particular attention is focused on the areas of quality, productivity, variation in clinical practice, skill mix, and working hours. The Q-Probes scheme is part of the College of American Pathologists programme in studies of quality assurance. The Q-Probes scheme and its applicability to pathology in the UK is illustrated by reviewing two recent Q-Probe studies: routine outpatient test turnaround time and outpatient test order accuracy. The Q-Probes scheme is somewhat limited by the small number of UK laboratories that have participated. In conclusion, as a result of the government's policy in the UK, benchmarking is here to stay. Benchmarking schemes described in this article are one way in which pathologists can demonstrate that they are providing a cost effective and high quality service. Key Words: benchmarking • pathology PMID:11477112
Benchmarking Academic Libraries: An Australian Case Study.

ERIC Educational Resources Information Center

Robertson, Margaret; Trahn, Isabella

1997-01-01

Discusses experiences and outcomes of benchmarking at the Queensland University of Technology (Australia) library that compared acquisitions, cataloging, document delivery, and research support services with those of the University of New South Wales. Highlights include results as a catalyst for change, and the use of common output and performance…
Benchmarking for Higher Education.

ERIC Educational Resources Information Center

Jackson, Norman, Ed.; Lund, Helen, Ed.

The chapters in this collection explore the concept of benchmarking as it is being used and developed in higher education (HE). Case studies and reviews show how universities in the United Kingdom are using benchmarking to aid in self-regulation and self-improvement. The chapters are: (1) "Introduction to Benchmarking" (Norman Jackson…
Overview of TPC Benchmark E: The Next Generation of OLTP Benchmarks

NASA Astrophysics Data System (ADS)

Hogan, Trish

Set to replace the aging TPC-C, the TPC Benchmark E is the next generation OLTP benchmark, which more accurately models client database usage. TPC-E addresses the shortcomings of TPC-C. It has a much more complex workload, requires the use of RAID-protected storage, generates much less I/O, and is much cheaper and easier to set up, run, and audit. After a period of overlap, it is expected that TPC-E will become the de facto OLTP benchmark.
Validation and Comparison of 2D and 3D Codes for Nearshore Motion of Long Waves Using Benchmark Problems

NASA Astrophysics Data System (ADS)

Velioǧlu, Deniz; Cevdet Yalçıner, Ahmet; Zaytsev, Andrey

2016-04-01

Tsunamis are huge waves with long wave periods and wave lengths that can cause great devastation and loss of life when they strike a coast. The interest in experimental and numerical modeling of tsunami propagation and inundation increased considerably after the 2011 Great East Japan earthquake. In this study, two numerical codes, FLOW 3D and NAMI DANCE, that analyze tsunami propagation and inundation patterns are considered. Flow 3D simulates linear and nonlinear propagating surface waves as well as long waves by solving three-dimensional Navier-Stokes (3D-NS) equations. NAMI DANCE uses finite difference computational method to solve 2D depth-averaged linear and nonlinear forms of shallow water equations (NSWE) in long wave problems, specifically tsunamis. In order to validate these two codes and analyze the differences between 3D-NS and 2D depth-averaged NSWE equations, two benchmark problems are applied. One benchmark problem investigates the runup of long waves over a complex 3D beach. The experimental setup is a 1:400 scale model of Monai Valley located on the west coast of Okushiri Island, Japan. Other benchmark problem is discussed in 2015 National Tsunami Hazard Mitigation Program (NTHMP) Annual meeting in Portland, USA. It is a field dataset, recording the Japan 2011 tsunami in Hilo Harbor, Hawaii. The computed water surface elevation and velocity data are compared with the measured data. The comparisons showed that both codes are in fairly good agreement with each other and benchmark data. The differences between 3D-NS and 2D depth-averaged NSWE equations are highlighted. All results are presented with discussions and comparisons. Acknowledgements: Partial support by Japan-Turkey Joint Research Project by JICA on earthquakes and tsunamis in Marmara Region (JICA SATREPS - MarDiM Project), 603839 ASTARTE Project of EU, UDAP-C-12-14 project of AFAD Turkey, 108Y227, 113M556 and 213M534 projects of TUBITAK Turkey, RAPSODI (CONCERT_Dis-021) of CONCERT
Benchmarking reference services: an introduction.

PubMed

Marshall, J G; Buchanan, H S

1995-01-01

Benchmarking is based on the common sense idea that someone else, either inside or outside of libraries, has found a better way of doing certain things and that your own library's performance can be improved by finding out how others do things and adopting the best practices you find. Benchmarking is one of the tools used for achieving continuous improvement in Total Quality Management (TQM) programs. Although benchmarking can be done on an informal basis, TQM puts considerable emphasis on formal data collection and performance measurement. Used to its full potential, benchmarking can provide a common measuring stick to evaluate process performance. This article introduces the general concept of benchmarking, linking it whenever possible to reference services in health sciences libraries. Data collection instruments that have potential application in benchmarking studies are discussed and the need to develop common measurement tools to facilitate benchmarking is emphasized.
A Uranium Bioremediation Reactive Transport Benchmark

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yabusaki, Steven B.; Sengor, Sevinc; Fang, Yilin

A reactive transport benchmark problem set has been developed based on in situ uranium bio-immobilization experiments that have been performed at a former uranium mill tailings site in Rifle, Colorado, USA. Acetate-amended groundwater stimulates indigenous microorganisms to catalyze the reduction of U(VI) to a sparingly soluble U(IV) mineral. The interplay between the flow, acetate loading periods and rates, microbially-mediated and geochemical reactions leads to dynamic behavior in metal- and sulfate-reducing bacteria, pH, alkalinity, and reactive mineral surfaces. The benchmark is based on an 8.5 m long one-dimensional model domain with constant saturated flow and uniform porosity. The 159-day simulation introducesmore » acetate and bromide through the upgradient boundary in 14-day and 85-day pulses separated by a 10 day interruption. Acetate loading is tripled during the second pulse, which is followed by a 50 day recovery period. Terminal electron accepting processes for goethite, phyllosilicate Fe(III), U(VI), and sulfate are modeled using Monod-type rate laws. Major ion geochemistry modeled includes mineral reactions, as well as aqueous and surface complexation reactions for UO2++, Fe++, and H+. In addition to the dynamics imparted by the transport of the acetate pulses, U(VI) behavior involves the interplay between bioreduction, which is dependent on acetate availability, and speciation-controlled surface complexation, which is dependent on pH, alkalinity and available surface complexation sites. The general difficulty of this benchmark is the large number of reactions (74), multiple rate law formulations, a multisite uranium surface complexation model, and the strong interdependency and sensitivity of the reaction processes. Results are presented for three simulators: HYDROGEOCHEM, PHT3D, and PHREEQC.« less
Benchmarking: A Method for Continuous Quality Improvement in Health

PubMed Central

Ettorchi-Tardy, Amina; Levif, Marie; Michel, Philippe

2012-01-01

Benchmarking, a management approach for implementing best practices at best cost, is a recent concept in the healthcare system. The objectives of this paper are to better understand the concept and its evolution in the healthcare sector, to propose an operational definition, and to describe some French and international experiences of benchmarking in the healthcare sector. To this end, we reviewed the literature on this approach's emergence in the industrial sector, its evolution, its fields of application and examples of how it has been used in the healthcare sector. Benchmarking is often thought to consist simply of comparing indicators and is not perceived in its entirety, that is, as a tool based on voluntary and active collaboration among several organizations to create a spirit of competition and to apply best practices. The key feature of benchmarking is its integration within a comprehensive and participatory policy of continuous quality improvement (CQI). Conditions for successful benchmarking focus essentially on careful preparation of the process, monitoring of the relevant indicators, staff involvement and inter-organizational visits. Compared to methods previously implemented in France (CQI and collaborative projects), benchmarking has specific features that set it apart as a healthcare innovation. This is especially true for healthcare or medical–social organizations, as the principle of inter-organizational visiting is not part of their culture. Thus, this approach will need to be assessed for feasibility and acceptability before it is more widely promoted. PMID:23634166
Benchmarking: a method for continuous quality improvement in health.

PubMed

Ettorchi-Tardy, Amina; Levif, Marie; Michel, Philippe

2012-05-01

Benchmarking, a management approach for implementing best practices at best cost, is a recent concept in the healthcare system. The objectives of this paper are to better understand the concept and its evolution in the healthcare sector, to propose an operational definition, and to describe some French and international experiences of benchmarking in the healthcare sector. To this end, we reviewed the literature on this approach's emergence in the industrial sector, its evolution, its fields of application and examples of how it has been used in the healthcare sector. Benchmarking is often thought to consist simply of comparing indicators and is not perceived in its entirety, that is, as a tool based on voluntary and active collaboration among several organizations to create a spirit of competition and to apply best practices. The key feature of benchmarking is its integration within a comprehensive and participatory policy of continuous quality improvement (CQI). Conditions for successful benchmarking focus essentially on careful preparation of the process, monitoring of the relevant indicators, staff involvement and inter-organizational visits. Compared to methods previously implemented in France (CQI and collaborative projects), benchmarking has specific features that set it apart as a healthcare innovation. This is especially true for healthcare or medical-social organizations, as the principle of inter-organizational visiting is not part of their culture. Thus, this approach will need to be assessed for feasibility and acceptability before it is more widely promoted.
Experimental power density distribution benchmark in the TRIGA Mark II reactor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Snoj, L.; Stancar, Z.; Radulovic, V.

2012-07-01

In order to improve the power calibration process and to benchmark the existing computational model of the TRIGA Mark II reactor at the Josef Stefan Inst. (JSI), a bilateral project was started as part of the agreement between the French Commissariat a l'energie atomique et aux energies alternatives (CEA) and the Ministry of higher education, science and technology of Slovenia. One of the objectives of the project was to analyze and improve the power calibration process of the JSI TRIGA reactor (procedural improvement and uncertainty reduction) by using absolutely calibrated CEA fission chambers (FCs). This is one of the fewmore » available power density distribution benchmarks for testing not only the fission rate distribution but also the absolute values of the fission rates. Our preliminary calculations indicate that the total experimental uncertainty of the measured reaction rate is sufficiently low that the experiments could be considered as benchmark experiments. (authors)« less
3D-MHD Simulations of the Madison Dynamo Experiment

NASA Astrophysics Data System (ADS)

Bayliss, R. A.; Forest, C. B.; Wright, J. C.; O'Connell, R.

2003-10-01

Growth, saturation and turbulent evolution of the Madison dynamo experiment is investigated numerically using a 3-D pseudo-spectral simulation of the MHD equations; results of the simulations are used to predict behavior of the experiment. The code solves the self-consistent full evolution of the magnetic and velocity fields. The code uses a spectral representation via spherical harmonic basis functions of the vector fields in longitude and latitude, and fourth order finite differences in the radial direction. The magnetic field evolution has been benchmarked against the laminar kinematic dynamo predicted by M.L. Dudley and R.W. James [Proc. R. Soc. Lond. A 425. 407-429 (1989)]. Initial results indicate that saturation of the magnetic field occurs so that the resulting perturbed backreaction of the induced magnetic field changes the velocity field such that it would no longer be linearly unstable, suggesting non-linear terms are necessary for explaining the resulting state. Saturation and self-excitation depend in detail upon the magnetic Prandtl number.
Toxicological Benchmarks for Screening Potential Contaminants of Concern for Effects on Terrestrial Plants

DOE Office of Scientific and Technical Information (OSTI.GOV)

Suter, G.W. II

1993-01-01

One of the initial stages in ecological risk assessment for hazardous waste sites is screening contaminants to determine which of them are worthy of further consideration as contaminants of potential concern. This process is termed contaminant screening. It is performed by comparing measured ambient concentrations of chemicals to benchmark concentrations. Currently, no standard benchmark concentrations exist for assessing contaminants in soil with respect to their toxicity to plants. This report presents a standard method for deriving benchmarks for this purpose (phytotoxicity benchmarks), a set of data concerning effects of chemicals in soil or soil solution on plants, and a setmore » of phytotoxicity benchmarks for 38 chemicals potentially associated with United States Department of Energy (DOE) sites. In addition, background information on the phytotoxicity and occurrence of the chemicals in soils is presented, and literature describing the experiments from which data were drawn for benchmark derivation is reviewed. Chemicals that are found in soil at concentrations exceeding both the phytotoxicity benchmark and the background concentration for the soil type should be considered contaminants of potential concern.« less

Benchmarking in Academic Pharmacy Departments

PubMed Central

Chisholm-Burns, Marie; Nappi, Jean; Gubbins, Paul O.; Ross, Leigh Ann

2010-01-01

Benchmarking in academic pharmacy, and recommendations for the potential uses of benchmarking in academic pharmacy departments are discussed in this paper. Benchmarking is the process by which practices, procedures, and performance metrics are compared to an established standard or best practice. Many businesses and industries use benchmarking to compare processes and outcomes, and ultimately plan for improvement. Institutions of higher learning have embraced benchmarking practices to facilitate measuring the quality of their educational and research programs. Benchmarking is used internally as well to justify the allocation of institutional resources or to mediate among competing demands for additional program staff or space. Surveying all chairs of academic pharmacy departments to explore benchmarking issues such as department size and composition, as well as faculty teaching, scholarly, and service productivity, could provide valuable information. To date, attempts to gather this data have had limited success. We believe this information is potentially important, urge that efforts to gather it should be continued, and offer suggestions to achieve full participation. PMID:21179251
Benchmarking in academic pharmacy departments.

PubMed

Bosso, John A; Chisholm-Burns, Marie; Nappi, Jean; Gubbins, Paul O; Ross, Leigh Ann

2010-10-11

Benchmarking in academic pharmacy, and recommendations for the potential uses of benchmarking in academic pharmacy departments are discussed in this paper. Benchmarking is the process by which practices, procedures, and performance metrics are compared to an established standard or best practice. Many businesses and industries use benchmarking to compare processes and outcomes, and ultimately plan for improvement. Institutions of higher learning have embraced benchmarking practices to facilitate measuring the quality of their educational and research programs. Benchmarking is used internally as well to justify the allocation of institutional resources or to mediate among competing demands for additional program staff or space. Surveying all chairs of academic pharmacy departments to explore benchmarking issues such as department size and composition, as well as faculty teaching, scholarly, and service productivity, could provide valuable information. To date, attempts to gather this data have had limited success. We believe this information is potentially important, urge that efforts to gather it should be continued, and offer suggestions to achieve full participation.
Performance of Landslide-HySEA tsunami model for NTHMP benchmarking validation process

NASA Astrophysics Data System (ADS)

Macias, Jorge

2017-04-01

In its FY2009 Strategic Plan, the NTHMP required that all numerical tsunami inundation models be verified as accurate and consistent through a model benchmarking process. This was completed in 2011, but only for seismic tsunami sources and in a limited manner for idealized solid underwater landslides. Recent work by various NTHMP states, however, has shown that landslide tsunami hazard may be dominant along significant parts of the US coastline, as compared to hazards from other tsunamigenic sources. To perform the above-mentioned validation process, a set of candidate benchmarks were proposed. These benchmarks are based on a subset of available laboratory date sets for solid slide experiments and deformable slide experiments, and include both submarine and subaerial slides. A benchmark based on a historic field event (Valdez, AK, 1964) close the list of proposed benchmarks. The Landslide-HySEA model has participated in the workshop that was organized at Texas A&M University - Galveston, on January 9-11, 2017. The aim of this presentation is to show some of the numerical results obtained for Landslide-HySEA in the framework of this benchmarking validation/verification effort. Acknowledgements. This research has been partially supported by the Junta de Andalucía research project TESELA (P11-RNM7069), the Spanish Government Research project SIMURISK (MTM2015-70490-C02-01-R) and Universidad de Málaga, Campus de Excelencia Internacional Andalucía Tech. The GPU computations were performed at the Unit of Numerical Methods (University of Malaga).
Nonparametric estimation of benchmark doses in environmental risk assessment

PubMed Central

Piegorsch, Walter W.; Xiong, Hui; Bhattacharya, Rabi N.; Lin, Lizhen

2013-01-01

Summary An important statistical objective in environmental risk analysis is estimation of minimum exposure levels, called benchmark doses (BMDs), that induce a pre-specified benchmark response in a dose-response experiment. In such settings, representations of the risk are traditionally based on a parametric dose-response model. It is a well-known concern, however, that if the chosen parametric form is misspecified, inaccurate and possibly unsafe low-dose inferences can result. We apply a nonparametric approach for calculating benchmark doses, based on an isotonic regression method for dose-response estimation with quantal-response data (Bhattacharya and Kong, 2007). We determine the large-sample properties of the estimator, develop bootstrap-based confidence limits on the BMDs, and explore the confidence limits’ small-sample properties via a short simulation study. An example from cancer risk assessment illustrates the calculations. PMID:23914133
Results Oriented Benchmarking: The Evolution of Benchmarking at NASA from Competitive Comparisons to World Class Space Partnerships

NASA Technical Reports Server (NTRS)

Bell, Michael A.

1999-01-01

Informal benchmarking using personal or professional networks has taken place for many years at the Kennedy Space Center (KSC). The National Aeronautics and Space Administration (NASA) recognized early on, the need to formalize the benchmarking process for better utilization of resources and improved benchmarking performance. The need to compete in a faster, better, cheaper environment has been the catalyst for formalizing these efforts. A pioneering benchmarking consortium was chartered at KSC in January 1994. The consortium known as the Kennedy Benchmarking Clearinghouse (KBC), is a collaborative effort of NASA and all major KSC contractors. The charter of this consortium is to facilitate effective benchmarking, and leverage the resulting quality improvements across KSC. The KBC acts as a resource with experienced facilitators and a proven process. One of the initial actions of the KBC was to develop a holistic methodology for Center-wide benchmarking. This approach to Benchmarking integrates the best features of proven benchmarking models (i.e., Camp, Spendolini, Watson, and Balm). This cost-effective alternative to conventional Benchmarking approaches has provided a foundation for consistent benchmarking at KSC through the development of common terminology, tools, and techniques. Through these efforts a foundation and infrastructure has been built which allows short duration benchmarking studies yielding results gleaned from world class partners that can be readily implemented. The KBC has been recognized with the Silver Medal Award (in the applied research category) from the International Benchmarking Clearinghouse.
FireHose Streaming Benchmarks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Karl Anderson, Steve Plimpton

2015-01-27

The FireHose Streaming Benchmarks are a suite of stream-processing benchmarks defined to enable comparison of streaming software and hardware, both quantitatively vis-a-vis the rate at which they can process data, and qualitatively by judging the effort involved to implement and run the benchmarks. Each benchmark has two parts. The first is a generator which produces and outputs datums at a high rate in a specific format. The second is an analytic which reads the stream of datums and is required to perform a well-defined calculation on the collection of datums, typically to find anomalous datums that have been created inmore » the stream by the generator. The FireHose suite provides code for the generators, sample code for the analytics (which users are free to re-implement in their own custom frameworks), and a precise definition of each benchmark calculation.« less
Effects of benchmarking on the quality of type 2 diabetes care: results of the OPTIMISE (Optimal Type 2 Diabetes Management Including Benchmarking and Standard Treatment) study in Greece

PubMed Central

Tsimihodimos, Vasilis; Kostapanos, Michael S.; Moulis, Alexandros; Nikas, Nikos; Elisaf, Moses S.

2015-01-01

Objectives: To investigate the effect of benchmarking on the quality of type 2 diabetes (T2DM) care in Greece. Methods: The OPTIMISE (Optimal Type 2 Diabetes Management Including Benchmarking and Standard Treatment) study [ClinicalTrials.gov identifier: NCT00681850] was an international multicenter, prospective cohort study. It included physicians randomized 3:1 to either receive benchmarking for glycated hemoglobin (HbA1c), systolic blood pressure (SBP) and low-density lipoprotein cholesterol (LDL-C) treatment targets (benchmarking group) or not (control group). The proportions of patients achieving the targets of the above-mentioned parameters were compared between groups after 12 months of treatment. Also, the proportions of patients achieving those targets at 12 months were compared with baseline in the benchmarking group. Results: In the Greek region, the OPTIMISE study included 797 adults with T2DM (570 in the benchmarking group). At month 12 the proportion of patients within the predefined targets for SBP and LDL-C was greater in the benchmarking compared with the control group (50.6 versus 35.8%, and 45.3 versus 36.1%, respectively). However, these differences were not statistically significant. No difference between groups was noted in the percentage of patients achieving the predefined target for HbA1c. At month 12 the increase in the percentage of patients achieving all three targets was greater in the benchmarking (5.9–15.0%) than in the control group (2.7–8.1%). In the benchmarking group more patients were on target regarding SBP (50.6% versus 29.8%), LDL-C (45.3% versus 31.3%) and HbA1c (63.8% versus 51.2%) at 12 months compared with baseline (p < 0.001 for all comparisons). Conclusion: Benchmarking may comprise a promising tool for improving the quality of T2DM care. Nevertheless, target achievement rates of each, and of all three, quality indicators were suboptimal, indicating there are still unmet needs in the management of T2DM. PMID:26445642
NAS Parallel Benchmark Results 11-96. 1.0

NASA Technical Reports Server (NTRS)

Bailey, David H.; Bailey, David; Chancellor, Marisa K. (Technical Monitor)

1997-01-01

The NAS Parallel Benchmarks have been developed at NASA Ames Research Center to study the performance of parallel supercomputers. The eight benchmark problems are specified in a "pencil and paper" fashion. In other words, the complete details of the problem to be solved are given in a technical document, and except for a few restrictions, benchmarkers are free to select the language constructs and implementation techniques best suited for a particular system. These results represent the best results that have been reported to us by the vendors for the specific 3 systems listed. In this report, we present new NPB (Version 1.0) performance results for the following systems: DEC Alpha Server 8400 5/440, Fujitsu VPP Series (VX, VPP300, and VPP700), HP/Convex Exemplar SPP2000, IBM RS/6000 SP P2SC node (120 MHz), NEC SX-4/32, SGI/CRAY T3E, SGI Origin200, and SGI Origin2000. We also report High Performance Fortran (HPF) based NPB results for IBM SP2 Wide Nodes, HP/Convex Exemplar SPP2000, and SGI/CRAY T3D. These results have been submitted by Applied Parallel Research (APR) and Portland Group Inc. (PGI). We also present sustained performance per dollar for Class B LU, SP and BT benchmarks.
Evaluating Productivity Predictions Under Elevated CO2 Conditions: Multi-Model Benchmarking Across FACE Experiments

NASA Astrophysics Data System (ADS)

Cowdery, E.; Dietze, M.

2016-12-01

As atmospheric levels of carbon dioxide levels continue to increase, it is critical that terrestrial ecosystem models can accurately predict ecological responses to the changing environment. Current predictions of net primary productivity (NPP) in response to elevated atmospheric CO2 concentration are highly variable and contain a considerable amount of uncertainty.The Predictive Ecosystem Analyzer (PEcAn) is an informatics toolbox that wraps around an ecosystem model and can be used to help identify which factors drive uncertainty. We tested a suite of models (LPJ-GUESS, MAESPA, GDAY, CLM5, DALEC, ED2), which represent a range from low to high structural complexity, across a range of Free-Air CO2 Enrichment (FACE) experiments: the Kennedy Space Center Open Top Chamber Experiment, the Rhinelander FACE experiment, the Duke Forest FACE experiment and the Oak Ridge Experiment on CO2 Enrichment. These tests were implemented in a novel benchmarking workflow that is automated, repeatable, and generalized to incorporate different sites and ecological models. Observational data from the FACE experiments represent a first test of this flexible, extensible approach aimed at providing repeatable tests of model process representation.To identify and evaluate the assumptions causing inter-model differences we used PEcAn to perform model sensitivity and uncertainty analysis, not only to assess the components of NPP, but also to examine system processes such nutrient uptake and and water use. Combining the observed patterns of uncertainty between multiple models with results of the recent FACE-model data synthesis project (FACE-MDS) can help identify which processes need further study and additional data constraints. These findings can be used to inform future experimental design and in turn can provide informative starting point for data assimilation.
Spherical Harmonic Solutions to the 3D Kobayashi Benchmark Suite

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, P.N.; Chang, B.; Hanebutte, U.R.

1999-12-29

Spherical harmonic solutions of order 5, 9 and 21 on spatial grids containing up to 3.3 million cells are presented for the Kobayashi benchmark suite. This suite of three problems with simple geometry of pure absorber with large void region was proposed by Professor Kobayashi at an OECD/NEA meeting in 1996. Each of the three problems contains a source, a void and a shield region. Problem 1 can best be described as a box in a box problem, where a source region is surrounded by a square void region which itself is embedded in a square shield region. Problems 2more » and 3 represent a shield with a void duct. Problem 2 having a straight and problem 3 a dog leg shaped duct. A pure absorber and a 50% scattering case are considered for each of the three problems. The solutions have been obtained with Ardra, a scalable, parallel neutron transport code developed at Lawrence Livermore National Laboratory (LLNL). The Ardra code takes advantage of a two-level parallelization strategy, which combines message passing between processing nodes and thread based parallelism amongst processors on each node. All calculations were performed on the IBM ASCI Blue-Pacific computer at LLNL.« less
Benchmarking Using Basic DBMS Operations

NASA Astrophysics Data System (ADS)

Crolotte, Alain; Ghazal, Ahmad

The TPC-H benchmark proved to be successful in the decision support area. Many commercial database vendors and their related hardware vendors used these benchmarks to show the superiority and competitive edge of their products. However, over time, the TPC-H became less representative of industry trends as vendors keep tuning their database to this benchmark-specific workload. In this paper, we present XMarq, a simple benchmark framework that can be used to compare various software/hardware combinations. Our benchmark model is currently composed of 25 queries that measure the performance of basic operations such as scans, aggregations, joins and index access. This benchmark model is based on the TPC-H data model due to its maturity and well-understood data generation capability. We also propose metrics to evaluate single-system performance and compare two systems. Finally we illustrate the effectiveness of this model by showing experimental results comparing two systems under different conditions.
Testing variations of the GW approximation on strongly correlated transition metal oxides: hematite (α-Fe2O3) as a benchmark.

PubMed

Liao, Peilin; Carter, Emily A

2011-09-07

Quantitative characterization of low-lying excited electronic states in materials is critical for the development of solar energy conversion materials. The many-body Green's function method known as the GW approximation (GWA) directly probes states corresponding to photoemission and inverse photoemission experiments, thereby determining the associated band structure. Several versions of the GW approximation with different levels of self-consistency exist in the field. While the GWA based on density functional theory (DFT) works well for conventional semiconductors, less is known about its reliability for strongly correlated semiconducting materials. Here we present a systematic study of the GWA using hematite (α-Fe(2)O(3)) as the benchmark material. We analyze its performance in terms of the calculated photoemission/inverse photoemission band gaps, densities of states, and dielectric functions. Overall, a non-self-consistent G(0)W(0) using input from DFT+U theory produces physical observables in best agreement with experiments. This journal is © the Owner Societies 2011
Benchmarking Tool Kit.

ERIC Educational Resources Information Center

Canadian Health Libraries Association.

Nine Canadian health libraries participated in a pilot test of the Benchmarking Tool Kit between January and April, 1998. Although the Tool Kit was designed specifically for health libraries, the content and approach are useful to other types of libraries as well. Used to its full potential, benchmarking can provide a common measuring stick to…
Benchmarking of venous thromboembolism prophylaxis practice with ENT.UK guidelines.

PubMed

Al-Qahtani, Ali S

2017-05-01

The aim of this study was to benchmark our guidelines of prevention of venous thromboembolism (VTE) in ENT surgical population against ENT.UK guidelines, and also to encourage healthcare providers to utilize benchmarking as an effective method of improving performance. The study design is prospective descriptive analysis. The setting of this study is tertiary referral centre (Assir Central Hospital, Abha, Saudi Arabia). In this study, we are benchmarking our practice guidelines of the prevention of VTE in the ENT surgical population against that of ENT.UK guidelines to mitigate any gaps. ENT guidelines 2010 were downloaded from the ENT.UK Website. Our guidelines were compared with the possibilities that either our performance meets or fall short of ENT.UK guidelines. Immediate corrective actions will take place if there is quality chasm between the two guidelines. ENT.UK guidelines are evidence-based and updated which may serve as role-model for adoption and benchmarking. Our guidelines were accordingly amended to contain all factors required in providing a quality service to ENT surgical patients. While not given appropriate attention, benchmarking is a useful tool in improving quality of health care. It allows learning from others' practices and experiences, and works towards closing any quality gaps. In addition, benchmarking clinical outcomes is critical for quality improvement and informing decisions concerning service provision. It is recommended to be included on the list of quality improvement methods of healthcare services.
Generation IV benchmarking of TRISO fuel performance models under accident conditions: Modeling input data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Collin, Blaise P.

2014-09-01

This document presents the benchmark plan for the calculation of particle fuel performance on safety testing experiments that are representative of operational accidental transients. The benchmark is dedicated to the modeling of fission product release under accident conditions by fuel performance codes from around the world, and the subsequent comparison to post-irradiation experiment (PIE) data from the modeled heating tests. The accident condition benchmark is divided into three parts: the modeling of a simplified benchmark problem to assess potential numerical calculation issues at low fission product release; the modeling of the AGR-1 and HFR-EU1bis safety testing experiments; and, the comparisonmore » of the AGR-1 and HFR-EU1bis modeling results with PIE data. The simplified benchmark case, thereafter named NCC (Numerical Calculation Case), is derived from ''Case 5'' of the International Atomic Energy Agency (IAEA) Coordinated Research Program (CRP) on coated particle fuel technology [IAEA 2012]. It is included so participants can evaluate their codes at low fission product release. ''Case 5'' of the IAEA CRP-6 showed large code-to-code discrepancies in the release of fission products, which were attributed to ''effects of the numerical calculation method rather than the physical model''[IAEA 2012]. The NCC is therefore intended to check if these numerical effects subsist. The first two steps imply the involvement of the benchmark participants with a modeling effort following the guidelines and recommendations provided by this document. The third step involves the collection of the modeling results by Idaho National Laboratory (INL) and the comparison of these results with the available PIE data. The objective of this document is to provide all necessary input data to model the benchmark cases, and to give some methodology guidelines and recommendations in order to make all results suitable for comparison with each other. The participants should read this
Performance Comparison of NAMI DANCE and FLOW-3D® Models in Tsunami Propagation, Inundation and Currents using NTHMP Benchmark Problems

NASA Astrophysics Data System (ADS)

Velioglu Sogut, Deniz; Yalciner, Ahmet Cevdet

2018-06-01

Field observations provide valuable data regarding nearshore tsunami impact, yet only in inundation areas where tsunami waves have already flooded. Therefore, tsunami modeling is essential to understand tsunami behavior and prepare for tsunami inundation. It is necessary that all numerical models used in tsunami emergency planning be subject to benchmark tests for validation and verification. This study focuses on two numerical codes, NAMI DANCE and FLOW-3D®, for validation and performance comparison. NAMI DANCE is an in-house tsunami numerical model developed by the Ocean Engineering Research Center of Middle East Technical University, Turkey and Laboratory of Special Research Bureau for Automation of Marine Research, Russia. FLOW-3D® is a general purpose computational fluid dynamics software, which was developed by scientists who pioneered in the design of the Volume-of-Fluid technique. The codes are validated and their performances are compared via analytical, experimental and field benchmark problems, which are documented in the ``Proceedings and Results of the 2011 National Tsunami Hazard Mitigation Program (NTHMP) Model Benchmarking Workshop'' and the ``Proceedings and Results of the NTHMP 2015 Tsunami Current Modeling Workshop". The variations between the numerical solutions of these two models are evaluated through statistical error analysis.
Analyzing the BBOB results by means of benchmarking concepts.

PubMed

Mersmann, O; Preuss, M; Trautmann, H; Bischl, B; Weihs, C

2015-01-01

We present methods to answer two basic questions that arise when benchmarking optimization algorithms. The first one is: which algorithm is the "best" one? and the second one is: which algorithm should I use for my real-world problem? Both are connected and neither is easy to answer. We present a theoretical framework for designing and analyzing the raw data of such benchmark experiments. This represents a first step in answering the aforementioned questions. The 2009 and 2010 BBOB benchmark results are analyzed by means of this framework and we derive insight regarding the answers to the two questions. Furthermore, we discuss how to properly aggregate rankings from algorithm evaluations on individual problems into a consensus, its theoretical background and which common pitfalls should be avoided. Finally, we address the grouping of test problems into sets with similar optimizer rankings and investigate whether these are reflected by already proposed test problem characteristics, finding that this is not always the case.
Thermo-hydro-mechanical-chemical processes in fractured-porous media: Benchmarks and examples

NASA Astrophysics Data System (ADS)

Kolditz, O.; Shao, H.; Görke, U.; Kalbacher, T.; Bauer, S.; McDermott, C. I.; Wang, W.

2012-12-01

The book comprises an assembly of benchmarks and examples for porous media mechanics collected over the last twenty years. Analysis of thermo-hydro-mechanical-chemical (THMC) processes is essential to many applications in environmental engineering, such as geological waste deposition, geothermal energy utilisation, carbon capture and storage, water resources management, hydrology, even climate change. In order to assess the feasibility as well as the safety of geotechnical applications, process-based modelling is the only tool to put numbers, i.e. to quantify future scenarios. This charges a huge responsibility concerning the reliability of computational tools. Benchmarking is an appropriate methodology to verify the quality of modelling tools based on best practices. Moreover, benchmarking and code comparison foster community efforts. The benchmark book is part of the OpenGeoSys initiative - an open source project to share knowledge and experience in environmental analysis and scientific computation.
Developing Benchmarks for Solar Radio Bursts

NASA Astrophysics Data System (ADS)

Biesecker, D. A.; White, S. M.; Gopalswamy, N.; Black, C.; Domm, P.; Love, J. J.; Pierson, J.

2016-12-01

Solar radio bursts can interfere with radar, communication, and tracking signals. In severe cases, radio bursts can inhibit the successful use of radio communications and disrupt a wide range of systems that are reliant on Position, Navigation, and Timing services on timescales ranging from minutes to hours across wide areas on the dayside of Earth. The White House's Space Weather Action Plan has asked for solar radio burst intensity benchmarks for an event occurrence frequency of 1 in 100 years and also a theoretical maximum intensity benchmark. The solar radio benchmark team was also asked to define the wavelength/frequency bands of interest. The benchmark team developed preliminary (phase 1) benchmarks for the VHF (30-300 MHz), UHF (300-3000 MHz), GPS (1176-1602 MHz), F10.7 (2800 MHz), and Microwave (4000-20000) bands. The preliminary benchmarks were derived based on previously published work. Limitations in the published work will be addressed in phase 2 of the benchmark process. In addition, deriving theoretical maxima requires additional work, where it is even possible to, in order to meet the Action Plan objectives. In this presentation, we will present the phase 1 benchmarks and the basis used to derive them. We will also present the work that needs to be done in order to complete the final, or phase 2 benchmarks.
FDNS CFD Code Benchmark for RBCC Ejector Mode Operation

NASA Technical Reports Server (NTRS)

Holt, James B.; Ruf, Joe

1999-01-01

Computational Fluid Dynamics (CFD) analysis results are compared with benchmark quality test data from the Propulsion Engineering Research Center's (PERC) Rocket Based Combined Cycle (RBCC) experiments to verify fluid dynamic code and application procedures. RBCC engine flowpath development will rely on CFD applications to capture the multi-dimensional fluid dynamic interactions and to quantify their effect on the RBCC system performance. Therefore, the accuracy of these CFD codes must be determined through detailed comparisons with test data. The PERC experiments build upon the well-known 1968 rocket-ejector experiments of Odegaard and Stroup by employing advanced optical and laser based diagnostics to evaluate mixing and secondary combustion. The Finite Difference Navier Stokes (FDNS) code was used to model the fluid dynamics of the PERC RBCC ejector mode configuration. Analyses were performed for both Diffusion and Afterburning (DAB) and Simultaneous Mixing and Combustion (SMC) test conditions. Results from both the 2D and the 3D models are presented.

Interactive visual optimization and analysis for RFID benchmarking.

PubMed

Wu, Yingcai; Chung, Ka-Kei; Qu, Huamin; Yuan, Xiaoru; Cheung, S C

2009-01-01

Radio frequency identification (RFID) is a powerful automatic remote identification technique that has wide applications. To facilitate RFID deployment, an RFID benchmarking instrument called aGate has been invented to identify the strengths and weaknesses of different RFID technologies in various environments. However, the data acquired by aGate are usually complex time varying multidimensional 3D volumetric data, which are extremely challenging for engineers to analyze. In this paper, we introduce a set of visualization techniques, namely, parallel coordinate plots, orientation plots, a visual history mechanism, and a 3D spatial viewer, to help RFID engineers analyze benchmark data visually and intuitively. With the techniques, we further introduce two workflow procedures (a visual optimization procedure for finding the optimum reader antenna configuration and a visual analysis procedure for comparing the performance and identifying the flaws of RFID devices) for the RFID benchmarking, with focus on the performance analysis of the aGate system. The usefulness and usability of the system are demonstrated in the user evaluation.
Toxicological benchmarks for screening potential contaminants of concern for effects on terrestrial plants: 1994 revision

DOE Office of Scientific and Technical Information (OSTI.GOV)

Will, M.E.; Suter, G.W. II

1994-09-01

One of the initial stages in ecological risk assessment for hazardous waste sites is screening contaminants to determine which of them are worthy of further consideration as contaminants of potential concern. This process is termed contaminant screening. It is performed by comparing measured ambient concentrations of chemicals to benchmark concentrations. Currently, no standard benchmark concentrations exist for assessing contaminants in soil with respect to their toxicity to plants. This report presents a standard method for deriving benchmarks for this purpose (phytotoxicity benchmarks), a set of data concerning effects of chemicals in soil or soil solution on plants, and a setmore » of phytotoxicity benchmarks for 38 chemicals potentially associated with United States Department of Energy (DOE) sites. In addition, background information on the phytotoxicity and occurrence of the chemicals in soils is presented, and literature describing the experiments from which data were drawn for benchmark derivation is reviewed. Chemicals that are found in soil at concentrations exceeding both the phytotoxicity benchmark and the background concentration for the soil type should be considered contaminants of potential concern.« less
New Multi-group Transport Neutronics (PHISICS) Capabilities for RELAP5-3D and its Application to Phase I of the OECD/NEA MHTGR-350 MW Benchmark

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gerhard Strydom; Cristian Rabiti; Andrea Alfonsi

2012-10-01

PHISICS is a neutronics code system currently under development at the Idaho National Laboratory (INL). Its goal is to provide state of the art simulation capability to reactor designers. The different modules for PHISICS currently under development are a nodal and semi-structured transport core solver (INSTANT), a depletion module (MRTAU) and a cross section interpolation (MIXER) module. The INSTANT module is the most developed of the mentioned above. Basic functionalities are ready to use, but the code is still in continuous development to extend its capabilities. This paper reports on the effort of coupling the nodal kinetics code package PHISICSmore » (INSTANT/MRTAU/MIXER) to the thermal hydraulics system code RELAP5-3D, to enable full core and system modeling. This will enable the possibility to model coupled (thermal-hydraulics and neutronics) problems with more options for 3D neutron kinetics, compared to the existing diffusion theory neutron kinetics module in RELAP5-3D (NESTLE). In the second part of the paper, an overview of the OECD/NEA MHTGR-350 MW benchmark is given. This benchmark has been approved by the OECD, and is based on the General Atomics 350 MW Modular High Temperature Gas Reactor (MHTGR) design. The benchmark includes coupled neutronics thermal hydraulics exercises that require more capabilities than RELAP5-3D with NESTLE offers. Therefore, the MHTGR benchmark makes extensive use of the new PHISICS/RELAP5-3D coupling capabilities. The paper presents the preliminary results of the three steady state exercises specified in Phase I of the benchmark using PHISICS/RELAP5-3D.« less
2-D Circulation Control Airfoil Benchmark Experiments Intended for CFD Code Validation

NASA Technical Reports Server (NTRS)

Englar, Robert J.; Jones, Gregory S.; Allan, Brian G.; Lin, Johb C.

2009-01-01

A current NASA Research Announcement (NRA) project being conducted by Georgia Tech Research Institute (GTRI) personnel and NASA collaborators includes the development of Circulation Control (CC) blown airfoils to improve subsonic aircraft high-lift and cruise performance. The emphasis of this program is the development of CC active flow control concepts for both high-lift augmentation, drag control, and cruise efficiency. A collaboration in this project includes work by NASA research engineers, whereas CFD validation and flow physics experimental research are part of NASA s systematic approach to developing design and optimization tools for CC applications to fixed-wing aircraft. The design space for CESTOL type aircraft is focusing on geometries that depend on advanced flow control technologies that include Circulation Control aerodynamics. The ability to consistently predict advanced aircraft performance requires improvements in design tools to include these advanced concepts. Validation of these tools will be based on experimental methods applied to complex flows that go beyond conventional aircraft modeling techniques. This paper focuses on recent/ongoing benchmark high-lift experiments and CFD efforts intended to provide 2-D CFD validation data sets related to NASA s Cruise Efficient Short Take Off and Landing (CESTOL) study. Both the experimental data and related CFD predictions are discussed.
Integral Full Core Multi-Physics PWR Benchmark with Measured Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Forget, Benoit; Smith, Kord; Kumar, Shikhar

In recent years, the importance of modeling and simulation has been highlighted extensively in the DOE research portfolio with concrete examples in nuclear engineering with the CASL and NEAMS programs. These research efforts and similar efforts worldwide aim at the development of high-fidelity multi-physics analysis tools for the simulation of current and next-generation nuclear power reactors. Like all analysis tools, verification and validation is essential to guarantee proper functioning of the software and methods employed. The current approach relies mainly on the validation of single physic phenomena (e.g. critical experiment, flow loops, etc.) and there is a lack of relevantmore » multiphysics benchmark measurements that are necessary to validate high-fidelity methods being developed today. This work introduces a new multi-cycle full-core Pressurized Water Reactor (PWR) depletion benchmark based on two operational cycles of a commercial nuclear power plant that provides a detailed description of fuel assemblies, burnable absorbers, in-core fission detectors, core loading and re-loading patterns. This benchmark enables analysts to develop extremely detailed reactor core models that can be used for testing and validation of coupled neutron transport, thermal-hydraulics, and fuel isotopic depletion. The benchmark also provides measured reactor data for Hot Zero Power (HZP) physics tests, boron letdown curves, and three-dimensional in-core flux maps from 58 instrumented assemblies. The benchmark description is now available online and has been used by many groups. However, much work remains to be done on the quantification of uncertainties and modeling sensitivities. This work aims to address these deficiencies and make this benchmark a true non-proprietary international benchmark for the validation of high-fidelity tools. This report details the BEAVRS uncertainty quantification for the first two cycle of operations and serves as the final report of the
Internal Quality Assurance Benchmarking. ENQA Workshop Report 20

ERIC Educational Resources Information Center

Blackstock, Douglas; Burquel, Nadine; Comet, Nuria; Kajaste, Matti; dos Santos, Sergio Machado; Marcos, Sandra; Moser, Marion; Ponds, Henri; Scheuthle, Harald; Sixto, Luis Carlos Velon

2012-01-01

The Internal Quality Assurance group of ENQA (IQA Group) has been organising a yearly seminar for its members since 2007. The main objective is to share experiences concerning the internal quality assurance of work processes in the participating agencies. The overarching theme of the 2011 seminar was how to use benchmarking as a tool for…
[Benchmarking in patient identification: An opportunity to learn].

PubMed

Salazar-de-la-Guerra, R M; Santotomás-Pajarrón, A; González-Prieto, V; Menéndez-Fraga, M D; Rocha Hurtado, C

To perform a benchmarking on the safe identification of hospital patients involved in "Club de las tres C" (Calidez, Calidad y Cuidados) in order to prepare a common procedure for this process. A descriptive study was conducted on the patient identification process in palliative care and stroke units in 5medium-stay hospitals. The following steps were carried out: Data collection from each hospital; organisation and data analysis, and preparation of a common procedure for this process. The data obtained for the safe identification of all stroke patients were: hospital 1 (93%), hospital 2 (93.1%), hospital 3 (100%), and hospital 5 (93.4%), and for the palliative care process: hospital 1 (93%), hospital 2 (92.3%), hospital 3 (92%), hospital 4 (98.3%), and hospital 5 (85.2%). The aim of the study has been accomplished successfully. Benchmarking activities have been developed and knowledge on the patient identification process has been shared. All hospitals had good results. The hospital 3 was best in the ictus identification process. The benchmarking identification is difficult, but, a useful common procedure that collects the best practices has been identified among the 5 hospitals. Copyright © 2017 SECA. Publicado por Elsevier España, S.L.U. All rights reserved.
HyspIRI Low Latency Concept and Benchmarks

NASA Technical Reports Server (NTRS)

Mandl, Dan

2010-01-01

Topics include HyspIRI low latency data ops concept, HyspIRI data flow, ongoing efforts, experiment with Web Coverage Processing Service (WCPS) approach to injecting new algorithms into SensorWeb, low fidelity HyspIRI IPM testbed, compute cloud testbed, open cloud testbed environment, Global Lambda Integrated Facility (GLIF) and OCC collaboration with Starlight, delay tolerant network (DTN) protocol benchmarking, and EO-1 configuration for preliminary DTN prototype.
Modelling solute dispersion in periodic heterogeneous porous media: Model benchmarking against intermediate scale experiments

NASA Astrophysics Data System (ADS)

Majdalani, Samer; Guinot, Vincent; Delenne, Carole; Gebran, Hicham

2018-06-01

This paper is devoted to theoretical and experimental investigations of solute dispersion in heterogeneous porous media. Dispersion in heterogenous porous media has been reported to be scale-dependent, a likely indication that the proposed dispersion models are incompletely formulated. A high quality experimental data set of breakthrough curves in periodic model heterogeneous porous media is presented. In contrast with most previously published experiments, the present experiments involve numerous replicates. This allows the statistical variability of experimental data to be accounted for. Several models are benchmarked against the data set: the Fickian-based advection-dispersion, mobile-immobile, multirate, multiple region advection dispersion models, and a newly proposed transport model based on pure advection. A salient property of the latter model is that its solutions exhibit a ballistic behaviour for small times, while tending to the Fickian behaviour for large time scales. Model performance is assessed using a novel objective function accounting for the statistical variability of the experimental data set, while putting equal emphasis on both small and large time scale behaviours. Besides being as accurate as the other models, the new purely advective model has the advantages that (i) it does not exhibit the undesirable effects associated with the usual Fickian operator (namely the infinite solute front propagation speed), and (ii) it allows dispersive transport to be simulated on every heterogeneity scale using scale-independent parameters.
Benchmarking expert system tools

NASA Technical Reports Server (NTRS)

Riley, Gary

1988-01-01

As part of its evaluation of new technologies, the Artificial Intelligence Section of the Mission Planning and Analysis Div. at NASA-Johnson has made timing tests of several expert system building tools. Among the production systems tested were Automated Reasoning Tool, several versions of OPS5, and CLIPS (C Language Integrated Production System), an expert system builder developed by the AI section. Also included in the test were a Zetalisp version of the benchmark along with four versions of the benchmark written in Knowledge Engineering Environment, an object oriented, frame based expert system tool. The benchmarks used for testing are studied.
Closed-Loop Neuromorphic Benchmarks

PubMed Central

Stewart, Terrence C.; DeWolf, Travis; Kleinhans, Ashley; Eliasmith, Chris

2015-01-01

Evaluating the effectiveness and performance of neuromorphic hardware is difficult. It is even more difficult when the task of interest is a closed-loop task; that is, a task where the output from the neuromorphic hardware affects some environment, which then in turn affects the hardware's future input. However, closed-loop situations are one of the primary potential uses of neuromorphic hardware. To address this, we present a methodology for generating closed-loop benchmarks that makes use of a hybrid of real physical embodiment and a type of “minimal” simulation. Minimal simulation has been shown to lead to robust real-world performance, while still maintaining the practical advantages of simulation, such as making it easy for the same benchmark to be used by many researchers. This method is flexible enough to allow researchers to explicitly modify the benchmarks to identify specific task domains where particular hardware excels. To demonstrate the method, we present a set of novel benchmarks that focus on motor control for an arbitrary system with unknown external forces. Using these benchmarks, we show that an error-driven learning rule can consistently improve motor control performance across a randomly generated family of closed-loop simulations, even when there are up to 15 interacting joints to be controlled. PMID:26696820
Benchmark gas core critical experiment.

NASA Technical Reports Server (NTRS)

Kunze, J. F.; Lofthouse, J. H.; Cooper, C. G.; Hyland, R. E.

1972-01-01

A critical experiment with spherical symmetry has been conducted on the gas core nuclear reactor concept. The nonspherical perturbations in the experiment were evaluated experimentally and produce corrections to the observed eigenvalue of approximately 1% delta k. The reactor consisted of a low density, central uranium hexafluoride gaseous core, surrounded by an annulus of void or low density hydrocarbon, which in turn was surrounded with a 97-cm-thick heavy water reflector.
Benchmarking of Neutron Production of Heavy-Ion Transport Codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Remec, Igor; Ronningen, Reginald M.; Heilbronn, Lawrence

Accurate prediction of radiation fields generated by heavy ion interactions is important in medical applications, space missions, and in design and operation of rare isotope research facilities. In recent years, several well-established computer codes in widespread use for particle and radiation transport calculations have been equipped with the capability to simulate heavy ion transport and interactions. To assess and validate these capabilities, we performed simulations of a series of benchmark-quality heavy ion experiments with the computer codes FLUKA, MARS15, MCNPX, and PHITS. We focus on the comparisons of secondary neutron production. Results are encouraging; however, further improvements in models andmore » codes and additional benchmarking are required.« less
GEN-IV Benchmarking of Triso Fuel Performance Models under accident conditions modeling input data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Collin, Blaise Paul

This document presents the benchmark plan for the calculation of particle fuel performance on safety testing experiments that are representative of operational accidental transients. The benchmark is dedicated to the modeling of fission product release under accident conditions by fuel performance codes from around the world, and the subsequent comparison to post-irradiation experiment (PIE) data from the modeled heating tests. The accident condition benchmark is divided into three parts: • The modeling of a simplified benchmark problem to assess potential numerical calculation issues at low fission product release. • The modeling of the AGR-1 and HFR-EU1bis safety testing experiments. •more » The comparison of the AGR-1 and HFR-EU1bis modeling results with PIE data. The simplified benchmark case, thereafter named NCC (Numerical Calculation Case), is derived from “Case 5” of the International Atomic Energy Agency (IAEA) Coordinated Research Program (CRP) on coated particle fuel technology [IAEA 2012]. It is included so participants can evaluate their codes at low fission product release. “Case 5” of the IAEA CRP-6 showed large code-to-code discrepancies in the release of fission products, which were attributed to “effects of the numerical calculation method rather than the physical model” [IAEA 2012]. The NCC is therefore intended to check if these numerical effects subsist. The first two steps imply the involvement of the benchmark participants with a modeling effort following the guidelines and recommendations provided by this document. The third step involves the collection of the modeling results by Idaho National Laboratory (INL) and the comparison of these results with the available PIE data. The objective of this document is to provide all necessary input data to model the benchmark cases, and to give some methodology guidelines and recommendations in order to make all results suitable for comparison with each other. The participants
Research on computer systems benchmarking

NASA Technical Reports Server (NTRS)

Smith, Alan Jay (Principal Investigator)

1996-01-01

This grant addresses the topic of research on computer systems benchmarking and is more generally concerned with performance issues in computer systems. This report reviews work in those areas during the period of NASA support under this grant. The bulk of the work performed concerned benchmarking and analysis of CPUs, compilers, caches, and benchmark programs. The first part of this work concerned the issue of benchmark performance prediction. A new approach to benchmarking and machine characterization was reported, using a machine characterizer that measures the performance of a given system in terms of a Fortran abstract machine. Another report focused on analyzing compiler performance. The performance impact of optimization in the context of our methodology for CPU performance characterization was based on the abstract machine model. Benchmark programs are analyzed in another paper. A machine-independent model of program execution was developed to characterize both machine performance and program execution. By merging these machine and program characterizations, execution time can be estimated for arbitrary machine/program combinations. The work was continued into the domain of parallel and vector machines, including the issue of caches in vector processors and multiprocessors. All of the afore-mentioned accomplishments are more specifically summarized in this report, as well as those smaller in magnitude supported by this grant.
Model Prediction Results for 2007 Ultrasonic Benchmark Problems

NASA Astrophysics Data System (ADS)

Kim, Hak-Joon; Song, Sung-Jin

2008-02-01

The World Federation of NDE Centers (WFNDEC) has addressed two types of problems for the 2007 ultrasonic benchmark problems: prediction of side-drilled hole responses with 45° and 60° refracted shear waves, and effects of surface curvatures on the ultrasonic responses of flat-bottomed hole. To solve this year's ultrasonic benchmark problems, we applied multi-Gaussian beam models for calculation of ultrasonic beam fields and the Kirchhoff approximation and the separation of variables method for calculation of far-field scattering amplitudes of flat-bottomed holes and side-drilled holes respectively In this paper, we present comparison results of model predictions to experiments for side-drilled holes and discuss effect of interface curvatures on ultrasonic responses by comparison of peak-to-peak amplitudes of flat-bottomed hole responses with different sizes and interface curvatures.
Qualification of CASMO5 / SIMULATE-3K against the SPERT-III E-core cold start-up experiments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grandi, G.; Moberg, L.

SIMULATE-3K is a three-dimensional kinetic code applicable to LWR Reactivity Initiated Accidents. S3K has been used to calculate several international recognized benchmarks. However, the feedback models in the benchmark exercises are different from the feedback models that SIMULATE-3K uses for LWR reactors. For this reason, it is worth comparing the SIMULATE-3K capabilities for Reactivity Initiated Accidents against kinetic experiments. The Special Power Excursion Reactor Test III was a pressurized-water, nuclear-research facility constructed to analyze the reactor kinetic behavior under initial conditions similar to those of commercial LWRs. The SPERT III E-core resembles a PWR in terms of fuel type, moderator,more » coolant flow rate, and system pressure. The initial test conditions (power, core flow, system pressure, core inlet temperature) are representative of cold start-up, hot start-up, hot standby, and hot full power. The qualification of S3K against the SPERT III E-core measurements is an ongoing work at Studsvik. In this paper, the results for the 30 cold start-up tests are presented. The results show good agreement with the experiments for the reactivity initiated accident main parameters: peak power, energy release and compensated reactivity. Predicted and measured peak powers differ at most by 13%. Measured and predicted reactivity compensations at the time of the peak power differ less than 0.01 $. Predicted and measured energy release differ at most by 13%. All differences are within the experimental uncertainty. (authors)« less
Benchmarking computational fluid dynamics models of lava flow simulation for hazard assessment, forecasting, and risk management

USGS Publications Warehouse

Dietterich, Hannah; Lev, Einat; Chen, Jiangzhi; Richardson, Jacob A.; Cashman, Katharine V.

2017-01-01

Numerical simulations of lava flow emplacement are valuable for assessing lava flow hazards, forecasting active flows, designing flow mitigation measures, interpreting past eruptions, and understanding the controls on lava flow behavior. Existing lava flow models vary in simplifying assumptions, physics, dimensionality, and the degree to which they have been validated against analytical solutions, experiments, and natural observations. In order to assess existing models and guide the development of new codes, we conduct a benchmarking study of computational fluid dynamics (CFD) models for lava flow emplacement, including VolcFlow, OpenFOAM, FLOW-3D, COMSOL, and MOLASSES. We model viscous, cooling, and solidifying flows over horizontal planes, sloping surfaces, and into topographic obstacles. We compare model results to physical observations made during well-controlled analogue and molten basalt experiments, and to analytical theory when available. Overall, the models accurately simulate viscous flow with some variability in flow thickness where flows intersect obstacles. OpenFOAM, COMSOL, and FLOW-3D can each reproduce experimental measurements of cooling viscous flows, and OpenFOAM and FLOW-3D simulations with temperature-dependent rheology match results from molten basalt experiments. We assess the goodness-of-fit of the simulation results and the computational cost. Our results guide the selection of numerical simulation codes for different applications, including inferring emplacement conditions of past lava flows, modeling the temporal evolution of ongoing flows during eruption, and probabilistic assessment of lava flow hazard prior to eruption. Finally, we outline potential experiments and desired key observational data from future flows that would extend existing benchmarking data sets.
Making Benchmark Testing Work

ERIC Educational Resources Information Center

Herman, Joan L.; Baker, Eva L.

2005-01-01

Many schools are moving to develop benchmark tests to monitor their students' progress toward state standards throughout the academic year. Benchmark tests can provide the ongoing information that schools need to guide instructional programs and to address student learning problems. The authors discuss six criteria that educators can use to…
HS06 Benchmark for an ARM Server

NASA Astrophysics Data System (ADS)

Kluth, Stefan

2014-06-01

We benchmarked an ARM cortex-A9 based server system with a four-core CPU running at 1.1 GHz. The system used Ubuntu 12.04 as operating system and the HEPSPEC 2006 (HS06) benchmarking suite was compiled natively with gcc-4.4 on the system. The benchmark was run for various settings of the relevant gcc compiler options. We did not find significant influence from the compiler options on the benchmark result. The final HS06 benchmark result is 10.4.

NAS Grid Benchmarks. 1.0

NASA Technical Reports Server (NTRS)

VanderWijngaart, Rob; Frumkin, Michael; Biegel, Bryan A. (Technical Monitor)

2002-01-01

We provide a paper-and-pencil specification of a benchmark suite for computational grids. It is based on the NAS (NASA Advanced Supercomputing) Parallel Benchmarks (NPB) and is called the NAS Grid Benchmarks (NGB). NGB problems are presented as data flow graphs encapsulating an instance of a slightly modified NPB task in each graph node, which communicates with other nodes by sending/receiving initialization data. Like NPB, NGB specifies several different classes (problem sizes). In this report we describe classes S, W, and A, and provide verification values for each. The implementor has the freedom to choose any language, grid environment, security model, fault tolerance/error correction mechanism, etc., as long as the resulting implementation passes the verification test and reports the turnaround time of the benchmark.
The Army Pollution Prevention Program: Improving Performance Through Benchmarking.

DTIC Science & Technology

1995-06-01

Washington, DC 20503. 1. AGENCY USE ONLY (Leave Blank) 2. REPORT DATE June 1995 3. REPORT TYPE AND DATES COVERED Final 4. TITLE AND SUBTITLE...unlimited 12b. DISTRIBUTION CODE 13. ABSTRACT (Maximum 200 words) This report investigates the feasibility of using benchmarking as a method for...could use to determine to what degree it should integrate benchmarking with other quality management tools to support the pollution prevention program
RELAP5-3D Results for Phase I (Exercise 2) of the OECD/NEA MHTGR-350 MW Benchmark

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gerhard Strydom

2012-06-01

The coupling of the PHISICS code suite to the thermal hydraulics system code RELAP5-3D has recently been initiated at the Idaho National Laboratory (INL) to provide a fully coupled prismatic Very High Temperature Reactor (VHTR) system modeling capability as part of the NGNP methods development program. The PHISICS code consists of three modules: INSTANT (performing 3D nodal transport core calculations), MRTAU (depletion and decay heat generation) and a perturbation/mixer module. As part of the verification and validation activities, steady state results have been obtained for Exercise 2 of Phase I of the newly-defined OECD/NEA MHTGR-350 MW Benchmark. This exercise requiresmore » participants to calculate a steady-state solution for an End of Equilibrium Cycle 350 MW Modular High Temperature Reactor (MHTGR), using the provided geometry, material, and coolant bypass flow description. The paper provides an overview of the MHTGR Benchmark and presents typical steady state results (e.g. solid and gas temperatures, thermal conductivities) for Phase I Exercise 2. Preliminary results are also provided for the early test phase of Exercise 3 using a two-group cross-section library and the Relap5-3D model developed for Exercise 2.« less
RELAP5-3D results for phase I (Exercise 2) of the OECD/NEA MHTGR-350 MW benchmark

DOE Office of Scientific and Technical Information (OSTI.GOV)

Strydom, G.; Epiney, A. S.

2012-07-01

The coupling of the PHISICS code suite to the thermal hydraulics system code RELAP5-3D has recently been initiated at the Idaho National Laboratory (INL) to provide a fully coupled prismatic Very High Temperature Reactor (VHTR) system modeling capability as part of the NGNP methods development program. The PHISICS code consists of three modules: INSTANT (performing 3D nodal transport core calculations), MRTAU (depletion and decay heat generation) and a perturbation/mixer module. As part of the verification and validation activities, steady state results have been obtained for Exercise 2 of Phase I of the newly-defined OECD/NEA MHTGR-350 MW Benchmark. This exercise requiresmore » participants to calculate a steady-state solution for an End of Equilibrium Cycle 350 MW Modular High Temperature Reactor (MHTGR), using the provided geometry, material, and coolant bypass flow description. The paper provides an overview of the MHTGR Benchmark and presents typical steady state results (e.g. solid and gas temperatures, thermal conductivities) for Phase I Exercise 2. Preliminary results are also provided for the early test phase of Exercise 3 using a two-group cross-section library and the Relap5-3D model developed for Exercise 2. (authors)« less
Length of stay benchmarks for inpatient rehabilitation after stroke.

PubMed

Meyer, Matthew; Britt, Eileen; McHale, Heather A; Teasell, Robert

2012-01-01

In Canada, no standardized benchmarks for length of stay (LOS) have been established for post-stroke inpatient rehabilitation. This paper describes the development of a severity specific median length of stay benchmarking strategy, assessment of its impact after one year of implementation in a Canadian rehabilitation hospital, and establishment of updated benchmarks that may be useful for comparison with other facilities across Canada. Patient data were retrospectively assessed for all patients admitted to a single post-acute stroke rehabilitation unit in Ontario, Canada between April 2005 and March 2008. Rehabilitation Patient Groups (RPGs) were used to establish stratified median length of stay benchmarks for each group that were incorporated into team rounds beginning in October 2009. Benchmark impact was assessed using mean LOS, FIM(®) gain, and discharge destination for each RPG group, collected prospectively for one year, compared against similar information from the previous calendar year. Benchmarks were then adjusted accordingly for future use. Between October 2009 and September 2010, a significant reduction in average LOS was noted compared to the previous year (35.3 vs. 41.2 days; p < 0.05). Reductions in LOS were noted in each RPG group including statistically significant reductions in 4 of the 7 groups. As intended, reductions in LOS were achieved with no significant reduction in mean FIM(®) gain or proportion of patients discharged home compared to the previous year. Adjusted benchmarks for LOS ranged from 13 to 48 days depending on the RPG group. After a single year of implementation, severity specific benchmarks helped the rehabilitation team reduce LOS while maintaining the same levels of functional gain and achieving the same rate of discharge to the community. © 2012 Informa UK, Ltd.
Benchmark On Sensitivity Calculation (Phase III)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ivanova, Tatiana; Laville, Cedric; Dyrda, James

2012-01-01

The sensitivities of the keff eigenvalue to neutron cross sections have become commonly used in similarity studies and as part of the validation algorithm for criticality safety assessments. To test calculations of the sensitivity coefficients, a benchmark study (Phase III) has been established by the OECD-NEA/WPNCS/EG UACSA (Expert Group on Uncertainty Analysis for Criticality Safety Assessment). This paper presents some sensitivity results generated by the benchmark participants using various computational tools based upon different computational methods: SCALE/TSUNAMI-3D and -1D, MONK, APOLLO2-MORET 5, DRAGON-SUSD3D and MMKKENO. The study demonstrates the performance of the tools. It also illustrates how model simplifications impactmore » the sensitivity results and demonstrates the importance of 'implicit' (self-shielding) sensitivities. This work has been a useful step towards verification of the existing and developed sensitivity analysis methods.« less
Benchmarking Is Associated With Improved Quality of Care in Type 2 Diabetes

PubMed Central

Hermans, Michel P.; Elisaf, Moses; Michel, Georges; Muls, Erik; Nobels, Frank; Vandenberghe, Hans; Brotons, Carlos

2013-01-01

OBJECTIVE To assess prospectively the effect of benchmarking on quality of primary care for patients with type 2 diabetes by using three major modifiable cardiovascular risk factors as critical quality indicators. RESEARCH DESIGN AND METHODS Primary care physicians treating patients with type 2 diabetes in six European countries were randomized to give standard care (control group) or standard care with feedback benchmarked against other centers in each country (benchmarking group). In both groups, laboratory tests were performed every 4 months. The primary end point was the percentage of patients achieving preset targets of the critical quality indicators HbA1c, LDL cholesterol, and systolic blood pressure (SBP) after 12 months of follow-up. RESULTS Of 4,027 patients enrolled, 3,996 patients were evaluable and 3,487 completed 12 months of follow-up. Primary end point of HbA1c target was achieved in the benchmarking group by 58.9 vs. 62.1% in the control group (P = 0.398) after 12 months; 40.0 vs. 30.1% patients met the SBP target (P < 0.001); 54.3 vs. 49.7% met the LDL cholesterol target (P = 0.006). Percentages of patients meeting all three targets increased during the study in both groups, with a statistically significant increase observed in the benchmarking group. The percentage of patients achieving all three targets at month 12 was significantly larger in the benchmarking group than in the control group (12.5 vs. 8.1%; P < 0.001). CONCLUSIONS In this prospective, randomized, controlled study, benchmarking was shown to be an effective tool for increasing achievement of critical quality indicators and potentially reducing patient cardiovascular residual risk profile. PMID:23846810
The Zoo, Benchmarks & You: How To Reach the Oregon State Benchmarks with Zoo Resources.

ERIC Educational Resources Information Center

2002

This document aligns Oregon state educational benchmarks and standards with Oregon Zoo resources. Benchmark areas examined include English, mathematics, science, social studies, and career and life roles. Brief descriptions of the programs offered by the zoo are presented. (SOE)
The Concepts "Benchmarks and Benchmarking" Used in Education Planning: Teacher Education as Example

ERIC Educational Resources Information Center

Steyn, H. J.

2015-01-01

Planning in education is a structured activity that includes several phases and steps that take into account several kinds of information (Steyn, Steyn, De Waal & Wolhuter, 2002: 146). One of the sets of information that are usually considered is the (so-called) "benchmarks" and "benchmarking" regarding the focus of a…
Kohn-Sham Band Structure Benchmark Including Spin-Orbit Coupling for 2D and 3D Solids

NASA Astrophysics Data System (ADS)

Huhn, William; Blum, Volker

2015-03-01

Accurate electronic band structures serve as a primary indicator of the suitability of a material for a given application, e.g., as electronic or catalytic materials. Computed band structures, however, are subject to a host of approximations, some of which are more obvious (e.g., the treatment of the exchange-correlation of self-energy) and others less obvious (e.g., the treatment of core, semicore, or valence electrons, handling of relativistic effects, or the accuracy of the underlying basis set used). We here provide a set of accurate Kohn-Sham band structure benchmarks, using the numeric atom-centered all-electron electronic structure code FHI-aims combined with the ``traditional'' PBE functional and the hybrid HSE functional, to calculate core, valence, and low-lying conduction bands of a set of 2D and 3D materials. Benchmarks are provided with and without effects of spin-orbit coupling, using quasi-degenerate perturbation theory to predict spin-orbit splittings. This work is funded by Fritz-Haber-Institut der Max-Planck-Gesellschaft.
Translational benchmark risk analysis

PubMed Central

Piegorsch, Walter W.

2010-01-01

Translational development – in the sense of translating a mature methodology from one area of application to another, evolving area – is discussed for the use of benchmark doses in quantitative risk assessment. Illustrations are presented with traditional applications of the benchmark paradigm in biology and toxicology, and also with risk endpoints that differ from traditional toxicological archetypes. It is seen that the benchmark approach can apply to a diverse spectrum of risk management settings. This suggests a promising future for this important risk-analytic tool. Extensions of the method to a wider variety of applications represent a significant opportunity for enhancing environmental, biomedical, industrial, and socio-economic risk assessments. PMID:20953283
A benchmark study of the sea-level equation in GIA modelling

NASA Astrophysics Data System (ADS)

Martinec, Zdenek; Klemann, Volker; van der Wal, Wouter; Riva, Riccardo; Spada, Giorgio; Simon, Karen; Blank, Bas; Sun, Yu; Melini, Daniele; James, Tom; Bradley, Sarah

2017-04-01

The sea-level load in glacial isostatic adjustment (GIA) is described by the so called sea-level equation (SLE), which represents the mass redistribution between ice sheets and oceans on a deforming earth. Various levels of complexity of SLE have been proposed in the past, ranging from a simple mean global sea level (the so-called eustatic sea level) to the load with a deforming ocean bottom, migrating coastlines and a changing shape of the geoid. Several approaches to solve the SLE have been derived, from purely analytical formulations to fully numerical methods. Despite various teams independently investigating GIA, there has been no systematic intercomparison amongst the solvers through which the methods may be validated. The goal of this paper is to present a series of benchmark experiments designed for testing and comparing numerical implementations of the SLE. Our approach starts with simple load cases even though the benchmark will not result in GIA predictions for a realistic loading scenario. In the longer term we aim for a benchmark with a realistic loading scenario, and also for benchmark solutions with rotational feedback. The current benchmark uses an earth model for which Love numbers have been computed and benchmarked in Spada et al (2011). In spite of the significant differences in the numerical methods employed, the test computations performed so far show a satisfactory agreement between the results provided by the participants. The differences found can often be attributed to the different approximations inherent to the various algorithms. Literature G. Spada, V. R. Barletta, V. Klemann, R. E. M. Riva, Z. Martinec, P. Gasperini, B. Lund, D. Wolf, L. L. A. Vermeersen, and M. A. King, 2011. A benchmark study for glacial isostatic adjustment codes. Geophys. J. Int. 185: 106-132 doi:10.1111/j.1365-
Benchmarking of neutron production of heavy-ion transport codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Remec, I.; Ronningen, R. M.; Heilbronn, L.

Document available in abstract form only, full text of document follows: Accurate prediction of radiation fields generated by heavy ion interactions is important in medical applications, space missions, and in design and operation of rare isotope research facilities. In recent years, several well-established computer codes in widespread use for particle and radiation transport calculations have been equipped with the capability to simulate heavy ion transport and interactions. To assess and validate these capabilities, we performed simulations of a series of benchmark-quality heavy ion experiments with the computer codes FLUKA, MARS15, MCNPX, and PHITS. We focus on the comparisons of secondarymore » neutron production. Results are encouraging; however, further improvements in models and codes and additional benchmarking are required. (authors)« less
Benchmarking an Unstructured-Grid Model for Tsunami Current Modeling

NASA Astrophysics Data System (ADS)

Zhang, Yinglong J.; Priest, George; Allan, Jonathan; Stimely, Laura

2016-12-01

We present model results derived from a tsunami current benchmarking workshop held by the NTHMP (National Tsunami Hazard Mitigation Program) in February 2015. Modeling was undertaken using our own 3D unstructured-grid model that has been previously certified by the NTHMP for tsunami inundation. Results for two benchmark tests are described here, including: (1) vortex structure in the wake of a submerged shoal and (2) impact of tsunami waves on Hilo Harbor in the 2011 Tohoku event. The modeled current velocities are compared with available lab and field data. We demonstrate that the model is able to accurately capture the velocity field in the two benchmark tests; in particular, the 3D model gives a much more accurate wake structure than the 2D model for the first test, with the root-mean-square error and mean bias no more than 2 cm s-1 and 8 mm s-1, respectively, for the modeled velocity.
Thermal Performance Benchmarking: Annual Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Feng, Xuhui

In FY16, the thermal performance of the 2014 Honda Accord Hybrid power electronics thermal management systems were benchmarked. Both experiments and numerical simulation were utilized to thoroughly study the thermal resistances and temperature distribution in the power module. Experimental results obtained from the water-ethylene glycol tests provided the junction-to-liquid thermal resistance. The finite element analysis (FEA) and computational fluid dynamics (CFD) models were found to yield a good match with experimental results. Both experimental and modeling results demonstrate that the passive stack is the dominant thermal resistance for both the motor and power electronics systems. The 2014 Accord power electronicsmore » systems yield steady-state thermal resistance values around 42- 50 mm to the 2nd power K/W, depending on the flow rates. At a typical flow rate of 10 liters per minute, the thermal resistance of the Accord system was found to be about 44 percent lower than that of the 2012 Nissan LEAF system that was benchmarked in FY15. The main reason for the difference is that the Accord power module used a metalized-ceramic substrate and eliminated the thermal interface material layers. FEA models were developed to study the transient performance of 2012 Nissan LEAF, 2014 Accord, and two other systems that feature conventional power module designs. The simulation results indicate that the 2012 LEAF power module has lowest thermal impedance at a time scale less than one second. This is probably due to moving low thermally conductive materials further away from the heat source and enhancing the heat spreading effect from the copper-molybdenum plate close to the insulated gate bipolar transistors. When approaching steady state, the Honda system shows lower thermal impedance. Measurement results of the thermal resistance of the 2015 BMW i3 power electronic system indicate that the i3 insulated gate bipolar transistor module has significantly lower junction
Medical school benchmarking - from tools to programmes.

PubMed

Wilkinson, Tim J; Hudson, Judith N; Mccoll, Geoffrey J; Hu, Wendy C Y; Jolly, Brian C; Schuwirth, Lambert W T

2015-02-01

Benchmarking among medical schools is essential, but may result in unwanted effects. To apply a conceptual framework to selected benchmarking activities of medical schools. We present an analogy between the effects of assessment on student learning and the effects of benchmarking on medical school educational activities. A framework by which benchmarking can be evaluated was developed and applied to key current benchmarking activities in Australia and New Zealand. The analogy generated a conceptual framework that tested five questions to be considered in relation to benchmarking: what is the purpose? what are the attributes of value? what are the best tools to assess the attributes of value? what happens to the results? and, what is the likely "institutional impact" of the results? If the activities were compared against a blueprint of desirable medical graduate outcomes, notable omissions would emerge. Medical schools should benchmark their performance on a range of educational activities to ensure quality improvement and to assure stakeholders that standards are being met. Although benchmarking potentially has positive benefits, it could also result in perverse incentives with unforeseen and detrimental effects on learning if it is undertaken using only a few selected assessment tools.
INTEGRAL BENCHMARK DATA FOR NUCLEAR DATA TESTING THROUGH THE ICSBEP AND THE NEWLY ORGANIZED IRPHEP

DOE Office of Scientific and Technical Information (OSTI.GOV)

J. Blair Briggs; Lori Scott; Yolanda Rugama

The status of the International Criticality Safety Benchmark Evaluation Project (ICSBEP) was last reported in a nuclear data conference at the International Conference on Nuclear Data for Science and Technology, ND-2004, in Santa Fe, New Mexico. Since that time the number and type of integral benchmarks have increased significantly. Included in the ICSBEP Handbook are criticality-alarm / shielding and fundamental physic benchmarks in addition to the traditional critical / subcritical benchmark data. Since ND 2004, a reactor physics counterpart to the ICSBEP, the International Reactor Physics Experiment Evaluation Project (IRPhEP) was initiated. The IRPhEP is patterned after the ICSBEP, butmore » focuses on other integral measurements, such as buckling, spectral characteristics, reactivity effects, reactivity coefficients, kinetics measurements, reaction-rate and power distributions, nuclide compositions, and other miscellaneous-type measurements in addition to the critical configuration. The status of these two projects is discussed and selected benchmarks highlighted in this paper.« less
HPC Analytics Support. Requirements for Uncertainty Quantification Benchmarks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paulson, Patrick R.; Purohit, Sumit; Rodriguez, Luke R.

2015-05-01

This report outlines techniques for extending benchmark generation products so they support uncertainty quantification by benchmarked systems. We describe how uncertainty quantification requirements can be presented to candidate analytical tools supporting SPARQL. We describe benchmark data sets for evaluating uncertainty quantification, as well as an approach for using our benchmark generator to produce data sets for generating benchmark data sets.
Issues in Benchmark Metric Selection

NASA Astrophysics Data System (ADS)

Crolotte, Alain

It is true that a metric can influence a benchmark but will esoteric metrics create more problems than they will solve? We answer this question affirmatively by examining the case of the TPC-D metric which used the much debated geometric mean for the single-stream test. We will show how a simple choice influenced the benchmark and its conduct and, to some extent, DBMS development. After examining other alternatives our conclusion is that the “real” measure for a decision-support benchmark is the arithmetic mean.
Benchmarking clinical photography services in the NHS.

PubMed

Arbon, Giles

2015-01-01

Benchmarking is used in services across the National Health Service (NHS) using various benchmarking programs. Clinical photography services do not have a program in place and services have to rely on ad hoc surveys of other services. A trial benchmarking exercise was undertaken with 13 services in NHS Trusts. This highlights valuable data and comparisons that can be used to benchmark and improve services throughout the profession.

Method and system for benchmarking computers

DOEpatents

Gustafson, John L.

1993-09-14

A testing system and method for benchmarking computer systems. The system includes a store containing a scalable set of tasks to be performed to produce a solution in ever-increasing degrees of resolution as a larger number of the tasks are performed. A timing and control module allots to each computer a fixed benchmarking interval in which to perform the stored tasks. Means are provided for determining, after completion of the benchmarking interval, the degree of progress through the scalable set of tasks and for producing a benchmarking rating relating to the degree of progress for each computer.
Benchmarking 2010: Trends in Education Philanthropy

ERIC Educational Resources Information Center

Bearman, Jessica

2010-01-01

"Benchmarking 2010" offers insights into the current priorities, practices and concerns of education grantmakers. The report is divided into five sections: (1) Mapping the Education Grantmaking Landscape; (2) 2010 Funding Priorities; (3) Strategies for Leveraging Greater Impact; (4) Identifying Significant Trends in Education Funding; and (5)…
Ultracool dwarf benchmarks with Gaia primaries

NASA Astrophysics Data System (ADS)

Marocco, F.; Pinfield, D. J.; Cook, N. J.; Zapatero Osorio, M. R.; Montes, D.; Caballero, J. A.; Gálvez-Ortiz, M. C.; Gromadzki, M.; Jones, H. R. A.; Kurtev, R.; Smart, R. L.; Zhang, Z.; Cabrera Lavers, A. L.; García Álvarez, D.; Qi, Z. X.; Rickard, M. J.; Dover, L.

2017-10-01

We explore the potential of Gaia for the field of benchmark ultracool/brown dwarf companions, and present the results of an initial search for metal-rich/metal-poor systems. A simulated population of resolved ultracool dwarf companions to Gaia primary stars is generated and assessed. Of the order of ˜24 000 companions should be identifiable outside of the Galactic plane (|b| > 10 deg) with large-scale ground- and space-based surveys including late M, L, T and Y types. Our simulated companion parameter space covers 0.02 ≤ M/M⊙ ≤ 0.1, 0.1 ≤ age/Gyr ≤ 14 and -2.5 ≤ [Fe/H] ≤ 0.5, with systems required to have a false alarm probability <10-4, based on projected separation and expected constraints on common distance, common proper motion and/or common radial velocity. Within this bulk population, we identify smaller target subsets of rarer systems whose collective properties still span the full parameter space of the population, as well as systems containing primary stars that are good age calibrators. Our simulation analysis leads to a series of recommendations for candidate selection and observational follow-up that could identify ˜500 diverse Gaia benchmarks. As a test of the veracity of our methodology and simulations, our initial search uses UKIRT Infrared Deep Sky Survey and Sloan Digital Sky Survey to select secondaries, with the parameters of primaries taken from Tycho-2, Radial Velocity Experiment, Large sky Area Multi-Object fibre Spectroscopic Telescope and Tycho-Gaia Astrometric Solution. We identify and follow up 13 new benchmarks. These include M8-L2 companions, with metallicity constraints ranging in quality, but robust in the range -0.39 ≤ [Fe/H] ≤ +0.36, and with projected physical separation in the range 0.6 < s/kau < 76. Going forward, Gaia offers a very high yield of benchmark systems, from which diverse subsamples may be able to calibrate a range of foundational ultracool/sub-stellar theory and observation.
Performance Evaluation and Benchmarking of Next Intelligent Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

del Pobil, Angel; Madhavan, Raj; Bonsignorio, Fabio

Performance Evaluation and Benchmarking of Intelligent Systems presents research dedicated to the subject of performance evaluation and benchmarking of intelligent systems by drawing from the experiences and insights of leading experts gained both through theoretical development and practical implementation of intelligent systems in a variety of diverse application domains. This contributed volume offers a detailed and coherent picture of state-of-the-art, recent developments, and further research areas in intelligent systems. The chapters cover a broad range of applications, such as assistive robotics, planetary surveying, urban search and rescue, and line tracking for automotive assembly. Subsystems or components described in this bookmore » include human-robot interaction, multi-robot coordination, communications, perception, and mapping. Chapters are also devoted to simulation support and open source software for cognitive platforms, providing examples of the type of enabling underlying technologies that can help intelligent systems to propagate and increase in capabilities. Performance Evaluation and Benchmarking of Intelligent Systems serves as a professional reference for researchers and practitioners in the field. This book is also applicable to advanced courses for graduate level students and robotics professionals in a wide range of engineering and related disciplines including computer science, automotive, healthcare, manufacturing, and service robotics.« less
Benchmarking Gas Path Diagnostic Methods: A Public Approach

NASA Technical Reports Server (NTRS)

Simon, Donald L.; Bird, Jeff; Davison, Craig; Volponi, Al; Iverson, R. Eugene

2008-01-01

Recent technology reviews have identified the need for objective assessments of engine health management (EHM) technology. The need is two-fold: technology developers require relevant data and problems to design and validate new algorithms and techniques while engine system integrators and operators need practical tools to direct development and then evaluate the effectiveness of proposed solutions. This paper presents a publicly available gas path diagnostic benchmark problem that has been developed by the Propulsion and Power Systems Panel of The Technical Cooperation Program (TTCP) to help address these needs. The problem is coded in MATLAB (The MathWorks, Inc.) and coupled with a non-linear turbofan engine simulation to produce "snap-shot" measurements, with relevant noise levels, as if collected from a fleet of engines over their lifetime of use. Each engine within the fleet will experience unique operating and deterioration profiles, and may encounter randomly occurring relevant gas path faults including sensor, actuator and component faults. The challenge to the EHM community is to develop gas path diagnostic algorithms to reliably perform fault detection and isolation. An example solution to the benchmark problem is provided along with associated evaluation metrics. A plan is presented to disseminate this benchmark problem to the engine health management technical community and invite technology solutions.
Evaluating the Effectiveness of a State-Mandated Benchmark Reading Assessment: mClass Reading 3D (Text Reading and Comprehension)

ERIC Educational Resources Information Center

Snow, Amie B.; Morris, Darrell; Perney, Jan

2018-01-01

We examined which of two instruments (Text Reading and Comprehension inventory [TRC] or a traditional informal reading inventory [IRI]) provides the more valid assessment of a primary-grade student's reading instructional level. The TRC is currently the required, benchmark reading assessment for students in grades K-3 in the state of North…
Benchmarking--Measuring and Comparing for Continuous Improvement.

ERIC Educational Resources Information Center

Henczel, Sue

2002-01-01

Discussion of benchmarking focuses on the use of internal and external benchmarking by special librarians. Highlights include defining types of benchmarking; historical development; benefits, including efficiency, improved performance, increased competitiveness, and better decision making; problems, including inappropriate adaptation; developing a…
EPA's Benchmark Dose Modeling Software

EPA Science Inventory

The EPA developed the Benchmark Dose Software (BMDS) as a tool to help Agency risk assessors facilitate applying benchmark dose (BMD) method’s to EPA’s human health risk assessment (HHRA) documents. The application of BMD methods overcomes many well know limitations ...
42 CFR 440.330 - Benchmark health benefits coverage.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 42 Public Health 4 2012-10-01 2012-10-01 false Benchmark health benefits coverage. 440.330 Section 440.330 Public Health CENTERS FOR MEDICARE & MEDICAID SERVICES, DEPARTMENT OF HEALTH AND HUMAN... Benchmark-Equivalent Coverage § 440.330 Benchmark health benefits coverage. Benchmark coverage is health...
Indoor Modelling Benchmark for 3D Geometry Extraction

NASA Astrophysics Data System (ADS)

Thomson, C.; Boehm, J.

2014-06-01

A combination of faster, cheaper and more accurate hardware, more sophisticated software, and greater industry acceptance have all laid the foundations for an increased desire for accurate 3D parametric models of buildings. Pointclouds are the data source of choice currently with static terrestrial laser scanning the predominant tool for large, dense volume measurement. The current importance of pointclouds as the primary source of real world representation is endorsed by CAD software vendor acquisitions of pointcloud engines in 2011. Both the capture and modelling of indoor environments require great effort in time by the operator (and therefore cost). Automation is seen as a way to aid this by reducing the workload of the user and some commercial packages have appeared that provide automation to some degree. In the data capture phase, advances in indoor mobile mapping systems are speeding up the process, albeit currently with a reduction in accuracy. As a result this paper presents freely accessible pointcloud datasets of two typical areas of a building each captured with two different capture methods and each with an accurate wholly manually created model. These datasets are provided as a benchmark for the research community to gauge the performance and improvements of various techniques for indoor geometry extraction. With this in mind, non-proprietary, interoperable formats are provided such as E57 for the scans and IFC for the reference model. The datasets can be found at: http://indoor-bench.github.io/indoor-bench.
E2 and SN2 Reactions of X(-) + CH3CH2X (X = F, Cl); an ab Initio and DFT Benchmark Study.

PubMed

Bento, A Patrícia; Solà, Miquel; Bickelhaupt, F Matthias

2008-06-01

We have computed consistent benchmark potential energy surfaces (PESs) for the anti-E2, syn-E2, and SN2 pathways of X(-) + CH3CH2X with X = F and Cl. This benchmark has been used to evaluate the performance of 31 popular density functionals, covering local-density approximation, generalized gradient approximation (GGA), meta-GGA, and hybrid density-functional theory (DFT). The ab initio benchmark has been obtained by exploring the PESs using a hierarchical series of ab initio methods [up to CCSD(T)] in combination with a hierarchical series of Gaussian-type basis sets (up to aug-cc-pVQZ). Our best CCSD(T) estimates show that the overall barriers for the various pathways increase in the order anti-E2 (X = F) < SN2 (X = F) < SN2 (X = Cl) ∼ syn-E2 (X = F) < anti-E2 (X = Cl) < syn-E2 (X = Cl). Thus, anti-E2 dominates for F(-) + CH3CH2F, and SN2 dominates for Cl(-) + CH3CH2Cl, while syn-E2 is in all cases the least favorable pathway. Best overall agreement with our ab initio benchmark is obtained by representatives from each of the three categories of functionals, GGA, meta-GGA, and hybrid DFT, with mean absolute errors in, for example, central barriers of 4.3 (OPBE), 2.2 (M06-L), and 2.0 kcal/mol (M06), respectively. Importantly, the hybrid functional BHandH and the meta-GGA M06-L yield incorrect trends and qualitative features of the PESs (in particular, an erroneous preference for SN2 over the anti-E2 in the case of F(-) + CH3CH2F) even though they are among the best functionals as measured by their small mean absolute errors of 3.3 and 2.2 kcal/mol in reaction barriers. OLYP and B3LYP have somewhat higher mean absolute errors in central barriers (5.6 and 4.8 kcal/mol, respectively), but the error distribution is somewhat more uniform, and as a consequence, the correct trends are reproduced.
Computational Chemistry Comparison and Benchmark Database

National Institute of Standards and Technology Data Gateway

SRD 101 NIST Computational Chemistry Comparison and Benchmark Database (Web, free access) The NIST Computational Chemistry Comparison and Benchmark Database is a collection of experimental and ab initio thermochemical properties for a selected set of molecules. The goals are to provide a benchmark set of molecules for the evaluation of ab initio computational methods and allow the comparison between different ab initio computational methods for the prediction of thermochemical properties.
TRIPOLI-4® - MCNP5 ITER A-lite neutronic model benchmarking

NASA Astrophysics Data System (ADS)

Jaboulay, J.-C.; Cayla, P.-Y.; Fausser, C.; Lee, Y.-K.; Trama, J.-C.; Li-Puma, A.

2014-06-01

The aim of this paper is to present the capability of TRIPOLI-4®, the CEA Monte Carlo code, to model a large-scale fusion reactor with complex neutron source and geometry. In the past, numerous benchmarks were conducted for TRIPOLI-4® assessment on fusion applications. Experiments (KANT, OKTAVIAN, FNG) analysis and numerical benchmarks (between TRIPOLI-4® and MCNP5) on the HCLL DEMO2007 and ITER models were carried out successively. In this previous ITER benchmark, nevertheless, only the neutron wall loading was analyzed, its main purpose was to present MCAM (the FDS Team CAD import tool) extension for TRIPOLI-4®. Starting from this work a more extended benchmark has been performed about the estimation of neutron flux, nuclear heating in the shielding blankets and tritium production rate in the European TBMs (HCLL and HCPB) and it is presented in this paper. The methodology to build the TRIPOLI-4® A-lite model is based on MCAM and the MCNP A-lite model (version 4.1). Simplified TBMs (from KIT) have been integrated in the equatorial-port. Comparisons of neutron wall loading, flux, nuclear heating and tritium production rate show a good agreement between the two codes. Discrepancies are mainly included in the Monte Carlo codes statistical error.
Internal Benchmarking for Institutional Effectiveness

ERIC Educational Resources Information Center

Ronco, Sharron L.

2012-01-01

Internal benchmarking is an established practice in business and industry for identifying best in-house practices and disseminating the knowledge about those practices to other groups in the organization. Internal benchmarking can be done with structures, processes, outcomes, or even individuals. In colleges or universities with multicampuses or a…
Developing integrated benchmarks for DOE performance measurement

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barancik, J.I.; Kramer, C.F.; Thode, Jr. H.C.

1992-09-30

The objectives of this task were to describe and evaluate selected existing sources of information on occupational safety and health with emphasis on hazard and exposure assessment, abatement, training, reporting, and control identifying for exposure and outcome in preparation for developing DOE performance benchmarks. Existing resources and methodologies were assessed for their potential use as practical performance benchmarks. Strengths and limitations of current data resources were identified. Guidelines were outlined for developing new or improved performance factors, which then could become the basis for selecting performance benchmarks. Data bases for non-DOE comparison populations were identified so that DOE performance couldmore » be assessed relative to non-DOE occupational and industrial groups. Systems approaches were described which can be used to link hazards and exposure, event occurrence, and adverse outcome factors, as needed to generate valid, reliable, and predictive performance benchmarks. Data bases were identified which contain information relevant to one or more performance assessment categories . A list of 72 potential performance benchmarks was prepared to illustrate the kinds of information that can be produced through a benchmark development program. Current information resources which may be used to develop potential performance benchmarks are limited. There is need to develop an occupational safety and health information and data system in DOE, which is capable of incorporating demonstrated and documented performance benchmarks prior to, or concurrent with the development of hardware and software. A key to the success of this systems approach is rigorous development and demonstration of performance benchmark equivalents to users of such data before system hardware and software commitments are institutionalized.« less
Development and application of freshwater sediment-toxicity benchmarks for currently used pesticides

USGS Publications Warehouse

Nowell, Lisa H.; Norman, Julia E.; Ingersoll, Christopher G.; Moran, Patrick W.

2016-01-01

Sediment-toxicity benchmarks are needed to interpret the biological significance of currently used pesticides detected in whole sediments. Two types of freshwater sediment benchmarks for pesticides were developed using spiked-sediment bioassay (SSB) data from the literature. These benchmarks can be used to interpret sediment-toxicity data or to assess the potential toxicity of pesticides in whole sediment. The Likely Effect Benchmark (LEB) defines a pesticide concentration in whole sediment above which there is a high probability of adverse effects on benthic invertebrates, and the Threshold Effect Benchmark (TEB) defines a concentration below which adverse effects are unlikely. For compounds without available SSBs, benchmarks were estimated using equilibrium partitioning (EqP). When a sediment sample contains a pesticide mixture, benchmark quotients can be summed for all detected pesticides to produce an indicator of potential toxicity for that mixture. Benchmarks were developed for 48 pesticide compounds using SSB data and 81 compounds using the EqP approach. In an example application, data for pesticides measured in sediment from 197 streams across the United States were evaluated using these benchmarks, and compared to measured toxicity from whole-sediment toxicity tests conducted with the amphipod Hyalella azteca (28-d exposures) and the midge Chironomus dilutus (10-d exposures). Amphipod survival, weight, and biomass were significantly and inversely related to summed benchmark quotients, whereas midge survival, weight, and biomass showed no relationship to benchmarks. Samples with LEB exceedances were rare (n = 3), but all were toxic to amphipods (i.e., significantly different from control). Significant toxicity to amphipods was observed for 72% of samples exceeding one or more TEBs, compared to 18% of samples below all TEBs. Factors affecting toxicity below TEBs may include the presence of contaminants other than pesticides, physical
PHISICS/RELAP5-3D RESULTS FOR EXERCISES II-1 AND II-2 OF THE OECD/NEA MHTGR-350 BENCHMARK

DOE Office of Scientific and Technical Information (OSTI.GOV)

Strydom, Gerhard

2016-03-01

The Idaho National Laboratory (INL) Advanced Reactor Technologies (ART) High-Temperature Gas-Cooled Reactor (HTGR) Methods group currently leads the Modular High-Temperature Gas-Cooled Reactor (MHTGR) 350 benchmark. The benchmark consists of a set of lattice-depletion, steady-state, and transient problems that can be used by HTGR simulation groups to assess the performance of their code suites. The paper summarizes the results obtained for the first two transient exercises defined for Phase II of the benchmark. The Parallel and Highly Innovative Simulation for INL Code System (PHISICS), coupled with the INL system code RELAP5-3D, was used to generate the results for the Depressurized Conductionmore » Cooldown (DCC) (exercise II-1a) and Pressurized Conduction Cooldown (PCC) (exercise II-2) transients. These exercises require the time-dependent simulation of coupled neutronics and thermal-hydraulics phenomena, and utilize the steady-state solution previously obtained for exercise I-3 of Phase I. This paper also includes a comparison of the benchmark results obtained with a traditional system code “ring” model against a more detailed “block” model that include kinetics feedback on an individual block level and thermal feedbacks on a triangular sub-mesh. The higher spatial fidelity that can be obtained by the block model is illustrated with comparisons of the maximum fuel temperatures, especially in the case of natural convection conditions that dominate the DCC and PCC events. Differences up to 125 K (or 10%) were observed between the ring and block model predictions of the DCC transient, mostly due to the block model’s capability of tracking individual block decay powers and more detailed helium flow distributions. In general, the block model only required DCC and PCC calculation times twice as long as the ring models, and it therefore seems that the additional development and calculation time required for the block model could be worth the gain that can
Benchmark for Strategic Performance Improvement.

ERIC Educational Resources Information Center

Gohlke, Annette

1997-01-01

Explains benchmarking, a total quality management tool used to measure and compare the work processes in a library with those in other libraries to increase library performance. Topics include the main groups of upper management, clients, and staff; critical success factors for each group; and benefits of benchmarking. (Author/LRW)
Beyond Benchmarking: Value-Adding Metrics

ERIC Educational Resources Information Center

Fitz-enz, Jac

2007-01-01

HR metrics has grown up a bit over the past two decades, moving away from simple benchmarking practices and toward a more inclusive approach to measuring institutional performance and progress. In this article, the acknowledged "father" of human capital performance benchmarking provides an overview of several aspects of today's HR metrics…
The NAS kernel benchmark program

NASA Technical Reports Server (NTRS)

Bailey, D. H.; Barton, J. T.

1985-01-01

A collection of benchmark test kernels that measure supercomputer performance has been developed for the use of the NAS (Numerical Aerodynamic Simulation) program at the NASA Ames Research Center. This benchmark program is described in detail and the specific ground rules are given for running the program as a performance test.

Gatemon Benchmarking and Two-Qubit Operation

NASA Astrophysics Data System (ADS)

Casparis, Lucas; Larsen, Thorvald; Olsen, Michael; Petersson, Karl; Kuemmeth, Ferdinand; Krogstrup, Peter; Nygard, Jesper; Marcus, Charles

Recent experiments have demonstrated superconducting transmon qubits with semiconductor nanowire Josephson junctions. These hybrid gatemon qubits utilize field effect tunability singular to semiconductors to allow complete qubit control using gate voltages, potentially a technological advantage over conventional flux-controlled transmons. Here, we present experiments with a two-qubit gatemon circuit. We characterize qubit coherence and stability and use randomized benchmarking to demonstrate single-qubit gate errors of ~0.5 % for all gates, including voltage-controlled Z rotations. We show coherent capacitive coupling between two gatemons and coherent SWAP operations. Finally, we perform a two-qubit controlled-phase gate with an estimated fidelity of ~91 %, demonstrating the potential of gatemon qubits for building scalable quantum processors. We acknowledge financial support from Microsoft Project Q and the Danish National Research Foundation.
How do I know if my forecasts are better? Using benchmarks in hydrological ensemble prediction

NASA Astrophysics Data System (ADS)

Pappenberger, F.; Ramos, M. H.; Cloke, H. L.; Wetterhall, F.; Alfieri, L.; Bogner, K.; Mueller, A.; Salamon, P.

2015-03-01

The skill of a forecast can be assessed by comparing the relative proximity of both the forecast and a benchmark to the observations. Example benchmarks include climatology or a naïve forecast. Hydrological ensemble prediction systems (HEPS) are currently transforming the hydrological forecasting environment but in this new field there is little information to guide researchers and operational forecasters on how benchmarks can be best used to evaluate their probabilistic forecasts. In this study, it is identified that the forecast skill calculated can vary depending on the benchmark selected and that the selection of a benchmark for determining forecasting system skill is sensitive to a number of hydrological and system factors. A benchmark intercomparison experiment is then undertaken using the continuous ranked probability score (CRPS), a reference forecasting system and a suite of 23 different methods to derive benchmarks. The benchmarks are assessed within the operational set-up of the European Flood Awareness System (EFAS) to determine those that are 'toughest to beat' and so give the most robust discrimination of forecast skill, particularly for the spatial average fields that EFAS relies upon. Evaluating against an observed discharge proxy the benchmark that has most utility for EFAS and avoids the most naïve skill across different hydrological situations is found to be meteorological persistency. This benchmark uses the latest meteorological observations of precipitation and temperature to drive the hydrological model. Hydrological long term average benchmarks, which are currently used in EFAS, are very easily beaten by the forecasting system and the use of these produces much naïve skill. When decomposed into seasons, the advanced meteorological benchmarks, which make use of meteorological observations from the past 20 years at the same calendar date, have the most skill discrimination. They are also good at discriminating skill in low flows and for all
How Benchmarking and Higher Education Came Together

ERIC Educational Resources Information Center

Levy, Gary D.; Ronco, Sharron L.

2012-01-01

This chapter introduces the concept of benchmarking and how higher education institutions began to use benchmarking for a variety of purposes. Here, benchmarking is defined as a strategic and structured approach whereby an organization compares aspects of its processes and/or outcomes to those of another organization or set of organizations to…
Benchmarking the ATLAS software through the Kit Validation engine

NASA Astrophysics Data System (ADS)

De Salvo, Alessandro; Brasolin, Franco

2010-04-01

The measurement of the experiment software performance is a very important metric in order to choose the most effective resources to be used and to discover the bottlenecks of the code implementation. In this work we present the benchmark techniques used to measure the ATLAS software performance through the ATLAS offline testing engine Kit Validation and the online portal Global Kit Validation. The performance measurements, the data collection, the online analysis and display of the results will be presented. The results of the measurement on different platforms and architectures will be shown, giving a full report on the CPU power and memory consumption of the Monte Carlo generation, simulation, digitization and reconstruction of the most CPU-intensive channels. The impact of the multi-core computing on the ATLAS software performance will also be presented, comparing the behavior of different architectures when increasing the number of concurrent processes. The benchmark techniques described in this paper have been used in the HEPiX group since the beginning of 2008 to help defining the performance metrics for the High Energy Physics applications, based on the real experiment software.
Technologies of polytechnic education in global benchmark higher education institutions

NASA Astrophysics Data System (ADS)

Kurushina, V. A.; Kurushina, E. V.; Zemenkova, M. Y.

2018-05-01

The Russian polytechnic education is going through the sequence of transformations started with introduction of bachelor and master degrees in the higher education instead of the previous “specialists”. The next stage of reformation in the Russian polytechnic education should imply the growth in quality of teaching and learning experience that is possible to achieve by accumulating the best education practices of the world-class universities using the benchmarking method. This paper gives an overview of some major distinctive features of the foreign benchmark higher education institution and the Russian university of polytechnic profile. The parameters that allowed the authors to select the foreign institution for comparison include the scope of educational profile, industrial specialization, connections with the leading regional corporations, size of the city and number of students. When considering the possibilities of using relevant higher education practices of the world level, the authors emphasize the importance of formation of a new mentality of an engineer, the role of computer technologies in engineering education, the provision of licensed software for the educational process which exceeds the level of a regional Russian university, and successful staff technologies (e.g., inviting “guest” lecturers or having 2-3 lecturers per course).
Numerical Benchmark of 3D Ground Motion Simulation in the Alpine valley of Grenoble, France.

NASA Astrophysics Data System (ADS)

Tsuno, S.; Chaljub, E.; Cornou, C.; Bard, P.

2006-12-01

Thank to the use of sophisticated numerical methods and to the access to increasing computational resources, our predictions of strong ground motion become more and more realistic and need to be carefully compared. We report our effort of benchmarking numerical methods of ground motion simulation in the case of the valley of Grenoble in the French Alps. The Grenoble valley is typical of a moderate seismicity area where strong site effects occur. The benchmark consisted in computing the seismic response of the `Y'-shaped Grenoble valley to (i) two local earthquakes (Ml<=3) for which recordings were avalaible; and (ii) two local hypothetical events (Mw=6) occuring on the so-called Belledonne Border Fault (BBF) [1]. A free-style prediction was also proposed, in which participants were allowed to vary the source and/or the model parameters and were asked to provide the resulting uncertainty in their estimation of ground motion. We received a total of 18 contributions from 14 different groups; 7 of these use 3D methods, among which 3 could handle surface topography, the other half comprises predictions based upon 1D (2 contributions), 2D (4 contributions) and empirical Green's function (EGF) (3 contributions) methods. Maximal frequency analysed ranged between 2.5 Hz for 3D calculations and 40 Hz for EGF predictions. We present a detailed comparison of the different predictions using raw indicators (e.g. peak values of ground velocity and acceleration, Fourier spectra, site over reference spectral ratios, ...) as well as sophisticated misfit criteria based upon previous works [2,3]. We further discuss the variability in estimating the importance of particular effects such as non-linear rheology, or surface topography. References: [1] Thouvenot F. et al., The Belledonne Border Fault: identification of an active seismic strike-slip fault in the western Alps, Geophys. J. Int., 155 (1), p. 174-192, 2003. [2] Anderson J., Quantitative measure of the goodness-of-fit of
Benchmarking worker nodes using LHCb productions and comparing with HEPSpec06

NASA Astrophysics Data System (ADS)

Charpentier, P.

2017-10-01

In order to estimate the capabilities of a computing slot with limited processing time, it is necessary to know with a rather good precision its “power”. This allows for example pilot jobs to match a task for which the required CPU-work is known, or to define the number of events to be processed knowing the CPU-work per event. Otherwise one always has the risk that the task is aborted because it exceeds the CPU capabilities of the resource. It also allows a better accounting of the consumed resources. The traditional way the CPU power is estimated in WLCG since 2007 is using the HEP-Spec06 benchmark (HS06) suite that was verified at the time to scale properly with a set of typical HEP applications. However, the hardware architecture of processors has evolved, all WLCG experiments moved to using 64-bit applications and use different compilation flags from those advertised for running HS06. It is therefore interesting to check the scaling of HS06 with the HEP applications. For this purpose, we have been using CPU intensive massive simulation productions from the LHCb experiment and compared their event throughput to the HS06 rating of the worker nodes. We also compared it with a much faster benchmark script that is used by the DIRAC framework used by LHCb for evaluating at run time the performance of the worker nodes. This contribution reports on the finding of these comparisons: the main observation is that the scaling with HS06 is no longer fulfilled, while the fast benchmarks have a better scaling but are less precise. One can also clearly see that some hardware or software features when enabled on the worker nodes may enhance their performance beyond expectation from either benchmark, depending on external factors.
Benchmarking: A Process for Improvement.

ERIC Educational Resources Information Center

Peischl, Thomas M.

One problem with the outcome-based measures used in higher education is that they measure quantity but not quality. Benchmarking, or the use of some external standard of quality to measure tasks, processes, and outputs, is partially solving that difficulty. Benchmarking allows for the establishment of a systematic process to indicate if outputs…
Bond Dissociation Energies for Diatomic Molecules Containing 3d Transition Metals: Benchmark Scalar-Relativistic Coupled-Cluster Calculations for 20 Molecules

DOE PAGES

Cheng, Lan; Gauss, Jürgen; Ruscic, Branko; ...

2017-01-12

Benchmark scalar-relativistic coupled-cluster calculations for dissociation energies of the 20 diatomic molecules containing 3d transition metals in the 3dMLBE20 database ( J. Chem. Theory Comput. 2015, 11, 2036) are reported in this paper. Electron correlation and basis set effects are systematically studied. The agreement between theory and experiment is in general satisfactory. For a subset of 16 molecules, the standard deviation between computational and experimental values is 9 kJ/mol with the maximum deviation being 15 kJ/mol. The discrepancies between theory and experiment remain substantial (more than 20 kJ/mol) for VH, CrH, CoH, and FeH. To explore the source of themore » latter discrepancies, the analysis used to determine the experimental dissociation energies for VH and CrH is revisited. It is shown that, if improved values are used for the heterolytic C–H dissociation energies of di- and trimethylamine involved in the experimental determination, the experimental values for the dissociation energies of VH and CrH are increased by 18 kJ/mol, such that D 0(VH) = 223 ± 7 kJ/mol and D 0(CrH) = 204 ± 7 kJ/mol (or D e(VH) = 233 ± 7 kJ/mol and D e(CrH) = 214 ± 7 kJ/mol). Finally, the new experimental values agree quite well with the calculated values, showing the consistency of the computation and the measured reaction thresholds.« less
Benchmarking in national health service procurement in Scotland.

PubMed

Walker, Scott; Masson, Ron; Telford, Ronnie; White, David

2007-11-01

The paper reports the results of a study on benchmarking activities undertaken by the procurement organization within the National Health Service (NHS) in Scotland, namely National Procurement (previously Scottish Healthcare Supplies Contracts Branch). NHS performance is of course politically important, and benchmarking is increasingly seen as a means to improve performance, so the study was carried out to determine if the current benchmarking approaches could be enhanced. A review of the benchmarking activities used by the private sector, local government and NHS organizations was carried out to establish a framework of the motivations, benefits, problems and costs associated with benchmarking. This framework was used to carry out the research through case studies and a questionnaire survey of NHS procurement organizations both in Scotland and other parts of the UK. Nine of the 16 Scottish Health Boards surveyed reported carrying out benchmarking during the last three years. The findings of the research were that there were similarities in approaches between local government and NHS Scotland Health, but differences between NHS Scotland and other UK NHS procurement organizations. Benefits were seen as significant and it was recommended that National Procurement should pursue the formation of a benchmarking group with members drawn from NHS Scotland and external benchmarking bodies to establish measures to be used in benchmarking across the whole of NHS Scotland.
Benchmarking facilities providing care: An international overview of initiatives

PubMed Central

Thonon, Frédérique; Watson, Jonathan; Saghatchian, Mahasti

2015-01-01

We performed a literature review of existing benchmarking projects of health facilities to explore (1) the rationales for those projects, (2) the motivation for health facilities to participate, (3) the indicators used and (4) the success and threat factors linked to those projects. We studied both peer-reviewed and grey literature. We examined 23 benchmarking projects of different medical specialities. The majority of projects used a mix of structure, process and outcome indicators. For some projects, participants had a direct or indirect financial incentive to participate (such as reimbursement by Medicaid/Medicare or litigation costs related to quality of care). A positive impact was reported for most projects, mainly in terms of improvement of practice and adoption of guidelines and, to a lesser extent, improvement in communication. Only 1 project reported positive impact in terms of clinical outcomes. Success factors and threats are linked to both the benchmarking process (such as organisation of meetings, link with existing projects) and indicators used (such as adjustment for diagnostic-related groups). The results of this review will help coordinators of a benchmarking project to set it up successfully. PMID:26770800
Hospital benchmarking: are U.S. eye hospitals ready?

PubMed

de Korne, Dirk F; van Wijngaarden, Jeroen D H; Sol, Kees J C A; Betz, Robert; Thomas, Richard C; Schein, Oliver D; Klazinga, Niek S

2012-01-01

Benchmarking is increasingly considered a useful management instrument to improve quality in health care, but little is known about its applicability in hospital settings. The aims of this study were to assess the applicability of a benchmarking project in U.S. eye hospitals and compare the results with an international initiative. We evaluated multiple cases by applying an evaluation frame abstracted from the literature to five U.S. eye hospitals that used a set of 10 indicators for efficiency benchmarking. Qualitative analysis entailed 46 semistructured face-to-face interviews with stakeholders, document analyses, and questionnaires. The case studies only partially met the conditions of the evaluation frame. Although learning and quality improvement were stated as overall purposes, the benchmarking initiative was at first focused on efficiency only. No ophthalmic outcomes were included, and clinicians were skeptical about their reporting relevance and disclosure. However, in contrast with earlier findings in international eye hospitals, all U.S. hospitals worked with internal indicators that were integrated in their performance management systems and supported benchmarking. Benchmarking can support performance management in individual hospitals. Having a certain number of comparable institutes provide similar services in a noncompetitive milieu seems to lay fertile ground for benchmarking. International benchmarking is useful only when these conditions are not met nationally. Although the literature focuses on static conditions for effective benchmarking, our case studies show that it is a highly iterative and learning process. The journey of benchmarking seems to be more important than the destination. Improving patient value (health outcomes per unit of cost) requires, however, an integrative perspective where clinicians and administrators closely cooperate on both quality and efficiency issues. If these worlds do not share such a relationship, the added
Benchmark Factors in Student Retention.

ERIC Educational Resources Information Center

Waggener, Anna T.; Smith, Constance K.

The first purpose of this study was to identify significant factors affecting the first benchmark in retaining students in college--the decision to enroll in the first fall semester after orientation. The second purpose was to examine enrollment decisions at the second benchmark--the decision to re-enroll in the second fall semester after freshman…
SP2Bench: A SPARQL Performance Benchmark

NASA Astrophysics Data System (ADS)

Schmidt, Michael; Hornung, Thomas; Meier, Michael; Pinkel, Christoph; Lausen, Georg

A meaningful analysis and comparison of both existing storage schemes for RDF data and evaluation approaches for SPARQL queries necessitates a comprehensive and universal benchmark platform. We present SP2Bench, a publicly available, language-specific performance benchmark for the SPARQL query language. SP2Bench is settled in the DBLP scenario and comprises a data generator for creating arbitrarily large DBLP-like documents and a set of carefully designed benchmark queries. The generated documents mirror vital key characteristics and social-world distributions encountered in the original DBLP data set, while the queries implement meaningful requests on top of this data, covering a variety of SPARQL operator constellations and RDF access patterns. In this chapter, we discuss requirements and desiderata for SPARQL benchmarks and present the SP2Bench framework, including its data generator, benchmark queries and performance metrics.
Benchmarking in a differentially heated rotating annulus experiment: Multiple equilibria in the light of laboratory experiments and simulations

NASA Astrophysics Data System (ADS)

Vincze, Miklos; Harlander, Uwe; Borchert, Sebastian; Achatz, Ulrich; Baumann, Martin; Egbers, Christoph; Fröhlich, Jochen; Hertel, Claudia; Heuveline, Vincent; Hickel, Stefan; von Larcher, Thomas; Remmler, Sebastian

2014-05-01

modes. Thus certain "benchmarks" have been created that can later be used as test cases for atmospheric numerical model validation. Both in the experiments and in the numerics multiple equilibrium states have been observed in the form of hysteretic behavior depending on the initial conditions. The precise quantification of these state and wave mode transitions may shed light to some aspects of the basic underlying dynamics of the baroclinic annulus configuration, still to be understood.
A suite of benchmark and challenge problems for enhanced geothermal systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

White, Mark; Fu, Pengcheng; McClure, Mark

A diverse suite of numerical simulators is currently being applied to predict or understand the performance of enhanced geothermal systems (EGS). To build confidence and identify critical development needs for these analytical tools, the United States Department of Energy, Geothermal Technologies Office sponsored a Code Comparison Study (GTO-CCS), with participants from universities, industry, and national laboratories. A principal objective for the study was to create a community forum for improvement and verification of numerical simulators for EGS modeling. Teams participating in the study were those representing U.S. national laboratories, universities, and industries, and each team brought unique numerical simulation capabilitiesmore » to bear on the problems. Two classes of problems were developed during the study, benchmark problems and challenge problems. The benchmark problems were structured to test the ability of the collection of numerical simulators to solve various combinations of coupled thermal, hydrologic, geomechanical, and geochemical processes. This class of problems was strictly defined in terms of properties, driving forces, initial conditions, and boundary conditions. The challenge problems were based on the enhanced geothermal systems research conducted at Fenton Hill, near Los Alamos, New Mexico, between 1974 and 1995. The problems involved two phases of research, stimulation, development, and circulation in two separate reservoirs. The challenge problems had specific questions to be answered via numerical simulation in three topical areas: 1) reservoir creation/stimulation, 2) reactive and passive transport, and 3) thermal recovery. Whereas the benchmark class of problems were designed to test capabilities for modeling coupled processes under strictly specified conditions, the stated objective for the challenge class of problems was to demonstrate what new understanding of the Fenton Hill experiments could be realized via the application
Benchmark problems and solutions

NASA Technical Reports Server (NTRS)

Tam, Christopher K. W.

1995-01-01

The scientific committee, after careful consideration, adopted six categories of benchmark problems for the workshop. These problems do not cover all the important computational issues relevant to Computational Aeroacoustics (CAA). The deciding factor to limit the number of categories to six was the amount of effort needed to solve these problems. For reference purpose, the benchmark problems are provided here. They are followed by the exact or approximate analytical solutions. At present, an exact solution for the Category 6 problem is not available.
featsel: A framework for benchmarking of feature selection algorithms and cost functions

NASA Astrophysics Data System (ADS)

Reis, Marcelo S.; Estrela, Gustavo; Ferreira, Carlos Eduardo; Barrera, Junior

In this paper, we introduce featsel, a framework for benchmarking of feature selection algorithms and cost functions. This framework allows the user to deal with the search space as a Boolean lattice and has its core coded in C++ for computational efficiency purposes. Moreover, featsel includes Perl scripts to add new algorithms and/or cost functions, generate random instances, plot graphs and organize results into tables. Besides, this framework already comes with dozens of algorithms and cost functions for benchmarking experiments. We also provide illustrative examples, in which featsel outperforms the popular Weka workbench in feature selection procedures on data sets from the UCI Machine Learning Repository.
42 CFR 440.335 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 42 Public Health 4 2013-10-01 2013-10-01 false Benchmark-equivalent health benefits coverage. 440... and Benchmark-Equivalent Coverage § 440.335 Benchmark-equivalent health benefits coverage. (a) Aggregate actuarial value. Benchmark-equivalent coverage is health benefits coverage that has an aggregate...
42 CFR 440.335 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 42 Public Health 4 2011-10-01 2011-10-01 false Benchmark-equivalent health benefits coverage. 440... and Benchmark-Equivalent Coverage § 440.335 Benchmark-equivalent health benefits coverage. (a) Aggregate actuarial value. Benchmark-equivalent coverage is health benefits coverage that has an aggregate...

Validation of the WIMSD4M cross-section generation code with benchmark results

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deen, J.R.; Woodruff, W.L.; Leal, L.E.

1995-01-01

The WIMSD4 code has been adopted for cross-section generation in support of the Reduced Enrichment Research and Test Reactor (RERTR) program at Argonne National Laboratory (ANL). Subsequently, the code has undergone several updates, and significant improvements have been achieved. The capability of generating group-collapsed micro- or macroscopic cross sections from the ENDF/B-V library and the more recent evaluation, ENDF/B-VI, in the ISOTXS format makes the modified version of the WIMSD4 code, WIMSD4M, very attractive, not only for the RERTR program, but also for the reactor physics community. The intent of the present paper is to validate the WIMSD4M cross-section librariesmore » for reactor modeling of fresh water moderated cores. The results of calculations performed with multigroup cross-section data generated with the WIMSD4M code will be compared against experimental results. These results correspond to calculations carried out with thermal reactor benchmarks of the Oak Ridge National Laboratory (ORNL) unreflected HEU critical spheres, the TRX LEU critical experiments, and calculations of a modified Los Alamos HEU D{sub 2}O moderated benchmark critical system. The benchmark calculations were performed with the discrete-ordinates transport code, TWODANT, using WIMSD4M cross-section data. Transport calculations using the XSDRNPM module of the SCALE code system are also included. In addition to transport calculations, diffusion calculations with the DIF3D code were also carried out, since the DIF3D code is used in the RERTR program for reactor analysis and design. For completeness, Monte Carlo results of calculations performed with the VIM and MCNP codes are also presented.« less
Validation of the WIMSD4M cross-section generation code with benchmark results

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leal, L.C.; Deen, J.R.; Woodruff, W.L.

1995-02-01

The WIMSD4 code has been adopted for cross-section generation in support of the Reduced Enrichment for Research and Test (RERTR) program at Argonne National Laboratory (ANL). Subsequently, the code has undergone several updates, and significant improvements have been achieved. The capability of generating group-collapsed micro- or macroscopic cross sections from the ENDF/B-V library and the more recent evaluation, ENDF/B-VI, in the ISOTXS format makes the modified version of the WIMSD4 code, WIMSD4M, very attractive, not only for the RERTR program, but also for the reactor physics community. The intent of the present paper is to validate the procedure to generatemore » cross-section libraries for reactor analyses and calculations utilizing the WIMSD4M code. To do so, the results of calculations performed with group cross-section data generated with the WIMSD4M code will be compared against experimental results. These results correspond to calculations carried out with thermal reactor benchmarks of the Oak Ridge National Laboratory(ORNL) unreflected critical spheres, the TRX critical experiments, and calculations of a modified Los Alamos highly-enriched heavy-water moderated benchmark critical system. The benchmark calculations were performed with the discrete-ordinates transport code, TWODANT, using WIMSD4M cross-section data. Transport calculations using the XSDRNPM module of the SCALE code system are also included. In addition to transport calculations, diffusion calculations with the DIF3D code were also carried out, since the DIF3D code is used in the RERTR program for reactor analysis and design. For completeness, Monte Carlo results of calculations performed with the VIM and MCNP codes are also presented.« less
Energy benchmarking in wastewater treatment plants: the importance of site operation and layout.

PubMed

Belloir, C; Stanford, C; Soares, A

2015-01-01

Energy benchmarking is a powerful tool in the optimization of wastewater treatment plants (WWTPs) in helping to reduce costs and greenhouse gas emissions. Traditionally, energy benchmarking methods focused solely on reporting electricity consumption, however, recent developments in this area have led to the inclusion of other types of energy, including electrical, manual, chemical and mechanical consumptions that can be expressed in kWh/m3. In this study, two full-scale WWTPs were benchmarked, both incorporated preliminary, secondary (oxidation ditch) and tertiary treatment processes, Site 1 also had an additional primary treatment step. The results indicated that Site 1 required 2.32 kWh/m3 against 0.98 kWh/m3 for Site 2. Aeration presented the highest energy consumption for both sites with 2.08 kWh/m3 required for Site 1 and 0.91 kWh/m3 in Site 2. The mechanical energy represented the second biggest consumption for Site 1 (9%, 0.212 kWh/m3) and chemical input was significant in Site 2 (4.1%, 0.026 kWh/m3). The analysis of the results indicated that Site 2 could be optimized by constructing a primary settling tank that would reduce the biochemical oxygen demand, total suspended solids and NH4 loads to the oxidation ditch by 55%, 75% and 12%, respectively, and at the same time reduce the aeration requirements by 49%. This study demonstrated that the effectiveness of the energy benchmarking exercise in identifying the highest energy-consuming assets, nevertheless it points out the need to develop a holistic overview of the WWTP and the need to include parameters such as effluent quality, site operation and plant layout to allow adequate benchmarking.
The National Practice Benchmark for oncology, 2014 report on 2013 data.

PubMed

Towle, Elaine L; Barr, Thomas R; Senese, James L

2014-11-01

The National Practice Benchmark (NPB) is a unique tool to measure oncology practices against others across the country in a way that allows meaningful comparisons despite differences in practice size or setting. In today's economic environment every oncology practice, regardless of business structure or affiliation, should be able to produce, monitor, and benchmark basic metrics to meet current business pressures for increased efficiency and efficacy of care. Although we recognize that the NPB survey results do not capture the experience of all oncology practices, practices that can and do participate demonstrate exceptional managerial capability, and this year those practices are recognized for their participation. In this report, we continue to emphasize the methodology introduced last year in which we reported medical revenue net of the cost of the drugs as net medical revenue for the hematology/oncology product line. The effect of this is to capture only the gross margin attributable to drugs as revenue. New this year, we introduce six measures of clinical data density and expand the radiation oncology benchmarks. Copyright © 2014 by American Society of Clinical Oncology.
LASL benchmark performance 1978. [CDC STAR-100, 6600, 7600, Cyber 73, and CRAY-1

DOE Office of Scientific and Technical Information (OSTI.GOV)

McKnight, A.L.

1979-08-01

This report presents the results of running several benchmark programs on a CDC STAR-100, a Cray Research CRAY-1, a CDC 6600, a CDC 7600, and a CDC Cyber 73. The benchmark effort included CRAY-1's at several installations running different operating systems and compilers. This benchmark is part of an ongoing program at Los Alamos Scientific Laboratory to collect performance data and monitor the development trend of supercomputers. 3 tables.
A benchmark for subduction zone modeling

NASA Astrophysics Data System (ADS)

van Keken, P.; King, S.; Peacock, S.

2003-04-01

Our understanding of subduction zones hinges critically on the ability to discern its thermal structure and dynamics. Computational modeling has become an essential complementary approach to observational and experimental studies. The accurate modeling of subduction zones is challenging due to the unique geometry, complicated rheological description and influence of fluid and melt formation. The complicated physics causes problems for the accurate numerical solution of the governing equations. As a consequence it is essential for the subduction zone community to be able to evaluate the ability and limitations of various modeling approaches. The participants of a workshop on the modeling of subduction zones, held at the University of Michigan at Ann Arbor, MI, USA in 2002, formulated a number of case studies to be developed into a benchmark similar to previous mantle convection benchmarks (Blankenbach et al., 1989; Busse et al., 1991; Van Keken et al., 1997). Our initial benchmark focuses on the dynamics of the mantle wedge and investigates three different rheologies: constant viscosity, diffusion creep, and dislocation creep. In addition we investigate the ability of codes to accurate model dynamic pressure and advection dominated flows. Proceedings of the workshop and the formulation of the benchmark are available at www.geo.lsa.umich.edu/~keken/subduction02.html We strongly encourage interested research groups to participate in this benchmark. At Nice 2003 we will provide an update and first set of benchmark results. Interested researchers are encouraged to contact one of the authors for further details.
Reference Solutions for Benchmark Turbulent Flows in Three Dimensions

NASA Technical Reports Server (NTRS)

Diskin, Boris; Thomas, James L.; Pandya, Mohagna J.; Rumsey, Christopher L.

2016-01-01

A grid convergence study is performed to establish benchmark solutions for turbulent flows in three dimensions (3D) in support of turbulence-model verification campaign at the Turbulence Modeling Resource (TMR) website. The three benchmark cases are subsonic flows around a 3D bump and a hemisphere-cylinder configuration and a supersonic internal flow through a square duct. Reference solutions are computed for Reynolds Averaged Navier Stokes equations with the Spalart-Allmaras turbulence model using a linear eddy-viscosity model for the external flows and a nonlinear eddy-viscosity model based on a quadratic constitutive relation for the internal flow. The study involves three widely-used practical computational fluid dynamics codes developed and supported at NASA Langley Research Center: FUN3D, USM3D, and CFL3D. Reference steady-state solutions computed with these three codes on families of consistently refined grids are presented. Grid-to-grid and code-to-code variations are described in detail.
Benchmarking NNWSI flow and transport codes: COVE 1 results

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hayden, N.K.

1985-06-01

The code verification (COVE) activity of the Nevada Nuclear Waste Storage Investigations (NNWSI) Project is the first step in certification of flow and transport codes used for NNWSI performance assessments of a geologic repository for disposing of high-level radioactive wastes. The goals of the COVE activity are (1) to demonstrate and compare the numerical accuracy and sensitivity of certain codes, (2) to identify and resolve problems in running typical NNWSI performance assessment calculations, and (3) to evaluate computer requirements for running the codes. This report describes the work done for COVE 1, the first step in benchmarking some of themore » codes. Isothermal calculations for the COVE 1 benchmarking have been completed using the hydrologic flow codes SAGUARO, TRUST, and GWVIP; the radionuclide transport codes FEMTRAN and TRUMP; and the coupled flow and transport code TRACR3D. This report presents the results of three cases of the benchmarking problem solved for COVE 1, a comparison of the results, questions raised regarding sensitivities to modeling techniques, and conclusions drawn regarding the status and numerical sensitivities of the codes. 30 refs.« less
Validation of the BUGJEFF311.BOLIB, BUGENDF70.BOLIB and BUGLE-B7 broad-group libraries on the PCA-Replica (H2O/Fe) neutron shielding benchmark experiment

NASA Astrophysics Data System (ADS)

Pescarini, Massimo; Orsi, Roberto; Frisoni, Manuela

2016-03-01

The PCA-Replica 12/13 (H2O/Fe) neutron shielding benchmark experiment was analysed using the TORT-3.2 3D SN code. PCA-Replica reproduces a PWR ex-core radial geometry with alternate layers of water and steel including a pressure vessel simulator. Three broad-group coupled neutron/photon working cross section libraries in FIDO-ANISN format with the same energy group structure (47 n + 20 γ) and based on different nuclear data were alternatively used: the ENEA BUGJEFF311.BOLIB (JEFF-3.1.1) and UGENDF70.BOLIB (ENDF/B-VII.0) libraries and the ORNL BUGLE-B7 (ENDF/B-VII.0) library. Dosimeter cross sections derived from the IAEA IRDF-2002 dosimetry file were employed. The calculated reaction rates for the Rh-103(n,n')Rh-103m, In-115(n,n')In-115m and S-32(n,p)P-32 threshold activation dosimeters and the calculated neutron spectra are compared with the corresponding experimental results.
A chemical EOR benchmark study of different reservoir simulators

NASA Astrophysics Data System (ADS)

Goudarzi, Ali; Delshad, Mojdeh; Sepehrnoori, Kamy

2016-09-01

Interest in chemical EOR processes has intensified in recent years due to the advancements in chemical formulations and injection techniques. Injecting Polymer (P), surfactant/polymer (SP), and alkaline/surfactant/polymer (ASP) are techniques for improving sweep and displacement efficiencies with the aim of improving oil production in both secondary and tertiary floods. There has been great interest in chemical flooding recently for different challenging situations. These include high temperature reservoirs, formations with extreme salinity and hardness, naturally fractured carbonates, and sandstone reservoirs with heavy and viscous crude oils. More oil reservoirs are reaching maturity where secondary polymer floods and tertiary surfactant methods have become increasingly important. This significance has added to the industry's interest in using reservoir simulators as tools for reservoir evaluation and management to minimize costs and increase the process efficiency. Reservoir simulators with special features are needed to represent coupled chemical and physical processes present in chemical EOR processes. The simulators need to be first validated against well controlled lab and pilot scale experiments to reliably predict the full field implementations. The available data from laboratory scale include 1) phase behavior and rheological data; and 2) results of secondary and tertiary coreflood experiments for P, SP, and ASP floods under reservoir conditions, i.e. chemical retentions, pressure drop, and oil recovery. Data collected from corefloods are used as benchmark tests comparing numerical reservoir simulators with chemical EOR modeling capabilities such as STARS of CMG, ECLIPSE-100 of Schlumberger, REVEAL of Petroleum Experts. The research UTCHEM simulator from The University of Texas at Austin is also included since it has been the benchmark for chemical flooding simulation for over 25 years. The results of this benchmark comparison will be utilized to improve
Benchmarking: your performance measurement and improvement tool.

PubMed

Senn, G F

2000-01-01

Many respected professional healthcare organizations and societies today are seeking to establish data-driven performance measurement strategies such as benchmarking. Clinicians are, however, resistant to "benchmarking" that is based on financial data alone, concerned that it may be adverse to the patients' best interests. Benchmarking of clinical procedures that uses physician's codes such as Current Procedural Terminology (CPTs) has greater credibility with practitioners. Better Performers, organizations that can perform procedures successfully at lower cost and in less time, become the "benchmark" against which other organizations can measure themselves. The Better Performers' strategies can be adopted by other facilities to save time or money while maintaining quality patient care.
Review of the GMD Benchmark Event in TPL-007-1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Backhaus, Scott N.; Rivera, Michael Kelly

2015-07-21

Los Alamos National Laboratory (LANL) examined the approaches suggested in NERC Standard TPL-007-1 for defining the geo-electric field for the Benchmark Geomagnetic Disturbance (GMD) Event. Specifically; 1. Estimating 100-year exceedance geo-electric field magnitude; The scaling of the GMD Benchmark Event to geomagnetic latitudes below 60 degrees north; and 3. The effect of uncertainties in earth conductivity data on the conversion from geomagnetic field to geo-electric field. This document summarizes the review and presents recommendations for consideration
Benchmark Airport Charges

NASA Technical Reports Server (NTRS)

deWit, A.; Cohn, N.

1999-01-01

The Netherlands Directorate General of Civil Aviation (DGCA) commissioned Hague Consulting Group (HCG) to complete a benchmark study of airport charges at twenty eight airports in Europe and around the world, based on 1996 charges. This study followed previous DGCA research on the topic but included more airports in much more detail. The main purpose of this new benchmark study was to provide insight into the levels and types of airport charges worldwide and into recent changes in airport charge policy and structure, This paper describes the 1996 analysis. It is intended that this work be repeated every year in order to follow developing trends and provide the most up-to-date information possible.
Benchmark Airport Charges

NASA Technical Reports Server (NTRS)

de Wit, A.; Cohn, N.

1999-01-01

The Netherlands Directorate General of Civil Aviation (DGCA) commissioned Hague Consulting Group (HCG) to complete a benchmark study of airport charges at twenty eight airports in Europe and around the world, based on 1996 charges. This study followed previous DGCA research on the topic but included more airports in much more detail. The main purpose of this new benchmark study was to provide insight into the levels and types of airport charges worldwide and into recent changes in airport charge policy and structure. This paper describes the 1996 analysis. It is intended that this work be repeated every year in order to follow developing trends and provide the most up-to-date information possible.
Benchmark Modeling of the Near-Field and Far-Field Wave Effects of Wave Energy Arrays

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rhinefrank, Kenneth E; Haller, Merrick C; Ozkan-Haller, H Tuba

2013-01-26

This project is an industry-led partnership between Columbia Power Technologies and Oregon State University that will perform benchmark laboratory experiments and numerical modeling of the near-field and far-field impacts of wave scattering from an array of wave energy devices. These benchmark experimental observations will help to fill a gaping hole in our present knowledge of the near-field effects of multiple, floating wave energy converters and are a critical requirement for estimating the potential far-field environmental effects of wave energy arrays. The experiments will be performed at the Hinsdale Wave Research Laboratory (Oregon State University) and will utilize an array ofmore » newly developed Buoys' that are realistic, lab-scale floating power converters. The array of Buoys will be subjected to realistic, directional wave forcing (1:33 scale) that will approximate the expected conditions (waves and water depths) to be found off the Central Oregon Coast. Experimental observations will include comprehensive in-situ wave and current measurements as well as a suite of novel optical measurements. These new optical capabilities will include imaging of the 3D wave scattering using a binocular stereo camera system, as well as 3D device motion tracking using a newly acquired LED system. These observing systems will capture the 3D motion history of individual Buoys as well as resolve the 3D scattered wave field; thus resolving the constructive and destructive wave interference patterns produced by the array at high resolution. These data combined with the device motion tracking will provide necessary information for array design in order to balance array performance with the mitigation of far-field impacts. As a benchmark data set, these data will be an important resource for testing of models for wave/buoy interactions, buoy performance, and far-field effects on wave and current patterns due to the presence of arrays. Under the proposed project we will
42 CFR 440.330 - Benchmark health benefits coverage.

Code of Federal Regulations, 2011 CFR

2011-10-01

... Benchmark-Equivalent Coverage § 440.330 Benchmark health benefits coverage. Benchmark coverage is health...) Federal Employees Health Benefit Plan Equivalent Coverage (FEHBP—Equivalent Health Insurance Coverage). A benefit plan equivalent to the standard Blue Cross/Blue Shield preferred provider option service benefit...
42 CFR 440.330 - Benchmark health benefits coverage.

Code of Federal Regulations, 2014 CFR

2014-10-01

... Benchmark-Equivalent Coverage § 440.330 Benchmark health benefits coverage. Benchmark coverage is health...) Federal Employees Health Benefit Plan Equivalent Coverage (FEHBP—Equivalent Health Insurance Coverage). A benefit plan equivalent to the standard Blue Cross/Blue Shield preferred provider option service benefit...
42 CFR 440.330 - Benchmark health benefits coverage.

Code of Federal Regulations, 2013 CFR

2013-10-01

... Benchmark-Equivalent Coverage § 440.330 Benchmark health benefits coverage. Benchmark coverage is health...) Federal Employees Health Benefit Plan Equivalent Coverage (FEHBP—Equivalent Health Insurance Coverage). A benefit plan equivalent to the standard Blue Cross/Blue Shield preferred provider option service benefit...
42 CFR 440.330 - Benchmark health benefits coverage.

Code of Federal Regulations, 2010 CFR

2010-10-01

... Benchmark-Equivalent Coverage § 440.330 Benchmark health benefits coverage. Benchmark coverage is health...) Federal Employees Health Benefit Plan Equivalent Coverage (FEHBP—Equivalent Health Insurance Coverage). A benefit plan equivalent to the standard Blue Cross/Blue Shield preferred provider option service benefit...
Toxicological benchmarks for screening potential contaminants of concern for effects on soil and litter invertebrates and heterotrophic process

DOE Office of Scientific and Technical Information (OSTI.GOV)

Will, M.E.; Suter, G.W. II

1994-09-01

One of the initial stages in ecological risk assessments for hazardous waste sites is the screening of contaminants to determine which of them are worthy of further consideration as {open_quotes}contaminants of potential concern.{close_quotes} This process is termed {open_quotes}contaminant screening.{close_quotes} It is performed by comparing measured ambient concentrations of chemicals to benchmark concentrations. Currently, no standard benchmark concentrations exist for assessing contaminants in soil with respect to their toxicity to soil- and litter-dwelling invertebrates, including earthworms, other micro- and macroinvertebrates, or heterotrophic bacteria and fungi. This report presents a standard method for deriving benchmarks for this purpose, sets of data concerningmore » effects of chemicals in soil on invertebrates and soil microbial processes, and benchmarks for chemicals potentially associated with United States Department of Energy sites. In addition, literature describing the experiments from which data were drawn for benchmark derivation. Chemicals that are found in soil at concentrations exceeding both the benchmarks and the background concentration for the soil type should be considered contaminants of potential concern.« less

40 CFR 141.172 - Disinfection profiling and benchmarking.

Code of Federal Regulations, 2011 CFR

2011-07-01

... benchmarking. 141.172 Section 141.172 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED... Disinfection-Systems Serving 10,000 or More People § 141.172 Disinfection profiling and benchmarking. (a... sanitary surveys conducted by the State. (c) Disinfection benchmarking. (1) Any system required to develop...
Raising Quality and Achievement. A College Guide to Benchmarking.

ERIC Educational Resources Information Center

Owen, Jane

This booklet introduces the principles and practices of benchmarking as a way of raising quality and achievement at further education colleges in Britain. Section 1 defines the concept of benchmarking. Section 2 explains what benchmarking is not and the steps that should be taken before benchmarking is initiated. The following aspects and…
Benchmarking forensic mental health organizations.

PubMed

Coombs, Tim; Taylor, Monica; Pirkis, Jane

2011-04-01

This paper describes the forensic mental health forums that were conducted as part of the National Mental Health Benchmarking Project (NMHBP). These forums encouraged participating organizations to compare their performance on a range of key performance indicators (KPIs) with that of their peers. Four forensic mental health organizations took part in the NMHBP. Representatives from these organizations attended eight benchmarking forums at which they documented their performance against previously agreed KPIs. They also undertook three special projects which explored some of the factors that might explain inter-organizational variation in performance. The inter-organizational range for many of the indicators was substantial. Observing this led participants to conduct the special projects to explore three factors which might help explain the variability - seclusion practices, delivery of community mental health services, and provision of court liaison services. The process of conducting the special projects gave participants insights into the practices and structures employed by their counterparts, and provided them with some important lessons for quality improvement. The forensic mental health benchmarking forums have demonstrated that benchmarking is feasible and likely to be useful in improving service performance and quality.
How to Advance TPC Benchmarks with Dependability Aspects

NASA Astrophysics Data System (ADS)

Almeida, Raquel; Poess, Meikel; Nambiar, Raghunath; Patil, Indira; Vieira, Marco

Transactional systems are the core of the information systems of most organizations. Although there is general acknowledgement that failures in these systems often entail significant impact both on the proceeds and reputation of companies, the benchmarks developed and managed by the Transaction Processing Performance Council (TPC) still maintain their focus on reporting bare performance. Each TPC benchmark has to pass a list of dependability-related tests (to verify ACID properties), but not all benchmarks require measuring their performances. While TPC-E measures the recovery time of some system failures, TPC-H and TPC-C only require functional correctness of such recovery. Consequently, systems used in TPC benchmarks are tuned mostly for performance. In this paper we argue that nowadays systems should be tuned for a more comprehensive suite of dependability tests, and that a dependability metric should be part of TPC benchmark publications. The paper discusses WHY and HOW this can be achieved. Two approaches are introduced and discussed: augmenting each TPC benchmark in a customized way, by extending each specification individually; and pursuing a more unified approach, defining a generic specification that could be adjoined to any TPC benchmark.
A Methodology for Benchmarking Relational Database Machines,

DTIC Science & Technology

1984-01-01

user benchmarks is to compare the multiple users to the best-case performance The data for each query classification coll and the performance...called a benchmark. The term benchmark originates from the markers used by sur - veyors in establishing common reference points for their measure...formatted databases. In order to further simplify the problem, we restrict our study to those DBMs which support the relational model. A sur - vey
Protein Models Docking Benchmark 2

PubMed Central

Anishchenko, Ivan; Kundrotas, Petras J.; Tuzikov, Alexander V.; Vakser, Ilya A.

2015-01-01

Structural characterization of protein-protein interactions is essential for our ability to understand life processes. However, only a fraction of known proteins have experimentally determined structures. Such structures provide templates for modeling of a large part of the proteome, where individual proteins can be docked by template-free or template-based techniques. Still, the sensitivity of the docking methods to the inherent inaccuracies of protein models, as opposed to the experimentally determined high-resolution structures, remains largely untested, primarily due to the absence of appropriate benchmark set(s). Structures in such a set should have pre-defined inaccuracy levels and, at the same time, resemble actual protein models in terms of structural motifs/packing. The set should also be large enough to ensure statistical reliability of the benchmarking results. We present a major update of the previously developed benchmark set of protein models. For each interactor, six models were generated with the model-to-native Cα RMSD in the 1 to 6 Å range. The models in the set were generated by a new approach, which corresponds to the actual modeling of new protein structures in the “real case scenario,” as opposed to the previous set, where a significant number of structures were model-like only. In addition, the larger number of complexes (165 vs. 63 in the previous set) increases the statistical reliability of the benchmarking. We estimated the highest accuracy of the predicted complexes (according to CAPRI criteria), which can be attained using the benchmark structures. The set is available at http://dockground.bioinformatics.ku.edu. PMID:25712716
[Do you mean benchmarking?].

PubMed

Bonnet, F; Solignac, S; Marty, J

2008-03-01

The purpose of benchmarking is to settle improvement processes by comparing the activities to quality standards. The proposed methodology is illustrated by benchmark business cases performed inside medical plants on some items like nosocomial diseases or organization of surgery facilities. Moreover, the authors have built a specific graphic tool, enhanced with balance score numbers and mappings, so that the comparison between different anesthesia-reanimation services, which are willing to start an improvement program, is easy and relevant. This ready-made application is even more accurate as far as detailed tariffs of activities are implemented.
Benchmarking, Total Quality Management, and Libraries.

ERIC Educational Resources Information Center

Shaughnessy, Thomas W.

1993-01-01

Discussion of the use of Total Quality Management (TQM) in higher education and academic libraries focuses on the identification, collection, and use of reliable data. Methods for measuring quality, including benchmarking, are described; performance measures are considered; and benchmarking techniques are examined. (11 references) (MES)
Radiation Detection Computational Benchmark Scenarios

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shaver, Mark W.; Casella, Andrew M.; Wittman, Richard S.

2013-09-24

Modeling forms an important component of radiation detection development, allowing for testing of new detector designs, evaluation of existing equipment against a wide variety of potential threat sources, and assessing operation performance of radiation detection systems. This can, however, result in large and complex scenarios which are time consuming to model. A variety of approaches to radiation transport modeling exist with complementary strengths and weaknesses for different problems. This variety of approaches, and the development of promising new tools (such as ORNL’s ADVANTG) which combine benefits of multiple approaches, illustrates the need for a means of evaluating or comparing differentmore » techniques for radiation detection problems. This report presents a set of 9 benchmark problems for comparing different types of radiation transport calculations, identifying appropriate tools for classes of problems, and testing and guiding the development of new methods. The benchmarks were drawn primarily from existing or previous calculations with a preference for scenarios which include experimental data, or otherwise have results with a high level of confidence, are non-sensitive, and represent problem sets of interest to NA-22. From a technical perspective, the benchmarks were chosen to span a range of difficulty and to include gamma transport, neutron transport, or both and represent different important physical processes and a range of sensitivity to angular or energy fidelity. Following benchmark identification, existing information about geometry, measurements, and previous calculations were assembled. Monte Carlo results (MCNP decks) were reviewed or created and re-run in order to attain accurate computational times and to verify agreement with experimental data, when present. Benchmark information was then conveyed to ORNL in order to guide testing and development of hybrid calculations. The results of those ADVANTG calculations were then sent to
Reactor Physics Measurements and Benchmark Specifications for Oak Ridge Highly Enriched Uranium Sphere (ORSphere)

DOE PAGES

Marshall, Margaret A.

2014-11-04

In the early 1970s Dr. John T. Mihalczo (team leader), J.J. Lynn, and J.R. Taylor performed experiments at the Oak Ridge Critical Experiments Facility (ORCEF) with highly enriched uranium (HEU) metal (called Oak Ridge Alloy or ORALLOY) in an effort to recreate GODIVA I results with greater accuracy than those performed at Los Alamos National Laboratory in the 1950s. The purpose of the Oak Ridge ORALLOY Sphere (ORSphere) experiments was to estimate the unreflected and unmoderated critical mass of an idealized sphere of uranium metal corrected to a density, purity, and enrichment such that it could be compared with themore » GODIVA I experiments. Additionally, various material reactivity worths, the surface material worth coefficient, the delayed neutron fraction, the prompt neutron decay constant, relative fission density, and relative neutron importance were all measured. The critical assembly, material reactivity worths, the surface material worth coefficient, and the delayed neutron fraction were all evaluated as benchmark experiment measurements. The reactor physics measurements are the focus of this paper; although for clarity the critical assembly benchmark specifications are briefly discussed.« less
Structural Benchmark Creep Testing for Microcast MarM-247 Advanced Stirling Convertor E2 Heater Head Test Article SN18

NASA Technical Reports Server (NTRS)

Krause, David L.; Brewer, Ethan J.; Pawlik, Ralph

2013-01-01

This report provides test methodology details and qualitative results for the first structural benchmark creep test of an Advanced Stirling Convertor (ASC) heater head of ASC-E2 design heritage. The test article was recovered from a flight-like Microcast MarM-247 heater head specimen previously used in helium permeability testing. The test article was utilized for benchmark creep test rig preparation, wall thickness and diametral laser scan hardware metrological developments, and induction heater custom coil experiments. In addition, a benchmark creep test was performed, terminated after one week when through-thickness cracks propagated at thermocouple weld locations. Following this, it was used to develop a unique temperature measurement methodology using contact thermocouples, thereby enabling future benchmark testing to be performed without the use of conventional welded thermocouples, proven problematic for the alloy. This report includes an overview of heater head structural benchmark creep testing, the origin of this particular test article, test configuration developments accomplished using the test article, creep predictions for its benchmark creep test, qualitative structural benchmark creep test results, and a short summary.
Benchmarking Helps Measure Union Programs, Operations.

ERIC Educational Resources Information Center

Mann, Jerry

2001-01-01

Explores three examples of benchmarking by college student unions. Focuses on how a union can collect information from other unions for use as benchmarking standards for the purposes of selling a concept or justifying program increases, or for comparing a union's financial performance to other unions. (EV)
Present Status and Extensions of the Monte Carlo Performance Benchmark

NASA Astrophysics Data System (ADS)

Hoogenboom, J. Eduard; Petrovic, Bojan; Martin, William R.

2014-06-01

The NEA Monte Carlo Performance benchmark started in 2011 aiming to monitor over the years the abilities to perform a full-size Monte Carlo reactor core calculation with a detailed power production for each fuel pin with axial distribution. This paper gives an overview of the contributed results thus far. It shows that reaching a statistical accuracy of 1 % for most of the small fuel zones requires about 100 billion neutron histories. The efficiency of parallel execution of Monte Carlo codes on a large number of processor cores shows clear limitations for computer clusters with common type computer nodes. However, using true supercomputers the speedup of parallel calculations is increasing up to large numbers of processor cores. More experience is needed from calculations on true supercomputers using large numbers of processors in order to predict if the requested calculations can be done in a short time. As the specifications of the reactor geometry for this benchmark test are well suited for further investigations of full-core Monte Carlo calculations and a need is felt for testing other issues than its computational performance, proposals are presented for extending the benchmark to a suite of benchmark problems for evaluating fission source convergence for a system with a high dominance ratio, for coupling with thermal-hydraulics calculations to evaluate the use of different temperatures and coolant densities and to study the correctness and effectiveness of burnup calculations. Moreover, other contemporary proposals for a full-core calculation with realistic geometry and material composition will be discussed.
Benchmark Study of Global Clean Energy Manufacturing | Advanced

Science.gov Websites

Manufacturing Research | NREL Benchmark Study of Global Clean Energy Manufacturing Benchmark Study of Global Clean Energy Manufacturing Through a first-of-its-kind benchmark study, the Clean Energy Technology End Product.' The study examined four clean energy technologies: wind turbine components
Benchmarking: contexts and details matter.

PubMed

Zheng, Siyuan

2017-07-05

Benchmarking is an essential step in the development of computational tools. We take this opportunity to pitch in our opinions on tool benchmarking, in light of two correspondence articles published in Genome Biology.Please see related Li et al. and Newman et al. correspondence articles: www.dx.doi.org/10.1186/s13059-017-1256-5 and www.dx.doi.org/10.1186/s13059-017-1257-4.
Benchmarking a Visual-Basic based multi-component one-dimensional reactive transport modeling tool

NASA Astrophysics Data System (ADS)

Torlapati, Jagadish; Prabhakar Clement, T.

2013-01-01

We present the details of a comprehensive numerical modeling tool, RT1D, which can be used for simulating biochemical and geochemical reactive transport problems. The code can be run within the standard Microsoft EXCEL Visual Basic platform, and it does not require any additional software tools. The code can be easily adapted by others for simulating different types of laboratory-scale reactive transport experiments. We illustrate the capabilities of the tool by solving five benchmark problems with varying levels of reaction complexity. These literature-derived benchmarks are used to highlight the versatility of the code for solving a variety of practical reactive transport problems. The benchmarks are described in detail to provide a comprehensive database, which can be used by model developers to test other numerical codes. The VBA code presented in the study is a practical tool that can be used by laboratory researchers for analyzing both batch and column datasets within an EXCEL platform.
Diagnostic Algorithm Benchmarking

NASA Technical Reports Server (NTRS)

Poll, Scott

2011-01-01

A poster for the NASA Aviation Safety Program Annual Technical Meeting. It describes empirical benchmarking on diagnostic algorithms using data from the ADAPT Electrical Power System testbed and a diagnostic software framework.
Numerical modelling of gravel unconstrained flow experiments with the DAN3D and RASH3D codes

NASA Astrophysics Data System (ADS)

Sauthier, Claire; Pirulli, Marina; Pisani, Gabriele; Scavia, Claudio; Labiouse, Vincent

2015-12-01

Landslide continuum dynamic models have improved considerably in the last years, but a consensus on the best method of calibrating the input resistance parameter values for predictive analyses has not yet emerged. In the present paper, numerical simulations of a series of laboratory experiments performed at the Laboratory for Rock Mechanics of the EPF Lausanne were undertaken with the RASH3D and DAN3D numerical codes. They aimed at analysing the possibility to use calibrated ranges of parameters (1) in a code different from that they were obtained from and (2) to simulate potential-events made of a material with the same characteristics as back-analysed past-events, but involving a different volume and propagation path. For this purpose, one of the four benchmark laboratory tests was used as past-event to calibrate the dynamic basal friction angle assuming a Coulomb-type behaviour of the sliding mass, and this back-analysed value was then used to simulate the three other experiments, assumed as potential-events. The computational findings show good correspondence with experimental results in terms of characteristics of the final deposits (i.e., runout, length and width). Furthermore, the obtained best fit values of the dynamic basal friction angle for the two codes turn out to be close to each other and within the range of values measured with pseudo-dynamic tilting tests.
Payer leverage and hospital compliance with a benchmark: a population-based observational study

PubMed Central

Hollingsworth, John M; Krein, Sarah L; Miller, David C; DeMonner, Sonya; Hollenbeck, Brent K

2007-01-01

Background Since 1976, Medicare has linked reimbursement for hospitals performing organ transplants to the attainment of certain benchmarks, including transplant volume. While Medicare is a stakeholder in all transplant services, its role in renal transplantation is likely greater, given its coverage of end-stage renal disease. Thus, Medicare's transplant experience allows us to examine the role of payer leverage in motivating hospital benchmark compliance. Methods Nationally representative discharge data for kidney (n = 29,272), liver (n = 7,988), heart (n = 3,530), and lung (n = 1,880) transplants from the Nationwide Inpatient Sample (1993 – 2003) were employed. Logistic regression techniques with robust variance estimators were used to examine the relationship between hospital volume compliance and Medicare market share; generalized estimating equations were used to explore the association between patient-level operative mortality and hospital volume compliance. Results Medicare's transplant market share varied by organ [57%, 28%, 27%, and 18% for kidney, lung, heart, and liver transplants, respectively (P < 0.001)]. Volume-based benchmark compliance varied by transplant type [85%, 75%, 44%, and 39% for kidney, liver, heart, and lung transplants, respectively (P < 0.001)], despite a lower odds of operative mortality at compliant hospitals. Adjusting for organ supply, high market leverage was independently associated with compliance at hospitals transplanting kidneys (OR, 143.00; 95% CI, 18.53 – 1103.49), hearts (OR, 2.84; 95% CI, 1.51 – 5.34), and lungs (OR, 3.24; 95% CI, 1.57 – 6.67). Conclusion These data highlight the influence of payer leverage–an important contextual factor in value-based purchasing initiatives. For uncommon diagnoses, these data suggest that at least 30% of a provider's patients might need to be "at risk" for an incentive to motivate compliance. PMID:17640364
MoMaS reactive transport benchmark using PFLOTRAN

NASA Astrophysics Data System (ADS)

Park, H.

2017-12-01

MoMaS benchmark was developed to enhance numerical simulation capability for reactive transport modeling in porous media. The benchmark was published in late September of 2009; it is not taken from a real chemical system, but realistic and numerically challenging tests. PFLOTRAN is a state-of-art massively parallel subsurface flow and reactive transport code that is being used in multiple nuclear waste repository projects at Sandia National Laboratories including Waste Isolation Pilot Plant and Used Fuel Disposition. MoMaS benchmark has three independent tests with easy, medium, and hard chemical complexity. This paper demonstrates how PFLOTRAN is applied to this benchmark exercise and shows results of the easy benchmark test case which includes mixing of aqueous components and surface complexation. Surface complexations consist of monodentate and bidentate reactions which introduces difficulty in defining selectivity coefficient if the reaction applies to a bulk reference volume. The selectivity coefficient becomes porosity dependent for bidentate reaction in heterogeneous porous media. The benchmark is solved by PFLOTRAN with minimal modification to address the issue and unit conversions were made properly to suit PFLOTRAN.

Benchmarking and Threshold Standards in Higher Education. Staff and Educational Development Series.

ERIC Educational Resources Information Center

Smith, Helen, Ed.; Armstrong, Michael, Ed.; Brown, Sally, Ed.

This book explores the issues involved in developing standards in higher education, examining the practical issues involved in benchmarking and offering a critical analysis of the problems associated with this developmental tool. The book focuses primarily on experience in the United Kingdom (UK), but looks also at international activity in this…
International E-Benchmarking: Flexible Peer Development of Authentic Learning Principles in Higher Education

ERIC Educational Resources Information Center

Leppisaari, Irja; Vainio, Leena; Herrington, Jan; Im, Yeonwook

2011-01-01

More and more, social technologies and virtual work methods are facilitating new ways of crossing boundaries in professional development and international collaborations. This paper examines the peer development of higher education teachers through the experiences of the IVBM project (International Virtual Benchmarking, 2009-2010). The…
Benchmarking hypercube hardware and software

NASA Technical Reports Server (NTRS)

Grunwald, Dirk C.; Reed, Daniel A.

1986-01-01

It was long a truism in computer systems design that balanced systems achieve the best performance. Message passing parallel processors are no different. To quantify the balance of a hypercube design, an experimental methodology was developed and the associated suite of benchmarks was applied to several existing hypercubes. The benchmark suite includes tests of both processor speed in the absence of internode communication and message transmission speed as a function of communication patterns.
Overview of Experiments for Physics of Fast Reactors from the International Handbooks of Evaluated Criticality Safety Benchmark Experiments and Evaluated Reactor Physics Benchmark Experiments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bess, J. D.; Briggs, J. B.; Gulliford, J.

Overview of Experiments to Study the Physics of Fast Reactors Represented in the International Directories of Critical and Reactor Experiments John D. Bess Idaho National Laboratory Jim Gulliford, Tatiana Ivanova Nuclear Energy Agency of the Organisation for Economic Cooperation and Development E.V.Rozhikhin, M.Yu.Sem?nov, A.M.Tsibulya Institute of Physics and Power Engineering The study the physics of fast reactors traditionally used the experiments presented in the manual labor of the Working Group on Evaluation of sections CSEWG (ENDF-202) issued by the Brookhaven National Laboratory in 1974. This handbook presents simplified homogeneous model experiments with relevant experimental data, as amended. The Nuclear Energymore » Agency of the Organization for Economic Cooperation and Development coordinates the activities of two international projects on the collection, evaluation and documentation of experimental data - the International Project on the assessment of critical experiments (1994) and the International Project on the assessment of reactor experiments (since 2005). The result of the activities of these projects are replenished every year, an international directory of critical (ICSBEP Handbook) and reactor (IRPhEP Handbook) experiments. The handbooks present detailed models of experiments with minimal amendments. Such models are of particular interest in terms of the settlements modern programs. The directories contain a large number of experiments which are suitable for the study of physics of fast reactors. Many of these experiments were performed at specialized critical stands, such as BFS (Russia), ZPR and ZPPR (USA), the ZEBRA (UK) and the experimental reactor JOYO (Japan), FFTF (USA). Other experiments, such as compact metal assembly, is also of interest in terms of the physics of fast reactors, they have been carried out on the universal critical stands in Russian institutes (VNIITF and VNIIEF) and the US (LANL, LLNL, and others.). Also worth
Benchmarking for Excellence and the Nursing Process

NASA Technical Reports Server (NTRS)

Sleboda, Claire

1999-01-01

Nursing is a service profession. The services provided are essential to life and welfare. Therefore, setting the benchmark for high quality care is fundamental. Exploring the definition of a benchmark value will help to determine a best practice approach. A benchmark is the descriptive statement of a desired level of performance against which quality can be judged. It must be sufficiently well understood by managers and personnel in order that it may serve as a standard against which to measure value.
Toward Scalable Benchmarks for Mass Storage Systems

NASA Technical Reports Server (NTRS)

Miller, Ethan L.

1996-01-01

This paper presents guidelines for the design of a mass storage system benchmark suite, along with preliminary suggestions for programs to be included. The benchmarks will measure both peak and sustained performance of the system as well as predicting both short- and long-term behavior. These benchmarks should be both portable and scalable so they may be used on storage systems from tens of gigabytes to petabytes or more. By developing a standard set of benchmarks that reflect real user workload, we hope to encourage system designers and users to publish performance figures that can be compared with those of other systems. This will allow users to choose the system that best meets their needs and give designers a tool with which they can measure the performance effects of improvements to their systems.
NASA Software Engineering Benchmarking Study

NASA Technical Reports Server (NTRS)

Rarick, Heather L.; Godfrey, Sara H.; Kelly, John C.; Crumbley, Robert T.; Wifl, Joel M.

2013-01-01

was its software assurance practices, which seemed to rate well in comparison to the other organizational groups and also seemed to include a larger scope of activities. An unexpected benefit of the software benchmarking study was the identification of many opportunities for collaboration in areas including metrics, training, sharing of CMMI experiences and resources such as instructors and CMMI Lead Appraisers, and even sharing of assets such as documented processes. A further unexpected benefit of the study was the feedback on NASA practices that was received from some of the organizations interviewed. From that feedback, other potential areas where NASA could improve were highlighted, such as accuracy of software cost estimation and budgetary practices. The detailed report contains discussion of the practices noted in each of the topic areas, as well as a summary of observations and recommendations from each of the topic areas. The resulting 24 recommendations from the topic areas were then consolidated to eliminate duplication and culled into a set of 14 suggested actionable recommendations. This final set of actionable recommendations, listed below, are items that can be implemented to improve NASA's software engineering practices and to help address many of the items that were listed in the NASA top software engineering issues. 1. Develop and implement standard contract language for software procurements. 2. Advance accurate and trusted software cost estimates for both procured and in-house software and improve the capture of actual cost data to facilitate further improvements. 3. Establish a consistent set of objectives and expectations, specifically types of metrics at the Agency level, so key trends and models can be identified and used to continuously improve software processes and each software development effort. 4. Maintain the CMMI Maturity Level requirement for critical NASA projects and use CMMI to measure organizations developing software for NASA. 5
Benchmarking short sequence mapping tools

PubMed Central

2013-01-01

Background The development of next-generation sequencing instruments has led to the generation of millions of short sequences in a single run. The process of aligning these reads to a reference genome is time consuming and demands the development of fast and accurate alignment tools. However, the current proposed tools make different compromises between the accuracy and the speed of mapping. Moreover, many important aspects are overlooked while comparing the performance of a newly developed tool to the state of the art. Therefore, there is a need for an objective evaluation method that covers all the aspects. In this work, we introduce a benchmarking suite to extensively analyze sequencing tools with respect to various aspects and provide an objective comparison. Results We applied our benchmarking tests on 9 well known mapping tools, namely, Bowtie, Bowtie2, BWA, SOAP2, MAQ, RMAP, GSNAP, Novoalign, and mrsFAST (mrFAST) using synthetic data and real RNA-Seq data. MAQ and RMAP are based on building hash tables for the reads, whereas the remaining tools are based on indexing the reference genome. The benchmarking tests reveal the strengths and weaknesses of each tool. The results show that no single tool outperforms all others in all metrics. However, Bowtie maintained the best throughput for most of the tests while BWA performed better for longer read lengths. The benchmarking tests are not restricted to the mentioned tools and can be further applied to others. Conclusion The mapping process is still a hard problem that is affected by many factors. In this work, we provided a benchmarking suite that reveals and evaluates the different factors affecting the mapping process. Still, there is no tool that outperforms all of the others in all the tests. Therefore, the end user should clearly specify his needs in order to choose the tool that provides the best results. PMID:23758764
Benchmarking on Tsunami Currents with ComMIT

NASA Astrophysics Data System (ADS)

Sharghi vand, N.; Kanoglu, U.

2015-12-01

There were no standards for the validation and verification of tsunami numerical models before 2004 Indian Ocean tsunami. Even, number of numerical models has been used for inundation mapping effort, evaluation of critical structures, etc. without validation and verification. After 2004, NOAA Center for Tsunami Research (NCTR) established standards for the validation and verification of tsunami numerical models (Synolakis et al. 2008 Pure Appl. Geophys. 165, 2197-2228), which will be used evaluation of critical structures such as nuclear power plants against tsunami attack. NCTR presented analytical, experimental and field benchmark problems aimed to estimate maximum runup and accepted widely by the community. Recently, benchmark problems were suggested by the US National Tsunami Hazard Mitigation Program Mapping & Modeling Benchmarking Workshop: Tsunami Currents on February 9-10, 2015 at Portland, Oregon, USA (http://nws.weather.gov/nthmp/index.html). These benchmark problems concentrated toward validation and verification of tsunami numerical models on tsunami currents. Three of the benchmark problems were: current measurement of the Japan 2011 tsunami in Hilo Harbor, Hawaii, USA and in Tauranga Harbor, New Zealand, and single long-period wave propagating onto a small-scale experimental model of the town of Seaside, Oregon, USA. These benchmark problems were implemented in the Community Modeling Interface for Tsunamis (ComMIT) (Titov et al. 2011 Pure Appl. Geophys. 168, 2121-2131), which is a user-friendly interface to the validated and verified Method of Splitting Tsunami (MOST) (Titov and Synolakis 1995 J. Waterw. Port Coastal Ocean Eng. 121, 308-316) model and is developed by NCTR. The modeling results are compared with the required benchmark data, providing good agreements and results are discussed. Acknowledgment: The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant
Benchmarking child and adolescent mental health organizations.

PubMed

Brann, Peter; Walter, Garry; Coombs, Tim

2011-04-01

This paper describes aspects of the child and adolescent benchmarking forums that were part of the National Mental Health Benchmarking Project (NMHBP). These forums enabled participating child and adolescent mental health organizations to benchmark themselves against each other, with a view to understanding variability in performance against a range of key performance indicators (KPIs). Six child and adolescent mental health organizations took part in the NMHBP. Representatives from these organizations attended eight benchmarking forums at which they documented their performance against relevant KPIs. They also undertook two special projects designed to help them understand the variation in performance on given KPIs. There was considerable inter-organization variability on many of the KPIs. Even within organizations, there was often substantial variability over time. The variability in indicator data raised many questions for participants. This challenged participants to better understand and describe their local processes, prompted them to collect additional data, and stimulated them to make organizational comparisons. These activities fed into a process of reflection about their performance. Benchmarking has the potential to illuminate intra- and inter-organizational performance in the child and adolescent context.
SeSBench - An initiative to benchmark reactive transport models for environmental subsurface processes

NASA Astrophysics Data System (ADS)

Jacques, Diederik

2017-04-01

As soil functions are governed by a multitude of interacting hydrological, geochemical and biological processes, simulation tools coupling mathematical models for interacting processes are needed. Coupled reactive transport models are a typical example of such coupled tools mainly focusing on hydrological and geochemical coupling (see e.g. Steefel et al., 2015). Mathematical and numerical complexity for both the tool itself or of the specific conceptual model can increase rapidly. Therefore, numerical verification of such type of models is a prerequisite for guaranteeing reliability and confidence and qualifying simulation tools and approaches for any further model application. In 2011, a first SeSBench -Subsurface Environmental Simulation Benchmarking- workshop was held in Berkeley (USA) followed by four other ones. The objective is to benchmark subsurface environmental simulation models and methods with a current focus on reactive transport processes. The final outcome was a special issue in Computational Geosciences (2015, issue 3 - Reactive transport benchmarks for subsurface environmental simulation) with a collection of 11 benchmarks. Benchmarks, proposed by the participants of the workshops, should be relevant for environmental or geo-engineering applications; the latter were mostly related to radioactive waste disposal issues - excluding benchmarks defined for pure mathematical reasons. Another important feature is the tiered approach within a benchmark with the definition of a single principle problem and different sub problems. The latter typically benchmarked individual or simplified processes (e.g. inert solute transport, simplified geochemical conceptual model) or geometries (e.g. batch or one-dimensional, homogeneous). Finally, three codes should be involved into a benchmark. The SeSBench initiative contributes to confidence building for applying reactive transport codes. Furthermore, it illustrates the use of those type of models for different
Principles for Developing Benchmark Criteria for Staff Training in Responsible Gambling.

PubMed

Oehler, Stefan; Banzer, Raphaela; Gruenerbl, Agnes; Malischnig, Doris; Griffiths, Mark D; Haring, Christian

2017-03-01

One approach to minimizing the negative consequences of excessive gambling is staff training to reduce the rate of the development of new cases of harm or disorder within their customers. The primary goal of the present study was to assess suitable benchmark criteria for the training of gambling employees at casinos and lottery retailers. The study utilised the Delphi Method, a survey with one qualitative and two quantitative phases. A total of 21 invited international experts in the responsible gambling field participated in all three phases. A total of 75 performance indicators were outlined and assigned to six categories: (1) criteria of content, (2) modelling, (3) qualification of trainer, (4) framework conditions, (5) sustainability and (6) statistical indicators. Nine of the 75 indicators were rated as very important by 90 % or more of the experts. Unanimous support for importance was given to indicators such as (1) comprehensibility and (2) concrete action-guidance for handling with problem gamblers, Additionally, the study examined the implementation of benchmarking, when it should be conducted, and who should be responsible. Results indicated that benchmarking should be conducted every 1-2 years regularly and that one institution should be clearly defined and primarily responsible for benchmarking. The results of the present study provide the basis for developing a benchmarking for staff training in responsible gambling.
42 CFR 457.430 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 42 Public Health 4 2011-10-01 2011-10-01 false Benchmark-equivalent health benefits coverage. 457... STATES State Plan Requirements: Coverage and Benefits § 457.430 Benchmark-equivalent health benefits coverage. (a) Aggregate actuarial value. Benchmark-equivalent coverage is health benefits coverage that has...
42 CFR 457.430 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 42 Public Health 4 2013-10-01 2013-10-01 false Benchmark-equivalent health benefits coverage. 457... STATES State Plan Requirements: Coverage and Benefits § 457.430 Benchmark-equivalent health benefits coverage. (a) Aggregate actuarial value. Benchmark-equivalent coverage is health benefits coverage that has...
42 CFR 457.430 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 42 Public Health 4 2010-10-01 2010-10-01 false Benchmark-equivalent health benefits coverage. 457... STATES State Plan Requirements: Coverage and Benefits § 457.430 Benchmark-equivalent health benefits coverage. (a) Aggregate actuarial value. Benchmark-equivalent coverage is health benefits coverage that has...
42 CFR 440.335 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 42 Public Health 4 2012-10-01 2012-10-01 false Benchmark-equivalent health benefits coverage. 440.335 Section 440.335 Public Health CENTERS FOR MEDICARE & MEDICAID SERVICES, DEPARTMENT OF HEALTH AND... and Benchmark-Equivalent Coverage § 440.335 Benchmark-equivalent health benefits coverage. (a...
42 CFR 440.335 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 42 Public Health 4 2014-10-01 2014-10-01 false Benchmark-equivalent health benefits coverage. 440.335 Section 440.335 Public Health CENTERS FOR MEDICARE & MEDICAID SERVICES, DEPARTMENT OF HEALTH AND... and Benchmark-Equivalent Coverage § 440.335 Benchmark-equivalent health benefits coverage. (a...
Results of the Australasian (Trans-Tasman Oncology Group) radiotherapy benchmarking exercise in preparation for participation in the PORTEC-3 trial.

PubMed

Jameson, Michael G; McNamara, Jo; Bailey, Michael; Metcalfe, Peter E; Holloway, Lois C; Foo, Kerwyn; Do, Viet; Mileshkin, Linda; Creutzberg, Carien L; Khaw, Pearly

2016-08-01

Protocol deviations in Randomised Controlled Trials have been found to result in a significant decrease in survival and local control. In some cases, the magnitude of the detrimental effect can be larger than the anticipated benefits of the interventions involved. The implementation of appropriate quality assurance of radiotherapy measures for clinical trials has been found to result in fewer deviations from protocol. This paper reports on a benchmarking study conducted in preparation for the PORTEC-3 trial in Australasia. A benchmarking CT dataset was sent to each of the Australasian investigators, it was requested they contour and plan the case according to trial protocol using local treatment planning systems. These data was then sent back to Trans-Tasman Oncology Group for collation and analysis. Thirty three investigators from eighteen institutions across Australia and New Zealand took part in the study. The mean clinical target volume (CTV) volume was 383.4 (228.5-497.8) cm(3) and the mean dose to a reference gold standard CTV was 48.8 (46.4-50.3) Gy. Although there were some large differences in the contouring of the CTV and its constituent parts, these did not translate into large variations in dosimetry. Where individual investigators had deviations from the trial contouring protocol, feedback was provided. The results of this study will be used to compare with the international study QA for the PORTEC-3 trial. © 2016 The Royal Australian and New Zealand College of Radiologists.
Clear, Complete, and Justified Problem Formulations for Aquatic Life Benchmark Values: Specifying the Dimensions

EPA Science Inventory

Nations that develop water quality benchmark values have relied primarily on standard data and methods. However, experience with chemicals such as Se, ammonia, and tributyltin has shown that standard methods do not adequately address some taxa, modes of exposure and effects. Deve...
CLEAR, COMPLETE, AND JUSTIFIED PROBLEM FORMULATIONS FOR AQUATIC LIFE BENCHMARK VALUES: SPECIFYING THE DIMENSIONS

EPA Science Inventory

Nations that develop water quality benchmark values have relied primarily on standard data and methods. However, experience with chemicals such as Se, ammonia, and tributyltin has shown that standard methods do not adequately address some taxa, modes of exposure and effects. Deve...

Taking the Battle Upstream: Towards a Benchmarking Role for NATO

DTIC Science & Technology

2012-09-01

Benchmark.........................................................................................14 Figure 8. World Bank Benchmarking Work on Quality...Search of a Benchmarking Theory for the Public Sector.” 16 Figure 8. World Bank Benchmarking Work on Quality of Governance One of the most...the Ministries of Defense in the countries in which it works ). Another interesting innovation is that for comparison purposes, McKinsey categorized
Human Health Benchmarks for Pesticides

EPA Pesticide Factsheets

Advanced testing methods now allow pesticides to be detected in water at very low levels. These small amounts of pesticides detected in drinking water or source water for drinking water do not necessarily indicate a health risk. The EPA has developed human health benchmarks for 363 pesticides to enable our partners to better determine whether the detection of a pesticide in drinking water or source waters for drinking water may indicate a potential health risk and to help them prioritize monitoring efforts.The table below includes benchmarks for acute (one-day) and chronic (lifetime) exposures for the most sensitive populations from exposure to pesticides that may be found in surface or ground water sources of drinking water. The table also includes benchmarks for 40 pesticides in drinking water that have the potential for cancer risk. The HHBP table includes pesticide active ingredients for which Health Advisories or enforceable National Primary Drinking Water Regulations (e.g., maximum contaminant levels) have not been developed.
Evaluation of control strategies using an oxidation ditch benchmark.

PubMed

Abusam, A; Keesman, K J; Spanjers, H; van, Straten G; Meinema, K

2002-01-01

This paper presents validation and implementation results of a benchmark developed for a specific full-scale oxidation ditch wastewater treatment plant. A benchmark is a standard simulation procedure that can be used as a tool in evaluating various control strategies proposed for wastewater treatment plants. It is based on model and performance criteria development. Testing of this benchmark, by comparing benchmark predictions to real measurements of the electrical energy consumptions and amounts of disposed sludge for a specific oxidation ditch WWTP, has shown that it can (reasonably) be used for evaluating the performance of this WWTP. Subsequently, the validated benchmark was then used in evaluating some basic and advanced control strategies. Some of the interesting results obtained are the following: (i) influent flow splitting ratio, between the first and the fourth aerated compartments of the ditch, has no significant effect on the TN concentrations in the effluent, and (ii) for evaluation of long-term control strategies, future benchmarks need to be able to assess settlers' performance.
Benchmarking to improve the quality of cystic fibrosis care.

PubMed

Schechter, Michael S

2012-11-01

Benchmarking involves the ascertainment of healthcare programs with most favorable outcomes as a means to identify and spread effective strategies for delivery of care. The recent interest in the development of patient registries for patients with cystic fibrosis (CF) has been fueled in part by an interest in using them to facilitate benchmarking. This review summarizes reports of how benchmarking has been operationalized in attempts to improve CF care. Although certain goals of benchmarking can be accomplished with an exclusive focus on registry data analysis, benchmarking programs in Germany and the United States have supplemented these data analyses with exploratory interactions and discussions to better understand successful approaches to care and encourage their spread throughout the care network. Benchmarking allows the discovery and facilitates the spread of effective approaches to care. It provides a pragmatic alternative to traditional research methods such as randomized controlled trials, providing insights into methods that optimize delivery of care and allowing judgments about the relative effectiveness of different therapeutic approaches.
29 CFR 1952.153 - Compliance staffing benchmarks.

Code of Federal Regulations, 2014 CFR

2014-07-01

... further revision of its benchmarks to 64 safety inspectors and 50 industrial hygienists. After opportunity... Labor Regulations Relating to Labor (Continued) OCCUPATIONAL SAFETY AND HEALTH ADMINISTRATION... benchmarks of 50 safety and 27 health compliance officers. After opportunity for public comment and service...
29 CFR 1952.153 - Compliance staffing benchmarks.

Code of Federal Regulations, 2012 CFR

2012-07-01

... further revision of its benchmarks to 64 safety inspectors and 50 industrial hygienists. After opportunity... Labor Regulations Relating to Labor (Continued) OCCUPATIONAL SAFETY AND HEALTH ADMINISTRATION... benchmarks of 50 safety and 27 health compliance officers. After opportunity for public comment and service...
29 CFR 1952.153 - Compliance staffing benchmarks.

Code of Federal Regulations, 2011 CFR

2011-07-01

... further revision of its benchmarks to 64 safety inspectors and 50 industrial hygienists. After opportunity... Labor Regulations Relating to Labor (Continued) OCCUPATIONAL SAFETY AND HEALTH ADMINISTRATION... benchmarks of 50 safety and 27 health compliance officers. After opportunity for public comment and service...
29 CFR 1952.153 - Compliance staffing benchmarks.

Code of Federal Regulations, 2010 CFR

2010-07-01

... further revision of its benchmarks to 64 safety inspectors and 50 industrial hygienists. After opportunity... Labor Regulations Relating to Labor (Continued) OCCUPATIONAL SAFETY AND HEALTH ADMINISTRATION... benchmarks of 50 safety and 27 health compliance officers. After opportunity for public comment and service...
29 CFR 1952.153 - Compliance staffing benchmarks.

Code of Federal Regulations, 2013 CFR

2013-07-01

... further revision of its benchmarks to 64 safety inspectors and 50 industrial hygienists. After opportunity... Labor Regulations Relating to Labor (Continued) OCCUPATIONAL SAFETY AND HEALTH ADMINISTRATION... benchmarks of 50 safety and 27 health compliance officers. After opportunity for public comment and service...
Benchmark Evaluation of Fuel Effect and Material Worth Measurements for a Beryllium-Reflected Space Reactor Mockup

DOE Office of Scientific and Technical Information (OSTI.GOV)

Marshall, Margaret A.; Bess, John D.

2015-02-01

The critical configuration of the small, compact critical assembly (SCCA) experiments performed at the Oak Ridge Critical Experiments Facility (ORCEF) in 1962-1965 have been evaluated as acceptable benchmark experiments for inclusion in the International Handbook of Evaluated Criticality Safety Benchmark Experiments. The initial intent of these experiments was to support the design of the Medium Power Reactor Experiment (MPRE) program, whose purpose was to study “power plants for the production of electrical power in space vehicles.” The third configuration in this series of experiments was a beryllium-reflected assembly of stainless-steel-clad, highly enriched uranium (HEU)-O 2 fuel mockup of a potassium-cooledmore » space power reactor. Reactivity measurements cadmium ratio spectral measurements and fission rate measurements were measured through the core and top reflector. Fuel effect worth measurements and neutron moderating and absorbing material worths were also measured in the assembly fuel region. The cadmium ratios, fission rate, and worth measurements were evaluated for inclusion in the International Handbook of Evaluated Criticality Safety Benchmark Experiments. The fuel tube effect and neutron moderating and absorbing material worth measurements are the focus of this paper. Additionally, a measurement of the worth of potassium filling the core region was performed but has not yet been evaluated Pellets of 93.15 wt.% enriched uranium dioxide (UO 2) were stacked in 30.48 cm tall stainless steel fuel tubes (0.3 cm tall end caps). Each fuel tube had 26 pellets with a total mass of 295.8 g UO 2 per tube. 253 tubes were arranged in 1.506-cm triangular lattice. An additional 7-tube cluster critical configuration was also measured but not used for any physics measurements. The core was surrounded on all side by a beryllium reflector. The fuel effect worths were measured by removing fuel tubes at various radius. An accident scenario was also simulated by
Python/Lua Benchmarks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Busby, L.

This is an adaptation of the pre-existing Scimark benchmark code to a variety of Python and Lua implementations. It also measures performance of the Fparser expression parser and C and C++ code on a variety of simple scientific expressions.
Treatment planning for spinal radiosurgery : A competitive multiplatform benchmark challenge.

PubMed

Moustakis, Christos; Chan, Mark K H; Kim, Jinkoo; Nilsson, Joakim; Bergman, Alanah; Bichay, Tewfik J; Palazon Cano, Isabel; Cilla, Savino; Deodato, Francesco; Doro, Raffaela; Dunst, Jürgen; Eich, Hans Theodor; Fau, Pierre; Fong, Ming; Haverkamp, Uwe; Heinze, Simon; Hildebrandt, Guido; Imhoff, Detlef; de Klerck, Erik; Köhn, Janett; Lambrecht, Ulrike; Loutfi-Krauss, Britta; Ebrahimi, Fatemeh; Masi, Laura; Mayville, Alan H; Mestrovic, Ante; Milder, Maaike; Morganti, Alessio G; Rades, Dirk; Ramm, Ulla; Rödel, Claus; Siebert, Frank-Andre; den Toom, Wilhelm; Wang, Lei; Wurster, Stefan; Schweikard, Achim; Soltys, Scott G; Ryu, Samuel; Blanck, Oliver

2018-05-25

To investigate the quality of treatment plans of spinal radiosurgery derived from different planning and delivery systems. The comparisons include robotic delivery and intensity modulated arc therapy (IMAT) approaches. Multiple centers with equal systems were used to reduce a bias based on individual's planning abilities. The study used a series of three complex spine lesions to maximize the difference in plan quality among the various approaches. Internationally recognized experts in the field of treatment planning and spinal radiosurgery from 12 centers with various treatment planning systems participated. For a complex spinal lesion, the results were compared against a previously published benchmark plan derived for CyberKnife radiosurgery (CKRS) using circular cones only. For two additional cases, one with multiple small lesions infiltrating three vertebrae and a single vertebra lesion treated with integrated boost, the results were compared against a benchmark plan generated using a best practice guideline for CKRS. All plans were rated based on a previously established ranking system. All 12 centers could reach equality (n = 4) or outperform (n = 8) the benchmark plan. For the multiple lesions and the single vertebra lesion plan only 5 and 3 of the 12 centers, respectively, reached equality or outperformed the best practice benchmark plan. However, the absolute differences in target and critical structure dosimetry were small and strongly planner-dependent rather than system-dependent. Overall, gantry-based IMAT with simple planning techniques (two coplanar arcs) produced faster treatments and significantly outperformed static gantry intensity modulated radiation therapy (IMRT) and multileaf collimator (MLC) or non-MLC CKRS treatment plan quality regardless of the system (mean rank out of 4 was 1.2 vs. 3.1, p = 0.002). High plan quality for complex spinal radiosurgery was achieved among all systems and all participating centers in this
An international land-biosphere model benchmarking activity for the IPCC Fifth Assessment Report (AR5)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoffman, Forrest M; Randerson, James T; Thornton, Peter E

2009-12-01

The need to capture important climate feedbacks in general circulation models (GCMs) has resulted in efforts to include atmospheric chemistry and land and ocean biogeochemistry into the next generation of production climate models, called Earth System Models (ESMs). While many terrestrial and ocean carbon models have been coupled to GCMs, recent work has shown that such models can yield a wide range of results (Friedlingstein et al., 2006). This work suggests that a more rigorous set of global offline and partially coupled experiments, along with detailed analyses of processes and comparisons with measurements, are needed. The Carbon-Land Model Intercomparison Projectmore » (C-LAMP) was designed to meet this need by providing a simulation protocol and model performance metrics based upon comparisons against best-available satellite- and ground-based measurements (Hoffman et al., 2007). Recently, a similar effort in Europe, called the International Land Model Benchmark (ILAMB) Project, was begun to assess the performance of European land surface models. These two projects will now serve as prototypes for a proposed international land-biosphere model benchmarking activity for those models participating in the IPCC Fifth Assessment Report (AR5). Initially used for model validation for terrestrial biogeochemistry models in the NCAR Community Land Model (CLM), C-LAMP incorporates a simulation protocol for both offline and partially coupled simulations using a prescribed historical trajectory of atmospheric CO2 concentrations. Models are confronted with data through comparisons against AmeriFlux site measurements, MODIS satellite observations, NOAA Globalview flask records, TRANSCOM inversions, and Free Air CO2 Enrichment (FACE) site measurements. Both sets of experiments have been performed using two different terrestrial biogeochemistry modules coupled to the CLM version 3 in the Community Climate System Model version 3 (CCSM3): the CASA model of Fung, et al., and
47 CFR 54.805 - Zone and study area above benchmark revenues calculated by the Administrator.

Code of Federal Regulations, 2010 CFR

2010-10-01

... Period Residential and Single-Line Business Lines times 12. If negative, the Zone Above Benchmark...) multiplied by all eligible telecommunications carrier zone Base Period Multi-line Business Lines times 12. If... 47 Telecommunication 3 2010-10-01 2010-10-01 false Zone and study area above benchmark revenues...
Quality Assurance Testing of Version 1.3 of U.S. EPA Benchmark Dose Software (Presentation)

EPA Science Inventory

EPA benchmark dose software (BMDS) issued to evaluate chemical dose-response data in support of Agency risk assessments, and must therefore be dependable. Quality assurance testing methods developed for BMDS were designed to assess model dependability with respect to curve-fitt...
Toxicological benchmarks for potential contaminants of concern for effects on soil and litter invertebrates and heterotrophic process

DOE Office of Scientific and Technical Information (OSTI.GOV)

Will, M.E.; Suter, G.W. II

1995-09-01

An important step in ecological risk assessments is screening the chemicals occur-ring on a site for contaminants of potential concern. Screening may be accomplished by comparing reported ambient concentrations to a set of toxicological benchmarks. Multiple endpoints for assessing risks posed by soil-borne contaminants to organisms directly impacted by them have been established. This report presents benchmarks for soil invertebrates and microbial processes and addresses only chemicals found at United States Department of Energy (DOE) sites. No benchmarks for pesticides are presented. After discussing methods, this report presents the results of the literature review and benchmark derivation for toxicity tomore » earthworms (Sect. 3), heterotrophic microbes and their processes (Sect. 4), and other invertebrates (Sect. 5). The final sections compare the benchmarks to other criteria and background and draw conclusions concerning the utility of the benchmarks.« less
Ontology for Semantic Data Integration in the Domain of IT Benchmarking.

PubMed

Pfaff, Matthias; Neubig, Stefan; Krcmar, Helmut

2018-01-01

A domain-specific ontology for IT benchmarking has been developed to bridge the gap between a systematic characterization of IT services and their data-based valuation. Since information is generally collected during a benchmark exercise using questionnaires on a broad range of topics, such as employee costs, software licensing costs, and quantities of hardware, it is commonly stored as natural language text; thus, this information is stored in an intrinsically unstructured form. Although these data form the basis for identifying potentials for IT cost reductions, neither a uniform description of any measured parameters nor the relationship between such parameters exists. Hence, this work proposes an ontology for the domain of IT benchmarking, available at https://w3id.org/bmontology. The design of this ontology is based on requirements mainly elicited from a domain analysis, which considers analyzing documents and interviews with representatives from Small- and Medium-Sized Enterprises and Information and Communications Technology companies over the last eight years. The development of the ontology and its main concepts is described in detail (i.e., the conceptualization of benchmarking events, questionnaires, IT services, indicators and their values) together with its alignment with the DOLCE-UltraLite foundational ontology.
Benchmark problems for numerical implementations of phase field models

DOE PAGES

Jokisaari, A. M.; Voorhees, P. W.; Guyer, J. E.; ...

2016-10-01

Here, we present the first set of benchmark problems for phase field models that are being developed by the Center for Hierarchical Materials Design (CHiMaD) and the National Institute of Standards and Technology (NIST). While many scientific research areas use a limited set of well-established software, the growing phase field community continues to develop a wide variety of codes and lacks benchmark problems to consistently evaluate the numerical performance of new implementations. Phase field modeling has become significantly more popular as computational power has increased and is now becoming mainstream, driving the need for benchmark problems to validate and verifymore » new implementations. We follow the example set by the micromagnetics community to develop an evolving set of benchmark problems that test the usability, computational resources, numerical capabilities and physical scope of phase field simulation codes. In this paper, we propose two benchmark problems that cover the physics of solute diffusion and growth and coarsening of a second phase via a simple spinodal decomposition model and a more complex Ostwald ripening model. We demonstrate the utility of benchmark problems by comparing the results of simulations performed with two different adaptive time stepping techniques, and we discuss the needs of future benchmark problems. The development of benchmark problems will enable the results of quantitative phase field models to be confidently incorporated into integrated computational materials science and engineering (ICME), an important goal of the Materials Genome Initiative.« less
A benchmark initiative on mantle convection with melting and melt segregation

NASA Astrophysics Data System (ADS)

Schmeling, Harro; Dohmen, Janik; Wallner, Herbert; Noack, Lena; Tosi, Nicola; Plesa, Ana-Catalina; Maurice, Maxime

2015-04-01

formulation. Variations of cases 1 - 3 may be tested, particularly studying the effect of melt extraction. The motivation of this presentation is to summarize first experiences, suggest possible modifications of the case definitions and call interested modelers to join this benchmark exercise. References: Blanckenbach, B., Busse, F., Christensen, U., Cserepes, L. Gun¬kel, D., Hansen, U., Har¬der, H. Jarvis, G., Koch, M., Mar¬quart, G., Moore D., Olson, P., and Schmeling, H., 1989: A benchmark comparison for mantle convection codes, J. Geo¬phys., 98, 23 38. Schmeling, H., 2000: Partial melting and melt segregation in a convecting mantle. In: Physics and Chemistry of Partially Molten Rocks, eds. N. Bagdassarov, D. Laporte, and A.B. Thompson, Kluwer Academic Publ., Dordrecht, pp. 141 - 178.
Using Benchmarking To Influence Tuition and Fee Decisions.

ERIC Educational Resources Information Center

Hubbell, Loren W. Loomis; Massa, Robert J.; Lapovsky, Lucie

2002-01-01

Discusses the use of benchmarking in managing enrollment. Using a case study, illustrates how benchmarking can help administrators develop strategies for planning and implementing admissions and pricing practices. (EV)

Benchmark study on glyphosate-resistant crop systems in the United States. Part 2: Perspectives.

PubMed

Owen, Micheal D K; Young, Bryan G; Shaw, David R; Wilson, Robert G; Jordan, David L; Dixon, Philip M; Weller, Stephen C

2011-07-01

A six-state, 5 year field project was initiated in 2006 to study weed management methods that foster the sustainability of genetically engineered (GE) glyphosate-resistant (GR) crop systems. The benchmark study field-scale experiments were initiated following a survey, conducted in the winter of 2005-2006, of farmer opinions on weed management practices and their views on GR weeds and management tactics. The main survey findings supported the premise that growers were generally less aware of the significance of evolved herbicide resistance and did not have a high recognition of the strong selection pressure from herbicides on the evolution of herbicide-resistant (HR) weeds. The results of the benchmark study survey indicated that there are educational challenges to implement sustainable GR-based crop systems and helped guide the development of the field-scale benchmark study. Paramount is the need to develop consistent and clearly articulated science-based management recommendations that enable farmers to reduce the potential for HR weeds. This paper provides background perspectives about the use of GR crops, the impact of these crops and an overview of different opinions about the use of GR crops on agriculture and society, as well as defining how the benchmark study will address these issues. Copyright © 2011 Society of Chemical Industry.
Benchmarking comparison and validation of MCNP photon interaction data

NASA Astrophysics Data System (ADS)

Colling, Bethany; Kodeli, I.; Lilley, S.; Packer, L. W.

2017-09-01

The objective of the research was to test available photoatomic data libraries for fusion relevant applications, comparing against experimental and computational neutronics benchmarks. Photon flux and heating was compared using the photon interaction data libraries (mcplib 04p, 05t, 84p and 12p). Suitable benchmark experiments (iron and water) were selected from the SINBAD database and analysed to compare experimental values with MCNP calculations using mcplib 04p, 84p and 12p. In both the computational and experimental comparisons, the majority of results with the 04p, 84p and 12p photon data libraries were within 1σ of the mean MCNP statistical uncertainty. Larger differences were observed when comparing computational results with the 05t test photon library. The Doppler broadening sampling bug in MCNP-5 is shown to be corrected for fusion relevant problems through use of the 84p photon data library. The recommended libraries for fusion neutronics are 84p (or 04p) with MCNP6 and 84p if using MCNP-5.
Benchmarking Outcomes in the Critically Injured Burn Patient

PubMed Central

Klein, Matthew B.; Goverman, Jeremy; Hayden, Douglas L.; Fagan, Shawn P.; McDonald-Smith, Grace P.; Alexander, Andrew K.; Gamelli, Richard L.; Gibran, Nicole S.; Finnerty, Celeste C.; Jeschke, Marc G.; Arnoldo, Brett; Wispelwey, Bram; Mindrinos, Michael N.; Xiao, Wenzhong; Honari, Shari E.; Mason, Philip H.; Schoenfeld, David A.; Herndon, David N.; Tompkins, Ronald G.

2014-01-01

Objective To determine and compare outcomes with accepted benchmarks in burn care at six academic burn centers. Background Since the 1960s, U.S. morbidity and mortality rates have declined tremendously for burn patients, likely related to improvements in surgical and critical care treatment. We describe the baseline patient characteristics and well-defined outcomes for major burn injuries. Methods We followed 300 adults and 241 children from 2003–2009 through hospitalization using standard operating procedures developed at study onset. We created an extensive database on patient and injury characteristics, anatomic and physiological derangement, clinical treatment, and outcomes. These data were compared with existing benchmarks in burn care. Results Study patients were critically injured as demonstrated by mean %TBSA (41.2±18.3 for adults and 57.8±18.2 for children) and presence of inhalation injury in 38% of the adults and 54.8% of the children. Mortality in adults was 14.1% for those less than 55 years old and 38.5% for those age ≥55 years. Mortality in patients less than 17 years old was 7.9%. Overall, the multiple organ failure rate was 27%. When controlling for age and %TBSA, presence of inhalation injury was not significant. Conclusions This study provides the current benchmark for major burn patients. Mortality rates, notwithstanding significant % TBSA and presence of inhalation injury, have significantly declined compared to previous benchmarks. Modern day surgical and medically intensive management has markedly improved to the point where we can expect patients less than 55 years old with severe burn injuries and inhalation injury to survive these devastating conditions. PMID:24722222
Relative Positioning of Ocean Bottom Benchmarks.

DTIC Science & Technology

1985-12-01

SCHOOL Monterey, California IDTIC -" . % LECTE .! . ,/, : FEB 1 4 1g86J THESIS RELATIVE POSITIONING D- OF OCEAN BOTTOM BENCHMARKS by LFeng-Yu Kuo LA...December 1985 Thesis Advisors: N. K. Saxena S. P. Tucker Approved for public release; distribution unlimited. 86 2 1 4 25 UNCLASSIFIED - 1SECURITY...NO. 3. RECIPIENT’S CATALOG NUMBER 4. TITLE (and Subtitle) S. TYPE OF REPORT & PERIOD COVERED Relative Positioning Master’s Thesis ; of December 1985
CompaRNA: a server for continuous benchmarking of automated methods for RNA secondary structure prediction

PubMed Central

Puton, Tomasz; Kozlowski, Lukasz P.; Rother, Kristian M.; Bujnicki, Janusz M.

2013-01-01

We present a continuous benchmarking approach for the assessment of RNA secondary structure prediction methods implemented in the CompaRNA web server. As of 3 October 2012, the performance of 28 single-sequence and 13 comparative methods has been evaluated on RNA sequences/structures released weekly by the Protein Data Bank. We also provide a static benchmark generated on RNA 2D structures derived from the RNAstrand database. Benchmarks on both data sets offer insight into the relative performance of RNA secondary structure prediction methods on RNAs of different size and with respect to different types of structure. According to our tests, on the average, the most accurate predictions obtained by a comparative approach are generated by CentroidAlifold, MXScarna, RNAalifold and TurboFold. On the average, the most accurate predictions obtained by single-sequence analyses are generated by CentroidFold, ContextFold and IPknot. The best comparative methods typically outperform the best single-sequence methods if an alignment of homologous RNA sequences is available. This article presents the results of our benchmarks as of 3 October 2012, whereas the rankings presented online are continuously updated. We will gladly include new prediction methods and new measures of accuracy in the new editions of CompaRNA benchmarks. PMID:23435231
Benchmarking is associated with improved quality of care in type 2 diabetes: the OPTIMISE randomized, controlled trial.

PubMed

Hermans, Michel P; Elisaf, Moses; Michel, Georges; Muls, Erik; Nobels, Frank; Vandenberghe, Hans; Brotons, Carlos

2013-11-01

To assess prospectively the effect of benchmarking on quality of primary care for patients with type 2 diabetes by using three major modifiable cardiovascular risk factors as critical quality indicators. Primary care physicians treating patients with type 2 diabetes in six European countries were randomized to give standard care (control group) or standard care with feedback benchmarked against other centers in each country (benchmarking group). In both groups, laboratory tests were performed every 4 months. The primary end point was the percentage of patients achieving preset targets of the critical quality indicators HbA1c, LDL cholesterol, and systolic blood pressure (SBP) after 12 months of follow-up. Of 4,027 patients enrolled, 3,996 patients were evaluable and 3,487 completed 12 months of follow-up. Primary end point of HbA1c target was achieved in the benchmarking group by 58.9 vs. 62.1% in the control group (P = 0.398) after 12 months; 40.0 vs. 30.1% patients met the SBP target (P < 0.001); 54.3 vs. 49.7% met the LDL cholesterol target (P = 0.006). Percentages of patients meeting all three targets increased during the study in both groups, with a statistically significant increase observed in the benchmarking group. The percentage of patients achieving all three targets at month 12 was significantly larger in the benchmarking group than in the control group (12.5 vs. 8.1%; P < 0.001). In this prospective, randomized, controlled study, benchmarking was shown to be an effective tool for increasing achievement of critical quality indicators and potentially reducing patient cardiovascular residual risk profile.
International benchmarking of specialty hospitals. A series of case studies on comprehensive cancer centres.

PubMed

van Lent, Wineke A M; de Beer, Relinde D; van Harten, Wim H

2010-08-31

Benchmarking is one of the methods used in business that is applied to hospitals to improve the management of their operations. International comparison between hospitals can explain performance differences. As there is a trend towards specialization of hospitals, this study examines the benchmarking process and the success factors of benchmarking in international specialized cancer centres. Three independent international benchmarking studies on operations management in cancer centres were conducted. The first study included three comprehensive cancer centres (CCC), three chemotherapy day units (CDU) were involved in the second study and four radiotherapy departments were included in the final study. Per multiple case study a research protocol was used to structure the benchmarking process. After reviewing the multiple case studies, the resulting description was used to study the research objectives. We adapted and evaluated existing benchmarking processes through formalizing stakeholder involvement and verifying the comparability of the partners. We also devised a framework to structure the indicators to produce a coherent indicator set and better improvement suggestions. Evaluating the feasibility of benchmarking as a tool to improve hospital processes led to mixed results. Case study 1 resulted in general recommendations for the organizations involved. In case study 2, the combination of benchmarking and lean management led in one CDU to a 24% increase in bed utilization and a 12% increase in productivity. Three radiotherapy departments of case study 3, were considering implementing the recommendations.Additionally, success factors, such as a well-defined and small project scope, partner selection based on clear criteria, stakeholder involvement, simple and well-structured indicators, analysis of both the process and its results and, adapt the identified better working methods to the own setting, were found. The improved benchmarking process and the success
International benchmarking of specialty hospitals. A series of case studies on comprehensive cancer centres

PubMed Central

2010-01-01

Background Benchmarking is one of the methods used in business that is applied to hospitals to improve the management of their operations. International comparison between hospitals can explain performance differences. As there is a trend towards specialization of hospitals, this study examines the benchmarking process and the success factors of benchmarking in international specialized cancer centres. Methods Three independent international benchmarking studies on operations management in cancer centres were conducted. The first study included three comprehensive cancer centres (CCC), three chemotherapy day units (CDU) were involved in the second study and four radiotherapy departments were included in the final study. Per multiple case study a research protocol was used to structure the benchmarking process. After reviewing the multiple case studies, the resulting description was used to study the research objectives. Results We adapted and evaluated existing benchmarking processes through formalizing stakeholder involvement and verifying the comparability of the partners. We also devised a framework to structure the indicators to produce a coherent indicator set and better improvement suggestions. Evaluating the feasibility of benchmarking as a tool to improve hospital processes led to mixed results. Case study 1 resulted in general recommendations for the organizations involved. In case study 2, the combination of benchmarking and lean management led in one CDU to a 24% increase in bed utilization and a 12% increase in productivity. Three radiotherapy departments of case study 3, were considering implementing the recommendations. Additionally, success factors, such as a well-defined and small project scope, partner selection based on clear criteria, stakeholder involvement, simple and well-structured indicators, analysis of both the process and its results and, adapt the identified better working methods to the own setting, were found. Conclusions The improved
Evaluation of integrated assessment model hindcast experiments: a case study of the GCAM 3.0 land use module

DOE Office of Scientific and Technical Information (OSTI.GOV)

Snyder, Abigail C.; Link, Robert P.; Calvin, Katherine V.

Hindcasting experiments (conducting a model forecast for a time period in which observational data are available) are being undertaken increasingly often by the integrated assessment model (IAM) community, across many scales of models. When they are undertaken, the results are often evaluated using global aggregates or otherwise highly aggregated skill scores that mask deficiencies. We select a set of deviation-based measures that can be applied on different spatial scales (regional versus global) to make evaluating the large number of variable–region combinations in IAMs more tractable. We also identify performance benchmarks for these measures, based on the statistics of the observationalmore » dataset, that allow a model to be evaluated in absolute terms rather than relative to the performance of other models at similar tasks. An ideal evaluation method for hindcast experiments in IAMs would feature both absolute measures for evaluation of a single experiment for a single model and relative measures to compare the results of multiple experiments for a single model or the same experiment repeated across multiple models, such as in community intercomparison studies. The performance benchmarks highlight the use of this scheme for model evaluation in absolute terms, providing information about the reasons a model may perform poorly on a given measure and therefore identifying opportunities for improvement. To demonstrate the use of and types of results possible with the evaluation method, the measures are applied to the results of a past hindcast experiment focusing on land allocation in the Global Change Assessment Model (GCAM) version 3.0. The question of how to more holistically evaluate models as complex as IAMs is an area for future research. We find quantitative evidence that global aggregates alone are not sufficient for evaluating IAMs that require global supply to equal global demand at each time period, such as GCAM. The results of this work indicate
Evaluation of integrated assessment model hindcast experiments: a case study of the GCAM 3.0 land use module

DOE PAGES

Snyder, Abigail C.; Link, Robert P.; Calvin, Katherine V.

2017-11-29

Hindcasting experiments (conducting a model forecast for a time period in which observational data are available) are being undertaken increasingly often by the integrated assessment model (IAM) community, across many scales of models. When they are undertaken, the results are often evaluated using global aggregates or otherwise highly aggregated skill scores that mask deficiencies. We select a set of deviation-based measures that can be applied on different spatial scales (regional versus global) to make evaluating the large number of variable–region combinations in IAMs more tractable. We also identify performance benchmarks for these measures, based on the statistics of the observationalmore » dataset, that allow a model to be evaluated in absolute terms rather than relative to the performance of other models at similar tasks. An ideal evaluation method for hindcast experiments in IAMs would feature both absolute measures for evaluation of a single experiment for a single model and relative measures to compare the results of multiple experiments for a single model or the same experiment repeated across multiple models, such as in community intercomparison studies. The performance benchmarks highlight the use of this scheme for model evaluation in absolute terms, providing information about the reasons a model may perform poorly on a given measure and therefore identifying opportunities for improvement. To demonstrate the use of and types of results possible with the evaluation method, the measures are applied to the results of a past hindcast experiment focusing on land allocation in the Global Change Assessment Model (GCAM) version 3.0. The question of how to more holistically evaluate models as complex as IAMs is an area for future research. We find quantitative evidence that global aggregates alone are not sufficient for evaluating IAMs that require global supply to equal global demand at each time period, such as GCAM. The results of this work indicate
Evaluation of integrated assessment model hindcast experiments: a case study of the GCAM 3.0 land use module

NASA Astrophysics Data System (ADS)

Snyder, Abigail C.; Link, Robert P.; Calvin, Katherine V.

2017-11-01

Hindcasting experiments (conducting a model forecast for a time period in which observational data are available) are being undertaken increasingly often by the integrated assessment model (IAM) community, across many scales of models. When they are undertaken, the results are often evaluated using global aggregates or otherwise highly aggregated skill scores that mask deficiencies. We select a set of deviation-based measures that can be applied on different spatial scales (regional versus global) to make evaluating the large number of variable-region combinations in IAMs more tractable. We also identify performance benchmarks for these measures, based on the statistics of the observational dataset, that allow a model to be evaluated in absolute terms rather than relative to the performance of other models at similar tasks. An ideal evaluation method for hindcast experiments in IAMs would feature both absolute measures for evaluation of a single experiment for a single model and relative measures to compare the results of multiple experiments for a single model or the same experiment repeated across multiple models, such as in community intercomparison studies. The performance benchmarks highlight the use of this scheme for model evaluation in absolute terms, providing information about the reasons a model may perform poorly on a given measure and therefore identifying opportunities for improvement. To demonstrate the use of and types of results possible with the evaluation method, the measures are applied to the results of a past hindcast experiment focusing on land allocation in the Global Change Assessment Model (GCAM) version 3.0. The question of how to more holistically evaluate models as complex as IAMs is an area for future research. We find quantitative evidence that global aggregates alone are not sufficient for evaluating IAMs that require global supply to equal global demand at each time period, such as GCAM. The results of this work indicate it is
Benchmarking the Integration of WAVEWATCH III Results into HAZUS-MH: Preliminary Results

NASA Technical Reports Server (NTRS)

Berglund, Judith; Holland, Donald; McKellip, Rodney; Sciaudone, Jeff; Vickery, Peter; Wang, Zhanxian; Ying, Ken

2005-01-01

The report summarizes the results from the preliminary benchmarking activities associated with the use of WAVEWATCH III (WW3) results in the HAZUS-MH MR1 flood module. Project partner Applied Research Associates (ARA) is integrating the WW3 model into HAZUS. The current version of HAZUS-MH predicts loss estimates from hurricane-related coastal flooding by using values of surge only. Using WW3, wave setup can be included with surge. Loss estimates resulting from the use of surge-only and surge-plus-wave-setup were compared. This benchmarking study is preliminary because the HAZUS-MH MR1 flood module was under development at the time of the study. In addition, WW3 is not scheduled to be fully integrated with HAZUS-MH and available for public release until 2008.
[The OPTIMISE study (Optimal Type 2 Diabetes Management Including Benchmarking and Standard Treatment]. Results for Luxembourg].

PubMed

Michel, G

2012-01-01

The OPTIMISE study (NCT00681850) has been run in six European countries, including Luxembourg, to prospectively assess the effect of benchmarking on the quality of primary care in patients with type 2 diabetes, using major modifiable vascular risk factors as critical quality indicators. Primary care centers treating type 2 diabetic patients were randomized to give standard care (control group) or standard care with feedback benchmarked against other centers in each country (benchmarking group). Primary endpoint was percentage of patients in the benchmarking group achieving pre-set targets of the critical quality indicators: glycated hemoglobin (HbAlc), systolic blood pressure (SBP) and low-density lipoprotein (LDL) cholesterol after 12 months follow-up. In Luxembourg, in the benchmarking group, more patients achieved target for SBP (40.2% vs. 20%) and for LDL-cholesterol (50.4% vs. 44.2%). 12.9% of patients in the benchmarking group met all three targets compared with patients in the control group (8.3%). In this randomized, controlled study, benchmarking was shown to be an effective tool for improving critical quality indicator targets, which are the principal modifiable vascular risk factors in diabetes type 2.
PMLB: a large benchmark suite for machine learning evaluation and comparison.

PubMed

Olson, Randal S; La Cava, William; Orzechowski, Patryk; Urbanowicz, Ryan J; Moore, Jason H

2017-01-01

The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark datasets have emerged from different sources, but their organization and adoption as standards have been inconsistent. As such, selecting and curating specific benchmarks remains an unnecessary burden on machine learning practitioners and data scientists. The present study introduces an accessible, curated, and developing public benchmark resource to facilitate identification of the strengths and weaknesses of different machine learning methodologies. We compare meta-features among the current set of benchmark datasets in this resource to characterize the diversity of available data. Finally, we apply a number of established machine learning methods to the entire benchmark suite and analyze how datasets and algorithms cluster in terms of performance. From this study, we find that existing benchmarks lack the diversity to properly benchmark machine learning algorithms, and there are several gaps in benchmarking problems that still need to be considered. This work represents another important step towards understanding the limitations of popular benchmarking suites and developing a resource that connects existing benchmarking standards to more diverse and efficient standards in the future.
Electric-Drive Vehicle Thermal Performance Benchmarking | Transportation

Science.gov Websites

studies are as follows: Characterize the thermal resistance and conductivity of various layers in the Research | NREL Electric-Drive Vehicle Thermal Performance Benchmarking Electric-Drive Vehicle Thermal Performance Benchmarking A photo of the internal components of an automotive inverter. NREL
Robust Tomography using Randomized Benchmarking

NASA Astrophysics Data System (ADS)

Silva, Marcus; Kimmel, Shelby; Johnson, Blake; Ryan, Colm; Ohki, Thomas

2013-03-01

Conventional randomized benchmarking (RB) can be used to estimate the fidelity of Clifford operations in a manner that is robust against preparation and measurement errors -- thus allowing for a more accurate and relevant characterization of the average error in Clifford gates compared to standard tomography protocols. Interleaved RB (IRB) extends this result to the extraction of error rates for individual Clifford gates. In this talk we will show how to combine multiple IRB experiments to extract all information about the unital part of any trace preserving quantum process. Consequently, one can compute the average fidelity to any unitary, not just the Clifford group, with tighter bounds than IRB. Moreover, the additional information can be used to design improvements in control. MS, BJ, CR and TO acknowledge support from IARPA under contract W911NF-10-1-0324.
New Reactor Physics Benchmark Data in the March 2012 Edition of the IRPhEP Handbook

DOE Office of Scientific and Technical Information (OSTI.GOV)

John D. Bess; J. Blair Briggs; Jim Gulliford

2012-11-01

The International Reactor Physics Experiment Evaluation Project (IRPhEP) was established to preserve integral reactor physics experimental data, including separate or special effects data for nuclear energy and technology applications. Numerous experiments that have been performed worldwide, represent a large investment of infrastructure, expertise, and cost, and are valuable resources of data for present and future research. These valuable assets provide the basis for recording, development, and validation of methods. If the experimental data are lost, the high cost to repeat many of these measurements may be prohibitive. The purpose of the IRPhEP is to provide an extensively peer-reviewed set ofmore » reactor physics-related integral data that can be used by reactor designers and safety analysts to validate the analytical tools used to design next-generation reactors and establish the safety basis for operation of these reactors. Contributors from around the world collaborate in the evaluation and review of selected benchmark experiments for inclusion in the International Handbook of Evaluated Reactor Physics Benchmark Experiments (IRPhEP Handbook) [1]. Several new evaluations have been prepared for inclusion in the March 2012 edition of the IRPhEP Handbook.« less
Unstructured Adaptive (UA) NAS Parallel Benchmark. Version 1.0

NASA Technical Reports Server (NTRS)

Feng, Huiyu; VanderWijngaart, Rob; Biswas, Rupak; Mavriplis, Catherine

2004-01-01

We present a complete specification of a new benchmark for measuring the performance of modern computer systems when solving scientific problems featuring irregular, dynamic memory accesses. It complements the existing NAS Parallel Benchmark suite. The benchmark involves the solution of a stylized heat transfer problem in a cubic domain, discretized on an adaptively refined, unstructured mesh.
BENCHMARKING SUSTAINABILITY ENGINEERING EDUCATION

EPA Science Inventory

The goals of this project are to develop and apply a methodology for benchmarking curricula in sustainability engineering and to identify individuals active in sustainability engineering education.
Performance Evaluation of Supercomputers using HPCC and IMB Benchmarks

NASA Technical Reports Server (NTRS)

Saini, Subhash; Ciotti, Robert; Gunney, Brian T. N.; Spelce, Thomas E.; Koniges, Alice; Dossa, Don; Adamidis, Panagiotis; Rabenseifner, Rolf; Tiyyagura, Sunil R.; Mueller, Matthias;

2006-01-01

The HPC Challenge (HPCC) benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnect fabric of five leading supercomputers - SGI Altix BX2, Cray XI, Cray Opteron Cluster, Dell Xeon cluster, and NEC SX-8. These five systems use five different networks (SGI NUMALINK4, Cray network, Myrinet, InfiniBand, and NEC IXS). The complete set of HPCC benchmarks are run on each of these systems. Additionally, we present Intel MPI Benchmarks (IMB) results to study the performance of 11 MPI communication functions on these systems.

Time-of-flight electron scattering from molecular hydrogen: Benchmark cross sections for excitation of the X 1Σg+→b 3Σu+ transition

NASA Astrophysics Data System (ADS)

Zawadzki, M.; Wright, R.; Dolmat, G.; Martin, M. F.; Hargreaves, L.; Fursa, D. V.; Zammit, M. C.; Scarlett, L. H.; Tapley, J. K.; Savage, J. S.; Bray, I.; Khakoo, M. A.

2018-05-01

The electron impact X 1Σg+→b 3Σu+ transition in molecular hydrogen is one of the most important dissociation pathways to forming atomic hydrogen atoms, and is of great importance in modeling astrophysical and industrial plasmas where molecular hydrogen is a substantial constituent. Recently, it has been found that the convergent close-coupling (CCC) cross sections of Zammit et al. [Phys. Rev. A 95, 022708 (2017), 10.1103/PhysRevA.95.022708] are up to a factor of 2 smaller than the currently recommended data. We have determined normalized differential cross sections for excitation of this transition from our experimental ratios of the inelastic to elastic scattering of electrons by molecular hydrogen using a transmission-free time-of-flight electron spectrometer, and find excellent agreement with the CCC calculations. Since there is already excellent agreement for the absolute elastic differential cross sections, we establish benchmark differential and integrated cross sections for the X 1Σg+→b 3Σu+ transition, with theory and experiment being essentially in complete agreement.
Sequoia Messaging Rate Benchmark

DOE Office of Scientific and Technical Information (OSTI.GOV)

Friedley, Andrew

2008-01-22

The purpose of this benchmark is to measure the maximal message rate of a single compute node. The first num_cores ranks are expected to reside on the 'core' compute node for which message rate is being tested. After that, the next num_nbors ranks are neighbors for the first core rank, the next set of num_nbors ranks are neighbors for the second core rank, and so on. For example, testing an 8-core node (num_cores = 8) with 4 neighbors (num_nbors = 4) requires 8 + 8 * 4 - 40 ranks. The first 8 of those 40 ranks are expected tomore » be on the 'core' node being benchmarked, while the rest of the ranks are on separate nodes.« less
Using a health promotion model to promote benchmarking.

PubMed

Welby, Jane

2006-07-01

The North East (England) Neonatal Benchmarking Group has been established for almost a decade and has researched and developed a substantial number of evidence-based benchmarks. With no firm evidence that these were being used or that there was any standardisation of neonatal care throughout the region, the group embarked on a programme to review the benchmarks and determine what evidence-based guidelines were needed to support standardisation. A health promotion planning model was used by one subgroup to structure the programme; it enabled all members of the sub group to engage in the review process and provided the motivation and supporting documentation for implementation of changes in practice. The need for a regional guideline development group to complement the activity of the benchmarking group is being addressed.
Electric load shape benchmarking for small- and medium-sized commercial buildings

DOE Office of Scientific and Technical Information (OSTI.GOV)

Luo, Xuan; Hong, Tianzhen; Chen, Yixing

Small- and medium-sized commercial buildings owners and utility managers often look for opportunities for energy cost savings through energy efficiency and energy waste minimization. However, they currently lack easy access to low-cost tools that help interpret the massive amount of data needed to improve understanding of their energy use behaviors. Benchmarking is one of the techniques used in energy audits to identify which buildings are priorities for an energy analysis. Traditional energy performance indicators, such as the energy use intensity (annual energy per unit of floor area), consider only the total annual energy consumption, lacking consideration of the fluctuation ofmore » energy use behavior over time, which reveals the time of use information and represents distinct energy use behaviors during different time spans. To fill the gap, this study developed a general statistical method using 24-hour electric load shape benchmarking to compare a building or business/tenant space against peers. Specifically, the study developed new forms of benchmarking metrics and data analysis methods to infer the energy performance of a building based on its load shape. We first performed a data experiment with collected smart meter data using over 2,000 small- and medium-sized businesses in California. We then conducted a cluster analysis of the source data, and determined and interpreted the load shape features and parameters with peer group analysis. Finally, we implemented the load shape benchmarking feature in an open-access web-based toolkit (the Commercial Building Energy Saver) to provide straightforward and practical recommendations to users. The analysis techniques were generic and flexible for future datasets of other building types and in other utility territories.« less
Electric load shape benchmarking for small- and medium-sized commercial buildings

DOE PAGES

Luo, Xuan; Hong, Tianzhen; Chen, Yixing; ...

2017-07-28

Small- and medium-sized commercial buildings owners and utility managers often look for opportunities for energy cost savings through energy efficiency and energy waste minimization. However, they currently lack easy access to low-cost tools that help interpret the massive amount of data needed to improve understanding of their energy use behaviors. Benchmarking is one of the techniques used in energy audits to identify which buildings are priorities for an energy analysis. Traditional energy performance indicators, such as the energy use intensity (annual energy per unit of floor area), consider only the total annual energy consumption, lacking consideration of the fluctuation ofmore » energy use behavior over time, which reveals the time of use information and represents distinct energy use behaviors during different time spans. To fill the gap, this study developed a general statistical method using 24-hour electric load shape benchmarking to compare a building or business/tenant space against peers. Specifically, the study developed new forms of benchmarking metrics and data analysis methods to infer the energy performance of a building based on its load shape. We first performed a data experiment with collected smart meter data using over 2,000 small- and medium-sized businesses in California. We then conducted a cluster analysis of the source data, and determined and interpreted the load shape features and parameters with peer group analysis. Finally, we implemented the load shape benchmarking feature in an open-access web-based toolkit (the Commercial Building Energy Saver) to provide straightforward and practical recommendations to users. The analysis techniques were generic and flexible for future datasets of other building types and in other utility territories.« less
INL Results for Phases I and III of the OECD/NEA MHTGR-350 Benchmark

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gerhard Strydom; Javier Ortensi; Sonat Sen

2013-09-01

The Idaho National Laboratory (INL) Very High Temperature Reactor (VHTR) Technology Development Office (TDO) Methods Core Simulation group led the construction of the Organization for Economic Cooperation and Development (OECD) Modular High Temperature Reactor (MHTGR) 350 MW benchmark for comparing and evaluating prismatic VHTR analysis codes. The benchmark is sponsored by the OECD's Nuclear Energy Agency (NEA), and the project will yield a set of reference steady-state, transient, and lattice depletion problems that can be used by the Department of Energy (DOE), the Nuclear Regulatory Commission (NRC), and vendors to assess their code suits. The Methods group is responsible formore » defining the benchmark specifications, leading the data collection and comparison activities, and chairing the annual technical workshops. This report summarizes the latest INL results for Phase I (steady state) and Phase III (lattice depletion) of the benchmark. The INSTANT, Pronghorn and RattleSnake codes were used for the standalone core neutronics modeling of Exercise 1, and the results obtained from these codes are compared in Section 4. Exercise 2 of Phase I requires the standalone steady-state thermal fluids modeling of the MHTGR-350 design, and the results for the systems code RELAP5-3D are discussed in Section 5. The coupled neutronics and thermal fluids steady-state solution for Exercise 3 are reported in Section 6, utilizing the newly developed Parallel and Highly Innovative Simulation for INL Code System (PHISICS)/RELAP5-3D code suit. Finally, the lattice depletion models and results obtained for Phase III are compared in Section 7. The MHTGR-350 benchmark proved to be a challenging simulation set of problems to model accurately, and even with the simplifications introduced in the benchmark specification this activity is an important step in the code-to-code verification of modern prismatic VHTR codes. A final OECD/NEA comparison report will compare the Phase I and III
Aeroelasticity Benchmark Assessment: Subsonic Fixed Wing Program

NASA Technical Reports Server (NTRS)

Florance, Jennifer P.; Chwalowski, Pawel; Wieseman, Carol D.

2010-01-01

The fundamental technical challenge in computational aeroelasticity is the accurate prediction of unsteady aerodynamic phenomena and the effect on the aeroelastic response of a vehicle. Currently, a benchmarking standard for use in validating the accuracy of computational aeroelasticity codes does not exist. Many aeroelastic data sets have been obtained in wind-tunnel and flight testing throughout the world; however, none have been globally presented or accepted as an ideal data set. There are numerous reasons for this. One reason is that often, such aeroelastic data sets focus on the aeroelastic phenomena alone (flutter, for example) and do not contain associated information such as unsteady pressures and time-correlated structural dynamic deflections. Other available data sets focus solely on the unsteady pressures and do not address the aeroelastic phenomena. Other discrepancies can include omission of relevant data, such as flutter frequency and / or the acquisition of only qualitative deflection data. In addition to these content deficiencies, all of the available data sets present both experimental and computational technical challenges. Experimental issues include facility influences, nonlinearities beyond those being modeled, and data processing. From the computational perspective, technical challenges include modeling geometric complexities, coupling between the flow and the structure, grid issues, and boundary conditions. The Aeroelasticity Benchmark Assessment task seeks to examine the existing potential experimental data sets and ultimately choose the one that is viewed as the most suitable for computational benchmarking. An initial computational evaluation of that configuration will then be performed using the Langley-developed computational fluid dynamics (CFD) software FUN3D1 as part of its code validation process. In addition to the benchmarking activity, this task also includes an examination of future research directions. Researchers within the
A Field-Based Aquatic Life Benchmark for Conductivity in ...

EPA Pesticide Factsheets

EPA announced the availability of the final report, A Field-Based Aquatic Life Benchmark for Conductivity in Central Appalachian Streams. This report describes a method to characterize the relationship between the extirpation (the effective extinction) of invertebrate genera and salinity (measured as conductivity) and from that relationship derives a freshwater aquatic life benchmark. This benchmark of 300 µS/cm may be applied to waters in Appalachian streams that are dominated by calcium and magnesium salts of sulfate and bicarbonate at circum-neutral to mildly alkaline pH. This report provides scientific evidence for a conductivity benchmark in a specific region rather than for the entire United States.
Using benchmarks for radiation testing of microprocessors and FPGAs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Quinn, Heather; Robinson, William H.; Rech, Paolo

Performance benchmarks have been used over the years to compare different systems. These benchmarks can be useful for researchers trying to determine how changes to the technology, architecture, or compiler affect the system's performance. No such standard exists for systems deployed into high radiation environments, making it difficult to assess whether changes in the fabrication process, circuitry, architecture, or software affect reliability or radiation sensitivity. In this paper, we propose a benchmark suite for high-reliability systems that is designed for field-programmable gate arrays and microprocessors. As a result, we describe the development process and report neutron test data for themore » hardware and software benchmarks.« less
Using benchmarks for radiation testing of microprocessors and FPGAs

DOE PAGES

Quinn, Heather; Robinson, William H.; Rech, Paolo; ...

2015-12-17

Performance benchmarks have been used over the years to compare different systems. These benchmarks can be useful for researchers trying to determine how changes to the technology, architecture, or compiler affect the system's performance. No such standard exists for systems deployed into high radiation environments, making it difficult to assess whether changes in the fabrication process, circuitry, architecture, or software affect reliability or radiation sensitivity. In this paper, we propose a benchmark suite for high-reliability systems that is designed for field-programmable gate arrays and microprocessors. As a result, we describe the development process and report neutron test data for themore » hardware and software benchmarks.« less
Standardised Benchmarking in the Quest for Orthologs

PubMed Central

Altenhoff, Adrian M.; Boeckmann, Brigitte; Capella-Gutierrez, Salvador; Dalquen, Daniel A.; DeLuca, Todd; Forslund, Kristoffer; Huerta-Cepas, Jaime; Linard, Benjamin; Pereira, Cécile; Pryszcz, Leszek P.; Schreiber, Fabian; Sousa da Silva, Alan; Szklarczyk, Damian; Train, Clément-Marie; Bork, Peer; Lecompte, Odile; von Mering, Christian; Xenarios, Ioannis; Sjölander, Kimmen; Juhl Jensen, Lars; Martin, Maria J.; Muffato, Matthieu; Gabaldón, Toni; Lewis, Suzanna E.; Thomas, Paul D.; Sonnhammer, Erik; Dessimoz, Christophe

2016-01-01

The identification of evolutionarily related genes across different species—orthologs in particular—forms the backbone of many comparative, evolutionary, and functional genomic analyses. Achieving high accuracy in orthology inference is thus essential. Yet the true evolutionary history of genes, required to ascertain orthology, is generally unknown. Furthermore, orthologs are used for very different applications across different phyla, with different requirements in terms of the precision-recall trade-off. As a result, assessing the performance of orthology inference methods remains difficult for both users and method developers. Here, we present a community effort to establish standards in orthology benchmarking and facilitate orthology benchmarking through an automated web-based service (http://orthology.benchmarkservice.org). Using this new service, we characterise the performance of 15 well-established orthology inference methods and resources on a battery of 20 different benchmarks. Standardised benchmarking provides a way for users to identify the most effective methods for the problem at hand, sets a minimal requirement for new tools and resources, and guides the development of more accurate orthology inference methods. PMID:27043882
Benchmarking CRISPR on-target sgRNA design.

PubMed

Yan, Jifang; Chuai, Guohui; Zhou, Chi; Zhu, Chenyu; Yang, Jing; Zhang, Chao; Gu, Feng; Xu, Han; Wei, Jia; Liu, Qi

2017-02-15

CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)-based gene editing has been widely implemented in various cell types and organisms. A major challenge in the effective application of the CRISPR system is the need to design highly efficient single-guide RNA (sgRNA) with minimal off-target cleavage. Several tools are available for sgRNA design, while limited tools were compared. In our opinion, benchmarking the performance of the available tools and indicating their applicable scenarios are important issues. Moreover, whether the reported sgRNA design rules are reproducible across different sgRNA libraries, cell types and organisms remains unclear. In our study, a systematic and unbiased benchmark of the sgRNA predicting efficacy was performed on nine representative on-target design tools, based on six benchmark data sets covering five different cell types. The benchmark study presented here provides novel quantitative insights into the available CRISPR tools. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Algorithm and Architecture Independent Benchmarking with SEAK

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tallent, Nathan R.; Manzano Franco, Joseph B.; Gawande, Nitin A.

2016-05-23

Many applications of high performance embedded computing are limited by performance or power bottlenecks. We have designed the Suite for Embedded Applications & Kernels (SEAK), a new benchmark suite, (a) to capture these bottlenecks in a way that encourages creative solutions; and (b) to facilitate rigorous, objective, end-user evaluation for their solutions. To avoid biasing solutions toward existing algorithms, SEAK benchmarks use a mission-centric (abstracted from a particular algorithm) and goal-oriented (functional) specification. To encourage solutions that are any combination of software or hardware, we use an end-user black-box evaluation that can capture tradeoffs between performance, power, accuracy, size, andmore » weight. The tradeoffs are especially informative for procurement decisions. We call our benchmarks future proof because each mission-centric interface and evaluation remains useful despite shifting algorithmic preferences. It is challenging to create both concise and precise goal-oriented specifications for mission-centric problems. This paper describes the SEAK benchmark suite and presents an evaluation of sample solutions that highlights power and performance tradeoffs.« less
Benchmarking the MCNP code for Monte Carlo modelling of an in vivo neutron activation analysis system.

PubMed

Natto, S A; Lewis, D G; Ryde, S J

1998-01-01

The Monte Carlo computer code MCNP (version 4A) has been used to develop a personal computer-based model of the Swansea in vivo neutron activation analysis (IVNAA) system. The model included specification of the neutron source (252Cf), collimators, reflectors and shielding. The MCNP model was 'benchmarked' against fast neutron and thermal neutron fluence data obtained experimentally from the IVNAA system. The Swansea system allows two irradiation geometries using 'short' and 'long' collimators, which provide alternative dose rates for IVNAA. The data presented here relate to the short collimator, although results of similar accuracy were obtained using the long collimator. The fast neutron fluence was measured in air at a series of depths inside the collimator. The measurements agreed with the MCNP simulation within the statistical uncertainty (5-10%) of the calculations. The thermal neutron fluence was measured and calculated inside the cuboidal water phantom. The depth of maximum thermal fluence was 3.2 cm (measured) and 3.0 cm (calculated). The width of the 50% thermal fluence level across the phantom at its mid-depth was found to be the same by both MCNP and experiment. This benchmarking exercise has given us a high degree of confidence in MCNP as a tool for the design of IVNAA systems.
An Online Tool for Global Benchmarking of Risk-Adjusted Surgical Outcomes.

PubMed

Spence, Richard T; Chang, David C; Chu, Kathryn; Panieri, Eugenio; Mueller, Jessica L; Hutter, Matthew M

2017-01-01

Increasing evidence demonstrates significant variation in adverse outcomes following surgery between countries. In order to better quantify these variations, we hypothesize that freely available online risk calculators can be used as a tool to generate global benchmarking of risk-adjusted surgical outcomes. This is a prospective cohort study conducted at an academic teaching hospital in South Africa (GSH). Consecutive adult patients undergoing major general or vascular surgery who met the ACS-NSQIP inclusion criteria for a 3-month period were included. Data variables required by the ACS risk calculator were prospectively collected, and patients were followed for 30 days post-surgery for the occurrence of endpoints. Calculating observed-to-expected ratios for ten outcome measures of interest generated risk-adjusted outcomes benchmarked against the ACS-NSQIP consortium. A total of 373 major general and vascular surgery procedures met the inclusion criteria. The GSH operative cohort varied significantly compared to the 2012 ACS-NSQIP database. The risk-adjusted O/E ratios were significant for any complication O/E 1.91 (95 % CI 1.57-2.31), surgical site infections O/E 4.76 (95 % CI 3.71-6.01), renal failure O/E 3.29 (95 % CI 1.50-6.24), death O/E 3.43 (95 % CI 2.19-5.11), and total length of stay (LOS) O/E 3.43 (95 % CI 2.19-5.11). Freely available online risk calculators can be utilized as tools for global benchmarking of risk-adjusted surgical outcomes.
Cross-industry benchmarking: is it applicable to the operating room?

PubMed

Marco, A P; Hart, S

2001-01-01

The use of benchmarking has been growing in nonmedical industries. This concept is being increasingly applied to medicine as the industry strives to improve quality and improve financial performance. Benchmarks can be either internal (set by the institution) or external (use other's performance as a goal). In some industries, benchmarking has crossed industry lines to identify breakthroughs in thinking. In this article, we examine whether the airline industry can be used as a source of external process benchmarking for the operating room.
High-Strength Composite Fabric Tested at Structural Benchmark Test Facility

NASA Technical Reports Server (NTRS)

Krause, David L.

2002-01-01

Large sheets of ultrahigh strength fabric were put to the test at NASA Glenn Research Center's Structural Benchmark Test Facility. The material was stretched like a snare drum head until the last ounce of strength was reached, when it burst with a cacophonous release of tension. Along the way, the 3-ft square samples were also pulled, warped, tweaked, pinched, and yanked to predict the material's physical reactions to the many loads that it will experience during its proposed use. The material tested was a unique multi-ply composite fabric, reinforced with fibers that had a tensile strength eight times that of common carbon steel. The fiber plies were oriented at 0 and 90 to provide great membrane stiffness, as well as oriented at 45 to provide an unusually high resistance to shear distortion. The fabric's heritage is in astronaut space suits and other NASA programs.
The national hydrologic bench-mark network

USGS Publications Warehouse

Cobb, Ernest D.; Biesecker, J.E.

1971-01-01

The United States is undergoing a dramatic growth of population and demands on its natural resources. The effects are widespread and often produce significant alterations of the environment. The hydrologic bench-mark network was established to provide data on stream basins which are little affected by these changes. The network is made up of selected stream basins which are not expected to be significantly altered by man. Data obtained from these basins can be used to document natural changes in hydrologic characteristics with time, to provide a better understanding of the hydrologic structure of natural basins, and to provide a comparative base for studying the effects of man on the hydrologic environment. There are 57 bench-mark basins in 37 States. These basins are in areas having a wide variety of climate and topography. The bench-mark basins and the types of data collected in the basins are described.
Validating Cellular Automata Lava Flow Emplacement Algorithms with Standard Benchmarks

NASA Astrophysics Data System (ADS)

Richardson, J. A.; Connor, L.; Charbonnier, S. J.; Connor, C.; Gallant, E.

2015-12-01

A major existing need in assessing lava flow simulators is a common set of validation benchmark tests. We propose three levels of benchmarks which test model output against increasingly complex standards. First, imulated lava flows should be morphologically identical, given changes in parameter space that should be inconsequential, such as slope direction. Second, lava flows simulated in simple parameter spaces can be tested against analytical solutions or empirical relationships seen in Bingham fluids. For instance, a lava flow simulated on a flat surface should produce a circular outline. Third, lava flows simulated over real world topography can be compared to recent real world lava flows, such as those at Tolbachik, Russia, and Fogo, Cape Verde. Success or failure of emplacement algorithms in these validation benchmarks can be determined using a Bayesian approach, which directly tests the ability of an emplacement algorithm to correctly forecast lava inundation. Here we focus on two posterior metrics, P(A|B) and P(¬A|¬B), which describe the positive and negative predictive value of flow algorithms. This is an improvement on less direct statistics such as model sensitivity and the Jaccard fitness coefficient. We have performed these validation benchmarks on a new, modular lava flow emplacement simulator that we have developed. This simulator, which we call MOLASSES, follows a Cellular Automata (CA) method. The code is developed in several interchangeable modules, which enables quick modification of the distribution algorithm from cell locations to their neighbors. By assessing several different distribution schemes with the benchmark tests, we have improved the performance of MOLASSES to correctly match early stages of the 2012-3 Tolbachik Flow, Kamchakta Russia, to 80%. We also can evaluate model performance given uncertain input parameters using a Monte Carlo setup. This illuminates sensitivity to model uncertainty.
Performance Characteristics of the Multi-Zone NAS Parallel Benchmarks

NASA Technical Reports Server (NTRS)

Jin, Haoqiang; VanderWijngaart, Rob F.

2003-01-01

We describe a new suite of computational benchmarks that models applications featuring multiple levels of parallelism. Such parallelism is often available in realistic flow computations on systems of grids, but had not previously been captured in bench-marks. The new suite, named NPB Multi-Zone, is extended from the NAS Parallel Benchmarks suite, and involves solving the application benchmarks LU, BT and SP on collections of loosely coupled discretization meshes. The solutions on the meshes are updated independently, but after each time step they exchange boundary value information. This strategy provides relatively easily exploitable coarse-grain parallelism between meshes. Three reference implementations are available: one serial, one hybrid using the Message Passing Interface (MPI) and OpenMP, and another hybrid using a shared memory multi-level programming model (SMP+OpenMP). We examine the effectiveness of hybrid parallelization paradigms in these implementations on three different parallel computers. We also use an empirical formula to investigate the performance characteristics of the multi-zone benchmarks.

Nonlinear viscoplasticity in ASPECT: benchmarking and applications to subduction

NASA Astrophysics Data System (ADS)

Glerum, Anne; Thieulot, Cedric; Fraters, Menno; Blom, Constantijn; Spakman, Wim

2018-03-01

ASPECT (Advanced Solver for Problems in Earth's ConvecTion) is a massively parallel finite element code originally designed for modeling thermal convection in the mantle with a Newtonian rheology. The code is characterized by modern numerical methods, high-performance parallelism and extensibility. This last characteristic is illustrated in this work: we have extended the use of ASPECT from global thermal convection modeling to upper-mantle-scale applications of subduction.
Subduction modeling generally requires the tracking of multiple materials with different properties and with nonlinear viscous and viscoplastic rheologies. To this end, we implemented a frictional plasticity criterion that is combined with a viscous diffusion and dislocation creep rheology. Because ASPECT uses compositional fields to represent different materials, all material parameters are made dependent on a user-specified number of fields.
The goal of this paper is primarily to describe and verify our implementations of complex, multi-material rheology by reproducing the results of four well-known two-dimensional benchmarks: the indentor benchmark, the brick experiment, the sandbox experiment and the slab detachment benchmark. Furthermore, we aim to provide hands-on examples for prospective users by demonstrating the use of multi-material viscoplasticity with three-dimensional, thermomechanical models of oceanic subduction, putting ASPECT on the map as a community code for high-resolution, nonlinear rheology subduction modeling.
Benchmarking biology research organizations using a new, dedicated tool.

PubMed

van Harten, Willem H; van Bokhorst, Leonard; van Luenen, Henri G A M

2010-02-01

International competition forces fundamental research organizations to assess their relative performance. We present a benchmark tool for scientific research organizations where, contrary to existing models, the group leader is placed in a central position within the organization. We used it in a pilot benchmark study involving six research institutions. Our study shows that data collection and data comparison based on this new tool can be achieved. It proved possible to compare relative performance and organizational characteristics and to generate suggestions for improvement for most participants. However, strict definitions of the parameters used for the benchmark and a thorough insight into the organization of each of the benchmark partners is required to produce comparable data and draw firm conclusions.
EPA and EFSA approaches for Benchmark Dose modeling

EPA Science Inventory

Benchmark dose (BMD) modeling has become the preferred approach in the analysis of toxicological dose-response data for the purpose of deriving human health toxicity values. The software packages most often used are Benchmark Dose Software (BMDS, developed by EPA) and PROAST (de...
40 CFR 141.543 - How is the disinfection benchmark calculated?

Code of Federal Regulations, 2012 CFR

2012-07-01

... 40 Protection of Environment 24 2012-07-01 2012-07-01 false How is the disinfection benchmark... Disinfection-Systems Serving Fewer Than 10,000 People Disinfection Benchmark § 141.543 How is the disinfection benchmark calculated? If your system is making a significant change to its disinfection practice, it must...
40 CFR 141.543 - How is the disinfection benchmark calculated?

Code of Federal Regulations, 2014 CFR

2014-07-01

... 40 Protection of Environment 23 2014-07-01 2014-07-01 false How is the disinfection benchmark... Disinfection-Systems Serving Fewer Than 10,000 People Disinfection Benchmark § 141.543 How is the disinfection benchmark calculated? If your system is making a significant change to its disinfection practice, it must...
40 CFR 141.543 - How is the disinfection benchmark calculated?

Code of Federal Regulations, 2013 CFR

2013-07-01

... 40 Protection of Environment 24 2013-07-01 2013-07-01 false How is the disinfection benchmark... Disinfection-Systems Serving Fewer Than 10,000 People Disinfection Benchmark § 141.543 How is the disinfection benchmark calculated? If your system is making a significant change to its disinfection practice, it must...
40 CFR 141.543 - How is the disinfection benchmark calculated?

Code of Federal Regulations, 2011 CFR

2011-07-01

... 40 Protection of Environment 23 2011-07-01 2011-07-01 false How is the disinfection benchmark... Disinfection-Systems Serving Fewer Than 10,000 People Disinfection Benchmark § 141.543 How is the disinfection benchmark calculated? If your system is making a significant change to its disinfection practice, it must...
Towards Systematic Benchmarking of Climate Model Performance

NASA Astrophysics Data System (ADS)

Gleckler, P. J.

2014-12-01

The process by which climate models are evaluated has evolved substantially over the past decade, with the Coupled Model Intercomparison Project (CMIP) serving as a centralizing activity for coordinating model experimentation and enabling research. Scientists with a broad spectrum of expertise have contributed to the CMIP model evaluation process, resulting in many hundreds of publications that have served as a key resource for the IPCC process. For several reasons, efforts are now underway to further systematize some aspects of the model evaluation process. First, some model evaluation can now be considered routine and should not require "re-inventing the wheel" or a journal publication simply to update results with newer models. Second, the benefit of CMIP research to model development has not been optimal because the publication of results generally takes several years and is usually not reproducible for benchmarking newer model versions. And third, there are now hundreds of model versions and many thousands of simulations, but there is no community-based mechanism for routinely monitoring model performance changes. An important change in the design of CMIP6 can help address these limitations. CMIP6 will include a small set standardized experiments as an ongoing exercise (CMIP "DECK": ongoing Diagnostic, Evaluation and Characterization of Klima), so that modeling groups can submit them at any time and not be overly constrained by deadlines. In this presentation, efforts to establish routine benchmarking of existing and future CMIP simulations will be described. To date, some benchmarking tools have been made available to all CMIP modeling groups to enable them to readily compare with CMIP5 simulations during the model development process. A natural extension of this effort is to make results from all CMIP simulations widely available, including the results from newer models as soon as the simulations become available for research. Making the results from routine
Nutrient cycle benchmarks for earth system land model

NASA Astrophysics Data System (ADS)

Zhu, Q.; Riley, W. J.; Tang, J.; Zhao, L.

2017-12-01

Projecting future biosphere-climate feedbacks using Earth system models (ESMs) relies heavily on robust modeling of land surface carbon dynamics. More importantly, soil nutrient (particularly, nitrogen (N) and phosphorus (P)) dynamics strongly modulate carbon dynamics, such as plant sequestration of atmospheric CO2. Prevailing ESM land models all consider nitrogen as a potentially limiting nutrient, and several consider phosphorus. However, including nutrient cycle processes in ESM land models potentially introduces large uncertainties that could be identified and addressed by improved observational constraints. We describe the development of two nutrient cycle benchmarks for ESM land models: (1) nutrient partitioning between plants and soil microbes inferred from 15N and 33P tracers studies and (2) nutrient limitation effects on carbon cycle informed by long-term fertilization experiments. We used these benchmarks to evaluate critical hypotheses regarding nutrient cycling and their representation in ESMs. We found that a mechanistic representation of plant-microbe nutrient competition based on relevant functional traits best reproduced observed plant-microbe nutrient partitioning. We also found that for multiple-nutrient models (i.e., N and P), application of Liebig's law of the minimum is often inaccurate. Rather, the Multiple Nutrient Limitation (MNL) concept better reproduces observed carbon-nutrient interactions.
MPI, HPF or OpenMP: A Study with the NAS Benchmarks

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Frumkin, Michael; Hribar, Michelle; Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

1999-01-01

Porting applications to new high performance parallel and distributed platforms is a challenging task. Writing parallel code by hand is time consuming and costly, but the task can be simplified by high level languages and would even better be automated by parallelizing tools and compilers. The definition of HPF (High Performance Fortran, based on data parallel model) and OpenMP (based on shared memory parallel model) standards has offered great opportunity in this respect. Both provide simple and clear interfaces to language like FORTRAN and simplify many tedious tasks encountered in writing message passing programs. In our study we implemented the parallel versions of the NAS Benchmarks with HPF and OpenMP directives. Comparison of their performance with the MPI implementation and pros and cons of different approaches will be discussed along with experience of using computer-aided tools to help parallelize these benchmarks. Based on the study,potentials of applying some of the techniques to realistic aerospace applications will be presented
MPI, HPF or OpenMP: A Study with the NAS Benchmarks

NASA Technical Reports Server (NTRS)

Jin, H.; Frumkin, M.; Hribar, M.; Waheed, A.; Yan, J.; Saini, Subhash (Technical Monitor)

1999-01-01

Porting applications to new high performance parallel and distributed platforms is a challenging task. Writing parallel code by hand is time consuming and costly, but this task can be simplified by high level languages and would even better be automated by parallelizing tools and compilers. The definition of HPF (High Performance Fortran, based on data parallel model) and OpenMP (based on shared memory parallel model) standards has offered great opportunity in this respect. Both provide simple and clear interfaces to language like FORTRAN and simplify many tedious tasks encountered in writing message passing programs. In our study, we implemented the parallel versions of the NAS Benchmarks with HPF and OpenMP directives. Comparison of their performance with the MPI implementation and pros and cons of different approaches will be discussed along with experience of using computer-aided tools to help parallelize these benchmarks. Based on the study, potentials of applying some of the techniques to realistic aerospace applications will be presented.
Winning Strategy: Set Benchmarks of Early Success to Build Momentum for the Long Term

ERIC Educational Resources Information Center

Spiro, Jody

2012-01-01

Change is a highly personal experience. Everyone participating in the effort has different reactions to change, different concerns, and different motivations for being involved. The smart change leader sets benchmarks along the way so there are guideposts and pause points instead of an endless change process. "Early wins"--a term used to describe…
Benchmarking can add up for healthcare accounting.

PubMed

Czarnecki, M T

1994-09-01

In 1993, a healthcare accounting and finance benchmarking survey of hospital and nonhospital organizations gathered statistics about key common performance areas. A low response did not allow for statistically significant findings, but the survey identified performance measures that can be used in healthcare financial management settings. This article explains the benchmarking process and examines some of the 1993 study's findings.
Benchmarks for Evaluation of Distributed Denial of Service (DDOS)

DTIC Science & Technology

2008-01-01

publications: [1] E. Arikan , Attack Profiling for DDoS Benchmarks, MS Thesis, University of Delaware, August 2006. [2] J. Mirkovic, A. Hussain, B. Wilson...Sigmetrics 2007, June 2007 [5] J. Mirkovic, E. Arikan , S. Wei, S. Fahmy, R. Thomas, and P. Reiher Benchmarks for DDoS Defense Evaluation, Proceedings of the...Security Experimentation, June 2006. [9] J. Mirkovic, E. Arikan , S. Wei, S. Fahmy, R. Thomas, P. Reiher, Benchmarks for DDoS Defense Evaluation
Benchmark matrix and guide: Part II.

PubMed

1991-01-01

In the last issue of the Journal of Quality Assurance (September/October 1991, Volume 13, Number 5, pp. 14-19), the benchmark matrix developed by Headquarters Air Force Logistics Command was published. Five horizontal levels on the matrix delineate progress in TQM: business as usual, initiation, implementation, expansion, and integration. The six vertical categories that are critical to the success of TQM are leadership, structure, training, recognition, process improvement, and customer focus. In this issue, "Benchmark Matrix and Guide: Part II" will show specifically how to apply the categories of leadership, structure, and training to the benchmark matrix progress levels. At the intersection of each category and level, specific behavior objectives are listed with supporting behaviors and guidelines. Some categories will have objectives that are relatively easy to accomplish, allowing quick progress from one level to the next. Other categories will take considerable time and effort to complete. In the next issue, Part III of this series will focus on recognition, process improvement, and customer focus.
A Competitive Benchmarking Study of Noncredit Program Administration.

ERIC Educational Resources Information Center

Alstete, Jeffrey W.

1996-01-01

A benchmarking project to measure administrative processes and financial ratios received 57 usable replies from 300 noncredit continuing education programs. Programs with strong financial surpluses were identified and their processes benchmarked (including response to inquiries, registrants, registrant/staff ratio, new courses, class size,…
The Learning Organisation: Results of a Benchmarking Study.

ERIC Educational Resources Information Center

Zairi, Mohamed

1999-01-01

Learning in corporations was assessed using these benchmarks: core qualities of creative organizations, characteristic of organizational creativity, attributes of flexible organizations, use of diversity and conflict, creative human resource management systems, and effective and successful teams. These benchmarks are key elements of the learning…
Surveys and Benchmarks

ERIC Educational Resources Information Center

Bers, Trudy

2012-01-01

Surveys and benchmarks continue to grow in importance for community colleges in response to several factors. One is the press for accountability, that is, for colleges to report the outcomes of their programs and services to demonstrate their quality and prudent use of resources, primarily to external constituents and governing boards at the state…
The Model Averaging for Dichotomous Response Benchmark Dose (MADr-BMD) Tool

EPA Pesticide Factsheets

Providing quantal response models, which are also used in the U.S. EPA benchmark dose software suite, and generates a model-averaged dose response model to generate benchmark dose and benchmark dose lower bound estimates.
Developing a benchmark for emotional analysis of music

PubMed Central

Yang, Yi-Hsuan; Soleymani, Mohammad

2017-01-01

Music emotion recognition (MER) field rapidly expanded in the last decade. Many new methods and new audio features are developed to improve the performance of MER algorithms. However, it is very difficult to compare the performance of the new methods because of the data representation diversity and scarcity of publicly available data. In this paper, we address these problems by creating a data set and a benchmark for MER. The data set that we release, a MediaEval Database for Emotional Analysis in Music (DEAM), is the largest available data set of dynamic annotations (valence and arousal annotations for 1,802 songs and song excerpts licensed under Creative Commons with 2Hz time resolution). Using DEAM, we organized the ‘Emotion in Music’ task at MediaEval Multimedia Evaluation Campaign from 2013 to 2015. The benchmark attracted, in total, 21 active teams to participate in the challenge. We analyze the results of the benchmark: the winning algorithms and feature-sets. We also describe the design of the benchmark, the evaluation procedures and the data cleaning and transformations that we suggest. The results from the benchmark suggest that the recurrent neural network based approaches combined with large feature-sets work best for dynamic MER. PMID:28282400

Decoys Selection in Benchmarking Datasets: Overview and Perspectives

PubMed Central

Réau, Manon; Langenfeld, Florent; Zagury, Jean-François; Lagarde, Nathalie; Montes, Matthieu

2018-01-01

Virtual Screening (VS) is designed to prospectively help identifying potential hits, i.e., compounds capable of interacting with a given target and potentially modulate its activity, out of large compound collections. Among the variety of methodologies, it is crucial to select the protocol that is the most adapted to the query/target system under study and that yields the most reliable output. To this aim, the performance of VS methods is commonly evaluated and compared by computing their ability to retrieve active compounds in benchmarking datasets. The benchmarking datasets contain a subset of known active compounds together with a subset of decoys, i.e., assumed non-active molecules. The composition of both the active and the decoy compounds subsets is critical to limit the biases in the evaluation of the VS methods. In this review, we focus on the selection of decoy compounds that has considerably changed over the years, from randomly selected compounds to highly customized or experimentally validated negative compounds. We first outline the evolution of decoys selection in benchmarking databases as well as current benchmarking databases that tend to minimize the introduction of biases, and secondly, we propose recommendations for the selection and the design of benchmarking datasets. PMID:29416509
Developing a benchmark for emotional analysis of music.

PubMed

Aljanaki, Anna; Yang, Yi-Hsuan; Soleymani, Mohammad

2017-01-01

Music emotion recognition (MER) field rapidly expanded in the last decade. Many new methods and new audio features are developed to improve the performance of MER algorithms. However, it is very difficult to compare the performance of the new methods because of the data representation diversity and scarcity of publicly available data. In this paper, we address these problems by creating a data set and a benchmark for MER. The data set that we release, a MediaEval Database for Emotional Analysis in Music (DEAM), is the largest available data set of dynamic annotations (valence and arousal annotations for 1,802 songs and song excerpts licensed under Creative Commons with 2Hz time resolution). Using DEAM, we organized the 'Emotion in Music' task at MediaEval Multimedia Evaluation Campaign from 2013 to 2015. The benchmark attracted, in total, 21 active teams to participate in the challenge. We analyze the results of the benchmark: the winning algorithms and feature-sets. We also describe the design of the benchmark, the evaluation procedures and the data cleaning and transformations that we suggest. The results from the benchmark suggest that the recurrent neural network based approaches combined with large feature-sets work best for dynamic MER.
A large-scale benchmark of gene prioritization methods.

PubMed

Guala, Dimitri; Sonnhammer, Erik L L

2017-04-21

In order to maximize the use of results from high-throughput experimental studies, e.g. GWAS, for identification and diagnostics of new disease-associated genes, it is important to have properly analyzed and benchmarked gene prioritization tools. While prospective benchmarks are underpowered to provide statistically significant results in their attempt to differentiate the performance of gene prioritization tools, a strategy for retrospective benchmarking has been missing, and new tools usually only provide internal validations. The Gene Ontology(GO) contains genes clustered around annotation terms. This intrinsic property of GO can be utilized in construction of robust benchmarks, objective to the problem domain. We demonstrate how this can be achieved for network-based gene prioritization tools, utilizing the FunCoup network. We use cross-validation and a set of appropriate performance measures to compare state-of-the-art gene prioritization algorithms: three based on network diffusion, NetRank and two implementations of Random Walk with Restart, and MaxLink that utilizes network neighborhood. Our benchmark suite provides a systematic and objective way to compare the multitude of available and future gene prioritization tools, enabling researchers to select the best gene prioritization tool for the task at hand, and helping to guide the development of more accurate methods.
A publicly available benchmark for biomedical dataset retrieval: the reference standard for the 2016 bioCADDIE dataset retrieval challenge

PubMed Central

Gururaj, Anupama E.; Chen, Xiaoling; Pournejati, Saeid; Alter, George; Hersh, William R.; Demner-Fushman, Dina; Ohno-Machado, Lucila

2017-01-01

Abstract The rapid proliferation of publicly available biomedical datasets has provided abundant resources that are potentially of value as a means to reproduce prior experiments, and to generate and explore novel hypotheses. However, there are a number of barriers to the re-use of such datasets, which are distributed across a broad array of dataset repositories, focusing on different data types and indexed using different terminologies. New methods are needed to enable biomedical researchers to locate datasets of interest within this rapidly expanding information ecosystem, and new resources are needed for the formal evaluation of these methods as they emerge. In this paper, we describe the design and generation of a benchmark for information retrieval of biomedical datasets, which was developed and used for the 2016 bioCADDIE Dataset Retrieval Challenge. In the tradition of the seminal Cranfield experiments, and as exemplified by the Text Retrieval Conference (TREC), this benchmark includes a corpus (biomedical datasets), a set of queries, and relevance judgments relating these queries to elements of the corpus. This paper describes the process through which each of these elements was derived, with a focus on those aspects that distinguish this benchmark from typical information retrieval reference sets. Specifically, we discuss the origin of our queries in the context of a larger collaborative effort, the biomedical and healthCAre Data Discovery Index Ecosystem (bioCADDIE) consortium, and the distinguishing features of biomedical dataset retrieval as a task. The resulting benchmark set has been made publicly available to advance research in the area of biomedical dataset retrieval. Database URL: https://biocaddie.org/benchmark-data PMID:29220453
Proposed biopsy performance benchmarks for MRI based on an audit of a large academic center.

PubMed

Sedora Román, Neda I; Mehta, Tejas S; Sharpe, Richard E; Slanetz, Priscilla J; Venkataraman, Shambhavi; Fein-Zachary, Valerie; Dialani, Vandana

2018-05-01

Performance benchmarks exist for mammography (MG); however, performance benchmarks for magnetic resonance imaging (MRI) are not yet fully developed. The purpose of our study was to perform an MRI audit based on established MG and screening MRI benchmarks and to review whether these benchmarks can be applied to an MRI practice. An IRB approved retrospective review of breast MRIs was performed at our center from 1/1/2011 through 12/31/13. For patients with biopsy recommendation, core biopsy and surgical pathology results were reviewed. The data were used to derive mean performance parameter values, including abnormal interpretation rate (AIR), positive predictive value (PPV), cancer detection rate (CDR), percentage of minimal cancers and axillary node negative cancers and compared with MG and screening MRI benchmarks. MRIs were also divided by screening and diagnostic indications to assess for differences in performance benchmarks amongst these two groups. Of the 2455 MRIs performed over 3-years, 1563 were performed for screening indications and 892 for diagnostic indications. With the exception of PPV2 for screening breast MRIs from 2011 to 2013, PPVs were met for our screening and diagnostic populations when compared to the MRI screening benchmarks established by the Breast Imaging Reporting and Data System (BI-RADS) 5 Atlas ® . AIR and CDR were lower for screening indications as compared to diagnostic indications. New MRI screening benchmarks can be used for screening MRI audits while the American College of Radiology (ACR) desirable goals for diagnostic MG can be used for diagnostic MRI audits. Our study corroborates established findings regarding differences in AIR and CDR amongst screening versus diagnostic indications. © 2017 Wiley Periodicals, Inc.
Methodology and issues of integral experiments selection for nuclear data validation

NASA Astrophysics Data System (ADS)

Tatiana, Ivanova; Ivanov, Evgeny; Hill, Ian

2017-09-01

Nuclear data validation involves a large suite of Integral Experiments (IEs) for criticality, reactor physics and dosimetry applications. [1] Often benchmarks are taken from international Handbooks. [2, 3] Depending on the application, IEs have different degrees of usefulness in validation, and usually the use of a single benchmark is not advised; indeed, it may lead to erroneous interpretation and results. [1] This work aims at quantifying the importance of benchmarks used in application dependent cross section validation. The approach is based on well-known General Linear Least Squared Method (GLLSM) extended to establish biases and uncertainties for given cross sections (within a given energy interval). The statistical treatment results in a vector of weighting factors for the integral benchmarks. These factors characterize the value added by a benchmark for nuclear data validation for the given application. The methodology is illustrated by one example, selecting benchmarks for 239Pu cross section validation. The studies were performed in the framework of Subgroup 39 (Methods and approaches to provide feedback from nuclear and covariance data adjustment for improvement of nuclear data files) established at the Working Party on International Nuclear Data Evaluation Cooperation (WPEC) of the Nuclear Science Committee under the Nuclear Energy Agency (NEA/OECD).
NAS Grid Benchmarks: A Tool for Grid Space Exploration

NASA Technical Reports Server (NTRS)

Frumkin, Michael; VanderWijngaart, Rob F.; Biegel, Bryan (Technical Monitor)

2001-01-01

We present an approach for benchmarking services provided by computational Grids. It is based on the NAS Parallel Benchmarks (NPB) and is called NAS Grid Benchmark (NGB) in this paper. We present NGB as a data flow graph encapsulating an instance of an NPB code in each graph node, which communicates with other nodes by sending/receiving initialization data. These nodes may be mapped to the same or different Grid machines. Like NPB, NGB will specify several different classes (problem sizes). NGB also specifies the generic Grid services sufficient for running the bench-mark. The implementor has the freedom to choose any specific Grid environment. However, we describe a reference implementation in Java, and present some scenarios for using NGB.
Simple Benchmark Specifications for Space Radiation Protection

NASA Technical Reports Server (NTRS)

Singleterry, Robert C. Jr.; Aghara, Sukesh K.

2013-01-01

This report defines space radiation benchmark specifications. This specification starts with simple, monoenergetic, mono-directional particles on slabs and progresses to human models in spacecraft. This report specifies the models and sources needed to what the team performing the benchmark needs to produce in a report. Also included are brief descriptions of how OLTARIS, the NASA Langley website for space radiation analysis, performs its analysis.
IT-benchmarking of clinical workflows: concept, implementation, and evaluation.

PubMed

Thye, Johannes; Straede, Matthias-Christopher; Liebe, Jan-David; Hübner, Ursula

2014-01-01

Due to the emerging evidence of health IT as opportunity and risk for clinical workflows, health IT must undergo a continuous measurement of its efficacy and efficiency. IT-benchmarks are a proven means for providing this information. The aim of this study was to enhance the methodology of an existing benchmarking procedure by including, in particular, new indicators of clinical workflows and by proposing new types of visualisation. Drawing on the concept of information logistics, we propose four workflow descriptors that were applied to four clinical processes. General and specific indicators were derived from these descriptors and processes. 199 chief information officers (CIOs) took part in the benchmarking. These hospitals were assigned to reference groups of a similar size and ownership from a total of 259 hospitals. Stepwise and comprehensive feedback was given to the CIOs. Most participants who evaluated the benchmark rated the procedure as very good, good, or rather good (98.4%). Benchmark information was used by CIOs for getting a general overview, advancing IT, preparing negotiations with board members, and arguing for a new IT project.
Benchmarking the cost efficiency of community care in Australian child and adolescent mental health services: implications for future benchmarking.

PubMed

Furber, Gareth; Brann, Peter; Skene, Clive; Allison, Stephen

2011-06-01

The purpose of this study was to benchmark the cost efficiency of community care across six child and adolescent mental health services (CAMHS) drawn from different Australian states. Organizational, contact and outcome data from the National Mental Health Benchmarking Project (NMHBP) data-sets were used to calculate cost per "treatment hour" and cost per episode for the six participating organizations. We also explored the relationship between intake severity as measured by the Health of the Nations Outcome Scales for Children and Adolescents (HoNOSCA) and cost per episode. The average cost per treatment hour was $223, with cost differences across the six services ranging from a mean of $156 to $273 per treatment hour. The average cost per episode was $3349 (median $1577) and there were significant differences in the CAMHS organizational medians ranging from $388 to $7076 per episode. HoNOSCA scores explained at best 6% of the cost variance per episode. These large cost differences indicate that community CAMHS have the potential to make substantial gains in cost efficiency through collaborative benchmarking. Benchmarking forums need considerable financial and business expertise for detailed comparison of business models for service provision.
A benchmarking method to measure dietary absorption efficiency of chemicals by fish.

PubMed

Xiao, Ruiyang; Adolfsson-Erici, Margaretha; Åkerman, Gun; McLachlan, Michael S; MacLeod, Matthew

2013-12-01

Understanding the dietary absorption efficiency of chemicals in the gastrointestinal tract of fish is important from both a scientific and a regulatory point of view. However, reported fish absorption efficiencies for well-studied chemicals are highly variable. In the present study, the authors developed and exploited an internal chemical benchmarking method that has the potential to reduce uncertainty and variability and, thus, to improve the precision of measurements of fish absorption efficiency. The authors applied the benchmarking method to measure the gross absorption efficiency for 15 chemicals with a wide range of physicochemical properties and structures. They selected 2,2',5,6'-tetrachlorobiphenyl (PCB53) and decabromodiphenyl ethane as absorbable and nonabsorbable benchmarks, respectively. Quantities of chemicals determined in fish were benchmarked to the fraction of PCB53 recovered in fish, and quantities of chemicals determined in feces were benchmarked to the fraction of decabromodiphenyl ethane recovered in feces. The performance of the benchmarking procedure was evaluated based on the recovery of the test chemicals and precision of absorption efficiency from repeated tests. Benchmarking did not improve the precision of the measurements; after benchmarking, however, the median recovery for 15 chemicals was 106%, and variability of recoveries was reduced compared with before benchmarking, suggesting that benchmarking could account for incomplete extraction of chemical in fish and incomplete collection of feces from different tests. © 2013 SETAC.
Integrating CFD, CAA, and Experiments Towards Benchmark Datasets for Airframe Noise Problems

NASA Technical Reports Server (NTRS)

Choudhari, Meelan M.; Yamamoto, Kazuomi

2012-01-01

Airframe noise corresponds to the acoustic radiation due to turbulent flow in the vicinity of airframe components such as high-lift devices and landing gears. The combination of geometric complexity, high Reynolds number turbulence, multiple regions of separation, and a strong coupling with adjacent physical components makes the problem of airframe noise highly challenging. Since 2010, the American Institute of Aeronautics and Astronautics has organized an ongoing series of workshops devoted to Benchmark Problems for Airframe Noise Computations (BANC). The BANC workshops are aimed at enabling a systematic progress in the understanding and high-fidelity predictions of airframe noise via collaborative investigations that integrate state of the art computational fluid dynamics, computational aeroacoustics, and in depth, holistic, and multifacility measurements targeting a selected set of canonical yet realistic configurations. This paper provides a brief summary of the BANC effort, including its technical objectives, strategy, and selective outcomes thus far.
[Benchmarking and other functions of ROM: back to basics].

PubMed

Barendregt, M

2015-01-01

Since 2011 outcome data in the Dutch mental health care have been collected on a national scale. This has led to confusion about the position of benchmarking in the system known as routine outcome monitoring (rom). To provide insight into the various objectives and uses of aggregated outcome data. A qualitative review was performed and the findings were analysed. Benchmarking is a strategy for finding best practices and for improving efficacy and it belongs to the domain of quality management. Benchmarking involves comparing outcome data by means of instrumentation and is relatively tolerant with regard to the validity of the data. Although benchmarking is a function of rom, it must be differentiated form other functions from rom. Clinical management, public accountability, research, payment for performance and information for patients are all functions of rom which require different ways of data feedback and which make different demands on the validity of the underlying data. Benchmarking is often wrongly regarded as being simply a synonym for 'comparing institutions'. It is, however, a method which includes many more factors; it can be used to improve quality and has a more flexible approach to the validity of outcome data and is less concerned than other rom functions about funding and the amount of information given to patients. Benchmarking can make good use of currently available outcome data.
Implementation experiences of NASTRAN on CDC CYBER 74 SCOPE 3.4 operating system

NASA Technical Reports Server (NTRS)

Go, J. C.; Hill, R. G.

1973-01-01

The implementation of the NASTRAN system on the CDC CYBER 74 SCOPE 3.4 Operating System is described. The flexibility of the NASTRAN system made it possible to accomplish the change with no major problems. Various sizes of benchmark and test problems, ranging from two hours to less than one minute CP time were run on the CDC CYBER SCOPE 3.3, Univac EXEC-8, and CDC CYBER SCOPE 3.4. The NASTRAN installation deck is provided.
A Standard-Setting Study to Establish College Success Criteria to Inform the SAT® College and Career Readiness Benchmark. Research Report 2012-3

ERIC Educational Resources Information Center

Kobrin, Jennifer L.; Patterson, Brian F.; Wiley, Andrew; Mattern, Krista D.

2012-01-01

In 2011, the College Board released its SAT college and career readiness benchmark, which represents the level of academic preparedness associated with a high likelihood of college success and completion. The goal of this study, which was conducted in 2008, was to establish college success criteria to inform the development of the benchmark. The…
Phase 3 experiments of the JAERI/USDOE collaborative program on fusion blanket neutronics. Volume 1: Experiment

NASA Astrophysics Data System (ADS)

Oyama, Yukio; Konno, Chikara; Ikeda, Yujiro; Maekawa, Fujio; Kosako, Kazuaki; Nakamura, Tomoo; Maekawa, Hiroshi; Youssef, Mahmoud Z.; Kumar, Anil; Abdou, Mohamed A.

1994-02-01

A pseudo-line source has been realized by using an accelerator based D-T point neutron source. The pseudo-line source is obtained by time averaging of continuously moving point source or by superposition of finely distributed point sources. The line source is utilized for fusion blanket neutronics experiments with an annular geometry so as to simulate a part of a tokamak reactor. The source neutron characteristics were measured for two operational modes for the line source, continuous and step-wide modes, with the activation foil and the NE213 detectors, respectively. In order to give a source condition for a successive calculational analysis on the annular blanket experiment, the neutron source characteristics was calculated by a Monte Carlo code. The reliability of the Monte Carlo calculation was confirmed by comparison with the measured source characteristics. The shape of the annular blanket system was a rectangular with an inner cavity. The annular blanket was consist of 15 mm-thick first wall (SS304) and 406 mm-thick breeder zone with Li2O at inside and Li2CO3 at outside. The line source was produced at the center of the inner cavity by moving the annular blanket system in the span of 2 m. Three annular blanket configurations were examined; the reference blanket, the blanket covered with 25 mm thick graphite armor and the armor-blanket with a large opening. The neutronics parameters of tritium production rate, neutron spectrum and activation reaction rate were measured with specially developed techniques such as multi-detector data acquisition system, spectrum weighting function method and ramp controlled high voltage system. The present experiment provides unique data for a higher step of benchmark to test a reliability of neutronics design calculation for a realistic tokamak reactor.
Benchmarking with the BLASST Sessional Staff Standards Framework

ERIC Educational Resources Information Center

Luzia, Karina; Harvey, Marina; Parker, Nicola; McCormack, Coralie; Brown, Natalie R.

2013-01-01

Benchmarking as a type of knowledge-sharing around good practice within and between institutions is increasingly common in the higher education sector. More recently, benchmarking as a process that can contribute to quality enhancement has been deployed across numerous institutions with a view to systematising frameworks to assure and enhance the…
Thermal Performance Benchmarking: Annual Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Moreno, Gilbert

2016-04-08

The goal for this project is to thoroughly characterize the performance of state-of-the-art (SOA) automotive power electronics and electric motor thermal management systems. Information obtained from these studies will be used to: Evaluate advantages and disadvantages of different thermal management strategies; establish baseline metrics for the thermal management systems; identify methods of improvement to advance the SOA; increase the publicly available information related to automotive traction-drive thermal management systems; help guide future electric drive technologies (EDT) research and development (R&D) efforts. The performance results combined with component efficiency and heat generation information obtained by Oak Ridge National Laboratory (ORNL) maymore » then be used to determine the operating temperatures for the EDT components under drive-cycle conditions. In FY15, the 2012 Nissan LEAF power electronics and electric motor thermal management systems were benchmarked. Testing of the 2014 Honda Accord Hybrid power electronics thermal management system started in FY15; however, due to time constraints it was not possible to include results for this system in this report. The focus of this project is to benchmark the thermal aspects of the systems. ORNL's benchmarking of electric and hybrid electric vehicle technology reports provide detailed descriptions of the electrical and packaging aspects of these automotive systems.« less
OWL2 benchmarking for the evaluation of knowledge based systems.

PubMed

Khan, Sher Afgun; Qadir, Muhammad Abdul; Abbas, Muhammad Azeem; Afzal, Muhammad Tanvir

2017-01-01

OWL2 semantics are becoming increasingly popular for the real domain applications like Gene engineering and health MIS. The present work identifies the research gap that negligible attention has been paid to the performance evaluation of Knowledge Base Systems (KBS) using OWL2 semantics. To fulfil this identified research gap, an OWL2 benchmark for the evaluation of KBS is proposed. The proposed benchmark addresses the foundational blocks of an ontology benchmark i.e. data schema, workload and performance metrics. The proposed benchmark is tested on memory based, file based, relational database and graph based KBS for performance and scalability measures. The results show that the proposed benchmark is able to evaluate the behaviour of different state of the art KBS on OWL2 semantics. On the basis of the results, the end users (i.e. domain expert) would be able to select a suitable KBS appropriate for his domain.
Antibody-protein interactions: benchmark datasets and prediction tools evaluation

PubMed Central

Ponomarenko, Julia V; Bourne, Philip E

2007-01-01

Background The ability to predict antibody binding sites (aka antigenic determinants or B-cell epitopes) for a given protein is a precursor to new vaccine design and diagnostics. Among the various methods of B-cell epitope identification X-ray crystallography is one of the most reliable methods. Using these experimental data computational methods exist for B-cell epitope prediction. As the number of structures of antibody-protein complexes grows, further interest in prediction methods using 3D structure is anticipated. This work aims to establish a benchmark for 3D structure-based epitope prediction methods. Results Two B-cell epitope benchmark datasets inferred from the 3D structures of antibody-protein complexes were defined. The first is a dataset of 62 representative 3D structures of protein antigens with inferred structural epitopes. The second is a dataset of 82 structures of antibody-protein complexes containing different structural epitopes. Using these datasets, eight web-servers developed for antibody and protein binding sites prediction have been evaluated. In no method did performance exceed a 40% precision and 46% recall. The values of the area under the receiver operating characteristic curve for the evaluated methods were about 0.6 for ConSurf, DiscoTope, and PPI-PRED methods and above 0.65 but not exceeding 0.70 for protein-protein docking methods when the best of the top ten models for the bound docking were considered; the remaining methods performed close to random. The benchmark datasets are included as a supplement to this paper. Conclusion It may be possible to improve epitope prediction methods through training on datasets which include only immune epitopes and through utilizing more features characterizing epitopes, for example, the evolutionary conservation score. Notwithstanding, overall poor performance may reflect the generality of antigenicity and hence the inability to decipher B-cell epitopes as an intrinsic feature of the protein. It

Benchmarking and beyond. Information trends in home care.

PubMed

Twiss, Amanda; Rooney, Heather; Lang, Christine

2002-11-01

With today's benchmarking concepts and tools, agencies have the unprecedented opportunity to use information as a strategic advantage. Because agencies are demanding more and better information, benchmark functionality has grown increasingly sophisticated. Agencies now require a new type of analysis, focused on high-level executive summaries while reducing the current "data overload."
Benchmarking strategies for measuring the quality of healthcare: problems and prospects.

PubMed

Lovaglio, Pietro Giorgio

2012-01-01

Over the last few years, increasing attention has been directed toward the problems inherent to measuring the quality of healthcare and implementing benchmarking strategies. Besides offering accreditation and certification processes, recent approaches measure the performance of healthcare institutions in order to evaluate their effectiveness, defined as the capacity to provide treatment that modifies and improves the patient's state of health. This paper, dealing with hospital effectiveness, focuses on research methods for effectiveness analyses within a strategy comparing different healthcare institutions. The paper, after having introduced readers to the principle debates on benchmarking strategies, which depend on the perspective and type of indicators used, focuses on the methodological problems related to performing consistent benchmarking analyses. Particularly, statistical methods suitable for controlling case-mix, analyzing aggregate data, rare events, and continuous outcomes measured with error are examined. Specific challenges of benchmarking strategies, such as the risk of risk adjustment (case-mix fallacy, underreporting, risk of comparing noncomparable hospitals), selection bias, and possible strategies for the development of consistent benchmarking analyses, are discussed. Finally, to demonstrate the feasibility of the illustrated benchmarking strategies, an application focused on determining regional benchmarks for patient satisfaction (using 2009 Lombardy Region Patient Satisfaction Questionnaire) is proposed.
Benchmark Dataset for Whole Genome Sequence Compression.

PubMed

C L, Biji; S Nair, Achuthsankar

2017-01-01

The research in DNA data compression lacks a standard dataset to test out compression tools specific to DNA. This paper argues that the current state of achievement in DNA compression is unable to be benchmarked in the absence of such scientifically compiled whole genome sequence dataset and proposes a benchmark dataset using multistage sampling procedure. Considering the genome sequence of organisms available in the National Centre for Biotechnology and Information (NCBI) as the universe, the proposed dataset selects 1,105 prokaryotes, 200 plasmids, 164 viruses, and 65 eukaryotes. This paper reports the results of using three established tools on the newly compiled dataset and show that their strength and weakness are evident only with a comparison based on the scientifically compiled benchmark dataset. The sample dataset and the respective links are available @ https://sourceforge.net/projects/benchmarkdnacompressiondataset/.
Scalable randomized benchmarking of non-Clifford gates

NASA Astrophysics Data System (ADS)

Cross, Andrew; Magesan, Easwar; Bishop, Lev; Smolin, John; Gambetta, Jay

Randomized benchmarking is a widely used experimental technique to characterize the average error of quantum operations. Benchmarking procedures that scale to enable characterization of n-qubit circuits rely on efficient procedures for manipulating those circuits and, as such, have been limited to subgroups of the Clifford group. However, universal quantum computers require additional, non-Clifford gates to approximate arbitrary unitary transformations. We define a scalable randomized benchmarking procedure over n-qubit unitary matrices that correspond to protected non-Clifford gates for a class of stabilizer codes. We present efficient methods for representing and composing group elements, sampling them uniformly, and synthesizing corresponding poly (n) -sized circuits. The procedure provides experimental access to two independent parameters that together characterize the average gate fidelity of a group element. We acknowledge support from ARO under Contract W911NF-14-1-0124.
Fourth Computational Aeroacoustics (CAA) Workshop on Benchmark Problems

NASA Technical Reports Server (NTRS)

Dahl, Milo D. (Editor)

2004-01-01

This publication contains the proceedings of the Fourth Computational Aeroacoustics (CAA) Workshop on Benchmark Problems. In this workshop, as in previous workshops, the problems were devised to gauge the technological advancement of computational techniques to calculate all aspects of sound generation and propagation in air directly from the fundamental governing equations. A variety of benchmark problems have been previously solved ranging from simple geometries with idealized acoustic conditions to test the accuracy and effectiveness of computational algorithms and numerical boundary conditions; to sound radiation from a duct; to gust interaction with a cascade of airfoils; to the sound generated by a separating, turbulent viscous flow. By solving these and similar problems, workshop participants have shown the technical progress from the basic challenges to accurate CAA calculations to the solution of CAA problems of increasing complexity and difficulty. The fourth CAA workshop emphasized the application of CAA methods to the solution of realistic problems. The workshop was held at the Ohio Aerospace Institute in Cleveland, Ohio, on October 20 to 22, 2003. At that time, workshop participants presented their solutions to problems in one or more of five categories. Their solutions are presented in this proceedings along with the comparisons of their solutions to the benchmark solutions or experimental data. The five categories for the benchmark problems were as follows: Category 1:Basic Methods. The numerical computation of sound is affected by, among other issues, the choice of grid used and by the boundary conditions. Category 2:Complex Geometry. The ability to compute the sound in the presence of complex geometric surfaces is important in practical applications of CAA. Category 3:Sound Generation by Interacting With a Gust. The practical application of CAA for computing noise generated by turbomachinery involves the modeling of the noise source mechanism as a
Implementation of NAS Parallel Benchmarks in Java

NASA Technical Reports Server (NTRS)

Frumkin, Michael; Schultz, Matthew; Jin, Hao-Qiang; Yan, Jerry

2000-01-01

A number of features make Java an attractive but a debatable choice for High Performance Computing (HPC). In order to gauge the applicability of Java to the Computational Fluid Dynamics (CFD) we have implemented NAS Parallel Benchmarks in Java. The performance and scalability of the benchmarks point out the areas where improvement in Java compiler technology and in Java thread implementation would move Java closer to Fortran in the competition for CFD applications.
Data Race Benchmark Collection

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liao, Chunhua; Lin, Pei-Hung; Asplund, Joshua

2017-03-21

This project is a benchmark suite of Open-MP parallel codes that have been checked for data races. The programs are marked to show which do and do not have races. This allows them to be leveraged while testing and developing race detection tools.
The Medical Library Association Benchmarking Network: development and implementation.

PubMed

Dudden, Rosalind Farnam; Corcoran, Kate; Kaplan, Janice; Magouirk, Jeff; Rand, Debra C; Smith, Bernie Todd

2006-04-01

This article explores the development and implementation of the Medical Library Association (MLA) Benchmarking Network from the initial idea and test survey, to the implementation of a national survey in 2002, to the establishment of a continuing program in 2004. Started as a program for hospital libraries, it has expanded to include other nonacademic health sciences libraries. The activities and timelines of MLA's Benchmarking Network task forces and editorial board from 1998 to 2004 are described. The Benchmarking Network task forces successfully developed an extensive questionnaire with parameters of size and measures of library activity and published a report of the data collected by September 2002. The data were available to all MLA members in the form of aggregate tables. Utilization of Web-based technologies proved feasible for data intake and interactive display. A companion article analyzes and presents some of the data. MLA has continued to develop the Benchmarking Network with the completion of a second survey in 2004. The Benchmarking Network has provided many small libraries with comparative data to present to their administrators. It is a challenge for the future to convince all MLA members to participate in this valuable program.
A benchmarking tool to evaluate computer tomography perfusion infarct core predictions against a DWI standard.

PubMed

Cereda, Carlo W; Christensen, Søren; Campbell, Bruce Cv; Mishra, Nishant K; Mlynash, Michael; Levi, Christopher; Straka, Matus; Wintermark, Max; Bammer, Roland; Albers, Gregory W; Parsons, Mark W; Lansberg, Maarten G

2016-10-01

Differences in research methodology have hampered the optimization of Computer Tomography Perfusion (CTP) for identification of the ischemic core. We aim to optimize CTP core identification using a novel benchmarking tool. The benchmarking tool consists of an imaging library and a statistical analysis algorithm to evaluate the performance of CTP. The tool was used to optimize and evaluate an in-house developed CTP-software algorithm. Imaging data of 103 acute stroke patients were included in the benchmarking tool. Median time from stroke onset to CT was 185 min (IQR 180-238), and the median time between completion of CT and start of MRI was 36 min (IQR 25-79). Volumetric accuracy of the CTP-ROIs was optimal at an rCBF threshold of <38%; at this threshold, the mean difference was 0.3 ml (SD 19.8 ml), the mean absolute difference was 14.3 (SD 13.7) ml, and CTP was 67% sensitive and 87% specific for identification of DWI positive tissue voxels. The benchmarking tool can play an important role in optimizing CTP software as it provides investigators with a novel method to directly compare the performance of alternative CTP software packages. © The Author(s) 2015.
The InterFrost benchmark of Thermo-Hydraulic codes for cold regions hydrology - first inter-comparison results

NASA Astrophysics Data System (ADS)

Grenier, Christophe; Roux, Nicolas; Anbergen, Hauke; Collier, Nathaniel; Costard, Francois; Ferrry, Michel; Frampton, Andrew; Frederick, Jennifer; Holmen, Johan; Jost, Anne; Kokh, Samuel; Kurylyk, Barret; McKenzie, Jeffrey; Molson, John; Orgogozo, Laurent; Rivière, Agnès; Rühaak, Wolfram; Selroos, Jan-Olof; Therrien, René; Vidstrand, Patrik

2015-04-01

The impacts of climate change in boreal regions has received considerable attention recently due to the warming trends that have been experienced in recent decades and are expected to intensify in the future. Large portions of these regions, corresponding to permafrost areas, are covered by water bodies (lakes, rivers) that interact with the surrounding permafrost. For example, the thermal state of the surrounding soil influences the energy and water budget of the surface water bodies. Also, these water bodies generate taliks (unfrozen zones below) that disturb the thermal regimes of permafrost and may play a key role in the context of climate change. Recent field studies and modeling exercises indicate that a fully coupled 2D or 3D Thermo-Hydraulic (TH) approach is required to understand and model the past and future evolution of landscapes, rivers, lakes and associated groundwater systems in a changing climate. However, there is presently a paucity of 3D numerical studies of permafrost thaw and associated hydrological changes, and the lack of study can be partly attributed to the difficulty in verifying multi-dimensional results produced by numerical models. Numerical approaches can only be validated against analytical solutions for a purely thermic 1D equation with phase change (e.g. Neumann, Lunardini). When it comes to the coupled TH system (coupling two highly non-linear equations), the only possible approach is to compare the results from different codes to provided test cases and/or to have controlled experiments for validation. Such inter-code comparisons can propel discussions to try to improve code performances. A benchmark exercise was initialized in 2014 with a kick-off meeting in Paris in November. Participants from USA, Canada, Germany, Sweden and France convened, representing altogether 13 simulation codes. The benchmark exercises consist of several test cases inspired by existing literature (e.g. McKenzie et al., 2007) as well as new ones. They
Implementation and verification of global optimization benchmark problems

NASA Astrophysics Data System (ADS)

Posypkin, Mikhail; Usov, Alexander

2017-12-01

The paper considers the implementation and verification of a test suite containing 150 benchmarks for global deterministic box-constrained optimization. A C++ library for describing standard mathematical expressions was developed for this purpose. The library automate the process of generating the value of a function and its' gradient at a given point and the interval estimates of a function and its' gradient on a given box using a single description. Based on this functionality, we have developed a collection of tests for an automatic verification of the proposed benchmarks. The verification has shown that literary sources contain mistakes in the benchmarks description. The library and the test suite are available for download and can be used freely.
Memory-Intensive Benchmarks: IRAM vs. Cache-Based Machines

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Gaeke, Brian R.; Husbands, Parry; Li, Xiaoye S.; Oliker, Leonid; Yelick, Katherine A.; Biegel, Bryan (Technical Monitor)

2002-01-01

The increasing gap between processor and memory performance has lead to new architectural models for memory-intensive applications. In this paper, we explore the performance of a set of memory-intensive benchmarks and use them to compare the performance of conventional cache-based microprocessors to a mixed logic and DRAM processor called VIRAM. The benchmarks are based on problem statements, rather than specific implementations, and in each case we explore the fundamental hardware requirements of the problem, as well as alternative algorithms and data structures that can help expose fine-grained parallelism or simplify memory access patterns. The benchmarks are characterized by their memory access patterns, their basic control structures, and the ratio of computation to memory operation.
New NAS Parallel Benchmarks Results

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Saphir, William; VanderWijngaart, Rob; Woo, Alex; Kutler, Paul (Technical Monitor)

1997-01-01

NPB2 (NAS (NASA Advanced Supercomputing) Parallel Benchmarks 2) is an implementation, based on Fortran and the MPI (message passing interface) message passing standard, of the original NAS Parallel Benchmark specifications. NPB2 programs are run with little or no tuning, in contrast to NPB vendor implementations, which are highly optimized for specific architectures. NPB2 results complement, rather than replace, NPB results. Because they have not been optimized by vendors, NPB2 implementations approximate the performance a typical user can expect for a portable parallel program on distributed memory parallel computers. Together these results provide an insightful comparison of the real-world performance of high-performance computers. New NPB2 features: New implementation (CG), new workstation class problem sizes, new serial sample versions, more performance statistics.
OPTIMIZATION OF MUD HAMMER DRILLING PERFORMANCE - A PROGRAM TO BENCHMARK THE VIABILITY OF ADVANCED MUD HAMMER DRILLING

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alan Black; Arnis Judzis

2003-01-01

a commercial deal with Halliburton on the supply of fluid hammers to the oil and gas business. (4) TerraTek is awaiting progress by Novatek (a DOE contractor) on the redesign and development of their next hammer tool. Their delay will require an extension to TerraTek's contracted program. (5) Smith International has sufficient interest in the program to start engineering and chroming of collars for testing at TerraTek. (6) Shell's Brian Tarr has agreed to join the Industry Advisory Group for the DOE project. The addition of Brian Tarr is welcomed as he has numerous years of experience with the Novatek tool and was involved in the early tests in Europe while with Mobil Oil. (7) Conoco's field trial of the Smith fluid hammer for an application in Vietnam was organized and has contributed to the increased interest in their tool. Progress during Q3 2002: (1) Smith International agreed to participate in the DOE Mud Hammer program. (2) Smith International chromed collars for upcoming benchmark tests at TerraTek, now scheduled for 4Q 2002. (3) ConocoPhillips had a field trial of the Smith fluid hammer offshore Vietnam. The hammer functioned properly, though the well encountered hole conditions and reaming problems. ConocoPhillips plan another field trial as a result. (4) DOE/NETL extended the contract for the fluid hammer program to allow Novatek to ''optimize'' their much delayed tool to 2003 and to allow Smith International to add ''benchmarking'' tests in light of SDS Digger Tools' current financial inability to participate. (5) ConocoPhillips joined the Industry Advisors for the mud hammer program. Progress during Q4 2002: (1) Smith International participated in the DOE Mud Hammer program through full scale benchmarking testing during the week of 4 November 2003. (2) TerraTek acknowledges Smith International, BP America, PDVSA, and ConocoPhillips for cost-sharing the Smith benchmarking tests allowing extension of the contract to add to the benchmarking testing program
Benchmarking in Education: Tech Prep, a Case in Point. IEE Brief Number 8.

ERIC Educational Resources Information Center

Inger, Morton

Benchmarking is a process by which organizations compare their practices, processes, and outcomes to standards of excellence in a systematic way. The benchmarking process entails the following essential steps: determining what to benchmark and establishing internal baseline data; identifying the benchmark; determining how that standard has been…
Comparison of the PHISICS/RELAP5-3D Ring and Block Model Results for Phase I of the OECD MHTGR-350 Benchmark

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gerhard Strydom

2014-04-01

The INL PHISICS code system consists of three modules providing improved core simulation capability: INSTANT (performing 3D nodal transport core calculations), MRTAU (depletion and decay heat generation) and a perturbation/mixer module. Coupling of the PHISICS code suite to the thermal hydraulics system code RELAP5-3D has recently been finalized, and as part of the code verification and validation program the exercises defined for Phase I of the OECD/NEA MHTGR 350 MW Benchmark were completed. This paper provides an overview of the MHTGR Benchmark, and presents selected results of the three steady state exercises 1-3 defined for Phase I. For Exercise 1,more » a stand-alone steady-state neutronics solution for an End of Equilibrium Cycle Modular High Temperature Reactor (MHTGR) was calculated with INSTANT, using the provided geometry, material descriptions, and detailed cross-section libraries. Exercise 2 required the modeling of a stand-alone thermal fluids solution. The RELAP5-3D results of four sub-cases are discussed, consisting of various combinations of coolant bypass flows and material thermophysical properties. Exercise 3 combined the first two exercises in a coupled neutronics and thermal fluids solution, and the coupled code suite PHISICS/RELAP5-3D was used to calculate the results of two sub-cases. The main focus of the paper is a comparison of the traditional RELAP5-3D “ring” model approach vs. a much more detailed model that include kinetics feedback on individual block level and thermal feedbacks on a triangular sub-mesh. The higher fidelity of the block model is illustrated with comparison results on the temperature, power density and flux distributions, and the typical under-predictions produced by the ring model approach are highlighted.« less
Possibilities and challenges of a large international benchmarking in pediatric diabetology-The SWEET experience.

PubMed

Witsch, Michael; Kosteria, Ioanna; Kordonouri, Olga; Alonso, Guy; Archinkova, Margarita; Besancon, Stephane; Birkebæk, Niels H; Bratina, Natasa; Cherubini, Valentino; Hanas, Ragnar; Hasnani, Dhruvi; Iotova, Violeta; Raposo, João Filipe; Schwandt, Anke; Sumnik, Zdenek; Svensson, Jannet; Veeze, Henk

2016-10-01

Despite the existence of evidence-based guidelines for the care of children with diabetes, widespread gaps in knowledge, attitude, and practice remain. The purpose of this paper is to present a review of benchmarking practices and results of this process within SWEET, moreover focusing on current challenges and future directions. Biannually, members electronically transfer de-identified clinic data for 37 parameters to the SWEET database. Each center receives benchmarking and data validation reports. In 2015, 48 centers have contributed data for 20 165 unique patients (51.6% male). After exclusion for missing data 19 131 patients remain for further analysis. The median age is 14.2 years, with a median diabetes duration 4.8 years; 96.0% of patients have type 1, 1.1% type 2, and 2.9% other diabetes types. Data completeness has increased over time. In 2015, median HbA1c of all patients' (diabetes type 1) medians was 7.8% (61.7 mmol/mol) with 39.1%, 41.4%, and 19.4% of patients having HbA1c < 7.5% (58 mmol/mol), 7.5%-9% (58-75 mmol/mol) and >9% (75 mmol/mol), respectively. Although HbA1c has been stable over time [7.7%-7.8% (60.7-61.7 mmol/mol)], there remains wide variation between centers. Fourteen centers achieve a median HbA1c <7.5% (58 mmol/mol). Our vision is that the participation in SWEET is encouraging members to deliver increasingly accurate and complete data. Dissemination of results and prospective projects serve as further motivation to improve data reporting. Comparing processes and outcomes will help members identify weaknesses and introduce innovative solutions, resulting in improved and more uniform care for patients with diabetes. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Benchmarking Strategies for Measuring the Quality of Healthcare: Problems and Prospects

PubMed Central

Lovaglio, Pietro Giorgio

2012-01-01

Over the last few years, increasing attention has been directed toward the problems inherent to measuring the quality of healthcare and implementing benchmarking strategies. Besides offering accreditation and certification processes, recent approaches measure the performance of healthcare institutions in order to evaluate their effectiveness, defined as the capacity to provide treatment that modifies and improves the patient's state of health. This paper, dealing with hospital effectiveness, focuses on research methods for effectiveness analyses within a strategy comparing different healthcare institutions. The paper, after having introduced readers to the principle debates on benchmarking strategies, which depend on the perspective and type of indicators used, focuses on the methodological problems related to performing consistent benchmarking analyses. Particularly, statistical methods suitable for controlling case-mix, analyzing aggregate data, rare events, and continuous outcomes measured with error are examined. Specific challenges of benchmarking strategies, such as the risk of risk adjustment (case-mix fallacy, underreporting, risk of comparing noncomparable hospitals), selection bias, and possible strategies for the development of consistent benchmarking analyses, are discussed. Finally, to demonstrate the feasibility of the illustrated benchmarking strategies, an application focused on determining regional benchmarks for patient satisfaction (using 2009 Lombardy Region Patient Satisfaction Questionnaire) is proposed. PMID:22666140
Development and Experimental Benchmark of Simulations to Predict Used Nuclear Fuel Cladding Temperatures during Drying and Transfer Operations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Greiner, Miles

Radial hydride formation in high-burnup used fuel cladding has the potential to radically reduce its ductility and suitability for long-term storage and eventual transport. To avoid this formation, the maximum post-reactor temperature must remain sufficiently low to limit the cladding hoop stress, and so that hydrogen from the existing circumferential hydrides will not dissolve and become available to re-precipitate into radial hydrides under the slow cooling conditions during drying, transfer and early dry-cask storage. The objective of this research is to develop and experimentallybenchmark computational fluid dynamics simulations of heat transfer in post-pool-storage drying operations, when high-burnup fuel cladding ismore » likely to experience its highest temperature. These benchmarked tools can play a key role in evaluating dry cask storage systems for extended storage of high-burnup fuels and post-storage transportation, including fuel retrievability. The benchmarked tools will be used to aid the design of efficient drying processes, as well as estimate variations of surface temperatures as a means of inferring helium integrity inside the canister or cask. This work will be conducted effectively because the principal investigator has experience developing these types of simulations, and has constructed a test facility that can be used to benchmark them.« less
MIPS bacterial genomes functional annotation benchmark dataset.

PubMed

Tetko, Igor V; Brauner, Barbara; Dunger-Kaltenbach, Irmtraud; Frishman, Goar; Montrone, Corinna; Fobo, Gisela; Ruepp, Andreas; Antonov, Alexey V; Surmeli, Dimitrij; Mewes, Hans-Wernen

2005-05-15

Any development of new methods for automatic functional annotation of proteins according to their sequences requires high-quality data (as benchmark) as well as tedious preparatory work to generate sequence parameters required as input data for the machine learning methods. Different program settings and incompatible protocols make a comparison of the analyzed methods difficult. The MIPS Bacterial Functional Annotation Benchmark dataset (MIPS-BFAB) is a new, high-quality resource comprising four bacterial genomes manually annotated according to the MIPS functional catalogue (FunCat). These resources include precalculated sequence parameters, such as sequence similarity scores, InterPro domain composition and other parameters that could be used to develop and benchmark methods for functional annotation of bacterial protein sequences. These data are provided in XML format and can be used by scientists who are not necessarily experts in genome annotation. BFAB is available at http://mips.gsf.de/proj/bfab

42 CFR 440.335 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2010 CFR

2010-10-01

...) Aggregate actuarial value. Benchmark-equivalent coverage is health benefits coverage that has an aggregate... planning services and supplies and other appropriate preventive services, as designated by the Secretary... State for purposes of comparison in establishing the aggregate actuarial value of the benchmark...
Benchmarking in pathology: development of a benchmarking complexity unit and associated key performance indicators.

PubMed

Neil, Amanda; Pfeffer, Sally; Burnett, Leslie

2013-01-01

This paper details the development of a new type of pathology laboratory productivity unit, the benchmarking complexity unit (BCU). The BCU provides a comparative index of laboratory efficiency, regardless of test mix. It also enables estimation of a measure of how much complex pathology a laboratory performs, and the identification of peer organisations for the purposes of comparison and benchmarking. The BCU is based on the theory that wage rates reflect productivity at the margin. A weighting factor for the ratio of medical to technical staff time was dynamically calculated based on actual participant site data. Given this weighting, a complexity value for each test, at each site, was calculated. The median complexity value (number of BCUs) for that test across all participating sites was taken as its complexity value for the Benchmarking in Pathology Program. The BCU allowed implementation of an unbiased comparison unit and test listing that was found to be a robust indicator of the relative complexity for each test. Employing the BCU data, a number of Key Performance Indicators (KPIs) were developed, including three that address comparative organisational complexity, analytical depth and performance efficiency, respectively. Peer groups were also established using the BCU combined with simple organisational and environmental metrics. The BCU has enabled productivity statistics to be compared between organisations. The BCU corrects for differences in test mix and workload complexity of different organisations and also allows for objective stratification into peer groups.
Benchmarking and audit of breast units improves quality of care

PubMed Central

van Dam, P.A.; Verkinderen, L.; Hauspy, J.; Vermeulen, P.; Dirix, L.; Huizing, M.; Altintas, S.; Papadimitriou, K.; Peeters, M.; Tjalma, W.

2013-01-01

Quality Indicators (QIs) are measures of health care quality that make use of readily available hospital inpatient administrative data. Assessment quality of care can be performed on different levels: national, regional, on a hospital basis or on an individual basis. It can be a mandatory or voluntary system. In all cases development of an adequate database for data extraction, and feedback of the findings is of paramount importance. In the present paper we performed a Medline search on “QIs and breast cancer” and “benchmarking and breast cancer care”, and we have added some data from personal experience. The current data clearly show that the use of QIs for breast cancer care, regular internal and external audit of performance of breast units, and benchmarking are effective to improve quality of care. Adherence to guidelines improves markedly (particularly regarding adjuvant treatment) and there are data emerging showing that this results in a better outcome. As quality assurance benefits patients, it will be a challenge for the medical and hospital community to develop affordable quality control systems, which are not leading to excessive workload. PMID:24753926
Teaching Benchmark Strategy for Fifth-Graders in Taiwan

ERIC Educational Resources Information Center

Yang, Der-Ching; Lai, M. L.

2013-01-01

The key purpose of this study was how we taught the use of benchmark strategy when comparing fraction for fifth-graders in Taiwan. 26 fifth graders from a public elementary in south Taiwan were selected to join this study. Results of this case study showed that students had a much progress on the use of benchmark strategy when comparing fraction…
Regional restoration benchmarks for Acropora cervicornis

NASA Astrophysics Data System (ADS)

Schopmeyer, Stephanie A.; Lirman, Diego; Bartels, Erich; Gilliam, David S.; Goergen, Elizabeth A.; Griffin, Sean P.; Johnson, Meaghan E.; Lustic, Caitlin; Maxwell, Kerry; Walter, Cory S.

2017-12-01

Coral gardening plays an important role in the recovery of depleted populations of threatened Acropora cervicornis in the Caribbean. Over the past decade, high survival coupled with fast growth of in situ nursery corals have allowed practitioners to create healthy and genotypically diverse nursery stocks. Currently, thousands of corals are propagated and outplanted onto degraded reefs on a yearly basis, representing a substantial increase in the abundance, biomass, and overall footprint of A. cervicornis. Here, we combined an extensive dataset collected by restoration practitioners to document early (1-2 yr) restoration success metrics in Florida and Puerto Rico, USA. By reporting region-specific data on the impacts of fragment collection on donor colonies, survivorship and productivity of nursery corals, and survivorship and productivity of outplanted corals during normal conditions, we provide the basis for a stop-light indicator framework for new or existing restoration programs to evaluate their performance. We show that current restoration methods are very effective, that no excess damage is caused to donor colonies, and that once outplanted, corals behave just as wild colonies. We also provide science-based benchmarks that can be used by programs to evaluate successes and challenges of their efforts, and to make modifications where needed. We propose that up to 10% of the biomass can be collected from healthy, large A. cervicornis donor colonies for nursery propagation. We also propose the following benchmarks for the first year of activities for A. cervicornis restoration: (1) >75% live tissue cover on donor colonies; (2) >80% survivorship of nursery corals; and (3) >70% survivorship of outplanted corals. Finally, we report productivity means of 4.4 cm yr-1 for nursery corals and 4.8 cm yr-1 for outplants as a frame of reference for ranking performance within programs. Such benchmarks, and potential subsequent adaptive actions, are needed to fully assess the
7 CFR 1709.5 - Determination of energy cost benchmarks.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 7 Agriculture 11 2012-01-01 2012-01-01 false Determination of energy cost benchmarks. 1709.5... SERVICE, DEPARTMENT OF AGRICULTURE ASSISTANCE TO HIGH ENERGY COST COMMUNITIES General Requirements § 1709.5 Determination of energy cost benchmarks. (a) The Administrator shall establish, using the most...
7 CFR 1709.5 - Determination of energy cost benchmarks.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 7 Agriculture 11 2014-01-01 2014-01-01 false Determination of energy cost benchmarks. 1709.5... SERVICE, DEPARTMENT OF AGRICULTURE ASSISTANCE TO HIGH ENERGY COST COMMUNITIES General Requirements § 1709.5 Determination of energy cost benchmarks. (a) The Administrator shall establish, using the most...
7 CFR 1709.5 - Determination of energy cost benchmarks.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 7 Agriculture 11 2010-01-01 2010-01-01 false Determination of energy cost benchmarks. 1709.5... SERVICE, DEPARTMENT OF AGRICULTURE ASSISTANCE TO HIGH ENERGY COST COMMUNITIES General Requirements § 1709.5 Determination of energy cost benchmarks. (a) The Administrator shall establish, using the most...
7 CFR 1709.5 - Determination of energy cost benchmarks.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 7 Agriculture 11 2011-01-01 2011-01-01 false Determination of energy cost benchmarks. 1709.5... SERVICE, DEPARTMENT OF AGRICULTURE ASSISTANCE TO HIGH ENERGY COST COMMUNITIES General Requirements § 1709.5 Determination of energy cost benchmarks. (a) The Administrator shall establish, using the most...
7 CFR 1709.5 - Determination of energy cost benchmarks.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 7 Agriculture 11 2013-01-01 2013-01-01 false Determination of energy cost benchmarks. 1709.5... SERVICE, DEPARTMENT OF AGRICULTURE ASSISTANCE TO HIGH ENERGY COST COMMUNITIES General Requirements § 1709.5 Determination of energy cost benchmarks. (a) The Administrator shall establish, using the most...
Nomenclatural Benchmarking: The roles of digital typification and telemicroscopy

USDA-ARS?s Scientific Manuscript database

The process of nomenclatural benchmarking is the examination of type specimens of all available names to ascertain which currently accepted species the specimen bearing the name falls within. We propose a strategy for addressing four challenges for nomenclatural benchmarking. First, there is the mat...
Using Toyota's A3 Thinking for Analyzing MBA Business Cases

ERIC Educational Resources Information Center

Anderson, Joe S.; Morgan, James N.; Williams, Susan K.

2011-01-01

A3 Thinking is fundamental to Toyota's benchmark management philosophy and to their lean production system. It is used to solve problems, gain agreement, mentor team members, and lead organizational improvements. A structured problem-solving approach, A3 Thinking builds improvement opportunities through experience. We used "The Toyota…
The Medical Library Association Benchmarking Network: development and implementation*

PubMed Central

Dudden, Rosalind Farnam; Corcoran, Kate; Kaplan, Janice; Magouirk, Jeff; Rand, Debra C.; Smith, Bernie Todd

2006-01-01

Objective: This article explores the development and implementation of the Medical Library Association (MLA) Benchmarking Network from the initial idea and test survey, to the implementation of a national survey in 2002, to the establishment of a continuing program in 2004. Started as a program for hospital libraries, it has expanded to include other nonacademic health sciences libraries. Methods: The activities and timelines of MLA's Benchmarking Network task forces and editorial board from 1998 to 2004 are described. Results: The Benchmarking Network task forces successfully developed an extensive questionnaire with parameters of size and measures of library activity and published a report of the data collected by September 2002. The data were available to all MLA members in the form of aggregate tables. Utilization of Web-based technologies proved feasible for data intake and interactive display. A companion article analyzes and presents some of the data. MLA has continued to develop the Benchmarking Network with the completion of a second survey in 2004. Conclusions: The Benchmarking Network has provided many small libraries with comparative data to present to their administrators. It is a challenge for the future to convince all MLA members to participate in this valuable program. PMID:16636702
Modification and benchmarking of MCNP for low-energy tungsten spectra.

PubMed

Mercier, J R; Kopp, D T; McDavid, W D; Dove, S B; Lancaster, J L; Tucker, D M

2000-12-01

The MCNP Monte Carlo radiation transport code was modified for diagnostic medical physics applications. In particular, the modified code was thoroughly benchmarked for the production of polychromatic tungsten x-ray spectra in the 30-150 kV range. Validating the modified code for coupled electron-photon transport with benchmark spectra was supplemented with independent electron-only and photon-only transport benchmarks. Major revisions to the code included the proper treatment of characteristic K x-ray production and scoring, new impact ionization cross sections, and new bremsstrahlung cross sections. Minor revisions included updated photon cross sections, electron-electron bremsstrahlung production, and K x-ray yield. The modified MCNP code is benchmarked to electron backscatter factors, x-ray spectra production, and primary and scatter photon transport.
The General Concept of Benchmarking and Its Application in Higher Education in Europe

ERIC Educational Resources Information Center

Nazarko, Joanicjusz; Kuzmicz, Katarzyna Anna; Szubzda-Prutis, Elzbieta; Urban, Joanna

2009-01-01

The purposes of this paper are twofold: a presentation of the theoretical basis of benchmarking and a discussion on practical benchmarking applications. Benchmarking is also analyzed as a productivity accelerator. The authors study benchmarking usage in the private and public sectors with due consideration of the specificities of the two areas.…
Benchmarking neuromorphic vision: lessons learnt from computer vision

PubMed Central

Tan, Cheston; Lallee, Stephane; Orchard, Garrick

2015-01-01

Neuromorphic Vision sensors have improved greatly since the first silicon retina was presented almost three decades ago. They have recently matured to the point where they are commercially available and can be operated by laymen. However, despite improved availability of sensors, there remains a lack of good datasets, while algorithms for processing spike-based visual data are still in their infancy. On the other hand, frame-based computer vision algorithms are far more mature, thanks in part to widely accepted datasets which allow direct comparison between algorithms and encourage competition. We are presented with a unique opportunity to shape the development of Neuromorphic Vision benchmarks and challenges by leveraging what has been learnt from the use of datasets in frame-based computer vision. Taking advantage of this opportunity, in this paper we review the role that benchmarks and challenges have played in the advancement of frame-based computer vision, and suggest guidelines for the creation of Neuromorphic Vision benchmarks and challenges. We also discuss the unique challenges faced when benchmarking Neuromorphic Vision algorithms, particularly when attempting to provide direct comparison with frame-based computer vision. PMID:26528120
Benchmarking FEniCS for mantle convection simulations

NASA Astrophysics Data System (ADS)

Vynnytska, L.; Rognes, M. E.; Clark, S. R.

2013-01-01

This paper evaluates the usability of the FEniCS Project for mantle convection simulations by numerical comparison to three established benchmarks. The benchmark problems all concern convection processes in an incompressible fluid induced by temperature or composition variations, and cover three cases: (i) steady-state convection with depth- and temperature-dependent viscosity, (ii) time-dependent convection with constant viscosity and internal heating, and (iii) a Rayleigh-Taylor instability. These problems are modeled by the Stokes equations for the fluid and advection-diffusion equations for the temperature and composition. The FEniCS Project provides a novel platform for the automated solution of differential equations by finite element methods. In particular, it offers a significant flexibility with regard to modeling and numerical discretization choices; we have here used a discontinuous Galerkin method for the numerical solution of the advection-diffusion equations. Our numerical results are in agreement with the benchmarks, and demonstrate the applicability of both the discontinuous Galerkin method and FEniCS for such applications.
A One-group, One-dimensional Transport Benchmark in Cylindrical Geometry

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barry Ganapol; Abderrafi M. Ougouag

A 1-D, 1-group computational benchmark in cylndrical geometry is described. This neutron transport benchmark is useful for evaluating reactor concepts that possess azimuthal symmetry such as a pebble-bed reactor.
A novel hybrid meta-heuristic technique applied to the well-known benchmark optimization problems

NASA Astrophysics Data System (ADS)

Abtahi, Amir-Reza; Bijari, Afsane

2017-03-01

In this paper, a hybrid meta-heuristic algorithm, based on imperialistic competition algorithm (ICA), harmony search (HS), and simulated annealing (SA) is presented. The body of the proposed hybrid algorithm is based on ICA. The proposed hybrid algorithm inherits the advantages of the process of harmony creation in HS algorithm to improve the exploitation phase of the ICA algorithm. In addition, the proposed hybrid algorithm uses SA to make a balance between exploration and exploitation phases. The proposed hybrid algorithm is compared with several meta-heuristic methods, including genetic algorithm (GA), HS, and ICA on several well-known benchmark instances. The comprehensive experiments and statistical analysis on standard benchmark functions certify the superiority of the proposed method over the other algorithms. The efficacy of the proposed hybrid algorithm is promising and can be used in several real-life engineering and management problems.
Key performance indicators to benchmark hospital information systems - a delphi study.

PubMed

Hübner-Bloder, G; Ammenwerth, E

2009-01-01

To identify the key performance indicators for hospital information systems (HIS) that can be used for HIS benchmarking. A Delphi survey with one qualitative and two quantitative rounds. Forty-four HIS experts from health care IT practice and academia participated in all three rounds. Seventy-seven performance indicators were identified and organized into eight categories: technical quality, software quality, architecture and interface quality, IT vendor quality, IT support and IT department quality, workflow support quality, IT outcome quality, and IT costs. The highest ranked indicators are related to clinical workflow support and user satisfaction. Isolated technical indicators or cost indicators were not seen as useful. The experts favored an interdisciplinary group of all the stakeholders, led by hospital management, to conduct the HIS benchmarking. They proposed benchmarking activities both in regular (annual) intervals as well as at defined events (for example after IT introduction). Most of the experts stated that in their institutions no HIS benchmarking activities are being performed at the moment. In the context of IT governance, IT benchmarking is gaining importance in the healthcare area. The found indicators reflect the view of health care IT professionals and researchers. Research is needed to further validate and operationalize key performance indicators, to provide an IT benchmarking framework, and to provide open repositories for a comparison of the HIS benchmarks of different hospitals.

Information-Theoretic Benchmarking of Land Surface Models

NASA Astrophysics Data System (ADS)

Nearing, Grey; Mocko, David; Kumar, Sujay; Peters-Lidard, Christa; Xia, Youlong

2016-04-01

Benchmarking is a type of model evaluation that compares model performance against a baseline metric that is derived, typically, from a different existing model. Statistical benchmarking was used to qualitatively show that land surface models do not fully utilize information in boundary conditions [1] several years before Gong et al [2] discovered the particular type of benchmark that makes it possible to *quantify* the amount of information lost by an incorrect or imperfect model structure. This theoretical development laid the foundation for a formal theory of model benchmarking [3]. We here extend that theory to separate uncertainty contributions from the three major components of dynamical systems models [4]: model structures, model parameters, and boundary conditions describe time-dependent details of each prediction scenario. The key to this new development is the use of large-sample [5] data sets that span multiple soil types, climates, and biomes, which allows us to segregate uncertainty due to parameters from the two other sources. The benefit of this approach for uncertainty quantification and segregation is that it does not rely on Bayesian priors (although it is strictly coherent with Bayes' theorem and with probability theory), and therefore the partitioning of uncertainty into different components is *not* dependent on any a priori assumptions. We apply this methodology to assess the information use efficiency of the four land surface models that comprise the North American Land Data Assimilation System (Noah, Mosaic, SAC-SMA, and VIC). Specifically, we looked at the ability of these models to estimate soil moisture and latent heat fluxes. We found that in the case of soil moisture, about 25% of net information loss was from boundary conditions, around 45% was from model parameters, and 30-40% was from the model structures. In the case of latent heat flux, boundary conditions contributed about 50% of net uncertainty, and model structures contributed
A proposed benchmark problem for cargo nuclear threat monitoring

NASA Astrophysics Data System (ADS)

Wesley Holmes, Thomas; Calderon, Adan; Peeples, Cody R.; Gardner, Robin P.

2011-10-01

There is currently a great deal of technical and political effort focused on reducing the risk of potential attacks on the United States involving radiological dispersal devices or nuclear weapons. This paper proposes a benchmark problem for gamma-ray and X-ray cargo monitoring with results calculated using MCNP5, v1.51. The primary goal is to provide a benchmark problem that will allow researchers in this area to evaluate Monte Carlo models for both speed and accuracy in both forward and inverse calculational codes and approaches for nuclear security applications. A previous benchmark problem was developed by one of the authors (RPG) for two similar oil well logging problems (Gardner and Verghese, 1991, [1]). One of those benchmarks has recently been used by at least two researchers in the nuclear threat area to evaluate the speed and accuracy of Monte Carlo codes combined with variance reduction techniques. This apparent need has prompted us to design this benchmark problem specifically for the nuclear threat researcher. This benchmark consists of conceptual design and preliminary calculational results using gamma-ray interactions on a system containing three thicknesses of three different shielding materials. A point source is placed inside the three materials lead, aluminum, and plywood. The first two materials are in right circular cylindrical form while the third is a cube. The entire system rests on a sufficiently thick lead base so as to reduce undesired scattering events. The configuration was arranged in such a manner that as gamma-ray moves from the source outward it first passes through the lead circular cylinder, then the aluminum circular cylinder, and finally the wooden cube before reaching the detector. A 2 in.×4 in.×16 in. box style NaI (Tl) detector was placed 1 m from the point source located in the center with the 4 in.×16 in. side facing the system. The two sources used in the benchmark are 137Cs and 235U.
Ground truth and benchmarks for performance evaluation

NASA Astrophysics Data System (ADS)

Takeuchi, Ayako; Shneier, Michael; Hong, Tsai Hong; Chang, Tommy; Scrapper, Christopher; Cheok, Geraldine S.

2003-09-01

Progress in algorithm development and transfer of results to practical applications such as military robotics requires the setup of standard tasks, of standard qualitative and quantitative measurements for performance evaluation and validation. Although the evaluation and validation of algorithms have been discussed for over a decade, the research community still faces a lack of well-defined and standardized methodology. The range of fundamental problems include a lack of quantifiable measures of performance, a lack of data from state-of-the-art sensors in calibrated real-world environments, and a lack of facilities for conducting realistic experiments. In this research, we propose three methods for creating ground truth databases and benchmarks using multiple sensors. The databases and benchmarks will provide researchers with high quality data from suites of sensors operating in complex environments representing real problems of great relevance to the development of autonomous driving systems. At NIST, we have prototyped a High Mobility Multi-purpose Wheeled Vehicle (HMMWV) system with a suite of sensors including a Riegl ladar, GDRS ladar, stereo CCD, several color cameras, Global Position System (GPS), Inertial Navigation System (INS), pan/tilt encoders, and odometry . All sensors are calibrated with respect to each other in space and time. This allows a database of features and terrain elevation to be built. Ground truth for each sensor can then be extracted from the database. The main goal of this research is to provide ground truth databases for researchers and engineers to evaluate algorithms for effectiveness, efficiency, reliability, and robustness, thus advancing the development of algorithms.
Application of the RNS3D Code to a Circular-Rectangular Transition Duct With and Without Inlet Swirl and Comparison with Experiments

NASA Technical Reports Server (NTRS)

Cavicchi, Richard H.

1999-01-01

Circular-rectangular transition ducts are used between engine exhausts and nozzles with rectangular cross sections that are designed for high performance aircraft. NASA Glenn Research Center has made experimental investigations of a series of circular-rectangular transition ducts to provide benchmark flow data for comparison with numerical calculations. These ducts are all designed with superellipse cross sections to facilitate grid generation. In response to this challenge, the three-dimensional RNS3D code has been applied to one of these transition ducts. This particular duct has a length-to-inlet diameter ratio of 1.5 and an exit-plane aspect ratio of 3.0. The inlet Mach number is 0.35. Two GRC experiments and the code were run for this duct without inlet swirl. One GRC experiment and the code were also run with inlet swirl. With no inlet swirl the code was successful in predicting pressures and secondary flow conditions, including a pair of counter-rotating vortices at both sidewalls of the exit plane. All these phenomena have been reported from the two GRC experiments. However, these vortices were suppressed in the one experiment when inlet swirl was used; whereas the RNS3D code still predicted them. The experiment was unable to provide data near the sidewalls, the very region where the vortices were predicted.
15 CFR 801.12 - Rules and regulations for the BE-140, Benchmark Survey of Insurance Transactions by U.S...

Code of Federal Regulations, 2010 CFR

2010-01-01

... insurance company. Part 3 requests information needed to determine whether a report is required, the types..., Benchmark Survey of Insurance Transactions by U.S. Insurance Companies with Foreign Persons. 801.12 Section... Transactions by U.S. Insurance Companies with Foreign Persons. (a) The BE-140, Benchmark Survey of Insurance...
Technical Report: Installed Cost Benchmarks and Deployment Barriers for

Science.gov Websites

Cost Benchmarks and Deployment Barriers for Residential Solar Photovoltaics with Energy Storage Q1 2016 Installed Cost Benchmarks and Deployment Barriers for Residential Solar with Energy Storage Researchers from NREL published a report that provides detailed component and system-level cost breakdowns for
What Are the ACT College Readiness Benchmarks? Information Brief

ERIC Educational Resources Information Center

ACT, Inc., 2013

2013-01-01

The ACT College Readiness Benchmarks are the minimum ACT® college readiness assessment scores required for students to have a high probability of success in credit-bearing college courses--English Composition, social sciences courses, College Algebra, or Biology. This report identifies the College Readiness Benchmarks on the ACT Compass scale…
Apples to Oranges: Benchmarking Vocational Education and Training Programmes

ERIC Educational Resources Information Center

Bogetoft, Peter; Wittrup, Jesper

2017-01-01

This paper discusses methods for benchmarking vocational education and training colleges and presents results from a number of models. It is conceptually difficult to benchmark vocational colleges. The colleges typically offer a wide range of course programmes, and the students come from different socioeconomic backgrounds. We solve the…
Implementation and validation of a conceptual benchmarking framework for patient blood management.

PubMed

Kastner, Peter; Breznik, Nada; Gombotz, Hans; Hofmann, Axel; Schreier, Günter

2015-01-01

Public health authorities and healthcare professionals are obliged to ensure high quality health service. Because of the high variability of the utilisation of blood and blood components, benchmarking is indicated in transfusion medicine. Implementation and validation of a benchmarking framework for Patient Blood Management (PBM) based on the report from the second Austrian Benchmark trial. Core modules for automatic report generation have been implemented with KNIME (Konstanz Information Miner) and validated by comparing the output with the results of the second Austrian benchmark trial. Delta analysis shows a deviation <0.1% for 95% (max. 1.4%). The framework provides a reliable tool for PBM benchmarking. The next step is technical integration with hospital information systems.
Preliminary Results for the OECD/NEA Time Dependent Benchmark using Rattlesnake, Rattlesnake-IQS and TDKENO

DOE Office of Scientific and Technical Information (OSTI.GOV)

DeHart, Mark D.; Mausolff, Zander; Weems, Zach

2016-08-01

One goal of the MAMMOTH M&S project is to validate the analysis capabilities within MAMMOTH. Historical data has shown limited value for validation of full three-dimensional (3D) multi-physics methods. Initial analysis considered the TREAT startup minimum critical core and one of the startup transient tests. At present, validation is focusing on measurements taken during the M8CAL test calibration series. These exercises will valuable in preliminary assessment of the ability of MAMMOTH to perform coupled multi-physics calculations; calculations performed to date are being used to validate the neutron transport solver Rattlesnake\\cite{Rattlesnake} and the fuels performance code BISON. Other validation projects outsidemore » of TREAT are available for single-physics benchmarking. Because the transient solution capability of Rattlesnake is one of the key attributes that makes it unique for TREAT transient simulations, validation of the transient solution of Rattlesnake using other time dependent kinetics benchmarks has considerable value. The Nuclear Energy Agency (NEA) of the Organization for Economic Cooperation and Development (OECD) has recently developed a computational benchmark for transient simulations. This benchmark considered both two-dimensional (2D) and 3D configurations for a total number of 26 different transients. All are negative reactivity insertions, typically returning to the critical state after some time.« less
Benchmarking multimedia performance

NASA Astrophysics Data System (ADS)

Zandi, Ahmad; Sudharsanan, Subramania I.

1998-03-01

With the introduction of faster processors and special instruction sets tailored to multimedia, a number of exciting applications are now feasible on the desktops. Among these is the DVD playback consisting, among other things, of MPEG-2 video and Dolby digital audio or MPEG-2 audio. Other multimedia applications such as video conferencing and speech recognition are also becoming popular on computer systems. In view of this tremendous interest in multimedia, a group of major computer companies have formed, Multimedia Benchmarks Committee as part of Standard Performance Evaluation Corp. to address the performance issues of multimedia applications. The approach is multi-tiered with three tiers of fidelity from minimal to full compliant. In each case the fidelity of the bitstream reconstruction as well as quality of the video or audio output are measured and the system is classified accordingly. At the next step the performance of the system is measured. In many multimedia applications such as the DVD playback the application needs to be run at a specific rate. In this case the measurement of the excess processing power, makes all the difference. All these make a system level, application based, multimedia benchmark very challenging. Several ideas and methodologies for each aspect of the problems will be presented and analyzed.
Benchmarks Momentum on Increase

ERIC Educational Resources Information Center

McNeil, Michele

2008-01-01

No longer content with the patchwork quilt of assessments used to measure states' K-12 performance, top policy groups are pushing states toward international benchmarking as a way to better prepare students for a competitive global economy. The National Governors Association, the Council of Chief State School Officers, and the standards-advocacy…
Benchmarks: WICHE Region 2012

ERIC Educational Resources Information Center

Western Interstate Commission for Higher Education, 2013

2013-01-01

Benchmarks: WICHE Region 2012 presents information on the West's progress in improving access to, success in, and financing of higher education. The information is updated annually to monitor change over time and encourage its use as a tool for informed discussion in policy and education communities. To establish a general context for the…
A review on the benchmarking concept in Malaysian construction safety performance

NASA Astrophysics Data System (ADS)

Ishak, Nurfadzillah; Azizan, Muhammad Azizi

2018-02-01

Construction industry is one of the major industries that propels Malaysia's economy in highly contributes to our nation's GDP growth, yet the high fatality rates on construction sites have caused concern among safety practitioners and the stakeholders. Hence, there is a need of benchmarking in performance of Malaysia's construction industry especially in terms of safety. This concept can create a fertile ground for ideas, but only in a receptive environment, organization that share good practices and compare their safety performance against other benefit most to establish improvement in safety culture. This research was conducted to study the awareness important, evaluate current practice and improvement, and also identify the constraint in implement of benchmarking on safety performance in our industry. Additionally, interviews with construction professionals were come out with different views on this concept. Comparison has been done to show the different understanding of benchmarking approach and how safety performance can be benchmarked. But, it's viewed as one mission, which to evaluate objectives identified through benchmarking that will improve the organization's safety performance. Finally, the expected result from this research is to help Malaysia's construction industry implement best practice in safety performance management through the concept of benchmarking.
Evaluation of the Pool Critical Assembly Benchmark with Explicitly-Modeled Geometry using MCNP6

DOE PAGES

Kulesza, Joel A.; Martz, Roger Lee

2017-03-01

Despite being one of the most widely used benchmarks for qualifying light water reactor (LWR) radiation transport methods and data, no benchmark calculation of the Oak Ridge National Laboratory (ORNL) Pool Critical Assembly (PCA) pressure vessel wall benchmark facility (PVWBF) using MCNP6 with explicitly modeled core geometry exists. As such, this paper provides results for such an analysis. First, a criticality calculation is used to construct the fixed source term. Next, ADVANTG-generated variance reduction parameters are used within the final MCNP6 fixed source calculations. These calculations provide unadjusted dosimetry results using three sets of dosimetry reaction cross sections of varyingmore » ages (those packaged with MCNP6, from the IRDF-2002 multi-group library, and from the ACE-formatted IRDFF v1.05 library). These results are then compared to two different sets of measured reaction rates. The comparison agrees in an overall sense within 2% and on a specific reaction- and dosimetry location-basis within 5%. Except for the neptunium dosimetry, the individual foil raw calculation-to-experiment comparisons usually agree within 10% but is typically greater than unity. Finally, in the course of developing these calculations, geometry that has previously not been completely specified is provided herein for the convenience of future analysts.« less
Benchmarking methods and data sets for ligand enrichment assessment in virtual screening.

PubMed

Xia, Jie; Tilahun, Ermias Lemma; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon

2015-01-01

Retrospective small-scale virtual screening (VS) based on benchmarking data sets has been widely used to estimate ligand enrichments of VS approaches in the prospective (i.e. real-world) efforts. However, the intrinsic differences of benchmarking sets to the real screening chemical libraries can cause biased assessment. Herein, we summarize the history of benchmarking methods as well as data sets and highlight three main types of biases found in benchmarking sets, i.e. "analogue bias", "artificial enrichment" and "false negative". In addition, we introduce our recent algorithm to build maximum-unbiased benchmarking sets applicable to both ligand-based and structure-based VS approaches, and its implementations to three important human histone deacetylases (HDACs) isoforms, i.e. HDAC1, HDAC6 and HDAC8. The leave-one-out cross-validation (LOO CV) demonstrates that the benchmarking sets built by our algorithm are maximum-unbiased as measured by property matching, ROC curves and AUCs. Copyright © 2014 Elsevier Inc. All rights reserved.
Benchmarking Methods and Data Sets for Ligand Enrichment Assessment in Virtual Screening

PubMed Central

Xia, Jie; Tilahun, Ermias Lemma; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon

2014-01-01

Retrospective small-scale virtual screening (VS) based on benchmarking data sets has been widely used to estimate ligand enrichments of VS approaches in the prospective (i.e. real-world) efforts. However, the intrinsic differences of benchmarking sets to the real screening chemical libraries can cause biased assessment. Herein, we summarize the history of benchmarking methods as well as data sets and highlight three main types of biases found in benchmarking sets, i.e. “analogue bias”, “artificial enrichment” and “false negative”. In addition, we introduced our recent algorithm to build maximum-unbiased benchmarking sets applicable to both ligand-based and structure-based VS approaches, and its implementations to three important human histone deacetylase (HDAC) isoforms, i.e. HDAC1, HDAC6 and HDAC8. The Leave-One-Out Cross-Validation (LOO CV) demonstrates that the benchmarking sets built by our algorithm are maximum-unbiased in terms of property matching, ROC curves and AUCs. PMID:25481478
Requirements for benchmarking personal image retrieval systems

NASA Astrophysics Data System (ADS)

Bouguet, Jean-Yves; Dulong, Carole; Kozintsev, Igor; Wu, Yi

2006-01-01

It is now common to have accumulated tens of thousands of personal ictures. Efficient access to that many pictures can only be done with a robust image retrieval system. This application is of high interest to Intel processor architects. It is highly compute intensive, and could motivate end users to upgrade their personal computers to the next generations of processors. A key question is how to assess the robustness of a personal image retrieval system. Personal image databases are very different from digital libraries that have been used by many Content Based Image Retrieval Systems.1 For example a personal image database has a lot of pictures of people, but a small set of different people typically family, relatives, and friends. Pictures are taken in a limited set of places like home, work, school, and vacation destination. The most frequent queries are searched for people, and for places. These attributes, and many others affect how a personal image retrieval system should be benchmarked, and benchmarks need to be different from existing ones based on art images, or medical images for examples. The attributes of the data set do not change the list of components needed for the benchmarking of such systems as specified in2: - data sets - query tasks - ground truth - evaluation measures - benchmarking events. This paper proposed a way to build these components to be representative of personal image databases, and of the corresponding usage models.
A New Performance Improvement Model: Adding Benchmarking to the Analysis of Performance Indicator Data.

PubMed

Al-Kuwaiti, Ahmed; Homa, Karen; Maruthamuthu, Thennarasu

2016-01-01

A performance improvement model was developed that focuses on the analysis and interpretation of performance indicator (PI) data using statistical process control and benchmarking. PIs are suitable for comparison with benchmarks only if the data fall within the statistically accepted limit-that is, show only random variation. Specifically, if there is no significant special-cause variation over a period of time, then the data are ready to be benchmarked. The proposed Define, Measure, Control, Internal Threshold, and Benchmark model is adapted from the Define, Measure, Analyze, Improve, Control (DMAIC) model. The model consists of the following five steps: Step 1. Define the process; Step 2. Monitor and measure the variation over the period of time; Step 3. Check the variation of the process; if stable (no significant variation), go to Step 4; otherwise, control variation with the help of an action plan; Step 4. Develop an internal threshold and compare the process with it; Step 5.1. Compare the process with an internal benchmark; and Step 5.2. Compare the process with an external benchmark. The steps are illustrated through the use of health care-associated infection (HAI) data collected for 2013 and 2014 from the Infection Control Unit, King Fahd Hospital, University of Dammam, Saudi Arabia. Monitoring variation is an important strategy in understanding and learning about a process. In the example, HAI was monitored for variation in 2013, and the need to have a more predictable process prompted the need to control variation by an action plan. The action plan was successful, as noted by the shift in the 2014 data, compared to the historical average, and, in addition, the variation was reduced. The model is subject to limitations: For example, it cannot be used without benchmarks, which need to be calculated the same way with similar patient populations, and it focuses only on the "Analyze" part of the DMAIC model.
Toxicological benchmarks for screening potential contaminants of concern for effects on aquatic biota: 1996 revision

DOE Office of Scientific and Technical Information (OSTI.GOV)

Suter, G.W. II; Tsao, C.L.

1996-06-01

This report presents potential screening benchmarks for protection of aquatic life form contaminants in water. Because there is no guidance for screening for benchmarks, a set of alternative benchmarks is presented herein. This report presents the alternative benchmarks for chemicals that have been detected on the Oak Ridge Reservation. It also presents the data used to calculate the benchmarks and the sources of the data. It compares the benchmarks and discusses their relative conservatism and utility. Also included is the updates of benchmark values where appropriate, new benchmark values, secondary sources are replaced by primary sources, and a more completemore » documentation of the sources and derivation of all values are presented.« less

A Benchmarking Initiative for Reactive Transport Modeling Applied to Subsurface Environmental Applications

NASA Astrophysics Data System (ADS)

Steefel, C. I.

2015-12-01

Over the last 20 years, we have seen the evolution of multicomponent reactive transport modeling and the expanding range and increasing complexity of subsurface environmental applications it is being used to address. Reactive transport modeling is being asked to provide accurate assessments of engineering performance and risk for important issues with far-reaching consequences. As a result, the complexity and detail of subsurface processes, properties, and conditions that can be simulated have significantly expanded. Closed form solutions are necessary and useful, but limited to situations that are far simpler than typical applications that combine many physical and chemical processes, in many cases in coupled form. In the absence of closed form and yet realistic solutions for complex applications, numerical benchmark problems with an accepted set of results will be indispensable to qualifying codes for various environmental applications. The intent of this benchmarking exercise, now underway for more than five years, is to develop and publish a set of well-described benchmark problems that can be used to demonstrate simulator conformance with norms established by the subsurface science and engineering community. The objective is not to verify this or that specific code--the reactive transport codes play a supporting role in this regard—but rather to use the codes to verify that a common solution of the problem can be achieved. Thus, the objective of each of the manuscripts is to present an environmentally-relevant benchmark problem that tests the conceptual model capabilities, numerical implementation, process coupling, and accuracy. The benchmark problems developed to date include 1) microbially-mediated reactions, 2) isotopes, 3) multi-component diffusion, 4) uranium fate and transport, 5) metal mobility in mining affected systems, and 6) waste repositories and related aspects.
Implementation of the NAS Parallel Benchmarks in Java

NASA Technical Reports Server (NTRS)

Frumkin, Michael A.; Schultz, Matthew; Jin, Haoqiang; Yan, Jerry; Biegel, Bryan (Technical Monitor)

2002-01-01

Several features make Java an attractive choice for High Performance Computing (HPC). In order to gauge the applicability of Java to Computational Fluid Dynamics (CFD), we have implemented the NAS (NASA Advanced Supercomputing) Parallel Benchmarks in Java. The performance and scalability of the benchmarks point out the areas where improvement in Java compiler technology and in Java thread implementation would position Java closer to Fortran in the competition for CFD applications.
GROWTH OF THE INTERNATIONAL CRITICALITY SAFETY AND REACTOR PHYSICS EXPERIMENT EVALUATION PROJECTS

DOE Office of Scientific and Technical Information (OSTI.GOV)

J. Blair Briggs; John D. Bess; Jim Gulliford

2011-09-01

Since the International Conference on Nuclear Criticality Safety (ICNC) 2007, the International Criticality Safety Benchmark Evaluation Project (ICSBEP) and the International Reactor Physics Experiment Evaluation Project (IRPhEP) have continued to expand their efforts and broaden their scope. Eighteen countries participated on the ICSBEP in 2007. Now, there are 20, with recent contributions from Sweden and Argentina. The IRPhEP has also expanded from eight contributing countries in 2007 to 16 in 2011. Since ICNC 2007, the contents of the 'International Handbook of Evaluated Criticality Safety Benchmark Experiments1' have increased from 442 evaluations (38000 pages), containing benchmark specifications for 3955 critical ormore » subcritical configurations to 516 evaluations (nearly 55000 pages), containing benchmark specifications for 4405 critical or subcritical configurations in the 2010 Edition of the ICSBEP Handbook. The contents of the Handbook have also increased from 21 to 24 criticality-alarm-placement/shielding configurations with multiple dose points for each, and from 20 to 200 configurations categorized as fundamental physics measurements relevant to criticality safety applications. Approximately 25 new evaluations and 150 additional configurations are expected to be added to the 2011 edition of the Handbook. Since ICNC 2007, the contents of the 'International Handbook of Evaluated Reactor Physics Benchmark Experiments2' have increased from 16 different experimental series that were performed at 12 different reactor facilities to 53 experimental series that were performed at 30 different reactor facilities in the 2011 edition of the Handbook. Considerable effort has also been made to improve the functionality of the searchable database, DICE (Database for the International Criticality Benchmark Evaluation Project) and verify the accuracy of the data contained therein. DICE will be discussed in separate papers at ICNC 2011. The status of the ICSBEP and
Benchmark Problems of the Geothermal Technologies Office Code Comparison Study

DOE Office of Scientific and Technical Information (OSTI.GOV)

White, Mark D.; Podgorney, Robert; Kelkar, Sharad M.

problems involved two phases of research, stimulation, development, and circulation in two separate reservoirs. The challenge problems had specific questions to be answered via numerical simulation in three topical areas: 1) reservoir creation/stimulation, 2) reactive and passive transport, and 3) thermal recovery. Whereas the benchmark class of problems were designed to test capabilities for modeling coupled processes under strictly specified conditions, the stated objective for the challenge class of problems was to demonstrate what new understanding of the Fenton Hill experiments could be realized via the application of modern numerical simulation tools by recognized expert practitioners.« less
Building Bridges Between Geoscience and Data Science through Benchmark Data Sets

NASA Astrophysics Data System (ADS)

Thompson, D. R.; Ebert-Uphoff, I.; Demir, I.; Gel, Y.; Hill, M. C.; Karpatne, A.; Güereque, M.; Kumar, V.; Cabral, E.; Smyth, P.

2017-12-01

The changing nature of observational field data demands richer and more meaningful collaboration between data scientists and geoscientists. Thus, among other efforts, the Working Group on Case Studies of the NSF-funded RCN on Intelligent Systems Research To Support Geosciences (IS-GEO) is developing a framework to strengthen such collaborations through the creation of benchmark datasets. Benchmark datasets provide an interface between disciplines without requiring extensive background knowledge. The goals are to create (1) a means for two-way communication between geoscience and data science researchers; (2) new collaborations, which may lead to new approaches for data analysis in the geosciences; and (3) a public, permanent repository of complex data sets, representative of geoscience problems, useful to coordinate efforts in research and education. The group identified 10 key elements and characteristics for ideal benchmarks. High impact: A problem with high potential impact. Active research area: A group of geoscientists should be eager to continue working on the topic. Challenge: The problem should be challenging for data scientists. Data science generality and versatility: It should stimulate development of new general and versatile data science methods. Rich information content: Ideally the data set provides stimulus for analysis at many different levels. Hierarchical problem statement: A hierarchy of suggested analysis tasks, from relatively straightforward to open-ended tasks. Means for evaluating success: Data scientists and geoscientists need means to evaluate whether the algorithms are successful and achieve intended purpose. Quick start guide: Introduction for data scientists on how to easily read the data to enable rapid initial data exploration. Geoscience context: Summary for data scientists of the specific data collection process, instruments used, any pre-processing and the science questions to be answered. Citability: A suitable identifier to
Issues in benchmarking human reliability analysis methods : a literature review.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lois, Erasmia; Forester, John Alan; Tran, Tuan Q.

There is a diversity of human reliability analysis (HRA) methods available for use in assessing human performance within probabilistic risk assessment (PRA). Due to the significant differences in the methods, including the scope, approach, and underlying models, there is a need for an empirical comparison investigating the validity and reliability of the methods. To accomplish this empirical comparison, a benchmarking study is currently underway that compares HRA methods with each other and against operator performance in simulator studies. In order to account for as many effects as possible in the construction of this benchmarking study, a literature review was conducted,more » reviewing past benchmarking studies in the areas of psychology and risk assessment. A number of lessons learned through these studies are presented in order to aid in the design of future HRA benchmarking endeavors.« less
Engine Benchmarking - Final CRADA Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wallner, Thomas

Detailed benchmarking of the powertrains of three light-duty vehicles was performed. Results were presented and provided to CRADA partners. The vehicles included a MY2011 Audi A4, a MY2012 Mini Cooper and a MY2014 Nissan Versa.
[Benchmarking of university trauma centers in Germany. Research and teaching].

PubMed

Gebhard, F; Raschke, M; Ruchholtz, S; Meffert, R; Marzi, I; Pohlemann, T; Südkamp, N; Josten, C; Zwipp, H

2011-07-01

Benchmarking is a very popular business process and meanwhile is used in research as well. The aim of the present study is to elucidate key numbers of German university trauma departments regarding research and teaching. The data set is based upon the monthly reports given by the administration in each university. As a result the study shows that only well-known parameters such as fund-raising and impact factors can be used to benchmark university-based trauma centers. The German federal system does not allow a nationwide benchmarking.
Benchmarking for maximum value.

PubMed

Baldwin, Ed

2009-03-01

Speaking at the most recent Healthcare Estates conference, Ed Baldwin, of international built asset consultancy EC Harris LLP, examined the role of benchmarking and market-testing--two of the key methods used to evaluate the quality and cost-effectiveness of hard and soft FM services provided under PFI healthcare schemes to ensure they are offering maximum value for money.
MARC calculations for the second WIPP structural benchmark problem

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morgan, H.S.

1981-05-01

This report describes calculations made with the MARC structural finite element code for the second WIPP structural benchmark problem. Specific aspects of problem implementation such as element choice, slip line modeling, creep law implementation, and thermal-mechanical coupling are discussed in detail. Also included are the computational results specified in the benchmark problem formulation.
Local implementation of the Essence of Care benchmarks.

PubMed

Jones, Sue

To understand clinical practice benchmarking from the perspective of nurses working in a large acute NHS trust and to determine whether the nurses perceived that their commitment to Essence of Care led to improvements in care, the factors that influenced their role in the process and the organisational factors that influenced benchmarking. An ethnographic case study approach was adopted. Six themes emerged from the data. Two organisational issues emerged: leadership and the values and/or culture of the organisation. The findings suggested that the leadership ability of the Essence of Care link nurses and the value placed on this work by the organisation were key to the success of benchmarking. A model for successful implementation of the Essence of Care is proposed based on the findings of this study, which lends itself to testing by other organisations.
Analytical three-dimensional neutron transport benchmarks for verification of nuclear engineering codes. Final report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ganapol, B.D.; Kornreich, D.E.

Because of the requirement of accountability and quality control in the scientific world, a demand for high-quality analytical benchmark calculations has arisen in the neutron transport community. The intent of these benchmarks is to provide a numerical standard to which production neutron transport codes may be compared in order to verify proper operation. The overall investigation as modified in the second year renewal application includes the following three primary tasks. Task 1 on two dimensional neutron transport is divided into (a) single medium searchlight problem (SLP) and (b) two-adjacent half-space SLP. Task 2 on three-dimensional neutron transport covers (a) pointmore » source in arbitrary geometry, (b) single medium SLP, and (c) two-adjacent half-space SLP. Task 3 on code verification, includes deterministic and probabilistic codes. The primary aim of the proposed investigation was to provide a suite of comprehensive two- and three-dimensional analytical benchmarks for neutron transport theory applications. This objective has been achieved. The suite of benchmarks in infinite media and the three-dimensional SLP are a relatively comprehensive set of one-group benchmarks for isotropically scattering media. Because of time and resource limitations, the extensions of the benchmarks to include multi-group and anisotropic scattering are not included here. Presently, however, enormous advances in the solution for the planar Green`s function in an anisotropically scattering medium have been made and will eventually be implemented in the two- and three-dimensional solutions considered under this grant. Of particular note in this work are the numerical results for the three-dimensional SLP, which have never before been presented. The results presented were made possible only because of the tremendous advances in computing power that have occurred during the past decade.« less
Benchmarking for On-Scalp MEG Sensors.

PubMed

Xie, Minshu; Schneiderman, Justin F; Chukharkin, Maxim L; Kalabukhov, Alexei; Riaz, Bushra; Lundqvist, Daniel; Whitmarsh, Stephen; Hamalainen, Matti; Jousmaki, Veikko; Oostenveld, Robert; Winkler, Dag

2017-06-01

We present a benchmarking protocol for quantitatively comparing emerging on-scalp magnetoencephalography (MEG) sensor technologies to their counterparts in state-of-the-art MEG systems. As a means of validation, we compare a high-critical-temperature superconducting quantum interference device (high T c SQUID) with the low- T c SQUIDs of an Elekta Neuromag TRIUX system in MEG recordings of auditory and somatosensory evoked fields (SEFs) on one human subject. We measure the expected signal gain for the auditory-evoked fields (deeper sources) and notice some unfamiliar features in the on-scalp sensor-based recordings of SEFs (shallower sources). The experimental results serve as a proof of principle for the benchmarking protocol. This approach is straightforward, general to various on-scalp MEG sensors, and convenient to use on human subjects. The unexpected features in the SEFs suggest on-scalp MEG sensors may reveal information about neuromagnetic sources that is otherwise difficult to extract from state-of-the-art MEG recordings. As the first systematically established on-scalp MEG benchmarking protocol, magnetic sensor developers can employ this method to prove the utility of their technology in MEG recordings. Further exploration of the SEFs with on-scalp MEG sensors may reveal unique information about their sources.
OPTIMIZATION OF MUD HAMMER DRILLING PERFORMANCE - A PROGRAM TO BENCHMARK THE VIABILITY OF ADVANCED MUD HAMMER DRILLING

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arnis Judzis

2003-01-01

This document details the progress to date on the ''OPTIMIZATION OF MUD HAMMER DRILLING PERFORMANCE -- A PROGRAM TO BENCHMARK THE VIABILITY OF ADVANCED MUD HAMMER DRILLING'' contract for the quarter starting October 2002 through December 2002. Even though we are awaiting the optimization portion of the testing program, accomplishments included the following: (1) Smith International participated in the DOE Mud Hammer program through full scale benchmarking testing during the week of 4 November 2003. (2) TerraTek acknowledges Smith International, BP America, PDVSA, and ConocoPhillips for cost-sharing the Smith benchmarking tests allowing extension of the contract to add to themore » benchmarking testing program. (3) Following the benchmark testing of the Smith International hammer, representatives from DOE/NETL, TerraTek, Smith International and PDVSA met at TerraTek in Salt Lake City to review observations, performance and views on the optimization step for 2003. (4) The December 2002 issue of Journal of Petroleum Technology (Society of Petroleum Engineers) highlighted the DOE fluid hammer testing program and reviewed last years paper on the benchmark performance of the SDS Digger and Novatek hammers. (5) TerraTek's Sid Green presented a technical review for DOE/NETL personnel in Morgantown on ''Impact Rock Breakage'' and its importance on improving fluid hammer performance. Much discussion has taken place on the issues surrounding mud hammer performance at depth conditions.« less
A note on bound constraints handling for the IEEE CEC'05 benchmark function suite.

PubMed

Liao, Tianjun; Molina, Daniel; de Oca, Marco A Montes; Stützle, Thomas

2014-01-01

The benchmark functions and some of the algorithms proposed for the special session on real parameter optimization of the 2005 IEEE Congress on Evolutionary Computation (CEC'05) have played and still play an important role in the assessment of the state of the art in continuous optimization. In this article, we show that if bound constraints are not enforced for the final reported solutions, state-of-the-art algorithms produce infeasible best candidate solutions for the majority of functions of the IEEE CEC'05 benchmark function suite. This occurs even though the optima of the CEC'05 functions are within the specified bounds. This phenomenon has important implications on algorithm comparisons, and therefore on algorithm designs. This article's goal is to draw the attention of the community to the fact that some authors might have drawn wrong conclusions from experiments using the CEC'05 problems.
42 CFR 422.308 - Adjustments to capitation rates, benchmarks, bids, and payments.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 42 Public Health 3 2013-10-01 2013-10-01 false Adjustments to capitation rates, benchmarks, bids, and payments. 422.308 Section 422.308 Public Health CENTERS FOR MEDICARE & MEDICAID SERVICES, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICARE PROGRAM (CONTINUED) MEDICARE ADVANTAGE PROGRAM...
42 CFR 422.308 - Adjustments to capitation rates, benchmarks, bids, and payments.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 42 Public Health 3 2011-10-01 2011-10-01 false Adjustments to capitation rates, benchmarks, bids, and payments. 422.308 Section 422.308 Public Health CENTERS FOR MEDICARE & MEDICAID SERVICES, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICARE PROGRAM MEDICARE ADVANTAGE PROGRAM Payments to...
42 CFR 422.308 - Adjustments to capitation rates, benchmarks, bids, and payments.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 42 Public Health 3 2012-10-01 2012-10-01 false Adjustments to capitation rates, benchmarks, bids, and payments. 422.308 Section 422.308 Public Health CENTERS FOR MEDICARE & MEDICAID SERVICES, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICARE PROGRAM (CONTINUED) MEDICARE ADVANTAGE PROGRAM...
42 CFR 422.308 - Adjustments to capitation rates, benchmarks, bids, and payments.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 42 Public Health 3 2014-10-01 2014-10-01 false Adjustments to capitation rates, benchmarks, bids, and payments. 422.308 Section 422.308 Public Health CENTERS FOR MEDICARE & MEDICAID SERVICES, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICARE PROGRAM (CONTINUED) MEDICARE ADVANTAGE PROGRAM...
A new numerical benchmark for variably saturated variable-density flow and transport in porous media

NASA Astrophysics Data System (ADS)

Guevara, Carlos; Graf, Thomas

2016-04-01

In subsurface hydrological systems, spatial and temporal variations in solute concentration and/or temperature may affect fluid density and viscosity. These variations could lead to potentially unstable situations, in which a dense fluid overlies a less dense fluid. These situations could produce instabilities that appear as dense plume fingers migrating downwards counteracted by vertical upwards flow of freshwater (Simmons et al., Transp. Porous Medium, 2002). As a result of unstable variable-density flow, solute transport rates are increased over large distances and times as compared to constant-density flow. The numerical simulation of variable-density flow in saturated and unsaturated media requires corresponding benchmark problems against which a computer model is validated (Diersch and Kolditz, Adv. Water Resour, 2002). Recorded data from a laboratory-scale experiment of variable-density flow and solute transport in saturated and unsaturated porous media (Simmons et al., Transp. Porous Medium, 2002) is used to define a new numerical benchmark. The HydroGeoSphere code (Therrien et al., 2004) coupled with PEST (www.pesthomepage.org) are used to obtain an optimized parameter set capable of adequately representing the data set by Simmons et al., (2002). Fingering in the numerical model is triggered using random hydraulic conductivity fields. Due to the inherent randomness, a large number of simulations were conducted in this study. The optimized benchmark model adequately predicts the plume behavior and the fate of solutes. This benchmark is useful for model verification of variable-density flow problems in saturated and/or unsaturated media.

Toward benchmarking in catalysis science: Best practices, challenges, and opportunities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bligaard, Thomas; Bullock, R. Morris; Campbell, Charles T.

Benchmarking is a community-based and (preferably) community-driven activity involving consensus-based decisions on how to make reproducible, fair, and relevant assessments. In catalysis science, important catalyst performance metrics include activity, selectivity, and the deactivation profile, which enable comparisons between new and standard catalysts. Benchmarking also requires careful documentation, archiving, and sharing of methods and measurements, to ensure that the full value of research data can be realized. Beyond these goals, benchmarking presents unique opportunities to advance and accelerate understanding of complex reaction systems by combining and comparing experimental information from multiple, in situ and operando techniques with theoretical insights derived frommore » calculations characterizing model systems. This Perspective describes the origins and uses of benchmarking and its applications in computational catalysis, heterogeneous catalysis, molecular catalysis, and electrocatalysis. As a result, it also discusses opportunities and challenges for future developments in these fields.« less
Toward benchmarking in catalysis science: Best practices, challenges, and opportunities

DOE PAGES

Bligaard, Thomas; Bullock, R. Morris; Campbell, Charles T.; ...

2016-03-07

Benchmarking is a community-based and (preferably) community-driven activity involving consensus-based decisions on how to make reproducible, fair, and relevant assessments. In catalysis science, important catalyst performance metrics include activity, selectivity, and the deactivation profile, which enable comparisons between new and standard catalysts. Benchmarking also requires careful documentation, archiving, and sharing of methods and measurements, to ensure that the full value of research data can be realized. Beyond these goals, benchmarking presents unique opportunities to advance and accelerate understanding of complex reaction systems by combining and comparing experimental information from multiple, in situ and operando techniques with theoretical insights derived frommore » calculations characterizing model systems. This Perspective describes the origins and uses of benchmarking and its applications in computational catalysis, heterogeneous catalysis, molecular catalysis, and electrocatalysis. As a result, it also discusses opportunities and challenges for future developments in these fields.« less
Benchmarking routine psychological services: a discussion of challenges and methods.

PubMed

Delgadillo, Jaime; McMillan, Dean; Leach, Chris; Lucock, Mike; Gilbody, Simon; Wood, Nick

2014-01-01

Policy developments in recent years have led to important changes in the level of access to evidence-based psychological treatments. Several methods have been used to investigate the effectiveness of these treatments in routine care, with different approaches to outcome definition and data analysis. To present a review of challenges and methods for the evaluation of evidence-based treatments delivered in routine mental healthcare. This is followed by a case example of a benchmarking method applied in primary care. High, average and poor performance benchmarks were calculated through a meta-analysis of published data from services working under the Improving Access to Psychological Therapies (IAPT) Programme in England. Pre-post treatment effect sizes (ES) and confidence intervals were estimated to illustrate a benchmarking method enabling services to evaluate routine clinical outcomes. High, average and poor performance ES for routine IAPT services were estimated to be 0.91, 0.73 and 0.46 for depression (using PHQ-9) and 1.02, 0.78 and 0.52 for anxiety (using GAD-7). Data from one specific IAPT service exemplify how to evaluate and contextualize routine clinical performance against these benchmarks. The main contribution of this report is to summarize key recommendations for the selection of an adequate set of psychometric measures, the operational definition of outcomes, and the statistical evaluation of clinical performance. A benchmarking method is also presented, which may enable a robust evaluation of clinical performance against national benchmarks. Some limitations concerned significant heterogeneity among data sources, and wide variations in ES and data completeness.
Gaia FGK benchmark stars: Metallicity

NASA Astrophysics Data System (ADS)

Jofré, P.; Heiter, U.; Soubiran, C.; Blanco-Cuaresma, S.; Worley, C. C.; Pancino, E.; Cantat-Gaudin, T.; Magrini, L.; Bergemann, M.; González Hernández, J. I.; Hill, V.; Lardo, C.; de Laverny, P.; Lind, K.; Masseron, T.; Montes, D.; Mucciarelli, A.; Nordlander, T.; Recio Blanco, A.; Sobeck, J.; Sordo, R.; Sousa, S. G.; Tabernero, H.; Vallenari, A.; Van Eck, S.

2014-04-01

Context. To calibrate automatic pipelines that determine atmospheric parameters of stars, one needs a sample of stars, or "benchmark stars", with well-defined parameters to be used as a reference. Aims: We provide detailed documentation of the iron abundance determination of the 34 FGK-type benchmark stars that are selected to be the pillars for calibration of the one billion Gaia stars. They cover a wide range of temperatures, surface gravities, and metallicities. Methods: Up to seven different methods were used to analyze an observed spectral library of high resolutions and high signal-to-noise ratios. The metallicity was determined by assuming a value of effective temperature and surface gravity obtained from fundamental relations; that is, these parameters were known a priori and independently from the spectra. Results: We present a set of metallicity values obtained in a homogeneous way for our sample of benchmark stars. In addition to this value, we provide detailed documentation of the associated uncertainties. Finally, we report a value of the metallicity of the cool giant ψ Phe for the first time. Based on NARVAL and HARPS data obtained within the Gaia DPAC (Data Processing and Analysis Consortium) and coordinated by the GBOG (Ground-Based Observations for Gaia) working group and on data retrieved from the ESO-ADP database.Tables 6-76 are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/564/A133
Derivation of Draft Ecological Soil Screening Levels for TNT and RDX Utilizing Terrestrial Plant and Soil Invertebrate Toxicity Benchmarks

DTIC Science & Technology

2012-11-01

TSL Soils Utilizing Growth Benchmarks for Alfalfa , Barnyard Grass, and Perennial Ryegrass ............................................. 5 3...Derivation of Terrestrial Plant-Based Draft Eco-SSL Value for RDX Weathered-and-Aged in SSL or TSL Soils Utilizing Growth Benchmarks for Alfalfa ...studies were conducted using the following plant species:  Dicotyledonous symbiotic species alfalfa (Medicago sativa L.)  Monocotyledonous
Identifying key genes in glaucoma based on a benchmarked dataset and the gene regulatory network.

PubMed

Chen, Xi; Wang, Qiao-Ling; Zhang, Meng-Hui

2017-10-01

The current study aimed to identify key genes in glaucoma based on a benchmarked dataset and gene regulatory network (GRN). Local and global noise was added to the gene expression dataset to produce a benchmarked dataset. Differentially-expressed genes (DEGs) between patients with glaucoma and normal controls were identified utilizing the Linear Models for Microarray Data (Limma) package based on benchmarked dataset. A total of 5 GRN inference methods, including Zscore, GeneNet, context likelihood of relatedness (CLR) algorithm, Partial Correlation coefficient with Information Theory (PCIT) and GEne Network Inference with Ensemble of Trees (Genie3) were evaluated using receiver operating characteristic (ROC) and precision and recall (PR) curves. The interference method with the best performance was selected to construct the GRN. Subsequently, topological centrality (degree, closeness and betweenness) was conducted to identify key genes in the GRN of glaucoma. Finally, the key genes were validated by performing reverse transcription-quantitative polymerase chain reaction (RT-qPCR). A total of 176 DEGs were detected from the benchmarked dataset. The ROC and PR curves of the 5 methods were analyzed and it was determined that Genie3 had a clear advantage over the other methods; thus, Genie3 was used to construct the GRN. Following topological centrality analysis, 14 key genes for glaucoma were identified, including IL6 , EPHA2 and GSTT1 and 5 of these 14 key genes were validated by RT-qPCR. Therefore, the current study identified 14 key genes in glaucoma, which may be potential biomarkers to use in the diagnosis of glaucoma and aid in identifying the molecular mechanism of this disease.
Benchmarking Ada tasking on tightly coupled multiprocessor architectures

NASA Technical Reports Server (NTRS)

Collard, Philippe; Goforth, Andre; Marquardt, Matthew

1989-01-01

The development of benchmarks and performance measures for parallel Ada tasking is reported with emphasis on the macroscopic behavior of the benchmark across a set of load parameters. The application chosen for the study was the NASREM model for telerobot control, relevant to many NASA missions. The results of the study demonstrate the potential of parallel Ada in accomplishing the task of developing a control system for a system such as the Flight Telerobotic Servicer using the NASREM framework.
Marking Closely or on the Bench?: An Australian's Benchmark Statement.

ERIC Educational Resources Information Center

Jones, Roy

2000-01-01

Reviews the benchmark statements of the Quality Assurance Agency for Higher Education in the United Kingdom. Examines the various sections within the benchmark. States that in terms of emphasizing the positive attributes of the geography discipline the statements have wide utility and applicability. (CMK)
40 CFR 141.543 - How is the disinfection benchmark calculated?

Code of Federal Regulations, 2010 CFR

2010-07-01

...) WATER PROGRAMS (CONTINUED) NATIONAL PRIMARY DRINKING WATER REGULATIONS Enhanced Filtration and Disinfection-Systems Serving Fewer Than 10,000 People Disinfection Benchmark § 141.543 How is the disinfection... 40 Protection of Environment 22 2010-07-01 2010-07-01 false How is the disinfection benchmark...
40 CFR 141.709 - Developing the disinfection profile and benchmark.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 40 Protection of Environment 23 2011-07-01 2011-07-01 false Developing the disinfection profile... Cryptosporidium Disinfection Profiling and Benchmarking Requirements § 141.709 Developing the disinfection profile and benchmark. (a) Systems required to develop disinfection profiles under § 141.708 must follow the...
40 CFR 141.709 - Developing the disinfection profile and benchmark.

Code of Federal Regulations, 2014 CFR

2014-07-01

... 40 Protection of Environment 23 2014-07-01 2014-07-01 false Developing the disinfection profile... Cryptosporidium Disinfection Profiling and Benchmarking Requirements § 141.709 Developing the disinfection profile and benchmark. (a) Systems required to develop disinfection profiles under § 141.708 must follow the...
40 CFR 141.709 - Developing the disinfection profile and benchmark.

Code of Federal Regulations, 2012 CFR

2012-07-01

... 40 Protection of Environment 24 2012-07-01 2012-07-01 false Developing the disinfection profile... Cryptosporidium Disinfection Profiling and Benchmarking Requirements § 141.709 Developing the disinfection profile and benchmark. (a) Systems required to develop disinfection profiles under § 141.708 must follow the...
40 CFR 141.709 - Developing the disinfection profile and benchmark.

Code of Federal Regulations, 2013 CFR

2013-07-01

... 40 Protection of Environment 24 2013-07-01 2013-07-01 false Developing the disinfection profile... Cryptosporidium Disinfection Profiling and Benchmarking Requirements § 141.709 Developing the disinfection profile and benchmark. (a) Systems required to develop disinfection profiles under § 141.708 must follow the...
Rethinking the reference collection: exploring benchmarks and e-book availability.

PubMed

Husted, Jeffrey T; Czechowski, Leslie J

2012-01-01

Librarians in the Health Sciences Library System at the University of Pittsburgh explored the possibility of developing an electronic reference collection that would replace the print reference collection, thus providing access to these valuable materials to a widely dispersed user population. The librarians evaluated the print reference collection and standard collection development lists as potential benchmarks for the electronic collection, and they determined which books were available in electronic format. They decided that the low availability of electronic versions of titles in each benchmark group rendered the creation of an electronic reference collection using either benchmark impractical.
A Benchmark and Comparative Study of Video-Based Face Recognition on COX Face Database.

PubMed

Huang, Zhiwu; Shan, Shiguang; Wang, Ruiping; Zhang, Haihong; Lao, Shihong; Kuerban, Alifu; Chen, Xilin

2015-12-01

Face recognition with still face images has been widely studied, while the research on video-based face recognition is inadequate relatively, especially in terms of benchmark datasets and comparisons. Real-world video-based face recognition applications require techniques for three distinct scenarios: 1) Videoto-Still (V2S); 2) Still-to-Video (S2V); and 3) Video-to-Video (V2V), respectively, taking video or still image as query or target. To the best of our knowledge, few datasets and evaluation protocols have benchmarked for all the three scenarios. In order to facilitate the study of this specific topic, this paper contributes a benchmarking and comparative study based on a newly collected still/video face database, named COX(1) Face DB. Specifically, we make three contributions. First, we collect and release a largescale still/video face database to simulate video surveillance with three different video-based face recognition scenarios (i.e., V2S, S2V, and V2V). Second, for benchmarking the three scenarios designed on our database, we review and experimentally compare a number of existing set-based methods. Third, we further propose a novel Point-to-Set Correlation Learning (PSCL) method, and experimentally show that it can be used as a promising baseline method for V2S/S2V face recognition on COX Face DB. Extensive experimental results clearly demonstrate that video-based face recognition needs more efforts, and our COX Face DB is a good benchmark database for evaluation.
An automated protocol for performance benchmarking a widefield fluorescence microscope.

PubMed

Halter, Michael; Bier, Elianna; DeRose, Paul C; Cooksey, Gregory A; Choquette, Steven J; Plant, Anne L; Elliott, John T

2014-11-01

Widefield fluorescence microscopy is a highly used tool for visually assessing biological samples and for quantifying cell responses. Despite its widespread use in high content analysis and other imaging applications, few published methods exist for evaluating and benchmarking the analytical performance of a microscope. Easy-to-use benchmarking methods would facilitate the use of fluorescence imaging as a quantitative analytical tool in research applications, and would aid the determination of instrumental method validation for commercial product development applications. We describe and evaluate an automated method to characterize a fluorescence imaging system's performance by benchmarking the detection threshold, saturation, and linear dynamic range to a reference material. The benchmarking procedure is demonstrated using two different materials as the reference material, uranyl-ion-doped glass and Schott 475 GG filter glass. Both are suitable candidate reference materials that are homogeneously fluorescent and highly photostable, and the Schott 475 GG filter glass is currently commercially available. In addition to benchmarking the analytical performance, we also demonstrate that the reference materials provide for accurate day to day intensity calibration. Published 2014 Wiley Periodicals Inc. Published 2014 Wiley Periodicals Inc. This article is a US government work and, as such, is in the public domain in the United States of America.
An overview of the Evaluation of Oxygen Interaction with Materials-third phase (EOIM-3) experiment: Space Shuttle Mission 46

NASA Technical Reports Server (NTRS)

Leger, Lubert J.; Koontz, Steven L.; Visentine, James T.; Hunton, Donald

1993-01-01

The interaction of the atomic oxygen (AO) component of the low earth orbit (LEO) environment with spacecraft materials has been the subject of several flight experiments over the past 11 years. The effect of AO interactions with materials has been shown to be significant for long-lived spacecraft such as Space Station Freedom and has resulted in materials changes for externally exposed surfaces. The data obtained from previous flight experiments, augmented by limited ground-based evaluation, have been used to evaluate hardware performance and select materials. Questions pertaining to the accuracy of this data base remain, resulting from the use of long-term ambient density models to estimate the O-atom fluxes and fluences needed to calculate materials reactivity in short-term flight experiments. The EOIM-3 flight experiment was designed to produce benchmark AO reactivity data and was carried out during STS-46. Ambient density measurements were made with a quadrupole mass spectrometer which was calibrated for AO measurements in a unique ground-based test facility. The combination of these data with the predictions of ambient density models allows an assessment of the accuracy of measured reaction rates on a wide variety of materials, many of which had never been tested in LEO before. The mass spectrometer is also used to obtain a better definition of the local neutral and plasma environments resulting from interaction of the ambient atmosphere with various spacecraft surfaces. In addition, the EOIM-3 experiment was designed to produce information on the effects of temperature, mechanical stress, and solar exposure on the AO reactivity of a wide range of materials. An overview of the EOIM-3 methods and results are presented.
Benchmarking of energy consumption in municipal wastewater treatment plants - a survey of over 200 plants in Italy.

PubMed

Vaccari, M; Foladori, P; Nembrini, S; Vitali, F

2018-05-01

One of the largest surveys in Europe about energy consumption in Italian wastewater treatment plants (WWTPs) is presented, based on 241 WWTPs and a total population equivalent (PE) of more than 9,000,000 PE. The study contributes towards standardised resilient data and benchmarking and to identify potentials for energy savings. In the energy benchmark, three indicators were used: specific energy consumption expressed per population equivalents (kWh PE -1 year -1 ), per cubic meter (kWh/m 3 ), and per unit of chemical oxygen demand (COD) removed (kWh/kgCOD). The indicator kWh/m 3 , even though widely applied, resulted in a biased benchmark, because highly influenced by stormwater and infiltrations. Plants with combined networks (often used in Europe) showed an apparent better energy performance. Conversely, the indicator kWh PE -1 year -1 resulted in a more meaningful definition of a benchmark. High energy efficiency was associated with: (i) large capacity of the plant, (ii) higher COD concentration in wastewater, (iii) separate sewer systems, (iv) capacity utilisation over 80%, and (v) high organic loads, but without overloading. The 25th percentile was proposed as a benchmark for four size classes: 23 kWh PE -1 y -1 for large plants > 100,000 PE; 42 kWh PE -1 y -1 for capacity 10,000 < PE < 100,000, 48 kWh PE -1 y -1 for capacity 2,000 < PE < 10,000 and 76 kWh PE -1 y -1 for small plants < 2,000 PE.
Toxicological benchmarks for screening potential contaminants of concern for effects on aquatic biota: 1994 Revision

DOE Office of Scientific and Technical Information (OSTI.GOV)

Suter, G.W. II; Mabrey, J.B.

1994-07-01

This report presents potential screening benchmarks for protection of aquatic life from contaminants in water. Because there is no guidance for screening benchmarks, a set of alternative benchmarks is presented herein. The alternative benchmarks are based on different conceptual approaches to estimating concentrations causing significant effects. For the upper screening benchmark, there are the acute National Ambient Water Quality Criteria (NAWQC) and the Secondary Acute Values (SAV). The SAV concentrations are values estimated with 80% confidence not to exceed the unknown acute NAWQC for those chemicals with no NAWQC. The alternative chronic benchmarks are the chronic NAWQC, the Secondary Chronicmore » Value (SCV), the lowest chronic values for fish and daphnids from chronic toxicity tests, the estimated EC20 for a sensitive species, and the concentration estimated to cause a 20% reduction in the recruit abundance of largemouth bass. It is recommended that ambient chemical concentrations be compared to all of these benchmarks. If NAWQC are exceeded, the chemicals must be contaminants of concern because the NAWQC are applicable or relevant and appropriate requirements (ARARs). If NAWQC are not exceeded, but other benchmarks are, contaminants should be selected on the basis of the number of benchmarks exceeded and the conservatism of the particular benchmark values, as discussed in the text. To the extent that toxicity data are available, this report presents the alternative benchmarks for chemicals that have been detected on the Oak Ridge Reservation. It also presents the data used to calculate benchmarks and the sources of the data. It compares the benchmarks and discusses their relative conservatism and utility.« less
Numerical simulation of air distribution in a room with a sidewall jet under benchmark test conditions

NASA Astrophysics Data System (ADS)

Zasimova, Marina; Ivanov, Nikolay

2018-05-01

The goal of the study is to validate Large Eddy Simulation (LES) data on mixing ventilation in an isothermal room at conditions of benchmark experiments by Hurnik et al. (2015). The focus is on the accuracy of the mean and rms velocity fields prediction in the quasi-free jet zone of the room with 3D jet supplied from a sidewall rectangular diffuser. Calculations were carried out using the ANSYS Fluent 16.2 software with an algebraic wall-modeled LES subgrid-scale model. CFD results on the mean velocity vector are compared with the Laser Doppler Anemometry data. The difference between the mean velocity vector and the mean air speed in the jet zone, both LES-computed, is presented and discussed.

Using chemical benchmarking to determine the persistence of chemicals in a Swedish lake.

PubMed

Zou, Hongyan; Radke, Michael; Kierkegaard, Amelie; MacLeod, Matthew; McLachlan, Michael S

2015-02-03

It is challenging to measure the persistence of chemicals under field conditions. In this work, two approaches for measuring persistence in the field were compared: the chemical mass balance approach, and a novel chemical benchmarking approach. Ten pharmaceuticals, an X-ray contrast agent, and an artificial sweetener were studied in a Swedish lake. Acesulfame K was selected as a benchmark to quantify persistence using the chemical benchmarking approach. The 95% confidence intervals of the half-life for transformation in the lake system ranged from 780-5700 days for carbamazepine to <1-2 days for ketoprofen. The persistence estimates obtained using the benchmarking approach agreed well with those from the mass balance approach (1-21% difference), indicating that chemical benchmarking can be a valid and useful method to measure the persistence of chemicals under field conditions. Compared to the mass balance approach, the benchmarking approach partially or completely eliminates the need to quantify mass flow of chemicals, so it is particularly advantageous when the quantification of mass flow of chemicals is difficult. Furthermore, the benchmarking approach allows for ready comparison and ranking of the persistence of different chemicals.
Benchmarking Diagnostic Algorithms on an Electrical Power System Testbed

NASA Technical Reports Server (NTRS)

Kurtoglu, Tolga; Narasimhan, Sriram; Poll, Scott; Garcia, David; Wright, Stephanie

2009-01-01

Diagnostic algorithms (DAs) are key to enabling automated health management. These algorithms are designed to detect and isolate anomalies of either a component or the whole system based on observations received from sensors. In recent years a wide range of algorithms, both model-based and data-driven, have been developed to increase autonomy and improve system reliability and affordability. However, the lack of support to perform systematic benchmarking of these algorithms continues to create barriers for effective development and deployment of diagnostic technologies. In this paper, we present our efforts to benchmark a set of DAs on a common platform using a framework that was developed to evaluate and compare various performance metrics for diagnostic technologies. The diagnosed system is an electrical power system, namely the Advanced Diagnostics and Prognostics Testbed (ADAPT) developed and located at the NASA Ames Research Center. The paper presents the fundamentals of the benchmarking framework, the ADAPT system, description of faults and data sets, the metrics used for evaluation, and an in-depth analysis of benchmarking results obtained from testing ten diagnostic algorithms on the ADAPT electrical power system testbed.
Measurement, Standards, and Peer Benchmarking: One Hospital's Journey.

PubMed

Martin, Brian S; Arbore, Mark

2016-04-01

Peer-to-peer benchmarking is an important component of rapid-cycle performance improvement in patient safety and quality-improvement efforts. Institutions should carefully examine critical success factors before engagement in peer-to-peer benchmarking in order to maximize growth and change opportunities. Solutions for Patient Safety has proven to be a high-yield engagement for Children's Hospital of Pittsburgh of University of Pittsburgh Medical Center, with measureable improvement in both organizational process and culture. Copyright © 2016 Elsevier Inc. All rights reserved.
Benchmarking protein classification algorithms via supervised cross-validation.

PubMed

Kertész-Farkas, Attila; Dhir, Somdutta; Sonego, Paolo; Pacurar, Mircea; Netoteia, Sergiu; Nijveen, Harm; Kuzniar, Arnold; Leunissen, Jack A M; Kocsor, András; Pongor, Sándor

2008-04-24

Development and testing of protein classification algorithms are hampered by the fact that the protein universe is characterized by groups vastly different in the number of members, in average protein size, similarity within group, etc. Datasets based on traditional cross-validation (k-fold, leave-one-out, etc.) may not give reliable estimates on how an algorithm will generalize to novel, distantly related subtypes of the known protein classes. Supervised cross-validation, i.e., selection of test and train sets according to the known subtypes within a database has been successfully used earlier in conjunction with the SCOP database. Our goal was to extend this principle to other databases and to design standardized benchmark datasets for protein classification. Hierarchical classification trees of protein categories provide a simple and general framework for designing supervised cross-validation strategies for protein classification. Benchmark datasets can be designed at various levels of the concept hierarchy using a simple graph-theoretic distance. A combination of supervised and random sampling was selected to construct reduced size model datasets, suitable for algorithm comparison. Over 3000 new classification tasks were added to our recently established protein classification benchmark collection that currently includes protein sequence (including protein domains and entire proteins), protein structure and reading frame DNA sequence data. We carried out an extensive evaluation based on various machine-learning algorithms such as nearest neighbor, support vector machines, artificial neural networks, random forests and logistic regression, used in conjunction with comparison algorithms, BLAST, Smith-Waterman, Needleman-Wunsch, as well as 3D comparison methods DALI and PRIDE. The resulting datasets provide lower, and in our opinion more realistic estimates of the classifier performance than do random cross-validation schemes. A combination of supervised and
COMPETITIVE BIDDING IN MEDICARE ADVANTAGE: EFFECT OF BENCHMARK CHANGES ON PLAN BIDS

PubMed Central

Song, Zirui; Landrum, Mary Beth; Chernew, Michael E.

2013-01-01

Bidding has been proposed to replace or complement the administered prices in Medicare pays to hospitals and health plans. In 2006, the Medicare Advantage program implemented a competitive bidding system to determine plan payments. In perfectly competitive models, plans bid their costs and thus bids are insensitive to the benchmark. Under many other models of competition, bids respond to changes in the benchmark. We conceptualize the bidding system and use an instrumental variable approach to study the effect of benchmark changes on bids. We use 2006–2010 plan payment data from the Centers for Medicare and Medicaid Services, published county benchmarks, actual realized fee-for-service costs, and Medicare Advantage enrollment. We find that a $1 increase in the benchmark leads to about a $0.53 increase in bids, suggesting that plans in the Medicare Advantage market have meaningful market power. PMID:24308881
Competitive bidding in Medicare Advantage: effect of benchmark changes on plan bids.

PubMed

Song, Zirui; Landrum, Mary Beth; Chernew, Michael E

2013-12-01

Bidding has been proposed to replace or complement the administered prices that Medicare pays to hospitals and health plans. In 2006, the Medicare Advantage program implemented a competitive bidding system to determine plan payments. In perfectly competitive models, plans bid their costs and thus bids are insensitive to the benchmark. Under many other models of competition, bids respond to changes in the benchmark. We conceptualize the bidding system and use an instrumental variable approach to study the effect of benchmark changes on bids. We use 2006-2010 plan payment data from the Centers for Medicare and Medicaid Services, published county benchmarks, actual realized fee-for-service costs, and Medicare Advantage enrollment. We find that a $1 increase in the benchmark leads to about a $0.53 increase in bids, suggesting that plans in the Medicare Advantage market have meaningful market power. Copyright © 2013 Elsevier B.V. All rights reserved.
Using Institutional Survey Data to Jump-Start Your Benchmarking Process

ERIC Educational Resources Information Center

Chow, Timothy K. C.

2012-01-01

Guided by the missions and visions, higher education institutions utilize benchmarking processes to identify better and more efficient ways to carry out their operations. Aside from the initial planning and organization steps involved in benchmarking, a matching or selection step is crucial for identifying other institutions that have good…
Practical Considerations when Using Benchmarking for Accountability in Higher Education

ERIC Educational Resources Information Center

Achtemeier, Sue D.; Simpson, Ronald D.

2005-01-01

The qualitative study on which this article is based examined key individuals' perceptions, both within a research university community and beyond in its external governing board, of how to improve benchmarking as an accountability method in higher education. Differing understanding of benchmarking revealed practical implications for using it as…
Analysis of benchmark critical experiments with ENDF/B-VI data sets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hardy, J. Jr.; Kahler, A.C.

1991-12-31

Several clean critical experiments were analyzed with ENDF/B-VI data to assess the adequacy of the data for U{sup 235}, U{sup 238} and oxygen. These experiments were (1) a set of homogeneous U{sup 235}-H{sub 2}O assemblies spanning a wide range of hydrogen/uranium ratio, and (2) TRX-1, a simple, H{sub 2}O-moderated Bettis lattice of slightly-enriched uranium metal rods. The analyses used the Monte Carlo program RCP01, with explicit three-dimensional geometry and detailed representation of cross sections. For the homogeneous criticals, calculated k{sub crit} values for large, thermal assemblies show good agreement with experiment. This supports the evaluated thermal criticality parameters for U{supmore » 235}. However, for assemblies with smaller H/U ratios, k{sub crit} values increase significantly with increasing leakage and flux-spectrum hardness. These trends suggest that leakage is underpredicted and that the resonance eta of the ENDF/B-VI U{sup 235} is too large. For TRX-1, reasonably good agreement is found with measured lattice parameters (reaction-rate ratios). Of primary interest is rho28, the ratio of above-thermal to thermal U{sup 238} capture. Calculated rho28 is 2.3 ({+-} 1.7) % above measurement, suggesting that U{sup 238} resonance capture remains slightly overpredicted with ENDF/B-VI. However, agreement is better than observed with earlier versions of ENDF/B.« less
How to Use Benchmark and Cross-section Studies to Improve Data Libraries and Models

NASA Astrophysics Data System (ADS)

Wagner, V.; Suchopár, M.; Vrzalová, J.; Chudoba, P.; Svoboda, O.; Tichý, P.; Krása, A.; Majerle, M.; Kugler, A.; Adam, J.; Baldin, A.; Furman, W.; Kadykov, M.; Solnyshkin, A.; Tsoupko-Sitnikov, S.; Tyutyunikov, S.; Vladimirovna, N.; Závorka, L.

2016-06-01

Improvements of the Monte Carlo transport codes and cross-section libraries are very important steps towards usage of the accelerator-driven transmutation systems. We have conducted a lot of benchmark experiments with different set-ups consisting of lead, natural uranium and moderator irradiated by relativistic protons and deuterons within framework of the collaboration “Energy and Transmutation of Radioactive Waste”. Unfortunately, the knowledge of the total or partial cross-sections of important reactions is insufficient. Due to this reason we have started extensive studies of different reaction cross-sections. We measure cross-sections of important neutron reactions by means of the quasi-monoenergetic neutron sources based on the cyclotrons at Nuclear Physics Institute in Řež and at The Svedberg Laboratory in Uppsala. Measurements of partial cross-sections of relativistic deuteron reactions were the second direction of our studies. The new results obtained during last years will be shown. Possible use of these data for improvement of libraries, models and benchmark studies will be discussed.
Benchmarking in pathology: development of an activity-based costing model.

PubMed

Burnett, Leslie; Wilson, Roger; Pfeffer, Sally; Lowry, John

2012-12-01

Benchmarking in Pathology (BiP) allows pathology laboratories to determine the unit cost of all laboratory tests and procedures, and also provides organisational productivity indices allowing comparisons of performance with other BiP participants. We describe 14 years of progressive enhancement to a BiP program, including the implementation of 'avoidable costs' as the accounting basis for allocation of costs rather than previous approaches using 'total costs'. A hierarchical tree-structured activity-based costing model distributes 'avoidable costs' attributable to the pathology activities component of a pathology laboratory operation. The hierarchical tree model permits costs to be allocated across multiple laboratory sites and organisational structures. This has enabled benchmarking on a number of levels, including test profiles and non-testing related workload activities. The development of methods for dealing with variable cost inputs, allocation of indirect costs using imputation techniques, panels of tests, and blood-bank record keeping, have been successfully integrated into the costing model. A variety of laboratory management reports are produced, including the 'cost per test' of each pathology 'test' output. Benchmarking comparisons may be undertaken at any and all of the 'cost per test' and 'cost per Benchmarking Complexity Unit' level, 'discipline/department' (sub-specialty) level, or overall laboratory/site and organisational levels. We have completed development of a national BiP program. An activity-based costing methodology based on avoidable costs overcomes many problems of previous benchmarking studies based on total costs. The use of benchmarking complexity adjustment permits correction for varying test-mix and diagnostic complexity between laboratories. Use of iterative communication strategies with program participants can overcome many obstacles and lead to innovations.
A call for benchmarking transposable element annotation methods.

PubMed

Hoen, Douglas R; Hickey, Glenn; Bourque, Guillaume; Casacuberta, Josep; Cordaux, Richard; Feschotte, Cédric; Fiston-Lavier, Anna-Sophie; Hua-Van, Aurélie; Hubley, Robert; Kapusta, Aurélie; Lerat, Emmanuelle; Maumus, Florian; Pollock, David D; Quesneville, Hadi; Smit, Arian; Wheeler, Travis J; Bureau, Thomas E; Blanchette, Mathieu

2015-01-01

DNA derived from transposable elements (TEs) constitutes large parts of the genomes of complex eukaryotes, with major impacts not only on genomic research but also on how organisms evolve and function. Although a variety of methods and tools have been developed to detect and annotate TEs, there are as yet no standard benchmarks-that is, no standard way to measure or compare their accuracy. This lack of accuracy assessment calls into question conclusions from a wide range of research that depends explicitly or implicitly on TE annotation. In the absence of standard benchmarks, toolmakers are impeded in improving their tools, annotators cannot properly assess which tools might best suit their needs, and downstream researchers cannot judge how accuracy limitations might impact their studies. We therefore propose that the TE research community create and adopt standard TE annotation benchmarks, and we call for other researchers to join the authors in making this long-overdue effort a success.
Testing New Programming Paradigms with NAS Parallel Benchmarks

NASA Technical Reports Server (NTRS)

Jin, H.; Frumkin, M.; Schultz, M.; Yan, J.

2000-01-01

Over the past decade, high performance computing has evolved rapidly, not only in hardware architectures but also with increasing complexity of real applications. Technologies have been developing to aim at scaling up to thousands of processors on both distributed and shared memory systems. Development of parallel programs on these computers is always a challenging task. Today, writing parallel programs with message passing (e.g. MPI) is the most popular way of achieving scalability and high performance. However, writing message passing programs is difficult and error prone. Recent years new effort has been made in defining new parallel programming paradigms. The best examples are: HPF (based on data parallelism) and OpenMP (based on shared memory parallelism). Both provide simple and clear extensions to sequential programs, thus greatly simplify the tedious tasks encountered in writing message passing programs. HPF is independent of memory hierarchy, however, due to the immaturity of compiler technology its performance is still questionable. Although use of parallel compiler directives is not new, OpenMP offers a portable solution in the shared-memory domain. Another important development involves the tremendous progress in the internet and its associated technology. Although still in its infancy, Java promisses portability in a heterogeneous environment and offers possibility to "compile once and run anywhere." In light of testing these new technologies, we implemented new parallel versions of the NAS Parallel Benchmarks (NPBs) with HPF and OpenMP directives, and extended the work with Java and Java-threads. The purpose of this study is to examine the effectiveness of alternative programming paradigms. NPBs consist of five kernels and three simulated applications that mimic the computation and data movement of large scale computational fluid dynamics (CFD) applications. We started with the serial version included in NPB2.3. Optimization of memory and cache usage
Measuring How Benchmark Assessments Affect Student Achievement. Issues & Answers. REL 2007-No. 039

ERIC Educational Resources Information Center

Henderson, Susan; Petrosino, Anthony; Guckenburg, Sarah; Hamilton, Stephen

2007-01-01

This report examines a Massachusetts pilot program for quarterly benchmark exams in middle-school mathematics, finding that program schools do not show greater gains in student achievement after a year. But that finding might reflect limited data rather than ineffective benchmark assessments. Benchmark assessments are used in many districts…
Benchmark levels for the consumptive water footprint of crop production for different environmental conditions: a case study for winter wheat in China

NASA Astrophysics Data System (ADS)

Zhuo, La; Mekonnen, Mesfin M.; Hoekstra, Arjen Y.

2016-11-01

Meeting growing food demands while simultaneously shrinking the water footprint (WF) of agricultural production is one of the greatest societal challenges. Benchmarks for the WF of crop production can serve as a reference and be helpful in setting WF reduction targets. The consumptive WF of crops, the consumption of rainwater stored in the soil (green WF), and the consumption of irrigation water (blue WF) over the crop growing period varies spatially and temporally depending on environmental factors like climate and soil. The study explores which environmental factors should be distinguished when determining benchmark levels for the consumptive WF of crops. Hereto we determine benchmark levels for the consumptive WF of winter wheat production in China for all separate years in the period 1961-2008, for rain-fed vs. irrigated croplands, for wet vs. dry years, for warm vs. cold years, for four different soil classes, and for two different climate zones. We simulate consumptive WFs of winter wheat production with the crop water productivity model AquaCrop at a 5 by 5 arcmin resolution, accounting for water stress only. The results show that (i) benchmark levels determined for individual years for the country as a whole remain within a range of ±20 % around long-term mean levels over 1961-2008, (ii) the WF benchmarks for irrigated winter wheat are 8-10 % larger than those for rain-fed winter wheat, (iii) WF benchmarks for wet years are 1-3 % smaller than for dry years, (iv) WF benchmarks for warm years are 7-8 % smaller than for cold years, (v) WF benchmarks differ by about 10-12 % across different soil texture classes, and (vi) WF benchmarks for the humid zone are 26-31 % smaller than for the arid zone, which has relatively higher reference evapotranspiration in general and lower yields in rain-fed fields. We conclude that when determining benchmark levels for the consumptive WF of a crop, it is useful to primarily distinguish between different climate zones. If
Learning Through Benchmarking: Developing a Relational, Prospective Approach to Benchmarking ICT in Learning and Teaching

ERIC Educational Resources Information Center

Ellis, Robert A.; Moore, Roger R.

2006-01-01

This study discusses benchmarking the use of information and communication technologies (ICT) in teaching and learning between two universities with different missions: one an Australian campus-based metropolitan university and the other a British distance-education provider. It argues that the differences notwithstanding, it is possible to…
Development of Benchmark Examples for Static Delamination Propagation and Fatigue Growth Predictions

NASA Technical Reports Server (NTRS)

Kruger, Ronald

2011-01-01

The development of benchmark examples for static delamination propagation and cyclic delamination onset and growth prediction is presented and demonstrated for a commercial code. The example is based on a finite element model of an End-Notched Flexure (ENF) specimen. The example is independent of the analysis software used and allows the assessment of the automated delamination propagation, onset and growth prediction capabilities in commercial finite element codes based on the virtual crack closure technique (VCCT). First, static benchmark examples were created for the specimen. Second, based on the static results, benchmark examples for cyclic delamination growth were created. Third, the load-displacement relationship from a propagation analysis and the benchmark results were compared, and good agreement could be achieved by selecting the appropriate input parameters. Fourth, starting from an initially straight front, the delamination was allowed to grow under cyclic loading. The number of cycles to delamination onset and the number of cycles during stable delamination growth for each growth increment were obtained from the automated analysis and compared to the benchmark examples. Again, good agreement between the results obtained from the growth analysis and the benchmark results could be achieved by selecting the appropriate input parameters. The benchmarking procedure proved valuable by highlighting the issues associated with the input parameters of the particular implementation. Selecting the appropriate input parameters, however, was not straightforward and often required an iterative procedure. Overall, the results are encouraging but further assessment for mixed-mode delamination is required.
Benchmarking high performance computing architectures with CMS’ skeleton framework

NASA Astrophysics Data System (ADS)

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

2017-10-01

In 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta, Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.
Comparison of the PHISICS/RELAP5-3D ring and block model results for phase I of the OECD/NEA MHTGR-350 benchmark

DOE PAGES

Strydom, G.; Epiney, A. S.; Alfonsi, Andrea; ...

2015-12-02

The PHISICS code system has been under development at INL since 2010. It consists of several modules providing improved coupled core simulation capability: INSTANT (3D nodal transport core calculations), MRTAU (depletion and decay heat generation) and modules performing criticality searches, fuel shuffling and generalized perturbation. Coupling of the PHISICS code suite to the thermal hydraulics system code RELAP5-3D was finalized in 2013, and as part of the verification and validation effort the first phase of the OECD/NEA MHTGR-350 Benchmark has now been completed. The theoretical basis and latest development status of the coupled PHISICS/RELAP5-3D tool are described in more detailmore » in a concurrent paper. This paper provides an overview of the OECD/NEA MHTGR-350 Benchmark and presents the results of Exercises 2 and 3 defined for Phase I. Exercise 2 required the modelling of a stand-alone thermal fluids solution at End of Equilibrium Cycle for the Modular High Temperature Reactor (MHTGR). The RELAP5-3D results of four sub-cases are discussed, consisting of various combinations of coolant bypass flows and material thermophysical properties. Exercise 3 required a coupled neutronics and thermal fluids solution, and the PHISICS/RELAP5-3D code suite was used to calculate the results of two sub-cases. The main focus of the paper is a comparison of results obtained with the traditional RELAP5-3D “ring” model approach against a much more detailed model that include kinetics feedback on individual block level and thermal feedbacks on a triangular sub-mesh. The higher fidelity that can be obtained by this “block” model is illustrated with comparison results on the temperature, power density and flux distributions. Furthermore, it is shown that the ring model leads to significantly lower fuel temperatures (up to 10%) when compared with the higher fidelity block model, and that the additional model development and run-time efforts are worth the gains obtained
Benchmarking: measuring the outcomes of evidence-based practice.

PubMed

DeLise, D C; Leasure, A R

2001-01-01

Measurement of the outcomes associated with implementation of evidence-based practice changes is becoming increasingly emphasized by multiple health care disciplines. A final step to the process of implementing and sustaining evidence-supported practice changes is that of outcomes evaluation and monitoring. The comparison of outcomes to internal and external measures is known as benchmarking. This article discusses evidence-based practice, provides an overview of outcomes evaluation, and describes the process of benchmarking to improve practice. A case study is used to illustrate this concept.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.