benchmark validation comparisons: Topics by Science.gov

Sample records for benchmark validation comparisons

Shift Verification and Validation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pandya, Tara M.; Evans, Thomas M.; Davidson, Gregory G

2016-09-07

This documentation outlines the verification and validation of Shift for the Consortium for Advanced Simulation of Light Water Reactors (CASL). Five main types of problems were used for validation: small criticality benchmark problems; full-core reactor benchmarks for light water reactors; fixed-source coupled neutron-photon dosimetry benchmarks; depletion/burnup benchmarks; and full-core reactor performance benchmarks. We compared Shift results to measured data and other simulated Monte Carlo radiation transport code results, and found very good agreement in a variety of comparison measures. These include prediction of critical eigenvalue, radial and axial pin power distributions, rod worth, leakage spectra, and nuclide inventories over amore » burn cycle. Based on this validation of Shift, we are confident in Shift to provide reference results for CASL benchmarking.« less
Benchmarking protein classification algorithms via supervised cross-validation.

PubMed

Kertész-Farkas, Attila; Dhir, Somdutta; Sonego, Paolo; Pacurar, Mircea; Netoteia, Sergiu; Nijveen, Harm; Kuzniar, Arnold; Leunissen, Jack A M; Kocsor, András; Pongor, Sándor

2008-04-24

Development and testing of protein classification algorithms are hampered by the fact that the protein universe is characterized by groups vastly different in the number of members, in average protein size, similarity within group, etc. Datasets based on traditional cross-validation (k-fold, leave-one-out, etc.) may not give reliable estimates on how an algorithm will generalize to novel, distantly related subtypes of the known protein classes. Supervised cross-validation, i.e., selection of test and train sets according to the known subtypes within a database has been successfully used earlier in conjunction with the SCOP database. Our goal was to extend this principle to other databases and to design standardized benchmark datasets for protein classification. Hierarchical classification trees of protein categories provide a simple and general framework for designing supervised cross-validation strategies for protein classification. Benchmark datasets can be designed at various levels of the concept hierarchy using a simple graph-theoretic distance. A combination of supervised and random sampling was selected to construct reduced size model datasets, suitable for algorithm comparison. Over 3000 new classification tasks were added to our recently established protein classification benchmark collection that currently includes protein sequence (including protein domains and entire proteins), protein structure and reading frame DNA sequence data. We carried out an extensive evaluation based on various machine-learning algorithms such as nearest neighbor, support vector machines, artificial neural networks, random forests and logistic regression, used in conjunction with comparison algorithms, BLAST, Smith-Waterman, Needleman-Wunsch, as well as 3D comparison methods DALI and PRIDE. The resulting datasets provide lower, and in our opinion more realistic estimates of the classifier performance than do random cross-validation schemes. A combination of supervised and random sampling was used to construct model datasets, suitable for algorithm comparison.
LipidQC: Method Validation Tool for Visual Comparison to SRM 1950 Using NIST Interlaboratory Comparison Exercise Lipid Consensus Mean Estimate Values.

PubMed

Ulmer, Candice Z; Ragland, Jared M; Koelmel, Jeremy P; Heckert, Alan; Jones, Christina M; Garrett, Timothy J; Yost, Richard A; Bowden, John A

2017-12-19

As advances in analytical separation techniques, mass spectrometry instrumentation, and data processing platforms continue to spur growth in the lipidomics field, more structurally unique lipid species are detected and annotated. The lipidomics community is in need of benchmark reference values to assess the validity of various lipidomics workflows in providing accurate quantitative measurements across the diverse lipidome. LipidQC addresses the harmonization challenge in lipid quantitation by providing a semiautomated process, independent of analytical platform, for visual comparison of experimental results of National Institute of Standards and Technology Standard Reference Material (SRM) 1950, "Metabolites in Frozen Human Plasma", against benchmark consensus mean concentrations derived from the NIST Lipidomics Interlaboratory Comparison Exercise.
A Review of Flood Loss Models as Basis for Harmonization and Benchmarking

PubMed Central

Kreibich, Heidi; Franco, Guillermo; Marechal, David

2016-01-01

Risk-based approaches have been increasingly accepted and operationalized in flood risk management during recent decades. For instance, commercial flood risk models are used by the insurance industry to assess potential losses, establish the pricing of policies and determine reinsurance needs. Despite considerable progress in the development of loss estimation tools since the 1980s, loss estimates still reflect high uncertainties and disparities that often lead to questioning their quality. This requires an assessment of the validity and robustness of loss models as it affects prioritization and investment decision in flood risk management as well as regulatory requirements and business decisions in the insurance industry. Hence, more effort is needed to quantify uncertainties and undertake validations. Due to a lack of detailed and reliable flood loss data, first order validations are difficult to accomplish, so that model comparisons in terms of benchmarking are essential. It is checked if the models are informed by existing data and knowledge and if the assumptions made in the models are aligned with the existing knowledge. When this alignment is confirmed through validation or benchmarking exercises, the user gains confidence in the models. Before these benchmarking exercises are feasible, however, a cohesive survey of existing knowledge needs to be undertaken. With that aim, this work presents a review of flood loss–or flood vulnerability–relationships collected from the public domain and some professional sources. Our survey analyses 61 sources consisting of publications or software packages, of which 47 are reviewed in detail. This exercise results in probably the most complete review of flood loss models to date containing nearly a thousand vulnerability functions. These functions are highly heterogeneous and only about half of the loss models are found to be accompanied by explicit validation at the time of their proposal. This paper exemplarily presents an approach for a quantitative comparison of disparate models via the reduction to the joint input variables of all models. Harmonization of models for benchmarking and comparison requires profound insight into the model structures, mechanisms and underlying assumptions. Possibilities and challenges are discussed that exist in model harmonization and the application of the inventory in a benchmarking framework. PMID:27454604
A Review of Flood Loss Models as Basis for Harmonization and Benchmarking.

PubMed

Gerl, Tina; Kreibich, Heidi; Franco, Guillermo; Marechal, David; Schröter, Kai

2016-01-01

Risk-based approaches have been increasingly accepted and operationalized in flood risk management during recent decades. For instance, commercial flood risk models are used by the insurance industry to assess potential losses, establish the pricing of policies and determine reinsurance needs. Despite considerable progress in the development of loss estimation tools since the 1980s, loss estimates still reflect high uncertainties and disparities that often lead to questioning their quality. This requires an assessment of the validity and robustness of loss models as it affects prioritization and investment decision in flood risk management as well as regulatory requirements and business decisions in the insurance industry. Hence, more effort is needed to quantify uncertainties and undertake validations. Due to a lack of detailed and reliable flood loss data, first order validations are difficult to accomplish, so that model comparisons in terms of benchmarking are essential. It is checked if the models are informed by existing data and knowledge and if the assumptions made in the models are aligned with the existing knowledge. When this alignment is confirmed through validation or benchmarking exercises, the user gains confidence in the models. Before these benchmarking exercises are feasible, however, a cohesive survey of existing knowledge needs to be undertaken. With that aim, this work presents a review of flood loss-or flood vulnerability-relationships collected from the public domain and some professional sources. Our survey analyses 61 sources consisting of publications or software packages, of which 47 are reviewed in detail. This exercise results in probably the most complete review of flood loss models to date containing nearly a thousand vulnerability functions. These functions are highly heterogeneous and only about half of the loss models are found to be accompanied by explicit validation at the time of their proposal. This paper exemplarily presents an approach for a quantitative comparison of disparate models via the reduction to the joint input variables of all models. Harmonization of models for benchmarking and comparison requires profound insight into the model structures, mechanisms and underlying assumptions. Possibilities and challenges are discussed that exist in model harmonization and the application of the inventory in a benchmarking framework.
Issues in benchmarking human reliability analysis methods : a literature review.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lois, Erasmia; Forester, John Alan; Tran, Tuan Q.

There is a diversity of human reliability analysis (HRA) methods available for use in assessing human performance within probabilistic risk assessment (PRA). Due to the significant differences in the methods, including the scope, approach, and underlying models, there is a need for an empirical comparison investigating the validity and reliability of the methods. To accomplish this empirical comparison, a benchmarking study is currently underway that compares HRA methods with each other and against operator performance in simulator studies. In order to account for as many effects as possible in the construction of this benchmarking study, a literature review was conducted,more » reviewing past benchmarking studies in the areas of psychology and risk assessment. A number of lessons learned through these studies are presented in order to aid in the design of future HRA benchmarking endeavors.« less
Issues in Benchmarking Human Reliability Analysis Methods: A Literature Review

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ronald L. Boring; Stacey M. L. Hendrickson; John A. Forester

There is a diversity of human reliability analysis (HRA) methods available for use in assessing human performance within probabilistic risk assessments (PRA). Due to the significant differences in the methods, including the scope, approach, and underlying models, there is a need for an empirical comparison investigating the validity and reliability of the methods. To accomplish this empirical comparison, a benchmarking study comparing and evaluating HRA methods in assessing operator performance in simulator experiments is currently underway. In order to account for as many effects as possible in the construction of this benchmarking study, a literature review was conducted, reviewing pastmore » benchmarking studies in the areas of psychology and risk assessment. A number of lessons learned through these studies are presented in order to aid in the design of future HRA benchmarking endeavors.« less
Performance Comparison of NAMI DANCE and FLOW-3D® Models in Tsunami Propagation, Inundation and Currents using NTHMP Benchmark Problems

NASA Astrophysics Data System (ADS)

Velioglu Sogut, Deniz; Yalciner, Ahmet Cevdet

2018-06-01

Field observations provide valuable data regarding nearshore tsunami impact, yet only in inundation areas where tsunami waves have already flooded. Therefore, tsunami modeling is essential to understand tsunami behavior and prepare for tsunami inundation. It is necessary that all numerical models used in tsunami emergency planning be subject to benchmark tests for validation and verification. This study focuses on two numerical codes, NAMI DANCE and FLOW-3D®, for validation and performance comparison. NAMI DANCE is an in-house tsunami numerical model developed by the Ocean Engineering Research Center of Middle East Technical University, Turkey and Laboratory of Special Research Bureau for Automation of Marine Research, Russia. FLOW-3D® is a general purpose computational fluid dynamics software, which was developed by scientists who pioneered in the design of the Volume-of-Fluid technique. The codes are validated and their performances are compared via analytical, experimental and field benchmark problems, which are documented in the ``Proceedings and Results of the 2011 National Tsunami Hazard Mitigation Program (NTHMP) Model Benchmarking Workshop'' and the ``Proceedings and Results of the NTHMP 2015 Tsunami Current Modeling Workshop". The variations between the numerical solutions of these two models are evaluated through statistical error analysis.
Benchmarking the Collocation Stand-Alone Library and Toolkit (CSALT)

NASA Technical Reports Server (NTRS)

Hughes, Steven; Knittel, Jeremy; Shoan, Wendy; Kim, Youngkwang; Conway, Claire; Conway, Darrel J.

2017-01-01

This paper describes the processes and results of Verification and Validation (VV) efforts for the Collocation Stand Alone Library and Toolkit (CSALT). We describe the test program and environments, the tools used for independent test data, and comparison results. The VV effort employs classical problems with known analytic solutions, solutions from other available software tools, and comparisons to benchmarking data available in the public literature. Presenting all test results are beyond the scope of a single paper. Here we present high-level test results for a broad range of problems, and detailed comparisons for selected problems.
Benchmarking the Collocation Stand-Alone Library and Toolkit (CSALT)

NASA Technical Reports Server (NTRS)

Hughes, Steven; Knittel, Jeremy; Shoan, Wendy (Compiler); Kim, Youngkwang; Conway, Claire (Compiler); Conway, Darrel

2017-01-01

This paper describes the processes and results of Verification and Validation (V&V) efforts for the Collocation Stand Alone Library and Toolkit (CSALT). We describe the test program and environments, the tools used for independent test data, and comparison results. The V&V effort employs classical problems with known analytic solutions, solutions from other available software tools, and comparisons to benchmarking data available in the public literature. Presenting all test results are beyond the scope of a single paper. Here we present high-level test results for a broad range of problems, and detailed comparisons for selected problems.
Cyber-Based Turbulent Combustion Simulation

DTIC Science & Technology

2012-02-28

flame thickness by comparing with benchmark of AFRL/RZ ( UNICORN ) suppressing the oscillatory numerical behavior. These improvements in numerical...fraction with the benchmark results of AFRL/RZ. This validating base is generated by the UNICORN program on the finest mesh available and the local...shared kinematic and thermodynamic data from the UNICORN program. The most important and meaningful conclusion can be drawn from this comparison is
Experimental validation benchmark data for CFD of transient convection from forced to natural with flow reversal on a vertical flat plate

DOE PAGES

Lance, Blake W.; Smith, Barton L.

2016-06-23

Transient convection has been investigated experimentally for the purpose of providing Computational Fluid Dynamics (CFD) validation benchmark data. A specialized facility for validation benchmark experiments called the Rotatable Buoyancy Tunnel was used to acquire thermal and velocity measurements of flow over a smooth, vertical heated plate. The initial condition was forced convection downward with subsequent transition to mixed convection, ending with natural convection upward after a flow reversal. Data acquisition through the transient was repeated for ensemble-averaged results. With simple flow geometry, validation data were acquired at the benchmark level. All boundary conditions (BCs) were measured and their uncertainties quantified.more » Temperature profiles on all four walls and the inlet were measured, as well as as-built test section geometry. Inlet velocity profiles and turbulence levels were quantified using Particle Image Velocimetry. System Response Quantities (SRQs) were measured for comparison with CFD outputs and include velocity profiles, wall heat flux, and wall shear stress. Extra effort was invested in documenting and preserving the validation data. Details about the experimental facility, instrumentation, experimental procedure, materials, BCs, and SRQs are made available through this paper. As a result, the latter two are available for download and the other details are included in this work.« less
The Earthquake‐Source Inversion Validation (SIV) Project

USGS Publications Warehouse

Mai, P. Martin; Schorlemmer, Danijel; Page, Morgan T.; Ampuero, Jean-Paul; Asano, Kimiyuki; Causse, Mathieu; Custodio, Susana; Fan, Wenyuan; Festa, Gaetano; Galis, Martin; Gallovic, Frantisek; Imperatori, Walter; Käser, Martin; Malytskyy, Dmytro; Okuwaki, Ryo; Pollitz, Fred; Passone, Luca; Razafindrakoto, Hoby N. T.; Sekiguchi, Haruko; Song, Seok Goo; Somala, Surendra N.; Thingbaijam, Kiran K. S.; Twardzik, Cedric; van Driel, Martin; Vyas, Jagdish C.; Wang, Rongjiang; Yagi, Yuji; Zielke, Olaf

2016-01-01

Finite‐fault earthquake source inversions infer the (time‐dependent) displacement on the rupture surface from geophysical data. The resulting earthquake source models document the complexity of the rupture process. However, multiple source models for the same earthquake, obtained by different research teams, often exhibit remarkable dissimilarities. To address the uncertainties in earthquake‐source inversion methods and to understand strengths and weaknesses of the various approaches used, the Source Inversion Validation (SIV) project conducts a set of forward‐modeling exercises and inversion benchmarks. In this article, we describe the SIV strategy, the initial benchmarks, and current SIV results. Furthermore, we apply statistical tools for quantitative waveform comparison and for investigating source‐model (dis)similarities that enable us to rank the solutions, and to identify particularly promising source inversion approaches. All SIV exercises (with related data and descriptions) and statistical comparison tools are available via an online collaboration platform, and we encourage source modelers to use the SIV benchmarks for developing and testing new methods. We envision that the SIV efforts will lead to new developments for tackling the earthquake‐source imaging problem.
Benchmark datasets for phylogenomic pipeline validation, applications for foodborne pathogen surveillance

PubMed Central

Rand, Hugh; Shumway, Martin; Trees, Eija K.; Simmons, Mustafa; Agarwala, Richa; Davis, Steven; Tillman, Glenn E.; Defibaugh-Chavez, Stephanie; Carleton, Heather A.; Klimke, William A.; Katz, Lee S.

2017-01-01

Background As next generation sequence technology has advanced, there have been parallel advances in genome-scale analysis programs for determining evolutionary relationships as proxies for epidemiological relationship in public health. Most new programs skip traditional steps of ortholog determination and multi-gene alignment, instead identifying variants across a set of genomes, then summarizing results in a matrix of single-nucleotide polymorphisms or alleles for standard phylogenetic analysis. However, public health authorities need to document the performance of these methods with appropriate and comprehensive datasets so they can be validated for specific purposes, e.g., outbreak surveillance. Here we propose a set of benchmark datasets to be used for comparison and validation of phylogenomic pipelines. Methods We identified four well-documented foodborne pathogen events in which the epidemiology was concordant with routine phylogenomic analyses (reference-based SNP and wgMLST approaches). These are ideal benchmark datasets, as the trees, WGS data, and epidemiological data for each are all in agreement. We have placed these sequence data, sample metadata, and “known” phylogenetic trees in publicly-accessible databases and developed a standard descriptive spreadsheet format describing each dataset. To facilitate easy downloading of these benchmarks, we developed an automated script that uses the standard descriptive spreadsheet format. Results Our “outbreak” benchmark datasets represent the four major foodborne bacterial pathogens (Listeria monocytogenes, Salmonella enterica, Escherichia coli, and Campylobacter jejuni) and one simulated dataset where the “known tree” can be accurately called the “true tree”. The downloading script and associated table files are available on GitHub: https://github.com/WGS-standards-and-analysis/datasets. Discussion These five benchmark datasets will help standardize comparison of current and future phylogenomic pipelines, and facilitate important cross-institutional collaborations. Our work is part of a global effort to provide collaborative infrastructure for sequence data and analytic tools—we welcome additional benchmark datasets in our recommended format, and, if relevant, we will add these on our GitHub site. Together, these datasets, dataset format, and the underlying GitHub infrastructure present a recommended path for worldwide standardization of phylogenomic pipelines. PMID:29372115
Benchmark datasets for phylogenomic pipeline validation, applications for foodborne pathogen surveillance.

PubMed

Timme, Ruth E; Rand, Hugh; Shumway, Martin; Trees, Eija K; Simmons, Mustafa; Agarwala, Richa; Davis, Steven; Tillman, Glenn E; Defibaugh-Chavez, Stephanie; Carleton, Heather A; Klimke, William A; Katz, Lee S

2017-01-01

As next generation sequence technology has advanced, there have been parallel advances in genome-scale analysis programs for determining evolutionary relationships as proxies for epidemiological relationship in public health. Most new programs skip traditional steps of ortholog determination and multi-gene alignment, instead identifying variants across a set of genomes, then summarizing results in a matrix of single-nucleotide polymorphisms or alleles for standard phylogenetic analysis. However, public health authorities need to document the performance of these methods with appropriate and comprehensive datasets so they can be validated for specific purposes, e.g., outbreak surveillance. Here we propose a set of benchmark datasets to be used for comparison and validation of phylogenomic pipelines. We identified four well-documented foodborne pathogen events in which the epidemiology was concordant with routine phylogenomic analyses (reference-based SNP and wgMLST approaches). These are ideal benchmark datasets, as the trees, WGS data, and epidemiological data for each are all in agreement. We have placed these sequence data, sample metadata, and "known" phylogenetic trees in publicly-accessible databases and developed a standard descriptive spreadsheet format describing each dataset. To facilitate easy downloading of these benchmarks, we developed an automated script that uses the standard descriptive spreadsheet format. Our "outbreak" benchmark datasets represent the four major foodborne bacterial pathogens ( Listeria monocytogenes , Salmonella enterica , Escherichia coli , and Campylobacter jejuni ) and one simulated dataset where the "known tree" can be accurately called the "true tree". The downloading script and associated table files are available on GitHub: https://github.com/WGS-standards-and-analysis/datasets. These five benchmark datasets will help standardize comparison of current and future phylogenomic pipelines, and facilitate important cross-institutional collaborations. Our work is part of a global effort to provide collaborative infrastructure for sequence data and analytic tools-we welcome additional benchmark datasets in our recommended format, and, if relevant, we will add these on our GitHub site. Together, these datasets, dataset format, and the underlying GitHub infrastructure present a recommended path for worldwide standardization of phylogenomic pipelines.
Validation of updated neutronic calculation models proposed for Atucha-II PHWR. Part I: Benchmark comparisons of WIMS-D5 and DRAGON cell and control rod parameters with MCNP5

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mollerach, R.; Leszczynski, F.; Fink, J.

2006-07-01

In 2005 the Argentine Government took the decision to complete the construction of the Atucha-II nuclear power plant, which has been progressing slowly during the last ten years. Atucha-II is a 745 MWe nuclear station moderated and cooled with heavy water, of German (Siemens) design located in Argentina. It has a pressure-vessel design with 451 vertical coolant channels, and the fuel assemblies (FA) are clusters of 37 natural UO{sub 2} rods with an active length of 530 cm. For the reactor physics area, a revision and update calculation methods and models (cell, supercell and reactor) was recently carried out coveringmore » cell, supercell (control rod) and core calculations. As a validation of the new models some benchmark comparisons were done with Monte Carlo calculations with MCNP5. This paper presents comparisons of cell and supercell benchmark problems based on a slightly idealized model of the Atucha-I core obtained with the WIMS-D5 and DRAGON codes with MCNP5 results. The Atucha-I core was selected because it is smaller, similar from a neutronic point of view, and more symmetric than Atucha-II Cell parameters compared include cell k-infinity, relative power levels of the different rings of fuel rods, and some two-group macroscopic cross sections. Supercell comparisons include supercell k-infinity changes due to the control rods (tubes) of steel and hafnium. (authors)« less
Validation of COG10 and ENDFB6R7 on the Auk Workstation for General Application to Plutonium Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Percher, Catherine G

2011-08-08

The COG 10 code package1 on the Auk workstation is now validated with the ENBFB6R7 neutron cross section library for general application to plutonium (Pu) systems by comparison of the calculated keffective to the expected keffective of several relevant experimental benchmarks. This validation is supplemental to the installation and verification of COG 10 on the Auk workstation2.
The Earthquake Source Inversion Validation (SIV) - Project: Summary, Status, Outlook

NASA Astrophysics Data System (ADS)

Mai, P. M.

2017-12-01

Finite-fault earthquake source inversions infer the (time-dependent) displacement on the rupture surface from geophysical data. The resulting earthquake source models document the complexity of the rupture process. However, this kinematic source inversion is ill-posed and returns non-unique solutions, as seen for instance in multiple source models for the same earthquake, obtained by different research teams, that often exhibit remarkable dissimilarities. To address the uncertainties in earthquake-source inversions and to understand strengths and weaknesses of various methods, the Source Inversion Validation (SIV) project developed a set of forward-modeling exercises and inversion benchmarks. Several research teams then use these validation exercises to test their codes and methods, but also to develop and benchmark new approaches. In this presentation I will summarize the SIV strategy, the existing benchmark exercises and corresponding results. Using various waveform-misfit criteria and newly developed statistical comparison tools to quantify source-model (dis)similarities, the SIV platforms is able to rank solutions and identify particularly promising source inversion approaches. Existing SIV exercises (with related data and descriptions) and all computational tools remain available via the open online collaboration platform; additional exercises and benchmark tests will be uploaded once they are fully developed. I encourage source modelers to use the SIV benchmarks for developing and testing new methods. The SIV efforts have already led to several promising new techniques for tackling the earthquake-source imaging problem. I expect that future SIV benchmarks will provide further innovations and insights into earthquake source kinematics that will ultimately help to better understand the dynamics of the rupture process.
Validation of COG10 and ENDFB6R7 on the Auk Workstation for General Application to Highly Enriched Uranium Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Percher, Catherine G.

2011-08-08

The COG 10 code package1 on the Auk workstation is now validated with the ENBFB6R7 neutron cross section library for general application to highly enriched uranium (HEU) systems by comparison of the calculated keffective to the expected keffective of several relevant experimental benchmarks. This validation is supplemental to the installation and verification of COG 10 on the Auk workstation2.
Benchmarking of Neutron Production of Heavy-Ion Transport Codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Remec, Igor; Ronningen, Reginald M.; Heilbronn, Lawrence

Accurate prediction of radiation fields generated by heavy ion interactions is important in medical applications, space missions, and in design and operation of rare isotope research facilities. In recent years, several well-established computer codes in widespread use for particle and radiation transport calculations have been equipped with the capability to simulate heavy ion transport and interactions. To assess and validate these capabilities, we performed simulations of a series of benchmark-quality heavy ion experiments with the computer codes FLUKA, MARS15, MCNPX, and PHITS. We focus on the comparisons of secondary neutron production. Results are encouraging; however, further improvements in models andmore » codes and additional benchmarking are required.« less

Benchmarking of Heavy Ion Transport Codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Remec, Igor; Ronningen, Reginald M.; Heilbronn, Lawrence

Accurate prediction of radiation fields generated by heavy ion interactions is important in medical applications, space missions, and in designing and operation of rare isotope research facilities. In recent years, several well-established computer codes in widespread use for particle and radiation transport calculations have been equipped with the capability to simulate heavy ion transport and interactions. To assess and validate these capabilities, we performed simulations of a series of benchmark-quality heavy ion experiments with the computer codes FLUKA, MARS15, MCNPX, and PHITS. We focus on the comparisons of secondary neutron production. Results are encouraging; however, further improvements in models andmore » codes and additional benchmarking are required.« less
Subgroup Benchmark Calculations for the Intra-Pellet Nonuniform Temperature Cases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, Kang Seog; Jung, Yeon Sang; Liu, Yuxuan

A benchmark suite has been developed by Seoul National University (SNU) for intrapellet nonuniform temperature distribution cases based on the practical temperature profiles according to the thermal power levels. Though a new subgroup capability for nonuniform temperature distribution was implemented in MPACT, no validation calculation has been performed for the new capability. This study focuses on bench-marking the new capability through a code-to-code comparison. Two continuous-energy Monte Carlo codes, McCARD and CE-KENO, are engaged in obtaining reference solutions, and the MPACT results are compared to the SNU nTRACER using a similar cross section library and subgroup method to obtain self-shieldedmore » cross sections.« less
Benchmarking comparison and validation of MCNP photon interaction data

NASA Astrophysics Data System (ADS)

Colling, Bethany; Kodeli, I.; Lilley, S.; Packer, L. W.

2017-09-01

The objective of the research was to test available photoatomic data libraries for fusion relevant applications, comparing against experimental and computational neutronics benchmarks. Photon flux and heating was compared using the photon interaction data libraries (mcplib 04p, 05t, 84p and 12p). Suitable benchmark experiments (iron and water) were selected from the SINBAD database and analysed to compare experimental values with MCNP calculations using mcplib 04p, 84p and 12p. In both the computational and experimental comparisons, the majority of results with the 04p, 84p and 12p photon data libraries were within 1σ of the mean MCNP statistical uncertainty. Larger differences were observed when comparing computational results with the 05t test photon library. The Doppler broadening sampling bug in MCNP-5 is shown to be corrected for fusion relevant problems through use of the 84p photon data library. The recommended libraries for fusion neutronics are 84p (or 04p) with MCNP6 and 84p if using MCNP-5.
Multiscale benchmarking of drug delivery vectors.

PubMed

Summers, Huw D; Ware, Matthew J; Majithia, Ravish; Meissner, Kenith E; Godin, Biana; Rees, Paul

2016-10-01

Cross-system comparisons of drug delivery vectors are essential to ensure optimal design. An in-vitro experimental protocol is presented that separates the role of the delivery vector from that of its cargo in determining the cell response, thus allowing quantitative comparison of different systems. The technique is validated through benchmarking of the dose-response of human fibroblast cells exposed to the cationic molecule, polyethylene imine (PEI); delivered as a free molecule and as a cargo on the surface of CdSe nanoparticles and Silica microparticles. The exposure metrics are converted to a delivered dose with the transport properties of the different scale systems characterized by a delivery time, τ. The benchmarking highlights an agglomeration of the free PEI molecules into micron sized clusters and identifies the metric determining cell death as the total number of PEI molecules presented to cells, determined by the delivery vector dose and the surface density of the cargo. Copyright © 2016 Elsevier Inc. All rights reserved.
Using chemical benchmarking to determine the persistence of chemicals in a Swedish lake.

PubMed

Zou, Hongyan; Radke, Michael; Kierkegaard, Amelie; MacLeod, Matthew; McLachlan, Michael S

2015-02-03

It is challenging to measure the persistence of chemicals under field conditions. In this work, two approaches for measuring persistence in the field were compared: the chemical mass balance approach, and a novel chemical benchmarking approach. Ten pharmaceuticals, an X-ray contrast agent, and an artificial sweetener were studied in a Swedish lake. Acesulfame K was selected as a benchmark to quantify persistence using the chemical benchmarking approach. The 95% confidence intervals of the half-life for transformation in the lake system ranged from 780-5700 days for carbamazepine to <1-2 days for ketoprofen. The persistence estimates obtained using the benchmarking approach agreed well with those from the mass balance approach (1-21% difference), indicating that chemical benchmarking can be a valid and useful method to measure the persistence of chemicals under field conditions. Compared to the mass balance approach, the benchmarking approach partially or completely eliminates the need to quantify mass flow of chemicals, so it is particularly advantageous when the quantification of mass flow of chemicals is difficult. Furthermore, the benchmarking approach allows for ready comparison and ranking of the persistence of different chemicals.
Benchmarking of neutron production of heavy-ion transport codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Remec, I.; Ronningen, R. M.; Heilbronn, L.

Document available in abstract form only, full text of document follows: Accurate prediction of radiation fields generated by heavy ion interactions is important in medical applications, space missions, and in design and operation of rare isotope research facilities. In recent years, several well-established computer codes in widespread use for particle and radiation transport calculations have been equipped with the capability to simulate heavy ion transport and interactions. To assess and validate these capabilities, we performed simulations of a series of benchmark-quality heavy ion experiments with the computer codes FLUKA, MARS15, MCNPX, and PHITS. We focus on the comparisons of secondarymore » neutron production. Results are encouraging; however, further improvements in models and codes and additional benchmarking are required. (authors)« less
Benchmarks--Standards Comparisons. Math Competencies: EFF Benchmarks Comparison [and] Reading Competencies: EFF Benchmarks Comparison [and] Writing Competencies: EFF Benchmarks Comparison.

ERIC Educational Resources Information Center

Kent State Univ., OH. Ohio Literacy Resource Center.

This document is intended to show the relationship between Ohio's Standards and Competencies, Equipped for the Future's (EFF's) Standards and Components of Performance, and Ohio's Revised Benchmarks. The document is divided into three parts, with Part 1 covering mathematics instruction, Part 2 covering reading instruction, and Part 3 covering…
Validation and Comparison of 2D and 3D Codes for Nearshore Motion of Long Waves Using Benchmark Problems

NASA Astrophysics Data System (ADS)

Velioǧlu, Deniz; Cevdet Yalçıner, Ahmet; Zaytsev, Andrey

2016-04-01

Tsunamis are huge waves with long wave periods and wave lengths that can cause great devastation and loss of life when they strike a coast. The interest in experimental and numerical modeling of tsunami propagation and inundation increased considerably after the 2011 Great East Japan earthquake. In this study, two numerical codes, FLOW 3D and NAMI DANCE, that analyze tsunami propagation and inundation patterns are considered. Flow 3D simulates linear and nonlinear propagating surface waves as well as long waves by solving three-dimensional Navier-Stokes (3D-NS) equations. NAMI DANCE uses finite difference computational method to solve 2D depth-averaged linear and nonlinear forms of shallow water equations (NSWE) in long wave problems, specifically tsunamis. In order to validate these two codes and analyze the differences between 3D-NS and 2D depth-averaged NSWE equations, two benchmark problems are applied. One benchmark problem investigates the runup of long waves over a complex 3D beach. The experimental setup is a 1:400 scale model of Monai Valley located on the west coast of Okushiri Island, Japan. Other benchmark problem is discussed in 2015 National Tsunami Hazard Mitigation Program (NTHMP) Annual meeting in Portland, USA. It is a field dataset, recording the Japan 2011 tsunami in Hilo Harbor, Hawaii. The computed water surface elevation and velocity data are compared with the measured data. The comparisons showed that both codes are in fairly good agreement with each other and benchmark data. The differences between 3D-NS and 2D depth-averaged NSWE equations are highlighted. All results are presented with discussions and comparisons. Acknowledgements: Partial support by Japan-Turkey Joint Research Project by JICA on earthquakes and tsunamis in Marmara Region (JICA SATREPS - MarDiM Project), 603839 ASTARTE Project of EU, UDAP-C-12-14 project of AFAD Turkey, 108Y227, 113M556 and 213M534 projects of TUBITAK Turkey, RAPSODI (CONCERT_Dis-021) of CONCERT-Japan Joint Call and Istanbul Metropolitan Municipality are all acknowledged.
Plutonium Critical Mass Curve Comparison to Mass at Upper Subcritical Limit (USL) Using Whisper

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alwin, Jennifer Louise; Zhang, Ning

Whisper is computational software designed to assist the nuclear criticality safety analyst with validation studies with the MCNP ® Monte Carlo radiation transport package. Standard approaches to validation rely on the selection of benchmarks based upon expert judgment. Whisper uses sensitivity/uncertainty (S/U) methods to select relevant benchmarks to a particular application or set of applications being analyzed. Using these benchmarks, Whisper computes a calculational margin. Whisper attempts to quantify the margin of subcriticality (MOS) from errors in software and uncertainties in nuclear data. The combination of the Whisper-derived calculational margin and MOS comprise the baseline upper subcritical limit (USL), tomore » which an additional margin may be applied by the nuclear criticality safety analyst as appropriate to ensure subcriticality. A series of critical mass curves for plutonium, similar to those found in Figure 31 of LA-10860-MS, have been generated using MCNP6.1.1 and the iterative parameter study software, WORM_Solver. The baseline USL for each of the data points of the curves was then computed using Whisper 1.1. The USL was then used to determine the equivalent mass for plutonium metal-water system. ANSI/ANS-8.1 states that it is acceptable to use handbook data, such as the data directly from the LA-10860-MS, as it is already considered validated (Section 4.3 4) “Use of subcritical limit data provided in ANSI/ANS standards or accepted reference publications does not require further validation.”). This paper attempts to take a novel approach to visualize traditional critical mass curves and allows comparison with the amount of mass for which the k eff is equal to the USL (calculational margin + margin of subcriticality). However, the intent is to plot the critical mass data along with USL, not to suggest that already accepted handbook data should have new and more rigorous requirements for validation.« less
The Validity of the Comparative Interrupted Time Series Design for Evaluating the Effect of School-Level Interventions.

PubMed

Jacob, Robin; Somers, Marie-Andree; Zhu, Pei; Bloom, Howard

2016-06-01

In this article, we examine whether a well-executed comparative interrupted time series (CITS) design can produce valid inferences about the effectiveness of a school-level intervention. This article also explores the trade-off between bias reduction and precision loss across different methods of selecting comparison groups for the CITS design and assesses whether choosing matched comparison schools based only on preintervention test scores is sufficient to produce internally valid impact estimates. We conduct a validation study of the CITS design based on the federal Reading First program as implemented in one state using results from a regression discontinuity design as a causal benchmark. Our results contribute to the growing base of evidence regarding the validity of nonexperimental designs. We demonstrate that the CITS design can, in our example, produce internally valid estimates of program impacts when multiple years of preintervention outcome data (test scores in the present case) are available and when a set of reasonable criteria are used to select comparison organizations (schools in the present case). © The Author(s) 2016.
Open-source platform to benchmark fingerprints for ligand-based virtual screening

PubMed Central

2013-01-01

Similarity-search methods using molecular fingerprints are an important tool for ligand-based virtual screening. A huge variety of fingerprints exist and their performance, usually assessed in retrospective benchmarking studies using data sets with known actives and known or assumed inactives, depends largely on the validation data sets used and the similarity measure used. Comparing new methods to existing ones in any systematic way is rather difficult due to the lack of standard data sets and evaluation procedures. Here, we present a standard platform for the benchmarking of 2D fingerprints. The open-source platform contains all source code, structural data for the actives and inactives used (drawn from three publicly available collections of data sets), and lists of randomly selected query molecules to be used for statistically valid comparisons of methods. This allows the exact reproduction and comparison of results for future studies. The results for 12 standard fingerprints together with two simple baseline fingerprints assessed by seven evaluation methods are shown together with the correlations between methods. High correlations were found between the 12 fingerprints and a careful statistical analysis showed that only the two baseline fingerprints were different from the others in a statistically significant way. High correlations were also found between six of the seven evaluation methods, indicating that despite their seeming differences, many of these methods are similar to each other. PMID:23721588
NASA low speed centrifugal compressor

NASA Technical Reports Server (NTRS)

Hathaway, Michael D.

1990-01-01

The flow characteristics of a low speed centrifugal compressor were examined at NASA Lewis Research Center to improve understanding of the flow in centrifugal compressors, to provide models of various flow phenomena, and to acquire benchmark data for three dimensional viscous flow code validation. The paper describes the objectives, test facilities' instrumentation, and experiment preliminary comparisons.
Revisiting Turbulence Model Validation for High-Mach Number Axisymmetric Compression Corner Flows

NASA Technical Reports Server (NTRS)

Georgiadis, Nicholas J.; Rumsey, Christopher L.; Huang, George P.

2015-01-01

Two axisymmetric shock-wave/boundary-layer interaction (SWBLI) cases are used to benchmark one- and two-equation Reynolds-averaged Navier-Stokes (RANS) turbulence models. This validation exercise was executed in the philosophy of the NASA Turbulence Modeling Resource and the AIAA Turbulence Model Benchmarking Working Group. Both SWBLI cases are from the experiments of Kussoy and Horstman for axisymmetric compression corner geometries with SWBLI inducing flares of 20 and 30 degrees, respectively. The freestream Mach number was approximately 7. The RANS closures examined are the Spalart-Allmaras one-equation model and the Menter family of kappa - omega two equation models including the Baseline and Shear Stress Transport formulations. The Wind-US and CFL3D RANS solvers are employed to simulate the SWBLI cases. Comparisons of RANS solutions to experimental data are made for a boundary layer survey plane just upstream of the SWBLI region. In the SWBLI region, comparisons of surface pressure and heat transfer are made. The effects of inflow modeling strategy, grid resolution, grid orthogonality, turbulent Prandtl number, and code-to-code variations are also addressed.
Benchmark Simulations of the Thermal-Hydraulic Responses during EBR-II Inherent Safety Tests using SAM

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hu, Rui; Sumner, Tyler S.

2016-04-17

An advanced system analysis tool SAM is being developed for fast-running, improved-fidelity, and whole-plant transient analyses at Argonne National Laboratory under DOE-NE’s Nuclear Energy Advanced Modeling and Simulation (NEAMS) program. As an important part of code development, companion validation activities are being conducted to ensure the performance and validity of the SAM code. This paper presents the benchmark simulations of two EBR-II tests, SHRT-45R and BOP-302R, whose data are available through the support of DOE-NE’s Advanced Reactor Technology (ART) program. The code predictions of major primary coolant system parameter are compared with the test results. Additionally, the SAS4A/SASSYS-1 code simulationmore » results are also included for a code-to-code comparison.« less
Validation and Performance Comparison of Numerical Codes for Tsunami Inundation

NASA Astrophysics Data System (ADS)

Velioglu, D.; Kian, R.; Yalciner, A. C.; Zaytsev, A.

2015-12-01

In inundation zones, tsunami motion turns from wave motion to flow of water. Modelling of this phenomenon is a complex problem since there are many parameters affecting the tsunami flow. In this respect, the performance of numerical codes that analyze tsunami inundation patterns becomes important. The computation of water surface elevation is not sufficient for proper analysis of tsunami behaviour in shallow water zones and on land and hence for the development of mitigation strategies. Velocity and velocity patterns are also crucial parameters and have to be computed at the highest accuracy. There are numerous numerical codes to be used for simulating tsunami inundation. In this study, FLOW 3D and NAMI DANCE codes are selected for validation and performance comparison. Flow 3D simulates linear and nonlinear propagating surface waves as well as long waves by solving three-dimensional Navier-Stokes (3D-NS) equations. FLOW 3D is used specificaly for flood problems. NAMI DANCE uses finite difference computational method to solve linear and nonlinear forms of shallow water equations (NSWE) in long wave problems, specifically tsunamis. In this study, these codes are validated and their performances are compared using two benchmark problems which are discussed in 2015 National Tsunami Hazard Mitigation Program (NTHMP) Annual meeting in Portland, USA. One of the problems is an experiment of a single long-period wave propagating up a piecewise linear slope and onto a small-scale model of the town of Seaside, Oregon. Other benchmark problem is an experiment of a single solitary wave propagating up a triangular shaped shelf with an island feature located at the offshore point of the shelf. The computed water surface elevation and velocity data are compared with the measured data. The comparisons showed that both codes are in fairly good agreement with each other and benchmark data. All results are presented with discussions and comparisons. The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement No 603839 (Project ASTARTE - Assessment, Strategy and Risk Reduction for Tsunamis in Europe)
Validation of tungsten cross sections in the neutron energy region up to 100 keV

NASA Astrophysics Data System (ADS)

Pigni, Marco T.; Žerovnik, Gašper; Leal, Luiz. C.; Trkov, Andrej

2017-09-01

Following a series of recent cross section evaluations on tungsten isotopes performed at Oak Ridge National Laboratory (ORNL), this paper presents the validation work carried out to test the performance of the evaluated cross sections based on lead-slowing-down (LSD) benchmarks conducted in Grenoble. ORNL completed the resonance parameter evaluation of four tungsten isotopes - 182,183,184,186W - in August 2014 and submitted it as an ENDF-compatible file to be part of the next release of the ENDF/B-VIII.0 nuclear data library. The evaluations were performed with support from the US Nuclear Criticality Safety Program in an effort to provide improved tungsten cross section and covariance data for criticality safety sensitivity analyses. The validation analysis based on the LSD benchmarks showed an improved agreement with the experimental response when the ORNL tungsten evaluations were included in the ENDF/B-VII.1 library. Comparison with the results obtained with the JEFF-3.2 nuclear data library are also discussed.
Land Ice Verification and Validation Kit

DOE Office of Scientific and Technical Information (OSTI.GOV)

2015-07-15

To address a pressing need to better understand the behavior and complex interaction of ice sheets within the global Earth system, significant development of continental-scale, dynamical ice-sheet models is underway. The associated verification and validation process of these models is being coordinated through a new, robust, python-based extensible software package, the Land Ice Verification and Validation toolkit (LIVV). This release provides robust and automated verification and a performance evaluation on LCF platforms. The performance V&V involves a comprehensive comparison of model performance relative to expected behavior on a given computing platform. LIVV operates on a set of benchmark and testmore » data, and provides comparisons for a suite of community prioritized tests, including configuration and parameter variations, bit-4-bit evaluation, and plots of tests where differences occur.« less
Accuracy of a simplified method for shielded gamma-ray skyshine sources

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bassett, M.S.; Shultis, J.K.

1989-11-01

Rigorous transport or Monte Carlo methods for estimating far-field gamma-ray skyshine doses generally are computationally intensive. consequently, several simplified techniques such as point-kernel methods and methods based on beam response functions have been proposed. For unshielded skyshine sources, these simplified methods have been shown to be quite accurate from comparisons to benchmark problems and to benchmark experimental results. For shielded sources, the simplified methods typically use exponential attenuation and photon buildup factors to describe the effect of the shield. However, the energy and directional redistribution of photons scattered in the shield is usually ignored, i.e., scattered photons are assumed tomore » emerge from the shield with the same energy and direction as the uncollided photons. The accuracy of this shield treatment is largely unknown due to the paucity of benchmark results for shielded sources. In this paper, the validity of such a shield treatment is assessed by comparison to a composite method, which accurately calculates the energy and angular distribution of photons penetrating the shield.« less
Computational Chemistry Comparison and Benchmark Database

National Institute of Standards and Technology Data Gateway

SRD 101 NIST Computational Chemistry Comparison and Benchmark Database (Web, free access) The NIST Computational Chemistry Comparison and Benchmark Database is a collection of experimental and ab initio thermochemical properties for a selected set of molecules. The goals are to provide a benchmark set of molecules for the evaluation of ab initio computational methods and allow the comparison between different ab initio computational methods for the prediction of thermochemical properties.
a Proposed Benchmark Problem for Scatter Calculations in Radiographic Modelling

NASA Astrophysics Data System (ADS)

Jaenisch, G.-R.; Bellon, C.; Schumm, A.; Tabary, J.; Duvauchelle, Ph.

2009-03-01

Code Validation is a permanent concern in computer modelling, and has been addressed repeatedly in eddy current and ultrasonic modeling. A good benchmark problem is sufficiently simple to be taken into account by various codes without strong requirements on geometry representation capabilities, focuses on few or even a single aspect of the problem at hand to facilitate interpretation and to avoid that compound errors compensate themselves, yields a quantitative result and is experimentally accessible. In this paper we attempt to address code validation for one aspect of radiographic modeling, the scattered radiation prediction. Many NDT applications can not neglect scattered radiation, and the scatter calculation thus is important to faithfully simulate the inspection situation. Our benchmark problem covers the wall thickness range of 10 to 50 mm for single wall inspections, with energies ranging from 100 to 500 keV in the first stage, and up to 1 MeV with wall thicknesses up to 70 mm in the extended stage. A simple plate geometry is sufficient for this purpose, and the scatter data is compared on a photon level, without a film model, which allows for comparisons with reference codes like MCNP. We compare results of three Monte Carlo codes (McRay, Sindbad and Moderato) as well as an analytical first order scattering code (VXI), and confront them to results obtained with MCNP. The comparison with an analytical scatter model provides insights into the application domain where this kind of approach can successfully replace Monte-Carlo calculations.

Transport methods and interactions for space radiations

NASA Technical Reports Server (NTRS)

Wilson, John W.; Townsend, Lawrence W.; Schimmerling, Walter S.; Khandelwal, Govind S.; Khan, Ferdous S.; Nealy, John E.; Cucinotta, Francis A.; Simonsen, Lisa C.; Shinn, Judy L.; Norbury, John W.

1991-01-01

A review of the program in space radiation protection at the Langley Research Center is given. The relevant Boltzmann equations are given with a discussion of approximation procedures for space applications. The interaction coefficients are related to solution of the many-body Schroedinger equation with nuclear and electromagnetic forces. Various solution techniques are discussed to obtain relevant interaction cross sections with extensive comparison with experiments. Solution techniques for the Boltzmann equations are discussed in detail. Transport computer code validation is discussed through analytical benchmarking, comparison with other codes, comparison with laboratory experiments and measurements in space. Applications to lunar and Mars missions are discussed.
Validation of tsunami inundation model TUNA-RP using OAR-PMEL-135 benchmark problem set

NASA Astrophysics Data System (ADS)

Koh, H. L.; Teh, S. Y.; Tan, W. K.; Kh'ng, X. Y.

2017-05-01

A standard set of benchmark problems, known as OAR-PMEL-135, is developed by the US National Tsunami Hazard Mitigation Program for tsunami inundation model validation. Any tsunami inundation model must be tested for its accuracy and capability using this standard set of benchmark problems before it can be gainfully used for inundation simulation. The authors have previously developed an in-house tsunami inundation model known as TUNA-RP. This inundation model solves the two-dimensional nonlinear shallow water equations coupled with a wet-dry moving boundary algorithm. This paper presents the validation of TUNA-RP against the solutions provided in the OAR-PMEL-135 benchmark problem set. This benchmark validation testing shows that TUNA-RP can indeed perform inundation simulation with accuracy consistent with that in the tested benchmark problem set.
Key performance indicators to benchmark hospital information systems - a delphi study.

PubMed

Hübner-Bloder, G; Ammenwerth, E

2009-01-01

To identify the key performance indicators for hospital information systems (HIS) that can be used for HIS benchmarking. A Delphi survey with one qualitative and two quantitative rounds. Forty-four HIS experts from health care IT practice and academia participated in all three rounds. Seventy-seven performance indicators were identified and organized into eight categories: technical quality, software quality, architecture and interface quality, IT vendor quality, IT support and IT department quality, workflow support quality, IT outcome quality, and IT costs. The highest ranked indicators are related to clinical workflow support and user satisfaction. Isolated technical indicators or cost indicators were not seen as useful. The experts favored an interdisciplinary group of all the stakeholders, led by hospital management, to conduct the HIS benchmarking. They proposed benchmarking activities both in regular (annual) intervals as well as at defined events (for example after IT introduction). Most of the experts stated that in their institutions no HIS benchmarking activities are being performed at the moment. In the context of IT governance, IT benchmarking is gaining importance in the healthcare area. The found indicators reflect the view of health care IT professionals and researchers. Research is needed to further validate and operationalize key performance indicators, to provide an IT benchmarking framework, and to provide open repositories for a comparison of the HIS benchmarks of different hospitals.
Implementation and validation of a conceptual benchmarking framework for patient blood management.

PubMed

Kastner, Peter; Breznik, Nada; Gombotz, Hans; Hofmann, Axel; Schreier, Günter

2015-01-01

Public health authorities and healthcare professionals are obliged to ensure high quality health service. Because of the high variability of the utilisation of blood and blood components, benchmarking is indicated in transfusion medicine. Implementation and validation of a benchmarking framework for Patient Blood Management (PBM) based on the report from the second Austrian Benchmark trial. Core modules for automatic report generation have been implemented with KNIME (Konstanz Information Miner) and validated by comparing the output with the results of the second Austrian benchmark trial. Delta analysis shows a deviation <0.1% for 95% (max. 1.4%). The framework provides a reliable tool for PBM benchmarking. The next step is technical integration with hospital information systems.
The InterFrost benchmark of Thermo-Hydraulic codes for cold regions hydrology - first inter-comparison results

NASA Astrophysics Data System (ADS)

Grenier, Christophe; Roux, Nicolas; Anbergen, Hauke; Collier, Nathaniel; Costard, Francois; Ferrry, Michel; Frampton, Andrew; Frederick, Jennifer; Holmen, Johan; Jost, Anne; Kokh, Samuel; Kurylyk, Barret; McKenzie, Jeffrey; Molson, John; Orgogozo, Laurent; Rivière, Agnès; Rühaak, Wolfram; Selroos, Jan-Olof; Therrien, René; Vidstrand, Patrik

2015-04-01

The impacts of climate change in boreal regions has received considerable attention recently due to the warming trends that have been experienced in recent decades and are expected to intensify in the future. Large portions of these regions, corresponding to permafrost areas, are covered by water bodies (lakes, rivers) that interact with the surrounding permafrost. For example, the thermal state of the surrounding soil influences the energy and water budget of the surface water bodies. Also, these water bodies generate taliks (unfrozen zones below) that disturb the thermal regimes of permafrost and may play a key role in the context of climate change. Recent field studies and modeling exercises indicate that a fully coupled 2D or 3D Thermo-Hydraulic (TH) approach is required to understand and model the past and future evolution of landscapes, rivers, lakes and associated groundwater systems in a changing climate. However, there is presently a paucity of 3D numerical studies of permafrost thaw and associated hydrological changes, and the lack of study can be partly attributed to the difficulty in verifying multi-dimensional results produced by numerical models. Numerical approaches can only be validated against analytical solutions for a purely thermic 1D equation with phase change (e.g. Neumann, Lunardini). When it comes to the coupled TH system (coupling two highly non-linear equations), the only possible approach is to compare the results from different codes to provided test cases and/or to have controlled experiments for validation. Such inter-code comparisons can propel discussions to try to improve code performances. A benchmark exercise was initialized in 2014 with a kick-off meeting in Paris in November. Participants from USA, Canada, Germany, Sweden and France convened, representing altogether 13 simulation codes. The benchmark exercises consist of several test cases inspired by existing literature (e.g. McKenzie et al., 2007) as well as new ones. They range from simpler, purely thermal cases (benchmark T1) to more complex, coupled 2D TH cases (benchmarks TH1, TH2, and TH3). Some experimental cases conducted in cold room complement the validation approach. A web site hosted by LSCE (Laboratoire des Sciences du Climat et de l'Environnement) is an interaction platform for the participants and hosts the test cases database at the following address: https://wiki.lsce.ipsl.fr/interfrost. The results of the first stage of the benchmark exercise will be presented. We will mainly focus on the inter-comparison of participant results for the coupled cases (TH1, TH2 & TH3). Further perspectives of the exercise will also be presented. Extensions to more complex physical conditions (e.g. unsaturated conditions and geometrical deformations) are contemplated. In addition, 1D vertical cases of interest to the Climate Modeling community will be proposed. Keywords: Permafrost; Numerical modeling; River-soil interaction; Arctic systems; soil freeze-thaw
Uncertainty in Earth System Models: Benchmarks for Ocean Model Performance and Validation

NASA Astrophysics Data System (ADS)

Ogunro, O. O.; Elliott, S.; Collier, N.; Wingenter, O. W.; Deal, C.; Fu, W.; Hoffman, F. M.

2017-12-01

The mean ocean CO2 sink is a major component of the global carbon budget, with marine reservoirs holding about fifty times more carbon than the atmosphere. Phytoplankton play a significant role in the net carbon sink through photosynthesis and drawdown, such that about a quarter of anthropogenic CO2 emissions end up in the ocean. Biology greatly increases the efficiency of marine environments in CO2 uptake and ultimately reduces the impact of the persistent rise in atmospheric concentrations. However, a number of challenges remain in appropriate representation of marine biogeochemical processes in Earth System Models (ESM). These threaten to undermine the community effort to quantify seasonal to multidecadal variability in ocean uptake of atmospheric CO2. In a bid to improve analyses of marine contributions to climate-carbon cycle feedbacks, we have developed new analysis methods and biogeochemistry metrics as part of the International Ocean Model Benchmarking (IOMB) effort. Our intent is to meet the growing diagnostic and benchmarking needs of ocean biogeochemistry models. The resulting software package has been employed to validate DOE ocean biogeochemistry results by comparison with observational datasets. Several other international ocean models contributing results to the fifth phase of the Coupled Model Intercomparison Project (CMIP5) were analyzed simultaneously. Our comparisons suggest that the biogeochemical processes determining CO2 entry into the global ocean are not well represented in most ESMs. Polar regions continue to show notable biases in many critical biogeochemical and physical oceanographic variables. Some of these disparities could have first order impacts on the conversion of atmospheric CO2 to organic carbon. In addition, single forcing simulations show that the current ocean state can be partly explained by the uptake of anthropogenic emissions. Combined effects of two or more of these forcings on ocean biogeochemical cycles and ecosystems are challenging to predict since additive or antagonistic effects may occur. A benchmarking tool for accurate assessment and validation of marine biogeochemical outputs will be indispensable as the model community continues to improve ESM developments. It will provide a first order tool in understanding climate-carbon cycle feedbacks.
Reverse Engineering Validation using a Benchmark Synthetic Gene Circuit in Human Cells

PubMed Central

Kang, Taek; White, Jacob T.; Xie, Zhen; Benenson, Yaakov; Sontag, Eduardo; Bleris, Leonidas

2013-01-01

Multi-component biological networks are often understood incompletely, in large part due to the lack of reliable and robust methodologies for network reverse engineering and characterization. As a consequence, developing automated and rigorously validated methodologies for unraveling the complexity of biomolecular networks in human cells remains a central challenge to life scientists and engineers. Today, when it comes to experimental and analytical requirements, there exists a great deal of diversity in reverse engineering methods, which renders the independent validation and comparison of their predictive capabilities difficult. In this work we introduce an experimental platform customized for the development and verification of reverse engineering and pathway characterization algorithms in mammalian cells. Specifically, we stably integrate a synthetic gene network in human kidney cells and use it as a benchmark for validating reverse engineering methodologies. The network, which is orthogonal to endogenous cellular signaling, contains a small set of regulatory interactions that can be used to quantify the reconstruction performance. By performing successive perturbations to each modular component of the network and comparing protein and RNA measurements, we study the conditions under which we can reliably reconstruct the causal relationships of the integrated synthetic network. PMID:23654266
Reverse engineering validation using a benchmark synthetic gene circuit in human cells.

PubMed

Kang, Taek; White, Jacob T; Xie, Zhen; Benenson, Yaakov; Sontag, Eduardo; Bleris, Leonidas

2013-05-17

Multicomponent biological networks are often understood incompletely, in large part due to the lack of reliable and robust methodologies for network reverse engineering and characterization. As a consequence, developing automated and rigorously validated methodologies for unraveling the complexity of biomolecular networks in human cells remains a central challenge to life scientists and engineers. Today, when it comes to experimental and analytical requirements, there exists a great deal of diversity in reverse engineering methods, which renders the independent validation and comparison of their predictive capabilities difficult. In this work we introduce an experimental platform customized for the development and verification of reverse engineering and pathway characterization algorithms in mammalian cells. Specifically, we stably integrate a synthetic gene network in human kidney cells and use it as a benchmark for validating reverse engineering methodologies. The network, which is orthogonal to endogenous cellular signaling, contains a small set of regulatory interactions that can be used to quantify the reconstruction performance. By performing successive perturbations to each modular component of the network and comparing protein and RNA measurements, we study the conditions under which we can reliably reconstruct the causal relationships of the integrated synthetic network.
Assessing Ecosystem Model Performance in Semiarid Systems

NASA Astrophysics Data System (ADS)

Thomas, A.; Dietze, M.; Scott, R. L.; Biederman, J. A.

2017-12-01

In ecosystem process modelling, comparing outputs to benchmark datasets observed in the field is an important way to validate models, allowing the modelling community to track model performance over time and compare models at specific sites. Multi-model comparison projects as well as models themselves have largely been focused on temperate forests and similar biomes. Semiarid regions, on the other hand, are underrepresented in land surface and ecosystem modelling efforts, and yet will be disproportionately impacted by disturbances such as climate change due to their sensitivity to changes in the water balance. Benchmarking models at semiarid sites is an important step in assessing and improving models' suitability for predicting the impact of disturbance on semiarid ecosystems. In this study, several ecosystem models were compared at a semiarid grassland in southwestern Arizona using PEcAn, or the Predictive Ecosystem Analyzer, an open-source eco-informatics toolbox ideal for creating the repeatable model workflows necessary for benchmarking. Models included SIPNET, DALEC, JULES, ED2, GDAY, LPJ-GUESS, MAESPA, CLM, CABLE, and FATES. Comparison between model output and benchmarks such as net ecosystem exchange (NEE) tended to produce high root mean square error and low correlation coefficients, reflecting poor simulation of seasonality and the tendency for models to create much higher carbon sources than observed. These results indicate that ecosystem models do not currently adequately represent semiarid ecosystem processes.
A Machine-to-Machine protocol benchmark for eHealth applications - Use case: Respiratory rehabilitation.

PubMed

Talaminos-Barroso, Alejandro; Estudillo-Valderrama, Miguel A; Roa, Laura M; Reina-Tosina, Javier; Ortega-Ruiz, Francisco

2016-06-01

M2M (Machine-to-Machine) communications represent one of the main pillars of the new paradigm of the Internet of Things (IoT), and is making possible new opportunities for the eHealth business. Nevertheless, the large number of M2M protocols currently available hinders the election of a suitable solution that satisfies the requirements that can demand eHealth applications. In the first place, to develop a tool that provides a benchmarking analysis in order to objectively select among the most relevant M2M protocols for eHealth solutions. In the second place, to validate the tool with a particular use case: the respiratory rehabilitation. A software tool, called Distributed Computing Framework (DFC), has been designed and developed to execute the benchmarking tests and facilitate the deployment in environments with a large number of machines, with independence of the protocol and performance metrics selected. DDS, MQTT, CoAP, JMS, AMQP and XMPP protocols were evaluated considering different specific performance metrics, including CPU usage, memory usage, bandwidth consumption, latency and jitter. The results obtained allowed to validate a case of use: respiratory rehabilitation of chronic obstructive pulmonary disease (COPD) patients in two scenarios with different types of requirement: Home-Based and Ambulatory. The results of the benchmark comparison can guide eHealth developers in the choice of M2M technologies. In this regard, the framework presented is a simple and powerful tool for the deployment of benchmark tests under specific environments and conditions. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Developing integrated benchmarks for DOE performance measurement

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barancik, J.I.; Kramer, C.F.; Thode, Jr. H.C.

1992-09-30

The objectives of this task were to describe and evaluate selected existing sources of information on occupational safety and health with emphasis on hazard and exposure assessment, abatement, training, reporting, and control identifying for exposure and outcome in preparation for developing DOE performance benchmarks. Existing resources and methodologies were assessed for their potential use as practical performance benchmarks. Strengths and limitations of current data resources were identified. Guidelines were outlined for developing new or improved performance factors, which then could become the basis for selecting performance benchmarks. Data bases for non-DOE comparison populations were identified so that DOE performance couldmore » be assessed relative to non-DOE occupational and industrial groups. Systems approaches were described which can be used to link hazards and exposure, event occurrence, and adverse outcome factors, as needed to generate valid, reliable, and predictive performance benchmarks. Data bases were identified which contain information relevant to one or more performance assessment categories . A list of 72 potential performance benchmarks was prepared to illustrate the kinds of information that can be produced through a benchmark development program. Current information resources which may be used to develop potential performance benchmarks are limited. There is need to develop an occupational safety and health information and data system in DOE, which is capable of incorporating demonstrated and documented performance benchmarks prior to, or concurrent with the development of hardware and software. A key to the success of this systems approach is rigorous development and demonstration of performance benchmark equivalents to users of such data before system hardware and software commitments are institutionalized.« less
Methodology and issues of integral experiments selection for nuclear data validation

NASA Astrophysics Data System (ADS)

Tatiana, Ivanova; Ivanov, Evgeny; Hill, Ian

2017-09-01

Nuclear data validation involves a large suite of Integral Experiments (IEs) for criticality, reactor physics and dosimetry applications. [1] Often benchmarks are taken from international Handbooks. [2, 3] Depending on the application, IEs have different degrees of usefulness in validation, and usually the use of a single benchmark is not advised; indeed, it may lead to erroneous interpretation and results. [1] This work aims at quantifying the importance of benchmarks used in application dependent cross section validation. The approach is based on well-known General Linear Least Squared Method (GLLSM) extended to establish biases and uncertainties for given cross sections (within a given energy interval). The statistical treatment results in a vector of weighting factors for the integral benchmarks. These factors characterize the value added by a benchmark for nuclear data validation for the given application. The methodology is illustrated by one example, selecting benchmarks for 239Pu cross section validation. The studies were performed in the framework of Subgroup 39 (Methods and approaches to provide feedback from nuclear and covariance data adjustment for improvement of nuclear data files) established at the Working Party on International Nuclear Data Evaluation Cooperation (WPEC) of the Nuclear Science Committee under the Nuclear Energy Agency (NEA/OECD).
Aerodynamic Validation of Emerging Projectile Configurations

DTIC Science & Technology

2011-12-01

was benchmarked against modern aerodynamic prediction programs like ANSYS CFX and Aero-Prediction 09 (AP09). Next, a comparison was made between two...types of angle of attack generation methods in ANSYS CFX . The research then focused on controlled tilting of the projectile’s nose to investigate the...resulting aerodynamic effects. ANSYS CFX was found to provide better agreement with the experimental data than AP09. 14. SUBJECT
Validation of Tendril TrueHome Using Software-to-Software Comparison

DOE Office of Scientific and Technical Information (OSTI.GOV)

Maguire, Jeffrey B; Horowitz, Scott G; Moore, Nathan

This study performed comparative evaluation of EnergyPlus version 8.6 and Tendril TrueHome, two physics-based home energy simulation models, to identify differences in energy consumption predictions between the two programs and resolve discrepancies between them. EnergyPlus is considered a benchmark, best-in-class software tool for building energy simulation. This exercise sought to improve both software tools through additional evaluation/scrutiny.
DSMC Simulations of Hypersonic Flows and Comparison With Experiments

NASA Technical Reports Server (NTRS)

Moss, James N.; Bird, Graeme A.; Markelov, Gennady N.

2004-01-01

This paper presents computational results obtained with the direct simulation Monte Carlo (DSMC) method for several biconic test cases in which shock interactions and flow separation-reattachment are key features of the flow. Recent ground-based experiments have been performed for several biconic configurations, and surface heating rate and pressure measurements have been proposed for code validation studies. The present focus is to expand on the current validating activities for a relatively new DSMC code called DS2V that Bird (second author) has developed. Comparisons with experiments and other computations help clarify the agreement currently being achieved between computations and experiments and to identify the range of measurement variability of the proposed validation data when benchmarked with respect to the current computations. For the test cases with significant vibrational nonequilibrium, the effect of the vibrational energy surface accommodation on heating and other quantities is demonstrated.
The development of a virtual reality training curriculum for colonoscopy.

PubMed

Sugden, Colin; Aggarwal, Rajesh; Banerjee, Amrita; Haycock, Adam; Thomas-Gibson, Siwan; Williams, Christopher B; Darzi, Ara

2012-07-01

The development of a structured virtual reality (VR) training curriculum for colonoscopy using high-fidelity simulation. Colonoscopy requires detailed knowledge and technical skill. Changes to working practices in recent times have reduced the availability of traditional training opportunities. Much might, therefore, be achieved by applying novel technologies such as VR simulation to colonoscopy. Scientifically developed device-specific curricula aim to maximize the yield of laboratory-based training by focusing on validated modules and linking progression to the attainment of benchmarked proficiency criteria. Fifty participants comprised of 30 novices (<10 colonoscopies), 10 intermediates (100 to 500 colonoscopies), and 10 experienced (>500 colonoscopies) colonoscopists were recruited to participate. Surrogates of proficiency, such as number of procedures undertaken, determined prospective allocation to 1 of 3 groups (novice, intermediate, and experienced). Construct validity and learning value (comparison between groups and within groups respectively) for each task and metric on the chosen simulator model determined suitability for inclusion in the curriculum. Eight tasks in possession of construct validity and significant learning curves were included in the curriculum: 3 abstract tasks, 4 part-procedural tasks, and 1 procedural task. The whole-procedure task was valid for 11 metrics including the following: "time taken to complete the task" (1238, 343, and 293 s; P < 0.001) and "insertion length with embedded tip" (23.8, 3.6, and 4.9 cm; P = 0.005). Learning curves consistently plateaued at or beyond the ninth attempt. Valid metrics were used to define benchmarks, derived from the performance of the experienced cohort, for each included task. A comprehensive, stratified, benchmarked, whole-procedure curriculum has been developed for a modern high-fidelity VR colonoscopy simulator.
Estimation of hand hygiene opportunities on an adult medical ward using 24-hour camera surveillance: validation of the HOW2 Benchmark Study.

PubMed

Diller, Thomas; Kelly, J William; Blackhurst, Dawn; Steed, Connie; Boeker, Sue; McElveen, Danielle C

2014-06-01

We previously published a formula to estimate the number of hand hygiene opportunities (HHOs) per patient-day using the World Health Organization's "Five Moments for Hand Hygiene" methodology (HOW2 Benchmark Study). HHOs can be used as a denominator for calculating hand hygiene compliance rates when product utilization data are available. This study validates the previously derived HHO estimate using 24-hour video surveillance of health care worker hand hygiene activity. The validation study utilized 24-hour video surveillance recordings of 26 patients' hospital stays to measure the actual number of HHOs per patient-day on a medicine ward in a large teaching hospital. Statistical methods were used to compare these results to those obtained by episodic observation of patient activity in the original derivation study. Total hours of data collection were 81.3 and 1,510.8, resulting in 1,740 and 4,522 HHOs in the derivation and validation studies, respectively. Comparisons of the mean and median HHOs per 24-hour period did not differ significantly. HHOs were 71.6 (95% confidence interval: 64.9-78.3) and 73.9 (95% confidence interval: 69.1-84.1), respectively. This study validates the HOW2 Benchmark Study and confirms that expected numbers of HHOs can be estimated from the unit's patient census and patient-to-nurse ratio. These data can be used as denominators in calculations of hand hygiene compliance rates from electronic monitoring using the "Five Moments for Hand Hygiene" methodology. Copyright © 2014 Association for Professionals in Infection Control and Epidemiology, Inc. Published by Mosby, Inc. All rights reserved.
Radiant Energy Measurements from a Scaled Jet Engine Axisymmetric Exhaust Nozzle for a Baseline Code Validation Case

NASA Technical Reports Server (NTRS)

Baumeister, Joseph F.

1994-01-01

A non-flowing, electrically heated test rig was developed to verify computer codes that calculate radiant energy propagation from nozzle geometries that represent aircraft propulsion nozzle systems. Since there are a variety of analysis tools used to evaluate thermal radiation propagation from partially enclosed nozzle surfaces, an experimental benchmark test case was developed for code comparison. This paper briefly describes the nozzle test rig and the developed analytical nozzle geometry used to compare the experimental and predicted thermal radiation results. A major objective of this effort was to make available the experimental results and the analytical model in a format to facilitate conversion to existing computer code formats. For code validation purposes this nozzle geometry represents one validation case for one set of analysis conditions. Since each computer code has advantages and disadvantages based on scope, requirements, and desired accuracy, the usefulness of this single nozzle baseline validation case can be limited for some code comparisons.
Benchmark Campaign of the COST Action GNSS4SWEC: Main Goals and Achievements

NASA Astrophysics Data System (ADS)

Dick, G.; Dousa, J.; Kacmarik, M.; Pottiaux, E.; Zus, F.; Brenot, H. H.; Moeller, G.; Kaplon, J.; Morel, L.; Hordyniec, P.

2016-12-01

This talk will give an overview of achievements of the Benchmark campaign, one of the central activities in the framework of the COST Action ES 1206 GNSS4SWEC. The main goal of the campaign is supporting the development and validation of advanced Global Navigation Satellite System (GNSS) tropospheric products, in particular high-resolution and ultra-fast/real-time zenith total delays (ZTD) and asymmetry products in terms of tropospheric horizontal gradients and slant delays.For the Benchmark campaign a complex data set of GNSS observations and various meteorological data were collected for a two-month period in 2013 (May-June) which included severe weather events in central Europe. An initial processing of data sets from GNSS and numerical weather models (NWM) provided independently estimated tropospheric reference products - ZTDs, tropospheric horizontal gradients and others. The comparison of horizontal tropospheric gradients from GNSS and NWM data demonstrated a very good agreement among independent solutions with negligible biases and an accuracy of about 0.5 mm. Visual comparisons of maps of zenith wet delays and tropospheric horizontal gradients showed very promising results for future exploitations of advanced GNSS tropospheric products in meteorological applications such as severe weather event monitoring and weather nowcasting.The benchmark data set is also used for an extensive validation of line-of-sight tropospheric Slant Total Delays (STD) from GNSS, NWM-raytracing and Water Vapour Radiometer (WVR) solutions. Six institutions delivered their STDs based on GNSS observations processed using different software and strategies. STDs from NWM ray-tracing came from three institutions using three different NWM models. Results show generally a very good mutual agreement among all solutions from all techniques. Among all an influence of adding not cleaned as well as cleaned GNSS post-fit residuals, i.e. residuals with eliminated and not eliminated non-tropospheric systematic effects such as multipath, to estimated STDs will be presented.
Watershed Regressions for Pesticides (WARP) for Predicting Annual Maximum and Annual Maximum Moving-Average Concentrations of Atrazine in Streams

USGS Publications Warehouse

Stone, Wesley W.; Gilliom, Robert J.; Crawford, Charles G.

2008-01-01

Regression models were developed for predicting annual maximum and selected annual maximum moving-average concentrations of atrazine in streams using the Watershed Regressions for Pesticides (WARP) methodology developed by the National Water-Quality Assessment Program (NAWQA) of the U.S. Geological Survey (USGS). The current effort builds on the original WARP models, which were based on the annual mean and selected percentiles of the annual frequency distribution of atrazine concentrations. Estimates of annual maximum and annual maximum moving-average concentrations for selected durations are needed to characterize the levels of atrazine and other pesticides for comparison to specific water-quality benchmarks for evaluation of potential concerns regarding human health or aquatic life. Separate regression models were derived for the annual maximum and annual maximum 21-day, 60-day, and 90-day moving-average concentrations. Development of the regression models used the same explanatory variables, transformations, model development data, model validation data, and regression methods as those used in the original development of WARP. The models accounted for 72 to 75 percent of the variability in the concentration statistics among the 112 sampling sites used for model development. Predicted concentration statistics from the four models were within a factor of 10 of the observed concentration statistics for most of the model development and validation sites. Overall, performance of the models for the development and validation sites supports the application of the WARP models for predicting annual maximum and selected annual maximum moving-average atrazine concentration in streams and provides a framework to interpret the predictions in terms of uncertainty. For streams with inadequate direct measurements of atrazine concentrations, the WARP model predictions for the annual maximum and the annual maximum moving-average atrazine concentrations can be used to characterize the probable levels of atrazine for comparison to specific water-quality benchmarks. Sites with a high probability of exceeding a benchmark for human health or aquatic life can be prioritized for monitoring.

The MCNP6 Analytic Criticality Benchmark Suite

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, Forrest B.

2016-06-16

Analytical benchmarks provide an invaluable tool for verifying computer codes used to simulate neutron transport. Several collections of analytical benchmark problems [1-4] are used routinely in the verification of production Monte Carlo codes such as MCNP® [5,6]. Verification of a computer code is a necessary prerequisite to the more complex validation process. The verification process confirms that a code performs its intended functions correctly. The validation process involves determining the absolute accuracy of code results vs. nature. In typical validations, results are computed for a set of benchmark experiments using a particular methodology (code, cross-section data with uncertainties, and modeling)more » and compared to the measured results from the set of benchmark experiments. The validation process determines bias, bias uncertainty, and possibly additional margins. Verification is generally performed by the code developers, while validation is generally performed by code users for a particular application space. The VERIFICATION_KEFF suite of criticality problems [1,2] was originally a set of 75 criticality problems found in the literature for which exact analytical solutions are available. Even though the spatial and energy detail is necessarily limited in analytical benchmarks, typically to a few regions or energy groups, the exact solutions obtained can be used to verify that the basic algorithms, mathematics, and methods used in complex production codes perform correctly. The present work has focused on revisiting this benchmark suite. A thorough review of the problems resulted in discarding some of them as not suitable for MCNP benchmarking. For the remaining problems, many of them were reformulated to permit execution in either multigroup mode or in the normal continuous-energy mode for MCNP. Execution of the benchmarks in continuous-energy mode provides a significant advance to MCNP verification methods.« less
Benchmarking and validation activities within JEFF project

NASA Astrophysics Data System (ADS)

Cabellos, O.; Alvarez-Velarde, F.; Angelone, M.; Diez, C. J.; Dyrda, J.; Fiorito, L.; Fischer, U.; Fleming, M.; Haeck, W.; Hill, I.; Ichou, R.; Kim, D. H.; Klix, A.; Kodeli, I.; Leconte, P.; Michel-Sendis, F.; Nunnenmann, E.; Pecchia, M.; Peneliau, Y.; Plompen, A.; Rochman, D.; Romojaro, P.; Stankovskiy, A.; Sublet, J. Ch.; Tamagno, P.; Marck, S. van der

2017-09-01

The challenge for any nuclear data evaluation project is to periodically release a revised, fully consistent and complete library, with all needed data and covariances, and ensure that it is robust and reliable for a variety of applications. Within an evaluation effort, benchmarking activities play an important role in validating proposed libraries. The Joint Evaluated Fission and Fusion (JEFF) Project aims to provide such a nuclear data library, and thus, requires a coherent and efficient benchmarking process. The aim of this paper is to present the activities carried out by the new JEFF Benchmarking and Validation Working Group, and to describe the role of the NEA Data Bank in this context. The paper will also review the status of preliminary benchmarking for the next JEFF-3.3 candidate cross-section files.
The Isprs Benchmark on Indoor Modelling

NASA Astrophysics Data System (ADS)

Khoshelham, K.; Díaz Vilariño, L.; Peter, M.; Kang, Z.; Acharya, D.

2017-09-01

Automated generation of 3D indoor models from point cloud data has been a topic of intensive research in recent years. While results on various datasets have been reported in literature, a comparison of the performance of different methods has not been possible due to the lack of benchmark datasets and a common evaluation framework. The ISPRS benchmark on indoor modelling aims to address this issue by providing a public benchmark dataset and an evaluation framework for performance comparison of indoor modelling methods. In this paper, we present the benchmark dataset comprising several point clouds of indoor environments captured by different sensors. We also discuss the evaluation and comparison of indoor modelling methods based on manually created reference models and appropriate quality evaluation criteria. The benchmark dataset is available for download at: http://www2.isprs.org/commissions/comm4/wg5/benchmark-on-indoor-modelling.html.
TRUST. I. A 3D externally illuminated slab benchmark for dust radiative transfer

NASA Astrophysics Data System (ADS)

Gordon, K. D.; Baes, M.; Bianchi, S.; Camps, P.; Juvela, M.; Kuiper, R.; Lunttila, T.; Misselt, K. A.; Natale, G.; Robitaille, T.; Steinacker, J.

2017-07-01

Context. The radiative transport of photons through arbitrary three-dimensional (3D) structures of dust is a challenging problem due to the anisotropic scattering of dust grains and strong coupling between different spatial regions. The radiative transfer problem in 3D is solved using Monte Carlo or Ray Tracing techniques as no full analytic solution exists for the true 3D structures. Aims: We provide the first 3D dust radiative transfer benchmark composed of a slab of dust with uniform density externally illuminated by a star. This simple 3D benchmark is explicitly formulated to provide tests of the different components of the radiative transfer problem including dust absorption, scattering, and emission. Methods: The details of the external star, the slab itself, and the dust properties are provided. This benchmark includes models with a range of dust optical depths fully probing cases that are optically thin at all wavelengths to optically thick at most wavelengths. The dust properties adopted are characteristic of the diffuse Milky Way interstellar medium. This benchmark includes solutions for the full dust emission including single photon (stochastic) heating as well as two simplifying approximations: One where all grains are considered in equilibrium with the radiation field and one where the emission is from a single effective grain with size-distribution-averaged properties. A total of six Monte Carlo codes and one Ray Tracing code provide solutions to this benchmark. Results: The solution to this benchmark is given as global spectral energy distributions (SEDs) and images at select diagnostic wavelengths from the ultraviolet through the infrared. Comparison of the results revealed that the global SEDs are consistent on average to a few percent for all but the scattered stellar flux at very high optical depths. The image results are consistent within 10%, again except for the stellar scattered flux at very high optical depths. The lack of agreement between different codes of the scattered flux at high optical depths is quantified for the first time. Convergence tests using one of the Monte Carlo codes illustrate the sensitivity of the solutions to various model parameters. Conclusions: We provide the first 3D dust radiative transfer benchmark and validate the accuracy of this benchmark through comparisons between multiple independent codes and detailed convergence tests.
Preliminary Results for the OECD/NEA Time Dependent Benchmark using Rattlesnake, Rattlesnake-IQS and TDKENO

DOE Office of Scientific and Technical Information (OSTI.GOV)

DeHart, Mark D.; Mausolff, Zander; Weems, Zach

2016-08-01

One goal of the MAMMOTH M&S project is to validate the analysis capabilities within MAMMOTH. Historical data has shown limited value for validation of full three-dimensional (3D) multi-physics methods. Initial analysis considered the TREAT startup minimum critical core and one of the startup transient tests. At present, validation is focusing on measurements taken during the M8CAL test calibration series. These exercises will valuable in preliminary assessment of the ability of MAMMOTH to perform coupled multi-physics calculations; calculations performed to date are being used to validate the neutron transport solver Rattlesnake\\cite{Rattlesnake} and the fuels performance code BISON. Other validation projects outsidemore » of TREAT are available for single-physics benchmarking. Because the transient solution capability of Rattlesnake is one of the key attributes that makes it unique for TREAT transient simulations, validation of the transient solution of Rattlesnake using other time dependent kinetics benchmarks has considerable value. The Nuclear Energy Agency (NEA) of the Organization for Economic Cooperation and Development (OECD) has recently developed a computational benchmark for transient simulations. This benchmark considered both two-dimensional (2D) and 3D configurations for a total number of 26 different transients. All are negative reactivity insertions, typically returning to the critical state after some time.« less
Navier-Stokes analysis of a liquid rocket engine disk cavity

NASA Technical Reports Server (NTRS)

Benjamin, Theodore G.; Mcconnaughey, Paul K.

1991-01-01

This paper presents a Navier-Stokes analysis of hydrodynamic phenomena occurring in the aft disk cavity of a liquid rocket engine turbine. The cavity analyzed in the Space Shuttle Main Engine Alternate Turbopump currently being developed by NASA and Pratt and Whitney. Comparison of results obtained from the Navier-Stokes code for two rotating disk datasets available in the literature are presented as benchmark validations. The benchmark results obtained using the code show good agreement relative to experimental data, and the turbine disk cavity was analyzed with comparable grid resolution, dissipation levels, and turbulence models. Predicted temperatures in the cavity show that little mixing of hot and cold fluid occurs in the cavity and the flow is dominated by swirl and pumping up the rotating disk.
All inclusive benchmarking.

PubMed

Ellis, Judith

2006-07-01

The aim of this article is to review published descriptions of benchmarking activity and synthesize benchmarking principles to encourage the acceptance and use of Essence of Care as a new benchmarking approach to continuous quality improvement, and to promote its acceptance as an integral and effective part of benchmarking activity in health services. The Essence of Care, was launched by the Department of Health in England in 2001 to provide a benchmarking tool kit to support continuous improvement in the quality of fundamental aspects of health care, for example, privacy and dignity, nutrition and hygiene. The tool kit is now being effectively used by some frontline staff. However, use is inconsistent, with the value of the tool kit, or the support clinical practice benchmarking requires to be effective, not always recognized or provided by National Health Service managers, who are absorbed with the use of quantitative benchmarking approaches and measurability of comparative performance data. This review of published benchmarking literature, was obtained through an ever-narrowing search strategy commencing from benchmarking within quality improvement literature through to benchmarking activity in health services and including access to not only published examples of benchmarking approaches and models used but the actual consideration of web-based benchmarking data. This supported identification of how benchmarking approaches have developed and been used, remaining true to the basic benchmarking principles of continuous improvement through comparison and sharing (Camp 1989). Descriptions of models and exemplars of quantitative and specifically performance benchmarking activity in industry abound (Camp 1998), with far fewer examples of more qualitative and process benchmarking approaches in use in the public services and then applied to the health service (Bullivant 1998). The literature is also in the main descriptive in its support of the effectiveness of benchmarking activity and although this does not seem to have restricted its popularity in quantitative activity, reticence about the value of the more qualitative approaches, for example Essence of Care, needs to be overcome in order to improve the quality of patient care and experiences. The perceived immeasurability and subjectivity of Essence of Care and clinical practice benchmarks means that these benchmarking approaches are not always accepted or supported by health service organizations as valid benchmarking activity. In conclusion, Essence of Care benchmarking is a sophisticated clinical practice benchmarking approach which needs to be accepted as an integral part of health service benchmarking activity to support improvement in the quality of patient care and experiences.
Performance comparison of SNP detection tools with illumina exome sequencing data—an assessment using both family pedigree information and sample-matched SNP array data

PubMed Central

Yi, Ming; Zhao, Yongmei; Jia, Li; He, Mei; Kebebew, Electron; Stephens, Robert M.

2014-01-01

To apply exome-seq-derived variants in the clinical setting, there is an urgent need to identify the best variant caller(s) from a large collection of available options. We have used an Illumina exome-seq dataset as a benchmark, with two validation scenarios—family pedigree information and SNP array data for the same samples, permitting global high-throughput cross-validation, to evaluate the quality of SNP calls derived from several popular variant discovery tools from both the open-source and commercial communities using a set of designated quality metrics. To the best of our knowledge, this is the first large-scale performance comparison of exome-seq variant discovery tools using high-throughput validation with both Mendelian inheritance checking and SNP array data, which allows us to gain insights into the accuracy of SNP calling through such high-throughput validation in an unprecedented way, whereas the previously reported comparison studies have only assessed concordance of these tools without directly assessing the quality of the derived SNPs. More importantly, the main purpose of our study was to establish a reusable procedure that applies high-throughput validation to compare the quality of SNP discovery tools with a focus on exome-seq, which can be used to compare any forthcoming tool(s) of interest. PMID:24831545
NIST TXR Validation of S-HIS radiances and a UW-SSEC Blackbody

NASA Astrophysics Data System (ADS)

Taylor, J. K.; O'Connell, J.; Rice, J. P.; Revercomb, H. E.; Best, F. A.; Tobin, D. C.; Knuteson, R. O.; Adler, D. P.; Ciganovich, N. C.; Dutcher, S. T.; Laporte, D. D.; Ellington, S. D.; Werner, M. W.; Garcia, R. K.

2007-12-01

The ability to accurately validate infrared spectral radiances measured from space by direct comparison with airborne spectrometer radiances was first demonstrated using the Scanning High-resolution Interferometer Sounder (S-HIS) aircraft instrument flown under the AIRS on the NASA Aqua spacecraft in 2002 with subsequent successful comparisons in 2004 and 2006. The comparisons span a range of conditions, including arctic and tropical atmospheres, daytime and nighttime, and ocean and land surfaces. Similar comprehensive and successful comparisons have also been conducted with S-HIS for the MODIS sensors, the Tropospheric Emission Spectrometer (TES), and most recently the MetOp Infrared Atmospheric Sounding Interferometer (IASI). These comparisons are part of a larger picture that already shows great progress toward transforming our ability to make, and verify, highly accurate spectral radiance observations from space. A key challenge, especially for climate, is to carefully define the absolute accuracy of satellite radiances. Our vision of the near-term future of spectrally resolved infrared radiance observation includes a new space-borne mission that provides benchmark observations of the emission spectrum for climate. This concept, referred to as the CLimate Absolute Radiance and REfractivity Observatory (CLARREO) in the recent NRC Decadal Survey provides more complete spectral and time-of-day coverage and would fly basic physical standards to eliminate the need to assume on-board reference stability. Therefore, the spectral radiances from this mission will also serve as benchmarks to propagate a highly accurate calibration to other space-borne IR instruments. For the current approach of calibrating infrared flight sensors, in which thermal vacuum tests are conducted before launch and stability is assumed after launch, in-flight calibration validation is essential for highly accurate applications. At present, airborne observations provide the only source of direct radiance validation with resulting traceable uncertainties approaching the level required for remote sensing and climate applications (0.1 K 3- sigma). For the calibration validation process to be accurate, repeatable, and meaningful, the reference instrument must be extremely well characterized and understood, carefully maintained, and accurately calibrated, with the calibration accuracy of the reference instrument tied to absolute standards. Tests of the S-HIS absolute calibration have been conducted using the NIST transfer radiometer (TXR). The TXR provides a more direct connection to the Blackbody reference sources maintained by NIST than the normal traceability of blackbody temperature scales and paint emissivity measurements. Two basic tests were conducted: (1) comparison of radiances measured by the S-HIS to those from the TXR, and (2) measuring the reflectivity of a UW-SSEC blackbody by using the TXR as a stable detector. Preliminary results from both tests are very promising for confirming and refining the expected absolute accuracy of the S-HIS.
Electric Power Consumption Coefficients for U.S. Industries: Regional Estimation and Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boero, Riccardo

Economic activity relies on electric power provided by electrical generation, transmission, and distribution systems. This paper presents a method developed at Los Alamos National Laboratory to estimate electric power consumption by different industries in the United States. Results are validated through comparisons with existing literature and benchmarking data sources. We also discuss the limitations and applications of the presented method, such as estimating indirect electric power consumption and assessing the economic impact of power outages based on input-output economic models.
Comparison/Validation Study of Lattice Boltzmann and Navier Stokes for Various Benchmark Applications: Report 1 in Discrete Nano-Scale Mechanics and Simulations Series

DTIC Science & Technology

2014-09-15

solver, OpenFOAM version 2.1.‡ In particular, the incompressible laminar flow equations (Eq. 6-8) were solved in conjunction with the pressure im- plicit...central differencing and upwinding schemes, respectively. Since the OpenFOAM code is inherently transient, steady-state conditions were ob- tained...collaborative effort between Kitware and Los Alamos National Laboratory. ‡ OpenFOAM is a free, open-source computational fluid dynamics software developed
Comparison of the CENTRM resonance processor to the NITAWL resonance processor in SCALE

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hollenbach, D.F.; Petrie, L.M.

1998-01-01

This report compares the MTAWL and CENTRM resonance processors in the SCALE code system. The cases examined consist of the International OECD/NEA Criticality Working Group Benchmark 20 problem. These cases represent fuel pellets partially dissolved in a borated solution. The assumptions inherent to the Nordheim Integral Treatment, used in MTAWL, are not valid for these problems. CENTRM resolves this limitation by explicitly calculating a problem dependent point flux from point cross sections, which is then used to create group cross sections.
Integral Full Core Multi-Physics PWR Benchmark with Measured Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Forget, Benoit; Smith, Kord; Kumar, Shikhar

In recent years, the importance of modeling and simulation has been highlighted extensively in the DOE research portfolio with concrete examples in nuclear engineering with the CASL and NEAMS programs. These research efforts and similar efforts worldwide aim at the development of high-fidelity multi-physics analysis tools for the simulation of current and next-generation nuclear power reactors. Like all analysis tools, verification and validation is essential to guarantee proper functioning of the software and methods employed. The current approach relies mainly on the validation of single physic phenomena (e.g. critical experiment, flow loops, etc.) and there is a lack of relevantmore » multiphysics benchmark measurements that are necessary to validate high-fidelity methods being developed today. This work introduces a new multi-cycle full-core Pressurized Water Reactor (PWR) depletion benchmark based on two operational cycles of a commercial nuclear power plant that provides a detailed description of fuel assemblies, burnable absorbers, in-core fission detectors, core loading and re-loading patterns. This benchmark enables analysts to develop extremely detailed reactor core models that can be used for testing and validation of coupled neutron transport, thermal-hydraulics, and fuel isotopic depletion. The benchmark also provides measured reactor data for Hot Zero Power (HZP) physics tests, boron letdown curves, and three-dimensional in-core flux maps from 58 instrumented assemblies. The benchmark description is now available online and has been used by many groups. However, much work remains to be done on the quantification of uncertainties and modeling sensitivities. This work aims to address these deficiencies and make this benchmark a true non-proprietary international benchmark for the validation of high-fidelity tools. This report details the BEAVRS uncertainty quantification for the first two cycle of operations and serves as the final report of the project.« less
Benchmarking on Tsunami Currents with ComMIT

NASA Astrophysics Data System (ADS)

Sharghi vand, N.; Kanoglu, U.

2015-12-01

There were no standards for the validation and verification of tsunami numerical models before 2004 Indian Ocean tsunami. Even, number of numerical models has been used for inundation mapping effort, evaluation of critical structures, etc. without validation and verification. After 2004, NOAA Center for Tsunami Research (NCTR) established standards for the validation and verification of tsunami numerical models (Synolakis et al. 2008 Pure Appl. Geophys. 165, 2197-2228), which will be used evaluation of critical structures such as nuclear power plants against tsunami attack. NCTR presented analytical, experimental and field benchmark problems aimed to estimate maximum runup and accepted widely by the community. Recently, benchmark problems were suggested by the US National Tsunami Hazard Mitigation Program Mapping & Modeling Benchmarking Workshop: Tsunami Currents on February 9-10, 2015 at Portland, Oregon, USA (http://nws.weather.gov/nthmp/index.html). These benchmark problems concentrated toward validation and verification of tsunami numerical models on tsunami currents. Three of the benchmark problems were: current measurement of the Japan 2011 tsunami in Hilo Harbor, Hawaii, USA and in Tauranga Harbor, New Zealand, and single long-period wave propagating onto a small-scale experimental model of the town of Seaside, Oregon, USA. These benchmark problems were implemented in the Community Modeling Interface for Tsunamis (ComMIT) (Titov et al. 2011 Pure Appl. Geophys. 168, 2121-2131), which is a user-friendly interface to the validated and verified Method of Splitting Tsunami (MOST) (Titov and Synolakis 1995 J. Waterw. Port Coastal Ocean Eng. 121, 308-316) model and is developed by NCTR. The modeling results are compared with the required benchmark data, providing good agreements and results are discussed. Acknowledgment: The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement no 603839 (Project ASTARTE - Assessment, Strategy and Risk Reduction for Tsunamis in Europe)
Assessment of the monitoring and evaluation system for integrated community case management (ICCM) in Ethiopia: a comparison against global benchmark indicators.

PubMed

Mamo, Dereje; Hazel, Elizabeth; Lemma, Israel; Guenther, Tanya; Bekele, Abeba; Demeke, Berhanu

2014-10-01

Program managers require feasible, timely, reliable, and valid measures of iCCM implementation to identify problems and assess progress. The global iCCM Task Force developed benchmark indicators to guide implementers to develop or improve monitoring and evaluation (M&E) systems. To assesses Ethiopia's iCCM M&E system by determining the availability and feasibility of the iCCM benchmark indicators. We conducted a desk review of iCCM policy documents, monitoring tools, survey reports, and other rele- vant documents; and key informant interviews with government and implementing partners involved in iCCM scale-up and M&E. Currently, Ethiopia collects data to inform most (70% [33/47]) iCCM benchmark indicators, and modest extra effort could boost this to 83% (39/47). Eight (17%) are not available given the current system. Most benchmark indicators that track coordination and policy, human resources, service delivery and referral, supervision, and quality assurance are available through the routine monitoring systems or periodic surveys. Indicators for supply chain management are less available due to limited consumption data and a weak link with treatment data. Little information is available on iCCM costs. Benchmark indicators can detail the status of iCCM implementation; however, some indicators may not fit country priorities, and others may be difficult to collect. The government of Ethiopia and partners should review and prioritize the benchmark indicators to determine which should be included in the routine M&E system, especially since iCCMdata are being reviewed for addition to the HMIS. Moreover, the Health Extension Worker's reporting burden can be minimized by an integrated reporting approach.
Maximal Unbiased Benchmarking Data Sets for Human Chemokine Receptors and Comparative Analysis.

PubMed

Xia, Jie; Reid, Terry-Elinor; Wu, Song; Zhang, Liangren; Wang, Xiang Simon

2018-05-29

Chemokine receptors (CRs) have long been druggable targets for the treatment of inflammatory diseases and HIV-1 infection. As a powerful technique, virtual screening (VS) has been widely applied to identifying small molecule leads for modern drug targets including CRs. For rational selection of a wide variety of VS approaches, ligand enrichment assessment based on a benchmarking data set has become an indispensable practice. However, the lack of versatile benchmarking sets for the whole CRs family that are able to unbiasedly evaluate every single approach including both structure- and ligand-based VS somewhat hinders modern drug discovery efforts. To address this issue, we constructed Maximal Unbiased Benchmarking Data sets for human Chemokine Receptors (MUBD-hCRs) using our recently developed tools of MUBD-DecoyMaker. The MUBD-hCRs encompasses 13 subtypes out of 20 chemokine receptors, composed of 404 ligands and 15756 decoys so far and is readily expandable in the future. It had been thoroughly validated that MUBD-hCRs ligands are chemically diverse while its decoys are maximal unbiased in terms of "artificial enrichment", "analogue bias". In addition, we studied the performance of MUBD-hCRs, in particular CXCR4 and CCR5 data sets, in ligand enrichment assessments of both structure- and ligand-based VS approaches in comparison with other benchmarking data sets available in the public domain and demonstrated that MUBD-hCRs is very capable of designating the optimal VS approach. MUBD-hCRs is a unique and maximal unbiased benchmarking set that covers major CRs subtypes so far.
The challenge of benchmarking health systems: is ICT innovation capacity more systemic than organizational dependent?

PubMed

Lapão, Luís Velez

2015-01-01

The article by Catan et al. presents a benchmarking exercise comparing Israel and Portugal on the implementation of Information and Communication Technologies in the healthcare sector. Special attention was given to e-Health and m-Health. The authors collected information via a set of interviews with key stakeholders. They compared two different cultures and societies, which have reached slightly different implementation outcomes. Although the comparison is very enlightening, it is also challenging. Benchmarking exercises present a set of challenges, such as the choice of methodologies and the assessment of the impact on organizational strategy. Precise benchmarking methodology is a valid tool for eliciting information about alternatives for improving health systems. However, many beneficial interventions, which benchmark as effective, fail to translate into meaningful healthcare outcomes across contexts. There is a relationship between results and the innovational and competitive environments. Differences in healthcare governance and financing models are well known; but little is known about their impact on Information and Communication Technology implementation. The article by Catan et al. provides interesting clues about this issue. Public systems (such as those of Portugal, UK, Sweden, Spain, etc.) present specific advantages and disadvantages concerning Information and Communication Technology development and implementation. Meanwhile, private systems based fundamentally on insurance packages, (such as Israel, Germany, Netherlands or USA) present a different set of advantages and disadvantages - especially a more open context for innovation. Challenging issues from both the Portuguese and Israeli cases will be addressed. Clearly, more research is needed on both benchmarking methodologies and on ICT implementation strategies.
Evaluation of control strategies using an oxidation ditch benchmark.

PubMed

Abusam, A; Keesman, K J; Spanjers, H; van, Straten G; Meinema, K

2002-01-01

This paper presents validation and implementation results of a benchmark developed for a specific full-scale oxidation ditch wastewater treatment plant. A benchmark is a standard simulation procedure that can be used as a tool in evaluating various control strategies proposed for wastewater treatment plants. It is based on model and performance criteria development. Testing of this benchmark, by comparing benchmark predictions to real measurements of the electrical energy consumptions and amounts of disposed sludge for a specific oxidation ditch WWTP, has shown that it can (reasonably) be used for evaluating the performance of this WWTP. Subsequently, the validated benchmark was then used in evaluating some basic and advanced control strategies. Some of the interesting results obtained are the following: (i) influent flow splitting ratio, between the first and the fourth aerated compartments of the ditch, has no significant effect on the TN concentrations in the effluent, and (ii) for evaluation of long-term control strategies, future benchmarks need to be able to assess settlers' performance.
Comparison of mapping algorithms used in high-throughput sequencing: application to Ion Torrent data

PubMed Central

2014-01-01

Background The rapid evolution in high-throughput sequencing (HTS) technologies has opened up new perspectives in several research fields and led to the production of large volumes of sequence data. A fundamental step in HTS data analysis is the mapping of reads onto reference sequences. Choosing a suitable mapper for a given technology and a given application is a subtle task because of the difficulty of evaluating mapping algorithms. Results In this paper, we present a benchmark procedure to compare mapping algorithms used in HTS using both real and simulated datasets and considering four evaluation criteria: computational resource and time requirements, robustness of mapping, ability to report positions for reads in repetitive regions, and ability to retrieve true genetic variation positions. To measure robustness, we introduced a new definition for a correctly mapped read taking into account not only the expected start position of the read but also the end position and the number of indels and substitutions. We developed CuReSim, a new read simulator, that is able to generate customized benchmark data for any kind of HTS technology by adjusting parameters to the error types. CuReSim and CuReSimEval, a tool to evaluate the mapping quality of the CuReSim simulated reads, are freely available. We applied our benchmark procedure to evaluate 14 mappers in the context of whole genome sequencing of small genomes with Ion Torrent data for which such a comparison has not yet been established. Conclusions A benchmark procedure to compare HTS data mappers is introduced with a new definition for the mapping correctness as well as tools to generate simulated reads and evaluate mapping quality. The application of this procedure to Ion Torrent data from the whole genome sequencing of small genomes has allowed us to validate our benchmark procedure and demonstrate that it is helpful for selecting a mapper based on the intended application, questions to be addressed, and the technology used. This benchmark procedure can be used to evaluate existing or in-development mappers as well as to optimize parameters of a chosen mapper for any application and any sequencing platform. PMID:24708189
Validation of the analytical methods in the LWR code BOXER for gadolinium-loaded fuel pins

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paratte, J.M.; Arkuszewski, J.J.; Kamboj, B.K.

1990-01-01

Due to the very high absorption occurring in gadolinium-loaded fuel pins, calculations of lattices with such pins present are a demanding test of the analysis methods in light water reactor (LWR) cell and assembly codes. Considerable effort has, therefore, been devoted to the validation of code methods for gadolinia fuel. The goal of the work reported in this paper is to check the analysis methods in the LWR cell/assembly code BOXER and its associated cross-section processing code ETOBOX, by comparison of BOXER results with those from a very accurate Monte Carlo calculation for a gadolinium benchmark problem. Initial results ofmore » such a comparison have been previously reported. However, the Monte Carlo calculations, done with the MCNP code, were performed at Los Alamos National Laboratory using ENDF/B-V data, while the BOXER calculations were performed at the Paul Scherrer Institute using JEF-1 nuclear data. This difference in the basic nuclear data used for the two calculations, caused by the restricted nature of these evaluated data files, led to associated uncertainties in a comparison of the results for methods validation. In the joint investigations at the Georgia Institute of Technology and PSI, such uncertainty in this comparison was eliminated by using ENDF/B-V data for BOXER calculations at Georgia Tech.« less

Benchmarking Multilayer-HySEA model for landslide generated tsunami. HTHMP validation process.

NASA Astrophysics Data System (ADS)

Macias, J.; Escalante, C.; Castro, M. J.

2017-12-01

Landslide tsunami hazard may be dominant along significant parts of the coastline around the world, in particular in the USA, as compared to hazards from other tsunamigenic sources. This fact motivated NTHMP about the need of benchmarking models for landslide generated tsunamis, following the same methodology already used for standard tsunami models when the source is seismic. To perform the above-mentioned validation process, a set of candidate benchmarks were proposed. These benchmarks are based on a subset of available laboratory data sets for solid slide experiments and deformable slide experiments, and include both submarine and subaerial slides. A benchmark based on a historic field event (Valdez, AK, 1964) close the list of proposed benchmarks. A total of 7 benchmarks. The Multilayer-HySEA model including non-hydrostatic effects has been used to perform all the benchmarking problems dealing with laboratory experiments proposed in the workshop that was organized at Texas A&M University - Galveston, on January 9-11, 2017 by NTHMP. The aim of this presentation is to show some of the latest numerical results obtained with the Multilayer-HySEA (non-hydrostatic) model in the framework of this validation effort.Acknowledgements. This research has been partially supported by the Spanish Government Research project SIMURISK (MTM2015-70490-C02-01-R) and University of Malaga, Campus de Excelencia Internacional Andalucía Tech. The GPU computations were performed at the Unit of Numerical Methods (University of Malaga).
Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

DOE PAGES

Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; ...

2015-12-21

This paper discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package developed and maintained at Oak Ridge National Laboratory. It has been developed to scale well from laptop to small computing clusters to advanced supercomputers. Special features of Shift include hybrid capabilities for variance reduction such as CADIS and FW-CADIS, and advanced parallel decomposition and tally methods optimized for scalability on supercomputing architectures. Shift has been validated and verified against various reactor physics benchmarks and compares well to other state-of-the-art Monte Carlo radiation transport codes such as MCNP5, CE KENO-VI, and OpenMC. Somemore » specific benchmarks used for verification and validation include the CASL VERA criticality test suite and several Westinghouse AP1000 ® problems. These benchmark and scaling studies show promising results.« less
[Benchmarking and other functions of ROM: back to basics].

PubMed

Barendregt, M

2015-01-01

Since 2011 outcome data in the Dutch mental health care have been collected on a national scale. This has led to confusion about the position of benchmarking in the system known as routine outcome monitoring (rom). To provide insight into the various objectives and uses of aggregated outcome data. A qualitative review was performed and the findings were analysed. Benchmarking is a strategy for finding best practices and for improving efficacy and it belongs to the domain of quality management. Benchmarking involves comparing outcome data by means of instrumentation and is relatively tolerant with regard to the validity of the data. Although benchmarking is a function of rom, it must be differentiated form other functions from rom. Clinical management, public accountability, research, payment for performance and information for patients are all functions of rom which require different ways of data feedback and which make different demands on the validity of the underlying data. Benchmarking is often wrongly regarded as being simply a synonym for 'comparing institutions'. It is, however, a method which includes many more factors; it can be used to improve quality and has a more flexible approach to the validity of outcome data and is less concerned than other rom functions about funding and the amount of information given to patients. Benchmarking can make good use of currently available outcome data.
Scale/TSUNAMI Sensitivity Data for ICSBEP Evaluations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rearden, Bradley T; Reed, Davis Allan; Lefebvre, Robert A

2011-01-01

The Tools for Sensitivity and Uncertainty Analysis Methodology Implementation (TSUNAMI) software developed at Oak Ridge National Laboratory (ORNL) as part of the Scale code system provide unique methods for code validation, gap analysis, and experiment design. For TSUNAMI analysis, sensitivity data are generated for each application and each existing or proposed experiment used in the assessment. The validation of diverse sets of applications requires potentially thousands of data files to be maintained and organized by the user, and a growing number of these files are available through the International Handbook of Evaluated Criticality Safety Benchmark Experiments (IHECSBE) distributed through themore » International Criticality Safety Benchmark Evaluation Program (ICSBEP). To facilitate the use of the IHECSBE benchmarks in rigorous TSUNAMI validation and gap analysis techniques, ORNL generated SCALE/TSUNAMI sensitivity data files (SDFs) for several hundred benchmarks for distribution with the IHECSBE. For the 2010 edition of IHECSBE, the sensitivity data were generated using 238-group cross-section data based on ENDF/B-VII.0 for 494 benchmark experiments. Additionally, ORNL has developed a quality assurance procedure to guide the generation of Scale inputs and sensitivity data, as well as a graphical user interface to facilitate the use of sensitivity data in identifying experiments and applying them in validation studies.« less
Modified reactive tabu search for the symmetric traveling salesman problems

NASA Astrophysics Data System (ADS)

Lim, Yai-Fung; Hong, Pei-Yee; Ramli, Razamin; Khalid, Ruzelan

2013-09-01

Reactive tabu search (RTS) is an improved method of tabu search (TS) and it dynamically adjusts tabu list size based on how the search is performed. RTS can avoid disadvantage of TS which is in the parameter tuning in tabu list size. In this paper, we proposed a modified RTS approach for solving symmetric traveling salesman problems (TSP). The tabu list size of the proposed algorithm depends on the number of iterations when the solutions do not override the aspiration level to achieve a good balance between diversification and intensification. The proposed algorithm was tested on seven chosen benchmarked problems of symmetric TSP. The performance of the proposed algorithm is compared with that of the TS by using empirical testing, benchmark solution and simple probabilistic analysis in order to validate the quality of solution. The computational results and comparisons show that the proposed algorithm provides a better quality solution than that of the TS.
Use of integral experiments in support to the validation of JEFF-3.2 nuclear data evaluation

NASA Astrophysics Data System (ADS)

Leclaire, Nicolas; Cochet, Bertrand; Jinaphanh, Alexis; Haeck, Wim

2017-09-01

For many years now, IRSN has developed its own Monte Carlo continuous energy capability, which allows testing various nuclear data libraries. In that prospect, a validation database of 1136 experiments was built from cases used for the validation of the APOLLO2-MORET 5 multigroup route of the CRISTAL V2.0 package. In this paper, the keff obtained for more than 200 benchmarks using the JEFF-3.1.1 and JEFF-3.2 libraries are compared to benchmark keff values and main discrepancies are analyzed regarding the neutron spectrum. Special attention is paid on benchmarks for which the results have been highly modified between both JEFF-3 versions.
Benchmarking criticality analysis of TRIGA fuel storage racks.

PubMed

Robinson, Matthew Loren; DeBey, Timothy M; Higginbotham, Jack F

2017-01-01

A criticality analysis was benchmarked to sub-criticality measurements of the hexagonal fuel storage racks at the United States Geological Survey TRIGA MARK I reactor in Denver. These racks, which hold up to 19 fuel elements each, are arranged at 0.61m (2 feet) spacings around the outer edge of the reactor. A 3-dimensional model was created of the racks using MCNP5, and the model was verified experimentally by comparison to measured subcritical multiplication data collected in an approach to critical loading of two of the racks. The validated model was then used to show that in the extreme condition where the entire circumference of the pool was lined with racks loaded with used fuel the storage array is subcritical with a k value of about 0.71; well below the regulatory limit of 0.8. A model was also constructed of the rectangular 2×10 fuel storage array used in many other TRIGA reactors to validate the technique against the original TRIGA licensing sub-critical analysis performed in 1966. The fuel used in this study was standard 20% enriched (LEU) aluminum or stainless steel clad TRIGA fuel. Copyright Â© 2016. Published by Elsevier Ltd.
Benchmark for Numerical Models of Stented Coronary Bifurcation Flow.

PubMed

García Carrascal, P; García García, J; Sierra Pallares, J; Castro Ruiz, F; Manuel Martín, F J

2018-09-01

In-stent restenosis ails many patients who have undergone stenting. When the stented artery is a bifurcation, the intervention is particularly critical because of the complex stent geometry involved in these structures. Computational fluid dynamics (CFD) has been shown to be an effective approach when modeling blood flow behavior and understanding the mechanisms that underlie in-stent restenosis. However, these CFD models require validation through experimental data in order to be reliable. It is with this purpose in mind that we performed particle image velocimetry (PIV) measurements of velocity fields within flows through a simplified coronary bifurcation. Although the flow in this simplified bifurcation differs from the actual blood flow, it emulates the main fluid dynamic mechanisms found in hemodynamic flow. Experimental measurements were performed for several stenting techniques in both steady and unsteady flow conditions. The test conditions were strictly controlled, and uncertainty was accurately predicted. The results obtained in this research represent readily accessible, easy to emulate, detailed velocity fields and geometry, and they have been successfully used to validate our numerical model. These data can be used as a benchmark for further development of numerical CFD modeling in terms of comparison of the main flow pattern characteristics.
Taking the Battle Upstream: Towards a Benchmarking Role for NATO

DTIC Science & Technology

2012-09-01

Benchmark.........................................................................................14 Figure 8. World Bank Benchmarking Work on Quality...Search of a Benchmarking Theory for the Public Sector.” 16 Figure 8. World Bank Benchmarking Work on Quality of Governance One of the most...the Ministries of Defense in the countries in which it works ). Another interesting innovation is that for comparison purposes, McKinsey categorized
Validation and Inter-comparison Against Observations of GODAE Ocean View Ocean Prediction Systems

NASA Astrophysics Data System (ADS)

Xu, J.; Davidson, F. J. M.; Smith, G. C.; Lu, Y.; Hernandez, F.; Regnier, C.; Drevillon, M.; Ryan, A.; Martin, M.; Spindler, T. D.; Brassington, G. B.; Oke, P. R.

2016-02-01

For weather forecasts, validation of forecast performance is done at the end user level as well as by the meteorological forecast centers. In the development of Ocean Prediction Capacity, the same level of care for ocean forecast performance and validation is needed. Herein we present results from a validation against observations of 6 Global Ocean Forecast Systems under the GODAE OceanView International Collaboration Network. These systems include the Global Ocean Ice Forecast System (GIOPS) developed by the Government of Canada, two systems PSY3 and PSY4 from the French Mercator-Ocean Ocean Forecasting Group, the FOAM system from UK met office, HYCOM-RTOFS from NOAA/NCEP/NWA of USA, and the Australian Bluelink-OceanMAPS system from the CSIRO, the Australian Meteorological Bureau and the Australian Navy.The observation data used in the comparison are sea surface temperature, sub-surface temperature, sub-surface salinity, sea level anomaly, and sea ice total concentration data. Results of the inter-comparison demonstrate forecast performance limits, strengths and weaknesses of each of the six systems. This work establishes validation protocols and routines by which all new prediction systems developed under the CONCEPTS Collaborative Network will be benchmarked prior to approval for operations. This includes anticipated delivery of CONCEPTS regional prediction systems over the next two years including a pan Canadian 1/12th degree resolution ice ocean prediction system and limited area 1/36th degree resolution prediction systems. The validation approach of comparing forecasts to observations at the time and location of the observation is called Class 4 metrics. It has been adopted by major international ocean prediction centers, and will be recommended to JCOMM-WMO as routine validation approach for operational oceanography worldwide.
Validation of hydrogen gas stratification and mixing models

DOE PAGES

Wu, Hsingtzu; Zhao, Haihua

2015-05-26

Two validation benchmarks confirm that the BMIX++ code is capable of simulating unintended hydrogen release scenarios efficiently. The BMIX++ (UC Berkeley mechanistic MIXing code in C++) code has been developed to accurately and efficiently predict the fluid mixture distribution and heat transfer in large stratified enclosures for accident analyses and design optimizations. The BMIX++ code uses a scaling based one-dimensional method to achieve large reduction in computational effort compared to a 3-D computational fluid dynamics (CFD) simulation. Two BMIX++ benchmark models have been developed. One is for a single buoyant jet in an open space and another is for amore » large sealed enclosure with both a jet source and a vent near the floor. Both of them have been validated by comparisons with experimental data. Excellent agreements are observed. The entrainment coefficients of 0.09 and 0.08 are found to fit the experimental data for hydrogen leaks with the Froude number of 99 and 268 best, respectively. In addition, the BIX++ simulation results of the average helium concentration for an enclosure with a vent and a single jet agree with the experimental data within a margin of about 10% for jet flow rates ranging from 1.21 × 10⁻⁴ to 3.29 × 10⁻⁴ m³/s. In conclusion, computing time for each BMIX++ model with a normal desktop computer is less than 5 min.« less
Test One to Test Many: A Unified Approach to Quantum Benchmarks

NASA Astrophysics Data System (ADS)

Bai, Ge; Chiribella, Giulio

2018-04-01

Quantum benchmarks are routinely used to validate the experimental demonstration of quantum information protocols. Many relevant protocols, however, involve an infinite set of input states, of which only a finite subset can be used to test the quality of the implementation. This is a problem, because the benchmark for the finitely many states used in the test can be higher than the original benchmark calculated for infinitely many states. This situation arises in the teleportation and storage of coherent states, for which the benchmark of 50% fidelity is commonly used in experiments, although finite sets of coherent states normally lead to higher benchmarks. Here, we show that the average fidelity over all coherent states can be indirectly probed with a single setup, requiring only two-mode squeezing, a 50-50 beam splitter, and homodyne detection. Our setup enables a rigorous experimental validation of quantum teleportation, storage, amplification, attenuation, and purification of noisy coherent states. More generally, we prove that every quantum benchmark can be tested by preparing a single entangled state and measuring a single observable.
A comparison of five benchmarks

NASA Technical Reports Server (NTRS)

Huss, Janice E.; Pennline, James A.

1987-01-01

Five benchmark programs were obtained and run on the NASA Lewis CRAY X-MP/24. A comparison was made between the programs codes and between the methods for calculating performance figures. Several multitasking jobs were run to gain experience in how parallel performance is measured.
PMLB: a large benchmark suite for machine learning evaluation and comparison.

PubMed

Olson, Randal S; La Cava, William; Orzechowski, Patryk; Urbanowicz, Ryan J; Moore, Jason H

2017-01-01

The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark datasets have emerged from different sources, but their organization and adoption as standards have been inconsistent. As such, selecting and curating specific benchmarks remains an unnecessary burden on machine learning practitioners and data scientists. The present study introduces an accessible, curated, and developing public benchmark resource to facilitate identification of the strengths and weaknesses of different machine learning methodologies. We compare meta-features among the current set of benchmark datasets in this resource to characterize the diversity of available data. Finally, we apply a number of established machine learning methods to the entire benchmark suite and analyze how datasets and algorithms cluster in terms of performance. From this study, we find that existing benchmarks lack the diversity to properly benchmark machine learning algorithms, and there are several gaps in benchmarking problems that still need to be considered. This work represents another important step towards understanding the limitations of popular benchmarking suites and developing a resource that connects existing benchmarking standards to more diverse and efficient standards in the future.
ELAPSE - NASA AMES LISP AND ADA BENCHMARK SUITE: EFFICIENCY OF LISP AND ADA PROCESSING - A SYSTEM EVALUATION

NASA Technical Reports Server (NTRS)

Davis, G. J.

1994-01-01

One area of research of the Information Sciences Division at NASA Ames Research Center is devoted to the analysis and enhancement of processors and advanced computer architectures, specifically in support of automation and robotic systems. To compare systems' abilities to efficiently process Lisp and Ada, scientists at Ames Research Center have developed a suite of non-parallel benchmarks called ELAPSE. The benchmark suite was designed to test a single computer's efficiency as well as alternate machine comparisons on Lisp, and/or Ada languages. ELAPSE tests the efficiency with which a machine can execute the various routines in each environment. The sample routines are based on numeric and symbolic manipulations and include two-dimensional fast Fourier transformations, Cholesky decomposition and substitution, Gaussian elimination, high-level data processing, and symbol-list references. Also included is a routine based on a Bayesian classification program sorting data into optimized groups. The ELAPSE benchmarks are available for any computer with a validated Ada compiler and/or Common Lisp system. Of the 18 routines that comprise ELAPSE, provided within this package are 14 developed or translated at Ames. The others are readily available through literature. The benchmark that requires the most memory is CHOLESKY.ADA. Under VAX/VMS, CHOLESKY.ADA requires 760K of main memory. ELAPSE is available on either two 5.25 inch 360K MS-DOS format diskettes (standard distribution) or a 9-track 1600 BPI ASCII CARD IMAGE format magnetic tape. The contents of the diskettes are compressed using the PKWARE archiving tools. The utility to unarchive the files, PKUNZIP.EXE, is included. The ELAPSE benchmarks were written in 1990. VAX and VMS are trademarks of Digital Equipment Corporation. MS-DOS is a registered trademark of Microsoft Corporation.
Benchmarking clinical photography services in the NHS.

PubMed

Arbon, Giles

2015-01-01

Benchmarking is used in services across the National Health Service (NHS) using various benchmarking programs. Clinical photography services do not have a program in place and services have to rely on ad hoc surveys of other services. A trial benchmarking exercise was undertaken with 13 services in NHS Trusts. This highlights valuable data and comparisons that can be used to benchmark and improve services throughout the profession.
Quality in E-Learning--A Conceptual Framework Based on Experiences from Three International Benchmarking Projects

ERIC Educational Resources Information Center

Ossiannilsson, E.; Landgren, L.

2012-01-01

Between 2008 and 2010, Lund University took part in three international benchmarking projects, "E-xcellence+," the "eLearning Benchmarking Exercise 2009," and the "First Dual-Mode Distance Learning Benchmarking Club." A comparison of these models revealed a rather high level of correspondence. From this finding and…
Performance of Landslide-HySEA tsunami model for NTHMP benchmarking validation process

NASA Astrophysics Data System (ADS)

Macias, Jorge

2017-04-01

In its FY2009 Strategic Plan, the NTHMP required that all numerical tsunami inundation models be verified as accurate and consistent through a model benchmarking process. This was completed in 2011, but only for seismic tsunami sources and in a limited manner for idealized solid underwater landslides. Recent work by various NTHMP states, however, has shown that landslide tsunami hazard may be dominant along significant parts of the US coastline, as compared to hazards from other tsunamigenic sources. To perform the above-mentioned validation process, a set of candidate benchmarks were proposed. These benchmarks are based on a subset of available laboratory date sets for solid slide experiments and deformable slide experiments, and include both submarine and subaerial slides. A benchmark based on a historic field event (Valdez, AK, 1964) close the list of proposed benchmarks. The Landslide-HySEA model has participated in the workshop that was organized at Texas A&M University - Galveston, on January 9-11, 2017. The aim of this presentation is to show some of the numerical results obtained for Landslide-HySEA in the framework of this benchmarking validation/verification effort. Acknowledgements. This research has been partially supported by the Junta de Andalucía research project TESELA (P11-RNM7069), the Spanish Government Research project SIMURISK (MTM2015-70490-C02-01-R) and Universidad de Málaga, Campus de Excelencia Internacional Andalucía Tech. The GPU computations were performed at the Unit of Numerical Methods (University of Malaga).
An international land-biosphere model benchmarking activity for the IPCC Fifth Assessment Report (AR5)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoffman, Forrest M; Randerson, James T; Thornton, Peter E

2009-12-01

The need to capture important climate feedbacks in general circulation models (GCMs) has resulted in efforts to include atmospheric chemistry and land and ocean biogeochemistry into the next generation of production climate models, called Earth System Models (ESMs). While many terrestrial and ocean carbon models have been coupled to GCMs, recent work has shown that such models can yield a wide range of results (Friedlingstein et al., 2006). This work suggests that a more rigorous set of global offline and partially coupled experiments, along with detailed analyses of processes and comparisons with measurements, are needed. The Carbon-Land Model Intercomparison Projectmore » (C-LAMP) was designed to meet this need by providing a simulation protocol and model performance metrics based upon comparisons against best-available satellite- and ground-based measurements (Hoffman et al., 2007). Recently, a similar effort in Europe, called the International Land Model Benchmark (ILAMB) Project, was begun to assess the performance of European land surface models. These two projects will now serve as prototypes for a proposed international land-biosphere model benchmarking activity for those models participating in the IPCC Fifth Assessment Report (AR5). Initially used for model validation for terrestrial biogeochemistry models in the NCAR Community Land Model (CLM), C-LAMP incorporates a simulation protocol for both offline and partially coupled simulations using a prescribed historical trajectory of atmospheric CO2 concentrations. Models are confronted with data through comparisons against AmeriFlux site measurements, MODIS satellite observations, NOAA Globalview flask records, TRANSCOM inversions, and Free Air CO2 Enrichment (FACE) site measurements. Both sets of experiments have been performed using two different terrestrial biogeochemistry modules coupled to the CLM version 3 in the Community Climate System Model version 3 (CCSM3): the CASA model of Fung, et al., and the carbon-nitrogen (CN) model of Thornton. Comparisons of the CLM3 offline results against observational datasets have been performed and are described in Randerson et al. (2009). CLM version 4 has been evaluated using C-LAMP, showing improvement in many of the metrics. Efforts are now underway to initiate a Nitrogen-Land Model Intercomparison Project (N-LAMP) to better constrain the effects of the nitrogen cycle in biosphere models. Presented will be new results from C-LAMP for CLM4, initial N-LAMP developments, and the proposed land-biosphere model benchmarking activity.« less
Benchmarking an unstructured grid sediment model in an energetic estuary

DOE PAGES

Lopez, Jesse E.; Baptista, António M.

2016-12-14

A sediment model coupled to the hydrodynamic model SELFE is validated against a benchmark combining a set of idealized tests and an application to a field-data rich energetic estuary. After sensitivity studies, model results for the idealized tests largely agree with previously reported results from other models in addition to analytical, semi-analytical, or laboratory results. Results of suspended sediment in an open channel test with fixed bottom are sensitive to turbulence closure and treatment for hydrodynamic bottom boundary. Results for the migration of a trench are very sensitive to critical stress and erosion rate, but largely insensitive to turbulence closure.more » The model is able to qualitatively represent sediment dynamics associated with estuarine turbidity maxima in an idealized estuary. Applied to the Columbia River estuary, the model qualitatively captures sediment dynamics observed by fixed stations and shipborne profiles. Representation of the vertical structure of suspended sediment degrades when stratification is underpredicted. Across all tests, skill metrics of suspended sediments lag those of hydrodynamics even when qualitatively representing dynamics. The benchmark is fully documented in an openly available repository to encourage unambiguous comparisons against other models.« less

A Monte-Carlo Benchmark of TRIPOLI-4® and MCNP on ITER neutronics

NASA Astrophysics Data System (ADS)

Blanchet, David; Pénéliau, Yannick; Eschbach, Romain; Fontaine, Bruno; Cantone, Bruno; Ferlet, Marc; Gauthier, Eric; Guillon, Christophe; Letellier, Laurent; Proust, Maxime; Mota, Fernando; Palermo, Iole; Rios, Luis; Guern, Frédéric Le; Kocan, Martin; Reichle, Roger

2017-09-01

Radiation protection and shielding studies are often based on the extensive use of 3D Monte-Carlo neutron and photon transport simulations. ITER organization hence recommends the use of MCNP-5 code (version 1.60), in association with the FENDL-2.1 neutron cross section data library, specifically dedicated to fusion applications. The MCNP reference model of the ITER tokamak, the `C-lite', is being continuously developed and improved. This article proposes to develop an alternative model, equivalent to the 'C-lite', but for the Monte-Carlo code TRIPOLI-4®. A benchmark study is defined to test this new model. Since one of the most critical areas for ITER neutronics analysis concerns the assessment of radiation levels and Shutdown Dose Rates (SDDR) behind the Equatorial Port Plugs (EPP), the benchmark is conducted to compare the neutron flux through the EPP. This problem is quite challenging with regard to the complex geometry and considering the important neutron flux attenuation ranging from 1014 down to 108 n•cm-2•s-1. Such code-to-code comparison provides independent validation of the Monte-Carlo simulations, improving the confidence in neutronic results.
Fisk-based criteria to support validation of detection methods for drinking water and air.

DOE Office of Scientific and Technical Information (OSTI.GOV)

MacDonell, M.; Bhattacharyya, M.; Finster, M.

2009-02-18

This report was prepared to support the validation of analytical methods for threat contaminants under the U.S. Environmental Protection Agency (EPA) National Homeland Security Research Center (NHSRC) program. It is designed to serve as a resource for certain applications of benchmark and fate information for homeland security threat contaminants. The report identifies risk-based criteria from existing health benchmarks for drinking water and air for potential use as validation targets. The focus is on benchmarks for chronic public exposures. The priority sources are standard EPA concentration limits for drinking water and air, along with oral and inhalation toxicity values. Many contaminantsmore » identified as homeland security threats to drinking water or air would convert to other chemicals within minutes to hours of being released. For this reason, a fate analysis has been performed to identify potential transformation products and removal half-lives in air and water so appropriate forms can be targeted for detection over time. The risk-based criteria presented in this report to frame method validation are expected to be lower than actual operational targets based on realistic exposures following a release. Note that many target criteria provided in this report are taken from available benchmarks without assessing the underlying toxicological details. That is, although the relevance of the chemical form and analogues are evaluated, the toxicological interpretations and extrapolations conducted by the authoring organizations are not. It is also important to emphasize that such targets in the current analysis are not health-based advisory levels to guide homeland security responses. This integrated evaluation of chronic public benchmarks and contaminant fate has identified more than 200 risk-based criteria as method validation targets across numerous contaminants and fate products in drinking water and air combined. The gap in directly applicable values is considerable across the full set of threat contaminants, so preliminary indicators were developed from other well-documented benchmarks to serve as a starting point for validation efforts. By this approach, at least preliminary context is available for water or air, and sometimes both, for all chemicals on the NHSRC list that was provided for this evaluation. This means that a number of concentrations presented in this report represent indirect measures derived from related benchmarks or surrogate chemicals, as described within the many results tables provided in this report.« less
42 CFR 440.335 - Benchmark-equivalent health benefits coverage.

Code of Federal Regulations, 2010 CFR

2010-10-01

...) Aggregate actuarial value. Benchmark-equivalent coverage is health benefits coverage that has an aggregate... planning services and supplies and other appropriate preventive services, as designated by the Secretary... State for purposes of comparison in establishing the aggregate actuarial value of the benchmark...
Validating Cellular Automata Lava Flow Emplacement Algorithms with Standard Benchmarks

NASA Astrophysics Data System (ADS)

Richardson, J. A.; Connor, L.; Charbonnier, S. J.; Connor, C.; Gallant, E.

2015-12-01

A major existing need in assessing lava flow simulators is a common set of validation benchmark tests. We propose three levels of benchmarks which test model output against increasingly complex standards. First, imulated lava flows should be morphologically identical, given changes in parameter space that should be inconsequential, such as slope direction. Second, lava flows simulated in simple parameter spaces can be tested against analytical solutions or empirical relationships seen in Bingham fluids. For instance, a lava flow simulated on a flat surface should produce a circular outline. Third, lava flows simulated over real world topography can be compared to recent real world lava flows, such as those at Tolbachik, Russia, and Fogo, Cape Verde. Success or failure of emplacement algorithms in these validation benchmarks can be determined using a Bayesian approach, which directly tests the ability of an emplacement algorithm to correctly forecast lava inundation. Here we focus on two posterior metrics, P(A|B) and P(¬A|¬B), which describe the positive and negative predictive value of flow algorithms. This is an improvement on less direct statistics such as model sensitivity and the Jaccard fitness coefficient. We have performed these validation benchmarks on a new, modular lava flow emplacement simulator that we have developed. This simulator, which we call MOLASSES, follows a Cellular Automata (CA) method. The code is developed in several interchangeable modules, which enables quick modification of the distribution algorithm from cell locations to their neighbors. By assessing several different distribution schemes with the benchmark tests, we have improved the performance of MOLASSES to correctly match early stages of the 2012-3 Tolbachik Flow, Kamchakta Russia, to 80%. We also can evaluate model performance given uncertain input parameters using a Monte Carlo setup. This illuminates sensitivity to model uncertainty.
Comparison of Origin 2000 and Origin 3000 Using NAS Parallel Benchmarks

NASA Technical Reports Server (NTRS)

Turney, Raymond D.

2001-01-01

This report describes results of benchmark tests on the Origin 3000 system currently being installed at the NASA Ames National Advanced Supercomputing facility. This machine will ultimately contain 1024 R14K processors. The first part of the system, installed in November, 2000 and named mendel, is an Origin 3000 with 128 R12K processors. For comparison purposes, the tests were also run on lomax, an Origin 2000 with R12K processors. The BT, LU, and SP application benchmarks in the NAS Parallel Benchmark Suite and the kernel benchmark FT were chosen to determine system performance and measure the impact of changes on the machine as it evolves. Having been written to measure performance on Computational Fluid Dynamics applications, these benchmarks are assumed appropriate to represent the NAS workload. Since the NAS runs both message passing (MPI) and shared-memory, compiler directive type codes, both MPI and OpenMP versions of the benchmarks were used. The MPI versions used were the latest official release of the NAS Parallel Benchmarks, version 2.3. The OpenMP versiqns used were PBN3b2, a beta version that is in the process of being released. NPB 2.3 and PBN 3b2 are technically different benchmarks, and NPB results are not directly comparable to PBN results.
Engineering department physical plant staffing requirements.

PubMed

Cole, C

1997-05-01

There is a considerable effort in the health care arena to establish credible engineering manpower yardsticks that are universally applicable as a benchmark. This document was created by using one facility's own benchmark criteria that can be used to help develop either internal or competitive benchmarking comparisons.
Development of a flattening filter free multiple source model for use as an independent, Monte Carlo, dose calculation, quality assurance tool for clinical trials.

PubMed

Faught, Austin M; Davidson, Scott E; Popple, Richard; Kry, Stephen F; Etzel, Carol; Ibbott, Geoffrey S; Followill, David S

2017-09-01

The Imaging and Radiation Oncology Core-Houston (IROC-H) Quality Assurance Center (formerly the Radiological Physics Center) has reported varying levels of compliance from their anthropomorphic phantom auditing program. IROC-H studies have suggested that one source of disagreement between institution submitted calculated doses and measurement is the accuracy of the institution's treatment planning system dose calculations and heterogeneity corrections used. In order to audit this step of the radiation therapy treatment process, an independent dose calculation tool is needed. Monte Carlo multiple source models for Varian flattening filter free (FFF) 6 MV and FFF 10 MV therapeutic x-ray beams were commissioned based on central axis depth dose data from a 10 × 10 cm 2 field size and dose profiles for a 40 × 40 cm 2 field size. The models were validated against open-field measurements in a water tank for field sizes ranging from 3 × 3 cm 2 to 40 × 40 cm 2 . The models were then benchmarked against IROC-H's anthropomorphic head and neck phantom and lung phantom measurements. Validation results, assessed with a ±2%/2 mm gamma criterion, showed average agreement of 99.9% and 99.0% for central axis depth dose data for FFF 6 MV and FFF 10 MV models, respectively. Dose profile agreement using the same evaluation technique averaged 97.8% and 97.9% for the respective models. Phantom benchmarking comparisons were evaluated with a ±3%/2 mm gamma criterion, and agreement averaged 90.1% and 90.8% for the respective models. Multiple source models for Varian FFF 6 MV and FFF 10 MV beams have been developed, validated, and benchmarked for inclusion in an independent dose calculation quality assurance tool for use in clinical trial audits. © 2017 American Association of Physicists in Medicine.
Fuzzy Structures Analysis of Aircraft Panels in NASTRAN

NASA Technical Reports Server (NTRS)

Sparrow, Victor W.; Buehrle, Ralph D.

2001-01-01

This paper concerns an application of the fuzzy structures analysis (FSA) procedures of Soize to prototypical aerospace panels in MSC/NASTRAN, a large commercial finite element program. A brief introduction to the FSA procedures is first provided. The implementation of the FSA methods is then disclosed, and the method is validated by comparison to published results for the forced vibrations of a fuzzy beam. The results of the new implementation show excellent agreement to the benchmark results. The ongoing effort at NASA Langley and Penn State to apply these fuzzy structures analysis procedures to real aircraft panels is then described.
U.S. IOOS coastal and ocean modeling testbed: Inter-model evaluation of tides, waves, and hurricane surge in the Gulf of Mexico

NASA Astrophysics Data System (ADS)

Kerr, P. C.; Donahue, A. S.; Westerink, J. J.; Luettich, R. A.; Zheng, L. Y.; Weisberg, R. H.; Huang, Y.; Wang, H. V.; Teng, Y.; Forrest, D. R.; Roland, A.; Haase, A. T.; Kramer, A. W.; Taylor, A. A.; Rhome, J. R.; Feyen, J. C.; Signell, R. P.; Hanson, J. L.; Hope, M. E.; Estes, R. M.; Dominguez, R. A.; Dunbar, R. P.; Semeraro, L. N.; Westerink, H. J.; Kennedy, A. B.; Smith, J. M.; Powell, M. D.; Cardone, V. J.; Cox, A. T.

2013-10-01

A Gulf of Mexico performance evaluation and comparison of coastal circulation and wave models was executed through harmonic analyses of tidal simulations, hindcasts of Hurricane Ike (2008) and Rita (2005), and a benchmarking study. Three unstructured coastal circulation models (ADCIRC, FVCOM, and SELFE) validated with similar skill on a new common Gulf scale mesh (ULLR) with identical frictional parameterization and forcing for the tidal validation and hurricane hindcasts. Coupled circulation and wave models, SWAN+ADCIRC and WWMII+SELFE, along with FVCOM loosely coupled with SWAN, also validated with similar skill. NOAA's official operational forecast storm surge model (SLOSH) was implemented on local and Gulf scale meshes with the same wind stress and pressure forcing used by the unstructured models for hindcasts of Ike and Rita. SLOSH's local meshes failed to capture regional processes such as Ike's forerunner and the results from the Gulf scale mesh further suggest shortcomings may be due to a combination of poor mesh resolution, missing internal physics such as tides and nonlinear advection, and SLOSH's internal frictional parameterization. In addition, these models were benchmarked to assess and compare execution speed and scalability for a prototypical operational simulation. It was apparent that a higher number of computational cores are needed for the unstructured models to meet similar operational implementation requirements to SLOSH, and that some of them could benefit from improved parallelization and faster execution speed.
Refinement, Validation and Benchmarking of a Model for E-Government Service Quality

NASA Astrophysics Data System (ADS)

Magoutas, Babis; Mentzas, Gregoris

This paper presents the refinement and validation of a model for Quality of e-Government Services (QeGS). We built upon our previous work where a conceptualized model was identified and put focus on the confirmatory phase of the model development process, in order to come up with a valid and reliable QeGS model. The validated model, which was benchmarked with very positive results with similar models found in the literature, can be used for measuring the QeGS in a reliable and valid manner. This will form the basis for a continuous quality improvement process, unleashing the full potential of e-government services for both citizens and public administrations.
FY 2002 Customer Satisfaction & Top 200 Users Survey Composite Report

DTIC Science & Technology

2002-11-01

Federal Government Benchmark 68.6% 71.1% DTIC Excels by +8.4 +11 *ACSI is the official service quality benchmark for the...care. § The American Customer Satisfaction Index (ACSI), the official service quality benchmark for the Federal Government, is currently 71.1%; DTIC...ACSI is the official service quality benchmark for the Federal GovernmentFig 1FY 20020Comparison of Customer Satisfaction (Customer Care
Solutions of the two-dimensional Hubbard model: Benchmarks and results from a wide range of numerical algorithms

DOE PAGES

LeBlanc, J. P. F.; Antipov, Andrey E.; Becca, Federico; ...

2015-12-14

Numerical results for ground-state and excited-state properties (energies, double occupancies, and Matsubara-axis self-energies) of the single-orbital Hubbard model on a two-dimensional square lattice are presented, in order to provide an assessment of our ability to compute accurate results in the thermodynamic limit. Many methods are employed, including auxiliary-field quantum Monte Carlo, bare and bold-line diagrammatic Monte Carlo, method of dual fermions, density matrix embedding theory, density matrix renormalization group, dynamical cluster approximation, diffusion Monte Carlo within a fixed-node approximation, unrestricted coupled cluster theory, and multireference projected Hartree-Fock methods. Comparison of results obtained by different methods allows for the identification ofmore » uncertainties and systematic errors. The importance of extrapolation to converged thermodynamic-limit values is emphasized. Furthermore, cases where agreement between different methods is obtained establish benchmark results that may be useful in the validation of new approaches and the improvement of existing methods.« less
Benchmarking in emergency health systems.

PubMed

Kennedy, Marcus P; Allen, Jacqueline; Allen, Greg

2002-12-01

This paper discusses the role of benchmarking as a component of quality management. It describes the historical background of benchmarking, its competitive origin and the requirement in today's health environment for a more collaborative approach. The classical 'functional and generic' types of benchmarking are discussed with a suggestion to adopt a different terminology that describes the purpose and practicalities of benchmarking. Benchmarking is not without risks. The consequence of inappropriate focus and the need for a balanced overview of process is explored. The competition that is intrinsic to benchmarking is questioned and the negative impact it may have on improvement strategies in poorly performing organizations is recognized. The difficulty in achieving cross-organizational validity in benchmarking is emphasized, as is the need to scrutinize benchmarking measures. The cost effectiveness of benchmarking projects is questioned and the concept of 'best value, best practice' in an environment of fixed resources is examined.
Establishing Benchmarks for Outcome Indicators: A Statistical Approach to Developing Performance Standards.

ERIC Educational Resources Information Center

Henry, Gary T.; And Others

1992-01-01

A statistical technique is presented for developing performance standards based on benchmark groups. The benchmark groups are selected using a multivariate technique that relies on a squared Euclidean distance method. For each observation unit (a school district in the example), a unique comparison group is selected. (SLD)
42 CFR 440.320 - State plan requirements: Optional enrollment for exempt individuals.

Code of Federal Regulations, 2010 CFR

2010-10-01

... benchmark or benchmark-equivalent benefit package, the State must effectively inform the individual prior to...-equivalent benefit package and the costs under such a package and provide a comparison of how they differ... benchmark-equivalent benefit package. (4) For individuals who the State determines have become exempt...
42 CFR 440.320 - State plan requirements: Optional enrollment for exempt individuals.

Code of Federal Regulations, 2011 CFR

2011-10-01

... benchmark or benchmark-equivalent benefit package, the State must effectively inform the individual prior to...-equivalent benefit package and the costs under such a package and provide a comparison of how they differ... benchmark-equivalent benefit package. (4) For individuals who the State determines have become exempt...
42 CFR 440.320 - State plan requirements: Optional enrollment for exempt individuals.

Code of Federal Regulations, 2012 CFR

2012-10-01

... benchmark or benchmark-equivalent benefit package, the State must effectively inform the individual prior to...-equivalent benefit package and the costs under such a package and provide a comparison of how they differ... benchmark-equivalent benefit package. (4) For individuals who the State determines have become exempt...
42 CFR 440.320 - State plan requirements: Optional enrollment for exempt individuals.

Code of Federal Regulations, 2013 CFR

2013-10-01

... benchmark or benchmark-equivalent benefit package, the State must effectively inform the individual prior to...-equivalent benefit package and the costs under such a package and provide a comparison of how they differ... benchmark-equivalent benefit package. (4) For individuals who the State determines have become exempt...
42 CFR 440.320 - State plan requirements: Optional enrollment for exempt individuals.

Code of Federal Regulations, 2014 CFR

2014-10-01

... benchmark or benchmark-equivalent benefit package, the State must effectively inform the individual prior to...-equivalent benefit package and the costs under such a package and provide a comparison of how they differ... benchmark-equivalent benefit package. (4) For individuals who the State determines have become exempt...
Comparison of the PHISICS/RELAP5-3D Ring and Block Model Results for Phase I of the OECD MHTGR-350 Benchmark

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gerhard Strydom

2014-04-01

The INL PHISICS code system consists of three modules providing improved core simulation capability: INSTANT (performing 3D nodal transport core calculations), MRTAU (depletion and decay heat generation) and a perturbation/mixer module. Coupling of the PHISICS code suite to the thermal hydraulics system code RELAP5-3D has recently been finalized, and as part of the code verification and validation program the exercises defined for Phase I of the OECD/NEA MHTGR 350 MW Benchmark were completed. This paper provides an overview of the MHTGR Benchmark, and presents selected results of the three steady state exercises 1-3 defined for Phase I. For Exercise 1,more » a stand-alone steady-state neutronics solution for an End of Equilibrium Cycle Modular High Temperature Reactor (MHTGR) was calculated with INSTANT, using the provided geometry, material descriptions, and detailed cross-section libraries. Exercise 2 required the modeling of a stand-alone thermal fluids solution. The RELAP5-3D results of four sub-cases are discussed, consisting of various combinations of coolant bypass flows and material thermophysical properties. Exercise 3 combined the first two exercises in a coupled neutronics and thermal fluids solution, and the coupled code suite PHISICS/RELAP5-3D was used to calculate the results of two sub-cases. The main focus of the paper is a comparison of the traditional RELAP5-3D “ring” model approach vs. a much more detailed model that include kinetics feedback on individual block level and thermal feedbacks on a triangular sub-mesh. The higher fidelity of the block model is illustrated with comparison results on the temperature, power density and flux distributions, and the typical under-predictions produced by the ring model approach are highlighted.« less

SU-F-T-305: Clinical Effects of Dosimetric Leaf Gap (DLG) Values Between Matched Varian Truebeam (TB) Linacs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mihailidis, D; Mallah, J; Zhu, D

2016-06-15

Purpose: The dosimetric leaf gap (DLG) is an important parameter to be measured for dynamic beam delivery of modern linacs, like the Varian Truebeam (TB). The clinical effects of DLG-values on IMRT and/or VMAT commissioning of two “matched” TB linacs will be presented.Methods and Materials: The DLG values on two TB linacs were measured for all energy modalities (filtered and FFF-modes) as part of the dynamic delivery mode commissioning (IMRT and/or VMAT. After the standard beam data was modeled in eclipse treatment planning system (TPS) and validated, IMRT validation was performed based on TG1191 benchmark, IROC Head-Neck (H&N) phantom andmore » sample of clinical cases, all measured on both linacs. Although there was a single-set of data entered in the TPS, a noticeable difference was observed for the DLG-values between the linacs. The TG119, IROC phantom and selected patient plans were furnished with DLG-values of TB1 for both linacs and the delivery was performed on both TB linacs for comparison. Results: The DLG values of TB1 was first used for both linacs to perform the testing comparisons. The QA comparison of TG119 plans revealed a great dependence of the results to the DLG-values used for the linac for all energy modalities studied, especially when moving from 3%/3mm to 2%/2mm γ-analysis. Conclusion: The DLG-values have a definite influence on the dynamic dose, delivery that increases with the plan complexity. We recommend that the measured DLG-values are assigned to each of the “matched” linacs, even if a single set of beam data describes multiple linacs. The user should perform a detail test of the dynamic delivery of each linac based on end-to-end benchmark suites like TG119 and IROC phantoms.1Ezzel G., et al., “IMRT commissioning: Multiple institution planning and dosimetry comparisons, a report from AAPM Task Group 119.” Med. Phys. 36:5359–5373 (2009). partly supported by CAMC Cancer Center and Alliance Oncology.« less
Benchmarking of calculation schemes in APOLLO2 and COBAYA3 for WER lattices

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zheleva, N.; Ivanov, P.; Todorova, G.

This paper presents solutions of the NURISP WER lattice benchmark using APOLLO2, TRIPOLI4 and COBAYA3 pin-by-pin. The main objective is to validate MOC based calculation schemes for pin-by-pin cross-section generation with APOLLO2 against TRIPOLI4 reference results. A specific objective is to test the APOLLO2 generated cross-sections and interface discontinuity factors in COBAYA3 pin-by-pin calculations with unstructured mesh. The VVER-1000 core consists of large hexagonal assemblies with 2 mm inter-assembly water gaps which require the use of unstructured meshes in the pin-by-pin core simulators. The considered 2D benchmark problems include 19-pin clusters, fuel assemblies and 7-assembly clusters. APOLLO2 calculation schemes withmore » the step characteristic method (MOC) and the higher-order Linear Surface MOC have been tested. The comparison of APOLLO2 vs. TRIPOLI4 results shows a very close agreement. The 3D lattice solver in COBAYA3 uses transport corrected multi-group diffusion approximation with interface discontinuity factors of Generalized Equivalence Theory (GET) or Black Box Homogenization (BBH) type. The COBAYA3 pin-by-pin results in 2, 4 and 8 energy groups are close to the reference solutions when using side-dependent interface discontinuity factors. (authors)« less
Quantitative comparison of DNA methylation assays for biomarker development and clinical applications.

PubMed

2016-07-01

DNA methylation patterns are altered in numerous diseases and often correlate with clinically relevant information such as disease subtypes, prognosis and drug response. With suitable assays and after validation in large cohorts, such associations can be exploited for clinical diagnostics and personalized treatment decisions. Here we describe the results of a community-wide benchmarking study comparing the performance of all widely used methods for DNA methylation analysis that are compatible with routine clinical use. We shipped 32 reference samples to 18 laboratories in seven different countries. Researchers in those laboratories collectively contributed 21 locus-specific assays for an average of 27 predefined genomic regions, as well as six global assays. We evaluated assay sensitivity on low-input samples and assessed the assays' ability to discriminate between cell types. Good agreement was observed across all tested methods, with amplicon bisulfite sequencing and bisulfite pyrosequencing showing the best all-round performance. Our technology comparison can inform the selection, optimization and use of DNA methylation assays in large-scale validation studies, biomarker development and clinical diagnostics.
A large-scale benchmark of gene prioritization methods.

PubMed

Guala, Dimitri; Sonnhammer, Erik L L

2017-04-21

In order to maximize the use of results from high-throughput experimental studies, e.g. GWAS, for identification and diagnostics of new disease-associated genes, it is important to have properly analyzed and benchmarked gene prioritization tools. While prospective benchmarks are underpowered to provide statistically significant results in their attempt to differentiate the performance of gene prioritization tools, a strategy for retrospective benchmarking has been missing, and new tools usually only provide internal validations. The Gene Ontology(GO) contains genes clustered around annotation terms. This intrinsic property of GO can be utilized in construction of robust benchmarks, objective to the problem domain. We demonstrate how this can be achieved for network-based gene prioritization tools, utilizing the FunCoup network. We use cross-validation and a set of appropriate performance measures to compare state-of-the-art gene prioritization algorithms: three based on network diffusion, NetRank and two implementations of Random Walk with Restart, and MaxLink that utilizes network neighborhood. Our benchmark suite provides a systematic and objective way to compare the multitude of available and future gene prioritization tools, enabling researchers to select the best gene prioritization tool for the task at hand, and helping to guide the development of more accurate methods.
Stratification of unresponsive patients by an independently validated index of brain complexity

PubMed Central

Casarotto, Silvia; Comanducci, Angela; Rosanova, Mario; Sarasso, Simone; Fecchio, Matteo; Napolitani, Martino; Pigorini, Andrea; G. Casali, Adenauer; Trimarchi, Pietro D.; Boly, Melanie; Gosseries, Olivia; Bodart, Olivier; Curto, Francesco; Landi, Cristina; Mariotti, Maurizio; Devalle, Guya; Laureys, Steven; Tononi, Giulio

2016-01-01

Objective Validating objective, brain‐based indices of consciousness in behaviorally unresponsive patients represents a challenge due to the impossibility of obtaining independent evidence through subjective reports. Here we address this problem by first validating a promising metric of consciousness—the Perturbational Complexity Index (PCI)—in a benchmark population who could confirm the presence or absence of consciousness through subjective reports, and then applying the same index to patients with disorders of consciousness (DOCs). Methods The benchmark population encompassed 150 healthy controls and communicative brain‐injured subjects in various states of conscious wakefulness, disconnected consciousness, and unconsciousness. Receiver operating characteristic curve analysis was performed to define an optimal cutoff for discriminating between the conscious and unconscious conditions. This cutoff was then applied to a cohort of noncommunicative DOC patients (38 in a minimally conscious state [MCS] and 43 in a vegetative state [VS]). Results We found an empirical cutoff that discriminated with 100% sensitivity and specificity between the conscious and the unconscious conditions in the benchmark population. This cutoff resulted in a sensitivity of 94.7% in detecting MCS and allowed the identification of a number of unresponsive VS patients (9 of 43) with high values of PCI, overlapping with the distribution of the benchmark conscious condition. Interpretation Given its high sensitivity and specificity in the benchmark and MCS population, PCI offers a reliable, independently validated stratification of unresponsive patients that has important physiopathological and therapeutic implications. In particular, the high‐PCI subgroup of VS patients may retain a capacity for consciousness that is not expressed in behavior. Ann Neurol 2016;80:718–729 PMID:27717082
ICSBEP Benchmarks For Nuclear Data Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Briggs, J. Blair

2005-05-24

The International Criticality Safety Benchmark Evaluation Project (ICSBEP) was initiated in 1992 by the United States Department of Energy. The ICSBEP became an official activity of the Organization for Economic Cooperation and Development (OECD) -- Nuclear Energy Agency (NEA) in 1995. Representatives from the United States, United Kingdom, France, Japan, the Russian Federation, Hungary, Republic of Korea, Slovenia, Serbia and Montenegro (formerly Yugoslavia), Kazakhstan, Spain, Israel, Brazil, Poland, and the Czech Republic are now participating. South Africa, India, China, and Germany are considering participation. The purpose of the ICSBEP is to identify, evaluate, verify, and formally document a comprehensive andmore » internationally peer-reviewed set of criticality safety benchmark data. The work of the ICSBEP is published as an OECD handbook entitled ''International Handbook of Evaluated Criticality Safety Benchmark Experiments.'' The 2004 Edition of the Handbook contains benchmark specifications for 3331 critical or subcritical configurations that are intended for use in validation efforts and for testing basic nuclear data. New to the 2004 Edition of the Handbook is a draft criticality alarm / shielding type benchmark that should be finalized in 2005 along with two other similar benchmarks. The Handbook is being used extensively for nuclear data testing and is expected to be a valuable resource for code and data validation and improvement efforts for decades to come. Specific benchmarks that are useful for testing structural materials such as iron, chromium, nickel, and manganese; beryllium; lead; thorium; and 238U are highlighted.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Frayce, D.; Khayat, R.E.; Derdouri, A.

The dual reciprocity boundary element method (DRBEM) is implemented to solve three-dimensional transient heat conduction problems in the presence of arbitrary sources, typically as these problems arise in materials processing. The DRBEM has a major advantage over conventional BEM, since it avoids the computation of volume integrals. These integrals stem from transient, nonlinear, and/or source terms. Thus there is no need to discretize the inner domain, since only a number of internal points are needed for the computation. The validity of the method is assessed upon comparison with results from benchmark problems where analytical solutions exist. There is generally goodmore » agreement. Comparison against finite element results is also favorable. Calculations are carried out in order to assess the influence of the number and location of internal nodes. The influence of the ratio of the numbers of internal to boundary nodes is also examined.« less
GEN-IV Benchmarking of Triso Fuel Performance Models under accident conditions modeling input data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Collin, Blaise Paul

This document presents the benchmark plan for the calculation of particle fuel performance on safety testing experiments that are representative of operational accidental transients. The benchmark is dedicated to the modeling of fission product release under accident conditions by fuel performance codes from around the world, and the subsequent comparison to post-irradiation experiment (PIE) data from the modeled heating tests. The accident condition benchmark is divided into three parts: • The modeling of a simplified benchmark problem to assess potential numerical calculation issues at low fission product release. • The modeling of the AGR-1 and HFR-EU1bis safety testing experiments. •more » The comparison of the AGR-1 and HFR-EU1bis modeling results with PIE data. The simplified benchmark case, thereafter named NCC (Numerical Calculation Case), is derived from “Case 5” of the International Atomic Energy Agency (IAEA) Coordinated Research Program (CRP) on coated particle fuel technology [IAEA 2012]. It is included so participants can evaluate their codes at low fission product release. “Case 5” of the IAEA CRP-6 showed large code-to-code discrepancies in the release of fission products, which were attributed to “effects of the numerical calculation method rather than the physical model” [IAEA 2012]. The NCC is therefore intended to check if these numerical effects subsist. The first two steps imply the involvement of the benchmark participants with a modeling effort following the guidelines and recommendations provided by this document. The third step involves the collection of the modeling results by Idaho National Laboratory (INL) and the comparison of these results with the available PIE data. The objective of this document is to provide all necessary input data to model the benchmark cases, and to give some methodology guidelines and recommendations in order to make all results suitable for comparison with each other. The participants should read this document thoroughly to make sure all the data needed for their calculations is provided in the document. Missing data will be added to a revision of the document if necessary. 09/2016: Tables 6 and 8 updated. AGR-2 input data added« less
GraphBench

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sukumar, Sreenivas R.; Hong, Seokyong; Lee, Sangkeun

2016-06-01

GraphBench is a benchmark suite for graph pattern mining and graph analysis systems. The benchmark suite is a significant addition to conducting apples-apples comparison of graph analysis software (databases, in-memory tools, triple stores, etc.)
Parallelization of Unsteady Adaptive Mesh Refinement for Unstructured Navier-Stokes Solvers

NASA Technical Reports Server (NTRS)

Schwing, Alan M.; Nompelis, Ioannis; Candler, Graham V.

2014-01-01

This paper explores the implementation of the MPI parallelization in a Navier-Stokes solver using adaptive mesh re nement. Viscous and inviscid test problems are considered for the purpose of benchmarking, as are implicit and explicit time advancement methods. The main test problem for comparison includes e ects from boundary layers and other viscous features and requires a large number of grid points for accurate computation. Ex- perimental validation against double cone experiments in hypersonic ow are shown. The adaptive mesh re nement shows promise for a staple test problem in the hypersonic com- munity. Extension to more advanced techniques for more complicated ows is described.
Experimental and computational surface and flow-field results for an all-body hypersonic aircraft

NASA Technical Reports Server (NTRS)

Lockman, William K.; Lawrence, Scott L.; Cleary, Joseph W.

1990-01-01

The objective of the present investigation is to establish a benchmark experimental data base for a generic hypersonic vehicle shape for validation and/or calibration of advanced computational fluid dynamics computer codes. This paper includes results from the comprehensive test program conducted in the NASA/Ames 3.5-foot Hypersonic Wind Tunnel for a generic all-body hypersonic aircraft model. Experimental and computational results on flow visualization, surface pressures, surface convective heat transfer, and pitot-pressure flow-field surveys are presented. Comparisons of the experimental results with computational results from an upwind parabolized Navier-Stokes code developed at Ames demonstrate the capabilities of this code.
Benchmarking gate-based quantum computers

NASA Astrophysics Data System (ADS)

Michielsen, Kristel; Nocon, Madita; Willsch, Dennis; Jin, Fengping; Lippert, Thomas; De Raedt, Hans

2017-11-01

With the advent of public access to small gate-based quantum processors, it becomes necessary to develop a benchmarking methodology such that independent researchers can validate the operation of these processors. We explore the usefulness of a number of simple quantum circuits as benchmarks for gate-based quantum computing devices and show that circuits performing identity operations are very simple, scalable and sensitive to gate errors and are therefore very well suited for this task. We illustrate the procedure by presenting benchmark results for the IBM Quantum Experience, a cloud-based platform for gate-based quantum computing.
Definitive Characterization of CA 19-9 in Resectable Pancreatic Cancer Using a Reference Set of Serum and Plasma Specimens.

PubMed

Haab, Brian B; Huang, Ying; Balasenthil, Seetharaman; Partyka, Katie; Tang, Huiyuan; Anderson, Michelle; Allen, Peter; Sasson, Aaron; Zeh, Herbert; Kaul, Karen; Kletter, Doron; Ge, Shaokui; Bern, Marshall; Kwon, Richard; Blasutig, Ivan; Srivastava, Sudhir; Frazier, Marsha L; Sen, Subrata; Hollingsworth, Michael A; Rinaudo, Jo Ann; Killary, Ann M; Brand, Randall E

2015-01-01

The validation of candidate biomarkers often is hampered by the lack of a reliable means of assessing and comparing performance. We present here a reference set of serum and plasma samples to facilitate the validation of biomarkers for resectable pancreatic cancer. The reference set includes a large cohort of stage I-II pancreatic cancer patients, recruited from 5 different institutions, and relevant control groups. We characterized the performance of the current best serological biomarker for pancreatic cancer, CA 19-9, using plasma samples from the reference set to provide a benchmark for future biomarker studies and to further our knowledge of CA 19-9 in early-stage pancreatic cancer and the control groups. CA 19-9 distinguished pancreatic cancers from the healthy and chronic pancreatitis groups with an average sensitivity and specificity of 70-74%, similar to previous studies using all stages of pancreatic cancer. Chronic pancreatitis patients did not show CA 19-9 elevations, but patients with benign biliary obstruction had elevations nearly as high as the cancer patients. We gained additional information about the biomarker by comparing two distinct assays. The two CA 9-9 assays agreed well in overall performance but diverged in measurements of individual samples, potentially due to subtle differences in antibody specificity as revealed by glycan array analysis. Thus, the reference set promises be a valuable resource for biomarker validation and comparison, and the CA 19-9 data presented here will be useful for benchmarking and for exploring relationships to CA 19-9.
Definitive Characterization of CA 19-9 in Resectable Pancreatic Cancer Using a Reference Set of Serum and Plasma Specimens

PubMed Central

Haab, Brian B.; Huang, Ying; Balasenthil, Seetharaman; Partyka, Katie; Tang, Huiyuan; Anderson, Michelle; Allen, Peter; Sasson, Aaron; Zeh, Herbert; Kaul, Karen; Kletter, Doron; Ge, Shaokui; Bern, Marshall; Kwon, Richard; Blasutig, Ivan; Srivastava, Sudhir; Frazier, Marsha L.; Sen, Subrata; Hollingsworth, Michael A.; Rinaudo, Jo Ann; Killary, Ann M.; Brand, Randall E.

2015-01-01

The validation of candidate biomarkers often is hampered by the lack of a reliable means of assessing and comparing performance. We present here a reference set of serum and plasma samples to facilitate the validation of biomarkers for resectable pancreatic cancer. The reference set includes a large cohort of stage I-II pancreatic cancer patients, recruited from 5 different institutions, and relevant control groups. We characterized the performance of the current best serological biomarker for pancreatic cancer, CA 19–9, using plasma samples from the reference set to provide a benchmark for future biomarker studies and to further our knowledge of CA 19–9 in early-stage pancreatic cancer and the control groups. CA 19–9 distinguished pancreatic cancers from the healthy and chronic pancreatitis groups with an average sensitivity and specificity of 70–74%, similar to previous studies using all stages of pancreatic cancer. Chronic pancreatitis patients did not show CA 19–9 elevations, but patients with benign biliary obstruction had elevations nearly as high as the cancer patients. We gained additional information about the biomarker by comparing two distinct assays. The two CA 9–9 assays agreed well in overall performance but diverged in measurements of individual samples, potentially due to subtle differences in antibody specificity as revealed by glycan array analysis. Thus, the reference set promises be a valuable resource for biomarker validation and comparison, and the CA 19–9 data presented here will be useful for benchmarking and for exploring relationships to CA 19–9. PMID:26431551
FY2012 summary of tasks completed on PROTEUS-thermal work.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, C.H.; Smith, M.A.

2012-06-06

PROTEUS is a suite of the neutronics codes, both old and new, that can be used within the SHARP codes being developed under the NEAMS program. Discussion here is focused on updates and verification and validation activities of the SHARP neutronics code, DeCART, for application to thermal reactor analysis. As part of the development of SHARP tools, the different versions of the DeCART code created for PWR, BWR, and VHTR analysis were integrated. Verification and validation tests for the integrated version were started, and the generation of cross section libraries based on the subgroup method was revisited for the targetedmore » reactor types. The DeCART code has been reorganized in preparation for an efficient integration of the different versions for PWR, BWR, and VHTR analysis. In DeCART, the old-fashioned common blocks and header files have been replaced by advanced memory structures. However, the changing of variable names was minimized in order to limit problems with the code integration. Since the remaining stability problems of DeCART were mostly caused by the CMFD methodology and modules, significant work was performed to determine whether they could be replaced by more stable methods and routines. The cross section library is a key element to obtain accurate solutions. Thus, the procedure for generating cross section libraries was revisited to provide libraries tailored for the targeted reactor types. To improve accuracy in the cross section library, an attempt was made to replace the CENTRM code by the MCNP Monte Carlo code as a tool obtaining reference resonance integrals. The use of the Monte Carlo code allows us to minimize problems or approximations that CENTRM introduces since the accuracy of the subgroup data is limited by that of the reference solutions. The use of MCNP requires an additional set of libraries without resonance cross sections so that reference calculations can be performed for a unit cell in which only one isotope of interest includes resonance cross sections, among the isotopes in the composition. The OECD MHTGR-350 benchmark core was simulated using DeCART as initial focus of the verification/validation efforts. Among the benchmark problems, Exercise 1 of Phase 1 is a steady-state benchmark case for the neutronics calculation for which block-wise cross sections were provided in 26 energy groups. This type of problem was designed for a homogenized geometry solver like DIF3D rather than the high-fidelity code DeCART. Instead of the homogenized block cross sections given in the benchmark, the VHTR-specific 238-group ENDF/B-VII.0 library of DeCART was directly used for preliminary calculations. Initial results showed that the multiplication factors of a fuel pin and a fuel block with or without a control rod hole were off by 6, -362, and -183 pcm Dk from comparable MCNP solutions, respectively. The 2-D and 3-D one-third core calculations were also conducted for the all-rods-out (ARO) and all-rods-in (ARI) configurations, producing reasonable results. Figure 1 illustrates the intermediate (1.5 eV - 17 keV) and thermal (below 1.5 eV) group flux distributions. As seen from VHTR cores with annular fuels, the intermediate group fluxes are relatively high in the fuel region, but the thermal group fluxes are higher in the inner and outer graphite reflector regions than in the fuel region. To support the current project, a new three-year I-NERI collaboration involving ANL and KAERI was started in November 2011, focused on performing in-depth verification and validation of high-fidelity multi-physics simulation codes for LWR and VHTR. The work scope includes generating improved cross section libraries for the targeted reactor types, developing benchmark models for verification and validation of the neutronics code with or without thermo-fluid feedback, and performing detailed comparisons of predicted reactor parameters against both Monte Carlo solutions and experimental measurements. The following list summarizes the work conducted so far for PROTEUS-Thermal Tasks: Unification of different versions of DeCART was initiated, and at the same time code modernization was conducted to make code unification efficient; (2) Regeneration of cross section libraries was attempted for the targeted reactor types, and the procedure for generating cross section libraries was updated by replacing CENTRM with MCNP for reference resonance integrals; (3) The MHTGR-350 benchmark core was simulated using DeCART with VHTR-specific 238-group ENDF/B-VII.0 library, and MCNP calculations were performed for comparison; and (4) Benchmark problems for PWR and BWR analysis were prepared for the DeCART verification/validation effort. In the coming months, the work listed above will be completed. Cross section libraries will be generated with optimized group structures for specific reactor types.« less
A chemical EOR benchmark study of different reservoir simulators

NASA Astrophysics Data System (ADS)

Goudarzi, Ali; Delshad, Mojdeh; Sepehrnoori, Kamy

2016-09-01

Interest in chemical EOR processes has intensified in recent years due to the advancements in chemical formulations and injection techniques. Injecting Polymer (P), surfactant/polymer (SP), and alkaline/surfactant/polymer (ASP) are techniques for improving sweep and displacement efficiencies with the aim of improving oil production in both secondary and tertiary floods. There has been great interest in chemical flooding recently for different challenging situations. These include high temperature reservoirs, formations with extreme salinity and hardness, naturally fractured carbonates, and sandstone reservoirs with heavy and viscous crude oils. More oil reservoirs are reaching maturity where secondary polymer floods and tertiary surfactant methods have become increasingly important. This significance has added to the industry's interest in using reservoir simulators as tools for reservoir evaluation and management to minimize costs and increase the process efficiency. Reservoir simulators with special features are needed to represent coupled chemical and physical processes present in chemical EOR processes. The simulators need to be first validated against well controlled lab and pilot scale experiments to reliably predict the full field implementations. The available data from laboratory scale include 1) phase behavior and rheological data; and 2) results of secondary and tertiary coreflood experiments for P, SP, and ASP floods under reservoir conditions, i.e. chemical retentions, pressure drop, and oil recovery. Data collected from corefloods are used as benchmark tests comparing numerical reservoir simulators with chemical EOR modeling capabilities such as STARS of CMG, ECLIPSE-100 of Schlumberger, REVEAL of Petroleum Experts. The research UTCHEM simulator from The University of Texas at Austin is also included since it has been the benchmark for chemical flooding simulation for over 25 years. The results of this benchmark comparison will be utilized to improve chemical design for field-scale studies using commercial simulators. The benchmark tests illustrate the potential of commercial simulators for chemical flooding projects and provide a comprehensive table of strengths and limitations of each simulator for a given chemical EOR process. Mechanistic simulations of chemical EOR processes will provide predictive capability and can aid in optimization of the field injection projects. The objective of this paper is not to compare the computational efficiency and solution algorithms; it only focuses on the process modeling comparison.
BIGHORN Computational Fluid Dynamics Theory, Methodology, and Code Verification & Validation Benchmark Problems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xia, Yidong; Andrs, David; Martineau, Richard Charles

This document presents the theoretical background for a hybrid finite-element / finite-volume fluid flow solver, namely BIGHORN, based on the Multiphysics Object Oriented Simulation Environment (MOOSE) computational framework developed at the Idaho National Laboratory (INL). An overview of the numerical methods used in BIGHORN are discussed and followed by a presentation of the formulation details. The document begins with the governing equations for the compressible fluid flow, with an outline of the requisite constitutive relations. A second-order finite volume method used for solving the compressible fluid flow problems is presented next. A Pressure-Corrected Implicit Continuous-fluid Eulerian (PCICE) formulation for timemore » integration is also presented. The multi-fluid formulation is being developed. Although multi-fluid is not fully-developed, BIGHORN has been designed to handle multi-fluid problems. Due to the flexibility in the underlying MOOSE framework, BIGHORN is quite extensible, and can accommodate both multi-species and multi-phase formulations. This document also presents a suite of verification & validation benchmark test problems for BIGHORN. The intent for this suite of problems is to provide baseline comparison data that demonstrates the performance of the BIGHORN solution methods on problems that vary in complexity from laminar to turbulent flows. Wherever possible, some form of solution verification has been attempted to identify sensitivities in the solution methods, and suggest best practices when using BIGHORN.« less
7 CFR 25.404 - Validation of designation.

Code of Federal Regulations, 2010 CFR

2010-01-01

... maintain a process for ensuring ongoing broad-based participation by community residents consistent with the approved application and planning process outlined in the strategic plan. (1) Continuous... benchmarks, the process it will use for reviewing goals and benchmarks and revising its strategic plan. (2...
Selecting Students for Pre-Algebra: Examination of the Relative Utility of the Anchorage Pre-Algebra Screening Tests and the State of Alaska Standards Based Benchmark 2 Mathematics Study. An Examination of Consequential Validity and Recommendation.

ERIC Educational Resources Information Center

Fenton, Ray

This study examined the relative efficacy of the Anchorage (Alaska) Pre-Algebra Test and the State of Alaska Benchmark in 2 Math examination as tools used in the process of recommending grade 6 students for grade 7 Pre-Algebra placement. The consequential validity of the tests is explored in the context of class placements and grades earned. The…
Validation of Shielding Analysis Capability of SuperMC with SINBAD

NASA Astrophysics Data System (ADS)

Chen, Chaobin; Yang, Qi; Wu, Bin; Han, Yuncheng; Song, Jing

2017-09-01

Abstract: The shielding analysis capability of SuperMC was validated with the Shielding Integral Benchmark Archive Database (SINBAD). The SINBAD was compiled by RSICC and NEA, it includes numerous benchmark experiments performed with the D-T fusion neutron source facilities of OKTAVIAN, FNS, IPPE, etc. The results from SuperMC simulation were compared with experimental data and MCNP results. Very good agreement with deviation lower than 1% was achieved and it suggests that SuperMC is reliable in shielding calculation.

Robust validation of approximate 1-matrix functionals with few-electron harmonium atoms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cioslowski, Jerzy, E-mail: jerzy@wmf.univ.szczecin.pl; Piris, Mario; Matito, Eduard

2015-12-07

A simple comparison between the exact and approximate correlation components U of the electron-electron repulsion energy of several states of few-electron harmonium atoms with varying confinement strengths provides a stringent validation tool for 1-matrix functionals. The robustness of this tool is clearly demonstrated in a survey of 14 known functionals, which reveals their substandard performance within different electron correlation regimes. Unlike spot-testing that employs dissociation curves of diatomic molecules or more extensive benchmarking against experimental atomization energies of molecules comprising some standard set, the present approach not only uncovers the flaws and patent failures of the functionals but, even moremore » importantly, also allows for pinpointing their root causes. Since the approximate values of U are computed at exact 1-densities, the testing requires minimal programming and thus is particularly suitable for rapid screening of new functionals.« less
In Search of a Time Efficient Approach to Crack and Delamination Growth Predictions in Composites

NASA Technical Reports Server (NTRS)

Krueger, Ronald; Carvalho, Nelson

2016-01-01

Analysis benchmarking was used to assess the accuracy and time efficiency of algorithms suitable for automated delamination growth analysis. First, the Floating Node Method (FNM) was introduced and its combination with a simple exponential growth law (Paris Law) and Virtual Crack Closure technique (VCCT) was discussed. Implementation of the method into a user element (UEL) in Abaqus/Standard(Registered TradeMark) was also presented. For the assessment of growth prediction capabilities, an existing benchmark case based on the Double Cantilever Beam (DCB) specimen was briefly summarized. Additionally, the development of new benchmark cases based on the Mixed-Mode Bending (MMB) specimen to assess the growth prediction capabilities under mixed-mode I/II conditions was discussed in detail. A comparison was presented, in which the benchmark cases were used to assess the existing low-cycle fatigue analysis tool in Abaqus/Standard(Registered TradeMark) in comparison to the FNM-VCCT fatigue growth analysis implementation. The low-cycle fatigue analysis tool in Abaqus/Standard(Registered TradeMark) was able to yield results that were in good agreement with the DCB benchmark example. Results for the MMB benchmark cases, however, only captured the trend correctly. The user element (FNM-VCCT) always yielded results that were in excellent agreement with all benchmark cases, at a fraction of the analysis time. The ability to assess the implementation of two methods in one finite element code illustrated the value of establishing benchmark solutions.
Benchmarking a geostatistical procedure for the homogenisation of annual precipitation series

NASA Astrophysics Data System (ADS)

Caineta, Júlio; Ribeiro, Sara; Henriques, Roberto; Soares, Amílcar; Costa, Ana Cristina

2014-05-01

The European project COST Action ES0601, Advances in homogenisation methods of climate series: an integrated approach (HOME), has brought to attention the importance of establishing reliable homogenisation methods for climate data. In order to achieve that, a benchmark data set, containing monthly and daily temperature and precipitation data, was created to be used as a comparison basis for the effectiveness of those methods. Several contributions were submitted and evaluated by a number of performance metrics, validating the results against realistic inhomogeneous data. HOME also led to the development of new homogenisation software packages, which included feedback and lessons learned during the project. Preliminary studies have suggested a geostatistical stochastic approach, which uses Direct Sequential Simulation (DSS), as a promising methodology for the homogenisation of precipitation data series. Based on the spatial and temporal correlation between the neighbouring stations, DSS calculates local probability density functions at a candidate station to detect inhomogeneities. The purpose of the current study is to test and compare this geostatistical approach with the methods previously presented in the HOME project, using surrogate precipitation series from the HOME benchmark data set. The benchmark data set contains monthly precipitation surrogate series, from which annual precipitation data series were derived. These annual precipitation series were subject to exploratory analysis and to a thorough variography study. The geostatistical approach was then applied to the data set, based on different scenarios for the spatial continuity. Implementing this procedure also promoted the development of a computer program that aims to assist on the homogenisation of climate data, while minimising user interaction. Finally, in order to compare the effectiveness of this methodology with the homogenisation methods submitted during the HOME project, the obtained results were evaluated using the same performance metrics. This comparison opens new perspectives for the development of an innovative procedure based on the geostatistical stochastic approach. Acknowledgements: The authors gratefully acknowledge the financial support of "Fundação para a Ciência e Tecnologia" (FCT), Portugal, through the research project PTDC/GEO-MET/4026/2012 ("GSIMCLI - Geostatistical simulation with local distributions for the homogenization and interpolation of climate data").
DOE Office of Scientific and Technical Information (OSTI.GOV)

Lopez, Jesse E.; Baptista, António M.

A sediment model coupled to the hydrodynamic model SELFE is validated against a benchmark combining a set of idealized tests and an application to a field-data rich energetic estuary. After sensitivity studies, model results for the idealized tests largely agree with previously reported results from other models in addition to analytical, semi-analytical, or laboratory results. Results of suspended sediment in an open channel test with fixed bottom are sensitive to turbulence closure and treatment for hydrodynamic bottom boundary. Results for the migration of a trench are very sensitive to critical stress and erosion rate, but largely insensitive to turbulence closure.more » The model is able to qualitatively represent sediment dynamics associated with estuarine turbidity maxima in an idealized estuary. Applied to the Columbia River estuary, the model qualitatively captures sediment dynamics observed by fixed stations and shipborne profiles. Representation of the vertical structure of suspended sediment degrades when stratification is underpredicted. Across all tests, skill metrics of suspended sediments lag those of hydrodynamics even when qualitatively representing dynamics. The benchmark is fully documented in an openly available repository to encourage unambiguous comparisons against other models.« less
Resilience landscapes for Congo basin rainforests vs. climate and management impacts

NASA Astrophysics Data System (ADS)

Pietsch, Stephan Alexander; Gautam, Sishir; Elias Bednar, Johannes; Stanzl, Patrick; Mosnier, Aline; Obersteiner, Michael

2015-04-01

Past climate change caused severe disturbances of the Central African rainforest belt, with forest fragmentation and re-expansion due to drier and wetter climate conditions. Besides climate, human induced forest degradation affected biodiversity, structure and carbon storage of Congo basin rainforests. Information on climatically stable, mature rainforest, unaffected by human induced disturbances, provides means of assessing the impact of forest degradation and may serve as benchmarks of carbon carrying capacity over regions with similar site and climate conditions. BioGeoChemical (BGC) ecosystem models explicitly consider the impacts of site and climate conditions and may assess benchmark levels over regions devoid of undisturbed conditions. We will present a BGC-model validation for the Western Congolian Lowland Rainforest (WCLRF) using field data from a recently confirmed forest refuge, show model - data comparisons for disturbed und undisturbed forests under different site and climate conditions as well as for sites with repeated assessment of biodiversity and standing biomass during recovery from intensive exploitation. We will present climatic thresholds for WCLRF stability, and construct resilience landscapes for current day conditions vs. climate and management impacts.
Hysteresis in the Central African Rainforest

NASA Astrophysics Data System (ADS)

Pietsch, Stephan Alexander; Elias Bednar, Johannes; Gautam, Sishir; Petritsch, Richard; Schier, Franziska; Stanzl, Patrick

2014-05-01

Past climate change caused severe disturbances of the Central African rainforest belt, with forest fragmentation and re-expansion due to drier and wetter climate conditions. Besides climate, human induced forest degradation affected biodiversity, structure and carbon storage of Congo basin rainforests. Information on climatically stable, mature rainforest, unaffected by human induced disturbances, provides means of assessing the impact of forest degradation and may serve as benchmarks of carbon carrying capacity over regions with similar site and climate conditions. BioGeoChemical (BGC) ecosystem models explicitly consider the impacts of site and climate conditions and may assess benchmark levels over regions devoid of undisturbed conditions. We will present a BGC-model validation for the Western Congolian Lowland Rainforest (WCLRF) using field data from a recently confirmed forest refuge, show model - data comparisons for disturbed und undisturbed forests under different site and climate conditions as well as for sites with repeated assessment of biodiversity and standing biomass during recovery from intensive exploitation. We will present climatic thresholds for WCLRF stability, analyse the relationship between resilience, standing C-stocks and change in climate and finally provide evidence of hysteresis.
Big heart data: advancing health informatics through data sharing in cardiovascular imaging.

PubMed

Suinesiaputra, Avan; Medrano-Gracia, Pau; Cowan, Brett R; Young, Alistair A

2015-07-01

The burden of heart disease is rapidly worsening due to the increasing prevalence of obesity and diabetes. Data sharing and open database resources for heart health informatics are important for advancing our understanding of cardiovascular function, disease progression and therapeutics. Data sharing enables valuable information, often obtained at considerable expense and effort, to be reused beyond the specific objectives of the original study. Many government funding agencies and journal publishers are requiring data reuse, and are providing mechanisms for data curation and archival. Tools and infrastructure are available to archive anonymous data from a wide range of studies, from descriptive epidemiological data to gigabytes of imaging data. Meta-analyses can be performed to combine raw data from disparate studies to obtain unique comparisons or to enhance statistical power. Open benchmark datasets are invaluable for validating data analysis algorithms and objectively comparing results. This review provides a rationale for increased data sharing and surveys recent progress in the cardiovascular domain. We also highlight the potential of recent large cardiovascular epidemiological studies enabling collaborative efforts to facilitate data sharing, algorithms benchmarking, disease modeling and statistical atlases.
Comparison of deterministic and stochastic methods for time-dependent Wigner simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shao, Sihong, E-mail: sihong@math.pku.edu.cn; Sellier, Jean Michel, E-mail: jeanmichel.sellier@parallel.bas.bg

2015-11-01

Recently a Monte Carlo method based on signed particles for time-dependent simulations of the Wigner equation has been proposed. While it has been thoroughly validated against physical benchmarks, no technical study about its numerical accuracy has been performed. To this end, this paper presents the first step towards the construction of firm mathematical foundations for the signed particle Wigner Monte Carlo method. An initial investigation is performed by means of comparisons with a cell average spectral element method, which is a highly accurate deterministic method and utilized to provide reference solutions. Several different numerical tests involving the time-dependent evolution ofmore » a quantum wave-packet are performed and discussed in deep details. In particular, this allows us to depict a set of crucial criteria for the signed particle Wigner Monte Carlo method to achieve a satisfactory accuracy.« less
Review and standardization of cell phone exposure calculations using the SAM phantom and anatomically correct head models.

PubMed

Beard, Brian B; Kainz, Wolfgang

2004-10-13

We reviewed articles using computational RF dosimetry to compare the Specific Anthropomorphic Mannequin (SAM) to anatomically correct models of the human head. Published conclusions based on such comparisons have varied widely. We looked for reasons that might cause apparently similar comparisons to produce dissimilar results. We also looked at the information needed to adequately compare the results of computational RF dosimetry studies. We concluded studies were not comparable because of differences in definitions, models, and methodology. Therefore we propose a protocol, developed by an IEEE standards group, as an initial step in alleviating this problem. The protocol calls for a benchmark validation study comparing the SAM phantom to two anatomically correct models of the human head. It also establishes common definitions and reporting requirements that will increase the comparability of all computational RF dosimetry studies of the human head.
Review and standardization of cell phone exposure calculations using the SAM phantom and anatomically correct head models

PubMed Central

Beard, Brian B; Kainz, Wolfgang

2004-01-01

We reviewed articles using computational RF dosimetry to compare the Specific Anthropomorphic Mannequin (SAM) to anatomically correct models of the human head. Published conclusions based on such comparisons have varied widely. We looked for reasons that might cause apparently similar comparisons to produce dissimilar results. We also looked at the information needed to adequately compare the results of computational RF dosimetry studies. We concluded studies were not comparable because of differences in definitions, models, and methodology. Therefore we propose a protocol, developed by an IEEE standards group, as an initial step in alleviating this problem. The protocol calls for a benchmark validation study comparing the SAM phantom to two anatomically correct models of the human head. It also establishes common definitions and reporting requirements that will increase the comparability of all computational RF dosimetry studies of the human head. PMID:15482601
Summary of comparison and analysis of results from exercises 1 and 2 of the OECD PBMR coupled neutronics/thermal hydraulics transient benchmark

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mkhabela, P.; Han, J.; Tyobeka, B.

2006-07-01

The Nuclear Energy Agency (NEA) of the Organization for Economic Cooperation and Development (OECD) has accepted, through the Nuclear Science Committee (NSC), the inclusion of the Pebble-Bed Modular Reactor 400 MW design (PBMR-400) coupled neutronics/thermal hydraulics transient benchmark problem as part of their official activities. The scope of the benchmark is to establish a well-defined problem, based on a common given library of cross sections, to compare methods and tools in core simulation and thermal hydraulics analysis with a specific focus on transient events through a set of multi-dimensional computational test problems. The benchmark includes three steady state exercises andmore » six transient exercises. This paper describes the first two steady state exercises, their objectives and the international participation in terms of organization, country and computer code utilized. This description is followed by a comparison and analysis of the participants' results submitted for these two exercises. The comparison of results from different codes allows for an assessment of the sensitivity of a result to the method employed and can thus help to focus the development efforts on the most critical areas. The two first exercises also allow for removing of user-related modeling errors and prepare core neutronics and thermal-hydraulics models of the different codes for the rest of the exercises in the benchmark. (authors)« less
Benchmarking neuromorphic vision: lessons learnt from computer vision

PubMed Central

Tan, Cheston; Lallee, Stephane; Orchard, Garrick

2015-01-01

Neuromorphic Vision sensors have improved greatly since the first silicon retina was presented almost three decades ago. They have recently matured to the point where they are commercially available and can be operated by laymen. However, despite improved availability of sensors, there remains a lack of good datasets, while algorithms for processing spike-based visual data are still in their infancy. On the other hand, frame-based computer vision algorithms are far more mature, thanks in part to widely accepted datasets which allow direct comparison between algorithms and encourage competition. We are presented with a unique opportunity to shape the development of Neuromorphic Vision benchmarks and challenges by leveraging what has been learnt from the use of datasets in frame-based computer vision. Taking advantage of this opportunity, in this paper we review the role that benchmarks and challenges have played in the advancement of frame-based computer vision, and suggest guidelines for the creation of Neuromorphic Vision benchmarks and challenges. We also discuss the unique challenges faced when benchmarking Neuromorphic Vision algorithms, particularly when attempting to provide direct comparison with frame-based computer vision. PMID:26528120
Benthic invertebrates of benchmark streams in agricultural areas of eastern Wisconsin, Western Lake Michigan Drainages

USGS Publications Warehouse

Rheaume, S.J.; Lenz, B.N.; Scudder, B.C.

1996-01-01

Information gathered from these benchmark streams can be used as a regional reference for comparison with other streams in agricultural areas, based on communities of aquatic biota, habitat, and water quality.
Using relative survival measures for cross-sectional and longitudinal benchmarks of countries, states, and districts: the BenchRelSurv- and BenchRelSurvPlot-macros

PubMed Central

2013-01-01

Background The objective of screening programs is to discover life threatening diseases in as many patients as early as possible and to increase the chance of survival. To be able to compare aspects of health care quality, methods are needed for benchmarking that allow comparisons on various health care levels (regional, national, and international). Objectives Applications and extensions of algorithms can be used to link the information on disease phases with relative survival rates and to consolidate them in composite measures. The application of the developed SAS-macros will give results for benchmarking of health care quality. Data examples for breast cancer care are given. Methods A reference scale (expected, E) must be defined at a time point at which all benchmark objects (observed, O) are measured. All indices are defined as O/E, whereby the extended standardized screening-index (eSSI), the standardized case-mix-index (SCI), the work-up-index (SWI), and the treatment-index (STI) address different health care aspects. The composite measures called overall-performance evaluation (OPE) and relative overall performance indices (ROPI) link the individual indices differently for cross-sectional or longitudinal analyses. Results Algorithms allow a time point and a time interval associated comparison of the benchmark objects in the indices eSSI, SCI, SWI, STI, OPE, and ROPI. Comparisons between countries, states and districts are possible. Exemplarily comparisons between two countries are made. The success of early detection and screening programs as well as clinical health care quality for breast cancer can be demonstrated while the population’s background mortality is concerned. Conclusions If external quality assurance programs and benchmark objects are based on population-based and corresponding demographic data, information of disease phase and relative survival rates can be combined to indices which offer approaches for comparative analyses between benchmark objects. Conclusions on screening programs and health care quality are possible. The macros can be transferred to other diseases if a disease-specific phase scale of prognostic value (e.g. stage) exists. PMID:23316692
ELECTRA © Launch and Re-Entry Safety Analysis Tool

NASA Astrophysics Data System (ADS)

Lazare, B.; Arnal, M. H.; Aussilhou, C.; Blazquez, A.; Chemama, F.

2010-09-01

French Space Operation Act gives as prime objective to National Technical Regulations to protect people, properties, public health and environment. In this frame, an independent technical assessment of French space operation is delegated to CNES. To perform this task and also for his owns operations CNES needs efficient state-of-the-art tools for evaluating risks. The development of the ELECTRA© tool, undertaken in 2007, meets the requirement for precise quantification of the risks involved in launching and re-entry of spacecraft. The ELECTRA© project draws on the proven expertise of CNES technical centers in the field of flight analysis and safety, spaceflight dynamics and the design of spacecraft. The ELECTRA© tool was specifically designed to evaluate the risks involved in the re-entry and return to Earth of all or part of a spacecraft. It will also be used for locating and visualizing nominal or accidental re-entry zones while comparing them with suitable geographic data such as population density, urban areas, and shipping lines, among others. The method chosen for ELECTRA© consists of two main steps: calculating the possible reentry trajectories for each fragment after the spacecraft breaks up; calculating the risks while taking into account the energy of the fragments, the population density and protection afforded by buildings. For launch operations and active re-entry, the risk calculation will be weighted by the probability of instantaneous failure of the spacecraft and integrated for the whole trajectory. ELECTRA©’s development is today at the end of the validation phase, last step before delivery to users. Validation process has been performed in different ways: numerical application way for the risk formulation; benchmarking process for casualty area, level of energy of the fragments entries and level of protection housing module; best practices in space transportation industries concerning dependability evaluation; benchmarking process for world population repartition leading to the choice of a worldwide used model called GPW V3. Then, the complementary part for validation has been numerous system tests, most of them by comparison with already existing tools, operationally used for example into the European Space port in French Guyana. The purpose of this article is to review the method and models chosen by CNES for describing physical phenomena and the results of validation process including comparison with other risk assessment tools.
Construct Validity of Fresh Frozen Human Cadaver as a Training Model in Minimal Access Surgery

PubMed Central

Macafee, David; Pranesh, Nagarajan; Horgan, Alan F.

2012-01-01

Background: The construct validity of fresh human cadaver as a training tool has not been established previously. The aims of this study were to investigate the construct validity of fresh frozen human cadaver as a method of training in minimal access surgery and determine if novices can be rapidly trained using this model to a safe level of performance. Methods: Junior surgical trainees, novices (<3 laparoscopic procedure performed) in laparoscopic surgery, performed 10 repetitions of a set of structured laparoscopic tasks on fresh frozen cadavers. Expert laparoscopists (>100 laparoscopic procedures) performed 3 repetitions of identical tasks. Performances were scored using a validated, objective Global Operative Assessment of Laparoscopic Skills scale. Scores for 3 consecutive repetitions were compared between experts and novices to determine construct validity. Furthermore, to determine if the novices reached a safe level, a trimmed mean of the experts score was used to define a benchmark. Mann-Whitney U test was used for construct validity analysis and 1-sample t test to compare performances of the novice group with the benchmark safe score. Results: Ten novices and 2 experts were recruited. Four out of 5 tasks (nondominant to dominant hand transfer; simulated appendicectomy; intracorporeal and extracorporeal knot tying) showed construct validity. Novices’ scores became comparable to benchmark scores between the eighth and tenth repetition. Conclusion: Minimal access surgical training using fresh frozen human cadavers appears to have construct validity. The laparoscopic skills of novices can be accelerated through to a safe level within 8 to 10 repetitions. PMID:23318058
Seismo-acoustic ray model benchmarking against experimental tank data.

PubMed

Camargo Rodríguez, Orlando; Collis, Jon M; Simpson, Harry J; Ey, Emanuel; Schneiderwind, Joseph; Felisberto, Paulo

2012-08-01

Acoustic predictions of the recently developed traceo ray model, which accounts for bottom shear properties, are benchmarked against tank experimental data from the EPEE-1 and EPEE-2 (Elastic Parabolic Equation Experiment) experiments. Both experiments are representative of signal propagation in a Pekeris-like shallow-water waveguide over a non-flat isotropic elastic bottom, where significant interaction of the signal with the bottom can be expected. The benchmarks show, in particular, that the ray model can be as accurate as a parabolic approximation model benchmarked in similar conditions. The results of benchmarking are important, on one side, as a preliminary experimental validation of the model and, on the other side, demonstrates the reliability of the ray approach for seismo-acoustic applications.
Modification and benchmarking of MCNP for low-energy tungsten spectra.

PubMed

Mercier, J R; Kopp, D T; McDavid, W D; Dove, S B; Lancaster, J L; Tucker, D M

2000-12-01

The MCNP Monte Carlo radiation transport code was modified for diagnostic medical physics applications. In particular, the modified code was thoroughly benchmarked for the production of polychromatic tungsten x-ray spectra in the 30-150 kV range. Validating the modified code for coupled electron-photon transport with benchmark spectra was supplemented with independent electron-only and photon-only transport benchmarks. Major revisions to the code included the proper treatment of characteristic K x-ray production and scoring, new impact ionization cross sections, and new bremsstrahlung cross sections. Minor revisions included updated photon cross sections, electron-electron bremsstrahlung production, and K x-ray yield. The modified MCNP code is benchmarked to electron backscatter factors, x-ray spectra production, and primary and scatter photon transport.
Revisiting Yasinsky and Henry`s benchmark using modern nodal codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Feltus, M.A.; Becker, M.W.

1995-12-31

The numerical experiments analyzed by Yasinsky and Henry are quite trivial by comparison with today`s standards because they used the finite difference code WIGLE for their benchmark. Also, this problem is a simple slab (one-dimensional) case with no feedback mechanisms. This research attempts to obtain STAR (Ref. 2) and NEM (Ref. 3) code results in order to produce a more modern kinetics benchmark with results comparable WIGLE.
Towards the Solution of the Many-Electron Problem in Real Materials: Equation of State of the Hydrogen Chain with State-of-the-Art Many-Body Methods

DOE PAGES

Motta, Mario; Ceperley, David M.; Chan, Garnet Kin-Lic; ...

2017-09-28

We present numerical results for the equation of state of an infinite chain of hydrogen atoms. A variety of modern many-body methods are employed, with exhaustive cross-checks and validation. Approaches for reaching the continuous space limit and the thermodynamic limit are investigated, proposed, and tested. The detailed comparisons provide a benchmark for assessing the current state of the art in many-body computation, and for the development of new methods. The ground-state energy per atom in the linear chain is accurately determined versus bond length, with a confidence bound given on all uncertainties.

On accuracy of the wave finite element predictions of wavenumbers and power flow: A benchmark problem

NASA Astrophysics Data System (ADS)

Søe-Knudsen, Alf; Sorokin, Sergey

2011-06-01

This rapid communication is concerned with justification of the 'rule of thumb', which is well known to the community of users of the finite element (FE) method in dynamics, for the accuracy assessment of the wave finite element (WFE) method. An explicit formula linking the size of a window in the dispersion diagram, where the WFE method is trustworthy, with the coarseness of a FE mesh employed is derived. It is obtained by the comparison of the exact Pochhammer-Chree solution for an elastic rod having the circular cross-section with its WFE approximations. It is shown that the WFE power flow predictions are also valid within this window.
Low-frequency quadrupole impedance of undulators and wigglers

DOE PAGES

Blednykh, A.; Bassi, G.; Hidaka, Y.; ...

2016-10-25

An analytical expression of the low-frequency quadrupole impedance for undulators and wigglers is derived and benchmarked against beam-based impedance measurements done at the 3 GeV NSLS-II storage ring. The adopted theoretical model, valid for an arbitrary number of electromagnetic layers with parallel geometry, allows to calculate the quadrupole impedance for arbitrary values of the magnetic permeability μ r. Here, in the comparison of the analytical results with the measurements for variable magnet gaps, two limit cases of the permeability have been studied: the case of perfect magnets (μ r → ∞), and the case in which the magnets are fullymore » saturated (μ r = 1).« less
A semi-implicit level set method for multiphase flows and fluid-structure interaction problems

NASA Astrophysics Data System (ADS)

Cottet, Georges-Henri; Maitre, Emmanuel

2016-06-01

In this paper we present a novel semi-implicit time-discretization of the level set method introduced in [8] for fluid-structure interaction problems. The idea stems from a linear stability analysis derived on a simplified one-dimensional problem. The semi-implicit scheme relies on a simple filter operating as a pre-processing on the level set function. It applies to multiphase flows driven by surface tension as well as to fluid-structure interaction problems. The semi-implicit scheme avoids the stability constraints that explicit scheme need to satisfy and reduces significantly the computational cost. It is validated through comparisons with the original explicit scheme and refinement studies on two-dimensional benchmarks.
Proposal of Evolutionary Simplex Method for Global Optimization Problem

NASA Astrophysics Data System (ADS)

Shimizu, Yoshiaki

To make an agile decision in a rational manner, role of optimization engineering has been notified increasingly under diversified customer demand. With this point of view, in this paper, we have proposed a new evolutionary method serving as an optimization technique in the paradigm of optimization engineering. The developed method has prospects to solve globally various complicated problem appearing in real world applications. It is evolved from the conventional method known as Nelder and Mead’s Simplex method by virtue of idea borrowed from recent meta-heuristic method such as PSO. Mentioning an algorithm to handle linear inequality constraints effectively, we have validated effectiveness of the proposed method through comparison with other methods using several benchmark problems.
Towards the Solution of the Many-Electron Problem in Real Materials: Equation of State of the Hydrogen Chain with State-of-the-Art Many-Body Methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Motta, Mario; Ceperley, David M.; Chan, Garnet Kin-Lic

We present numerical results for the equation of state of an infinite chain of hydrogen atoms. A variety of modern many-body methods are employed, with exhaustive cross-checks and validation. Approaches for reaching the continuous space limit and the thermodynamic limit are investigated, proposed, and tested. The detailed comparisons provide a benchmark for assessing the current state of the art in many-body computation, and for the development of new methods. The ground-state energy per atom in the linear chain is accurately determined versus bond length, with a confidence bound given on all uncertainties.
Comparison of the PHISICS/RELAP5-3D ring and block model results for phase I of the OECD/NEA MHTGR-350 benchmark

DOE PAGES

Strydom, G.; Epiney, A. S.; Alfonsi, Andrea; ...

2015-12-02

The PHISICS code system has been under development at INL since 2010. It consists of several modules providing improved coupled core simulation capability: INSTANT (3D nodal transport core calculations), MRTAU (depletion and decay heat generation) and modules performing criticality searches, fuel shuffling and generalized perturbation. Coupling of the PHISICS code suite to the thermal hydraulics system code RELAP5-3D was finalized in 2013, and as part of the verification and validation effort the first phase of the OECD/NEA MHTGR-350 Benchmark has now been completed. The theoretical basis and latest development status of the coupled PHISICS/RELAP5-3D tool are described in more detailmore » in a concurrent paper. This paper provides an overview of the OECD/NEA MHTGR-350 Benchmark and presents the results of Exercises 2 and 3 defined for Phase I. Exercise 2 required the modelling of a stand-alone thermal fluids solution at End of Equilibrium Cycle for the Modular High Temperature Reactor (MHTGR). The RELAP5-3D results of four sub-cases are discussed, consisting of various combinations of coolant bypass flows and material thermophysical properties. Exercise 3 required a coupled neutronics and thermal fluids solution, and the PHISICS/RELAP5-3D code suite was used to calculate the results of two sub-cases. The main focus of the paper is a comparison of results obtained with the traditional RELAP5-3D “ring” model approach against a much more detailed model that include kinetics feedback on individual block level and thermal feedbacks on a triangular sub-mesh. The higher fidelity that can be obtained by this “block” model is illustrated with comparison results on the temperature, power density and flux distributions. Furthermore, it is shown that the ring model leads to significantly lower fuel temperatures (up to 10%) when compared with the higher fidelity block model, and that the additional model development and run-time efforts are worth the gains obtained in the improved spatial temperature and flux distributions.« less
An End-to-End simulator for the development of atmospheric corrections and temperature - emissivity separation algorithms in the TIR spectral domain

NASA Astrophysics Data System (ADS)

Rock, Gilles; Fischer, Kim; Schlerf, Martin; Gerhards, Max; Udelhoven, Thomas

2017-04-01

The development and optimization of image processing algorithms requires the availability of datasets depicting every step from earth surface to the sensor's detector. The lack of ground truth data obliges to develop algorithms on simulated data. The simulation of hyperspectral remote sensing data is a useful tool for a variety of tasks such as the design of systems, the understanding of the image formation process, and the development and validation of data processing algorithms. An end-to-end simulator has been set up consisting of a forward simulator, a backward simulator and a validation module. The forward simulator derives radiance datasets based on laboratory sample spectra, applies atmospheric contributions using radiative transfer equations, and simulates the instrument response using configurable sensor models. This is followed by the backward simulation branch, consisting of an atmospheric correction (AC), a temperature and emissivity separation (TES) or a hybrid AC and TES algorithm. An independent validation module allows the comparison between input and output dataset and the benchmarking of different processing algorithms. In this study, hyperspectral thermal infrared scenes of a variety of surfaces have been simulated to analyze existing AC and TES algorithms. The ARTEMISS algorithm was optimized and benchmarked against the original implementations. The errors in TES were found to be related to incorrect water vapor retrieval. The atmospheric characterization could be optimized resulting in increasing accuracies in temperature and emissivity retrieval. Airborne datasets of different spectral resolutions were simulated from terrestrial HyperCam-LW measurements. The simulated airborne radiance spectra were subjected to atmospheric correction and TES and further used for a plant species classification study analyzing effects related to noise and mixed pixels.
Comparison and validation of point spread models for imaging in natural waters.

PubMed

Hou, Weilin; Gray, Deric J; Weidemann, Alan D; Arnone, Robert A

2008-06-23

It is known that scattering by particulates within natural waters is the main cause of the blur in underwater images. Underwater images can be better restored or enhanced with knowledge of the point spread function (PSF) of the water. This will extend the performance range as well as the information retrieval from underwater electro-optical systems, which is critical in many civilian and military applications, including target and especially mine detection, search and rescue, and diver visibility. A better understanding of the physical process involved also helps to predict system performance and simulate it accurately on demand. The presented effort first reviews several PSF models, including the introduction of a semi-analytical PSF given optical properties of the medium, including scattering albedo, mean scattering angles and the optical range. The models under comparison include the empirical model of Duntley, a modified PSF model by Dolin et al, as well as the numerical integration of analytical forms from Wells, as a benchmark of theoretical results. For experimental results, in addition to that of Duntley, we validate the above models with measured point spread functions by applying field measured scattering properties with Monte Carlo simulations. Results from these comparisons suggest it is sufficient but necessary to have the three parameters listed above to model PSFs. The simplified approach introduced also provides adequate accuracy and flexibility for imaging applications, as shown by examples of restored underwater images.
Validation of electronic structure methods for isomerization reactions of large organic molecules.

PubMed

Luo, Sijie; Zhao, Yan; Truhlar, Donald G

2011-08-14

In this work the ISOL24 database of isomerization energies of large organic molecules presented by Huenerbein et al. [Phys. Chem. Chem. Phys., 2010, 12, 6940] is updated, resulting in the new benchmark database called ISOL24/11, and this database is used to test 50 electronic model chemistries. To accomplish the update, the very expensive and highly accurate CCSD(T)-F12a/aug-cc-pVDZ method is first exploited to investigate a six-reaction subset of the 24 reactions, and by comparison of various methods with the benchmark, MCQCISD-MPW is confirmed to be of high accuracy. The final ISOL24/11 database is composed of six reaction energies calculated by CCSD(T)-F12a/aug-cc-pVDZ and 18 calculated by MCQCISD-MPW. We then tested 40 single-component density functionals (both local and hybrid), eight doubly hybrid functionals, and two other methods against ISOL24/11. It is found that the SCS-MP3/CBS method, which is used as benchmark for the original ISOL24, has an MUE of 1.68 kcal mol(-1), which is close to or larger than some of the best tested DFT methods. Using the new benchmark, we find ωB97X-D and MC3MPWB to be the best single-component and doubly hybrid functionals respectively, with PBE0-D3 and MC3MPW performing almost as well. The best single-component density functionals without molecular mechanics dispersion-like terms are M08-SO, M08-HX, M05-2X, and M06-2X. The best single-component density functionals without Hartree-Fock exchange are M06-L-D3 when MM terms are included and M06-L when they are not.
Benchmarking biology research organizations using a new, dedicated tool.

PubMed

van Harten, Willem H; van Bokhorst, Leonard; van Luenen, Henri G A M

2010-02-01

International competition forces fundamental research organizations to assess their relative performance. We present a benchmark tool for scientific research organizations where, contrary to existing models, the group leader is placed in a central position within the organization. We used it in a pilot benchmark study involving six research institutions. Our study shows that data collection and data comparison based on this new tool can be achieved. It proved possible to compare relative performance and organizational characteristics and to generate suggestions for improvement for most participants. However, strict definitions of the parameters used for the benchmark and a thorough insight into the organization of each of the benchmark partners is required to produce comparable data and draw firm conclusions.
Self-reported personality disorder in the children in the community sample: convergent and prospective validity in late adolescence and adulthood.

PubMed

Crawford, Thomas N; Cohen, Patricia; Johnson, Jeffrey G; Kasen, Stephanie; First, Michael B; Gordon, Kathy; Brook, Judith S

2005-02-01

Approximately 800 youths from the Children in the Community Study (Cohen & Cohen, 1996) have been assessed prospectively for over 20 years to study personality disorders (PDs) in adolescents and young adults. In this article we evaluate the Children in the Community Self-Report (CIC-SR) Scales, which were designed to assess DSM-IV PDs using self-reported prospective data from this longitudinal sample. To evaluate convergent validity, we assessed concordance between the CIC-SR Scales and the Structured Clinical Interview for DSM-IV Personality Disorders (SCID-II; First, Gibbon, Spitzer, Williams, & Benjamin, 1995) in 644 participants at mean age 33. To assess predictive validity, we used CIC-SR Scales at mean age 22 to predict subsequent CIC-SR and SCID-II Personality Questionnaire scores at mean age 33. In these analyses the CIC-SR Scales matched or exceeded benchmarks established in previous comparisons between self-report instruments and structured clinical interviews. Unlike other self-report scales, the CIC-SR did not appear to overestimate diagnoses when compared with SCID-II clinical diagnoses.
Model development and validation of geometrically complex eddy current coils using finite element methods

NASA Astrophysics Data System (ADS)

Brown, Alexander; Eviston, Connor

2017-02-01

Multiple FEM models of complex eddy current coil geometries were created and validated to calculate the change of impedance due to the presence of a notch. Capable realistic simulations of eddy current inspections are required for model assisted probability of detection (MAPOD) studies, inversion algorithms, experimental verification, and tailored probe design for NDE applications. An FEM solver was chosen to model complex real world situations including varying probe dimensions and orientations along with complex probe geometries. This will also enable creation of a probe model library database with variable parameters. Verification and validation was performed using other commercially available eddy current modeling software as well as experimentally collected benchmark data. Data analysis and comparison showed that the created models were able to correctly model the probe and conductor interactions and accurately calculate the change in impedance of several experimental scenarios with acceptable error. The promising results of the models enabled the start of an eddy current probe model library to give experimenters easy access to powerful parameter based eddy current models for alternate project applications.
Validation of numerical codes for impact and explosion cratering: Impacts on strengthless and metal targets

NASA Astrophysics Data System (ADS)

Pierazzo, E.; Artemieva, N.; Asphaug, E.; Baldwin, E. C.; Cazamias, J.; Coker, R.; Collins, G. S.; Crawford, D. A.; Davison, T.; Elbeshausen, D.; Holsapple, K. A.; Housen, K. R.; Korycansky, D. G.; Wünnemann, K.

2008-12-01

Over the last few decades, rapid improvement of computer capabilities has allowed impact cratering to be modeled with increasing complexity and realism, and has paved the way for a new era of numerical modeling of the impact process, including full, three-dimensional (3D) simulations. When properly benchmarked and validated against observation, computer models offer a powerful tool for understanding the mechanics of impact crater formation. This work presents results from the first phase of a project to benchmark and validate shock codes. A variety of 2D and 3D codes were used in this study, from commercial products like AUTODYN, to codes developed within the scientific community like SOVA, SPH, ZEUS-MP, iSALE, and codes developed at U.S. National Laboratories like CTH, SAGE/RAGE, and ALE3D. Benchmark calculations of shock wave propagation in aluminum-on-aluminum impacts were performed to examine the agreement between codes for simple idealized problems. The benchmark simulations show that variability in code results is to be expected due to differences in the underlying solution algorithm of each code, artificial stability parameters, spatial and temporal resolution, and material models. Overall, the inter-code variability in peak shock pressure as a function of distance is around 10 to 20%. In general, if the impactor is resolved by at least 20 cells across its radius, the underestimation of peak shock pressure due to spatial resolution is less than 10%. In addition to the benchmark tests, three validation tests were performed to examine the ability of the codes to reproduce the time evolution of crater radius and depth observed in vertical laboratory impacts in water and two well-characterized aluminum alloys. Results from these calculations are in good agreement with experiments. There appears to be a general tendency of shock physics codes to underestimate the radius of the forming crater. Overall, the discrepancy between the model and experiment results is between 10 and 20%, similar to the inter-code variability.
Benchmark Lisp And Ada Programs

NASA Technical Reports Server (NTRS)

Davis, Gloria; Galant, David; Lim, Raymond; Stutz, John; Gibson, J.; Raghavan, B.; Cheesema, P.; Taylor, W.

1992-01-01

Suite of nonparallel benchmark programs, ELAPSE, designed for three tests: comparing efficiency of computer processing via Lisp vs. Ada; comparing efficiencies of several computers processing via Lisp; or comparing several computers processing via Ada. Tests efficiency which computer executes routines in each language. Available for computer equipped with validated Ada compiler and/or Common Lisp system.
Developing of Indicators of an E-Learning Benchmarking Model for Higher Education Institutions

ERIC Educational Resources Information Center

Sae-Khow, Jirasak

2014-01-01

This study was the development of e-learning indicators used as an e-learning benchmarking model for higher education institutes. Specifically, it aimed to: 1) synthesize the e-learning indicators; 2) examine content validity by specialists; and 3) explore appropriateness of the e-learning indicators. Review of related literature included…
Generation IV benchmarking of TRISO fuel performance models under accident conditions: Modeling input data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Collin, Blaise P.

2014-09-01

This document presents the benchmark plan for the calculation of particle fuel performance on safety testing experiments that are representative of operational accidental transients. The benchmark is dedicated to the modeling of fission product release under accident conditions by fuel performance codes from around the world, and the subsequent comparison to post-irradiation experiment (PIE) data from the modeled heating tests. The accident condition benchmark is divided into three parts: the modeling of a simplified benchmark problem to assess potential numerical calculation issues at low fission product release; the modeling of the AGR-1 and HFR-EU1bis safety testing experiments; and, the comparisonmore » of the AGR-1 and HFR-EU1bis modeling results with PIE data. The simplified benchmark case, thereafter named NCC (Numerical Calculation Case), is derived from ''Case 5'' of the International Atomic Energy Agency (IAEA) Coordinated Research Program (CRP) on coated particle fuel technology [IAEA 2012]. It is included so participants can evaluate their codes at low fission product release. ''Case 5'' of the IAEA CRP-6 showed large code-to-code discrepancies in the release of fission products, which were attributed to ''effects of the numerical calculation method rather than the physical model''[IAEA 2012]. The NCC is therefore intended to check if these numerical effects subsist. The first two steps imply the involvement of the benchmark participants with a modeling effort following the guidelines and recommendations provided by this document. The third step involves the collection of the modeling results by Idaho National Laboratory (INL) and the comparison of these results with the available PIE data. The objective of this document is to provide all necessary input data to model the benchmark cases, and to give some methodology guidelines and recommendations in order to make all results suitable for comparison with each other. The participants should read this document thoroughly to make sure all the data needed for their calculations is provided in the document. Missing data will be added to a revision of the document if necessary.« less
Benchmarking methods and data sets for ligand enrichment assessment in virtual screening.

PubMed

Xia, Jie; Tilahun, Ermias Lemma; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon

2015-01-01

Retrospective small-scale virtual screening (VS) based on benchmarking data sets has been widely used to estimate ligand enrichments of VS approaches in the prospective (i.e. real-world) efforts. However, the intrinsic differences of benchmarking sets to the real screening chemical libraries can cause biased assessment. Herein, we summarize the history of benchmarking methods as well as data sets and highlight three main types of biases found in benchmarking sets, i.e. "analogue bias", "artificial enrichment" and "false negative". In addition, we introduce our recent algorithm to build maximum-unbiased benchmarking sets applicable to both ligand-based and structure-based VS approaches, and its implementations to three important human histone deacetylases (HDACs) isoforms, i.e. HDAC1, HDAC6 and HDAC8. The leave-one-out cross-validation (LOO CV) demonstrates that the benchmarking sets built by our algorithm are maximum-unbiased as measured by property matching, ROC curves and AUCs. Copyright © 2014 Elsevier Inc. All rights reserved.
Benchmarking Methods and Data Sets for Ligand Enrichment Assessment in Virtual Screening

PubMed Central

Xia, Jie; Tilahun, Ermias Lemma; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon

2014-01-01

Retrospective small-scale virtual screening (VS) based on benchmarking data sets has been widely used to estimate ligand enrichments of VS approaches in the prospective (i.e. real-world) efforts. However, the intrinsic differences of benchmarking sets to the real screening chemical libraries can cause biased assessment. Herein, we summarize the history of benchmarking methods as well as data sets and highlight three main types of biases found in benchmarking sets, i.e. “analogue bias”, “artificial enrichment” and “false negative”. In addition, we introduced our recent algorithm to build maximum-unbiased benchmarking sets applicable to both ligand-based and structure-based VS approaches, and its implementations to three important human histone deacetylase (HDAC) isoforms, i.e. HDAC1, HDAC6 and HDAC8. The Leave-One-Out Cross-Validation (LOO CV) demonstrates that the benchmarking sets built by our algorithm are maximum-unbiased in terms of property matching, ROC curves and AUCs. PMID:25481478
Benchmarking the ATLAS software through the Kit Validation engine

NASA Astrophysics Data System (ADS)

De Salvo, Alessandro; Brasolin, Franco

2010-04-01

The measurement of the experiment software performance is a very important metric in order to choose the most effective resources to be used and to discover the bottlenecks of the code implementation. In this work we present the benchmark techniques used to measure the ATLAS software performance through the ATLAS offline testing engine Kit Validation and the online portal Global Kit Validation. The performance measurements, the data collection, the online analysis and display of the results will be presented. The results of the measurement on different platforms and architectures will be shown, giving a full report on the CPU power and memory consumption of the Monte Carlo generation, simulation, digitization and reconstruction of the most CPU-intensive channels. The impact of the multi-core computing on the ATLAS software performance will also be presented, comparing the behavior of different architectures when increasing the number of concurrent processes. The benchmark techniques described in this paper have been used in the HEPiX group since the beginning of 2008 to help defining the performance metrics for the High Energy Physics applications, based on the real experiment software.
IEA-Task 31 WAKEBENCH: Towards a protocol for wind farm flow model evaluation. Part 2: Wind farm wake models

NASA Astrophysics Data System (ADS)

Moriarty, Patrick; Sanz Rodrigo, Javier; Gancarski, Pawel; Chuchfield, Matthew; Naughton, Jonathan W.; Hansen, Kurt S.; Machefaux, Ewan; Maguire, Eoghan; Castellani, Francesco; Terzi, Ludovico; Breton, Simon-Philippe; Ueda, Yuko

2014-06-01

Researchers within the International Energy Agency (IEA) Task 31: Wakebench have created a framework for the evaluation of wind farm flow models operating at the microscale level. The framework consists of a model evaluation protocol integrated with a web-based portal for model benchmarking (www.windbench.net). This paper provides an overview of the building-block validation approach applied to wind farm wake models, including best practices for the benchmarking and data processing procedures for validation datasets from wind farm SCADA and meteorological databases. A hierarchy of test cases has been proposed for wake model evaluation, from similarity theory of the axisymmetric wake and idealized infinite wind farm, to single-wake wind tunnel (UMN-EPFL) and field experiments (Sexbierum), to wind farm arrays in offshore (Horns Rev, Lillgrund) and complex terrain conditions (San Gregorio). A summary of results from the axisymmetric wake, Sexbierum, Horns Rev and Lillgrund benchmarks are used to discuss the state-of-the-art of wake model validation and highlight the most relevant issues for future development.

KEY COMPARISON: CCQM-K27-Subsequent: Key Comparison (subsequent) for the determination of ethanol in aqueous matrix

NASA Astrophysics Data System (ADS)

Schantz, Michele M.; Duewer, David L.; Parris, Reenie M.; May, Willie E.; Archer, Marcellé; Mussell, Chris; Carter, David; Konopelko, Leonid A.; Kustikov, Yury A.; Krylov, Anatoli I.; Fatina, Olga V.

2005-01-01

Ethanol is important both forensically ('drunk driving' or driving while under the influence, 'DWI', regulations) and commercially (alcoholic beverages). Blood- and breath-alcohol testing can be imposed on individuals operating private vehicles such as cars, boats, or snowmobiles, or operators of commercial vehicles like trucks, planes, and ships. The various levels of blood alcohol that determine whether these operators are considered legally impaired vary depending on the circumstances and locality. Accurate calibration and validation of instrumentation is critical in areas of forensic testing where quantitative analysis directly affects the outcome of criminal prosecutions, as is the case with the determination of ethanol in blood and breath. Additionally, the accurate assessment of the alcoholic content of beverages is a commercially important commodity. In 2002, the CCQM conducted a Key Comparison (CCQM-K27) for the determination of ethanol in aqueous matrix with nine participants. A report on this project has been approved by the CCQM and can be found at the BIPM website and in this Technical Supplement. CCQM-K27 comprised three samples, one at low mass fraction of ethanol in water (nominal concentration of 0.8 mg/g), one at high level (nominal concentration of 120 mg/g), and one wine matrix (nominal concentration of 81 mg/g). Overall agreement among eight participants using gas chromatography with flame ionization detection (GC-FID), titrimetry, isotope dilution gas chromatography/mass spectrometry (GC-IDMS), and gas chromatography-combustion-isotope ratio mass spectrometry (ID-GC-C-IRMS) was good. The ninth participant used a headspace GC-FID method that had not been validated in an earlier pilot study (CCQM-P35). A follow-on Key Comparison, CCQM-K27-Subsequent, was initiated in 2003 to accommodate laboratories that had not been ready to benchmark their methods in the original CCQM-K27 study or that wished to benchmark a different method. Four levels of ethanol in water were used in the subsequent study (nominal concentrations of 0.2 mg/g, 1 mg/g, 3 mg/g, and 60 mg/g). The three participants in the CCQM-K27-Subsequent Key Comparison demonstrated their ability to measure ethanol in aqueous matrix in the concentration range of 0.2 mg/g to 60 mg/g. Main text. To reach the main text of this paper, click on Final Report. The final report has been peer-reviewed and approved for publication by the CCQM, according to the provisions of the Mutual Recognition Arrangement (MRA).
ECHO: health care performance assessment in several European health systems.

PubMed

Bernal-Delgado, E; Christiansen, T; Bloor, K; Mateus, C; Yazbeck, A M; Munck, J; Bremner, J

2015-02-01

Strengthening health-care effectiveness, increasing accessibility and improving resilience are key goals in the upcoming European Union health-care agenda. European Collaboration for Health-Care Optimization (ECHO), an international research project on health-care performance assessment funded by the seventh framework programme, has provided evidence and methodology to allow the attainment of those goals. This article aims at describing ECHO, analysing its main instruments and discussing some of the ECHO policy implications. Using patient-level administrative data, a series of observational studies (ecological and cross-section with associated time-series analyses) were conducted to analyze population and patients' exposure to health care. Operationally, several performance dimensions such as health-care inequalities, quality, safety and efficiency were analyzed using a set of validated indicators. The main instruments in ECHO were: (i) building a homogeneous data infrastructure; (ii) constructing coding crosswalks to allow comparisons between countries; (iii) making geographical units of analysis comparable; and (iv) allowing comparisons through the use of common benchmarks. ECHO has provided some innovations in international comparisons of health-care performance, mainly derived from the massive pooling of patient-level data and thus: (i) has expanded the usual approach based on average figures, providing insight into within and across country variation at various meaningful policy levels, (ii) the important effort made on data homogenization has increased comparability, increasing stakeholders' reliance on data and improving the acceptance of findings and (iii) has been able to provide more flexible and reliable benchmarking, allowing stakeholders to make critical use of the evidence. © The Author 2015. Published by Oxford University Press on behalf of the European Public Health Association. All rights reserved.
Final report on CCQM-K27.2: Second Subsequent study: determination of ethanol in aqueous media

NASA Astrophysics Data System (ADS)

Schantz, Michele M.; Parris, Reenie M.; May, Willie E.; Rosso, Adriana; Puglisi, Celia; Marques Rodrigues Caixeiro, Janaína; Massiff, Gabriela; Camacho Frías, Evangelina; Pérez Urquiza, Melina; Archer, Marcellé; Visser, M. S.; deVos, Betty-Jayne

2013-01-01

Ethanol is important both forensically ('drunk driving' or driving while under the influence, 'DWI', regulations) and commercially (alcoholic beverages). Blood- and breath-alcohol testing can be imposed on individuals operating private vehicles such as cars, boats or snowmobiles, or operators of commercial vehicles like trucks, planes and ships. The various levels of blood alcohol that determine whether these operators are considered legally impaired vary depending on the circumstances and locality. Accurate calibration and validation of instrumentation is critical in areas of forensic testing where quantitative analysis directly affects the outcome of criminal prosecutions, as is the case with the determination of ethanol in blood and breath. Additionally, the accurate assessment of the alcoholic content of beverages is a commercially important commodity. In 2002, the CCQM conducted a key comparison (CCQM-K27) for the determination of ethanol in aqueous matrix with nine participants. A report on this project has been approved by the CCQM and can be found at the BIPM website. CCQM-K27 comprised three samples, one at low mass fraction of ethanol in water (nominal concentration of 0.8 mg/g), one at high level (nominal concentration of 120 mg/g) and one wine matrix (nominal concentration of 81 mg/g). Overall agreement among eight participants using gas chromatography with flame ionization detection (GC-FID), titrimetry, isotope dilution gas chromatography/mass spectrometry (GC-IDMS) and gas chromatography-combustion-isotope ratio mass spectrometry (ID-GC-C-IRMS) was good. The ninth participant used a headspace GC-FID method that had not been validated in an earlier pilot study (CCQM-P35). A follow-on key comparison, CCQM-K27-Subsequent, was initiated in 2003 to accommodate laboratories that had not been ready to benchmark their methods in the original CCQM-K27 study or that wished to benchmark a different method. Four levels of ethanol in water were used in the subsequent study (nominal concentrations of 0.2 mg/g, 1 mg/g, 3 mg/g and 60 mg/g). The three participants in the CCQM-K27-Subsequent key comparison demonstrated their ability to measure ethanol in aqueous matrix in the concentration range of 0.2 mg/g to 60 mg/g. A report on this project has been approved by the CCQM and can be found at the BIPM website. A second follow-on key comparison, CCQM-K27.2 Second Subsequent, was initiated in 2006 to accommodate laboratories that had not been ready to benchmark their methods in the previous two CCQM-K27 studies. Two levels of ethanol in water were used in the second subsequent study ranging in concentration between 0.5 mg/g and 4 mg/g. Four of the five participants in the CCQM-K27.2 Second Subsequent key comparison demonstrated their ability to measure ethanol in aqueous matrix in that concentration range. Main text. To reach the main text of this paper, click on Final Report. Note that this text is that which appears in Appendix B of the BIPM key comparison database kcdb.bipm.org/. The final report has been peer-reviewed and approved for publication by the CCQM, according to the provisions of the CIPM Mutual Recognition Arrangement (CIPM MRA).
Second Computational Aeroacoustics (CAA) Workshop on Benchmark Problems

NASA Technical Reports Server (NTRS)

Tam, C. K. W. (Editor); Hardin, J. C. (Editor)

1997-01-01

The proceedings of the Second Computational Aeroacoustics (CAA) Workshop on Benchmark Problems held at Florida State University are the subject of this report. For this workshop, problems arising in typical industrial applications of CAA were chosen. Comparisons between numerical solutions and exact solutions are presented where possible.
A Transnational Comparison of Lecturer Self-Efficacy

ERIC Educational Resources Information Center

Hemmings, Brian Colin; Kay, Russell; Sharp, John; Taylor, Claire

2012-01-01

Benchmarking within higher education is now relatively commonplace, as institutions increasingly compete directly with one another to improve the overall "quality" of what they do and attempt to establish and better their position among peers as measured against sector standards. The benchmarking of confidence among academic staff in…
Benchmarking Procedures for High-Throughput Context Specific Reconstruction Algorithms

PubMed Central

Pacheco, Maria P.; Pfau, Thomas; Sauter, Thomas

2016-01-01

Recent progress in high-throughput data acquisition has shifted the focus from data generation to processing and understanding of how to integrate collected information. Context specific reconstruction based on generic genome scale models like ReconX or HMR has the potential to become a diagnostic and treatment tool tailored to the analysis of specific individuals. The respective computational algorithms require a high level of predictive power, robustness and sensitivity. Although multiple context specific reconstruction algorithms were published in the last 10 years, only a fraction of them is suitable for model building based on human high-throughput data. Beside other reasons, this might be due to problems arising from the limitation to only one metabolic target function or arbitrary thresholding. This review describes and analyses common validation methods used for testing model building algorithms. Two major methods can be distinguished: consistency testing and comparison based testing. The first is concerned with robustness against noise, e.g., missing data due to the impossibility to distinguish between the signal and the background of non-specific binding of probes in a microarray experiment, and whether distinct sets of input expressed genes corresponding to i.e., different tissues yield distinct models. The latter covers methods comparing sets of functionalities, comparison with existing networks or additional databases. We test those methods on several available algorithms and deduce properties of these algorithms that can be compared with future developments. The set of tests performed, can therefore serve as a benchmarking procedure for future algorithms. PMID:26834640
Model Prediction Results for 2007 Ultrasonic Benchmark Problems

NASA Astrophysics Data System (ADS)

Kim, Hak-Joon; Song, Sung-Jin

2008-02-01

The World Federation of NDE Centers (WFNDEC) has addressed two types of problems for the 2007 ultrasonic benchmark problems: prediction of side-drilled hole responses with 45° and 60° refracted shear waves, and effects of surface curvatures on the ultrasonic responses of flat-bottomed hole. To solve this year's ultrasonic benchmark problems, we applied multi-Gaussian beam models for calculation of ultrasonic beam fields and the Kirchhoff approximation and the separation of variables method for calculation of far-field scattering amplitudes of flat-bottomed holes and side-drilled holes respectively In this paper, we present comparison results of model predictions to experiments for side-drilled holes and discuss effect of interface curvatures on ultrasonic responses by comparison of peak-to-peak amplitudes of flat-bottomed hole responses with different sizes and interface curvatures.
SP2Bench: A SPARQL Performance Benchmark

NASA Astrophysics Data System (ADS)

Schmidt, Michael; Hornung, Thomas; Meier, Michael; Pinkel, Christoph; Lausen, Georg

A meaningful analysis and comparison of both existing storage schemes for RDF data and evaluation approaches for SPARQL queries necessitates a comprehensive and universal benchmark platform. We present SP2Bench, a publicly available, language-specific performance benchmark for the SPARQL query language. SP2Bench is settled in the DBLP scenario and comprises a data generator for creating arbitrarily large DBLP-like documents and a set of carefully designed benchmark queries. The generated documents mirror vital key characteristics and social-world distributions encountered in the original DBLP data set, while the queries implement meaningful requests on top of this data, covering a variety of SPARQL operator constellations and RDF access patterns. In this chapter, we discuss requirements and desiderata for SPARQL benchmarks and present the SP2Bench framework, including its data generator, benchmark queries and performance metrics.
The electronegativity equalization method and the split charge equilibration applied to organic systems: parametrization, validation, and comparison.

PubMed

Verstraelen, Toon; Van Speybroeck, Veronique; Waroquier, Michel

2009-07-28

An extensive benchmark of the electronegativity equalization method (EEM) and the split charge equilibration (SQE) model on a very diverse set of organic molecules is presented. These models efficiently compute atomic partial charges and are used in the development of polarizable force fields. The predicted partial charges that depend on empirical parameters are calibrated to reproduce results from quantum mechanical calculations. Recently, SQE is presented as an extension of the EEM to obtain the correct size dependence of the molecular polarizability. In this work, 12 parametrization protocols are applied to each model and the optimal parameters are benchmarked systematically. The training data for the empirical parameters comprise of MP2/Aug-CC-pVDZ calculations on 500 organic molecules containing the elements H, C, N, O, F, S, Cl, and Br. These molecules have been selected by an ingenious and autonomous protocol from an initial set of almost 500,000 small organic molecules. It is clear that the SQE model outperforms the EEM in all benchmark assessments. When using Hirshfeld-I charges for the calibration, the SQE model optimally reproduces the molecular electrostatic potential from the ab initio calculations. Applications on chain molecules, i.e., alkanes, alkenes, and alpha alanine helices, confirm that the EEM gives rise to a divergent behavior for the polarizability, while the SQE model shows the correct trends. We conclude that the SQE model is an essential component of a polarizable force field, showing several advantages over the original EEM.
An Unbiased Method To Build Benchmarking Sets for Ligand-Based Virtual Screening and its Application To GPCRs

PubMed Central

2015-01-01

Benchmarking data sets have become common in recent years for the purpose of virtual screening, though the main focus had been placed on the structure-based virtual screening (SBVS) approaches. Due to the lack of crystal structures, there is great need for unbiased benchmarking sets to evaluate various ligand-based virtual screening (LBVS) methods for important drug targets such as G protein-coupled receptors (GPCRs). To date these ready-to-apply data sets for LBVS are fairly limited, and the direct usage of benchmarking sets designed for SBVS could bring the biases to the evaluation of LBVS. Herein, we propose an unbiased method to build benchmarking sets for LBVS and validate it on a multitude of GPCRs targets. To be more specific, our methods can (1) ensure chemical diversity of ligands, (2) maintain the physicochemical similarity between ligands and decoys, (3) make the decoys dissimilar in chemical topology to all ligands to avoid false negatives, and (4) maximize spatial random distribution of ligands and decoys. We evaluated the quality of our Unbiased Ligand Set (ULS) and Unbiased Decoy Set (UDS) using three common LBVS approaches, with Leave-One-Out (LOO) Cross-Validation (CV) and a metric of average AUC of the ROC curves. Our method has greatly reduced the “artificial enrichment” and “analogue bias” of a published GPCRs benchmarking set, i.e., GPCR Ligand Library (GLL)/GPCR Decoy Database (GDD). In addition, we addressed an important issue about the ratio of decoys per ligand and found that for a range of 30 to 100 it does not affect the quality of the benchmarking set, so we kept the original ratio of 39 from the GLL/GDD. PMID:24749745
An unbiased method to build benchmarking sets for ligand-based virtual screening and its application to GPCRs.

PubMed

Xia, Jie; Jin, Hongwei; Liu, Zhenming; Zhang, Liangren; Wang, Xiang Simon

2014-05-27

Benchmarking data sets have become common in recent years for the purpose of virtual screening, though the main focus had been placed on the structure-based virtual screening (SBVS) approaches. Due to the lack of crystal structures, there is great need for unbiased benchmarking sets to evaluate various ligand-based virtual screening (LBVS) methods for important drug targets such as G protein-coupled receptors (GPCRs). To date these ready-to-apply data sets for LBVS are fairly limited, and the direct usage of benchmarking sets designed for SBVS could bring the biases to the evaluation of LBVS. Herein, we propose an unbiased method to build benchmarking sets for LBVS and validate it on a multitude of GPCRs targets. To be more specific, our methods can (1) ensure chemical diversity of ligands, (2) maintain the physicochemical similarity between ligands and decoys, (3) make the decoys dissimilar in chemical topology to all ligands to avoid false negatives, and (4) maximize spatial random distribution of ligands and decoys. We evaluated the quality of our Unbiased Ligand Set (ULS) and Unbiased Decoy Set (UDS) using three common LBVS approaches, with Leave-One-Out (LOO) Cross-Validation (CV) and a metric of average AUC of the ROC curves. Our method has greatly reduced the "artificial enrichment" and "analogue bias" of a published GPCRs benchmarking set, i.e., GPCR Ligand Library (GLL)/GPCR Decoy Database (GDD). In addition, we addressed an important issue about the ratio of decoys per ligand and found that for a range of 30 to 100 it does not affect the quality of the benchmarking set, so we kept the original ratio of 39 from the GLL/GDD.
The InterFrost benchmark of Thermo-Hydraulic codes for cold regions hydrology - first inter-comparison phase results

NASA Astrophysics Data System (ADS)

Grenier, Christophe; Rühaak, Wolfram

2016-04-01

Climate change impacts in permafrost regions have received considerable attention recently due to the pronounced warming trends experienced in recent decades and which have been projected into the future. Large portions of these permafrost regions are characterized by surface water bodies (lakes, rivers) that interact with the surrounding permafrost often generating taliks (unfrozen zones) within the permafrost that allow for hydrologic interactions between the surface water bodies and underlying aquifers and thus influence the hydrologic response of a landscape to climate change. Recent field studies and modeling exercises indicate that a fully coupled 2D or 3D Thermo-Hydraulic (TH) approach is required to understand and model past and future evolution such units (Kurylyk et al. 2014). However, there is presently a paucity of 3D numerical studies of permafrost thaw and associated hydrological changes, which can be partly attributed to the difficulty in verifying multi-dimensional results produced by numerical models. A benchmark exercise was initialized at the end of 2014. Participants convened from USA, Canada, Europe, representing 13 simulation codes. The benchmark exercises consist of several test cases inspired by existing literature (e.g. McKenzie et al., 2007) as well as new ones (Kurylyk et al. 2014; Grenier et al. in prep.; Rühaak et al. 2015). They range from simpler, purely thermal 1D cases to more complex, coupled 2D TH cases (benchmarks TH1, TH2, and TH3). Some experimental cases conducted in a cold room complement the validation approach. A web site hosted by LSCE (Laboratoire des Sciences du Climat et de l'Environnement) is an interaction platform for the participants and hosts the test case databases at the following address: https://wiki.lsce.ipsl.fr/interfrost. The results of the first stage of the benchmark exercise will be presented. We will mainly focus on the inter-comparison of participant results for the coupled cases TH2 & TH3. Both cases are essentially theoretical but include the full complexity of the coupled non-linear set of equations (heat transfer with conduction, advection, phase change and Darcian flow). The complete set of inter-comparison results shows that the participating codes all produce simulations which are quantitatively similar and correspond to physical intuition. From a quantitative perspective, they agree well over the whole set of performance measures. The differences among the simulation results will be discussed in more depth throughout the test cases especially for the identification of the threshold times for each system as these exhibited the least agreement. However, the results suggest that in spite of the difficulties associated with the resolution of the set of TH equations (coupled and non-linear structure with phase change providing steep slopes), the developed codes provide robust results with a qualitatively reasonable representation of the processes and offer a quantitatively realistic basis. Further perspectives of the exercise will also be presented.
How Sound Is NSSE? Investigating the Psychometric Properties of NSSE at a Public, Research-Extensive Institution

ERIC Educational Resources Information Center

Campbell, Corbin M.; Cabrera, Alberto F.

2011-01-01

The National Survey of Student Engagement (NSSE) Benchmarks has emerged as a competing paradigm for assessing institutional effectiveness vis-a-vis the U.S. News & World Report. However, Porter (2009) has critiqued it for failing to meet validity and reliability standards. This study investigated whether the NSSE five benchmarks had construct…
Results Oriented Benchmarking: The Evolution of Benchmarking at NASA from Competitive Comparisons to World Class Space Partnerships

NASA Technical Reports Server (NTRS)

Bell, Michael A.

1999-01-01

Informal benchmarking using personal or professional networks has taken place for many years at the Kennedy Space Center (KSC). The National Aeronautics and Space Administration (NASA) recognized early on, the need to formalize the benchmarking process for better utilization of resources and improved benchmarking performance. The need to compete in a faster, better, cheaper environment has been the catalyst for formalizing these efforts. A pioneering benchmarking consortium was chartered at KSC in January 1994. The consortium known as the Kennedy Benchmarking Clearinghouse (KBC), is a collaborative effort of NASA and all major KSC contractors. The charter of this consortium is to facilitate effective benchmarking, and leverage the resulting quality improvements across KSC. The KBC acts as a resource with experienced facilitators and a proven process. One of the initial actions of the KBC was to develop a holistic methodology for Center-wide benchmarking. This approach to Benchmarking integrates the best features of proven benchmarking models (i.e., Camp, Spendolini, Watson, and Balm). This cost-effective alternative to conventional Benchmarking approaches has provided a foundation for consistent benchmarking at KSC through the development of common terminology, tools, and techniques. Through these efforts a foundation and infrastructure has been built which allows short duration benchmarking studies yielding results gleaned from world class partners that can be readily implemented. The KBC has been recognized with the Silver Medal Award (in the applied research category) from the International Benchmarking Clearinghouse.
Benchmarking health IT among OECD countries: better data for better policy

PubMed Central

Adler-Milstein, Julia; Ronchi, Elettra; Cohen, Genna R; Winn, Laura A Pannella; Jha, Ashish K

2014-01-01

Objective To develop benchmark measures of health information and communication technology (ICT) use to facilitate cross-country comparisons and learning. Materials and methods The effort is led by the Organisation for Economic Co-operation and Development (OECD). Approaches to definition and measurement within four ICT domains were compared across seven OECD countries in order to identify functionalities in each domain. These informed a set of functionality-based benchmark measures, which were refined in collaboration with representatives from more than 20 OECD and non-OECD countries. We report on progress to date and remaining work to enable countries to begin to collect benchmark data. Results The four benchmarking domains include provider-centric electronic record, patient-centric electronic record, health information exchange, and tele-health. There was broad agreement on functionalities in the provider-centric electronic record domain (eg, entry of core patient data, decision support), and less agreement in the other three domains in which country representatives worked to select benchmark functionalities. Discussion Many countries are working to implement ICTs to improve healthcare system performance. Although many countries are looking to others as potential models, the lack of consistent terminology and approach has made cross-national comparisons and learning difficult. Conclusions As countries develop and implement strategies to increase the use of ICTs to promote health goals, there is a historic opportunity to enable cross-country learning. To facilitate this learning and reduce the chances that individual countries flounder, a common understanding of health ICT adoption and use is needed. The OECD-led benchmarking process is a crucial step towards achieving this. PMID:23721983
Benchmarking health IT among OECD countries: better data for better policy.

PubMed

Adler-Milstein, Julia; Ronchi, Elettra; Cohen, Genna R; Winn, Laura A Pannella; Jha, Ashish K

2014-01-01

To develop benchmark measures of health information and communication technology (ICT) use to facilitate cross-country comparisons and learning. The effort is led by the Organisation for Economic Co-operation and Development (OECD). Approaches to definition and measurement within four ICT domains were compared across seven OECD countries in order to identify functionalities in each domain. These informed a set of functionality-based benchmark measures, which were refined in collaboration with representatives from more than 20 OECD and non-OECD countries. We report on progress to date and remaining work to enable countries to begin to collect benchmark data. The four benchmarking domains include provider-centric electronic record, patient-centric electronic record, health information exchange, and tele-health. There was broad agreement on functionalities in the provider-centric electronic record domain (eg, entry of core patient data, decision support), and less agreement in the other three domains in which country representatives worked to select benchmark functionalities. Many countries are working to implement ICTs to improve healthcare system performance. Although many countries are looking to others as potential models, the lack of consistent terminology and approach has made cross-national comparisons and learning difficult. As countries develop and implement strategies to increase the use of ICTs to promote health goals, there is a historic opportunity to enable cross-country learning. To facilitate this learning and reduce the chances that individual countries flounder, a common understanding of health ICT adoption and use is needed. The OECD-led benchmarking process is a crucial step towards achieving this.
Fission yields data generation and benchmarks of decay heat estimation of a nuclear fuel

NASA Astrophysics Data System (ADS)

Gil, Choong-Sup; Kim, Do Heon; Yoo, Jae Kwon; Lee, Jounghwa

2017-09-01

Fission yields data with the ENDF-6 format of 235U, 239Pu, and several actinides dependent on incident neutron energies have been generated using the GEF code. In addition, fission yields data libraries of ORIGEN-S, -ARP modules in the SCALE code, have been generated with the new data. The decay heats by ORIGEN-S using the new fission yields data have been calculated and compared with the measured data for validation in this study. The fission yields data ORIGEN-S libraries based on ENDF/B-VII.1, JEFF-3.1.1, and JENDL/FPY-2011 have also been generated, and decay heats were calculated using the ORIGEN-S libraries for analyses and comparisons.
Designing an activity-based costing model for a non-admitted prisoner healthcare setting.

PubMed

Cai, Xiao; Moore, Elizabeth; McNamara, Martin

2013-09-01

To design and deliver an activity-based costing model within a non-admitted prisoner healthcare setting. Key phases from the NSW Health clinical redesign methodology were utilised: diagnostic, solution design and implementation. The diagnostic phase utilised a range of strategies to identify issues requiring attention in the development of the costing model. The solution design phase conceptualised distinct 'building blocks' of activity and cost based on the speciality of clinicians providing care. These building blocks enabled the classification of activity and comparisons of costs between similar facilities. The implementation phase validated the model. The project generated an activity-based costing model based on actual activity performed, gained acceptability among clinicians and managers, and provided the basis for ongoing efficiency and benchmarking efforts.
Measuring effectiveness of electronic medical records systems: towards building a composite index for benchmarking hospitals.

PubMed

Otieno, George Ochieng; Hinako, Toyama; Motohiro, Asonuma; Daisuke, Koide; Keiko, Naitoh

2008-10-01

Many hospitals are currently in the process of developing and implementing electronic medical records (EMR) systems. This is a critical time for developing a framework that can measure and allow for comparison the effectiveness of EMR systems across hospitals that have implemented these systems. The motivation for this study comes from the realization that there is limited research on the understanding of the effectiveness of EMR systems, and a lack of appropriate reference theoretical framework for measuring the effectiveness of EMR systems. In this paper, we propose a conceptual framework for generating a composite index (CI) for measuring the effectiveness of EMR systems in hospitals. Data used to test the framework and associated research objectives were derived from a cross-sectional survey of five stakeholders of EMR systems including chief medical officers, chief nursing officers, chief information officers, doctors and nurses in 20 Japanese hospitals. Using statistical means of standardization and principal component analysis (PCA) procedure, CI was developed by summing up the scores of four dimensions-system quality, information quality, use and user satisfaction. The process included formulating items for each dimension, condensing the data into factors relevant to the dimension and calculating the CI by summing up the product of each dimension with its respective principal component score coefficient. The Cronbach's alpha for the four dimensions used in developing CI was .843. Validation of CI revealed that it was correlated to internal dimensions (system quality, R=.828; information quality, R=.909; use, R=.969; and user satisfaction, R=.679) and to external factors (JAHIS level, R=.832 and patient safety culture, R=.585). These results suggest that CI could be a reliable and valid measure of the effectiveness of EMR systems in the responding hospitals. On benchmarking of hospitals, 30.0% (6/20) of the responding hospitals performed less than satisfactory on CI and that majority of the hospitals performed poorly on user satisfaction. CI has provided a standard way, through quantitative means, of measuring, comparing and categorizing the effectiveness of EMR systems in hospitals. CI can be a powerful tool for benchmarking the effectiveness of EMR systems in hospitals in ways that can guide hospitals in computerization process as well as benchmark their systems against other hospitals.
Benchmarking: measuring the outcomes of evidence-based practice.

PubMed

DeLise, D C; Leasure, A R

2001-01-01

Measurement of the outcomes associated with implementation of evidence-based practice changes is becoming increasingly emphasized by multiple health care disciplines. A final step to the process of implementing and sustaining evidence-supported practice changes is that of outcomes evaluation and monitoring. The comparison of outcomes to internal and external measures is known as benchmarking. This article discusses evidence-based practice, provides an overview of outcomes evaluation, and describes the process of benchmarking to improve practice. A case study is used to illustrate this concept.

Simulating Next-Generation Sequencing Datasets from Empirical Mutation and Sequencing Models

PubMed Central

Stephens, Zachary D.; Hudson, Matthew E.; Mainzer, Liudmila S.; Taschuk, Morgan; Weber, Matthew R.; Iyer, Ravishankar K.

2016-01-01

An obstacle to validating and benchmarking methods for genome analysis is that there are few reference datasets available for which the “ground truth” about the mutational landscape of the sample genome is known and fully validated. Additionally, the free and public availability of real human genome datasets is incompatible with the preservation of donor privacy. In order to better analyze and understand genomic data, we need test datasets that model all variants, reflecting known biology as well as sequencing artifacts. Read simulators can fulfill this requirement, but are often criticized for limited resemblance to true data and overall inflexibility. We present NEAT (NExt-generation sequencing Analysis Toolkit), a set of tools that not only includes an easy-to-use read simulator, but also scripts to facilitate variant comparison and tool evaluation. NEAT has a wide variety of tunable parameters which can be set manually on the default model or parameterized using real datasets. The software is freely available at github.com/zstephens/neat-genreads. PMID:27893777
Measuring comparative hospital performance.

PubMed

Griffith, John R; Alexander, Jeffrey A; Jelinek, Richard C

2002-01-01

Leading healthcare provider organizations now use a "balanced scorecard" of performance measures, expanding information reviewed at the governance level to include financial, customer, and internal performance information, as well as providing an opportunity to learn and grow to provide better strategic guidance. The approach, successfully used by other industries, uses competitor data and benchmarks to identify opportunities for improved mission achievement. This article evaluates one set of nine multidimensional hospital performance measures derived from Medicare reports (cash flow, asset turnover, mortality, complications, length of inpatient stay, cost per case, occupancy, change in occupancy, and percent of revenue from outpatient care). The study examines the content validity, reliability and sensitivity, validity of comparison, and independence and concludes that seven of the nine measures (all but the two occupancy measures) represent a potentially useful set for evaluating most U.S. hospitals. This set reflects correctable differences in performance between hospitals serving similar populations, that is, the measures reflect relative performance and identify opportunities to make the organization more successful.
Analytic Validation of Immunohistochemistry Assays: New Benchmark Data From a Survey of 1085 Laboratories.

PubMed

Stuart, Lauren N; Volmar, Keith E; Nowak, Jan A; Fatheree, Lisa A; Souers, Rhona J; Fitzgibbons, Patrick L; Goldsmith, Jeffrey D; Astles, J Rex; Nakhleh, Raouf E

2017-09-01

- A cooperative agreement between the College of American Pathologists (CAP) and the United States Centers for Disease Control and Prevention was undertaken to measure laboratories' awareness and implementation of an evidence-based laboratory practice guideline (LPG) on immunohistochemical (IHC) validation practices published in 2014. - To establish new benchmark data on IHC laboratory practices. - A 2015 survey on IHC assay validation practices was sent to laboratories subscribed to specific CAP proficiency testing programs and to additional nonsubscribing laboratories that perform IHC testing. Specific questions were designed to capture laboratory practices not addressed in a 2010 survey. - The analysis was based on responses from 1085 laboratories that perform IHC staining. Ninety-six percent (809 of 844) always documented validation of IHC assays. Sixty percent (648 of 1078) had separate procedures for predictive and nonpredictive markers, 42.7% (220 of 515) had procedures for laboratory-developed tests, 50% (349 of 697) had procedures for testing cytologic specimens, and 46.2% (363 of 785) had procedures for testing decalcified specimens. Minimum case numbers were specified by 85.9% (720 of 838) of laboratories for nonpredictive markers and 76% (584 of 768) for predictive markers. Median concordance requirements were 95% for both types. For initial validation, 75.4% (538 of 714) of laboratories adopted the 20-case minimum for nonpredictive markers and 45.9% (266 of 579) adopted the 40-case minimum for predictive markers as outlined in the 2014 LPG. The most common method for validation was correlation with morphology and expected results. Laboratories also reported which assay changes necessitated revalidation and their minimum case requirements. - Benchmark data on current IHC validation practices and procedures may help laboratories understand the issues and influence further refinement of LPG recommendations.
FireHose Streaming Benchmarks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Karl Anderson, Steve Plimpton

2015-01-27

The FireHose Streaming Benchmarks are a suite of stream-processing benchmarks defined to enable comparison of streaming software and hardware, both quantitatively vis-a-vis the rate at which they can process data, and qualitatively by judging the effort involved to implement and run the benchmarks. Each benchmark has two parts. The first is a generator which produces and outputs datums at a high rate in a specific format. The second is an analytic which reads the stream of datums and is required to perform a well-defined calculation on the collection of datums, typically to find anomalous datums that have been created inmore » the stream by the generator. The FireHose suite provides code for the generators, sample code for the analytics (which users are free to re-implement in their own custom frameworks), and a precise definition of each benchmark calculation.« less
Comparison of PIV with 4D-Flow in a physiological accurate flow phantom

NASA Astrophysics Data System (ADS)

Sansom, Kurt; Balu, Niranjan; Liu, Haining; Aliseda, Alberto; Yuan, Chun; Canton, Maria De Gador

2016-11-01

Validation of 4D MRI flow sequences with planar particle image velocimetry (PIV) is performed in a physiologically-accurate flow phantom. A patient-specific phantom of a carotid artery is connected to a pulsatile flow loop to simulate the 3D unsteady flow in the cardiovascular anatomy. Cardiac-cycle synchronized MRI provides time-resolved 3D blood velocity measurements in clinical tool that is promising but lacks a robust validation framework. PIV at three different Reynolds numbers (540, 680, and 815, chosen based on +/- 20 % of the average velocity from the patient-specific CCA waveform) and four different Womersley numbers (3.30, 3.68, 4.03, and 4.35, chosen to reflect a physiological range of heart rates) are compared to 4D-MRI measurements. An accuracy assessment of raw velocity measurements and a comparison of estimated and measureable flow parameters such as wall shear stress, fluctuating velocity rms, and Lagrangian particle residence time, will be presented, with justification for their biomechanics relevance to the pathophysiology of arterial disease: atherosclerosis and intimal hyperplasia. Lastly, the framework is applied to a new 4D-Flow MRI sequence and post processing techniques to provide a quantitative assessment with the benchmarked data. Department of Education GAANN Fellowship.
Finite-Difference Modeling of Acoustic and Gravity Wave Propagation in Mars Atmosphere: Application to Infrasounds Emitted by Meteor Impacts

NASA Astrophysics Data System (ADS)

Garcia, Raphael F.; Brissaud, Quentin; Rolland, Lucie; Martin, Roland; Komatitsch, Dimitri; Spiga, Aymeric; Lognonné, Philippe; Banerdt, Bruce

2017-10-01

The propagation of acoustic and gravity waves in planetary atmospheres is strongly dependent on both wind conditions and attenuation properties. This study presents a finite-difference modeling tool tailored for acoustic-gravity wave applications that takes into account the effect of background winds, attenuation phenomena (including relaxation effects specific to carbon dioxide atmospheres) and wave amplification by exponential density decrease with height. The simulation tool is implemented in 2D Cartesian coordinates and first validated by comparison with analytical solutions for benchmark problems. It is then applied to surface explosions simulating meteor impacts on Mars in various Martian atmospheric conditions inferred from global climate models. The acoustic wave travel times are validated by comparison with 2D ray tracing in a windy atmosphere. Our simulations predict that acoustic waves generated by impacts can refract back to the surface on wind ducts at high altitude. In addition, due to the strong nighttime near-surface temperature gradient on Mars, the acoustic waves are trapped in a waveguide close to the surface, which allows a night-side detection of impacts at large distances in Mars plains. Such theoretical predictions are directly applicable to future measurements by the INSIGHT NASA Discovery mission.
Reliability and Validity of Michigan School Libraries for the 21st Century Measurement Benchmarks

ERIC Educational Resources Information Center

Floyd, Natosha N.

2016-01-01

The purpose of this study was to examine the psychometric properties of the Michigan School Libraries for the 21st Century Measurement Benchmarks (SL21). The instrument consists of 19 items with three subscales: Building the 21st Century Learning Environment Subscale, Teaching for 21st Century Learning Subscale, and Leading the Way to 21st Century…
Evaluating the Effectiveness of a State-Mandated Benchmark Reading Assessment: mClass Reading 3D (Text Reading and Comprehension)

ERIC Educational Resources Information Center

Snow, Amie B.; Morris, Darrell; Perney, Jan

2018-01-01

We examined which of two instruments (Text Reading and Comprehension inventory [TRC] or a traditional informal reading inventory [IRI]) provides the more valid assessment of a primary-grade student's reading instructional level. The TRC is currently the required, benchmark reading assessment for students in grades K-3 in the state of North…
Thought Experiment to Examine Benchmark Performance for Fusion Nuclear Data

NASA Astrophysics Data System (ADS)

Murata, Isao; Ohta, Masayuki; Kusaka, Sachie; Sato, Fuminobu; Miyamaru, Hiroyuki

2017-09-01

There are many benchmark experiments carried out so far with DT neutrons especially aiming at fusion reactor development. These integral experiments seemed vaguely to validate the nuclear data below 14 MeV. However, no precise studies exist now. The author's group thus started to examine how well benchmark experiments with DT neutrons can play a benchmarking role for energies below 14 MeV. Recently, as a next phase, to generalize the above discussion, the energy range was expanded to the entire region. In this study, thought experiments with finer energy bins have thus been conducted to discuss how to generally estimate performance of benchmark experiments. As a result of thought experiments with a point detector, the sensitivity for a discrepancy appearing in the benchmark analysis is "equally" due not only to contribution directly conveyed to the deterctor, but also due to indirect contribution of neutrons (named (A)) making neutrons conveying the contribution, indirect controbution of neutrons (B) making the neutrons (A) and so on. From this concept, it would become clear from a sensitivity analysis in advance how well and which energy nuclear data could be benchmarked with a benchmark experiment.
Benchmarking: A Study of School and School District Effect and Efficiency.

ERIC Educational Resources Information Center

Swanson, Austin D.; Engert, Frank

The "New York State School Report Card" provides a vehicle for benchmarking with respect to student achievement. In this study, additional tools were developed for making external comparisons with respect to achievement, and tools were added for assessing fiscal policy and efficiency. Data from school years 1993-94 through 1995-96 were…
The Kelvin-Helmholtz instability of boundary-layer plasmas in the kinetic regime

DOE Office of Scientific and Technical Information (OSTI.GOV)

Steinbusch, Benedikt, E-mail: b.steinbusch@fz-juelich.de; Gibbon, Paul, E-mail: p.gibbon@fz-juelich.de; Department of Mathematics, Centre for Mathematical Plasma Astrophysics, Katholieke Universiteit Leuven

2016-05-15

The dynamics of the Kelvin-Helmholtz instability are investigated in the kinetic, high-frequency regime with a novel, two-dimensional, mesh-free tree code. In contrast to earlier studies which focused on specially prepared equilibrium configurations in order to compare with fluid theory, a more naturally occurring plasma-vacuum boundary layer is considered here with relevance to both space plasma and linear plasma devices. Quantitative comparisons of the linear phase are made between the fluid and kinetic models. After establishing the validity of this technique via comparison to linear theory and conventional particle-in-cell simulation for classical benchmark problems, a quantitative analysis of the more complexmore » magnetized plasma-vacuum layer is presented and discussed. It is found that in this scenario, the finite Larmor orbits of the ions result in significant departures from the effective shear velocity and width underlying the instability growth, leading to generally slower development and stronger nonlinear coupling between fast growing short-wavelength modes and longer wavelengths.« less
Benchmarking Student Diversity at Public Universities in the United States: Accounting for State Population Composition

PubMed Central

Franklin, Rachel S.

2014-01-01

Regions rely at least partially on the internal production of a qualified workforce in order to maintain their economic competitiveness. Increasingly, at least from a university or corporate point of view, a qualified workforce is viewed as one that is racially and ethnically diverse. However, the conceptualization and measurement of ethnic and racial diversity in higher education appears to be often based on normative values rather than solid benchmarks, making any regional comparisons or goals difficult to specify. Ideally, at least as a starting point, public state universities would, while attempting to increase overall student diversity, benchmark their progress against the state population composition. This paper combines enrollment data from the National Center for Education Statistics (NCES) with U.S. Census Bureau population estimates data to provide a point of comparison for state universities. The paper has two goals: first a university-level comparison of diversity scores, as measured by the interaction index and, second, an analysis of how university student population composition compares to that of the population the university was originally intended to serve – the state population. PMID:25506123
Benchmarking Student Diversity at Public Universities in the United States: Accounting for State Population Composition.

PubMed

Franklin, Rachel S

2012-10-01

Regions rely at least partially on the internal production of a qualified workforce in order to maintain their economic competitiveness. Increasingly, at least from a university or corporate point of view, a qualified workforce is viewed as one that is racially and ethnically diverse. However, the conceptualization and measurement of ethnic and racial diversity in higher education appears to be often based on normative values rather than solid benchmarks, making any regional comparisons or goals difficult to specify. Ideally, at least as a starting point, public state universities would, while attempting to increase overall student diversity, benchmark their progress against the state population composition. This paper combines enrollment data from the National Center for Education Statistics (NCES) with U.S. Census Bureau population estimates data to provide a point of comparison for state universities. The paper has two goals: first a university-level comparison of diversity scores, as measured by the interaction index and, second, an analysis of how university student population composition compares to that of the population the university was originally intended to serve - the state population.
A broad scope knowledge based model for optimization of VMAT in esophageal cancer: validation and assessment of plan quality among different treatment centers.

PubMed

Fogliata, Antonella; Nicolini, Giorgia; Clivio, Alessandro; Vanetti, Eugenio; Laksar, Sarbani; Tozzi, Angelo; Scorsetti, Marta; Cozzi, Luca

2015-10-31

To evaluate the performance of a broad scope model-based optimisation process for volumetric modulated arc therapy applied to esophageal cancer. A set of 70 previously treated patients in two different institutions, were selected to train a model for the prediction of dose-volume constraints. The model was built with a broad-scope purpose, aiming to be effective for different dose prescriptions and tumour localisations. It was validated on three groups of patients from the same institution and from another clinic not providing patients for the training phase. Comparison of the automated plans was done against reference cases given by the clinically accepted plans. Quantitative improvements (statistically significant for the majority of the analysed dose-volume parameters) were observed between the benchmark and the test plans. Of 624 dose-volume objectives assessed for plan evaluation, in 21 cases (3.3 %) the reference plans failed to respect the constraints while the model-based plans succeeded. Only in 3 cases (<0.5 %) the reference plans passed the criteria while the model-based failed. In 5.3 % of the cases both groups of plans failed and in the remaining cases both passed the tests. Plans were optimised using a broad scope knowledge-based model to determine the dose-volume constraints. The results showed dosimetric improvements when compared to the benchmark data. Particularly the plans optimised for patients from the third centre, not participating to the training, resulted in superior quality. The data suggests that the new engine is reliable and could encourage its application to clinical practice.
Excore Modeling with VERAShift

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pandya, Tara M.; Evans, Thomas M.

It is important to be able to accurately predict the neutron flux outside the immediate reactor core for a variety of safety and material analyses. Monte Carlo radiation transport calculations are required to produce the high fidelity excore responses. Under this milestone VERA (specifically the VERAShift package) has been extended to perform excore calculations by running radiation transport calculations with Shift. This package couples VERA-CS with Shift to perform excore tallies for multiple state points concurrently, with each component capable of parallel execution on independent domains. Specifically, this package performs fluence calculations in the core barrel and vessel, or, performsmore » the requested tallies in any user-defined excore regions. VERAShift takes advantage of the general geometry package in Shift. This gives VERAShift the flexibility to explicitly model features outside the core barrel, including detailed vessel models, detectors, and power plant details. A very limited set of experimental and numerical benchmarks is available for excore simulation comparison. The Consortium for the Advanced Simulation of Light Water Reactors (CASL) has developed a set of excore benchmark problems to include as part of the VERA-CS verification and validation (V&V) problems. The excore capability in VERAShift has been tested on small representative assembly problems, multiassembly problems, and quarter-core problems. VERAView has also been extended to visualize these vessel fluence results from VERAShift. Preliminary vessel fluence results for quarter-core multistate calculations look very promising. Further development is needed to determine the details relevant to excore simulations. Validation of VERA for fluence and excore detectors still needs to be performed against experimental and numerical results.« less
Benchmarking in pathology: development of an activity-based costing model.

PubMed

Burnett, Leslie; Wilson, Roger; Pfeffer, Sally; Lowry, John

2012-12-01

Benchmarking in Pathology (BiP) allows pathology laboratories to determine the unit cost of all laboratory tests and procedures, and also provides organisational productivity indices allowing comparisons of performance with other BiP participants. We describe 14 years of progressive enhancement to a BiP program, including the implementation of 'avoidable costs' as the accounting basis for allocation of costs rather than previous approaches using 'total costs'. A hierarchical tree-structured activity-based costing model distributes 'avoidable costs' attributable to the pathology activities component of a pathology laboratory operation. The hierarchical tree model permits costs to be allocated across multiple laboratory sites and organisational structures. This has enabled benchmarking on a number of levels, including test profiles and non-testing related workload activities. The development of methods for dealing with variable cost inputs, allocation of indirect costs using imputation techniques, panels of tests, and blood-bank record keeping, have been successfully integrated into the costing model. A variety of laboratory management reports are produced, including the 'cost per test' of each pathology 'test' output. Benchmarking comparisons may be undertaken at any and all of the 'cost per test' and 'cost per Benchmarking Complexity Unit' level, 'discipline/department' (sub-specialty) level, or overall laboratory/site and organisational levels. We have completed development of a national BiP program. An activity-based costing methodology based on avoidable costs overcomes many problems of previous benchmarking studies based on total costs. The use of benchmarking complexity adjustment permits correction for varying test-mix and diagnostic complexity between laboratories. Use of iterative communication strategies with program participants can overcome many obstacles and lead to innovations.
Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine.

PubMed

Singhal, Ayush; Simmons, Michael; Lu, Zhiyong

2016-11-01

The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed). Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer's disease, hemochromatosis, age-related macular degeneration (AMD), diabetes mellitus, and cystic fibrosis). We then evaluate our approach in two ways: (1) a direct comparison with the state of the art using benchmark datasets; (2) a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt) for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F1-measure (from 0.62 to 0.79) over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB), we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets (disease-gene-variant) that overlapped with entries in UniProt and 5,384 triplets without overlap in UniProt. Analysis of the overlapping triplets and of a stratified sample of the non-overlapping triplets revealed accuracies of 93% and 80% for the respective categories (cumulative accuracy, 77%). We conclude that our process represents an important and broadly applicable improvement to the state of the art for curation of disease-gene-variant relationships.
Monte Carlo reference data sets for imaging research: Executive summary of the report of AAPM Research Committee Task Group 195.

PubMed

Sechopoulos, Ioannis; Ali, Elsayed S M; Badal, Andreu; Badano, Aldo; Boone, John M; Kyprianou, Iacovos S; Mainegra-Hing, Ernesto; McMillan, Kyle L; McNitt-Gray, Michael F; Rogers, D W O; Samei, Ehsan; Turner, Adam C

2015-10-01

The use of Monte Carlo simulations in diagnostic medical imaging research is widespread due to its flexibility and ability to estimate quantities that are challenging to measure empirically. However, any new Monte Carlo simulation code needs to be validated before it can be used reliably. The type and degree of validation required depends on the goals of the research project, but, typically, such validation involves either comparison of simulation results to physical measurements or to previously published results obtained with established Monte Carlo codes. The former is complicated due to nuances of experimental conditions and uncertainty, while the latter is challenging due to typical graphical presentation and lack of simulation details in previous publications. In addition, entering the field of Monte Carlo simulations in general involves a steep learning curve. It is not a simple task to learn how to program and interpret a Monte Carlo simulation, even when using one of the publicly available code packages. This Task Group report provides a common reference for benchmarking Monte Carlo simulations across a range of Monte Carlo codes and simulation scenarios. In the report, all simulation conditions are provided for six different Monte Carlo simulation cases that involve common x-ray based imaging research areas. The results obtained for the six cases using four publicly available Monte Carlo software packages are included in tabular form. In addition to a full description of all simulation conditions and results, a discussion and comparison of results among the Monte Carlo packages and the lessons learned during the compilation of these results are included. This abridged version of the report includes only an introductory description of the six cases and a brief example of the results of one of the cases. This work provides an investigator the necessary information to benchmark his/her Monte Carlo simulation software against the reference cases included here before performing his/her own novel research. In addition, an investigator entering the field of Monte Carlo simulations can use these descriptions and results as a self-teaching tool to ensure that he/she is able to perform a specific simulation correctly. Finally, educators can assign these cases as learning projects as part of course objectives or training programs.
CFD validation experiments for hypersonic flows

NASA Technical Reports Server (NTRS)

Marvin, Joseph G.

1992-01-01

A roadmap for CFD code validation is introduced. The elements of the roadmap are consistent with air-breathing vehicle design requirements and related to the important flow path components: forebody, inlet, combustor, and nozzle. Building block and benchmark validation experiments are identified along with their test conditions and measurements. Based on an evaluation criteria, recommendations for an initial CFD validation data base are given and gaps identified where future experiments could provide new validation data.
Evaluation of Graph Pattern Matching Workloads in Graph Analysis Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hong, Seokyong; Lee, Sangkeun; Lim, Seung-Hwan

2016-01-01

Graph analysis has emerged as a powerful method for data scientists to represent, integrate, query, and explore heterogeneous data sources. As a result, graph data management and mining became a popular area of research, and led to the development of plethora of systems in recent years. Unfortunately, the number of emerging graph analysis systems and the wide range of applications, coupled with a lack of apples-to-apples comparisons, make it difficult to understand the trade-offs between different systems and the graph operations for which they are designed. A fair comparison of these systems is a challenging task for the following reasons:more » multiple data models, non-standardized serialization formats, various query interfaces to users, and diverse environments they operate in. To address these key challenges, in this paper we present a new benchmark suite by extending the Lehigh University Benchmark (LUBM) to cover the most common capabilities of various graph analysis systems. We provide the design process of the benchmark, which generalizes the workflow for data scientists to conduct the desired graph analysis on different graph analysis systems. Equipped with this extended benchmark suite, we present performance comparison for nine subgraph pattern retrieval operations over six graph analysis systems, namely NetworkX, Neo4j, Jena, Titan, GraphX, and uRiKA. Through the proposed benchmark suite, this study reveals both quantitative and qualitative findings in (1) implications in loading data into each system; (2) challenges in describing graph patterns for each query interface; and (3) different sensitivity of each system to query selectivity. We envision that this study will pave the road for: (i) data scientists to select the suitable graph analysis systems, and (ii) data management system designers to advance graph analysis systems.« less

RASSP Benchmark 4 Technical Description.

DTIC Science & Technology

1998-01-09

be carried out. Based on results of the study, an implementation of all, or part, of the system described in this benchmark technical description...validate interface and timing constraints. The ISA level of modeling defines the limit of detail expected in the VHDL virtual prototype. It does not...develop a set of candidate architectures and perform an architecture trade-off study. Candidate proces- sor implementations must then be examined for
Omega Hawaii Antenna System: Modification and Validation Tests. Volume 2. Data Sheets.

DTIC Science & Technology

1979-10-19

a benchmark because of potential hotel construction . DS 5-1 DATA SHEET 5 (DS-5) RADIO FIELD INTENSITY MEASUREMENTS OMEGA STATION: HAWAII SITE NO. C 1A...27.5 1008 11.05 26.5 1007 Ft 11.80 28.1 COMMENT Not considered for a benchmark because of potential hotel construction . DS 5-5 DATA SHEET 5 (DS-5) RADIO
Two-dimensional free-surface flow under gravity: A new benchmark case for SPH method

NASA Astrophysics Data System (ADS)

Wu, J. Z.; Fang, L.

2018-02-01

Currently there are few free-surface benchmark cases with analytical results for the Smoothed Particle Hydrodynamics (SPH) simulation. In the present contribution we introduce a two-dimensional free-surface flow under gravity, and obtain an analytical expression on the surface height difference and a theoretical estimation on the surface fractal dimension. They are preliminarily validated and supported by SPH calculations.
Picking ChIP-seq peak detectors for analyzing chromatin modification experiments

PubMed Central

Micsinai, Mariann; Parisi, Fabio; Strino, Francesco; Asp, Patrik; Dynlacht, Brian D.; Kluger, Yuval

2012-01-01

Numerous algorithms have been developed to analyze ChIP-Seq data. However, the complexity of analyzing diverse patterns of ChIP-Seq signals, especially for epigenetic marks, still calls for the development of new algorithms and objective comparisons of existing methods. We developed Qeseq, an algorithm to detect regions of increased ChIP read density relative to background. Qeseq employs critical novel elements, such as iterative recalibration and neighbor joining of reads to identify enriched regions of any length. To objectively assess its performance relative to other 14 ChIP-Seq peak finders, we designed a novel protocol based on Validation Discriminant Analysis (VDA) to optimally select validation sites and generated two validation datasets, which are the most comprehensive to date for algorithmic benchmarking of key epigenetic marks. In addition, we systematically explored a total of 315 diverse parameter configurations from these algorithms and found that typically optimal parameters in one dataset do not generalize to other datasets. Nevertheless, default parameters show the most stable performance, suggesting that they should be used. This study also provides a reproducible and generalizable methodology for unbiased comparative analysis of high-throughput sequencing tools that can facilitate future algorithmic development. PMID:22307239
Picking ChIP-seq peak detectors for analyzing chromatin modification experiments.

PubMed

Micsinai, Mariann; Parisi, Fabio; Strino, Francesco; Asp, Patrik; Dynlacht, Brian D; Kluger, Yuval

2012-05-01

Numerous algorithms have been developed to analyze ChIP-Seq data. However, the complexity of analyzing diverse patterns of ChIP-Seq signals, especially for epigenetic marks, still calls for the development of new algorithms and objective comparisons of existing methods. We developed Qeseq, an algorithm to detect regions of increased ChIP read density relative to background. Qeseq employs critical novel elements, such as iterative recalibration and neighbor joining of reads to identify enriched regions of any length. To objectively assess its performance relative to other 14 ChIP-Seq peak finders, we designed a novel protocol based on Validation Discriminant Analysis (VDA) to optimally select validation sites and generated two validation datasets, which are the most comprehensive to date for algorithmic benchmarking of key epigenetic marks. In addition, we systematically explored a total of 315 diverse parameter configurations from these algorithms and found that typically optimal parameters in one dataset do not generalize to other datasets. Nevertheless, default parameters show the most stable performance, suggesting that they should be used. This study also provides a reproducible and generalizable methodology for unbiased comparative analysis of high-throughput sequencing tools that can facilitate future algorithmic development.
[Do you mean benchmarking?].

PubMed

Bonnet, F; Solignac, S; Marty, J

2008-03-01

The purpose of benchmarking is to settle improvement processes by comparing the activities to quality standards. The proposed methodology is illustrated by benchmark business cases performed inside medical plants on some items like nosocomial diseases or organization of surgery facilities. Moreover, the authors have built a specific graphic tool, enhanced with balance score numbers and mappings, so that the comparison between different anesthesia-reanimation services, which are willing to start an improvement program, is easy and relevant. This ready-made application is even more accurate as far as detailed tariffs of activities are implemented.
Revenues and Expenditures: Peer and Benchmark Comparisons--University of Hawai'i Community Colleges, Fiscal Year 1995-96.

ERIC Educational Resources Information Center

Hawaii Univ., Honolulu. Institutional Research Office.

This report presents information comparing the University of Hawaii Community Colleges (UHCC) to benchmark and peer-group institutions on selected financial measures. The primary data sources for this report were the Integrated Postsecondary Education Data System (IPEDS) Finance Survey for the 1995-1996 fiscal year and the IPEDS Fall Enrollment…
Is Higher Better? Determinants and Comparisons of Performance on the Major Field Test in Business

ERIC Educational Resources Information Center

Bielinska-Kwapisz, Agnieszka; Brown, F. William; Semenik, Richard

2012-01-01

Student performance on the Major Field Achievement Test in Business is an important benchmark for college of business programs. The authors' results indicate that such benchmarking can only be meaningful if certain student characteristics are taken into account. The differences in achievement between cohorts are explored in detail by separating…
Revenues and Expenditures: Peer and Benchmark Comparisons, University of Hawai'i, Fiscal Year 1994-95.

ERIC Educational Resources Information Center

Hawaii Univ., Honolulu.

The University of Hawaii's (UH) three university and seven community college campuses are compared with benchmark and peer group institutions with regard to selected financial measures. The primary data sources for this report were the Integrated Postsecondary Education Data System (IPEDS) Finance Survey, Fiscal Year 1994-95. Tables show data on…
Practical application of the benchmarking technique to increase reliability and efficiency of power installations and main heat-mechanic equipment of thermal power plants

NASA Astrophysics Data System (ADS)

Rimov, A. A.; Chukanova, T. I.; Trofimov, Yu. V.

2016-12-01

Data on the comparative analysis variants of the quality of power installations (benchmarking) applied in the power industry is systematized. It is shown that the most efficient variant of implementation of the benchmarking technique is the analysis of statistical distributions of the indicators in the composed homogenous group of the uniform power installations. The benchmarking technique aimed at revealing the available reserves on improvement of the reliability and heat efficiency indicators of the power installations of the thermal power plants is developed in the furtherance of this approach. The technique provides a possibility of reliable comparison of the quality of the power installations in their homogenous group limited by the number and adoption of the adequate decision on improving some or other technical characteristics of this power installation. The technique provides structuring of the list of the comparison indicators and internal factors affecting them represented according to the requirements of the sectoral standards and taking into account the price formation characteristics in the Russian power industry. The mentioned structuring ensures traceability of the reasons of deviation of the internal influencing factors from the specified values. The starting point for further detail analysis of the delay of the certain power installation indicators from the best practice expressed in the specific money equivalent is positioning of this power installation on distribution of the key indicator being a convolution of the comparison indicators. The distribution of the key indicator is simulated by the Monte-Carlo method after receiving the actual distributions of the comparison indicators: specific lost profit due to the short supply of electric energy and short delivery of power, specific cost of losses due to the nonoptimal expenditures for repairs, and specific cost of excess fuel equivalent consumption. The quality loss indicators are developed facilitating the analysis of the benchmarking results permitting to represent the quality loss of this power installation in the form of the difference between the actual value of the key indicator or comparison indicator and the best quartile of the existing distribution. The uncertainty of the obtained values of the quality loss indicators was evaluated by transforming the standard uncertainties of the input values into the expanded uncertainties of the output values with the confidence level of 95%. The efficiency of the technique is demonstrated in terms of benchmarking of the main thermal and mechanical equipment of the extraction power-generating units T-250 and power installations of the thermal power plants with the main steam pressure 130 atm.
Benchmark of Machine Learning Methods for Classification of a SENTINEL-2 Image

NASA Astrophysics Data System (ADS)

Pirotti, F.; Sunar, F.; Piragnolo, M.

2016-06-01

Thanks to mainly ESA and USGS, a large bulk of free images of the Earth is readily available nowadays. One of the main goals of remote sensing is to label images according to a set of semantic categories, i.e. image classification. This is a very challenging issue since land cover of a specific class may present a large spatial and spectral variability and objects may appear at different scales and orientations. In this study, we report the results of benchmarking 9 machine learning algorithms tested for accuracy and speed in training and classification of land-cover classes in a Sentinel-2 dataset. The following machine learning methods (MLM) have been tested: linear discriminant analysis, k-nearest neighbour, random forests, support vector machines, multi layered perceptron, multi layered perceptron ensemble, ctree, boosting, logarithmic regression. The validation is carried out using a control dataset which consists of an independent classification in 11 land-cover classes of an area about 60 km2, obtained by manual visual interpretation of high resolution images (20 cm ground sampling distance) by experts. In this study five out of the eleven classes are used since the others have too few samples (pixels) for testing and validating subsets. The classes used are the following: (i) urban (ii) sowable areas (iii) water (iv) tree plantations (v) grasslands. Validation is carried out using three different approaches: (i) using pixels from the training dataset (train), (ii) using pixels from the training dataset and applying cross-validation with the k-fold method (kfold) and (iii) using all pixels from the control dataset. Five accuracy indices are calculated for the comparison between the values predicted with each model and control values over three sets of data: the training dataset (train), the whole control dataset (full) and with k-fold cross-validation (kfold) with ten folds. Results from validation of predictions of the whole dataset (full) show the random forests method with the highest values; kappa index ranging from 0.55 to 0.42 respectively with the most and least number pixels for training. The two neural networks (multi layered perceptron and its ensemble) and the support vector machines - with default radial basis function kernel - methods follow closely with comparable performance.
Development and efficacy assessments of tea seed oil makeup remover.

PubMed

Parnsamut, N; Kanlayavattanakul, M; Lourith, N

2017-05-01

The efficacy of tea seed oil to clean foundation and eyeliner was evaluated. The safe and efficient tea seed oil makeup remover was developed. In vitro cleansing efficacy of makeup remover was UV-spectrophotometric validated. The stability evaluation by means of accelerated stability test was conducted. In vitro and in vivo cleansing efficacy of the removers was conducted in a comparison with benchmark majorly containing olive oil. Tea seed oil cleaned 90.64±4.56% of foundation and 87.62±8.35% of eyeliner. The stable with most appropriate textures base was incorporated with tea seed oil. Three tea seed oil removers (50, 55 and 60%) were stabled. The 60% tea seed oil remover significantly removed foundation better than others (94.48±3.37%; P<0.001) and the benchmark (92.32±1.33%), but insignificant removed eyeliner (87.50±5.15%; P=0.059). Tea seed oil remover caused none of skin irritation as examined in 20 human volunteers. A single-blind, randomized control exhibited that the tea seed oil remover gained a better preference over the benchmark (75.42±8.10 and 70.00±7.78%; P=0.974). The safe and efficient tea seed oil makeup removers had been developed. The consumers' choices towards the makeup remover containing the bio-oils are widen. In vitro cleansing efficacy during the course of makeup remover development using UV-spectrophotometric method feasible for pharmaceutic industries is encouraged. Copyright © 2016 Académie Nationale de Pharmacie. Published by Elsevier Masson SAS. All rights reserved.
Expectations of clinical teachers and faculty regarding development of the CanMEDS-Family Medicine competencies: Laval developmental benchmarks scale for family medicine residency training.

PubMed

Lacasse, Miriam; Théorêt, Johanne; Tessier, Sylvie; Arsenault, Louise

2014-01-01

The CanMEDS-Family Medicine (CanMEDS-FM) framework defines the expected terminal enabling competencies (EC) for family medicine (FM) residency training in Canada. However, benchmarks throughout the 2-year program are not yet defined. This study aimed to identify expected time frames for achievement of the CanMEDS-FM competencies during FM residency training and create a developmental benchmarks scale for family medicine residency training. This 2011-2012 study followed a Delphi methodology. Selected faculty and clinical teachers identified, via questionnaire, the expected time of EC achievement from beginning of residency to one year in practice (0, 6, 12, […] 36 months). The 15-85th percentile intervals became the expected competency achievement interval. Content validity of the obtained benchmarks was assessed through a second Delphi round. The 1st and 2nd rounds were completed by 33 and 27 respondents, respectively. A developmental benchmarks scale was designed after the 1st round to illustrate expectations regarding achievement of each EC. The 2nd round (content validation) led to minor adjustments (1.9±2.7 months) of intervals for 44 of the 92 competencies, the others remaining unchanged. The Laval Developmental Benchmarks Scale for Family Medicine clarifies expectations regarding achievement of competencies throughout FM training. In a competency-based education system this now allows identification and management of outlying residents, both those excelling and needing remediation. Further research should focus on assessment of the scale reliability after pilot implementation in family medicine clinical teaching units at Laval University, and corroborate the established timeline in other sites.
a Dosimetry Assessment for the Core Restraint of AN Advanced Gas Cooled Reactor

NASA Astrophysics Data System (ADS)

Thornton, D. A.; Allen, D. A.; Tyrrell, R. J.; Meese, T. C.; Huggon, A. P.; Whiley, G. S.; Mossop, J. R.

2009-08-01

This paper describes calculations of neutron damage rates within the core restraint structures of Advanced Gas Cooled Reactors (AGRs). Using advanced features of the Monte Carlo radiation transport code MCBEND, and neutron source data from core follow calculations performed with the reactor physics code PANTHER, a detailed model of the reactor cores of two of British Energy's AGR power plants has been developed for this purpose. Because there are no relevant neutron fluence measurements directly supporting this assessment, results of benchmark comparisons and successful validation of MCBEND for Magnox reactors have been used to estimate systematic and random uncertainties on the predictions. In particular, it has been necessary to address the known under-prediction of lower energy fast neutron responses associated with the penetration of large thicknesses of graphite.
Celeris: A GPU-accelerated open source software with a Boussinesq-type wave solver for real-time interactive simulation and visualization

NASA Astrophysics Data System (ADS)

Tavakkol, Sasan; Lynett, Patrick

2017-08-01

In this paper, we introduce an interactive coastal wave simulation and visualization software, called Celeris. Celeris is an open source software which needs minimum preparation to run on a Windows machine. The software solves the extended Boussinesq equations using a hybrid finite volume-finite difference method and supports moving shoreline boundaries. The simulation and visualization are performed on the GPU using Direct3D libraries, which enables the software to run faster than real-time. Celeris provides a first-of-its-kind interactive modeling platform for coastal wave applications and it supports simultaneous visualization with both photorealistic and colormapped rendering capabilities. We validate our software through comparison with three standard benchmarks for non-breaking and breaking waves.
Nonlocal elasticity and shear deformation effects on thermal buckling of a CNT embedded in a viscoelastic medium

NASA Astrophysics Data System (ADS)

Zenkour, A. M.

2018-05-01

The thermal buckling analysis of carbon nanotubes embedded in a visco-Pasternak's medium is investigated. The Eringen's nonlocal elasticity theory, in conjunction with the first-order Donnell's shell theory, is used for this purpose. The surrounding medium is considered as a three-parameter viscoelastic foundation model, Winkler-Pasternak's model as well as a viscous damping coefficient. The governing equilibrium equations are obtained and solved for carbon nanotubes subjected to different thermal and mechanical loads. The effects of nonlocal parameter, radius and length of nanotube, and the three foundation parameters on the thermal buckling of the nanotube are studied. Sample critical buckling loads are reported and graphically illustrated to check the validity of the present results and to present benchmarks for future comparisons.
Imidazole derivatives as angiotensin II AT1 receptor blockers: Benchmarks, drug-like calculations and quantitative structure-activity relationships modeling

NASA Astrophysics Data System (ADS)

Alloui, Mebarka; Belaidi, Salah; Othmani, Hasna; Jaidane, Nejm-Eddine; Hochlaf, Majdi

2018-03-01

We performed benchmark studies on the molecular geometry, electron properties and vibrational analysis of imidazole using semi-empirical, density functional theory and post Hartree-Fock methods. These studies validated the use of AM1 for the treatment of larger systems. Then, we treated the structural, physical and chemical relationships for a series of imidazole derivatives acting as angiotensin II AT1 receptor blockers using AM1. QSAR studies were done for these imidazole derivatives using a combination of various physicochemical descriptors. A multiple linear regression procedure was used to design the relationships between molecular descriptor and the activity of imidazole derivatives. Results validate the derived QSAR model.
Towards unbiased benchmarking of evolutionary and hybrid algorithms for real-valued optimisation

NASA Astrophysics Data System (ADS)

MacNish, Cara

2007-12-01

Randomised population-based algorithms, such as evolutionary, genetic and swarm-based algorithms, and their hybrids with traditional search techniques, have proven successful and robust on many difficult real-valued optimisation problems. This success, along with the readily applicable nature of these techniques, has led to an explosion in the number of algorithms and variants proposed. In order for the field to advance it is necessary to carry out effective comparative evaluations of these algorithms, and thereby better identify and understand those properties that lead to better performance. This paper discusses the difficulties of providing benchmarking of evolutionary and allied algorithms that is both meaningful and logistically viable. To be meaningful the benchmarking test must give a fair comparison that is free, as far as possible, from biases that favour one style of algorithm over another. To be logistically viable it must overcome the need for pairwise comparison between all the proposed algorithms. To address the first problem, we begin by attempting to identify the biases that are inherent in commonly used benchmarking functions. We then describe a suite of test problems, generated recursively as self-similar or fractal landscapes, designed to overcome these biases. For the second, we describe a server that uses web services to allow researchers to 'plug in' their algorithms, running on their local machines, to a central benchmarking repository.
A CFD validation roadmap for hypersonic flows

NASA Technical Reports Server (NTRS)

Marvin, Joseph G.

1992-01-01

A roadmap for computational fluid dynamics (CFD) code validation is developed. The elements of the roadmap are consistent with air-breathing vehicle design requirements and related to the important flow path components: forebody, inlet, combustor, and nozzle. Building block and benchmark validation experiments are identified along with their test conditions and measurements. Based on an evaluation criteria, recommendations for an initial CFD validation data base are given and gaps identified where future experiments would provide the needed validation data.
A CFD validation roadmap for hypersonic flows

NASA Technical Reports Server (NTRS)

Marvin, Joseph G.

1993-01-01

A roadmap for computational fluid dynamics (CFD) code validation is developed. The elements of the roadmap are consistent with air-breathing vehicle design requirements and related to the important flow path components: forebody, inlet, combustor, and nozzle. Building block and benchmark validation experiments are identified along with their test conditions and measurements. Based on an evaluation criteria, recommendations for an initial CFD validation data base are given and gaps identified where future experiments would provide the needed validation data.

Benchmark Shock Tube Experiments for Radiative Heating Relevant to Earth Re-Entry

NASA Technical Reports Server (NTRS)

Brandis, A. M.; Cruden, B. A.

2017-01-01

Detailed spectrally and spatially resolved radiance has been measured in the Electric Arc Shock Tube (EAST) facility for conditions relevant to high speed entry into a variety of atmospheres, including Earth, Venus, Titan, Mars and the Outer Planets. The tests that measured radiation relevant for Earth re-entry are the focus of this work and are taken from campaigns 47, 50, 52 and 57. These tests covered conditions from 8 km/s to 15.5 km/s at initial pressures ranging from 0.05 Torr to 1 Torr, of which shots at 0.1 and 0.2 Torr are analyzed in this paper. These conditions cover a range of points of interest for potential fight missions, including return from Low Earth Orbit, the Moon and Mars. The large volume of testing available from EAST is useful for statistical analysis of radiation data, but is problematic for identifying representative experiments for performing detailed analysis. Therefore, the intent of this paper is to select a subset of benchmark test data that can be considered for further detailed study. These benchmark shots are intended to provide more accessible data sets for future code validation studies and facility-to-facility comparisons. The shots that have been selected as benchmark data are the ones in closest agreement to a line of best fit through all of the EAST results, whilst also showing the best experimental characteristics, such as test time and convergence to equilibrium. The EAST data are presented in different formats for analysis. These data include the spectral radiance at equilibrium, the spatial dependence of radiance over defined wavelength ranges and the mean non-equilibrium spectral radiance (so-called 'spectral non-equilibrium metric'). All the information needed to simulate each experimental trace, including free-stream conditions, shock time of arrival (i.e. x-t) relation, and the spectral and spatial resolution functions, are provided.
Design of an Evolutionary Approach for Intrusion Detection

PubMed Central

2013-01-01

A novel evolutionary approach is proposed for effective intrusion detection based on benchmark datasets. The proposed approach can generate a pool of noninferior individual solutions and ensemble solutions thereof. The generated ensembles can be used to detect the intrusions accurately. For intrusion detection problem, the proposed approach could consider conflicting objectives simultaneously like detection rate of each attack class, error rate, accuracy, diversity, and so forth. The proposed approach can generate a pool of noninferior solutions and ensembles thereof having optimized trade-offs values of multiple conflicting objectives. In this paper, a three-phase, approach is proposed to generate solutions to a simple chromosome design in the first phase. In the first phase, a Pareto front of noninferior individual solutions is approximated. In the second phase of the proposed approach, the entire solution set is further refined to determine effective ensemble solutions considering solution interaction. In this phase, another improved Pareto front of ensemble solutions over that of individual solutions is approximated. The ensemble solutions in improved Pareto front reported improved detection results based on benchmark datasets for intrusion detection. In the third phase, a combination method like majority voting method is used to fuse the predictions of individual solutions for determining prediction of ensemble solution. Benchmark datasets, namely, KDD cup 1999 and ISCX 2012 dataset, are used to demonstrate and validate the performance of the proposed approach for intrusion detection. The proposed approach can discover individual solutions and ensemble solutions thereof with a good support and a detection rate from benchmark datasets (in comparison with well-known ensemble methods like bagging and boosting). In addition, the proposed approach is a generalized classification approach that is applicable to the problem of any field having multiple conflicting objectives, and a dataset can be represented in the form of labelled instances in terms of its features. PMID:24376390
Neural network approach to quantum-chemistry data: accurate prediction of density functional theory energies.

PubMed

Balabin, Roman M; Lomakina, Ekaterina I

2009-08-21

Artificial neural network (ANN) approach has been applied to estimate the density functional theory (DFT) energy with large basis set using lower-level energy values and molecular descriptors. A total of 208 different molecules were used for the ANN training, cross validation, and testing by applying BLYP, B3LYP, and BMK density functionals. Hartree-Fock results were reported for comparison. Furthermore, constitutional molecular descriptor (CD) and quantum-chemical molecular descriptor (QD) were used for building the calibration model. The neural network structure optimization, leading to four to five hidden neurons, was also carried out. The usage of several low-level energy values was found to greatly reduce the prediction error. An expected error, mean absolute deviation, for ANN approximation to DFT energies was 0.6+/-0.2 kcal mol(-1). In addition, the comparison of the different density functionals with the basis sets and the comparison of multiple linear regression results were also provided. The CDs were found to overcome limitation of the QD. Furthermore, the effective ANN model for DFT/6-311G(3df,3pd) and DFT/6-311G(2df,2pd) energy estimation was developed, and the benchmark results were provided.
Validating vignette and conjoint survey experiments against real-world behavior

PubMed Central

Hainmueller, Jens; Hangartner, Dominik; Yamamoto, Teppei

2015-01-01

Survey experiments, like vignette and conjoint analyses, are widely used in the social sciences to elicit stated preferences and study how humans make multidimensional choices. However, there is a paucity of research on the external validity of these methods that examines whether the determinants that explain hypothetical choices made by survey respondents match the determinants that explain what subjects actually do when making similar choices in real-world situations. This study compares results from conjoint and vignette analyses on which immigrant attributes generate support for naturalization with closely corresponding behavioral data from a natural experiment in Switzerland, where some municipalities used referendums to decide on the citizenship applications of foreign residents. Using a representative sample from the same population and the official descriptions of applicant characteristics that voters received before each referendum as a behavioral benchmark, we find that the effects of the applicant attributes estimated from the survey experiments perform remarkably well in recovering the effects of the same attributes in the behavioral benchmark. We also find important differences in the relative performances of the different designs. Overall, the paired conjoint design, where respondents evaluate two immigrants side by side, comes closest to the behavioral benchmark; on average, its estimates are within 2% percentage points of the effects in the behavioral benchmark. PMID:25646415
Toward benchmarking in catalysis science: Best practices, challenges, and opportunities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bligaard, Thomas; Bullock, R. Morris; Campbell, Charles T.

Benchmarking is a community-based and (preferably) community-driven activity involving consensus-based decisions on how to make reproducible, fair, and relevant assessments. In catalysis science, important catalyst performance metrics include activity, selectivity, and the deactivation profile, which enable comparisons between new and standard catalysts. Benchmarking also requires careful documentation, archiving, and sharing of methods and measurements, to ensure that the full value of research data can be realized. Beyond these goals, benchmarking presents unique opportunities to advance and accelerate understanding of complex reaction systems by combining and comparing experimental information from multiple, in situ and operando techniques with theoretical insights derived frommore » calculations characterizing model systems. This Perspective describes the origins and uses of benchmarking and its applications in computational catalysis, heterogeneous catalysis, molecular catalysis, and electrocatalysis. As a result, it also discusses opportunities and challenges for future developments in these fields.« less
Thermo-hydro-mechanical-chemical processes in fractured-porous media: Benchmarks and examples

NASA Astrophysics Data System (ADS)

Kolditz, O.; Shao, H.; Görke, U.; Kalbacher, T.; Bauer, S.; McDermott, C. I.; Wang, W.

2012-12-01

The book comprises an assembly of benchmarks and examples for porous media mechanics collected over the last twenty years. Analysis of thermo-hydro-mechanical-chemical (THMC) processes is essential to many applications in environmental engineering, such as geological waste deposition, geothermal energy utilisation, carbon capture and storage, water resources management, hydrology, even climate change. In order to assess the feasibility as well as the safety of geotechnical applications, process-based modelling is the only tool to put numbers, i.e. to quantify future scenarios. This charges a huge responsibility concerning the reliability of computational tools. Benchmarking is an appropriate methodology to verify the quality of modelling tools based on best practices. Moreover, benchmarking and code comparison foster community efforts. The benchmark book is part of the OpenGeoSys initiative - an open source project to share knowledge and experience in environmental analysis and scientific computation.
Toward benchmarking in catalysis science: Best practices, challenges, and opportunities

DOE PAGES

Bligaard, Thomas; Bullock, R. Morris; Campbell, Charles T.; ...

2016-03-07

Benchmarking is a community-based and (preferably) community-driven activity involving consensus-based decisions on how to make reproducible, fair, and relevant assessments. In catalysis science, important catalyst performance metrics include activity, selectivity, and the deactivation profile, which enable comparisons between new and standard catalysts. Benchmarking also requires careful documentation, archiving, and sharing of methods and measurements, to ensure that the full value of research data can be realized. Beyond these goals, benchmarking presents unique opportunities to advance and accelerate understanding of complex reaction systems by combining and comparing experimental information from multiple, in situ and operando techniques with theoretical insights derived frommore » calculations characterizing model systems. This Perspective describes the origins and uses of benchmarking and its applications in computational catalysis, heterogeneous catalysis, molecular catalysis, and electrocatalysis. As a result, it also discusses opportunities and challenges for future developments in these fields.« less
Benchmarking: applications to transfusion medicine.

PubMed

Apelseth, Torunn Oveland; Molnar, Laura; Arnold, Emmy; Heddle, Nancy M

2012-10-01

Benchmarking is as a structured continuous collaborative process in which comparisons for selected indicators are used to identify factors that, when implemented, will improve transfusion practices. This study aimed to identify transfusion medicine studies reporting on benchmarking, summarize the benchmarking approaches used, and identify important considerations to move the concept of benchmarking forward in the field of transfusion medicine. A systematic review of published literature was performed to identify transfusion medicine-related studies that compared at least 2 separate institutions or regions with the intention of benchmarking focusing on 4 areas: blood utilization, safety, operational aspects, and blood donation. Forty-five studies were included: blood utilization (n = 35), safety (n = 5), operational aspects of transfusion medicine (n = 5), and blood donation (n = 0). Based on predefined criteria, 7 publications were classified as benchmarking, 2 as trending, and 36 as single-event studies. Three models of benchmarking are described: (1) a regional benchmarking program that collects and links relevant data from existing electronic sources, (2) a sentinel site model where data from a limited number of sites are collected, and (3) an institutional-initiated model where a site identifies indicators of interest and approaches other institutions. Benchmarking approaches are needed in the field of transfusion medicine. Major challenges include defining best practices and developing cost-effective methods of data collection. For those interested in initiating a benchmarking program, the sentinel site model may be most effective and sustainable as a starting point, although the regional model would be the ideal goal. Copyright © 2012 Elsevier Inc. All rights reserved.
BACT Simulation User Guide (Version 7.0)

NASA Technical Reports Server (NTRS)

Waszak, Martin R.

1997-01-01

This report documents the structure and operation of a simulation model of the Benchmark Active Control Technology (BACT) Wind-Tunnel Model. The BACT system was designed, built, and tested at NASA Langley Research Center as part of the Benchmark Models Program and was developed to perform wind-tunnel experiments to obtain benchmark quality data to validate computational fluid dynamics and computational aeroelasticity codes, to verify the accuracy of current aeroservoelasticity design and analysis tools, and to provide an active controls testbed for evaluating new and innovative control algorithms for flutter suppression and gust load alleviation. The BACT system has been especially valuable as a control system testbed.
Benchmarking the D-Wave Two

NASA Astrophysics Data System (ADS)

Job, Joshua; Wang, Zhihui; Rønnow, Troels; Troyer, Matthias; Lidar, Daniel

2014-03-01

We report on experimental work benchmarking the performance of the D-Wave Two programmable annealer on its native Ising problem, and a comparison to available classical algorithms. In this talk we will focus on the comparison with an algorithm originally proposed and implemented by Alex Selby. This algorithm uses dynamic programming to repeatedly optimize over randomly selected maximal induced trees of the problem graph starting from a random initial state. If one is looking for a quantum advantage over classical algorithms, one should compare to classical algorithms which are designed and optimized to maximally take advantage of the structure of the type of problem one is using for the comparison. In that light, this classical algorithm should serve as a good gauge for any potential quantum speedup for the D-Wave Two.
High-order continuum kinetic method for modeling plasma dynamics in phase space

DOE PAGES

Vogman, G. V.; Colella, P.; Shumlak, U.

2014-12-15

Continuum methods offer a high-fidelity means of simulating plasma kinetics. While computationally intensive, these methods are advantageous because they can be cast in conservation-law form, are not susceptible to noise, and can be implemented using high-order numerical methods. Advances in continuum method capabilities for modeling kinetic phenomena in plasmas require the development of validation tools in higher dimensional phase space and an ability to handle non-cartesian geometries. To that end, a new benchmark for validating Vlasov-Poisson simulations in 3D (x,v x,v y) is presented. The benchmark is based on the Dory-Guest-Harris instability and is successfully used to validate a continuummore » finite volume algorithm. To address challenges associated with non-cartesian geometries, unique features of cylindrical phase space coordinates are described. Preliminary results of continuum kinetic simulations in 4D (r,z,v r,v z) phase space are presented.« less
Assessing validity of observational intervention studies - the Benchmarking Controlled Trials.

PubMed

Malmivaara, Antti

2016-09-01

Benchmarking Controlled Trial (BCT) is a concept which covers all observational studies aiming to assess impact of interventions or health care system features to patients and populations. To create and pilot test a checklist for appraising methodological validity of a BCT. The checklist was created by extracting the most essential elements from the comprehensive set of criteria in the previous paper on BCTs. Also checklists and scientific papers on observational studies and respective systematic reviews were utilized. Ten BCTs published in the Lancet and in the New England Journal of Medicine were used to assess feasibility of the created checklist. The appraised studies seem to have several methodological limitations, some of which could be avoided in planning, conducting and reporting phases of the studies. The checklist can be used for planning, conducting, reporting, reviewing, and critical reading of observational intervention studies. However, the piloted checklist should be validated in further studies. Key messages Benchmarking Controlled Trial (BCT) is a concept which covers all observational studies aiming to assess impact of interventions or health care system features to patients and populations. This paper presents a checklist for appraising methodological validity of BCTs and pilot-tests the checklist with ten BCTs published in leading medical journals. The appraised studies seem to have several methodological limitations, some of which could be avoided in planning, conducting and reporting phases of the studies. The checklist can be used for planning, conducting, reporting, reviewing, and critical reading of observational intervention studies.
A Web Resource for Standardized Benchmark Datasets, Metrics, and Rosetta Protocols for Macromolecular Modeling and Design.

PubMed

Ó Conchúir, Shane; Barlow, Kyle A; Pache, Roland A; Ollikainen, Noah; Kundert, Kale; O'Meara, Matthew J; Smith, Colin A; Kortemme, Tanja

2015-01-01

The development and validation of computational macromolecular modeling and design methods depend on suitable benchmark datasets and informative metrics for comparing protocols. In addition, if a method is intended to be adopted broadly in diverse biological applications, there needs to be information on appropriate parameters for each protocol, as well as metrics describing the expected accuracy compared to experimental data. In certain disciplines, there exist established benchmarks and public resources where experts in a particular methodology are encouraged to supply their most efficient implementation of each particular benchmark. We aim to provide such a resource for protocols in macromolecular modeling and design. We present a freely accessible web resource (https://kortemmelab.ucsf.edu/benchmarks) to guide the development of protocols for protein modeling and design. The site provides benchmark datasets and metrics to compare the performance of a variety of modeling protocols using different computational sampling methods and energy functions, providing a "best practice" set of parameters for each method. Each benchmark has an associated downloadable benchmark capture archive containing the input files, analysis scripts, and tutorials for running the benchmark. The captures may be run with any suitable modeling method; we supply command lines for running the benchmarks using the Rosetta software suite. We have compiled initial benchmarks for the resource spanning three key areas: prediction of energetic effects of mutations, protein design, and protein structure prediction, each with associated state-of-the-art modeling protocols. With the help of the wider macromolecular modeling community, we hope to expand the variety of benchmarks included on the website and continue to evaluate new iterations of current methods as they become available.
ANAlyte: A modular image analysis tool for ANA testing with indirect immunofluorescence.

PubMed

Di Cataldo, Santa; Tonti, Simone; Bottino, Andrea; Ficarra, Elisa

2016-05-01

The automated analysis of indirect immunofluorescence images for Anti-Nuclear Autoantibody (ANA) testing is a fairly recent field that is receiving ever-growing interest from the research community. ANA testing leverages on the categorization of intensity level and fluorescent pattern of IIF images of HEp-2 cells to perform a differential diagnosis of important autoimmune diseases. Nevertheless, it suffers from tremendous lack of repeatability due to subjectivity in the visual interpretation of the images. The automatization of the analysis is seen as the only valid solution to this problem. Several works in literature address individual steps of the work-flow, nonetheless integrating such steps and assessing their effectiveness as a whole is still an open challenge. We present a modular tool, ANAlyte, able to characterize a IIF image in terms of fluorescent intensity level and fluorescent pattern without any user-interactions. For this purpose, ANAlyte integrates the following: (i) Intensity Classifier module, that categorizes the intensity level of the input slide based on multi-scale contrast assessment; (ii) Cell Segmenter module, that splits the input slide into individual HEp-2 cells; (iii) Pattern Classifier module, that determines the fluorescent pattern of the slide based on the pattern of the individual cells. To demonstrate the accuracy and robustness of our tool, we experimentally validated ANAlyte on two different public benchmarks of IIF HEp-2 images with rigorous leave-one-out cross-validation strategy. We obtained overall accuracy of fluorescent intensity and pattern classification respectively around 85% and above 90%. We assessed all results by comparisons with some of the most representative state of the art works. Unlike most of the other works in the recent literature, ANAlyte aims at the automatization of all the major steps of ANA image analysis. Results on public benchmarks demonstrate that the tool can characterize HEp-2 slides in terms of intensity and fluorescent pattern with accuracy better or comparable with the state of the art techniques, even when such techniques are run on manually segmented cells. Hence, ANAlyte can be proposed as a valid solution to the problem of ANA testing automatization. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Assessing the Accuracy of Generalized Inferences From Comparison Group Studies Using a Within-Study Comparison Approach: The Methodology.

PubMed

Jaciw, Andrew P

2016-06-01

Various studies have examined bias in impact estimates from comparison group studies (CGSs) of job training programs, and in education, where results are benchmarked against experimental results. Such within-study comparison (WSC) approaches investigate levels of bias in CGS-based impact estimates, as well as the success of various design and analytic strategies for reducing bias. This article reviews past literature and summarizes conditions under which CGSs replicate experimental benchmark results. It extends the framework to, and develops the methodology for, situations where results from CGSs are generalized to untreated inference populations. Past research is summarized; methods are developed to examine bias in program impact estimates based on cross-site comparisons in a multisite trial that are evaluated against site-specific experimental benchmarks. Students in Grades K-3 in 79 schools in Tennessee; students in Grades 4-8 in 82 schools in Alabama. Grades K-3 Stanford Achievement Test (SAT) in reading and math scores; Grades 4-8 SAT10 reading scores. Past studies show that bias in CGS-based estimates can be limited through strong design, with local matching, and appropriate analysis involving pretest covariates and variables that represent selection processes. Extension of the methodology to investigate accuracy of generalized estimates from CGSs shows bias from confounders and effect moderators. CGS results, when extrapolated to untreated inference populations, may be biased due to variation in outcomes and impact. Accounting for effects of confounders or moderators may reduce bias. © The Author(s) 2016.
Computers for real time flight simulation: A market survey

NASA Technical Reports Server (NTRS)

Bekey, G. A.; Karplus, W. J.

1977-01-01

An extensive computer market survey was made to determine those available systems suitable for current and future flight simulation studies at Ames Research Center. The primary requirement is for the computation of relatively high frequency content (5 Hz) math models representing powered lift flight vehicles. The Rotor Systems Research Aircraft (RSRA) was used as a benchmark vehicle for computation comparison studies. The general nature of helicopter simulations and a description of the benchmark model are presented, and some of the sources of simulation difficulties are examined. A description of various applicable computer architectures is presented, along with detailed discussions of leading candidate systems and comparisons between them.
A comparison of common programming languages used in bioinformatics.

PubMed

Fourment, Mathieu; Gillings, Michael R

2008-02-05

The performance of different programming languages has previously been benchmarked using abstract mathematical algorithms, but not using standard bioinformatics algorithms. We compared the memory usage and speed of execution for three standard bioinformatics methods, implemented in programs using one of six different programming languages. Programs for the Sellers algorithm, the Neighbor-Joining tree construction algorithm and an algorithm for parsing BLAST file outputs were implemented in C, C++, C#, Java, Perl and Python. Implementations in C and C++ were fastest and used the least memory. Programs in these languages generally contained more lines of code. Java and C# appeared to be a compromise between the flexibility of Perl and Python and the fast performance of C and C++. The relative performance of the tested languages did not change from Windows to Linux and no clear evidence of a faster operating system was found. Source code and additional information are available from http://www.bioinformatics.org/benchmark/. This benchmark provides a comparison of six commonly used programming languages under two different operating systems. The overall comparison shows that a developer should choose an appropriate language carefully, taking into account the performance expected and the library availability for each language.
A homology-based pipeline for global prediction of post-translational modification sites

NASA Astrophysics Data System (ADS)

Chen, Xiang; Shi, Shao-Ping; Xu, Hao-Dong; Suo, Sheng-Bao; Qiu, Jian-Ding

2016-05-01

The pathways of protein post-translational modifications (PTMs) have been shown to play particularly important roles for almost any biological process. Identification of PTM substrates along with information on the exact sites is fundamental for fully understanding or controlling biological processes. Alternative computational strategies would help to annotate PTMs in a high-throughput manner. Traditional algorithms are suited for identifying the common organisms and tissues that have a complete PTM atlas or extensive experimental data. While annotation of rare PTMs in most organisms is a clear challenge. In this work, to this end we have developed a novel homology-based pipeline named PTMProber that allows identification of potential modification sites for most of the proteomes lacking PTMs data. Cross-promotion E-value (CPE) as stringent benchmark has been used in our pipeline to evaluate homology to known modification sites. Independent-validation tests show that PTMProber achieves over 58.8% recall with high precision by CPE benchmark. Comparisons with other machine-learning tools show that PTMProber pipeline performs better on general predictions. In addition, we developed a web-based tool to integrate this pipeline at http://bioinfo.ncu.edu.cn/PTMProber/index.aspx. In addition to pre-constructed prediction models of PTM, the website provides an extensional functionality to allow users to customize models.
Benchmarking nitrogen removal suspended-carrier biofilm systems using dynamic simulation.

PubMed

Vanhooren, H; Yuan, Z; Vanrolleghem, P A

2002-01-01

We are witnessing an enormous growth in biological nitrogen removal from wastewater. It presents specific challenges beyond traditional COD (carbon) removal. A possibility for optimised process design is the use of biomass-supporting media. In this paper, attached growth processes (AGP) are evaluated using dynamic simulations. The advantages of these systems that were qualitatively described elsewhere, are validated quantitatively based on a simulation benchmark for activated sludge treatment systems. This simulation benchmark is extended with a biofilm model that allows for fast and accurate simulation of the conversion of different substrates in a biofilm. The economic feasibility of this system is evaluated using the data generated with the benchmark simulations. Capital savings due to volume reduction and reduced sludge production are weighed out against increased aeration costs. In this evaluation, effluent quality is integrated as well.
Autoreject: Automated artifact rejection for MEG and EEG data.

PubMed

Jas, Mainak; Engemann, Denis A; Bekhti, Yousra; Raimondo, Federico; Gramfort, Alexandre

2017-10-01

We present an automated algorithm for unified rejection and repair of bad trials in magnetoencephalography (MEG) and electroencephalography (EEG) signals. Our method capitalizes on cross-validation in conjunction with a robust evaluation metric to estimate the optimal peak-to-peak threshold - a quantity commonly used for identifying bad trials in M/EEG. This approach is then extended to a more sophisticated algorithm which estimates this threshold for each sensor yielding trial-wise bad sensors. Depending on the number of bad sensors, the trial is then repaired by interpolation or by excluding it from subsequent analysis. All steps of the algorithm are fully automated thus lending itself to the name Autoreject. In order to assess the practical significance of the algorithm, we conducted extensive validation and comparisons with state-of-the-art methods on four public datasets containing MEG and EEG recordings from more than 200 subjects. The comparisons include purely qualitative efforts as well as quantitatively benchmarking against human supervised and semi-automated preprocessing pipelines. The algorithm allowed us to automate the preprocessing of MEG data from the Human Connectome Project (HCP) going up to the computation of the evoked responses. The automated nature of our method minimizes the burden of human inspection, hence supporting scalability and reliability demanded by data analysis in modern neuroscience. Copyright © 2017 Elsevier Inc. All rights reserved.

Benchmark problems for numerical implementations of phase field models

DOE PAGES

Jokisaari, A. M.; Voorhees, P. W.; Guyer, J. E.; ...

2016-10-01

Here, we present the first set of benchmark problems for phase field models that are being developed by the Center for Hierarchical Materials Design (CHiMaD) and the National Institute of Standards and Technology (NIST). While many scientific research areas use a limited set of well-established software, the growing phase field community continues to develop a wide variety of codes and lacks benchmark problems to consistently evaluate the numerical performance of new implementations. Phase field modeling has become significantly more popular as computational power has increased and is now becoming mainstream, driving the need for benchmark problems to validate and verifymore » new implementations. We follow the example set by the micromagnetics community to develop an evolving set of benchmark problems that test the usability, computational resources, numerical capabilities and physical scope of phase field simulation codes. In this paper, we propose two benchmark problems that cover the physics of solute diffusion and growth and coarsening of a second phase via a simple spinodal decomposition model and a more complex Ostwald ripening model. We demonstrate the utility of benchmark problems by comparing the results of simulations performed with two different adaptive time stepping techniques, and we discuss the needs of future benchmark problems. The development of benchmark problems will enable the results of quantitative phase field models to be confidently incorporated into integrated computational materials science and engineering (ICME), an important goal of the Materials Genome Initiative.« less
Educating Next Generation Nuclear Criticality Safety Engineers at the Idaho National Laboratory

DOE Office of Scientific and Technical Information (OSTI.GOV)

J. D. Bess; J. B. Briggs; A. S. Garcia

2011-09-01

One of the challenges in educating our next generation of nuclear safety engineers is the limitation of opportunities to receive significant experience or hands-on training prior to graduation. Such training is generally restricted to on-the-job-training before this new engineering workforce can adequately provide assessment of nuclear systems and establish safety guidelines. Participation in the International Criticality Safety Benchmark Evaluation Project (ICSBEP) and the International Reactor Physics Experiment Evaluation Project (IRPhEP) can provide students and young professionals the opportunity to gain experience and enhance critical engineering skills. The ICSBEP and IRPhEP publish annual handbooks that contain evaluations of experiments along withmore » summarized experimental data and peer-reviewed benchmark specifications to support the validation of neutronics codes, nuclear cross-section data, and the validation of reactor designs. Participation in the benchmark process not only benefits those who use these Handbooks within the international community, but provides the individual with opportunities for professional development, networking with an international community of experts, and valuable experience to be used in future employment. Traditionally students have participated in benchmarking activities via internships at national laboratories, universities, or companies involved with the ICSBEP and IRPhEP programs. Additional programs have been developed to facilitate the nuclear education of students while participating in the benchmark projects. These programs include coordination with the Center for Space Nuclear Research (CSNR) Next Degree Program, the Collaboration with the Department of Energy Idaho Operations Office to train nuclear and criticality safety engineers, and student evaluations as the basis for their Master's thesis in nuclear engineering.« less
Derivation, Validation and Application of a Pragmatic Risk Prediction Index for Benchmarking of Surgical Outcomes.

PubMed

Spence, Richard T; Chang, David C; Kaafarani, Haytham M A; Panieri, Eugenio; Anderson, Geoffrey A; Hutter, Matthew M

2018-02-01

Despite the existence of multiple validated risk assessment and quality benchmarking tools in surgery, their utility outside of high-income countries is limited. We sought to derive, validate and apply a scoring system that is both (1) feasible, and (2) reliably predicts mortality in a middle-income country (MIC) context. A 5-step methodology was used: (1) development of a de novo surgical outcomes database modeled around the American College of Surgeons' National Surgical Quality Improvement Program (ACS-NSQIP) in South Africa (SA dataset), (2) use of the resultant data to identify all predictors of in-hospital death with more than 90% capture indicating feasibility of collection, (3) use these predictors to derive and validate an integer-based score that reliably predicts in-hospital death in the 2012 ACS-NSQIP, (4) apply the score in the original SA dataset and demonstrate its performance, (5) identify threshold cutoffs of the score to prompt action and drive quality improvement. Following step one-three above, the 13 point Codman's score was derived and validated on 211,737 and 109,079 patients, respectively, and includes: age 65 (1), partially or completely dependent functional status (1), preoperative transfusions ≥4 units (1), emergency operation (2), sepsis or septic shock (2) American Society of Anesthesia score ≥3 (3) and operative procedure (1-3). Application of the score to 373 patients in the SA dataset showed good discrimination and calibration to predict an in-hospital death. A Codman Score of 8 is an optimal cutoff point for defining expected and unexpected deaths. We have designed a novel risk prediction score specific for a MIC context. The Codman Score can prove useful for both (1) preoperative decision-making and (2) benchmarking the quality of surgical care in MIC's.
Validation of adenosine triphosphate to audit manual cleaning of flexible endoscope channels.

PubMed

Alfa, Michelle J; Fatima, Iram; Olson, Nancy

2013-03-01

Compliance with cleaning of flexible endoscope channels cannot be verified using visual inspection. Adenosine triphosphate (ATP) has been suggested as a possible rapid cleaning monitor for flexible endoscope channels. There have not been published validation studies to specify the level of ATP that indicates inadequate cleaning has been achieved. The objective of this study was to validate the Clean-Trace (3M Inc, St. Paul, MN) ATP water test method for monitoring manual cleaning of flexible endoscopes. This was a simulated use study using a duodenoscope as the test device. Artificial test soil containing 10(6) colony-forming units of Pseudomonas aeruginosa and Enterococcus faecalis was used to perfuse all channels. The flush sample method for the suction-biopsy (L1) or air-water channel (L2) using 40 and 20 mLs sterile reverse osmosis water, respectively, was validated. Residuals of ATP, protein, hemoglobin, and bioburden were quantitated from channel samples taken from uncleaned, partially cleaned, and fully cleaned duodenoscopes. The benchmarks for clean were as follows: <6.4 μg/cm(2) protein, <2.2 μg/cm(2) hemoglobin, and <4-log10 colony-forming units/cm(2) bioburden. The average ATP in clean channel samples was 27.7 RLUs and 154 RLUs for L1 and L2, respectively (<200 RLUs for all channels). The average protein, hemoglobin, and bioburden benchmarks were achieved if <200 RLUs were detected. If the channel sample was >200 RLUs, the residual organic and bioburden levels would exceed the acceptable benchmarks. Our data validated that flexible endoscopes that have complete manual cleaning will have <200 RLUs by the Clean-Trace ATP test. Copyright © 2013 Association for Professionals in Infection Control and Epidemiology, Inc. Published by Mosby, Inc. All rights reserved.
Estimation of suspended particulate matter in turbid coastal waters: application to hyperspectral satellite imagery.

PubMed

Zhao, Jun; Cao, Wenxi; Xu, Zhantang; Ye, Haibin; Yang, Yuezhong; Wang, Guifen; Zhou, Wen; Sun, Zhaohua

2018-04-16

An empirical algorithm is proposed to estimate suspended particulate matter (SPM) ranging from 0.675 to 25.7 mg L -1 in the turbid Pearl River estuary (PRE). Comparisons between model predicted and in situ measured SPM resulted in R 2 s of 0.97 and 0.88 and mean absolute percentage errors (MAPEs) of 23.96% and 29.69% by using the calibration and validation data sets, respectively. The developed algorithm demonstrated the highest accuracy when compared with existing ones for turbid coastal waters. The diurnal dynamics of SPM was revealed by applying the proposed algorithm to reflectance data collected by a moored buoy in the PRE. The established algorithm was implemented to Hyperspectral Imager for the Coastal Ocean (HICO) data and the distribution pattern of SPM in the PRE was elucidated. Validation of HICO-derived reflectance data by using concurrent MODIS/Aqua data as a benchmark indicated their reliability. Factors influencing variability of SPM in the PRE were analyzed, which implicated the combined effects of wind, tide, rainfall, and circulation as the cause.
The Modeling of Advanced BWR Fuel Designs with the NRC Fuel Depletion Codes PARCS/PATHS

DOE PAGES

Ward, Andrew; Downar, Thomas J.; Xu, Y.; ...

2015-04-22

The PATHS (PARCS Advanced Thermal Hydraulic Solver) code was developed at the University of Michigan in support of U.S. Nuclear Regulatory Commission research to solve the steady-state, two-phase, thermal-hydraulic equations for a boiling water reactor (BWR) and to provide thermal-hydraulic feedback for BWR depletion calculations with the neutronics code PARCS (Purdue Advanced Reactor Core Simulator). The simplified solution methodology, including a three-equation drift flux formulation and an optimized iteration scheme, yields very fast run times in comparison to conventional thermal-hydraulic systems codes used in the industry, while still retaining sufficient accuracy for applications such as BWR depletion calculations. Lastly, themore » capability to model advanced BWR fuel designs with part-length fuel rods and heterogeneous axial channel flow geometry has been implemented in PATHS, and the code has been validated against previously benchmarked advanced core simulators as well as BWR plant and experimental data. We describe the modifications to the codes and the results of the validation in this paper.« less
Nonlinear system identification of smart structures under high impact loads

NASA Astrophysics Data System (ADS)

Sarp Arsava, Kemal; Kim, Yeesock; El-Korchi, Tahar; Park, Hyo Seon

2013-05-01

The main purpose of this paper is to develop numerical models for the prediction and analysis of the highly nonlinear behavior of integrated structure control systems subjected to high impact loading. A time-delayed adaptive neuro-fuzzy inference system (TANFIS) is proposed for modeling of the complex nonlinear behavior of smart structures equipped with magnetorheological (MR) dampers under high impact forces. Experimental studies are performed to generate sets of input and output data for training and validation of the TANFIS models. The high impact load and current signals are used as the input disturbance and control signals while the displacement and acceleration responses from the structure-MR damper system are used as the output signals. The benchmark adaptive neuro-fuzzy inference system (ANFIS) is used as a baseline. Comparisons of the trained TANFIS models with experimental results demonstrate that the TANFIS modeling framework is an effective way to capture nonlinear behavior of integrated structure-MR damper systems under high impact loading. In addition, the performance of the TANFIS model is much better than that of ANFIS in both the training and the validation processes.
Benchmarking initiatives in the water industry.

PubMed

Parena, R; Smeets, E

2001-01-01

Customer satisfaction and service care are every day pushing professionals in the water industry to seek to improve their performance, lowering costs and increasing the provided service level. Process Benchmarking is generally recognised as a systematic mechanism of comparing one's own utility with other utilities or businesses with the intent of self-improvement by adopting structures or methods used elsewhere. The IWA Task Force on Benchmarking, operating inside the Statistics and Economics Committee, has been committed to developing a general accepted concept of Process Benchmarking to support water decision-makers in addressing issues of efficiency. In a first step the Task Force disseminated among the Committee members a questionnaire focused on providing suggestions about the kind, the evolution degree and the main concepts of Benchmarking adopted in the represented Countries. A comparison among the guidelines adopted in The Netherlands and Scandinavia has recently challenged the Task Force in drafting a methodology for a worldwide process benchmarking in water industry. The paper provides a framework of the most interesting benchmarking experiences in the water sector and describes in detail both the final results of the survey and the methodology focused on identification of possible improvement areas.
A review on the benchmarking concept in Malaysian construction safety performance

NASA Astrophysics Data System (ADS)

Ishak, Nurfadzillah; Azizan, Muhammad Azizi

2018-02-01

Construction industry is one of the major industries that propels Malaysia's economy in highly contributes to our nation's GDP growth, yet the high fatality rates on construction sites have caused concern among safety practitioners and the stakeholders. Hence, there is a need of benchmarking in performance of Malaysia's construction industry especially in terms of safety. This concept can create a fertile ground for ideas, but only in a receptive environment, organization that share good practices and compare their safety performance against other benefit most to establish improvement in safety culture. This research was conducted to study the awareness important, evaluate current practice and improvement, and also identify the constraint in implement of benchmarking on safety performance in our industry. Additionally, interviews with construction professionals were come out with different views on this concept. Comparison has been done to show the different understanding of benchmarking approach and how safety performance can be benchmarked. But, it's viewed as one mission, which to evaluate objectives identified through benchmarking that will improve the organization's safety performance. Finally, the expected result from this research is to help Malaysia's construction industry implement best practice in safety performance management through the concept of benchmarking.
Validating the performance of correlated fission multiplicity implementation in radiation transport codes with subcritical neutron multiplication benchmark experiments

DOE PAGES

Arthur, Jennifer; Bahran, Rian; Hutchinson, Jesson; ...

2018-06-14

Historically, radiation transport codes have uncorrelated fission emissions. In reality, the particles emitted by both spontaneous and induced fissions are correlated in time, energy, angle, and multiplicity. This work validates the performance of various current Monte Carlo codes that take into account the underlying correlated physics of fission neutrons, specifically neutron multiplicity distributions. The performance of 4 Monte Carlo codes - MCNP®6.2, MCNP®6.2/FREYA, MCNP®6.2/CGMF, and PoliMi - was assessed using neutron multiplicity benchmark experiments. In addition, MCNP®6.2 simulations were run using JEFF-3.2 and JENDL-4.0, rather than ENDF/B-VII.1, data for 239Pu and 240Pu. The sensitive benchmark parameters that in this workmore » represent the performance of each correlated fission multiplicity Monte Carlo code include the singles rate, the doubles rate, leakage multiplication, and Feynman histograms. Although it is difficult to determine which radiation transport code shows the best overall performance in simulating subcritical neutron multiplication inference benchmark measurements, it is clear that correlations exist between the underlying nuclear data utilized by (or generated by) the various codes, and the correlated neutron observables of interest. This could prove useful in nuclear data validation and evaluation applications, in which a particular moment of the neutron multiplicity distribution is of more interest than the other moments. It is also quite clear that, because transport is handled by MCNP®6.2 in 3 of the 4 codes, with the 4th code (PoliMi) being based on an older version of MCNP®, the differences in correlated neutron observables of interest are most likely due to the treatment of fission event generation in each of the different codes, as opposed to the radiation transport.« less
Validating the performance of correlated fission multiplicity implementation in radiation transport codes with subcritical neutron multiplication benchmark experiments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arthur, Jennifer; Bahran, Rian; Hutchinson, Jesson

Historically, radiation transport codes have uncorrelated fission emissions. In reality, the particles emitted by both spontaneous and induced fissions are correlated in time, energy, angle, and multiplicity. This work validates the performance of various current Monte Carlo codes that take into account the underlying correlated physics of fission neutrons, specifically neutron multiplicity distributions. The performance of 4 Monte Carlo codes - MCNP®6.2, MCNP®6.2/FREYA, MCNP®6.2/CGMF, and PoliMi - was assessed using neutron multiplicity benchmark experiments. In addition, MCNP®6.2 simulations were run using JEFF-3.2 and JENDL-4.0, rather than ENDF/B-VII.1, data for 239Pu and 240Pu. The sensitive benchmark parameters that in this workmore » represent the performance of each correlated fission multiplicity Monte Carlo code include the singles rate, the doubles rate, leakage multiplication, and Feynman histograms. Although it is difficult to determine which radiation transport code shows the best overall performance in simulating subcritical neutron multiplication inference benchmark measurements, it is clear that correlations exist between the underlying nuclear data utilized by (or generated by) the various codes, and the correlated neutron observables of interest. This could prove useful in nuclear data validation and evaluation applications, in which a particular moment of the neutron multiplicity distribution is of more interest than the other moments. It is also quite clear that, because transport is handled by MCNP®6.2 in 3 of the 4 codes, with the 4th code (PoliMi) being based on an older version of MCNP®, the differences in correlated neutron observables of interest are most likely due to the treatment of fission event generation in each of the different codes, as opposed to the radiation transport.« less
A European benchmarking system to evaluate in-hospital mortality rates in acute coronary syndrome: the EURHOBOP project.

PubMed

Dégano, Irene R; Subirana, Isaac; Torre, Marina; Grau, María; Vila, Joan; Fusco, Danilo; Kirchberger, Inge; Ferrières, Jean; Malmivaara, Antti; Azevedo, Ana; Meisinger, Christa; Bongard, Vanina; Farmakis, Dimitros; Davoli, Marina; Häkkinen, Unto; Araújo, Carla; Lekakis, John; Elosua, Roberto; Marrugat, Jaume

2015-03-01

Hospital performance models in acute myocardial infarction (AMI) are useful to assess patient management. While models are available for individual countries, mainly US, cross-European performance models are lacking. Thus, we aimed to develop a system to benchmark European hospitals in AMI and percutaneous coronary intervention (PCI), based on predicted in-hospital mortality. We used the EURopean HOspital Benchmarking by Outcomes in ACS Processes (EURHOBOP) cohort to develop the models, which included 11,631 AMI patients and 8276 acute coronary syndrome (ACS) patients who underwent PCI. Models were validated with a cohort of 55,955 European ACS patients. Multilevel logistic regression was used to predict in-hospital mortality in European hospitals for AMI and PCI. Administrative and clinical models were constructed with patient- and hospital-level covariates, as well as hospital- and country-based random effects. Internal cross-validation and external validation showed good discrimination at the patient level and good calibration at the hospital level, based on the C-index (0.736-0.819) and the concordance correlation coefficient (55.4%-80.3%). Mortality ratios (MRs) showed excellent concordance between administrative and clinical models (97.5% for AMI and 91.6% for PCI). Exclusion of transfers and hospital stays ≤1day did not affect in-hospital mortality prediction in sensitivity analyses, as shown by MR concordance (80.9%-85.4%). Models were used to develop a benchmarking system to compare in-hospital mortality rates of European hospitals with similar characteristics. The developed system, based on the EURHOBOP models, is a simple and reliable tool to compare in-hospital mortality rates between European hospitals in AMI and PCI. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Benchmarking working conditions for health and safety in the frontline healthcare industry: Perspectives from Australia and Malaysia.

PubMed

McLinton, Sarven S; Loh, May Young; Dollard, Maureen F; Tuckey, Michelle M R; Idris, Mohd Awang; Morton, Sharon

2018-04-06

To present benchmarks for working conditions in healthcare industries as an initial effort into international surveillance. The healthcare industry is fundamental to sustaining the health of Australians, yet it is under immense pressure. Budgets are limited, demands are increasing as are workplace injuries and all of these factors compromise patient care. Urgent attention is needed to reduce strains on workers and costs in health care, however, little work has been done to benchmark psychosocial factors in healthcare working conditions in the Asia-Pacific. Intercultural comparisons are important to provide an evidence base for public policy. A cross-sectional design was used (like other studies of prevalence), including a mixed-methods approach with qualitative interviews to better contextualize the results. Data on psychosocial factors and other work variables were collected from healthcare workers in three hospitals in Australia (N = 1,258) and Malaysia (N = 1,125). 2015 benchmarks were calculated for each variable and comparison was conducted via independent samples t tests. Healthcare samples were also compared with benchmarks for non-healthcare general working populations from their respective countries: Australia (N = 973) and Malaysia (N = 225). Our study benchmarks healthcare working conditions in Australia and Malaysia against the general working population, identifying trends that indicate the industry is in need of intervention strategies and job redesign initiatives that better support psychological health and safety. We move toward a better understanding of the precursors of psychosocial safety climate in a broader context, including similarities and differences between Australia and Malaysia in national culture, government occupational health and safety policies and top-level management practices. © 2018 John Wiley & Sons Ltd.
Benchmarking in pathology: development of a benchmarking complexity unit and associated key performance indicators.

PubMed

Neil, Amanda; Pfeffer, Sally; Burnett, Leslie

2013-01-01

This paper details the development of a new type of pathology laboratory productivity unit, the benchmarking complexity unit (BCU). The BCU provides a comparative index of laboratory efficiency, regardless of test mix. It also enables estimation of a measure of how much complex pathology a laboratory performs, and the identification of peer organisations for the purposes of comparison and benchmarking. The BCU is based on the theory that wage rates reflect productivity at the margin. A weighting factor for the ratio of medical to technical staff time was dynamically calculated based on actual participant site data. Given this weighting, a complexity value for each test, at each site, was calculated. The median complexity value (number of BCUs) for that test across all participating sites was taken as its complexity value for the Benchmarking in Pathology Program. The BCU allowed implementation of an unbiased comparison unit and test listing that was found to be a robust indicator of the relative complexity for each test. Employing the BCU data, a number of Key Performance Indicators (KPIs) were developed, including three that address comparative organisational complexity, analytical depth and performance efficiency, respectively. Peer groups were also established using the BCU combined with simple organisational and environmental metrics. The BCU has enabled productivity statistics to be compared between organisations. The BCU corrects for differences in test mix and workload complexity of different organisations and also allows for objective stratification into peer groups.
Measurement of predictive validity in violence risk assessment studies: a second-order systematic review.

PubMed

Singh, Jay P; Desmarais, Sarah L; Van Dorn, Richard A

2013-01-01

The objective of the present review was to examine how predictive validity is analyzed and reported in studies of instruments used to assess violence risk. We reviewed 47 predictive validity studies published between 1990 and 2011 of 25 instruments that were included in two recent systematic reviews. Although all studies reported receiver operating characteristic curve analyses and the area under the curve (AUC) performance indicator, this methodology was defined inconsistently and findings often were misinterpreted. In addition, there was between-study variation in benchmarks used to determine whether AUCs were small, moderate, or large in magnitude. Though virtually all of the included instruments were designed to produce categorical estimates of risk - through the use of either actuarial risk bins or structured professional judgments - only a minority of studies calculated performance indicators for these categorical estimates. In addition to AUCs, other performance indicators, such as correlation coefficients, were reported in 60% of studies, but were infrequently defined or interpreted. An investigation of sources of heterogeneity did not reveal significant variation in reporting practices as a function of risk assessment approach (actuarial vs. structured professional judgment), study authorship, geographic location, type of journal (general vs. specialized audience), sample size, or year of publication. Findings suggest a need for standardization of predictive validity reporting to improve comparison across studies and instruments. Copyright © 2013 John Wiley & Sons, Ltd.
Energy saving in WWTP: Daily benchmarking under uncertainty and data availability limitations.

PubMed

Torregrossa, D; Schutz, G; Cornelissen, A; Hernández-Sancho, F; Hansen, J

2016-07-01

Efficient management of Waste Water Treatment Plants (WWTPs) can produce significant environmental and economic benefits. Energy benchmarking can be used to compare WWTPs, identify targets and use these to improve their performance. Different authors have performed benchmark analysis on monthly or yearly basis but their approaches suffer from a time lag between an event, its detection, interpretation and potential actions. The availability of on-line measurement data on many WWTPs should theoretically enable the decrease of the management response time by daily benchmarking. Unfortunately this approach is often impossible because of limited data availability. This paper proposes a methodology to perform a daily benchmark analysis under database limitations. The methodology has been applied to the Energy Online System (EOS) developed in the framework of the project "INNERS" (INNovative Energy Recovery Strategies in the urban water cycle). EOS calculates a set of Key Performance Indicators (KPIs) for the evaluation of energy and process performances. In EOS, the energy KPIs take in consideration the pollutant load in order to enable the comparison between different plants. For example, EOS does not analyse the energy consumption but the energy consumption on pollutant load. This approach enables the comparison of performances for plants with different loads or for a single plant under different load conditions. The energy consumption is measured by on-line sensors, while the pollutant load is measured in the laboratory approximately every 14 days. Consequently, the unavailability of the water quality parameters is the limiting factor in calculating energy KPIs. In this paper, in order to overcome this limitation, the authors have developed a methodology to estimate the required parameters and manage the uncertainty in the estimation. By coupling the parameter estimation with an interval based benchmark approach, the authors propose an effective, fast and reproducible way to manage infrequent inlet measurements. Its use enables benchmarking on a daily basis and prepares the ground for further investigation. Copyright © 2016 Elsevier Inc. All rights reserved.
Benchmarking child and adolescent mental health organizations.

PubMed

Brann, Peter; Walter, Garry; Coombs, Tim

2011-04-01

This paper describes aspects of the child and adolescent benchmarking forums that were part of the National Mental Health Benchmarking Project (NMHBP). These forums enabled participating child and adolescent mental health organizations to benchmark themselves against each other, with a view to understanding variability in performance against a range of key performance indicators (KPIs). Six child and adolescent mental health organizations took part in the NMHBP. Representatives from these organizations attended eight benchmarking forums at which they documented their performance against relevant KPIs. They also undertook two special projects designed to help them understand the variation in performance on given KPIs. There was considerable inter-organization variability on many of the KPIs. Even within organizations, there was often substantial variability over time. The variability in indicator data raised many questions for participants. This challenged participants to better understand and describe their local processes, prompted them to collect additional data, and stimulated them to make organizational comparisons. These activities fed into a process of reflection about their performance. Benchmarking has the potential to illuminate intra- and inter-organizational performance in the child and adolescent context.
Solidification of a binary alloy: Finite-element, single-domain simulation and new benchmark solutions

NASA Astrophysics Data System (ADS)

Le Bars, Michael; Worster, M. Grae

2006-07-01

A finite-element simulation of binary alloy solidification based on a single-domain formulation is presented and tested. Resolution of phase change is first checked by comparison with the analytical results of Worster [M.G. Worster, Solidification of an alloy from a cooled boundary, J. Fluid Mech. 167 (1986) 481-501] for purely diffusive solidification. Fluid dynamical processes without phase change are then tested by comparison with previous numerical studies of thermal convection in a pure fluid [G. de Vahl Davis, Natural convection of air in a square cavity: a bench mark numerical solution, Int. J. Numer. Meth. Fluids 3 (1983) 249-264; D.A. Mayne, A.S. Usmani, M. Crapper, h-adaptive finite element solution of high Rayleigh number thermally driven cavity problem, Int. J. Numer. Meth. Heat Fluid Flow 10 (2000) 598-615; D.C. Wan, B.S.V. Patnaik, G.W. Wei, A new benchmark quality solution for the buoyancy driven cavity by discrete singular convolution, Numer. Heat Transf. 40 (2001) 199-228], in a porous medium with a constant porosity [G. Lauriat, V. Prasad, Non-darcian effects on natural convection in a vertical porous enclosure, Int. J. Heat Mass Transf. 32 (1989) 2135-2148; P. Nithiarasu, K.N. Seetharamu, T. Sundararajan, Natural convective heat transfer in an enclosure filled with fluid saturated variable porosity medium, Int. J. Heat Mass Transf. 40 (1997) 3955-3967] and in a mixed liquid-porous medium with a spatially variable porosity [P. Nithiarasu, K.N. Seetharamu, T. Sundararajan, Natural convective heat transfer in an enclosure filled with fluid saturated variable porosity medium, Int. J. Heat Mass Transf. 40 (1997) 3955-3967; N. Zabaras, D. Samanta, A stabilized volume-averaging finite element method for flow in porous media and binary alloy solidification processes, Int. J. Numer. Meth. Eng. 60 (2004) 1103-1138]. Finally, new benchmark solutions for simultaneous flow through both fluid and porous domains and for convective solidification processes are presented, based on the similarity solutions in corner-flow geometries recently obtained by Le Bars and Worster [M. Le Bars, M.G. Worster, Interfacial conditions between a pure fluid and a porous medium: implications for binary alloy solidification, J. Fluid Mech. (in press)]. Good agreement is found for all tests, hence validating our physical and numerical methods. More generally, the computations presented here could now be considered as standard and reliable analytical benchmarks for numerical simulations, specifically and independently testing the different processes underlying binary alloy solidification.
Working Group 1 "Advanced GNSS Processing Techniques" of the COST Action GNSS4SWEC: Overview of main achievements

NASA Astrophysics Data System (ADS)

Douša, Jan; Dick, Galina; Kačmařík, Michal; Václavovic, Pavel; Pottiaux, Eric; Zus, Florian; Brenot, Hugues; Moeller, Gregor; Hinterberger, Fabian; Pacione, Rosa; Stuerze, Andrea; Eben, Kryštof; Teferle, Norman; Ding, Wenwu; Morel, Laurent; Kaplon, Jan; Hordyniec, Pavel; Rohm, Witold

2017-04-01

The COST Action ES1206 GNSS4SWEC addresses new exploitations of the synergy between developments in GNSS and meteorological communities. The Working Group 1 (Advanced GNSS processing techniques) deals with implementing and assessing new methods for GNSS tropospheric monitoring and precise positioning exploiting all modern GNSS constellations, signals, products etc. Besides other goals, WG1 coordinates development of advanced tropospheric products in support of weather numerical and non-numerical nowcasting. These are ultra-fast and high-resolution tropospheric products available in real time or in a sub-hourly fashion and parameters in support of monitoring an anisotropy of the troposphere, e.g. horizontal gradients and tropospheric slant path delays. This talk gives an overview of WG1 activities and, particularly, achievements in two activities, Benchmark and Real-time demonstration campaigns. For the Benchmark campaign a complex data set of GNSS observations and various meteorological data were collected for a two-month period in 2013 (May-June) which included severe weather events in central Europe. An initial processing of data sets from GNSS and numerical weather models (NWM) provided independently estimated reference parameters - ZTDs and tropospheric horizontal gradients. The comparison of horizontal tropospheric gradients from GNSS and NWM data demonstrated a very good agreement among independent solutions with negligible biases and an accuracy of about 0.5 mm. Visual comparisons of maps of zenith wet delays and tropospheric horizontal gradients showed very promising results for future exploitations of advanced GNSS tropospheric products in meteorological applications such as severe weather event monitoring and weather nowcasting. The Benchmark data set is also used for an extensive validation of line-of-sight tropospheric Slant Total Delays (STD) from GNSS, NWM-raytracing and Water Vapour Radiometer (WVR) solutions. Seven institutions delivered their STDs estimated based on GNSS observations processed using different software and strategies. STDs from NWM ray-tracing came from three institutions using four different NWM models. Results show generally a very good mutual agreement among all solutions from all techniques. The influence of adding not cleaned GNSS post-fit residuals, i.e. residuals that still contains non-tropospheric systematic effects such as multipath, to estimated STDs will be presented. The Real-time demonstration campaign aims at enhancing and assessing ultra-fast GNSS tropospheric products for severe weather and NWM nowcasting. Results are showed from real-time demonstrations as well as offline production simulating real-time using Benchmark campaign.
Using benchmarking techniques and the 2011 maternity practices infant nutrition and care (mPINC) survey to improve performance among peer groups across the United States.

PubMed

Edwards, Roger A; Dee, Deborah; Umer, Amna; Perrine, Cria G; Shealy, Katherine R; Grummer-Strawn, Laurence M

2014-02-01

A substantial proportion of US maternity care facilities engage in practices that are not evidence-based and that interfere with breastfeeding. The CDC Survey of Maternity Practices in Infant Nutrition and Care (mPINC) showed significant variation in maternity practices among US states. The purpose of this article is to use benchmarking techniques to identify states within relevant peer groups that were top performers on mPINC survey indicators related to breastfeeding support. We used 11 indicators of breastfeeding-related maternity care from the 2011 mPINC survey and benchmarking techniques to organize and compare hospital-based maternity practices across the 50 states and Washington, DC. We created peer categories for benchmarking first by region (grouping states by West, Midwest, South, and Northeast) and then by size (grouping states by the number of maternity facilities and dividing each region into approximately equal halves based on the number of facilities). Thirty-four states had scores high enough to serve as benchmarks, and 32 states had scores low enough to reflect the lowest score gap from the benchmark on at least 1 indicator. No state served as the benchmark on more than 5 indicators and no state was furthest from the benchmark on more than 7 indicators. The small peer group benchmarks in the South, West, and Midwest were better than the large peer group benchmarks on 91%, 82%, and 36% of the indicators, respectively. In the West large, the Midwest large, the Midwest small, and the South large peer groups, 4-6 benchmarks showed that less than 50% of hospitals have ideal practice in all states. The evaluation presents benchmarks for peer group state comparisons that provide potential and feasible targets for improvement.

Benchmarking Discount Rate in Natural Resource Damage Assessment with Risk Aversion.

PubMed

Wu, Desheng; Chen, Shuzhen

2017-08-01

Benchmarking a credible discount rate is of crucial importance in natural resource damage assessment (NRDA) and restoration evaluation. This article integrates a holistic framework of NRDA with prevailing low discount rate theory, and proposes a discount rate benchmarking decision support system based on service-specific risk aversion. The proposed approach has the flexibility of choosing appropriate discount rates for gauging long-term services, as opposed to decisions based simply on duration. It improves injury identification in NRDA since potential damages and side-effects to ecosystem services are revealed within the service-specific framework. A real embankment case study demonstrates valid implementation of the method. © 2017 Society for Risk Analysis.
Experimental and theoretical study of magnetohydrodynamic ship models.

PubMed

Cébron, David; Viroulet, Sylvain; Vidal, Jérémie; Masson, Jean-Paul; Viroulet, Philippe

2017-01-01

Magnetohydrodynamic (MHD) ships represent a clear demonstration of the Lorentz force in fluids, which explains the number of students practicals or exercises described on the web. However, the related literature is rather specific and no complete comparison between theory and typical small scale experiments is currently available. This work provides, in a self-consistent framework, a detailed presentation of the relevant theoretical equations for small MHD ships and experimental measurements for future benchmarks. Theoretical results of the literature are adapted to these simple battery/magnets powered ships moving on salt water. Comparison between theory and experiments are performed to validate each theoretical step such as the Tafel and the Kohlrausch laws, or the predicted ship speed. A successful agreement is obtained without any adjustable parameter. Finally, based on these results, an optimal design is then deduced from the theory. Therefore this work provides a solid theoretical and experimental ground for small scale MHD ships, by presenting in detail several approximations and how they affect the boat efficiency. Moreover, the theory is general enough to be adapted to other contexts, such as large scale ships or industrial flow measurement techniques.
Experimental and theoretical study of magnetohydrodynamic ship models

PubMed Central

Viroulet, Sylvain; Vidal, Jérémie; Masson, Jean-Paul; Viroulet, Philippe

2017-01-01

Magnetohydrodynamic (MHD) ships represent a clear demonstration of the Lorentz force in fluids, which explains the number of students practicals or exercises described on the web. However, the related literature is rather specific and no complete comparison between theory and typical small scale experiments is currently available. This work provides, in a self-consistent framework, a detailed presentation of the relevant theoretical equations for small MHD ships and experimental measurements for future benchmarks. Theoretical results of the literature are adapted to these simple battery/magnets powered ships moving on salt water. Comparison between theory and experiments are performed to validate each theoretical step such as the Tafel and the Kohlrausch laws, or the predicted ship speed. A successful agreement is obtained without any adjustable parameter. Finally, based on these results, an optimal design is then deduced from the theory. Therefore this work provides a solid theoretical and experimental ground for small scale MHD ships, by presenting in detail several approximations and how they affect the boat efficiency. Moreover, the theory is general enough to be adapted to other contexts, such as large scale ships or industrial flow measurement techniques. PMID:28665941
Decoys Selection in Benchmarking Datasets: Overview and Perspectives

PubMed Central

Réau, Manon; Langenfeld, Florent; Zagury, Jean-François; Lagarde, Nathalie; Montes, Matthieu

2018-01-01

Virtual Screening (VS) is designed to prospectively help identifying potential hits, i.e., compounds capable of interacting with a given target and potentially modulate its activity, out of large compound collections. Among the variety of methodologies, it is crucial to select the protocol that is the most adapted to the query/target system under study and that yields the most reliable output. To this aim, the performance of VS methods is commonly evaluated and compared by computing their ability to retrieve active compounds in benchmarking datasets. The benchmarking datasets contain a subset of known active compounds together with a subset of decoys, i.e., assumed non-active molecules. The composition of both the active and the decoy compounds subsets is critical to limit the biases in the evaluation of the VS methods. In this review, we focus on the selection of decoy compounds that has considerably changed over the years, from randomly selected compounds to highly customized or experimentally validated negative compounds. We first outline the evolution of decoys selection in benchmarking databases as well as current benchmarking databases that tend to minimize the introduction of biases, and secondly, we propose recommendations for the selection and the design of benchmarking datasets. PMID:29416509
Validation of Short-Term Noise Assessment Procedures: FY16 Summary of Procedures, Progress, and Preliminary Results

DTIC Science & Technology

Validation project. This report describes the procedure used to generate the noise models output dataset , and then it compares that dataset to the...benchmark, the Engineer Research and Development Centers Long-Range Sound Propagation dataset . It was found that the models consistently underpredict the
The Benchmarking Capacity of a General Outcome Measure of Academic Language in Science and Social Studies

ERIC Educational Resources Information Center

Mooney, Paul; Lastrapes, Renée E.

2016-01-01

The amount of research evaluating the technical merits of general outcome measures of science and social studies achievement is growing. This study targeted criterion validity for critical content monitoring. Questions addressed the concurrent criterion validity of alternate presentation formats of critical content monitoring and the measure's…
VizieR Online Data Catalog: Surface gravity determination in late-type stars (Morel+, 2012)

NASA Astrophysics Data System (ADS)

Morel, T.; Miglio, A.

2012-06-01

The frequency of maximum oscillation power measured in dwarfs and giants exhibiting solar-like pulsations provides a precise, and potentially accurate, inference of the stellar surface gravity. An extensive comparison for about 40 well-studied pulsating stars with gravities derived using classical methods (ionization balance, pressure-sensitive spectral features or location with respect to evolutionary tracks) supports the validity of this technique and reveals an overall remarkable agreement with mean differences not exceeding 0.05dex (although with a dispersion of up to ~0.2dex). It is argued that interpolation in theoretical isochrones may be the most precise way of estimating the gravity by traditional means in nearby dwarfs. Attention is drawn to the usefulness of seismic targets as benchmarks in the context of large-scale surveys. (1 data file).
Uterine Artery Embolization: An Analysis of Online Patient Information Quality and Readability with Historical Comparison.

PubMed

Murray, Timothy E; Mansoor, Tayyaub; Bowden, Dermot J; O'Neill, Damien C; Lee, Michael J

2018-05-01

Investigators aimed to assess online information describing uterine artery embolization (UAE) to examine the quality and readability of websites patients are accessing. A list of applicable, commonly used searchable terms was generated, including "Uterine Artery Embolization," "Fibroid Embolization," "Uterine Fibroid Embolization," and "Uterine Artery Embolisation." Each possible term was assessed across the five most-used English language search engines to determine the most commonly used term. The most common term was then investigated across each search engine, with the first 25 pages returned by each engine included for analysis. Duplicate pages, nontext content such as video or audio, and pages behind paywalls were excluded. Pages were analyzed for quality and readability using validated tools including DISCERN score, JAMA Benchmark Criteria, HONcode Certification, Flesch Reading Ease Score, Flesch-Kincaid Grade Level, and Gunning-Fog Index. Secondary features such as age, rank, author, and publisher were recorded. The most common applicable term was "Uterine Artery Embolization" (492,900 results). Mean DISCERN quality of information provided by UAE websites is "fair"; however, it has declined since comparative 2012 studies. Adherence to JAMA Benchmark Criteria has reduced to 6.7%. UAE website readability remains more difficult than the World Health Organization-recommended 7-8th grade reading levels. HONcode-certified websites (35.6%) demonstrated significantly higher quality than noncertified websites. Quality of online UAE information remains "fair." Adherence to JAMA benchmark criteria is poor. Readability is above recommended 7-8th grade levels. HONcode certification was predictive of higher website quality, a useful guide to patients requesting additional information. Copyright © 2018 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
Benchmarking the QUAD4/TRIA3 element

NASA Technical Reports Server (NTRS)

Pitrof, Stephen M.; Venkayya, Vipperla B.

1993-01-01

The QUAD4 and TRIA3 elements are the primary plate/shell elements in NASTRAN. These elements enable the user to analyze thin plate/shell structures for membrane, bending and shear phenomena. They are also very new elements in the NASTRAN library. These elements are extremely versatile and constitute a substantially enhanced analysis capability in NASTRAN. However, with the versatility comes the burden of understanding a myriad of modeling implications and their effect on accuracy and analysis quality. The validity of many aspects of these elements were established through a series of benchmark problem results and comparison with those available in the literature and obtained from other programs like MSC/NASTRAN and CSAR/NASTRAN. Never-the-less such a comparison is never complete because of the new and creative use of these elements in complex modeling situations. One of the important features of QUAD4 and TRIA3 elements is the offset capability which allows the midsurface of the plate to be noncoincident with the surface of the grid points. None of the previous elements, with the exception of bar (beam), has this capability. The offset capability played a crucial role in the design of QUAD4 and TRIA3 elements. It allowed modeling layered composites, laminated plates and sandwich plates with the metal and composite face sheets. Even though the basic implementation of the offset capability is found to be sound in the previous applications, there is some uncertainty in relatively simple applications. The main purpose of this paper is to test the integrity of the offset capability and provide guidelines for its effective use. For the purpose of simplicity, references in this paper to the QUAD4 element will also include the TRIA3 element.
Benchmark Simulation of Natural Circulation Cooling System with Salt Working Fluid Using SAM

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ahmed, K. K.; Scarlat, R. O.; Hu, R.

Liquid salt-cooled reactors, such as the Fluoride Salt-Cooled High-Temperature Reactor (FHR), offer passive decay heat removal through natural circulation using Direct Reactor Auxiliary Cooling System (DRACS) loops. The behavior of such systems should be well-understood through performance analysis. The advanced system thermal-hydraulics tool System Analysis Module (SAM) from Argonne National Laboratory has been selected for this purpose. The work presented here is part of a larger study in which SAM modeling capabilities are being enhanced for the system analyses of FHR or Molten Salt Reactors (MSR). Liquid salt thermophysical properties have been implemented in SAM, as well as properties ofmore » Dowtherm A, which is used as a simulant fluid for scaled experiments, for future code validation studies. Additional physics modules to represent phenomena specific to salt-cooled reactors, such as freezing of coolant, are being implemented in SAM. This study presents a useful first benchmark for the applicability of SAM to liquid salt-cooled reactors: it provides steady-state and transient comparisons for a salt reactor system. A RELAP5-3D model of the Mark-1 Pebble-Bed FHR (Mk1 PB-FHR), and in particular its DRACS loop for emergency heat removal, provides steady state and transient results for flow rates and temperatures in the system that are used here for code-to-code comparison with SAM. The transient studied is a loss of forced circulation with SCRAM event. To the knowledge of the authors, this is the first application of SAM to FHR or any other molten salt reactors. While building these models in SAM, any gaps in the code’s capability to simulate such systems are identified and addressed immediately, or listed as future improvements to the code.« less
Optimized selection of benchmark test parameters for image watermark algorithms based on Taguchi methods and corresponding influence on design decisions for real-world applications

NASA Astrophysics Data System (ADS)

Rodriguez, Tony F.; Cushman, David A.

2003-06-01

With the growing commercialization of watermarking techniques in various application scenarios it has become increasingly important to quantify the performance of watermarking products. The quantification of relative merits of various products is not only essential in enabling further adoption of the technology by society as a whole, but will also drive the industry to develop testing plans/methodologies to ensure quality and minimize cost (to both vendors & customers.) While the research community understands the theoretical need for a publicly available benchmarking system to quantify performance, there has been less discussion on the practical application of these systems. By providing a standard set of acceptance criteria, benchmarking systems can dramatically increase the quality of a particular watermarking solution, validating the product performances if they are used efficiently and frequently during the design process. In this paper we describe how to leverage specific design of experiments techniques to increase the quality of a watermarking scheme, to be used with the benchmark tools being developed by the Ad-Hoc Watermark Verification Group. A Taguchi Loss Function is proposed for an application and orthogonal arrays used to isolate optimal levels for a multi-factor experimental situation. Finally, the results are generalized to a population of cover works and validated through an exhaustive test.
Results of the GABLS3 diurnal-cycle benchmark for wind energy applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rodrigo, J. Sanz; Allaerts, D.; Avila, M.

We present results of the GABLS3 model intercomparison benchmark revisited for wind energy applications. The case consists of a diurnal cycle, measured at the 200-m tall Cabauw tower in the Netherlands, including a nocturnal low-level jet. The benchmark includes a sensitivity analysis of WRF simulations using two input meteorological databases and five planetary boundary-layer schemes. A reference set of mesoscale tendencies is used to drive microscale simulations using RANS k-ϵ and LES turbulence models. The validation is based on rotor-based quantities of interest. Cycle-integrated mean absolute errors are used to quantify model performance. The results of the benchmark are usedmore » to discuss input uncertainties from mesoscale modelling, different meso-micro coupling strategies (online vs offline) and consistency between RANS and LES codes when dealing with boundary-layer mean flow quantities. Altogether, all the microscale simulations produce a consistent coupling with mesoscale forcings.« less
Results of the GABLS3 diurnal-cycle benchmark for wind energy applications

DOE PAGES

Rodrigo, J. Sanz; Allaerts, D.; Avila, M.; ...

2017-06-13

We present results of the GABLS3 model intercomparison benchmark revisited for wind energy applications. The case consists of a diurnal cycle, measured at the 200-m tall Cabauw tower in the Netherlands, including a nocturnal low-level jet. The benchmark includes a sensitivity analysis of WRF simulations using two input meteorological databases and five planetary boundary-layer schemes. A reference set of mesoscale tendencies is used to drive microscale simulations using RANS k-ϵ and LES turbulence models. The validation is based on rotor-based quantities of interest. Cycle-integrated mean absolute errors are used to quantify model performance. The results of the benchmark are usedmore » to discuss input uncertainties from mesoscale modelling, different meso-micro coupling strategies (online vs offline) and consistency between RANS and LES codes when dealing with boundary-layer mean flow quantities. Altogether, all the microscale simulations produce a consistent coupling with mesoscale forcings.« less
OECD-NEA Expert Group on Multi-Physics Experimental Data, Benchmarks and Validation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Valentine, Timothy; Rohatgi, Upendra S.

High-fidelity, multi-physics modeling and simulation (M&S) tools are being developed and utilized for a variety of applications in nuclear science and technology and show great promise in their abilities to reproduce observed phenomena for many applications. Even with the increasing fidelity and sophistication of coupled multi-physics M&S tools, the underpinning models and data still need to be validated against experiments that may require a more complex array of validation data because of the great breadth of the time, energy and spatial domains of the physical phenomena that are being simulated. The Expert Group on Multi-Physics Experimental Data, Benchmarks and Validationmore » (MPEBV) of the Nuclear Energy Agency (NEA) of the Organization for Economic Cooperation and Development (OECD) was formed to address the challenges with the validation of such tools. The work of the MPEBV expert group is shared among three task forces to fulfill its mandate and specific exercises are being developed to demonstrate validation principles for common industrial challenges. This paper describes the overall mission of the group, the specific objectives of the task forces, the linkages among the task forces, and the development of a validation exercise that focuses on a specific reactor challenge problem.« less
Evaluation of the Pool Critical Assembly Benchmark with Explicitly-Modeled Geometry using MCNP6

DOE PAGES

Kulesza, Joel A.; Martz, Roger Lee

2017-03-01

Despite being one of the most widely used benchmarks for qualifying light water reactor (LWR) radiation transport methods and data, no benchmark calculation of the Oak Ridge National Laboratory (ORNL) Pool Critical Assembly (PCA) pressure vessel wall benchmark facility (PVWBF) using MCNP6 with explicitly modeled core geometry exists. As such, this paper provides results for such an analysis. First, a criticality calculation is used to construct the fixed source term. Next, ADVANTG-generated variance reduction parameters are used within the final MCNP6 fixed source calculations. These calculations provide unadjusted dosimetry results using three sets of dosimetry reaction cross sections of varyingmore » ages (those packaged with MCNP6, from the IRDF-2002 multi-group library, and from the ACE-formatted IRDFF v1.05 library). These results are then compared to two different sets of measured reaction rates. The comparison agrees in an overall sense within 2% and on a specific reaction- and dosimetry location-basis within 5%. Except for the neptunium dosimetry, the individual foil raw calculation-to-experiment comparisons usually agree within 10% but is typically greater than unity. Finally, in the course of developing these calculations, geometry that has previously not been completely specified is provided herein for the convenience of future analysts.« less
Assessing validity of observational intervention studies – the Benchmarking Controlled Trials

PubMed Central

Malmivaara, Antti

2016-01-01

Abstract Background: Benchmarking Controlled Trial (BCT) is a concept which covers all observational studies aiming to assess impact of interventions or health care system features to patients and populations. Aims: To create and pilot test a checklist for appraising methodological validity of a BCT. Methods: The checklist was created by extracting the most essential elements from the comprehensive set of criteria in the previous paper on BCTs. Also checklists and scientific papers on observational studies and respective systematic reviews were utilized. Ten BCTs published in the Lancet and in the New England Journal of Medicine were used to assess feasibility of the created checklist. Results: The appraised studies seem to have several methodological limitations, some of which could be avoided in planning, conducting and reporting phases of the studies. Conclusions: The checklist can be used for planning, conducting, reporting, reviewing, and critical reading of observational intervention studies. However, the piloted checklist should be validated in further studies.Key messagesBenchmarking Controlled Trial (BCT) is a concept which covers all observational studies aiming to assess impact of interventions or health care system features to patients and populations.This paper presents a checklist for appraising methodological validity of BCTs and pilot-tests the checklist with ten BCTs published in leading medical journals. The appraised studies seem to have several methodological limitations, some of which could be avoided in planning, conducting and reporting phases of the studies.The checklist can be used for planning, conducting, reporting, reviewing, and critical reading of observational intervention studies. PMID:27238631
Construct validity and expert benchmarking of the haptic virtual reality dental simulator.

PubMed

Suebnukarn, Siriwan; Chaisombat, Monthalee; Kongpunwijit, Thanapohn; Rhienmora, Phattanapon

2014-10-01

The aim of this study was to demonstrate construct validation of the haptic virtual reality (VR) dental simulator and to define expert benchmarking criteria for skills assessment. Thirty-four self-selected participants (fourteen novices, fourteen intermediates, and six experts in endodontics) at one dental school performed ten repetitions of three mode tasks of endodontic cavity preparation: easy (mandibular premolar with one canal), medium (maxillary premolar with two canals), and hard (mandibular molar with three canals). The virtual instrument's path length was registered by the simulator. The outcomes were assessed by an expert. The error scores in easy and medium modes accurately distinguished the experts from novices and intermediates at the onset of training, when there was a significant difference between groups (ANOVA, p<0.05). The trend was consistent until trial 5. From trial 6 on, the three groups achieved similar scores. No significant difference was found between groups at the end of training. Error score analysis was not able to distinguish any group at the hard level of training. Instrument path length showed a difference in performance according to groups at the onset of training (ANOVA, p<0.05). This study established construct validity for the haptic VR dental simulator by demonstrating its discriminant capabilities between that of experts and non-experts. The experts' error scores and path length were used to define benchmarking criteria for optimal performance.
FDA Benchmark Medical Device Flow Models for CFD Validation.

PubMed

Malinauskas, Richard A; Hariharan, Prasanna; Day, Steven W; Herbertson, Luke H; Buesen, Martin; Steinseifer, Ulrich; Aycock, Kenneth I; Good, Bryan C; Deutsch, Steven; Manning, Keefe B; Craven, Brent A

Computational fluid dynamics (CFD) is increasingly being used to develop blood-contacting medical devices. However, the lack of standardized methods for validating CFD simulations and blood damage predictions limits its use in the safety evaluation of devices. Through a U.S. Food and Drug Administration (FDA) initiative, two benchmark models of typical device flow geometries (nozzle and centrifugal blood pump) were tested in multiple laboratories to provide experimental velocities, pressures, and hemolysis data to support CFD validation. In addition, computational simulations were performed by more than 20 independent groups to assess current CFD techniques. The primary goal of this article is to summarize the FDA initiative and to report recent findings from the benchmark blood pump model study. Discrepancies between CFD predicted velocities and those measured using particle image velocimetry most often occurred in regions of flow separation (e.g., downstream of the nozzle throat, and in the pump exit diffuser). For the six pump test conditions, 57% of the CFD predictions of pressure head were within one standard deviation of the mean measured values. Notably, only 37% of all CFD submissions contained hemolysis predictions. This project aided in the development of an FDA Guidance Document on factors to consider when reporting computational studies in medical device regulatory submissions. There is an accompanying podcast available for this article. Please visit the journal's Web site (www.asaiojournal.com) to listen.
Availability of Neutronics Benchmarks in the ICSBEP and IRPhEP Handbooks for Computational Tools Testing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bess, John D.; Briggs, J. Blair; Ivanova, Tatiana

2017-02-01

In the past several decades, numerous experiments have been performed worldwide to support reactor operations, measurements, design, and nuclear safety. Those experiments represent an extensive international investment in infrastructure, expertise, and cost, representing significantly valuable resources of data supporting past, current, and future research activities. Those valuable assets represent the basis for recording, development, and validation of our nuclear methods and integral nuclear data [1]. The loss of these experimental data, which has occurred all too much in the recent years, is tragic. The high cost to repeat many of these measurements can be prohibitive, if not impossible, to surmount.more » Two international projects were developed, and are under the direction of the Organisation for Co-operation and Development Nuclear Energy Agency (OECD NEA) to address the challenges of not just data preservation, but evaluation of the data to determine its merit for modern and future use. The International Criticality Safety Benchmark Evaluation Project (ICSBEP) was established to identify and verify comprehensive critical benchmark data sets; evaluate the data, including quantification of biases and uncertainties; compile the data and calculations in a standardized format; and formally document the effort into a single source of verified benchmark data [2]. Similarly, the International Reactor Physics Experiment Evaluation Project (IRPhEP) was established to preserve integral reactor physics experimental data, including separate or special effects data for nuclear energy and technology applications [3]. Annually, contributors from around the world continue to collaborate in the evaluation and review of select benchmark experiments for preservation and dissemination. The extensively peer-reviewed integral benchmark data can then be utilized to support nuclear design and safety analysts to validate the analytical tools, methods, and data needed for next-generation reactor design, safety analysis requirements, and all other front- and back-end activities contributing to the overall nuclear fuel cycle where quality neutronics calculations are paramount.« less
Benchmark Dataset for Whole Genome Sequence Compression.

PubMed

C L, Biji; S Nair, Achuthsankar

2017-01-01

The research in DNA data compression lacks a standard dataset to test out compression tools specific to DNA. This paper argues that the current state of achievement in DNA compression is unable to be benchmarked in the absence of such scientifically compiled whole genome sequence dataset and proposes a benchmark dataset using multistage sampling procedure. Considering the genome sequence of organisms available in the National Centre for Biotechnology and Information (NCBI) as the universe, the proposed dataset selects 1,105 prokaryotes, 200 plasmids, 164 viruses, and 65 eukaryotes. This paper reports the results of using three established tools on the newly compiled dataset and show that their strength and weakness are evident only with a comparison based on the scientifically compiled benchmark dataset. The sample dataset and the respective links are available @ https://sourceforge.net/projects/benchmarkdnacompressiondataset/.

Integrated Disposal Facility FY 2016: ILAW Verification and Validation of the eSTOMP Simulator

DOE Office of Scientific and Technical Information (OSTI.GOV)

Freedman, Vicky L.; Bacon, Diana H.; Fang, Yilin

2016-05-13

This document describes two sets of simulations carried out to further verify and validate the eSTOMP simulator. In this report, a distinction is made between verification and validation, and the focus is on verifying eSTOMP through a series of published benchmarks on cementitious wastes, and validating eSTOMP based on a lysimeter experiment for the glassified waste. These activities are carried out within the context of a scientific view of validation that asserts that models can only be invalidated, and that model validation (and verification) is a subjective assessment.
Standards and Guidelines for Numerical Models for Tsunami Hazard Mitigation

NASA Astrophysics Data System (ADS)

Titov, V.; Gonzalez, F.; Kanoglu, U.; Yalciner, A.; Synolakis, C. E.

2006-12-01

An increased number of nations around the workd need to develop tsunami mitigation plans which invariably involve inundation maps for warning guidance and evacuation planning. There is the risk that inundation maps may be produced with older or untested methodology, as there are currently no standards for modeling tools. In the aftermath of the 2004 megatsunami, some models were used to model inundation for Cascadia events with results much larger than sediment records and existing state-of-the-art studies suggest leading to confusion among emergency management. Incorrectly assessing tsunami impact is hazardous, as recent events in 2006 in Tonga, Kythira, Greece and Central Java have suggested (Synolakis and Bernard, 2006). To calculate tsunami currents, forces and runup on coastal structures, and inundation of coastlines one must calculate the evolution of the tsunami wave from the deep ocean to its target site, numerically. No matter what the numerical model, validation (the process of ensuring that the model solves the parent equations of motion accurately) and verification (the process of ensuring that the model used represents geophysical reality appropriately) both are an essential. Validation ensures that the model performs well in a wide range of circumstances and is accomplished through comparison with analytical solutions. Verification ensures that the computational code performs well over a range of geophysical problems. A few analytic solutions have been validated themselves with laboratory data. Even fewer existing numerical models have been both validated with the analytical solutions and verified with both laboratory measurements and field measurements, thus establishing a gold standard for numerical codes for inundation mapping. While there is in principle no absolute certainty that a numerical code that has performed well in all the benchmark tests will also produce correct inundation predictions with any given source motions, validated codes reduce the level of uncertainty in their results to the uncertainty in the geophysical initial conditions. Further, when coupled with real--time free--field tsunami measurements from tsunameters, validated codes are the only choice for realistic forecasting of inundation; the consequences of failure are too ghastly to take chances with numerical procedures that have not been validated. We discuss a ten step process of benchmark tests for models used for inundation mapping. The associated methodology and algorithmes have to first be validated with analytical solutions, then verified with laboratory measurements and field data. The models need to be published in the scientific literature in peer-review journals indexed by ISI. While this process may appear onerous, it reflects our state of knowledge, and is the only defensible methodology when human lives are at stake. Synolakis, C.E., and Bernard, E.N, Tsunami science before and beyond Boxing Day 2004, Phil. Trans. R. Soc. A 364 1845, 2231--2263, 2005.
Exploring the QSAR's predictive truthfulness of the novel N-tuple discrete derivative indices on benchmark datasets.

PubMed

Martínez-Santiago, O; Marrero-Ponce, Y; Vivas-Reyes, R; Rivera-Borroto, O M; Hurtado, E; Treto-Suarez, M A; Ramos, Y; Vergara-Murillo, F; Orozco-Ugarriza, M E; Martínez-López, Y

2017-05-01

Graph derivative indices (GDIs) have recently been defined over N-atoms (N = 2, 3 and 4) simultaneously, which are based on the concept of derivatives in discrete mathematics (finite difference), metaphorical to the derivative concept in classical mathematical analysis. These molecular descriptors (MDs) codify topo-chemical and topo-structural information based on the concept of the derivative of a molecular graph with respect to a given event (S) over duplex, triplex and quadruplex relations of atoms (vertices). These GDIs have been successfully applied in the description of physicochemical properties like reactivity, solubility and chemical shift, among others, and in several comparative quantitative structure activity/property relationship (QSAR/QSPR) studies. Although satisfactory results have been obtained in previous modelling studies with the aforementioned indices, it is necessary to develop new, more rigorous analysis to assess the true predictive performance of the novel structure codification. So, in the present paper, an assessment and statistical validation of the performance of these novel approaches in QSAR studies are executed, as well as a comparison with those of other QSAR procedures reported in the literature. To achieve the main aim of this research, QSARs were developed on eight chemical datasets widely used as benchmarks in the evaluation/validation of several QSAR methods and/or many different MDs (fundamentally 3D MDs). Three to seven variable QSAR models were built for each chemical dataset, according to the original dissection into training/test sets. The models were developed by using multiple linear regression (MLR) coupled with a genetic algorithm as the feature wrapper selection technique in the MobyDigs software. Each family of GDIs (for duplex, triplex and quadruplex) behaves similarly in all modelling, although there were some exceptions. However, when all families were used in combination, the results achieved were quantitatively higher than those reported by other authors in similar experiments. Comparisons with respect to external correlation coefficients (q 2 ext ) revealed that the models based on GDIs possess superior predictive ability in seven of the eight datasets analysed, outperforming methodologies based on similar or more complex techniques and confirming the good predictive power of the obtained models. For the q 2 ext values, the non-parametric comparison revealed significantly different results to those reported so far, which demonstrated that the models based on DIVATI's indices presented the best global performance and yielded significantly better predictions than the 12 0-3D QSAR procedures used in the comparison. Therefore, GDIs are suitable for structure codification of the molecules and constitute a good alternative to build QSARs for the prediction of physicochemical, biological and environmental endpoints.
Identifying key genes in glaucoma based on a benchmarked dataset and the gene regulatory network.

PubMed

Chen, Xi; Wang, Qiao-Ling; Zhang, Meng-Hui

2017-10-01

The current study aimed to identify key genes in glaucoma based on a benchmarked dataset and gene regulatory network (GRN). Local and global noise was added to the gene expression dataset to produce a benchmarked dataset. Differentially-expressed genes (DEGs) between patients with glaucoma and normal controls were identified utilizing the Linear Models for Microarray Data (Limma) package based on benchmarked dataset. A total of 5 GRN inference methods, including Zscore, GeneNet, context likelihood of relatedness (CLR) algorithm, Partial Correlation coefficient with Information Theory (PCIT) and GEne Network Inference with Ensemble of Trees (Genie3) were evaluated using receiver operating characteristic (ROC) and precision and recall (PR) curves. The interference method with the best performance was selected to construct the GRN. Subsequently, topological centrality (degree, closeness and betweenness) was conducted to identify key genes in the GRN of glaucoma. Finally, the key genes were validated by performing reverse transcription-quantitative polymerase chain reaction (RT-qPCR). A total of 176 DEGs were detected from the benchmarked dataset. The ROC and PR curves of the 5 methods were analyzed and it was determined that Genie3 had a clear advantage over the other methods; thus, Genie3 was used to construct the GRN. Following topological centrality analysis, 14 key genes for glaucoma were identified, including IL6 , EPHA2 and GSTT1 and 5 of these 14 key genes were validated by RT-qPCR. Therefore, the current study identified 14 key genes in glaucoma, which may be potential biomarkers to use in the diagnosis of glaucoma and aid in identifying the molecular mechanism of this disease.
Preparation and benchmarking of ANSL-V cross sections for advanced neutron source reactor studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arwood, J.W.; Ford, W.E. III; Greene, N.M.

1987-01-01

Validity of selected data from the fine-group neutron library was satisfactorily tested in performance parameter calculations for the BAPL-1, TRX-1, and ZEEP-1 thermal lattice benchmarks. BAPL-2 is an H/sub 2/O moderated, uranium oxide lattice; TRX-1 is an H/sub 2/O moderated, 1.31 weight percent enriched uranium metal lattice; ZEEP-1 is a D/sub 2/O-moderated, natural uranium lattice. 26 refs., 1 tab.
IMAGESEER - IMAGEs for Education and Research

NASA Technical Reports Server (NTRS)

Le Moigne, Jacqueline; Grubb, Thomas; Milner, Barbara

2012-01-01

IMAGESEER is a new Web portal that brings easy access to NASA image data for non-NASA researchers, educators, and students. The IMAGESEER Web site and database are specifically designed to be utilized by the university community, to enable teaching image processing (IP) techniques on NASA data, as well as to provide reference benchmark data to validate new IP algorithms. Along with the data and a Web user interface front-end, basic knowledge of the application domains, benchmark information, and specific NASA IP challenges (or case studies) are provided.
Benchmarking infrastructure for mutation text mining

PubMed Central

2014-01-01

Background Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. Results We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. Conclusion We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption. PMID:24568600
Benchmarking infrastructure for mutation text mining.

PubMed

Klein, Artjom; Riazanov, Alexandre; Hindle, Matthew M; Baker, Christopher Jo

2014-02-25

Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption.
Hypersonic Experimental and Computational Capability, Improvement and Validation. Volume 2

NASA Technical Reports Server (NTRS)

Muylaert, Jean (Editor); Kumar, Ajay (Editor); Dujarric, Christian (Editor)

1998-01-01

The results of the phase 2 effort conducted under AGARD Working Group 18 on Hypersonic Experimental and Computational Capability, Improvement and Validation are presented in this report. The first volume, published in May 1996, mainly focused on the design methodology, plans and some initial results of experiments that had been conducted to serve as validation benchmarks. The current volume presents the detailed experimental and computational data base developed during this effort.
Status and plans for the ANOPP/HSR prediction system

NASA Technical Reports Server (NTRS)

Nolan, Sandra K.

1992-01-01

ANOPP is a comprehensive prediction system which was developed and validated by NASA. Because ANOPP is a system prediction program, it allows aerospace industry researchers to create trade-off studies with a variety of aircraft noise problems. The extensive validation of ANOPP allows the program results to be used as a benchmark for testing other prediction codes.
Benchmark radar targets for the validation of computational electromagnetics programs

NASA Technical Reports Server (NTRS)

Woo, Alex C.; Wang, Helen T. G.; Schuh, Michael J.; Sanders, Michael L.

1993-01-01

Results are presented of a set of computational electromagnetics validation measurements referring to three-dimensional perfectly conducting smooth targets, performed for the Electromagnetic Code Consortium. Plots are presented for both the low- and high-frequency measurements of the NASA almond, an ogive, a double ogive, a cone-sphere, and a cone-sphere with a gap.
Using GTO-Velo to Facilitate Communication and Sharing of Simulation Results in Support of the Geothermal Technologies Office Code Comparison Study

DOE Office of Scientific and Technical Information (OSTI.GOV)

White, Signe K.; Purohit, Sumit; Boyd, Lauren W.

The Geothermal Technologies Office Code Comparison Study (GTO-CCS) aims to support the DOE Geothermal Technologies Office in organizing and executing a model comparison activity. This project is directed at testing, diagnosing differences, and demonstrating modeling capabilities of a worldwide collection of numerical simulators for evaluating geothermal technologies. Teams of researchers are collaborating in this code comparison effort, and it is important to be able to share results in a forum where technical discussions can easily take place without requiring teams to travel to a common location. Pacific Northwest National Laboratory has developed an open-source, flexible framework called Velo that providesmore » a knowledge management infrastructure and tools to support modeling and simulation for a variety of types of projects in a number of scientific domains. GTO-Velo is a customized version of the Velo Framework that is being used as the collaborative tool in support of the GTO-CCS project. Velo is designed around a novel integration of a collaborative Web-based environment and a scalable enterprise Content Management System (CMS). The underlying framework provides a flexible and unstructured data storage system that allows for easy upload of files that can be in any format. Data files are organized in hierarchical folders and each folder and each file has a corresponding wiki page for metadata. The user interacts with Velo through a web browser based wiki technology, providing the benefit of familiarity and ease of use. High-level folders have been defined in GTO-Velo for the benchmark problem descriptions, descriptions of simulator/code capabilities, a project notebook, and folders for participating teams. Each team has a subfolder with write access limited only to the team members, where they can upload their simulation results. The GTO-CCS participants are charged with defining the benchmark problems for the study, and as each GTO-CCS Benchmark problem is defined, the problem creator can provide a description using a template on the metadata page corresponding to the benchmark problem folder. Project documents, references and videos of the weekly online meetings are shared via GTO-Velo. A results comparison tool allows users to plot their uploaded simulation results on the fly, along with those of other teams, to facilitate weekly discussions of the benchmark problem results being generated by the teams. GTO-Velo is an invaluable tool providing the project coordinators and team members with a framework for collaboration among geographically dispersed organizations.« less
Systematic Benchmarking of Diagnostic Technologies for an Electrical Power System

NASA Technical Reports Server (NTRS)

Kurtoglu, Tolga; Jensen, David; Poll, Scott

2009-01-01

Automated health management is a critical functionality for complex aerospace systems. A wide variety of diagnostic algorithms have been developed to address this technical challenge. Unfortunately, the lack of support to perform large-scale V&V (verification and validation) of diagnostic technologies continues to create barriers to effective development and deployment of such algorithms for aerospace vehicles. In this paper, we describe a formal framework developed for benchmarking of diagnostic technologies. The diagnosed system is the Advanced Diagnostics and Prognostics Testbed (ADAPT), a real-world electrical power system (EPS), developed and maintained at the NASA Ames Research Center. The benchmarking approach provides a systematic, empirical basis to the testing of diagnostic software and is used to provide performance assessment for different diagnostic algorithms.
Open Rotor - Analysis of Diagnostic Data

NASA Technical Reports Server (NTRS)

Envia, Edmane

2011-01-01

NASA is researching open rotor propulsion as part of its technology research and development plan for addressing the subsonic transport aircraft noise, emission and fuel burn goals. The low-speed wind tunnel test for investigating the aerodynamic and acoustic performance of a benchmark blade set at the approach and takeoff conditions has recently concluded. A high-speed wind tunnel diagnostic test campaign has begun to investigate the performance of this benchmark open rotor blade set at the cruise condition. Databases from both speed regimes will comprise a comprehensive collection of benchmark open rotor data for use in assessing/validating aerodynamic and noise prediction tools (component & system level) as well as providing insights into the physics of open rotors to help guide the development of quieter open rotors.
DE-NE0008277_PROTEUS final technical report 2018

DOE Office of Scientific and Technical Information (OSTI.GOV)

Enqvist, Andreas

This project details re-evaluations of experiments of gas-cooled fast reactor (GCFR) core designs performed in the 1970s at the PROTEUS reactor and create a series of International Reactor Physics Experiment Evaluation Project (IRPhEP) benchmarks. Currently there are no gas-cooled fast reactor (GCFR) experiments available in the International Handbook of Evaluated Reactor Physics Benchmark Experiments (IRPhEP Handbook). These experiments are excellent candidates for reanalysis and development of multiple benchmarks because these experiments provide high-quality integral nuclear data relevant to the validation and refinement of thorium, neptunium, uranium, plutonium, iron, and graphite cross sections. It would be cost prohibitive to reproduce suchmore » a comprehensive suite of experimental data to support any future GCFR endeavors.« less
Benchmarking the cost efficiency of community care in Australian child and adolescent mental health services: implications for future benchmarking.

PubMed

Furber, Gareth; Brann, Peter; Skene, Clive; Allison, Stephen

2011-06-01

The purpose of this study was to benchmark the cost efficiency of community care across six child and adolescent mental health services (CAMHS) drawn from different Australian states. Organizational, contact and outcome data from the National Mental Health Benchmarking Project (NMHBP) data-sets were used to calculate cost per "treatment hour" and cost per episode for the six participating organizations. We also explored the relationship between intake severity as measured by the Health of the Nations Outcome Scales for Children and Adolescents (HoNOSCA) and cost per episode. The average cost per treatment hour was $223, with cost differences across the six services ranging from a mean of $156 to $273 per treatment hour. The average cost per episode was $3349 (median $1577) and there were significant differences in the CAMHS organizational medians ranging from $388 to $7076 per episode. HoNOSCA scores explained at best 6% of the cost variance per episode. These large cost differences indicate that community CAMHS have the potential to make substantial gains in cost efficiency through collaborative benchmarking. Benchmarking forums need considerable financial and business expertise for detailed comparison of business models for service provision.
An automated protocol for performance benchmarking a widefield fluorescence microscope.

PubMed

Halter, Michael; Bier, Elianna; DeRose, Paul C; Cooksey, Gregory A; Choquette, Steven J; Plant, Anne L; Elliott, John T

2014-11-01

Widefield fluorescence microscopy is a highly used tool for visually assessing biological samples and for quantifying cell responses. Despite its widespread use in high content analysis and other imaging applications, few published methods exist for evaluating and benchmarking the analytical performance of a microscope. Easy-to-use benchmarking methods would facilitate the use of fluorescence imaging as a quantitative analytical tool in research applications, and would aid the determination of instrumental method validation for commercial product development applications. We describe and evaluate an automated method to characterize a fluorescence imaging system's performance by benchmarking the detection threshold, saturation, and linear dynamic range to a reference material. The benchmarking procedure is demonstrated using two different materials as the reference material, uranyl-ion-doped glass and Schott 475 GG filter glass. Both are suitable candidate reference materials that are homogeneously fluorescent and highly photostable, and the Schott 475 GG filter glass is currently commercially available. In addition to benchmarking the analytical performance, we also demonstrate that the reference materials provide for accurate day to day intensity calibration. Published 2014 Wiley Periodicals Inc. Published 2014 Wiley Periodicals Inc. This article is a US government work and, as such, is in the public domain in the United States of America.
MIPS bacterial genomes functional annotation benchmark dataset.

PubMed

Tetko, Igor V; Brauner, Barbara; Dunger-Kaltenbach, Irmtraud; Frishman, Goar; Montrone, Corinna; Fobo, Gisela; Ruepp, Andreas; Antonov, Alexey V; Surmeli, Dimitrij; Mewes, Hans-Wernen

2005-05-15

Any development of new methods for automatic functional annotation of proteins according to their sequences requires high-quality data (as benchmark) as well as tedious preparatory work to generate sequence parameters required as input data for the machine learning methods. Different program settings and incompatible protocols make a comparison of the analyzed methods difficult. The MIPS Bacterial Functional Annotation Benchmark dataset (MIPS-BFAB) is a new, high-quality resource comprising four bacterial genomes manually annotated according to the MIPS functional catalogue (FunCat). These resources include precalculated sequence parameters, such as sequence similarity scores, InterPro domain composition and other parameters that could be used to develop and benchmark methods for functional annotation of bacterial protein sequences. These data are provided in XML format and can be used by scientists who are not necessarily experts in genome annotation. BFAB is available at http://mips.gsf.de/proj/bfab
Risk adjustment as basis for rational benchmarking: the example of colon carcinoma.

PubMed

Ptok, Henry; Marusch, Frank; Schmidt, Uwe; Gastinger, Ingo; Wenisch, Hubertus J C; Lippert, Hans

2011-01-01

The results of resection of colorectal carcinoma can vary greatly from one hospital to another. However, this does not necessarily reflect differences in the quality of treatment. The purpose of this study was to compare various tools for the risk-adjusted assessment of treatment results after resection of colorectal carcinoma within the context of hospital benchmarking. On the basis of a data pool provided by a multicentric observation study of patients with colon cancer, the postoperative in-hospital mortality rates at two high-volume hospitals ("A" and "B") were compared. After univariate comparison, risk-adjusted comparison of postoperative mortality was performed by logistic regression analysis (LReA), propensity-score analysis (PScA), and the CR-POSSUM score. Postoperative complications were compared by LReA and PScA. Although postoperative mortality differed significantly (P = 0.041) in univariate comparison of hospitals A and B (2.9% vs. 6.4%), no significant difference was found by LReA or PScA. Similarly, the observed mortality at these did not differ significantly from the mortality estimated by the CR-POSSUM score (hospital A, 2.9%/4.9%, P = 0.298; hospital B, 6.4%/6.5%, P = 1.000). Significant differences were seen in risk-adjusted comparison of most postoperative complications (by both LReA and PScA), but there were no differences in the rates of relaparotomy or anastomotic leakage that required surgery. For the hard outcome variable "postoperative mortality," none of the three risk adjustment procedures showed any difference between the hospitals. The CR-POSSUM score can be regarded as the most practicable tool for risk-adjusted comparison of the outcome of colon-carcinoma resection in clinical benchmarking.
A fresh approach to stellar benchmarking

NASA Astrophysics Data System (ADS)

Beaton, Rachael

2018-06-01

An avalanche of data is about to revolutionize astronomy, but the options for validating those data have been limited. High-precision measurements from the Hubble Space Telescope enable a much-needed alternative option.

Structured Uncertainty Bound Determination From Data for Control and Performance Validation

NASA Technical Reports Server (NTRS)

Lim, Kyong B.

2003-01-01

This report attempts to document the broad scope of issues that must be satisfactorily resolved before one can expect to methodically obtain, with a reasonable confidence, a near-optimal robust closed loop performance in physical applications. These include elements of signal processing, noise identification, system identification, model validation, and uncertainty modeling. Based on a recently developed methodology involving a parameterization of all model validating uncertainty sets for a given linear fractional transformation (LFT) structure and noise allowance, a new software, Uncertainty Bound Identification (UBID) toolbox, which conveniently executes model validation tests and determine uncertainty bounds from data, has been designed and is currently available. This toolbox also serves to benchmark the current state-of-the-art in uncertainty bound determination and in turn facilitate benchmarking of robust control technology. To help clarify the methodology and use of the new software, two tutorial examples are provided. The first involves the uncertainty characterization of a flexible structure dynamics, and the second example involves a closed loop performance validation of a ducted fan based on an uncertainty bound from data. These examples, along with other simulation and experimental results, also help describe the many factors and assumptions that determine the degree of success in applying robust control theory to practical problems.
Designs of Empirical Evaluations of Nonexperimental Methods in Field Settings.

PubMed

Wong, Vivian C; Steiner, Peter M

2018-01-01

Over the last three decades, a research design has emerged to evaluate the performance of nonexperimental (NE) designs and design features in field settings. It is called the within-study comparison (WSC) approach or the design replication study. In the traditional WSC design, treatment effects from a randomized experiment are compared to those produced by an NE approach that shares the same target population. The nonexperiment may be a quasi-experimental design, such as a regression-discontinuity or an interrupted time-series design, or an observational study approach that includes matching methods, standard regression adjustments, and difference-in-differences methods. The goals of the WSC are to determine whether the nonexperiment can replicate results from a randomized experiment (which provides the causal benchmark estimate), and the contexts and conditions under which these methods work in practice. This article presents a coherent theory of the design and implementation of WSCs for evaluating NE methods. It introduces and identifies the multiple purposes of WSCs, required design components, common threats to validity, design variants, and causal estimands of interest in WSCs. It highlights two general approaches for empirical evaluations of methods in field settings, WSC designs with independent and dependent benchmark and NE arms. This article highlights advantages and disadvantages for each approach, and conditions and contexts under which each approach is optimal for addressing methodological questions.
Performance evaluation of tile-based Fisher Ratio analysis using a benchmark yeast metabolome dataset.

PubMed

Watson, Nathanial E; Parsons, Brendon A; Synovec, Robert E

2016-08-12

Performance of tile-based Fisher Ratio (F-ratio) data analysis, recently developed for discovery-based studies using comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry (GC×GC-TOFMS), is evaluated with a metabolomics dataset that had been previously analyzed in great detail, but while taking a brute force approach. The previously analyzed data (referred to herein as the benchmark dataset) were intracellular extracts from Saccharomyces cerevisiae (yeast), either metabolizing glucose (repressed) or ethanol (derepressed), which define the two classes in the discovery-based analysis to find metabolites that are statistically different in concentration between the two classes. Beneficially, this previously analyzed dataset provides a concrete means to validate the tile-based F-ratio software. Herein, we demonstrate and validate the significant benefits of applying tile-based F-ratio analysis. The yeast metabolomics data are analyzed more rapidly in about one week versus one year for the prior studies with this dataset. Furthermore, a null distribution analysis is implemented to statistically determine an adequate F-ratio threshold, whereby the variables with F-ratio values below the threshold can be ignored as not class distinguishing, which provides the analyst with confidence when analyzing the hit table. Forty-six of the fifty-four benchmarked changing metabolites were discovered by the new methodology while consistently excluding all but one of the benchmarked nineteen false positive metabolites previously identified. Copyright © 2016 Elsevier B.V. All rights reserved.
ARABIC TRANSLATION AND ADAPTATION OF THE HOSPITAL CONSUMER ASSESSMENT OF HEALTHCARE PROVIDERS AND SYSTEMS (HCAHPS) PATIENT SATISFACTION SURVEY INSTRUMENT.

PubMed

Dockins, James; Abuzahrieh, Ramzi; Stack, Martin

2015-01-01

To translate and adapt an effective, validated, benchmarked, and widely used patient satisfaction measurement tool for use with an Arabic-speaking population. Translation of survey's items, survey administration process development, evaluation of reliability, and international benchmarking Three hundred-bed tertiary care hospital in Jeddah, Saudi Arabia. 645 patients discharged during 2011 from the hospital's inpatient care units. INTERVENTIONS; The Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) instrument was translated into Arabic, a randomized weekly sample of patients was selected, and the survey was administered via telephone during 2011 to patients or their relatives. Scores were compiled for each of the HCAHPS questions and then for each of the six HCAHPS clinical composites, two non-clinical items, and two global items. Clinical composite scores, as well as the two non-clinical and two global items were analyzed for the 645 respondents. Clinical composites were analyzed using Spearman's correlation coefficient and Cronbach's alpha to demonstrate acceptable internal consistency for these items and scales demonstrated acceptable internal consistency for the clinical composites. (Spearman's correlation coefficient = 0.327 - 0.750, P < 0.01; Cronbach's alpha = 0.516 - 0.851) All ten HCAHPS measures were compared quarterly to US national averages with results that closely paralleled the US benchmarks. . The Arabic translation and adaptation of the HCAHPS is a valid, reliable, and feasible tool for evaluation and benchmarking of inpatient satisfaction in Arabic speaking populations.
New Reactor Physics Benchmark Data in the March 2012 Edition of the IRPhEP Handbook

DOE Office of Scientific and Technical Information (OSTI.GOV)

John D. Bess; J. Blair Briggs; Jim Gulliford

2012-11-01

The International Reactor Physics Experiment Evaluation Project (IRPhEP) was established to preserve integral reactor physics experimental data, including separate or special effects data for nuclear energy and technology applications. Numerous experiments that have been performed worldwide, represent a large investment of infrastructure, expertise, and cost, and are valuable resources of data for present and future research. These valuable assets provide the basis for recording, development, and validation of methods. If the experimental data are lost, the high cost to repeat many of these measurements may be prohibitive. The purpose of the IRPhEP is to provide an extensively peer-reviewed set ofmore » reactor physics-related integral data that can be used by reactor designers and safety analysts to validate the analytical tools used to design next-generation reactors and establish the safety basis for operation of these reactors. Contributors from around the world collaborate in the evaluation and review of selected benchmark experiments for inclusion in the International Handbook of Evaluated Reactor Physics Benchmark Experiments (IRPhEP Handbook) [1]. Several new evaluations have been prepared for inclusion in the March 2012 edition of the IRPhEP Handbook.« less
Benchmarking Ligand-Based Virtual High-Throughput Screening with the PubChem Database

PubMed Central

Butkiewicz, Mariusz; Lowe, Edward W.; Mueller, Ralf; Mendenhall, Jeffrey L.; Teixeira, Pedro L.; Weaver, C. David; Meiler, Jens

2013-01-01

With the rapidly increasing availability of High-Throughput Screening (HTS) data in the public domain, such as the PubChem database, methods for ligand-based computer-aided drug discovery (LB-CADD) have the potential to accelerate and reduce the cost of probe development and drug discovery efforts in academia. We assemble nine data sets from realistic HTS campaigns representing major families of drug target proteins for benchmarking LB-CADD methods. Each data set is public domain through PubChem and carefully collated through confirmation screens validating active compounds. These data sets provide the foundation for benchmarking a new cheminformatics framework BCL::ChemInfo, which is freely available for non-commercial use. Quantitative structure activity relationship (QSAR) models are built using Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Decision Trees (DTs), and Kohonen networks (KNs). Problem-specific descriptor optimization protocols are assessed including Sequential Feature Forward Selection (SFFS) and various information content measures. Measures of predictive power and confidence are evaluated through cross-validation, and a consensus prediction scheme is tested that combines orthogonal machine learning algorithms into a single predictor. Enrichments ranging from 15 to 101 for a TPR cutoff of 25% are observed. PMID:23299552
Developing and Trialling an independent, scalable and repeatable IT-benchmarking procedure for healthcare organisations.

PubMed

Liebe, J D; Hübner, U

2013-01-01

Continuous improvements of IT-performance in healthcare organisations require actionable performance indicators, regularly conducted, independent measurements and meaningful and scalable reference groups. Existing IT-benchmarking initiatives have focussed on the development of reliable and valid indicators, but less on the questions about how to implement an environment for conducting easily repeatable and scalable IT-benchmarks. This study aims at developing and trialling a procedure that meets the afore-mentioned requirements. We chose a well established, regularly conducted (inter-) national IT-survey of healthcare organisations (IT-Report Healthcare) as the environment and offered the participants of the 2011 survey (CIOs of hospitals) to enter a benchmark. The 61 structural and functional performance indicators covered among others the implementation status and integration of IT-systems and functions, global user satisfaction and the resources of the IT-department. Healthcare organisations were grouped by size and ownership. The benchmark results were made available electronically and feedback on the use of these results was requested after several months. Fifty-ninehospitals participated in the benchmarking. Reference groups consisted of up to 141 members depending on the number of beds (size) and the ownership (public vs. private). A total of 122 charts showing single indicator frequency views were sent to each participant. The evaluation showed that 94.1% of the CIOs who participated in the evaluation considered this benchmarking beneficial and reported that they would enter again. Based on the feedback of the participants we developed two additional views that provide a more consolidated picture. The results demonstrate that establishing an independent, easily repeatable and scalable IT-benchmarking procedure is possible and was deemed desirable. Based on these encouraging results a new benchmarking round which includes process indicators is currently conducted.
Using Benchmarking Techniques and the 2011 Maternity Practices Infant Nutrition and Care (mPINC) Survey to Improve Performance among Peer Groups across the United States

PubMed Central

Edwards, Roger A.; Dee, Deborah; Umer, Amna; Perrine, Cria G.; Shealy, Katherine R.; Grummer-Strawn, Laurence M.

2015-01-01

Background A substantial proportion of US maternity care facilities engage in practices that are not evidence-based and that interfere with breastfeeding. The CDC Survey of Maternity Practices in Infant Nutrition and Care (mPINC) showed significant variation in maternity practices among US states. Objective The purpose of this article is to use benchmarking techniques to identify states within relevant peer groups that were top performers on mPINC survey indicators related to breastfeeding support. Methods We used 11 indicators of breastfeeding-related maternity care from the 2011 mPINC survey and benchmarking techniques to organize and compare hospital-based maternity practices across the 50 states and Washington, DC. We created peer categories for benchmarking first by region (grouping states by West, Midwest, South, and Northeast) and then by size (grouping states by the number of maternity facilities and dividing each region into approximately equal halves based on the number of facilities). Results Thirty-four states had scores high enough to serve as benchmarks, and 32 states had scores low enough to reflect the lowest score gap from the benchmark on at least 1 indicator. No state served as the benchmark on more than 5 indicators and no state was furthest from the benchmark on more than 7 indicators. The small peer group benchmarks in the South, West, and Midwest were better than the large peer group benchmarks on 91%, 82%, and 36% of the indicators, respectively. In the West large, the Midwest large, the Midwest small, and the South large peer groups, 4–6 benchmarks showed that less than 50% of hospitals have ideal practice in all states. Conclusion The evaluation presents benchmarks for peer group state comparisons that provide potential and feasible targets for improvement. PMID:24394963
Notes on numerical reliability of several statistical analysis programs

USGS Publications Warehouse

Landwehr, J.M.; Tasker, Gary D.

1999-01-01

This report presents a benchmark analysis of several statistical analysis programs currently in use in the USGS. The benchmark consists of a comparison between the values provided by a statistical analysis program for variables in the reference data set ANASTY and their known or calculated theoretical values. The ANASTY data set is an amendment of the Wilkinson NASTY data set that has been used in the statistical literature to assess the reliability (computational correctness) of calculated analytical results.
Energy Behavior Change and Army Net Zero Energy; Gaps in the Army’s Approach to Changing Energy Behavior

DTIC Science & Technology

2014-06-13

age, and design , installations set benchmarks for utility use and cost. This benchmark includes a buffer above and below the baseline. If residents...sustainability officers from each government agency (US President 2009, 6). The order requires that each federal agency designate a senior...conducting direct comparisons of pre and post intervention data (Judd et al. 2013, 15). Soldiers were the primary occupants of the three buildings with
Integrated control/structure optimization by multilevel decomposition

NASA Technical Reports Server (NTRS)

Zeiler, Thomas A.; Gilbert, Michael G.

1990-01-01

A method for integrated control/structure optimization by multilevel decomposition is presented. It is shown that several previously reported methods were actually partial decompositions wherein only the control was decomposed into a subsystem design. One of these partially decomposed problems was selected as a benchmark example for comparison. The system is fully decomposed into structural and control subsystem designs and an improved design is produced. Theory, implementation, and results for the method are presented and compared with the benchmark example.
Device Discovery in Frequency Hopping Wireless Ad Hoc Networks

DTIC Science & Technology

2004-09-01

10.1. Benchmark scatternet configuration used for outreach compar- ison. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 10.2. Average...Slave - Slave A B C DE Figure 10.1: Benchmark scatternet configuration used for outreach comparison. Additionally: • All nodes are within range of one...MSTSs ISOM mean = 6.97 MSTSs NISOM mean = 7.21 MSTSs Exponential distribution MSTSs P ic on et A -D p ac ke t g en er at io n ti m e pr ob ab il it y
An evaluative model of system performance in manned teleoperational systems

NASA Technical Reports Server (NTRS)

Haines, Richard F.

1989-01-01

Manned teleoperational systems are used in aerospace operations in which humans must interact with machines remotely. Manual guidance of remotely piloted vehicles, controling a wind tunnel, carrying out a scientific procedure remotely are examples of teleoperations. A four input parameter throughput (Tp) model is presented which can be used to evaluate complex, manned, teleoperations-based systems and make critical comparisons among candidate control systems. The first two parameters of this model deal with nominal (A) and off-nominal (B) predicted events while the last two focus on measured events of two types, human performance (C) and system performance (D). Digital simulations showed that the expression A(1-B)/C+D) produced the greatest homogeneity of variance and distribution symmetry. Results from a recently completed manned life science telescience experiment will be used to further validate the model. Complex, interacting teleoperational systems may be systematically evaluated using this expression much like a computer benchmark is used.
Crack Path Selection in Thermally Loaded Borosilicate/Steel Bibeam Specimen

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grutzik, Scott Joseph; Reedy, Jr., E. D.

Here, we have developed a novel specimen for studying crack paths in glass. Under certain conditions, the specimen reaches a state where the crack must select between multiple paths satisfying the K II = 0 condition. This path selection is a simple but challenging benchmark case for both analytical and numerical methods of predicting crack propagation. We document the development of the specimen, using an uncracked and instrumented test case to study the effect of adhesive choice and validate the accuracy of both a simple beam theory model and a finite element model. In addition, we present preliminary fracture testmore » results and provide a comparison to the path predicted by two numerical methods (mesh restructuring and XFEM). The directional stability of the crack path and differences in kink angle predicted by various crack kinking criteria is analyzed with a finite element model.« less
Crack Path Selection in Thermally Loaded Borosilicate/Steel Bibeam Specimen

DOE PAGES

Grutzik, Scott Joseph; Reedy, Jr., E. D.

2017-08-04

Here, we have developed a novel specimen for studying crack paths in glass. Under certain conditions, the specimen reaches a state where the crack must select between multiple paths satisfying the K II = 0 condition. This path selection is a simple but challenging benchmark case for both analytical and numerical methods of predicting crack propagation. We document the development of the specimen, using an uncracked and instrumented test case to study the effect of adhesive choice and validate the accuracy of both a simple beam theory model and a finite element model. In addition, we present preliminary fracture testmore » results and provide a comparison to the path predicted by two numerical methods (mesh restructuring and XFEM). The directional stability of the crack path and differences in kink angle predicted by various crack kinking criteria is analyzed with a finite element model.« less
Dissociative recombination by frame transformation to Siegert pseudostates: A comparison with a numerically solvable model

NASA Astrophysics Data System (ADS)

Hvizdoš, Dávid; Váňa, Martin; Houfek, Karel; Greene, Chris H.; Rescigno, Thomas N.; McCurdy, C. William; Čurík, Roman

2018-02-01

We present a simple two-dimensional model of the indirect dissociative recombination process. The model has one electronic and one nuclear degree of freedom and it can be solved to high precision, without making any physically motivated approximations, by employing the exterior complex scaling method together with the finite-elements method and discrete variable representation. The approach is applied to solve a model for dissociative recombination of H2 + in the singlet ungerade channels, and the results serve as a benchmark to test validity of several physical approximations commonly used in the computational modeling of dissociative recombination for real molecular targets. The second, approximate, set of calculations employs a combination of multichannel quantum defect theory and frame transformation into a basis of Siegert pseudostates. The cross sections computed with the two methods are compared in detail for collision energies from 0 to 2 eV.
Breast cancer survival in New Zealand women.

PubMed

Campbell, Ian D; Scott, Nina; Seneviratne, Sanjeewa; Kollias, James; Walters, David; Taylor, Corey; Webster, Fleur; Zorbas, Helen; Roder, David M

2015-01-01

The Quality Audit (BQA) of Breast Surgeons of Australia and New Zealand includes a broad range of data and is the largest New Zealand (NZ) breast cancer (BC) database outside the NZ Cancer Registry. We used BQA data to compare BC survival by ethnicity, deprivation, remoteness, clinical characteristic and case load. BQA and death data were linked using the National Health Index. Disease-specific survival for invasive cases was benchmarked against Australian BQA data and NZ population-based survivals. Validity was explored by comparison with expected survival by risk factor. Compared with 93% for Australian audit cases, 5-year survival was 90% for NZ audit cases overall, 87% for Maori, 84% for Pacific and 91% for other. BC survival in NZ appears lower than in Australia, with inequities by ethnicity. Differences may be due to access, timeliness and quality of health services, patient risk profiles, BQA coverage and death-record methodology. © 2014 Royal Australasian College of Surgeons.
Fully Decentralized Semi-supervised Learning via Privacy-preserving Matrix Completion.

PubMed

Fierimonte, Roberto; Scardapane, Simone; Uncini, Aurelio; Panella, Massimo

2016-08-26

Distributed learning refers to the problem of inferring a function when the training data are distributed among different nodes. While significant work has been done in the contexts of supervised and unsupervised learning, the intermediate case of Semi-supervised learning in the distributed setting has received less attention. In this paper, we propose an algorithm for this class of problems, by extending the framework of manifold regularization. The main component of the proposed algorithm consists of a fully distributed computation of the adjacency matrix of the training patterns. To this end, we propose a novel algorithm for low-rank distributed matrix completion, based on the framework of diffusion adaptation. Overall, the distributed Semi-supervised algorithm is efficient and scalable, and it can preserve privacy by the inclusion of flexible privacy-preserving mechanisms for similarity computation. The experimental results and comparison on a wide range of standard Semi-supervised benchmarks validate our proposal.
Safety and control of accelerator-driven subcritical systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rief, H.; Takahashi, H.

1995-10-01

To study control and safety of accelertor driven nuclear systems, a one point kinetic model was developed and programed. It deals with fast transients as a function of reactivity insertion. Doppler feedback, and the intensity of an external neutron source. The model allows for a simultaneous calculation of an equivalent critical reactor. It was validated by a comparison with a benchmark specified by the Nuclear Energy Agency Committee of Reactor Physics. Additional features are the possibility of inserting a linear or quadratic time dependent reactivity ramp which may account for gravity induced accidents like earthquakes, the possibility to shut downmore » the external neutron source by an exponential decay law of the form exp({minus}t/{tau}), and a graphical display of the power and reactivity changes. The calculations revealed that such boosters behave quite benignly even if they are only slightly subcritical.« less
Structural modeling for multicell composite rotor blades

NASA Technical Reports Server (NTRS)

Rehfield, Lawrence W.; Atilgan, Ali R.

1987-01-01

Composite material systems are currently good candidates for aerospace structures, primarily for the design flexibility they offer, i.e., it is possible to tailor the material and manufacturing approach to the application. A working definition of elastic or structural tailoring is the use of structural concept, fiber orientation, ply stacking sequence, and a blend of materials to achieve specific performance goals. In the design process, choices of materials and dimensions are made which produce specific response characteristics, and which permit the selected goals to be achieved. Common choices for tailoring goals are preventing instabilities or vibration resonances or enhancing damage tolerance. An essential, enabling factor in the design of tailored composite structures is structural modeling that accurately, but simply, characterizes response. The objective of this paper is to present a new multicell beam model for composite rotor blades and to validate predictions based on the new model by comparison with a finite element simulation in three benchmark static load cases.

INL Results for Phases I and III of the OECD/NEA MHTGR-350 Benchmark

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gerhard Strydom; Javier Ortensi; Sonat Sen

2013-09-01

The Idaho National Laboratory (INL) Very High Temperature Reactor (VHTR) Technology Development Office (TDO) Methods Core Simulation group led the construction of the Organization for Economic Cooperation and Development (OECD) Modular High Temperature Reactor (MHTGR) 350 MW benchmark for comparing and evaluating prismatic VHTR analysis codes. The benchmark is sponsored by the OECD's Nuclear Energy Agency (NEA), and the project will yield a set of reference steady-state, transient, and lattice depletion problems that can be used by the Department of Energy (DOE), the Nuclear Regulatory Commission (NRC), and vendors to assess their code suits. The Methods group is responsible formore » defining the benchmark specifications, leading the data collection and comparison activities, and chairing the annual technical workshops. This report summarizes the latest INL results for Phase I (steady state) and Phase III (lattice depletion) of the benchmark. The INSTANT, Pronghorn and RattleSnake codes were used for the standalone core neutronics modeling of Exercise 1, and the results obtained from these codes are compared in Section 4. Exercise 2 of Phase I requires the standalone steady-state thermal fluids modeling of the MHTGR-350 design, and the results for the systems code RELAP5-3D are discussed in Section 5. The coupled neutronics and thermal fluids steady-state solution for Exercise 3 are reported in Section 6, utilizing the newly developed Parallel and Highly Innovative Simulation for INL Code System (PHISICS)/RELAP5-3D code suit. Finally, the lattice depletion models and results obtained for Phase III are compared in Section 7. The MHTGR-350 benchmark proved to be a challenging simulation set of problems to model accurately, and even with the simplifications introduced in the benchmark specification this activity is an important step in the code-to-code verification of modern prismatic VHTR codes. A final OECD/NEA comparison report will compare the Phase I and III results of all other international participants in 2014, while the remaining Phase II transient case results will be reported in 2015.« less
Benchmarking of surgical complications in gynaecological oncology: prospective multicentre study.

PubMed

Burnell, M; Iyer, R; Gentry-Maharaj, A; Nordin, A; Liston, R; Manchanda, R; Das, N; Gornall, R; Beardmore-Gray, A; Hillaby, K; Leeson, S; Linder, A; Lopes, A; Meechan, D; Mould, T; Nevin, J; Olaitan, A; Rufford, B; Shanbhag, S; Thackeray, A; Wood, N; Reynolds, K; Ryan, A; Menon, U

2016-12-01

To explore the impact of risk-adjustment on surgical complication rates (CRs) for benchmarking gynaecological oncology centres. Prospective cohort study. Ten UK accredited gynaecological oncology centres. Women undergoing major surgery on a gynaecological oncology operating list. Patient co-morbidity, surgical procedures and intra-operative (IntraOp) complications were recorded contemporaneously by surgeons for 2948 major surgical procedures. Postoperative (PostOp) complications were collected from hospitals and patients. Risk-prediction models for IntraOp and PostOp complications were created using penalised (lasso) logistic regression using over 30 potential patient/surgical risk factors. Observed and risk-adjusted IntraOp and PostOp CRs for individual hospitals were calculated. Benchmarking using colour-coded funnel plots and observed-to-expected ratios was undertaken. Overall, IntraOp CR was 4.7% (95% CI 4.0-5.6) and PostOp CR was 25.7% (95% CI 23.7-28.2). The observed CRs for all hospitals were under the upper 95% control limit for both IntraOp and PostOp funnel plots. Risk-adjustment and use of observed-to-expected ratio resulted in one hospital moving to the >95-98% CI (red) band for IntraOp CRs. Use of only hospital-reported data for PostOp CRs would have resulted in one hospital being unfairly allocated to the red band. There was little concordance between IntraOp and PostOp CRs. The funnel plots and overall IntraOp (≈5%) and PostOp (≈26%) CRs could be used for benchmarking gynaecological oncology centres. Hospital benchmarking using risk-adjusted CRs allows fairer institutional comparison. IntraOp and PostOp CRs are best assessed separately. As hospital under-reporting is common for postoperative complications, use of patient-reported outcomes is important. Risk-adjusted benchmarking of surgical complications for ten UK gynaecological oncology centres allows fairer comparison. © 2016 Royal College of Obstetricians and Gynaecologists.
Assessing Student Understanding of the "New Biology": Development and Evaluation of a Criterion-Referenced Genomics and Bioinformatics Assessment

NASA Astrophysics Data System (ADS)

Campbell, Chad Edward

Over the past decade, hundreds of studies have introduced genomics and bioinformatics (GB) curricula and laboratory activities at the undergraduate level. While these publications have facilitated the teaching and learning of cutting-edge content, there has yet to be an evaluation of these assessment tools to determine if they are meeting the quality control benchmarks set forth by the educational research community. An analysis of these assessment tools indicated that <10% referenced any quality control criteria and that none of the assessments met more than one of the quality control benchmarks. In the absence of evidence that these benchmarks had been met, it is unclear whether these assessment tools are capable of generating valid and reliable inferences about student learning. To remedy this situation the development of a robust GB assessment aligned with the quality control benchmarks was undertaken in order to ensure evidence-based evaluation of student learning outcomes. Content validity is a central piece of construct validity, and it must be used to guide instrument and item development. This study reports on: (1) the correspondence of content validity evidence gathered from independent sources; (2) the process of item development using this evidence; (3) the results from a pilot administration of the assessment; (4) the subsequent modification of the assessment based on the pilot administration results and; (5) the results from the second administration of the assessment. Twenty-nine different subtopics within GB (Appendix B: Genomics and Bioinformatics Expert Survey) were developed based on preliminary GB textbook analyses. These subtopics were analyzed using two methods designed to gather content validity evidence: (1) a survey of GB experts (n=61) and (2) a detailed content analyses of GB textbooks (n=6). By including only the subtopics that were shown to have robust support across these sources, 22 GB subtopics were established for inclusion in the assessment. An expert panel subsequently developed, evaluated, and revised two multiple-choice items to align with each of the 22 subtopics, producing a final item pool of 44 items. These items were piloted with student samples of varying content exposure levels. Both Classical Test Theory (CTT) and Item Response Theory (IRT) methodologies were used to evaluate the assessment's validity, reliability and ability inferences, and its ability to differentiate students with different magnitudes of content exposure. A total of 18 items were subsequently modified and reevaluated by an expert panel. The 26 original and 18 modified items were once again piloted with student samples of varying content exposure levels. Both CTT and IRT methodologies were once again used to evaluate student responses in order to evaluate the assessment's validity and reliability inferences as well as its ability to differentiate students with different magnitudes of content exposure. Interviews with students from different content exposure levels were also performed in order to gather convergent validity evidence (external validity evidence) as well as substantive validity evidence. Also included are the limitations of the assessment and a set of guidelines on how the assessment can best be used.
Predicting field-scale dispersion under realistic conditions with the polar Markovian velocity process model

NASA Astrophysics Data System (ADS)

Dünser, Simon; Meyer, Daniel W.

2016-06-01

In most groundwater aquifers, dispersion of tracers is dominated by flow-field inhomogeneities resulting from the underlying heterogeneous conductivity or transmissivity field. This effect is referred to as macrodispersion. Since in practice, besides a few point measurements the complete conductivity field is virtually never available, a probabilistic treatment is needed. To quantify the uncertainty in tracer concentrations from a given geostatistical model for the conductivity, Monte Carlo (MC) simulation is typically used. To avoid the excessive computational costs of MC, the polar Markovian velocity process (PMVP) model was recently introduced delivering predictions at about three orders of magnitude smaller computing times. In artificial test cases, the PMVP model has provided good results in comparison with MC. In this study, we further validate the model in a more challenging and realistic setup. The setup considered is derived from the well-known benchmark macrodispersion experiment (MADE), which is highly heterogeneous and non-stationary with a large number of unevenly scattered conductivity measurements. Validations were done against reference MC and good overall agreement was found. Moreover, simulations of a simplified setup with a single measurement were conducted in order to reassess the model's most fundamental assumptions and to provide guidance for model improvements.
Exploratory factor analysis of the Clinical Learning Environment, Supervision and Nurse Teacher Scale (CLES+T).

PubMed

Watson, Paul Barry; Seaton, Philippa; Sims, Deborah; Jamieson, Isabel; Mountier, Jane; Whittle, Rose; Saarikoski, Mikko

2014-01-01

The Clinical Learning Environment, Supervision and Nurse Teacher (CLES+T) scale measures student nurses' perceptions of clinical learning environments. This study evaluates the construct validity and internal reliability of the CLES+T in hospital settings in New Zealand. Comparisons are made between New Zealand and Finnish data. The CLES+T scale was completed by 416 Bachelor of Nursing students following hospital clinical placements between October 2008 and December 2009. Construct validity and internal reliability were assessed using exploratory factor analysis and Cronbach's alpha. Exploratory factor analysis supports 4 factors. Cronbach's alpha ranged from .82 to .93. All items except 1 loaded on the same factors found in unpublished Finnish data. The first factor combined 2 previous components from the published Finnish component analysis and was renamed: connecting with, and learning in, communities of clinical practice. The remaining 3 factors (Nurse teacher, Supervisory relationship, and Leadership style of the manager) corresponded to previous components and their conceptualizations. The CLES+T has good internal reliability and a consistent factor structure across samples. The consistency across international samples supports faculties and hospitals using the CLES+T to benchmark the quality of clinical learning environments provided to students.
Small drinking water systems under spatiotemporal water quality variability: a risk-based performance benchmarking framework.

PubMed

Bereskie, Ty; Haider, Husnain; Rodriguez, Manuel J; Sadiq, Rehan

2017-08-23

Traditional approaches for benchmarking drinking water systems are binary, based solely on the compliance and/or non-compliance of one or more water quality performance indicators against defined regulatory guidelines/standards. The consequence of water quality failure is dependent on location within a water supply system as well as time of the year (i.e., season) with varying levels of water consumption. Conventional approaches used for water quality comparison purposes fail to incorporate spatiotemporal variability and degrees of compliance and/or non-compliance. This can lead to misleading or inaccurate performance assessment data used in the performance benchmarking process. In this research, a hierarchical risk-based water quality performance benchmarking framework is proposed to evaluate small drinking water systems (SDWSs) through cross-comparison amongst similar systems. The proposed framework (R WQI framework) is designed to quantify consequence associated with seasonal and location-specific water quality issues in a given drinking water supply system to facilitate more efficient decision-making for SDWSs striving for continuous performance improvement. Fuzzy rule-based modelling is used to address imprecision associated with measuring performance based on singular water quality guidelines/standards and the uncertainties present in SDWS operations and monitoring. This proposed R WQI framework has been demonstrated using data collected from 16 SDWSs in Newfoundland and Labrador and Quebec, Canada, and compared to the Canadian Council of Ministers of the Environment WQI, a traditional, guidelines/standard-based approach. The study found that the R WQI framework provides an in-depth state of water quality and benchmarks SDWSs more rationally based on the frequency of occurrence and consequence of failure events.
[Benchmark experiment to verify radiation transport calculations for dosimetry in radiation therapy].

PubMed

Renner, Franziska

2016-09-01

Monte Carlo simulations are regarded as the most accurate method of solving complex problems in the field of dosimetry and radiation transport. In (external) radiation therapy they are increasingly used for the calculation of dose distributions during treatment planning. In comparison to other algorithms for the calculation of dose distributions, Monte Carlo methods have the capability of improving the accuracy of dose calculations - especially under complex circumstances (e.g. consideration of inhomogeneities). However, there is a lack of knowledge of how accurate the results of Monte Carlo calculations are on an absolute basis. A practical verification of the calculations can be performed by direct comparison with the results of a benchmark experiment. This work presents such a benchmark experiment and compares its results (with detailed consideration of measurement uncertainty) with the results of Monte Carlo calculations using the well-established Monte Carlo code EGSnrc. The experiment was designed to have parallels to external beam radiation therapy with respect to the type and energy of the radiation, the materials used and the kind of dose measurement. Because the properties of the beam have to be well known in order to compare the results of the experiment and the simulation on an absolute basis, the benchmark experiment was performed using the research electron accelerator of the Physikalisch-Technische Bundesanstalt (PTB), whose beam was accurately characterized in advance. The benchmark experiment and the corresponding Monte Carlo simulations were carried out for two different types of ionization chambers and the results were compared. Considering the uncertainty, which is about 0.7 % for the experimental values and about 1.0 % for the Monte Carlo simulation, the results of the simulation and the experiment coincide. Copyright © 2015. Published by Elsevier GmbH.
International health IT benchmarking: learning from cross-country comparisons.

PubMed

Zelmer, Jennifer; Ronchi, Elettra; Hyppönen, Hannele; Lupiáñez-Villanueva, Francisco; Codagnone, Cristiano; Nøhr, Christian; Huebner, Ursula; Fazzalari, Anne; Adler-Milstein, Julia

2017-03-01

To pilot benchmark measures of health information and communication technology (ICT) availability and use to facilitate cross-country learning. A prior Organization for Economic Cooperation and Development-led effort involving 30 countries selected and defined functionality-based measures for availability and use of electronic health records, health information exchange, personal health records, and telehealth. In this pilot, an Organization for Economic Cooperation and Development Working Group compiled results for 38 countries for a subset of measures with broad coverage using new and/or adapted country-specific or multinational surveys and other sources from 2012 to 2015. We also synthesized country learnings to inform future benchmarking. While electronic records are widely used to store and manage patient information at the point of care-all but 2 pilot countries reported use by at least half of primary care physicians; many had rates above 75%-patient information exchange across organizations/settings is less common. Large variations in the availability and use of telehealth and personal health records also exist. Pilot participation demonstrated interest in cross-national benchmarking. Using the most comparable measures available to date, it showed substantial diversity in health ICT availability and use in all domains. The project also identified methodological considerations (e.g., structural and health systems issues that can affect measurement) important for future comparisons. While health policies and priorities differ, many nations aim to increase access, quality, and/or efficiency of care through effective ICT use. By identifying variations and describing key contextual factors, benchmarking offers the potential to facilitate cross-national learning and accelerate the progress of individual countries. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Better Medicare Cost Report data are needed to help hospitals benchmark costs and performance.

PubMed

Magnus, S A; Smith, D G

2000-01-01

To evaluate costs and achieve cost control in the face of new technology and demands for efficiency from both managed care and governmental payers, hospitals need to benchmark their costs against those of other comparable hospitals. Since they typically use Medicare Cost Report (MCR) data for this purpose, a variety of cost accounting problems with the MCR may hamper hospitals' understanding of their relative costs and performance. Managers and researchers alike need to investigate the validity, accuracy, and timeliness of the MCR's cost accounting data.
Encoding color information for visual tracking: Algorithms and benchmark.

PubMed

Liang, Pengpeng; Blasch, Erik; Ling, Haibin

2015-12-01

While color information is known to provide rich discriminative clues for visual inference, most modern visual trackers limit themselves to the grayscale realm. Despite recent efforts to integrate color in tracking, there is a lack of comprehensive understanding of the role color information can play. In this paper, we attack this problem by conducting a systematic study from both the algorithm and benchmark perspectives. On the algorithm side, we comprehensively encode 10 chromatic models into 16 carefully selected state-of-the-art visual trackers. On the benchmark side, we compile a large set of 128 color sequences with ground truth and challenge factor annotations (e.g., occlusion). A thorough evaluation is conducted by running all the color-encoded trackers, together with two recently proposed color trackers. A further validation is conducted on an RGBD tracking benchmark. The results clearly show the benefit of encoding color information for tracking. We also perform detailed analysis on several issues, including the behavior of various combinations between color model and visual tracker, the degree of difficulty of each sequence for tracking, and how different challenge factors affect the tracking performance. We expect the study to provide the guidance, motivation, and benchmark for future work on encoding color in visual tracking.
PFLOTRAN-RepoTREND Source Term Comparison Summary.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Frederick, Jennifer M.

Code inter-comparison studies are useful exercises to verify and benchmark independently developed software to ensure proper function, especially when the software is used to model high-consequence systems which cannot be physically tested in a fully representative environment. This summary describes the results of the first portion of the code inter-comparison between PFLOTRAN and RepoTREND, which compares the radionuclide source term used in a typical performance assessment.
Benchmark gamma-ray skyshine experiment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nason, R.R.; Shultis, J.K.; Faw, R.E.

1982-01-01

A benchmark gamma-ray skyshine experiment is descibed in which /sup 60/Co sources were either collimated into an upward 150-deg conical beam or shielded vertically by two different thicknesses of concrete. A NaI(Tl) spectrometer and a high pressure ion chamber were used to measure, respectively, the energy spectrum and the 4..pi..-exposure rate of the air-reflected gamma photons up to 700 m from the source. Analyses of the data and comparison to DOT discrete ordinates calculations are presented.
Integrated control/structure optimization by multilevel decomposition

NASA Technical Reports Server (NTRS)

Zeiler, Thomas A.; Gilbert, Michael G.

1990-01-01

A method for integrated control/structure optimization by multilevel decomposition is presented. It is shown that several previously reported methods were actually partial decompositions wherein only the control was decomposed into a subsystem design. One of these partially decomposed problems was selected as a benchmark example for comparison. The present paper fully decomposes the system into structural and control subsystem designs and produces an improved design. Theory, implementation, and results for the method are presented and compared with the benchmark example.
Validation of a Low-Thrust Mission Design Tool Using Operational Navigation Software

NASA Technical Reports Server (NTRS)

Englander, Jacob A.; Knittel, Jeremy M.; Williams, Ken; Stanbridge, Dale; Ellison, Donald H.

2017-01-01

Design of flight trajectories for missions employing solar electric propulsion requires a suitably high-fidelity design tool. In this work, the Evolutionary Mission Trajectory Generator (EMTG) is presented as a medium-high fidelity design tool that is suitable for mission proposals. EMTG is validated against the high-heritage deep-space navigation tool MIRAGE, demonstrating both the accuracy of EMTG's model and an operational mission design and navigation procedure using both tools. The validation is performed using a benchmark mission to the Jupiter Trojans.
New NAS Parallel Benchmarks Results

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Saphir, William; VanderWijngaart, Rob; Woo, Alex; Kutler, Paul (Technical Monitor)

1997-01-01

NPB2 (NAS (NASA Advanced Supercomputing) Parallel Benchmarks 2) is an implementation, based on Fortran and the MPI (message passing interface) message passing standard, of the original NAS Parallel Benchmark specifications. NPB2 programs are run with little or no tuning, in contrast to NPB vendor implementations, which are highly optimized for specific architectures. NPB2 results complement, rather than replace, NPB results. Because they have not been optimized by vendors, NPB2 implementations approximate the performance a typical user can expect for a portable parallel program on distributed memory parallel computers. Together these results provide an insightful comparison of the real-world performance of high-performance computers. New NPB2 features: New implementation (CG), new workstation class problem sizes, new serial sample versions, more performance statistics.
Toxicological Benchmarks for Screening of Potential Contaminants of Concern for Effects on Aquatic Biota on the Oak Ridge Reservation, Oak Ridge, Tennessee

DOE Office of Scientific and Technical Information (OSTI.GOV)

Suter, G.W., II

1993-01-01

One of the initial stages in ecological risk assessment of hazardous waste sites is the screening of contaminants to determine which, if any, of them are worthy of further consideration; this process is termed contaminant screening. Screening is performed by comparing concentrations in ambient media to benchmark concentrations that are either indicative of a high likelihood of significant effects (upper screening benchmarks) or of a very low likelihood of significant effects (lower screening benchmarks). Exceedance of an upper screening benchmark indicates that the chemical in question is clearly of concern and remedial actions are likely to be needed. Exceedance ofmore » a lower screening benchmark indicates that a contaminant is of concern unless other information indicates that the data are unreliable or the comparison is inappropriate. Chemicals with concentrations below the lower benchmark are not of concern if the ambient data are judged to be adequate. This report presents potential screening benchmarks for protection of aquatic life from contaminants in water. Because there is no guidance for screening benchmarks, a set of alternative benchmarks is presented herein. The alternative benchmarks are based on different conceptual approaches to estimating concentrations causing significant effects. For the upper screening benchmark, there are the acute National Ambient Water Quality Criteria (NAWQC) and the Secondary Acute Values (SAV). The SAV concentrations are values estimated with 80% confidence not to exceed the unknown acute NAWQC for those chemicals with no NAWQC. The alternative chronic benchmarks are the chronic NAWQC, the Secondary Chronic Value (SCV), the lowest chronic values for fish and daphnids, the lowest EC20 for fish and daphnids from chronic toxicity tests, the estimated EC20 for a sensitive species, and the concentration estimated to cause a 20% reduction in the recruit abundance of largemouth bass. It is recommended that ambient chemical concentrations be compared to all of these benchmarks. If NAWQC are exceeded, the chemicals must be contaminants of concern because the NAWQC are applicable or relevant and appropriate requirements (ARARs). If NAWQC are not exceeded, but other benchmarks are, contaminants should be selected on the basis of the number of benchmarks exceeded and the conservatism of the particular benchmark values, as discussed in the text. To the extent that toxicity data are available, this report presents the alternative benchmarks for chemicals that have been detected on the Oak Ridge Reservation. It also presents the data used to calculate the benchmarks and the sources of the data. It compares the benchmarks and discusses their relative conservatism and utility. This report supersedes a prior aquatic benchmarks report (Suter and Mabrey 1994). It adds two new types of benchmarks. It also updates the benchmark values where appropriate, adds some new benchmark values, replaces secondary sources with primary sources, and provides more complete documentation of the sources and derivation of all values.« less
Large scale study of multiple-molecule queries

PubMed Central

2009-01-01

Background In ligand-based screening, as well as in other chemoinformatics applications, one seeks to effectively search large repositories of molecules in order to retrieve molecules that are similar typically to a single molecule lead. However, in some case, multiple molecules from the same family are available to seed the query and search for other members of the same family. Multiple-molecule query methods have been less studied than single-molecule query methods. Furthermore, the previous studies have relied on proprietary data and sometimes have not used proper cross-validation methods to assess the results. In contrast, here we develop and compare multiple-molecule query methods using several large publicly available data sets and background. We also create a framework based on a strict cross-validation protocol to allow unbiased benchmarking for direct comparison in future studies across several performance metrics. Results Fourteen different multiple-molecule query methods were defined and benchmarked using: (1) 41 publicly available data sets of related molecules with similar biological activity; and (2) publicly available background data sets consisting of up to 175,000 molecules randomly extracted from the ChemDB database and other sources. Eight of the fourteen methods were parameter free, and six of them fit one or two free parameters to the data using a careful cross-validation protocol. All the methods were assessed and compared for their ability to retrieve members of the same family against the background data set by using several performance metrics including the Area Under the Accumulation Curve (AUAC), Area Under the Curve (AUC), F1-measure, and BEDROC metrics. Consistent with the previous literature, the best parameter-free methods are the MAX-SIM and MIN-RANK methods, which score a molecule to a family by the maximum similarity, or minimum ranking, obtained across the family. One new parameterized method introduced in this study and two previously defined methods, the Exponential Tanimoto Discriminant (ETD), the Tanimoto Power Discriminant (TPD), and the Binary Kernel Discriminant (BKD), outperform most other methods but are more complex, requiring one or two parameters to be fit to the data. Conclusion Fourteen methods for multiple-molecule querying of chemical databases, including novel methods, (ETD) and (TPD), are validated using publicly available data sets, standard cross-validation protocols, and established metrics. The best results are obtained with ETD, TPD, BKD, MAX-SIM, and MIN-RANK. These results can be replicated and compared with the results of future studies using data freely downloadable from http://cdb.ics.uci.edu/. PMID:20298525
Mean Length of Utterance in Children with Specific Language Impairment and in Younger Control Children Shows Concurrent Validity and Stable and Parallel Growth Trajectories

ERIC Educational Resources Information Center

Rice, Mabel L.; Redmond, Sean M.; Hoffman, Lesa

2006-01-01

Purpose: Although mean length of utterance (MLU) is a useful benchmark in studies of children with specific language impairment (SLI), some empirical and interpretive issues are unresolved. The authors report on 2 studies examining, respectively, the concurrent validity and temporal stability of MLU equivalency between children with SLI and…
Length of stay benchmarks for inpatient rehabilitation after stroke.

PubMed

Meyer, Matthew; Britt, Eileen; McHale, Heather A; Teasell, Robert

2012-01-01

In Canada, no standardized benchmarks for length of stay (LOS) have been established for post-stroke inpatient rehabilitation. This paper describes the development of a severity specific median length of stay benchmarking strategy, assessment of its impact after one year of implementation in a Canadian rehabilitation hospital, and establishment of updated benchmarks that may be useful for comparison with other facilities across Canada. Patient data were retrospectively assessed for all patients admitted to a single post-acute stroke rehabilitation unit in Ontario, Canada between April 2005 and March 2008. Rehabilitation Patient Groups (RPGs) were used to establish stratified median length of stay benchmarks for each group that were incorporated into team rounds beginning in October 2009. Benchmark impact was assessed using mean LOS, FIM(®) gain, and discharge destination for each RPG group, collected prospectively for one year, compared against similar information from the previous calendar year. Benchmarks were then adjusted accordingly for future use. Between October 2009 and September 2010, a significant reduction in average LOS was noted compared to the previous year (35.3 vs. 41.2 days; p < 0.05). Reductions in LOS were noted in each RPG group including statistically significant reductions in 4 of the 7 groups. As intended, reductions in LOS were achieved with no significant reduction in mean FIM(®) gain or proportion of patients discharged home compared to the previous year. Adjusted benchmarks for LOS ranged from 13 to 48 days depending on the RPG group. After a single year of implementation, severity specific benchmarks helped the rehabilitation team reduce LOS while maintaining the same levels of functional gain and achieving the same rate of discharge to the community. © 2012 Informa UK, Ltd.
Harmonizing lipidomics: NIST interlaboratory comparison exercise for lipidomics using SRM 1950-Metabolites in Frozen Human Plasma.

PubMed

Bowden, John A; Heckert, Alan; Ulmer, Candice Z; Jones, Christina M; Koelmel, Jeremy P; Abdullah, Laila; Ahonen, Linda; Alnouti, Yazen; Armando, Aaron M; Asara, John M; Bamba, Takeshi; Barr, John R; Bergquist, Jonas; Borchers, Christoph H; Brandsma, Joost; Breitkopf, Susanne B; Cajka, Tomas; Cazenave-Gassiot, Amaury; Checa, Antonio; Cinel, Michelle A; Colas, Romain A; Cremers, Serge; Dennis, Edward A; Evans, James E; Fauland, Alexander; Fiehn, Oliver; Gardner, Michael S; Garrett, Timothy J; Gotlinger, Katherine H; Han, Jun; Huang, Yingying; Neo, Aveline Huipeng; Hyötyläinen, Tuulia; Izumi, Yoshihiro; Jiang, Hongfeng; Jiang, Houli; Jiang, Jiang; Kachman, Maureen; Kiyonami, Reiko; Klavins, Kristaps; Klose, Christian; Köfeler, Harald C; Kolmert, Johan; Koal, Therese; Koster, Grielof; Kuklenyik, Zsuzsanna; Kurland, Irwin J; Leadley, Michael; Lin, Karen; Maddipati, Krishna Rao; McDougall, Danielle; Meikle, Peter J; Mellett, Natalie A; Monnin, Cian; Moseley, M Arthur; Nandakumar, Renu; Oresic, Matej; Patterson, Rainey; Peake, David; Pierce, Jason S; Post, Martin; Postle, Anthony D; Pugh, Rebecca; Qiu, Yunping; Quehenberger, Oswald; Ramrup, Parsram; Rees, Jon; Rembiesa, Barbara; Reynaud, Denis; Roth, Mary R; Sales, Susanne; Schuhmann, Kai; Schwartzman, Michal Laniado; Serhan, Charles N; Shevchenko, Andrej; Somerville, Stephen E; St John-Williams, Lisa; Surma, Michal A; Takeda, Hiroaki; Thakare, Rhishikesh; Thompson, J Will; Torta, Federico; Triebl, Alexander; Trötzmüller, Martin; Ubhayasekera, S J Kumari; Vuckovic, Dajana; Weir, Jacquelyn M; Welti, Ruth; Wenk, Markus R; Wheelock, Craig E; Yao, Libin; Yuan, Min; Zhao, Xueqing Heather; Zhou, Senlin

2017-12-01

As the lipidomics field continues to advance, self-evaluation within the community is critical. Here, we performed an interlaboratory comparison exercise for lipidomics using Standard Reference Material (SRM) 1950-Metabolites in Frozen Human Plasma, a commercially available reference material. The interlaboratory study comprised 31 diverse laboratories, with each laboratory using a different lipidomics workflow. A total of 1,527 unique lipids were measured across all laboratories and consensus location estimates and associated uncertainties were determined for 339 of these lipids measured at the sum composition level by five or more participating laboratories. These evaluated lipids detected in SRM 1950 serve as community-wide benchmarks for intra- and interlaboratory quality control and method validation. These analyses were performed using nonstandardized laboratory-independent workflows. The consensus locations were also compared with a previous examination of SRM 1950 by the LIPID MAPS consortium. While the central theme of the interlaboratory study was to provide values to help harmonize lipids, lipid mediators, and precursor measurements across the community, it was also initiated to stimulate a discussion regarding areas in need of improvement. Copyright © 2017 by the American Society for Biochemistry and Molecular Biology, Inc.

Measurement and validation of benchmark-quality thick-target tungsten X-ray spectra below 150 kVp.

PubMed

Mercier, J R; Kopp, D T; McDavid, W D; Dove, S B; Lancaster, J L; Tucker, D M

2000-11-01

Pulse-height distributions of two constant potential X-ray tubes with fixed anode tungsten targets were measured and unfolded. The measurements employed quantitative alignment of the beam, the use of two different semiconductor detectors (high-purity germanium and cadmium-zinc-telluride), two different ion chamber systems with beam-specific calibration factors, and various filter and tube potential combinations. Monte Carlo response matrices were generated for each detector for unfolding the pulse-height distributions into spectra incident on the detectors. These response matrices were validated for the low error bars assigned to the data. A significant aspect of the validation of spectra, and a detailed characterization of the X-ray tubes, involved measuring filtered and unfiltered beams at multiple tube potentials (30-150 kVp). Full corrections to ion chamber readings were employed to convert normalized fluence spectra into absolute fluence spectra. The characterization of fixed anode pitting and its dominance over exit window plating and/or detector dead layer was determined. An Appendix of tabulated benchmark spectra with assigned error ranges was developed for future reference.
Summary of BISON Development and Validation Activities - NEAMS FY16 Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williamson, R. L.; Pastore, G.; Gamble, K. A.

This summary report contains an overview of work performed under the work package en- titled “FY2016 NEAMS INL-Engineering Scale Fuel Performance (BISON)” A first chapter identifies the specific FY-16 milestones, providing a basic description of the associated work and references to related detailed documentation. Where applicable, a representative technical result is provided. A second chapter summarizes major additional accomplishments, which in- clude: 1) publication of a journal article on solution verification and validation of BISON for LWR fuel, 2) publication of a journal article on 3D Missing Pellet Surface (MPS) analysis of BWR fuel, 3) use of BISON to designmore » a unique 3D MPS validation experiment for future in- stallation in the Halden research reactor, 4) participation in an OECD benchmark on Pellet Clad Mechanical Interaction (PCMI), 5) participation in an OECD benchmark on Reactivity Insertion Accident (RIA) analysis, 6) participation in an OECD activity on uncertainity quantification and sensitivity analysis in nuclear fuel modeling and 7) major improvements to BISON’s fission gas behavior models. A final chapter outlines FY-17 future work.« less
Validation of optimization strategies using the linear structured production chains

NASA Astrophysics Data System (ADS)

Kusiak, Jan; Morkisz, Paweł; Oprocha, Piotr; Pietrucha, Wojciech; Sztangret, Łukasz

2017-06-01

Different optimization strategies applied to sequence of several stages of production chains were validated in this paper. Two benchmark problems described by ordinary differential equations (ODEs) were considered. A water tank and a passive CR-RC filter were used as the exemplary objects described by the first and the second order differential equations, respectively. Considered in the work optimization problems serve as the validators of strategies elaborated by the Authors. However, the main goal of research is selection of the best strategy for optimization of two real metallurgical processes which will be investigated in an on-going projects. The first problem will be the oxidizing roasting process of zinc sulphide concentrate where the sulphur from the input concentrate should be eliminated and the minimal concentration of sulphide sulphur in the roasted products has to be achieved. Second problem will be the lead refining process consisting of three stages: roasting to the oxide, oxide reduction to metal and the oxidizing refining. Strategies, which appear the most effective in considered benchmark problems will be candidates for optimization of the mentioned above industrial processes.
Benchmark Evaluation of Dounreay Prototype Fast Reactor Minor Actinide Depletion Measurements

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hess, J. D.; Gauld, I. C.; Gulliford, J.

2017-01-01

Historic measurements of actinide samples in the Dounreay Prototype Fast Reactor (PFR) are of interest for modern nuclear data and simulation validation. Samples of various higher-actinide isotopes were irradiated for 492 effective full-power days and radiochemically assayed at Oak Ridge National Laboratory (ORNL) and Japan Atomic Energy Research Institute (JAERI). Limited data were available regarding the PFR irradiation; a six-group neutron spectra was available with some power history data to support a burnup depletion analysis validation study. Under the guidance of the Organisation for Economic Co-Operation and Development Nuclear Energy Agency (OECD NEA), the International Reactor Physics Experiment Evaluation Projectmore » (IRPhEP) and Spent Fuel Isotopic Composition (SFCOMPO) Project are collaborating to recover all measurement data pertaining to these measurements, including collaboration with the United Kingdom to obtain pertinent reactor physics design and operational history data. These activities will produce internationally peer-reviewed benchmark data to support validation of minor actinide cross section data and modern neutronic simulation of fast reactors with accompanying fuel cycle activities such as transportation, recycling, storage, and criticality safety.« less
A preliminary comparison of hydrodynamic approaches for flood inundation modeling of urban areas in Jakarta Ciliwung river basin

NASA Astrophysics Data System (ADS)

Rojali, Aditia; Budiaji, Abdul Somat; Pribadi, Yudhistira Satya; Fatria, Dita; Hadi, Tri Wahyu

2017-07-01

This paper addresses on the numerical modeling approaches for flood inundation in urban areas. Decisive strategy to choose between 1D, 2D or even a hybrid 1D-2D model is more than important to optimize flood inundation analyses. To find cost effective yet robust and accurate model has been our priority and motivation in the absence of available High Performance Computing facilities. The application of 1D, 1D/2D and full 2D modeling approach to river flood study in Jakarta Ciliwung river basin, and a comparison of approaches benchmarked for the inundation study are presented. This study demonstrate the successful use of 1D/2D and 2D system to model Jakarta Ciliwung river basin in terms of inundation results and computational aspect. The findings of the study provide an interesting comparison between modeling approaches, HEC-RAS 1D, 1D-2D, 2D, and ANUGA when benchmarked to the Manggarai water level measurement.
Benchmarking for Bayesian Reinforcement Learning

PubMed Central

Ernst, Damien; Couëtoux, Adrien

2016-01-01

In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed. PMID:27304891
Benchmarking for Bayesian Reinforcement Learning.

PubMed

Castronovo, Michael; Ernst, Damien; Couëtoux, Adrien; Fonteneau, Raphael

2016-01-01

In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed.
Performance evaluation of firefly algorithm with variation in sorting for non-linear benchmark problems

NASA Astrophysics Data System (ADS)

Umbarkar, A. J.; Balande, U. T.; Seth, P. D.

2017-06-01

The field of nature inspired computing and optimization techniques have evolved to solve difficult optimization problems in diverse fields of engineering, science and technology. The firefly attraction process is mimicked in the algorithm for solving optimization problems. In Firefly Algorithm (FA) sorting of fireflies is done by using sorting algorithm. The original FA is proposed with bubble sort for ranking the fireflies. In this paper, the quick sort replaces bubble sort to decrease the time complexity of FA. The dataset used is unconstrained benchmark functions from CEC 2005 [22]. The comparison of FA using bubble sort and FA using quick sort is performed with respect to best, worst, mean, standard deviation, number of comparisons and execution time. The experimental result shows that FA using quick sort requires less number of comparisons but requires more execution time. The increased number of fireflies helps to converge into optimal solution whereas by varying dimension for algorithm performed better at a lower dimension than higher dimension.
A Simplified Approach for the Rapid Generation of Transient Heat-Shield Environments

NASA Technical Reports Server (NTRS)

Wurster, Kathryn E.; Zoby, E. Vincent; Mills, Janelle C.; Kamhawi, Hilmi

2007-01-01

A simplified approach has been developed whereby transient entry heating environments are reliably predicted based upon a limited set of benchmark radiative and convective solutions. Heating, pressure and shear-stress levels, non-dimensionalized by an appropriate parameter at each benchmark condition are applied throughout the entry profile. This approach was shown to be valid based on the observation that the fully catalytic, laminar distributions examined were relatively insensitive to altitude as well as velocity throughout the regime of significant heating. In order to establish a best prediction by which to judge the results that can be obtained using a very limited benchmark set, predictions based on a series of benchmark cases along a trajectory are used. Solutions which rely only on the limited benchmark set, ideally in the neighborhood of peak heating, are compared against the resultant transient heating rates and total heat loads from the best prediction. Predictions based on using two or fewer benchmark cases at or near the trajectory peak heating condition, yielded results to within 5-10 percent of the best predictions. Thus, the method provides transient heating environments over the heat-shield face with sufficient resolution and accuracy for thermal protection system design and also offers a significant capability to perform rapid trade studies such as the effect of different trajectories, atmospheres, or trim angle of attack, on convective and radiative heating rates and loads, pressure, and shear-stress levels.
Benchmarking Spike-Based Visual Recognition: A Dataset and Evaluation

PubMed Central

Liu, Qian; Pineda-García, Garibaldi; Stromatias, Evangelos; Serrano-Gotarredona, Teresa; Furber, Steve B.

2016-01-01

Today, increasing attention is being paid to research into spike-based neural computation both to gain a better understanding of the brain and to explore biologically-inspired computation. Within this field, the primate visual pathway and its hierarchical organization have been extensively studied. Spiking Neural Networks (SNNs), inspired by the understanding of observed biological structure and function, have been successfully applied to visual recognition and classification tasks. In addition, implementations on neuromorphic hardware have enabled large-scale networks to run in (or even faster than) real time, making spike-based neural vision processing accessible on mobile robots. Neuromorphic sensors such as silicon retinas are able to feed such mobile systems with real-time visual stimuli. A new set of vision benchmarks for spike-based neural processing are now needed to measure progress quantitatively within this rapidly advancing field. We propose that a large dataset of spike-based visual stimuli is needed to provide meaningful comparisons between different systems, and a corresponding evaluation methodology is also required to measure the performance of SNN models and their hardware implementations. In this paper we first propose an initial NE (Neuromorphic Engineering) dataset based on standard computer vision benchmarksand that uses digits from the MNIST database. This dataset is compatible with the state of current research on spike-based image recognition. The corresponding spike trains are produced using a range of techniques: rate-based Poisson spike generation, rank order encoding, and recorded output from a silicon retina with both flashing and oscillating input stimuli. In addition, a complementary evaluation methodology is presented to assess both model-level and hardware-level performance. Finally, we demonstrate the use of the dataset and the evaluation methodology using two SNN models to validate the performance of the models and their hardware implementations. With this dataset we hope to (1) promote meaningful comparison between algorithms in the field of neural computation, (2) allow comparison with conventional image recognition methods, (3) provide an assessment of the state of the art in spike-based visual recognition, and (4) help researchers identify future directions and advance the field. PMID:27853419
Benchmarking Spike-Based Visual Recognition: A Dataset and Evaluation.

PubMed

Liu, Qian; Pineda-García, Garibaldi; Stromatias, Evangelos; Serrano-Gotarredona, Teresa; Furber, Steve B

2016-01-01

Today, increasing attention is being paid to research into spike-based neural computation both to gain a better understanding of the brain and to explore biologically-inspired computation. Within this field, the primate visual pathway and its hierarchical organization have been extensively studied. Spiking Neural Networks (SNNs), inspired by the understanding of observed biological structure and function, have been successfully applied to visual recognition and classification tasks. In addition, implementations on neuromorphic hardware have enabled large-scale networks to run in (or even faster than) real time, making spike-based neural vision processing accessible on mobile robots. Neuromorphic sensors such as silicon retinas are able to feed such mobile systems with real-time visual stimuli. A new set of vision benchmarks for spike-based neural processing are now needed to measure progress quantitatively within this rapidly advancing field. We propose that a large dataset of spike-based visual stimuli is needed to provide meaningful comparisons between different systems, and a corresponding evaluation methodology is also required to measure the performance of SNN models and their hardware implementations. In this paper we first propose an initial NE (Neuromorphic Engineering) dataset based on standard computer vision benchmarksand that uses digits from the MNIST database. This dataset is compatible with the state of current research on spike-based image recognition. The corresponding spike trains are produced using a range of techniques: rate-based Poisson spike generation, rank order encoding, and recorded output from a silicon retina with both flashing and oscillating input stimuli. In addition, a complementary evaluation methodology is presented to assess both model-level and hardware-level performance. Finally, we demonstrate the use of the dataset and the evaluation methodology using two SNN models to validate the performance of the models and their hardware implementations. With this dataset we hope to (1) promote meaningful comparison between algorithms in the field of neural computation, (2) allow comparison with conventional image recognition methods, (3) provide an assessment of the state of the art in spike-based visual recognition, and (4) help researchers identify future directions and advance the field.
Visual Inspection Research Project Report on Benchmark Inspections

DOT National Transportation Integrated Search

1996-10-01

Word document. Recognizing the importance of visual inspection in the maintenance of the civil air fleet, the FAA tasked the Aging Aircraft Nondestructive Inspection Validation Center (AANC) at Sandia National Labs in Albuquerque, NM, to establish a ...
[Benchmarking using different measurement instruments and the management of measurement variability].

PubMed

Blankers, M; Barendregt, M; Dekker, J J M

2016-01-01

In mental health care centres in the Netherlands outcome data are collected using a variety of outcome instruments. This may have implications for the comparability of outcome results between different centres. To discuss recent findings regarding the extent to which the eight instruments currently used in clinical practice report comparable results. Our study is based on a combination of literature review and empirical research. The results obtained with the eight instruments are not equivalent. Patients symptom reductions appear larger with some instruments than with others. The current practice of benchmarking in the Dutch mental health system would have greater validity if the number of different instruments would be reduced. State-of-the-art calibration studies are necessary to validate the comparability of the remaining instruments. Ideally, all mental health centres will soon use one instrument per care domain to measure treatment outcome.
A benchmark study of the sea-level equation in GIA modelling

NASA Astrophysics Data System (ADS)

Martinec, Zdenek; Klemann, Volker; van der Wal, Wouter; Riva, Riccardo; Spada, Giorgio; Simon, Karen; Blank, Bas; Sun, Yu; Melini, Daniele; James, Tom; Bradley, Sarah

2017-04-01

The sea-level load in glacial isostatic adjustment (GIA) is described by the so called sea-level equation (SLE), which represents the mass redistribution between ice sheets and oceans on a deforming earth. Various levels of complexity of SLE have been proposed in the past, ranging from a simple mean global sea level (the so-called eustatic sea level) to the load with a deforming ocean bottom, migrating coastlines and a changing shape of the geoid. Several approaches to solve the SLE have been derived, from purely analytical formulations to fully numerical methods. Despite various teams independently investigating GIA, there has been no systematic intercomparison amongst the solvers through which the methods may be validated. The goal of this paper is to present a series of benchmark experiments designed for testing and comparing numerical implementations of the SLE. Our approach starts with simple load cases even though the benchmark will not result in GIA predictions for a realistic loading scenario. In the longer term we aim for a benchmark with a realistic loading scenario, and also for benchmark solutions with rotational feedback. The current benchmark uses an earth model for which Love numbers have been computed and benchmarked in Spada et al (2011). In spite of the significant differences in the numerical methods employed, the test computations performed so far show a satisfactory agreement between the results provided by the participants. The differences found can often be attributed to the different approximations inherent to the various algorithms. Literature G. Spada, V. R. Barletta, V. Klemann, R. E. M. Riva, Z. Martinec, P. Gasperini, B. Lund, D. Wolf, L. L. A. Vermeersen, and M. A. King, 2011. A benchmark study for glacial isostatic adjustment codes. Geophys. J. Int. 185: 106-132 doi:10.1111/j.1365-
Comparison of Vocal Vibration-Dose Measures for Potential-Damage Risk Criteria

ERIC Educational Resources Information Center

Titze, Ingo R.; Hunter, Eric J.

2015-01-01

Purpose: School-teachers have become a benchmark population for the study of occupational voice use. A decade of vibration-dose studies on the teacher population allows a comparison to be made between specific dose measures for eventual assessment of damage risk. Method: Vibration dosimetry is reformulated with the inclusion of collision stress.…
The Geoid Slope Validation Survey 2014 and GRAV-D airborne gravity enhanced geoid comparison results in Iowa

NASA Astrophysics Data System (ADS)

Wang, Y. M.; Becker, C.; Mader, G.; Martin, D.; Li, X.; Jiang, T.; Breidenbach, S.; Geoghegan, C.; Winester, D.; Guillaume, S.; Bürki, B.

2017-10-01

Three Geoid Slope Validation Surveys were planned by the National Geodetic Survey for validating geoid improvement gained by incorporating airborne gravity data collected by the "Gravity for the Redefinition of the American Vertical Datum" (GRAV-D) project in flat, medium and rough topographic areas, respectively. The first survey GSVS11 over a flat topographic area in Texas confirmed that a 1-cm differential accuracy geoid over baseline lengths between 0.4 and 320 km is achievable with GRAV-D data included (Smith et al. in J Geod 87:885-907, 2013). The second survey, Geoid Slope Validation Survey 2014 (GSVS14) took place in Iowa in an area with moderate topography but significant gravity variation. Two sets of geoidal heights were computed from GPS/leveling data and observed astrogeodetic deflections of the vertical at 204 GSVS14 official marks. They agree with each other at a {± }1.2 cm level, which attests to the high quality of the GSVS14 data. In total, four geoid models were computed. Three models combined the GOCO03/5S satellite gravity model with terrestrial and GRAV-D gravity with different strategies. The fourth model, called xGEOID15A, had no airborne gravity data and served as the benchmark to quantify the contribution of GRAV-D to the geoid improvement. The comparisons show that each model agrees with the GPS/leveling geoid height by 1.5 cm in mark-by-mark comparisons. In differential comparisons, all geoid models have a predicted accuracy of 1-2 cm at baseline lengths from 1.6 to 247 km. The contribution of GRAV-D is not apparent due to a 9-cm slope in the western 50-km section of the traverse for all gravimetric geoid models, and it was determined that the slopes have been caused by a 5 mGal bias in the terrestrial gravity data. If that western 50-km section of the testing line is excluded in the comparisons, then the improvement with GRAV-D is clearly evident. In that case, 1-cm differential accuracy on baselines of any length is achieved with the GRAV-D-enhanced geoid models and exhibits a clear improvement over the geoid models without GRAV-D data. GSVS14 confirmed that the geoid differential accuracies are in the 1-2 cm range at various baseline lengths. The accuracy increases to 1 cm with GRAV-D gravity when the west 50 km line is not included. The data collected by the surveys have high accuracy and have the potential to be used for validation of other geodetic techniques, e.g., the chronometric leveling. To reach the 1-cm height differences of the GSVS data, a clock with frequency accuracy of 10^{-18} is required. Using the GSVS data, the accuracy of ellipsoidal height differences can also be estimated.
Performance Comparison of HPF and MPI Based NAS Parallel Benchmarks

NASA Technical Reports Server (NTRS)

Saini, Subhash

1997-01-01

Compilers supporting High Performance Form (HPF) features first appeared in late 1994 and early 1995 from Applied Parallel Research (APR), Digital Equipment Corporation, and The Portland Group (PGI). IBM introduced an HPF compiler for the IBM RS/6000 SP2 in April of 1996. Over the past two years, these implementations have shown steady improvement in terms of both features and performance. The performance of various hardware/ programming model (HPF and MPI) combinations will be compared, based on latest NAS Parallel Benchmark results, thus providing a cross-machine and cross-model comparison. Specifically, HPF based NPB results will be compared with MPI based NPB results to provide perspective on performance currently obtainable using HPF versus MPI or versus hand-tuned implementations such as those supplied by the hardware vendors. In addition, we would also present NPB, (Version 1.0) performance results for the following systems: DEC Alpha Server 8400 5/440, Fujitsu CAPP Series (VX, VPP300, and VPP700), HP/Convex Exemplar SPP2000, IBM RS/6000 SP P2SC node (120 MHz), NEC SX-4/32, SGI/CRAY T3E, and SGI Origin2000. We would also present sustained performance per dollar for Class B LU, SP and BT benchmarks.
Theory comparison and numerical benchmarking on neoclassical toroidal viscosity torque

NASA Astrophysics Data System (ADS)

Wang, Zhirui; Park, Jong-Kyu; Liu, Yueqiang; Logan, Nikolas; Kim, Kimin; Menard, Jonathan E.

2014-04-01

Systematic comparison and numerical benchmarking have been successfully carried out among three different approaches of neoclassical toroidal viscosity (NTV) theory and the corresponding codes: IPEC-PENT is developed based on the combined NTV theory but without geometric simplifications [Park et al., Phys. Rev. Lett. 102, 065002 (2009)]; MARS-Q includes smoothly connected NTV formula [Shaing et al., Nucl. Fusion 50, 025022 (2010)] based on Shaing's analytic formulation in various collisionality regimes; MARS-K, originally computing the drift kinetic energy, is upgraded to compute the NTV torque based on the equivalence between drift kinetic energy and NTV torque [J.-K. Park, Phys. Plasma 18, 110702 (2011)]. The derivation and numerical results both indicate that the imaginary part of drift kinetic energy computed by MARS-K is equivalent to the NTV torque in IPEC-PENT. In the benchmark of precession resonance between MARS-Q and MARS-K/IPEC-PENT, the agreement and correlation between the connected NTV formula and the combined NTV theory in different collisionality regimes are shown for the first time. Additionally, both IPEC-PENT and MARS-K indicate the importance of the bounce harmonic resonance which can greatly enhance the NTV torque when E ×B drift frequency reaches the bounce resonance condition.
Aeroelasticity Benchmark Assessment: Subsonic Fixed Wing Program

NASA Technical Reports Server (NTRS)

Florance, Jennifer P.; Chwalowski, Pawel; Wieseman, Carol D.

2010-01-01

The fundamental technical challenge in computational aeroelasticity is the accurate prediction of unsteady aerodynamic phenomena and the effect on the aeroelastic response of a vehicle. Currently, a benchmarking standard for use in validating the accuracy of computational aeroelasticity codes does not exist. Many aeroelastic data sets have been obtained in wind-tunnel and flight testing throughout the world; however, none have been globally presented or accepted as an ideal data set. There are numerous reasons for this. One reason is that often, such aeroelastic data sets focus on the aeroelastic phenomena alone (flutter, for example) and do not contain associated information such as unsteady pressures and time-correlated structural dynamic deflections. Other available data sets focus solely on the unsteady pressures and do not address the aeroelastic phenomena. Other discrepancies can include omission of relevant data, such as flutter frequency and / or the acquisition of only qualitative deflection data. In addition to these content deficiencies, all of the available data sets present both experimental and computational technical challenges. Experimental issues include facility influences, nonlinearities beyond those being modeled, and data processing. From the computational perspective, technical challenges include modeling geometric complexities, coupling between the flow and the structure, grid issues, and boundary conditions. The Aeroelasticity Benchmark Assessment task seeks to examine the existing potential experimental data sets and ultimately choose the one that is viewed as the most suitable for computational benchmarking. An initial computational evaluation of that configuration will then be performed using the Langley-developed computational fluid dynamics (CFD) software FUN3D1 as part of its code validation process. In addition to the benchmarking activity, this task also includes an examination of future research directions. Researchers within the Aeroelasticity Branch will examine other experimental efforts within the Subsonic Fixed Wing (SFW) program (such as testing of the NASA Common Research Model (CRM)) and other NASA programs and assess aeroelasticity issues and research topics.
Precise Ages for the Benchmark Brown Dwarfs HD 19467 B and HD 4747 B

NASA Astrophysics Data System (ADS)

Wood, Charlotte; Boyajian, Tabetha; Crepp, Justin; von Braun, Kaspar; Brewer, John; Schaefer, Gail; Adams, Arthur; White, Tim

2018-01-01

Large uncertainty in the age of brown dwarfs, stemming from a mass-age degeneracy, makes it difficult to constrain substellar evolutionary models. To break the degeneracy, we need ''benchmark" brown dwarfs (found in binary systems) whose ages can be determined independent of their masses. HD~19467~B and HD~4747~B are two benchmark brown dwarfs detected through the TRENDS (TaRgeting bENchmark objects with Doppler Spectroscopy) high-contrast imaging program for which we have dynamical mass measurements. To constrain their ages independently through isochronal analysis, we measured the radii of the host stars with interferometry using the Center for High Angular Resolution Astronomy (CHARA) Array. Assuming the brown dwarfs have the same ages as their host stars, we use these results to distinguish between several substellar evolutionary models. In this poster, we present new age estimates for HD~19467 and HD~4747 that are more accurate and precise and show our preliminary comparisons to cooling models.

Benchmarking FEniCS for mantle convection simulations

NASA Astrophysics Data System (ADS)

Vynnytska, L.; Rognes, M. E.; Clark, S. R.

2013-01-01

This paper evaluates the usability of the FEniCS Project for mantle convection simulations by numerical comparison to three established benchmarks. The benchmark problems all concern convection processes in an incompressible fluid induced by temperature or composition variations, and cover three cases: (i) steady-state convection with depth- and temperature-dependent viscosity, (ii) time-dependent convection with constant viscosity and internal heating, and (iii) a Rayleigh-Taylor instability. These problems are modeled by the Stokes equations for the fluid and advection-diffusion equations for the temperature and composition. The FEniCS Project provides a novel platform for the automated solution of differential equations by finite element methods. In particular, it offers a significant flexibility with regard to modeling and numerical discretization choices; we have here used a discontinuous Galerkin method for the numerical solution of the advection-diffusion equations. Our numerical results are in agreement with the benchmarks, and demonstrate the applicability of both the discontinuous Galerkin method and FEniCS for such applications.
Leveraging long read sequencing from a single individual to provide a comprehensive resource for benchmarking variant calling methods

PubMed Central

Mu, John C.; Tootoonchi Afshar, Pegah; Mohiyuddin, Marghoob; Chen, Xi; Li, Jian; Bani Asadi, Narges; Gerstein, Mark B.; Wong, Wing H.; Lam, Hugo Y. K.

2015-01-01

A high-confidence, comprehensive human variant set is critical in assessing accuracy of sequencing algorithms, which are crucial in precision medicine based on high-throughput sequencing. Although recent works have attempted to provide such a resource, they still do not encompass all major types of variants including structural variants (SVs). Thus, we leveraged the massive high-quality Sanger sequences from the HuRef genome to construct by far the most comprehensive gold set of a single individual, which was cross validated with deep Illumina sequencing, population datasets, and well-established algorithms. It was a necessary effort to completely reanalyze the HuRef genome as its previously published variants were mostly reported five years ago, suffering from compatibility, organization, and accuracy issues that prevent their direct use in benchmarking. Our extensive analysis and validation resulted in a gold set with high specificity and sensitivity. In contrast to the current gold sets of the NA12878 or HS1011 genomes, our gold set is the first that includes small variants, deletion SVs and insertion SVs up to a hundred thousand base-pairs. We demonstrate the utility of our HuRef gold set to benchmark several published SV detection tools. PMID:26412485
Effects of benchmarking on the quality of type 2 diabetes care: results of the OPTIMISE (Optimal Type 2 Diabetes Management Including Benchmarking and Standard Treatment) study in Greece

PubMed Central

Tsimihodimos, Vasilis; Kostapanos, Michael S.; Moulis, Alexandros; Nikas, Nikos; Elisaf, Moses S.

2015-01-01

Objectives: To investigate the effect of benchmarking on the quality of type 2 diabetes (T2DM) care in Greece. Methods: The OPTIMISE (Optimal Type 2 Diabetes Management Including Benchmarking and Standard Treatment) study [ClinicalTrials.gov identifier: NCT00681850] was an international multicenter, prospective cohort study. It included physicians randomized 3:1 to either receive benchmarking for glycated hemoglobin (HbA1c), systolic blood pressure (SBP) and low-density lipoprotein cholesterol (LDL-C) treatment targets (benchmarking group) or not (control group). The proportions of patients achieving the targets of the above-mentioned parameters were compared between groups after 12 months of treatment. Also, the proportions of patients achieving those targets at 12 months were compared with baseline in the benchmarking group. Results: In the Greek region, the OPTIMISE study included 797 adults with T2DM (570 in the benchmarking group). At month 12 the proportion of patients within the predefined targets for SBP and LDL-C was greater in the benchmarking compared with the control group (50.6 versus 35.8%, and 45.3 versus 36.1%, respectively). However, these differences were not statistically significant. No difference between groups was noted in the percentage of patients achieving the predefined target for HbA1c. At month 12 the increase in the percentage of patients achieving all three targets was greater in the benchmarking (5.9–15.0%) than in the control group (2.7–8.1%). In the benchmarking group more patients were on target regarding SBP (50.6% versus 29.8%), LDL-C (45.3% versus 31.3%) and HbA1c (63.8% versus 51.2%) at 12 months compared with baseline (p < 0.001 for all comparisons). Conclusion: Benchmarking may comprise a promising tool for improving the quality of T2DM care. Nevertheless, target achievement rates of each, and of all three, quality indicators were suboptimal, indicating there are still unmet needs in the management of T2DM. PMID:26445642
International benchmarking of specialty hospitals. A series of case studies on comprehensive cancer centres.

PubMed

van Lent, Wineke A M; de Beer, Relinde D; van Harten, Wim H

2010-08-31

Benchmarking is one of the methods used in business that is applied to hospitals to improve the management of their operations. International comparison between hospitals can explain performance differences. As there is a trend towards specialization of hospitals, this study examines the benchmarking process and the success factors of benchmarking in international specialized cancer centres. Three independent international benchmarking studies on operations management in cancer centres were conducted. The first study included three comprehensive cancer centres (CCC), three chemotherapy day units (CDU) were involved in the second study and four radiotherapy departments were included in the final study. Per multiple case study a research protocol was used to structure the benchmarking process. After reviewing the multiple case studies, the resulting description was used to study the research objectives. We adapted and evaluated existing benchmarking processes through formalizing stakeholder involvement and verifying the comparability of the partners. We also devised a framework to structure the indicators to produce a coherent indicator set and better improvement suggestions. Evaluating the feasibility of benchmarking as a tool to improve hospital processes led to mixed results. Case study 1 resulted in general recommendations for the organizations involved. In case study 2, the combination of benchmarking and lean management led in one CDU to a 24% increase in bed utilization and a 12% increase in productivity. Three radiotherapy departments of case study 3, were considering implementing the recommendations.Additionally, success factors, such as a well-defined and small project scope, partner selection based on clear criteria, stakeholder involvement, simple and well-structured indicators, analysis of both the process and its results and, adapt the identified better working methods to the own setting, were found. The improved benchmarking process and the success factors can produce relevant input to improve the operations management of specialty hospitals.
International benchmarking of specialty hospitals. A series of case studies on comprehensive cancer centres

PubMed Central

2010-01-01

Background Benchmarking is one of the methods used in business that is applied to hospitals to improve the management of their operations. International comparison between hospitals can explain performance differences. As there is a trend towards specialization of hospitals, this study examines the benchmarking process and the success factors of benchmarking in international specialized cancer centres. Methods Three independent international benchmarking studies on operations management in cancer centres were conducted. The first study included three comprehensive cancer centres (CCC), three chemotherapy day units (CDU) were involved in the second study and four radiotherapy departments were included in the final study. Per multiple case study a research protocol was used to structure the benchmarking process. After reviewing the multiple case studies, the resulting description was used to study the research objectives. Results We adapted and evaluated existing benchmarking processes through formalizing stakeholder involvement and verifying the comparability of the partners. We also devised a framework to structure the indicators to produce a coherent indicator set and better improvement suggestions. Evaluating the feasibility of benchmarking as a tool to improve hospital processes led to mixed results. Case study 1 resulted in general recommendations for the organizations involved. In case study 2, the combination of benchmarking and lean management led in one CDU to a 24% increase in bed utilization and a 12% increase in productivity. Three radiotherapy departments of case study 3, were considering implementing the recommendations. Additionally, success factors, such as a well-defined and small project scope, partner selection based on clear criteria, stakeholder involvement, simple and well-structured indicators, analysis of both the process and its results and, adapt the identified better working methods to the own setting, were found. Conclusions The improved benchmarking process and the success factors can produce relevant input to improve the operations management of specialty hospitals. PMID:20807408
Linking Recognition Practices and National Qualifications Frameworks: International Benchmarking of Experiences and Strategies on the Recognition, Validation and Accreditation (RVA) of Non-Formal and Informal Learning

ERIC Educational Resources Information Center

Singh, Madhu, Ed.; Duvekot, Ruud, Ed.

2013-01-01

This publication is the outcome of the international conference organized by UNESCO Institute for Lifelong Learning (UIL), in collaboration with the Centre for Validation of Prior Learning at Inholland University of Applied Sciences, the Netherlands, and in partnership with the French National Commission for UNESCO that was held in Hamburg in…
Uncertainty Quantification Techniques of SCALE/TSUNAMI

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rearden, Bradley T; Mueller, Don

2011-01-01

The Standardized Computer Analysis for Licensing Evaluation (SCALE) code system developed at Oak Ridge National Laboratory (ORNL) includes Tools for Sensitivity and Uncertainty Analysis Methodology Implementation (TSUNAMI). The TSUNAMI code suite can quantify the predicted change in system responses, such as k{sub eff}, reactivity differences, or ratios of fluxes or reaction rates, due to changes in the energy-dependent, nuclide-reaction-specific cross-section data. Where uncertainties in the neutron cross-section data are available, the sensitivity of the system to the cross-section data can be applied to propagate the uncertainties in the cross-section data to an uncertainty in the system response. Uncertainty quantification ismore » useful for identifying potential sources of computational biases and highlighting parameters important to code validation. Traditional validation techniques often examine one or more average physical parameters to characterize a system and identify applicable benchmark experiments. However, with TSUNAMI correlation coefficients are developed by propagating the uncertainties in neutron cross-section data to uncertainties in the computed responses for experiments and safety applications through sensitivity coefficients. The bias in the experiments, as a function of their correlation coefficient with the intended application, is extrapolated to predict the bias and bias uncertainty in the application through trending analysis or generalized linear least squares techniques, often referred to as 'data adjustment.' Even with advanced tools to identify benchmark experiments, analysts occasionally find that the application models include some feature or material for which adequately similar benchmark experiments do not exist to support validation. For example, a criticality safety analyst may want to take credit for the presence of fission products in spent nuclear fuel. In such cases, analysts sometimes rely on 'expert judgment' to select an additional administrative margin to account for gap in the validation data or to conclude that the impact on the calculated bias and bias uncertainty is negligible. As a result of advances in computer programs and the evolution of cross-section covariance data, analysts can use the sensitivity and uncertainty analysis tools in the TSUNAMI codes to estimate the potential impact on the application-specific bias and bias uncertainty resulting from nuclides not represented in available benchmark experiments. This paper presents the application of methods described in a companion paper.« less
2-D Circulation Control Airfoil Benchmark Experiments Intended for CFD Code Validation

NASA Technical Reports Server (NTRS)

Englar, Robert J.; Jones, Gregory S.; Allan, Brian G.; Lin, Johb C.

2009-01-01

A current NASA Research Announcement (NRA) project being conducted by Georgia Tech Research Institute (GTRI) personnel and NASA collaborators includes the development of Circulation Control (CC) blown airfoils to improve subsonic aircraft high-lift and cruise performance. The emphasis of this program is the development of CC active flow control concepts for both high-lift augmentation, drag control, and cruise efficiency. A collaboration in this project includes work by NASA research engineers, whereas CFD validation and flow physics experimental research are part of NASA s systematic approach to developing design and optimization tools for CC applications to fixed-wing aircraft. The design space for CESTOL type aircraft is focusing on geometries that depend on advanced flow control technologies that include Circulation Control aerodynamics. The ability to consistently predict advanced aircraft performance requires improvements in design tools to include these advanced concepts. Validation of these tools will be based on experimental methods applied to complex flows that go beyond conventional aircraft modeling techniques. This paper focuses on recent/ongoing benchmark high-lift experiments and CFD efforts intended to provide 2-D CFD validation data sets related to NASA s Cruise Efficient Short Take Off and Landing (CESTOL) study. Both the experimental data and related CFD predictions are discussed.
Verification of MCNP6.2 for Nuclear Criticality Safety Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, Forrest B.; Rising, Michael Evan; Alwin, Jennifer Louise

2017-05-10

Several suites of verification/validation benchmark problems were run in early 2017 to verify that the new production release of MCNP6.2 performs correctly for nuclear criticality safety applications (NCS). MCNP6.2 results for several NCS validation suites were compared to the results from MCNP6.1 [1] and MCNP6.1.1 [2]. MCNP6.1 is the production version of MCNP® released in 2013, and MCNP6.1.1 is the update released in 2014. MCNP6.2 includes all of the standard features for NCS calculations that have been available for the past 15 years, along with new features for sensitivity-uncertainty based methods for NCS validation [3]. Results from the benchmark suitesmore » were compared with results from previous verification testing [4-8]. Criticality safety analysts should consider testing MCNP6.2 on their particular problems and validation suites. No further development of MCNP5 is planned. MCNP6.1 is now 4 years old, and MCNP6.1.1 is now 3 years old. In general, released versions of MCNP are supported only for about 5 years, due to resource limitations. All future MCNP improvements, bug fixes, user support, and new capabilities are targeted only to MCNP6.2 and beyond.« less
Global benchmarking of medical student learning outcomes? Implementation and pilot results of the International Foundations of Medicine Clinical Sciences Exam at The University of Queensland, Australia.

PubMed

Wilkinson, David; Schafer, Jennifer; Hewett, David; Eley, Diann; Swanson, Dave

2014-01-01

To report pilot results for international benchmarking of learning outcomes among 426 final year medical students at the University of Queensland (UQ), Australia. Students took the International Foundations of Medicine (IFOM) Clinical Sciences Exam (CSE) developed by the National Board of Medical Examiners, USA, as a required formative assessment. IFOM CSE comprises 160 multiple-choice questions in medicine, surgery, obstetrics, paediatrics and mental health, taken over 4.5 hours. Significant implementation issues; IFOM scores and benchmarking with International Comparison Group (ICG) scores and United States Medical Licensing Exam (USMLE) Step 2 Clinical Knowledge (CK) scores; and correlation with UQ medical degree cumulative grade point average (GPA). Implementation as an online exam, under university-mandated conditions was successful. Mean IFOM score was 531.3 (maximum 779-minimum 200). The UQ cohort performed better (31% scored below 500) than the ICG (55% below 500). However 49% of the UQ cohort did not meet the USMLE Step 2 CK minimum score. Correlation between IFOM scores and UQ cumulative GPA was reasonable at 0.552 (p < 0.001). International benchmarking is feasible and provides a variety of useful benchmarking opportunities.
Predicting Cost/Performance Trade-Offs for Whitney: A Commodity Computing Cluster

NASA Technical Reports Server (NTRS)

Becker, Jeffrey C.; Nitzberg, Bill; VanderWijngaart, Rob F.; Kutler, Paul (Technical Monitor)

1997-01-01

Recent advances in low-end processor and network technology have made it possible to build a "supercomputer" out of commodity components. We develop simple models of the NAS Parallel Benchmarks version 2 (NPB 2) to explore the cost/performance trade-offs involved in building a balanced parallel computer supporting a scientific workload. We develop closed form expressions detailing the number and size of messages sent by each benchmark. Coupling these with measured single processor performance, network latency, and network bandwidth, our models predict benchmark performance to within 30%. A comparison based on total system cost reveals that current commodity technology (200 MHz Pentium Pros with 100baseT Ethernet) is well balanced for the NPBs up to a total system cost of around $1,000,000.
RISC Processors and High Performance Computing

NASA Technical Reports Server (NTRS)

Bailey, David H.; Saini, Subhash; Craw, James M. (Technical Monitor)

1995-01-01

This tutorial will discuss the top five RISC microprocessors and the parallel systems in which they are used. It will provide a unique cross-machine comparison not available elsewhere. The effective performance of these processors will be compared by citing standard benchmarks in the context of real applications. The latest NAS Parallel Benchmarks, both absolute performance and performance per dollar, will be listed. The next generation of the NPB will be described. The tutorial will conclude with a discussion of future directions in the field. Technology Transfer Considerations: All of these computer systems are commercially available internationally. Information about these processors is available in the public domain, mostly from the vendors themselves. The NAS Parallel Benchmarks and their results have been previously approved numerous times for public release, beginning back in 1991.
Assessing Discriminative Performance at External Validation of Clinical Prediction Models

PubMed Central

Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W.

2016-01-01

Introduction External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. Methods We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. Results The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. Conclusion The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients. PMID:26881753
Assessing Discriminative Performance at External Validation of Clinical Prediction Models.

PubMed

Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W

2016-01-01

External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients.
Benchmarking Controlled Trial--a novel concept covering all observational effectiveness studies.

PubMed

Malmivaara, Antti

2015-06-01

The Benchmarking Controlled Trial (BCT) is a novel concept which covers all observational studies aiming to assess effectiveness. BCTs provide evidence of the comparative effectiveness between health service providers, and of effectiveness due to particular features of the health and social care systems. BCTs complement randomized controlled trials (RCTs) as the sources of evidence on effectiveness. This paper presents a definition of the BCT; compares the position of BCTs in assessing effectiveness with that of RCTs; presents a checklist for assessing methodological validity of a BCT; and pilot-tests the checklist with BCTs published recently in the leading medical journals.
The Sensitivity of WRF Daily Summertime Simulations over West Africa to Alternative Parameterizations. Part 1: African Wave Circulation

NASA Technical Reports Server (NTRS)

Noble, Erik; Druyan, Leonard M.; Fulakeza, Matthew

2014-01-01

The performance of the NCAR Weather Research and Forecasting Model (WRF) as a West African regional-atmospheric model is evaluated. The study tests the sensitivity of WRF-simulated vorticity maxima associated with African easterly waves to 64 combinations of alternative parameterizations in a series of simulations in September. In all, 104 simulations of 12-day duration during 11 consecutive years are examined. The 64 combinations combine WRF parameterizations of cumulus convection, radiation transfer, surface hydrology, and PBL physics. Simulated daily and mean circulation results are validated against NASA's Modern-Era Retrospective Analysis for Research and Applications (MERRA) and NCEP/Department of Energy Global Reanalysis 2. Precipitation is considered in a second part of this two-part paper. A wide range of 700-hPa vorticity validation scores demonstrates the influence of alternative parameterizations. The best WRF performers achieve correlations against reanalysis of 0.40-0.60 and realistic amplitudes of spatiotemporal variability for the 2006 focus year while a parallel-benchmark simulation by the NASA Regional Model-3 (RM3) achieves higher correlations, but less realistic spatiotemporal variability. The largest favorable impact on WRF-vorticity validation is achieved by selecting the Grell-Devenyi cumulus convection scheme, resulting in higher correlations against reanalysis than simulations using the Kain-Fritch convection. Other parameterizations have less-obvious impact, although WRF configurations incorporating one surface model and PBL scheme consistently performed poorly. A comparison of reanalysis circulation against two NASA radiosonde stations confirms that both reanalyses represent observations well enough to validate the WRF results. Validation statistics for optimized WRF configurations simulating the parallel period during 10 additional years are less favorable than for 2006.
Development of a proficiency-based virtual reality simulation training curriculum for laparoscopic appendicectomy.

PubMed

Sirimanna, Pramudith; Gladman, Marc A

2017-10-01

Proficiency-based virtual reality (VR) training curricula improve intraoperative performance, but have not been developed for laparoscopic appendicectomy (LA). This study aimed to develop an evidence-based training curriculum for LA. A total of 10 experienced (>50 LAs), eight intermediate (10-30 LAs) and 20 inexperienced (<10 LAs) operators performed guided and unguided LA tasks on a high-fidelity VR simulator using internationally relevant techniques. The ability to differentiate levels of experience (construct validity) was measured using simulator-derived metrics. Learning curves were analysed. Proficiency benchmarks were defined by the performance of the experienced group. Intermediate and experienced participants completed a questionnaire to evaluate the realism (face validity) and relevance (content validity). Of 18 surgeons, 16 (89%) considered the VR model to be visually realistic and 17 (95%) believed that it was representative of actual practice. All 'guided' modules demonstrated construct validity (P < 0.05), with learning curves that plateaued between sessions 6 and 9 (P < 0.01). When comparing inexperienced to intermediates to experienced, the 'unguided' LA module demonstrated construct validity for economy of motion (5.00 versus 7.17 versus 7.84, respectively; P < 0.01) and task time (864.5 s versus 477.2 s versus 352.1 s, respectively, P < 0.01). Construct validity was also confirmed for number of movements, path length and idle time. Validated modules were used for curriculum construction, with proficiency benchmarks used as performance goals. A VR LA model was realistic and representative of actual practice and was validated as a training and assessment tool. Consequently, the first evidence-based internationally applicable training curriculum for LA was constructed, which facilitates skill acquisition to proficiency. © 2017 Royal Australasian College of Surgeons.
Performance Evaluation and Improvement of Ferroelectric Field-Effect Transistor Memory

NASA Astrophysics Data System (ADS)

Yu, Hyung Suk

Flash memory is reaching scaling limitations rapidly due to reduction of charge in floating gates, charge leakage and capacitive coupling between cells which cause threshold voltage fluctuations, short retention times, and interference. Many new memory technologies are being considered as alternatives to flash memory in an effort to overcome these limitations. Ferroelectric Field-Effect Transistor (FeFET) is one of the main emerging candidates because of its structural similarity to conventional FETs and fast switching speed. Nevertheless, the performance of FeFETs have not been systematically compared and analyzed against other competing technologies. In this work, we first benchmark the intrinsic performance of FeFETs and other memories by simulations in order to identify the strengths and weaknesses of FeFETs. To simulate realistic memory applications, we compare memories on an array structure. For the comparisons, we construct an accurate delay model and verify it by benchmarking against exact HSPICE simulations. Second, we propose an accurate model for FeFET memory window since the existing model has limitations. The existing model assumes symmetric operation voltages but it is not valid for the practical asymmetric operation voltages. In this modeling, we consider practical operation voltages and device dimensions. Also, we investigate realistic changes of memory window over time and retention time of FeFETs. Last, to improve memory window and subthreshold swing, we suggest nonplanar junctionless structures for FeFETs. Using the suggested structures, we study the dimensional dependences of crucial parameters like memory window and subthreshold swing and also analyze key interference mechanisms.
Neutron spectra measurement and calculations using data libraries CIELO, JEFF-3.2 and ENDF/B-VII.1 in iron benchmark assemblies

NASA Astrophysics Data System (ADS)

Jansky, Bohumil; Rejchrt, Jiri; Novak, Evzen; Losa, Evzen; Blokhin, Anatoly I.; Mitenkova, Elena

2017-09-01

The leakage neutron spectra measurements have been done on benchmark spherical assemblies - iron spheres with diameter of 20, 30, 50 and 100 cm. The Cf-252 neutron source was placed into the centre of iron sphere. The proton recoil method was used for neutron spectra measurement using spherical hydrogen proportional counters with diameter of 4 cm and with pressure of 400 and 1000 kPa. The neutron energy range of spectrometer is from 0.1 to 1.3 MeV. This energy interval represents about 85 % of all leakage neutrons from Fe sphere of diameter 50 cm and about of 74% for Fe sphere of diameter 100 cm. The adequate MCNP neutron spectra calculations based on data libraries CIELO, JEFF-3.2 and ENDF/B-VII.1 were done. Two calculations were done with CIELO library. The first one used data for all Fe-isotopes from CIELO and the second one (CIELO-56) used only Fe-56 data from CIELO and data for other Fe isotopes were from ENDF/B-VII.1. The energy structure used for calculations and measurements was 40 gpd (groups per decade) and 200 gpd. Structure 200 gpd represents lethargy step about of 1%. This relatively fine energy structure enables to analyze the Fe resonance neutron energy structure. The evaluated cross section data of Fe were validated on comparisons between the calculated and experimental spectra.
Neutron Deep Penetration Calculations in Light Water with Monte Carlo TRIPOLI-4® Variance Reduction Techniques

NASA Astrophysics Data System (ADS)

Lee, Yi-Kang

2017-09-01

Nuclear decommissioning takes place in several stages due to the radioactivity in the reactor structure materials. A good estimation of the neutron activation products distributed in the reactor structure materials impacts obviously on the decommissioning planning and the low-level radioactive waste management. Continuous energy Monte-Carlo radiation transport code TRIPOLI-4 has been applied on radiation protection and shielding analyses. To enhance the TRIPOLI-4 application in nuclear decommissioning activities, both experimental and computational benchmarks are being performed. To calculate the neutron activation of the shielding and structure materials of nuclear facilities, the knowledge of 3D neutron flux map and energy spectra must be first investigated. To perform this type of neutron deep penetration calculations with the Monte Carlo transport code, variance reduction techniques are necessary in order to reduce the uncertainty of the neutron activation estimation. In this study, variance reduction options of the TRIPOLI-4 code were used on the NAIADE 1 light water shielding benchmark. This benchmark document is available from the OECD/NEA SINBAD shielding benchmark database. From this benchmark database, a simplified NAIADE 1 water shielding model was first proposed in this work in order to make the code validation easier. Determination of the fission neutron transport was performed in light water for penetration up to 50 cm for fast neutrons and up to about 180 cm for thermal neutrons. Measurement and calculation results were benchmarked. Variance reduction options and their performance were discussed and compared.

Shallow water models as tool for tsunami current predictions in ports and harbors. Validation with Tohoku 2011 field data

NASA Astrophysics Data System (ADS)

Gonzalez Vida, J. M., Sr.; Macias Sanchez, J.; Castro, M. J.; Ortega, S.

2015-12-01

Model ability to compute and predict tsunami flow velocities is of importance in risk assessment and hazard mitigation. Substantial damage can be produced by high velocity flows, particularly in harbors and bays, even when the wave height is small. Besides, an accurate simulation of tsunami flow velocities and accelerations is fundamental for advancing in the study of tsunami sediment transport. These considerations made the National Tsunami Hazard Mitigation Program (NTHMP) proposing a benchmark exercise focused on modeling and simulating tsunami currents. Until recently, few direct measurements of tsunami velocities were available to compare and to validate model results. After Tohoku 2011 many current meters measurement were made, mainly in harbors and channels. In this work we present a part of the contribution made by the EDANYA group from the University of Malaga to the NTHMP workshop organized at Portland (USA), 9-10 of February 2015. We have selected three out of the five proposed benchmark problems. Two of them consist in real observed data from the Tohoku 2011 event, one at Hilo Habour (Hawaii) and the other at Tauranga Bay (New Zealand). The third one consists in laboratory experimental data for the inundation of Seaside City in Oregon. For this model validation the Tsunami-HySEA model, developed by EDANYA group, was used. The overall conclusion that we could extract from this validation exercise was that the Tsunami-HySEA model performed well in all benchmark problems proposed. The greater spatial variability in tsunami velocity than wave height makes it more difficult its precise numerical representation. The larger variability in velocities is likely a result of the behaviour of the flow as it is channelized and as it flows around bathymetric highs and structures. In the other hand wave height do not respond as strongly to chanelized flow as current velocity.
Benchmark On Sensitivity Calculation (Phase III)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ivanova, Tatiana; Laville, Cedric; Dyrda, James

2012-01-01

The sensitivities of the keff eigenvalue to neutron cross sections have become commonly used in similarity studies and as part of the validation algorithm for criticality safety assessments. To test calculations of the sensitivity coefficients, a benchmark study (Phase III) has been established by the OECD-NEA/WPNCS/EG UACSA (Expert Group on Uncertainty Analysis for Criticality Safety Assessment). This paper presents some sensitivity results generated by the benchmark participants using various computational tools based upon different computational methods: SCALE/TSUNAMI-3D and -1D, MONK, APOLLO2-MORET 5, DRAGON-SUSD3D and MMKKENO. The study demonstrates the performance of the tools. It also illustrates how model simplifications impactmore » the sensitivity results and demonstrates the importance of 'implicit' (self-shielding) sensitivities. This work has been a useful step towards verification of the existing and developed sensitivity analysis methods.« less
Anharmonic Vibrational Spectroscopy on Metal Transition Complexes

NASA Astrophysics Data System (ADS)

Latouche, Camille; Bloino, Julien; Barone, Vincenzo

2014-06-01

Advances in hardware performance and the availability of efficient and reliable computational models have made possible the application of computational spectroscopy to ever larger molecular systems. The systematic interpretation of experimental data and the full characterization of complex molecules can then be facilitated. Focusing on vibrational spectroscopy, several approaches have been proposed to simulate spectra beyond the double harmonic approximation, so that more details become available. However, a routine use of such tools requires the preliminary definition of a valid protocol with the most appropriate combination of electronic structure and nuclear calculation models. Several benchmark of anharmonic calculations frequency have been realized on organic molecules. Nevertheless, benchmarks of organometallics or inorganic metal complexes at this level are strongly lacking despite the interest of these systems due to their strong emission and vibrational properties. Herein we report the benchmark study realized with anharmonic calculations on simple metal complexes, along with some pilot applications on systems of direct technological or biological interest.
Fast Neutron Spectrum Potassium Worth for Space Power Reactor Design Validation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bess, John D.; Marshall, Margaret A.; Briggs, J. Blair

2015-03-01

A variety of critical experiments were constructed of enriched uranium metal (oralloy ) during the 1960s and 1970s at the Oak Ridge Critical Experiments Facility (ORCEF) in support of criticality safety operations at the Y-12 Plant. The purposes of these experiments included the evaluation of storage, casting, and handling limits for the Y-12 Plant and providing data for verification of calculation methods and cross-sections for nuclear criticality safety applications. These included solid cylinders of various diameters, annuli of various inner and outer diameters, two and three interacting cylinders of various diameters, and graphite and polyethylene reflected cylinders and annuli. Ofmore » the hundreds of delayed critical experiments, one was performed that consisted of uranium metal annuli surrounding a potassium-filled, stainless steel can. The outer diameter of the annuli was approximately 13 inches (33.02 cm) with an inner diameter of 7 inches (17.78 cm). The diameter of the stainless steel can was 7 inches (17.78 cm). The critical height of the configurations was approximately 5.6 inches (14.224 cm). The uranium annulus consisted of multiple stacked rings, each with radial thicknesses of 1 inch (2.54 cm) and varying heights. A companion measurement was performed using empty stainless steel cans; the primary purpose of these experiments was to test the fast neutron cross sections of potassium as it was a candidate for coolant in some early space power reactor designs.The experimental measurements were performed on July 11, 1963, by J. T. Mihalczo and M. S. Wyatt (Ref. 1) with additional information in its corresponding logbook. Unreflected and unmoderated experiments with the same set of highly enriched uranium metal parts were performed at the Oak Ridge Critical Experiments Facility in the 1960s and are evaluated in the International Handbook for Evaluated Criticality Safety Benchmark Experiments (ICSBEP Handbook) with the identifier HEU MET FAST 051. Thin graphite reflected (2 inches or less) experiments also using the same set of highly enriched uranium metal parts are evaluated in HEU MET FAST 071. Polyethylene-reflected configurations are evaluated in HEU-MET-FAST-076. A stack of highly enriched metal discs with a thick beryllium top reflector is evaluated in HEU-MET-FAST-069, and two additional highly enriched uranium annuli with beryllium cores are evaluated in HEU-MET-FAST-059. Both detailed and simplified model specifications are provided in this evaluation. Both of these fast neutron spectra assemblies were determined to be acceptable benchmark experiments. The calculated eigenvalues for both the detailed and the simple benchmark models are within ~0.26 % of the benchmark values for Configuration 1 (calculations performed using MCNP6 with ENDF/B-VII.1 neutron cross section data), but under-calculate the benchmark values by ~7s because the uncertainty in the benchmark is very small: ~0.0004 (1s); for Configuration 2, the under-calculation is ~0.31 % and ~8s. Comparison of detailed and simple model calculations for the potassium worth measurement and potassium mass coefficient yield results approximately 70 – 80 % lower (~6s to 10s) than the benchmark values for the various nuclear data libraries utilized. Both the potassium worth and mass coefficient are also deemed to be acceptable benchmark experiment measurements.« less
Validation of the WIMSD4M cross-section generation code with benchmark results

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deen, J.R.; Woodruff, W.L.; Leal, L.E.

1995-01-01

The WIMSD4 code has been adopted for cross-section generation in support of the Reduced Enrichment Research and Test Reactor (RERTR) program at Argonne National Laboratory (ANL). Subsequently, the code has undergone several updates, and significant improvements have been achieved. The capability of generating group-collapsed micro- or macroscopic cross sections from the ENDF/B-V library and the more recent evaluation, ENDF/B-VI, in the ISOTXS format makes the modified version of the WIMSD4 code, WIMSD4M, very attractive, not only for the RERTR program, but also for the reactor physics community. The intent of the present paper is to validate the WIMSD4M cross-section librariesmore » for reactor modeling of fresh water moderated cores. The results of calculations performed with multigroup cross-section data generated with the WIMSD4M code will be compared against experimental results. These results correspond to calculations carried out with thermal reactor benchmarks of the Oak Ridge National Laboratory (ORNL) unreflected HEU critical spheres, the TRX LEU critical experiments, and calculations of a modified Los Alamos HEU D{sub 2}O moderated benchmark critical system. The benchmark calculations were performed with the discrete-ordinates transport code, TWODANT, using WIMSD4M cross-section data. Transport calculations using the XSDRNPM module of the SCALE code system are also included. In addition to transport calculations, diffusion calculations with the DIF3D code were also carried out, since the DIF3D code is used in the RERTR program for reactor analysis and design. For completeness, Monte Carlo results of calculations performed with the VIM and MCNP codes are also presented.« less
Validation of the WIMSD4M cross-section generation code with benchmark results

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leal, L.C.; Deen, J.R.; Woodruff, W.L.

1995-02-01

The WIMSD4 code has been adopted for cross-section generation in support of the Reduced Enrichment for Research and Test (RERTR) program at Argonne National Laboratory (ANL). Subsequently, the code has undergone several updates, and significant improvements have been achieved. The capability of generating group-collapsed micro- or macroscopic cross sections from the ENDF/B-V library and the more recent evaluation, ENDF/B-VI, in the ISOTXS format makes the modified version of the WIMSD4 code, WIMSD4M, very attractive, not only for the RERTR program, but also for the reactor physics community. The intent of the present paper is to validate the procedure to generatemore » cross-section libraries for reactor analyses and calculations utilizing the WIMSD4M code. To do so, the results of calculations performed with group cross-section data generated with the WIMSD4M code will be compared against experimental results. These results correspond to calculations carried out with thermal reactor benchmarks of the Oak Ridge National Laboratory(ORNL) unreflected critical spheres, the TRX critical experiments, and calculations of a modified Los Alamos highly-enriched heavy-water moderated benchmark critical system. The benchmark calculations were performed with the discrete-ordinates transport code, TWODANT, using WIMSD4M cross-section data. Transport calculations using the XSDRNPM module of the SCALE code system are also included. In addition to transport calculations, diffusion calculations with the DIF3D code were also carried out, since the DIF3D code is used in the RERTR program for reactor analysis and design. For completeness, Monte Carlo results of calculations performed with the VIM and MCNP codes are also presented.« less
A Focus Group Exploration of Automated Case-Finders to Identify High-Risk Heart Failure Patients Within an Urban Safety Net Hospital.

PubMed

Patterson, Mark E; Miranda, Derick; Schuman, Greg; Eaton, Christopher; Smith, Andrew; Silver, Brad

2016-01-01

Leveraging "big data" as a means of informing cost-effective care holds potential in triaging high-risk heart failure (HF) patients for interventions within hospitals seeking to reduce 30-day readmissions. Explore provider's beliefs and perceptions about using an electronic health record (EHR)-based tool that uses unstructured clinical notes to risk-stratify high-risk heart failure patients. Six providers from an inpatient HF clinic within an urban safety net hospital were recruited to participate in a semistructured focus group. A facilitator led a discussion on the feasibility and value of using an EHR tool driven by unstructured clinical notes to help identify high-risk patients. Data collected from transcripts were analyzed using a thematic analysis that facilitated drawing conclusions clustered around categories and themes. From six categories emerged two themes: (1) challenges of finding valid and accurate results, and (2) strategies used to overcome these challenges. Although employing a tool that uses electronic medical record (EMR) unstructured text as the benchmark by which to identify high-risk patients is efficient, choosing appropriate benchmark groups could be challenging given the multiple causes of readmission. Strategies to mitigate these challenges include establishing clear selection criteria to guide benchmark group composition, and quality outcome goals for the hospital. Prior to implementing into practice an innovative EMR-based case-finder driven by unstructured clinical notes, providers are advised to do the following: (1) define patient quality outcome goals, (2) establish criteria by which to guide benchmark selection, and (3) verify the tool's validity and reliability. Achieving consensus on these issues would be necessary for this innovative EHR-based tool to effectively improve clinical decision-making and in turn, decrease readmissions for high-risk patients.
Species management benchmarking: outcomes over outputs in a changing operating environment.

PubMed

Hogg, Carolyn J; Hibbard, Chris; Ford, Claire; Embury, Amanda

2013-03-01

Species management has been utilized by the zoo and aquarium industry, since the mid-1990s, to ensure the ongoing genetic and demographic viability of populations, which can be difficult to maintain in the ever-changing operating environments of zoos. In 2009, the Zoo and Aquarium Association Australasia reviewed their species management services, focusing on addressing issues that had arisen as a result of the managed programs maturing and operating environments evolving. In summary, the project examined resourcing, policies, processes, and species to be managed. As a result, a benchmarking tool was developed (Health Check Report, HCR), which evaluated the programs against a set of broad criteria. A comparison of managed programs (n = 98), between 2008 and 2011, was undertaken to ascertain the tool's effectiveness. There was a marked decrease in programs that were designated as weak (37 down to 13); and an increase in excellent programs (24 up to 49) between the 2 years. Further, there were significant improvements in the administration benchmarking area (submission of reports, captive management plan development) across a number of taxon advisory groups. This HCR comparison showed that a benchmarking tool enables a program's performance to be quickly assessed and any remedial measures applied. The increases observed in program health were mainly due to increased management goals being attained. The HCR will be an ongoing program, as the management of the programs increases and goals are achieved, criteria will be refined to better highlight ongoing issues and ways in which these can be resolved. © 2012 Wiley Periodicals, Inc.
Development and validation of a new pediatric resuscitation and trauma outcome (PRESTO) model using the U.S. National Trauma Data Bank.

PubMed

St-Louis, Etienne; Bracco, David; Hanley, James; Razek, Tarek; Baird, Robert

2017-10-12

There is a need for a pediatric trauma outcomes benchmarking model that is adapted for Low-and-Middle-Income Countries (LMICs). We used the National-Trauma-Data-Bank (NTDB) and applied constraints specific to resource-poor environments to develop and validate an LMIC-specific pediatric trauma score. We selected a sample of pediatric trauma patients aged 0-14years in the NTDB from 2007 to 2012. Primary outcome was in-hospital death. Logistic regression was used to create the Pediatric Resuscitation and Trauma Outcome (PRESTO) score, which includes only low-tech predictor variables - those easily obtainable at point-of-care. Internal validation was performed using 10-fold cross-validation. External validation compared PRESTO to TRISS using ROC analyses. Among 651,030 patients, there were 64% males. Median age was 7. In-hospital mortality-rate was 1.2%. Mean TRISS predicted mortality was 0.04% (range 0%-43%). Independent predictors included in PRESTO (p<0.01) were age, blood pressure, neurologic status, need for supplemental oxygen, pulse, and oxygen saturation. The sensitivity and specificity of PRESTO were 95.7% and 94.0%. The resulting model had an AUC of 0.98 compared to 0.89 for TRISS. PRESTO satisfies the requirements of low-resource settings and is inherently adapted to children, allowing for benchmarking and eventual quality improvement initiatives. Further research is necessary for in-situ validation using prospectively collected LMIC data. Level III - Case-Control (Prognostic) Study. Copyright © 2017 Elsevier Inc. All rights reserved.
Validation of mechanical models for reinforced concrete structures: Presentation of the French project ``Benchmark des Poutres de la Rance''

NASA Astrophysics Data System (ADS)

L'Hostis, V.; Brunet, C.; Poupard, O.; Petre-Lazar, I.

2006-11-01

Several ageing models are available for the prediction of the mechanical consequences of rebar corrosion. They are used for service life prediction of reinforced concrete structures. Concerning corrosion diagnosis of reinforced concrete, some Non Destructive Testing (NDT) tools have been developed, and have been in use for some years. However, these developments require validation on existing concrete structures. The French project “Benchmark des Poutres de la Rance” contributes to this aspect. It has two main objectives: (i) validation of mechanical models to estimate the influence of rebar corrosion on the load bearing capacity of a structure, (ii) qualification of the use of the NDT results to collect information on steel corrosion within reinforced-concrete structures. Ten French and European institutions from both academic research laboratories and industrial companies contributed during the years 2004 and 2005. This paper presents the project that was divided into several work packages: (i) the reinforced concrete beams were characterized from non-destructive testing tools, (ii) the mechanical behaviour of the beams was experimentally tested, (iii) complementary laboratory analysis were performed and (iv) finally numerical simulations results were compared to the experimental results obtained with the mechanical tests.
Cancer Detection in Microarray Data Using a Modified Cat Swarm Optimization Clustering Approach

PubMed

M, Pandi; R, Balamurugan; N, Sadhasivam

2017-12-29

Objective: A better understanding of functional genomics can be obtained by extracting patterns hidden in gene expression data. This could have paramount implications for cancer diagnosis, gene treatments and other domains. Clustering may reveal natural structures and identify interesting patterns in underlying data. The main objective of this research was to derive a heuristic approach to detection of highly co-expressed genes related to cancer from gene expression data with minimum Mean Squared Error (MSE). Methods: A modified CSO algorithm using Harmony Search (MCSO-HS) for clustering cancer gene expression data was applied. Experiment results are analyzed using two cancer gene expression benchmark datasets, namely for leukaemia and for breast cancer. Result: The results indicated MCSO-HS to be better than HS and CSO, 13% and 9% with the leukaemia dataset. For breast cancer dataset improvement was by 22% and 17%, respectively, in terms of MSE. Conclusion: The results showed MCSO-HS to outperform HS and CSO with both benchmark datasets. To validate the clustering results, this work was tested with internal and external cluster validation indices. Also this work points to biological validation of clusters with gene ontology in terms of function, process and component. Creative Commons Attribution License
Organic contaminants, trace and major elements, and nutrients in water and sediment sampled in response to the Deepwater Horizon oil spill

USGS Publications Warehouse

Nowell, Lisa H.; Ludtke, Amy S.; Mueller, David K.; Scott, Jonathon C.

2011-01-01

Considering all the information evaluated in this report, there were significant differences between pre-landfall and post-landfall samples for PAH concentrations in sediment. Pre-landfall and post-landfall samples did not differ significantly in concentrations or benchmark exceedances for most organics in water or trace elements in sediment. For trace elements in water, aquatic-life benchmarks were exceeded in almost 50 percent of samples, but the high and variable analytical reporting levels precluded statistical comparison of benchmark exceedances between sampling periods. Concentrations of several PAH compounds in sediment were significantly higher in post-landfall samples than pre-landfall samples, and five of seven sites with the largest differences in PAH concentrations also had diagnostic geochemical evidence of Deepwater Horizon Macondo-1 oil from Rosenbauer and others (2010).
The COA360: a tool for assessing the cultural competency of healthcare organizations.

PubMed

LaVeist, Thomas A; Relosa, Rachel; Sawaya, Nadia

2008-01-01

The U.S. Census Bureau projects that by 2050, non-Hispanic whites will be in the numerical minority. This rapid diversification requires healthcare organizations to pay closer attention to cross-cultural issues if they are to meet the healthcare needs of the nation and continue to maintain a high standard of care. Although scorecards and benchmarking are widely used to gauge healthcare organizations' performance in various areas, these tools have been underused in relation to cultural preparedness or initiatives. The likely reason for this is the lack of a validated tool specifically designed to examine cultural competency. Existing validated cultural competency instruments evaluate individuals, not organizations. In this article, we discuss a study to validate the Cultural Competency Organizational Assessment--360 or the COA360, an instrument designed to appraise a healthcare organization's cultural competence. The Office of Minority Health and the Joint Commission have each developed standards for measuring the cultural competency of organizations. The COA360 is designed to assess adherence to both of these sets of standards. For this validation study, we enlisted a panel of national experts. The panel rated each dimension of the COA360, and the combination of items for each of the scale's 14 dimensions was rated above 4.13 (on 5-point scale). Our conclusion points to the validity of the COA360. As such, it is a valuable tool not only for assessing a healthcare organization's cultural readiness but also for benchmarking its progress in addressing cultural and diversity issues.
Comparison of discrete ordinate and Monte Carlo simulations of polarized radiative transfer in two coupled slabs with different refractive indices.

PubMed

Cohen, D; Stamnes, S; Tanikawa, T; Sommersten, E R; Stamnes, J J; Lotsberg, J K; Stamnes, K

2013-04-22

A comparison is presented of two different methods for polarized radiative transfer in coupled media consisting of two adjacent slabs with different refractive indices, each slab being a stratified medium with no change in optical properties except in the direction of stratification. One of the methods is based on solving the integro-differential radiative transfer equation for the two coupled slabs using the discrete ordinate approximation. The other method is based on probabilistic and statistical concepts and simulates the propagation of polarized light using the Monte Carlo approach. The emphasis is on non-Rayleigh scattering for particles in the Mie regime. Comparisons with benchmark results available for a slab with constant refractive index show that both methods reproduce these benchmark results when the refractive index is set to be the same in the two slabs. Computed results for test cases with coupling (different refractive indices in the two slabs) show that the two methods produce essentially identical results for identical input in terms of absorption and scattering coefficients and scattering phase matrices.
New features and improved uncertainty analysis in the NEA nuclear data sensitivity tool (NDaST)

NASA Astrophysics Data System (ADS)

Dyrda, J.; Soppera, N.; Hill, I.; Bossant, M.; Gulliford, J.

2017-09-01

Following the release and initial testing period of the NEA's Nuclear Data Sensitivity Tool [1], new features have been designed and implemented in order to expand its uncertainty analysis capabilities. The aim is to provide a free online tool for integral benchmark testing, that is both efficient and comprehensive, meeting the needs of the nuclear data and benchmark testing communities. New features include access to P1 sensitivities for neutron scattering angular distribution [2] and constrained Chi sensitivities for the prompt fission neutron energy sampling. Both of these are compatible with covariance data accessed via the JANIS nuclear data software, enabling propagation of the resultant uncertainties in keff to a large series of integral experiment benchmarks. These capabilities are available using a number of different covariance libraries e.g., ENDF/B, JEFF, JENDL and TENDL, allowing comparison of the broad range of results it is possible to obtain. The IRPhE database of reactor physics measurements is now also accessible within the tool in addition to the criticality benchmarks from ICSBEP. Other improvements include the ability to determine and visualise the energy dependence of a given calculated result in order to better identify specific regions of importance or high uncertainty contribution. Sorting and statistical analysis of the selected benchmark suite is now also provided. Examples of the plots generated by the software are included to illustrate such capabilities. Finally, a number of analytical expressions, for example Maxwellian and Watt fission spectra will be included. This will allow the analyst to determine the impact of varying such distributions within the data evaluation, either through adjustment of parameters within the expressions, or by comparison to a more general probability distribution fitted to measured data. The impact of such changes is verified through calculations which are compared to a `direct' measurement found by adjustment of the original ENDF format file.
Unavailability of thymidine kinase does not preclude the use of German comprehensive prognostic index: results of an external validation analysis in early chronic lymphocytic leukemia and comparison with MD Anderson Cancer Center model.

PubMed

Molica, Stefano; Giannarelli, Diana; Mirabelli, Rosanna; Levato, Luciano; Russo, Antonio; Linardi, Maria; Gentile, Massimo; Morabito, Fortunato

2016-01-01

A comprehensive prognostic index that includes clinical (i.e., age, sex, ECOG performance status), serum (i.e., ß2-microglobulin, thymidine kinase [TK]), and molecular (i.e., IGVH mutational status, del 17p, del 11q) markers developed by the German CLL Study Group (GCLLSG) was externally validated in a prospective, community-based cohort consisting of 338 patients with early chronic lymphocytic leukemia (CLL) using as endpoint the time to first treatment (TTFT). Because serum TK was not available, a slightly modified version of the model based on seven instead of eight prognostic variables was used. By German index, 62.9% of patients were scored as having low-risk CLL (score 0-2), whereas 37.1% had intermediate-risk CLL (score 3-5). This stratification translated into a significant difference in the TTFT [HR = 4.21; 95% C.I. (2.71-6.53); P < 0.0001]. Also the 2007 MD Anderson Cancer Center (MDACC) score, barely based on traditional clinical parameters, showed comparable reliability [HR = 2.73; 95% C.I. (1.79-4.17); P < 0.0001]. A comparative performance assessment between the two models revealed that prediction of the TTFT was more accurate with German score. The c-statistic of the MDACC model was 0.65 (range, 0.53-0.78) a level below that of the German index [0.71 (range, 0.60-0.82)] and below the accepted 0.7 threshold necessary to have value at the individual patient level. Results of this external comparative validation analysis strongly support the German score as the benchmark for comparison of any novel prognostic scheme aimed at evaluating the TTFT in patients with early CLL even when a modified version which does not include TK is utilized. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Specific entrustable professional activities for undergraduate medical internships: a method compatible with the academic curriculum.

PubMed

Hamui-Sutton, Alicia; Monterrosas-Rojas, Ana María; Ortiz-Montalvo, Armando; Flores-Morones, Felipe; Torruco-García, Uri; Navarrete-Martínez, Andrea; Arrioja-Guerrero, Araceli

2017-08-25

Competency-based education has been considered the most important pedagogical trend in Medicine in the last two decades. In clinical contexts, competencies are implemented through Entrustable Professional Activities (EPAs) which are observable and measurable. The aim of this paper is to describe the methodology used in the design of educational tools to assess students´ competencies in clinical practice during their undergraduate internship (UI). In this paper, we present the construction of specific APROCs (Actividades Profesionales Confiables) in Surgery (S), Gynecology and Obstetrics (GO) and Family Medicine (FM) rotations with three levels of performance. The study considered a mixed method exploratory type design, a qualitative phase followed by a quantitative validation exercise. In the first stage data was obtained from three rotations (FM, GO and S) through focus groups about real and expected activities of medical interns. Triangulation with other sources was made to construct benchmarks. In the second stage, narrative descriptions with the three levels were validated by professors who teach the different subjects using the Delphi technique. The results may be described both curricular and methodological wise. From the curricular point of view, APROCs were identified in three UI rotations within clinical contexts in Mexico City, benchmarks were developed by levels and validated by experts' consensus. In regard to methodological issues, this research contributed to the development of a strategy, following six steps, to build APROCs using mixed methods. Developing benchmarks provides a regular and standardized language that helps to evaluate student's performance and define educational strategies efficiently and accurately. The university academic program was aligned with APROCs in clinical contexts to assure the acquisition of competencies by students.
Hazards of benchmarking complications with the National Trauma Data Bank: numerators in search of denominators.

PubMed

Kardooni, Shahrzad; Haut, Elliott R; Chang, David C; Pierce, Charles A; Efron, David T; Haider, Adil H; Pronovost, Peter J; Cornwell, Edward E

2008-02-01

Complication rates after trauma may serve as important indicators of quality of care. Meaningful performance benchmarks for complication rates require reference standards from valid and reliable data. Selection of appropriate numerators and denominators is a major consideration for data validity in performance improvement and benchmarking. We examined the suitability of the National Trauma Data Bank (NTDB) as a reference for benchmarking trauma center complication rates. We selected the five most commonly reported complications in the NTDB v. 6.1 (pneumonia, urinary tract infection, acute respiratory distress syndrome, deep vein thrombosis, myocardial infarction). We compared rates for each complication using three different denominators defined by different populations at risk. A-all patients from all 700 reporting facilities as the denominator (n = 1,466,887); B-only patients from the 441 hospitals reporting at least one complication (n = 1,307,729); C-patients from hospitals reporting at least one occurrence of each specific complication, giving a unique denominator for each complication (n range = 869,675-1,167,384). We also looked at differences in hospital characteristics between complication reporters and nonreporters. There was a 12.2% increase in the rate of each complication when patients from facilities not reporting any complications were excluded from the denominator. When rates were calculated using a unique denominator for each complication, rates increased 25% to 70%. The change from rate A to rate C produced a new rank order for the top five complications. When compared directly, rates B and C were also significantly different for all complications (all p < 0.01). Hospitals that reported complication information had significantly higher annual admissions and were more likely to be designated level I or II trauma centers and be university teaching hospitals. There is great variability in complication data reported in the NTDB that may introduce bias and significantly influence rates of complications reported. This potential for bias creates a challenge for appropriately interpreting complication rates for hospital performance benchmarking. We recognize the value of large aggregated registries such as the NTDB as a valuable tool for benchmarking and performance improvement purposes. However, we strongly advocate the need for conscientious selection of numerators and denominators that serve as the basic foundation for research.
Benchmarks for health expenditures, services and outcomes in Africa during the 1990s.

PubMed Central

Peters, D. H.; Elmendorf, A. E.; Kandola, K.; Chellaraj, G.

2000-01-01

There is limited information on national health expenditures, services, and outcomes in African countries during the 1990s. We intend to make statistical information available for national level comparisons. National level data were collected from numerous international databases, and supplemented by national household surveys and World Bank expenditure reviews. The results were tabulated and analysed in an exploratory fashion to provide benchmarks for groupings of African countries and individual country comparison. There is wide variation in scale and outcome of health care spending between African countries, with poorer countries tending to do worse than wealthier ones. From 1990-96, the median annual per capita government expenditure on health was nearly US$ 6, but averaged US$ 3 in the lowest-income countries, compared to US$ 72 in middle-income countries. Similar trends were found for health services and outcomes. Results from individual countries (particularly Ethiopia, Ghana, Côte d'Ivoire and Gabon) are used to indicate how the data can be used to identify areas of improvement in health system performance. Serious gaps in data, particularly concerning private sector delivery and financing, health service utilization, equity and efficiency measures, hinder more effective health management. Nonetheless, the data are useful for providing benchmarks for performance and for crudely identifying problem areas in health systems for individual countries. PMID:10916913
Theory comparison and numerical benchmarking on neoclassical toroidal viscosity torque

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Zhirui; Park, Jong-Kyu; Logan, Nikolas

Systematic comparison and numerical benchmarking have been successfully carried out among three different approaches of neoclassical toroidal viscosity (NTV) theory and the corresponding codes: IPEC-PENT is developed based on the combined NTV theory but without geometric simplifications [Park et al., Phys. Rev. Lett. 102, 065002 (2009)]; MARS-Q includes smoothly connected NTV formula [Shaing et al., Nucl. Fusion 50, 025022 (2010)] based on Shaing's analytic formulation in various collisionality regimes; MARS-K, originally computing the drift kinetic energy, is upgraded to compute the NTV torque based on the equivalence between drift kinetic energy and NTV torque [J.-K. Park, Phys. Plasma 18, 110702more » (2011)]. The derivation and numerical results both indicate that the imaginary part of drift kinetic energy computed by MARS-K is equivalent to the NTV torque in IPEC-PENT. In the benchmark of precession resonance between MARS-Q and MARS-K/IPEC-PENT, the agreement and correlation between the connected NTV formula and the combined NTV theory in different collisionality regimes are shown for the first time. Additionally, both IPEC-PENT and MARS-K indicate the importance of the bounce harmonic resonance which can greatly enhance the NTV torque when E×B drift frequency reaches the bounce resonance condition.« less

Supportive decision making at the point of care: refinement of a case-based reasoning application for use in nursing practice.

PubMed

DI Pietro, Tammie L; Doran, Diane M; McArthur, Gregory

2010-01-01

Variations in nursing care have been observed, affecting patient outcomes and quality of care. Case-based reasoners that benchmark for patient indicators can reduce variation through decision support. This study evaluated and validated a case-based reasoning application to establish benchmarks for nursing-sensitive patient outcomes of pain, fatigue, and toilet use, using patient characteristic variables for generating similar cases. Three graduate nursing students participated. Each ranked 25 patient cases using demographics of age, sex, diagnosis, and comorbidities against 10 patients from a database. Participant judgments of case similarity were compared with the case-based reasoning system. Feature weights for each indicator were adjusted to make the case-based reasoning system's similarity ranking correspond more closely to participant judgment. Small differences were noted between initial weights and weights generated from participants. For example, initial weight for comorbidities was 0.35, whereas weights generated by participants for pain, fatigue, and toilet use were 0.49, 0.42, and 0.48, respectively. For the same outcomes, the initial weight for sex was 0.15, but weights generated by the participants were 0.025, 0.002, and 0.000, respectively. Refinement of the case-based reasoning tool established valid benchmarks for patient outcomes in relation to participants and assisted in point-of-care decision making.
CANDO and the infinite drug discovery frontier

PubMed Central

Minie, Mark; Chopra, Gaurav; Sethi, Geetika; Horst, Jeremy; White, George; Roy, Ambrish; Hatti, Kaushik; Samudrala, Ram

2014-01-01

The Computational Analysis of Novel Drug Opportunities (CANDO) platform (http://protinfo.org/cando) uses similarity of compound–proteome interaction signatures to infer homology of compound/drug behavior. We constructed interaction signatures for 3733 human ingestible compounds covering 48,278 protein structures mapping to 2030 indications based on basic science methodologies to predict and analyze protein structure, function, and interactions developed by us and others. Our signature comparison and ranking approach yielded benchmarking accuracies of 12–25% for 1439 indications with at least two approved compounds. We prospectively validated 49/82 ‘high value’ predictions from nine studies covering seven indications, with comparable or better activity to existing drugs, which serve as novel repurposed therapeutics. Our approach may be generalized to compounds beyond those approved by the FDA, and can also consider mutations in protein structures to enable personalization. Our platform provides a holistic multiscale modeling framework of complex atomic, molecular, and physiological systems with broader applications in medicine and engineering. PMID:24980786
A novel metaheuristic for continuous optimization problems: Virus optimization algorithm

NASA Astrophysics Data System (ADS)

Liang, Yun-Chia; Rodolfo Cuevas Juarez, Josue

2016-01-01

A novel metaheuristic for continuous optimization problems, named the virus optimization algorithm (VOA), is introduced and investigated. VOA is an iteratively population-based method that imitates the behaviour of viruses attacking a living cell. The number of viruses grows at each replication and is controlled by an immune system (a so-called 'antivirus') to prevent the explosive growth of the virus population. The viruses are divided into two classes (strong and common) to balance the exploitation and exploration effects. The performance of the VOA is validated through a set of eight benchmark functions, which are also subject to rotation and shifting effects to test its robustness. Extensive comparisons were conducted with over 40 well-known metaheuristic algorithms and their variations, such as artificial bee colony, artificial immune system, differential evolution, evolutionary programming, evolutionary strategy, genetic algorithm, harmony search, invasive weed optimization, memetic algorithm, particle swarm optimization and simulated annealing. The results showed that the VOA is a viable solution for continuous optimization.
Security Event Recognition for Visual Surveillance

NASA Astrophysics Data System (ADS)

Liao, W.; Yang, C.; Yang, M. Ying; Rosenhahn, B.

2017-05-01

With rapidly increasing deployment of surveillance cameras, the reliable methods for automatically analyzing the surveillance video and recognizing special events are demanded by different practical applications. This paper proposes a novel effective framework for security event analysis in surveillance videos. First, convolutional neural network (CNN) framework is used to detect objects of interest in the given videos. Second, the owners of the objects are recognized and monitored in real-time as well. If anyone moves any object, this person will be verified whether he/she is its owner. If not, this event will be further analyzed and distinguished between two different scenes: moving the object away or stealing it. To validate the proposed approach, a new video dataset consisting of various scenarios is constructed for more complex tasks. For comparison purpose, the experiments are also carried out on the benchmark databases related to the task on abandoned luggage detection. The experimental results show that the proposed approach outperforms the state-of-the-art methods and effective in recognizing complex security events.
Optimum stacking sequence design of laminated composite circular plates with curvilinear fibres by a layer-wise optimization method

NASA Astrophysics Data System (ADS)

Guenanou, A.; Houmat, A.

2018-05-01

The optimum stacking sequence design for the maximum fundamental frequency of symmetrically laminated composite circular plates with curvilinear fibres is investigated for the first time using a layer-wise optimization method. The design variables are two fibre orientation angles per layer. The fibre paths are constructed using the method of shifted paths. The first-order shear deformation plate theory and a curved square p-element are used to calculate the objective function. The blending function method is used to model accurately the geometry of the circular plate. The equations of motion are derived using Lagrange's method. The numerical results are validated by means of a convergence test and comparison with published values for symmetrically laminated composite circular plates with rectilinear fibres. The material parameters, boundary conditions, number of layers and thickness are shown to influence the optimum solutions to different extents. The results should serve as a benchmark for optimum stacking sequences of symmetrically laminated composite circular plates with curvilinear fibres.
iMODFIT: efficient and robust flexible fitting based on vibrational analysis in internal coordinates.

PubMed

Lopéz-Blanco, José Ramón; Chacón, Pablo

2013-11-01

Here, we employed the collective motions extracted from Normal Mode Analysis (NMA) in internal coordinates (torsional space) for the flexible fitting of atomic-resolution structures into electron microscopy (EM) density maps. The proposed methodology was validated using a benchmark of simulated cases, highlighting its robustness over the full range of EM resolutions and even over coarse-grained representations. A systematic comparison with other methods further showcased the advantages of this proposed methodology, especially at medium to lower resolutions. Using this method, computational costs and potential overfitting problems are naturally reduced by constraining the search in low-frequency NMA space, where covalent geometry is implicitly maintained. This method also effectively captures the macromolecular changes of a representative set of experimental test cases. We believe that this novel approach will extend the currently available EM hybrid methods to the atomic-level interpretation of large conformational changes and their functional implications. Copyright © 2013 Elsevier Inc. All rights reserved.
[2-year results of leg amputation in Hungary based on a nationwide data base].

PubMed

Kullmann, L; Belicza, E; László, G

1997-09-14

Authors review nation-wide hospital data of amputees over two years in order to make comparison with similar data gathered about 20 years ago. Data were provided by the National Medical Records Centre and processed by own developed programmes. The quality level of data validity is slightly criticised. The cause of amputation was most often vascular disease, amputees were elderly and large majority of amputation surgery was carried out on the lower limb. The rate of below-knee amputation has gone up favourably over the 20 years, but there are large regional differences within the country. Mortality parameters remarkably exceed those of foreign countries. Regarding compromised data accuracy, still there are ways of exploiting data in favour of quality improvement of care, e.g. improve below-knee amputation rate, reduce mortality. Publication of data can be of bench-marking importance for hospitals by enabling them to compare own results with those of other hospitals, as well as develop and improve performance.
Weighted interior penalty discretization of fully nonlinear and weakly dispersive free surface shallow water flows

NASA Astrophysics Data System (ADS)

Di Pietro, Daniele A.; Marche, Fabien

2018-02-01

In this paper, we further investigate the use of a fully discontinuous Finite Element discrete formulation for the study of shallow water free surface flows in the fully nonlinear and weakly dispersive flow regime. We consider a decoupling strategy in which we approximate the solutions of the classical shallow water equations supplemented with a source term globally accounting for the non-hydrostatic effects. This source term can be computed through the resolution of elliptic second-order linear sub-problems, which only involve second order partial derivatives in space. We then introduce an associated Symmetric Weighted Internal Penalty discrete bilinear form, allowing to deal with the discontinuous nature of the elliptic problem's coefficients in a stable and consistent way. Similar discrete formulations are also introduced for several recent optimized fully nonlinear and weakly dispersive models. These formulations are validated again several benchmarks involving h-convergence, p-convergence and comparisons with experimental data, showing optimal convergence properties.
TRIPOLI-4® - MCNP5 ITER A-lite neutronic model benchmarking

NASA Astrophysics Data System (ADS)

Jaboulay, J.-C.; Cayla, P.-Y.; Fausser, C.; Lee, Y.-K.; Trama, J.-C.; Li-Puma, A.

2014-06-01

The aim of this paper is to present the capability of TRIPOLI-4®, the CEA Monte Carlo code, to model a large-scale fusion reactor with complex neutron source and geometry. In the past, numerous benchmarks were conducted for TRIPOLI-4® assessment on fusion applications. Experiments (KANT, OKTAVIAN, FNG) analysis and numerical benchmarks (between TRIPOLI-4® and MCNP5) on the HCLL DEMO2007 and ITER models were carried out successively. In this previous ITER benchmark, nevertheless, only the neutron wall loading was analyzed, its main purpose was to present MCAM (the FDS Team CAD import tool) extension for TRIPOLI-4®. Starting from this work a more extended benchmark has been performed about the estimation of neutron flux, nuclear heating in the shielding blankets and tritium production rate in the European TBMs (HCLL and HCPB) and it is presented in this paper. The methodology to build the TRIPOLI-4® A-lite model is based on MCAM and the MCNP A-lite model (version 4.1). Simplified TBMs (from KIT) have been integrated in the equatorial-port. Comparisons of neutron wall loading, flux, nuclear heating and tritium production rate show a good agreement between the two codes. Discrepancies are mainly included in the Monte Carlo codes statistical error.
WWTP dynamic disturbance modelling--an essential module for long-term benchmarking development.

PubMed

Gernaey, K V; Rosen, C; Jeppsson, U

2006-01-01

Intensive use of the benchmark simulation model No. 1 (BSM1), a protocol for objective comparison of the effectiveness of control strategies in biological nitrogen removal activated sludge plants, has also revealed a number of limitations. Preliminary definitions of the long-term benchmark simulation model No. 1 (BSM1_LT) and the benchmark simulation model No. 2 (BSM2) have been made to extend BSM1 for evaluation of process monitoring methods and plant-wide control strategies, respectively. Influent-related disturbances for BSM1_LT/BSM2 are to be generated with a model, and this paper provides a general overview of the modelling methods used. Typical influent dynamic phenomena generated with the BSM1_LT/BSM2 influent disturbance model, including diurnal, weekend, seasonal and holiday effects, as well as rainfall, are illustrated with simulation results. As a result of the work described in this paper, a proposed influent model/file has been released to the benchmark developers for evaluation purposes. Pending this evaluation, a final BSM1_LT/BSM2 influent disturbance model definition is foreseen. Preliminary simulations with dynamic influent data generated by the influent disturbance model indicate that default BSM1 activated sludge plant control strategies will need extensions for BSM1_LT/BSM2 to efficiently handle 1 year of influent dynamics.
Accelerating progress in Artificial General Intelligence: Choosing a benchmark for natural world interaction

NASA Astrophysics Data System (ADS)

Rohrer, Brandon

2010-12-01

Measuring progress in the field of Artificial General Intelligence (AGI) can be difficult without commonly accepted methods of evaluation. An AGI benchmark would allow evaluation and comparison of the many computational intelligence algorithms that have been developed. In this paper I propose that a benchmark for natural world interaction would possess seven key characteristics: fitness, breadth, specificity, low cost, simplicity, range, and task focus. I also outline two benchmark examples that meet most of these criteria. In the first, the direction task, a human coach directs a machine to perform a novel task in an unfamiliar environment. The direction task is extremely broad, but may be idealistic. In the second, the AGI battery, AGI candidates are evaluated based on their performance on a collection of more specific tasks. The AGI battery is designed to be appropriate to the capabilities of currently existing systems. Both the direction task and the AGI battery would require further definition before implementing. The paper concludes with a description of a task that might be included in the AGI battery: the search and retrieve task.
Benchmarking NNWSI flow and transport codes: COVE 1 results

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hayden, N.K.

1985-06-01

The code verification (COVE) activity of the Nevada Nuclear Waste Storage Investigations (NNWSI) Project is the first step in certification of flow and transport codes used for NNWSI performance assessments of a geologic repository for disposing of high-level radioactive wastes. The goals of the COVE activity are (1) to demonstrate and compare the numerical accuracy and sensitivity of certain codes, (2) to identify and resolve problems in running typical NNWSI performance assessment calculations, and (3) to evaluate computer requirements for running the codes. This report describes the work done for COVE 1, the first step in benchmarking some of themore » codes. Isothermal calculations for the COVE 1 benchmarking have been completed using the hydrologic flow codes SAGUARO, TRUST, and GWVIP; the radionuclide transport codes FEMTRAN and TRUMP; and the coupled flow and transport code TRACR3D. This report presents the results of three cases of the benchmarking problem solved for COVE 1, a comparison of the results, questions raised regarding sensitivities to modeling techniques, and conclusions drawn regarding the status and numerical sensitivities of the codes. 30 refs.« less
The National Practice Benchmark for oncology, 2014 report on 2013 data.

PubMed

Towle, Elaine L; Barr, Thomas R; Senese, James L

2014-11-01

The National Practice Benchmark (NPB) is a unique tool to measure oncology practices against others across the country in a way that allows meaningful comparisons despite differences in practice size or setting. In today's economic environment every oncology practice, regardless of business structure or affiliation, should be able to produce, monitor, and benchmark basic metrics to meet current business pressures for increased efficiency and efficacy of care. Although we recognize that the NPB survey results do not capture the experience of all oncology practices, practices that can and do participate demonstrate exceptional managerial capability, and this year those practices are recognized for their participation. In this report, we continue to emphasize the methodology introduced last year in which we reported medical revenue net of the cost of the drugs as net medical revenue for the hematology/oncology product line. The effect of this is to capture only the gross margin attributable to drugs as revenue. New this year, we introduce six measures of clinical data density and expand the radiation oncology benchmarks. Copyright © 2014 by American Society of Clinical Oncology.
Public Interest Energy Research (PIER) Program Development of a Computer-based Benchmarking and Analytical Tool. Benchmarking and Energy & Water Savings Tool in Dairy Plants (BEST-Dairy)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, Tengfang; Flapper, Joris; Ke, Jing

The overall goal of the project is to develop a computer-based benchmarking and energy and water savings tool (BEST-Dairy) for use in the California dairy industry - including four dairy processes - cheese, fluid milk, butter, and milk powder. BEST-Dairy tool developed in this project provides three options for the user to benchmark each of the dairy product included in the tool, with each option differentiated based on specific detail level of process or plant, i.e., 1) plant level; 2) process-group level, and 3) process-step level. For each detail level, the tool accounts for differences in production and other variablesmore » affecting energy use in dairy processes. The dairy products include cheese, fluid milk, butter, milk powder, etc. The BEST-Dairy tool can be applied to a wide range of dairy facilities to provide energy and water savings estimates, which are based upon the comparisons with the best available reference cases that were established through reviewing information from international and national samples. We have performed and completed alpha- and beta-testing (field testing) of the BEST-Dairy tool, through which feedback from voluntary users in the U.S. dairy industry was gathered to validate and improve the tool's functionality. BEST-Dairy v1.2 was formally published in May 2011, and has been made available for free downloads from the internet (i.e., http://best-dairy.lbl.gov). A user's manual has been developed and published as the companion documentation for use with the BEST-Dairy tool. In addition, we also carried out technology transfer activities by engaging the dairy industry in the process of tool development and testing, including field testing, technical presentations, and technical assistance throughout the project. To date, users from more than ten countries in addition to those in the U.S. have downloaded the BEST-Dairy from the LBNL website. It is expected that the use of BEST-Dairy tool will advance understanding of energy and water usage in individual dairy plants, augment benchmarking activities in the market places, and facilitate implementation of efficiency measures and strategies to save energy and water usage in the dairy industry. Industrial adoption of this emerging tool and technology in the market is expected to benefit dairy plants, which are important customers of California utilities. Further demonstration of this benchmarking tool is recommended, for facilitating its commercialization and expansion in functions of the tool. Wider use of this BEST-Dairy tool and its continuous expansion (in functionality) will help to reduce the actual consumption of energy and water in the dairy industry sector. The outcomes comply very well with the goals set by the AB 1250 for PIER program.« less
A shallow water table fluctuation model in response to precipitation with consideration of unsaturated gravitational flow

NASA Astrophysics Data System (ADS)

Park, E.; Jeong, J.

2017-12-01

A precise estimation of groundwater fluctuation is studied by considering delayed recharge flux (DRF) and unsaturated zone drainage (UZD). Both DRF and UZD are due to gravitational flow impeded in the unsaturated zone, which may nonnegligibly affect groundwater level changes. In the validation, a previous model without the consideration of unsaturated flow is benchmarked where the actual groundwater level and precipitation data are divided into three periods based on the climatic condition. The estimation capability of the new model is superior to the benchmarked model as indicated by the significantly improved representation of groundwater level with physically interpretable model parameters.
Overview of the 2014 Edition of the International Handbook of Evaluated Reactor Physics Benchmark Experiments (IRPhEP Handbook)

DOE Office of Scientific and Technical Information (OSTI.GOV)

John D. Bess; J. Blair Briggs; Jim Gulliford

2014-10-01

The International Reactor Physics Experiment Evaluation Project (IRPhEP) is a widely recognized world class program. The work of the IRPhEP is documented in the International Handbook of Evaluated Reactor Physics Benchmark Experiments (IRPhEP Handbook). Integral data from the IRPhEP Handbook is used by reactor safety and design, nuclear data, criticality safety, and analytical methods development specialists, worldwide, to perform necessary validations of their calculational techniques. The IRPhEP Handbook is among the most frequently quoted reference in the nuclear industry and is expected to be a valuable resource for future decades.
Benchmarking Controlled Trial—a novel concept covering all observational effectiveness studies

PubMed Central

Malmivaara, Antti

2015-01-01

Abstract The Benchmarking Controlled Trial (BCT) is a novel concept which covers all observational studies aiming to assess effectiveness. BCTs provide evidence of the comparative effectiveness between health service providers, and of effectiveness due to particular features of the health and social care systems. BCTs complement randomized controlled trials (RCTs) as the sources of evidence on effectiveness. This paper presents a definition of the BCT; compares the position of BCTs in assessing effectiveness with that of RCTs; presents a checklist for assessing methodological validity of a BCT; and pilot-tests the checklist with BCTs published recently in the leading medical journals. PMID:25965700
Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data.

PubMed

Rohrer, Sebastian G; Baumann, Knut

2009-02-01

Refined nearest neighbor analysis was recently introduced for the analysis of virtual screening benchmark data sets. It constitutes a technique from the field of spatial statistics and provides a mathematical framework for the nonparametric analysis of mapped point patterns. Here, refined nearest neighbor analysis is used to design benchmark data sets for virtual screening based on PubChem bioactivity data. A workflow is devised that purges data sets of compounds active against pharmaceutically relevant targets from unselective hits. Topological optimization using experimental design strategies monitored by refined nearest neighbor analysis functions is applied to generate corresponding data sets of actives and decoys that are unbiased with regard to analogue bias and artificial enrichment. These data sets provide a tool for Maximum Unbiased Validation (MUV) of virtual screening methods. The data sets and a software package implementing the MUV design workflow are freely available at http://www.pharmchem.tu-bs.de/lehre/baumann/MUV.html.
Benchmark Comparison of Dual- and Quad-Core Processor Linux Clusters with Two Global Climate Modeling Workloads

NASA Technical Reports Server (NTRS)

McGalliard, James

2008-01-01

This viewgraph presentation details the science and systems environments that NASA High End computing program serves. Included is a discussion of the workload that is involved in the processing for the Global Climate Modeling. The Goddard Earth Observing System Model, Version 5 (GEOS-5) is a system of models integrated using the Earth System Modeling Framework (ESMF). The GEOS-5 system was used for the Benchmark tests, and the results of the tests are shown and discussed. Tests were also run for the Cubed Sphere system, results for these test are also shown.
Results of the 2013 UT modeling benchmark obtained with models implemented in CIVA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Toullelan, Gwénaël; Raillon, Raphaële; Chatillon, Sylvain

The 2013 Ultrasonic Testing (UT) modeling benchmark concerns direct echoes from side drilled holes (SDH), flat bottom holes (FBH) and corner echoes from backwall breaking artificial notches inspected with a matrix phased array probe. This communication presents the results obtained with the models implemented in the CIVA software: the pencilmodel is used to compute the field radiated by the probe, the Kirchhoff approximation is applied to predict the response of FBH and notches and the SOV (Separation Of Variables) model is used for the SDH responses. The comparison between simulated and experimental results are presented and discussed.

Benchmarking for the competitive marketplace.

PubMed

Clarke, R W; Sucher, T O

1999-07-01

One would get little argument these days regarding the importance of performance measurement in the health care industry. The traditional approach has been the straightforward use of measurable units such as financial comparisons and clinical indicators (e.g., length of stay). Also we in the health care industry have traditionally benchmarked our performance and strategies against those most like ourselves. Today's competitive market demands a more customer-focused set of performance measures that go beyond traditional approaches such as customer service. The most important task in today's environment is to study the customers' emerging priorities and adjust our business to meet those priorities.
[Does implementation of benchmarking in quality circles improve the quality of care of patients with asthma and reduce drug interaction?].

PubMed

Kaufmann-Kolle, Petra; Szecsenyi, Joachim; Broge, Björn; Haefeli, Walter Emil; Schneider, Antonius

2011-01-01

The purpose of this cluster-randomised controlled trial was to evaluate the efficacy of quality circles (QCs) working either with general data-based feedback or with an open benchmark within the field of asthma care and drug-drug interactions. Twelve QCs, involving 96 general practitioners from 85 practices, were randomised. Six QCs worked with traditional anonymous feedback and six with an open benchmark. Two QC meetings supported with feedback reports were held covering the topics "drug-drug interactions" and "asthma"; in both cases discussions were guided by a trained moderator. Outcome measures included health-related quality of life and patient satisfaction with treatment, asthma severity and number of potentially inappropriate drug combinations as well as the general practitioners' satisfaction in relation to the performance of the QC. A significant improvement in the treatment of asthma was observed in both trial arms. However, there was only a slight improvement regarding inappropriate drug combinations. There were no relevant differences between the group with open benchmark (B-QC) and traditional quality circles (T-QC). The physicians' satisfaction with the QC performance was significantly higher in the T-QCs. General practitioners seem to take a critical perspective about open benchmarking in quality circles. Caution should be used when implementing benchmarking in a quality circle as it did not improve healthcare when compared to the traditional procedure with anonymised comparisons. Copyright © 2011. Published by Elsevier GmbH.
A Benchmark and Comparative Study of Video-Based Face Recognition on COX Face Database.

PubMed

Huang, Zhiwu; Shan, Shiguang; Wang, Ruiping; Zhang, Haihong; Lao, Shihong; Kuerban, Alifu; Chen, Xilin

2015-12-01

Face recognition with still face images has been widely studied, while the research on video-based face recognition is inadequate relatively, especially in terms of benchmark datasets and comparisons. Real-world video-based face recognition applications require techniques for three distinct scenarios: 1) Videoto-Still (V2S); 2) Still-to-Video (S2V); and 3) Video-to-Video (V2V), respectively, taking video or still image as query or target. To the best of our knowledge, few datasets and evaluation protocols have benchmarked for all the three scenarios. In order to facilitate the study of this specific topic, this paper contributes a benchmarking and comparative study based on a newly collected still/video face database, named COX(1) Face DB. Specifically, we make three contributions. First, we collect and release a largescale still/video face database to simulate video surveillance with three different video-based face recognition scenarios (i.e., V2S, S2V, and V2V). Second, for benchmarking the three scenarios designed on our database, we review and experimentally compare a number of existing set-based methods. Third, we further propose a novel Point-to-Set Correlation Learning (PSCL) method, and experimentally show that it can be used as a promising baseline method for V2S/S2V face recognition on COX Face DB. Extensive experimental results clearly demonstrate that video-based face recognition needs more efforts, and our COX Face DB is a good benchmark database for evaluation.
Comparative modeling and benchmarking data sets for human histone deacetylases and sirtuin families.

PubMed

Xia, Jie; Tilahun, Ermias Lemma; Kebede, Eyob Hailu; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon

2015-02-23

Histone deacetylases (HDACs) are an important class of drug targets for the treatment of cancers, neurodegenerative diseases, and other types of diseases. Virtual screening (VS) has become fairly effective approaches for drug discovery of novel and highly selective histone deacetylase inhibitors (HDACIs). To facilitate the process, we constructed maximal unbiased benchmarking data sets for HDACs (MUBD-HDACs) using our recently published methods that were originally developed for building unbiased benchmarking sets for ligand-based virtual screening (LBVS). The MUBD-HDACs cover all four classes including Class III (Sirtuins family) and 14 HDAC isoforms, composed of 631 inhibitors and 24609 unbiased decoys. Its ligand sets have been validated extensively as chemically diverse, while the decoy sets were shown to be property-matching with ligands and maximal unbiased in terms of "artificial enrichment" and "analogue bias". We also conducted comparative studies with DUD-E and DEKOIS 2.0 sets against HDAC2 and HDAC8 targets and demonstrate that our MUBD-HDACs are unique in that they can be applied unbiasedly to both LBVS and SBVS approaches. In addition, we defined a novel metric, i.e. NLBScore, to detect the "2D bias" and "LBVS favorable" effect within the benchmarking sets. In summary, MUBD-HDACs are the only comprehensive and maximal-unbiased benchmark data sets for HDACs (including Sirtuins) that are available so far. MUBD-HDACs are freely available at http://www.xswlab.org/ .
Benchmarking all-atom simulations using hydrogen exchange

DOE Office of Scientific and Technical Information (OSTI.GOV)

Skinner, John J.; Yu, Wookyung; Gichana, Elizabeth K.

We are now able to fold small proteins reversibly to their native structures [Lindorff-Larsen K, Piana S, Dror RO, Shaw DE (2011) Science 334(6055):517–520] using long-time molecular dynamics (MD) simulations. Our results indicate that modern force fields can reproduce the energy surface near the native structure. In this paper, to test how well the force fields recapitulate the other regions of the energy surface, MD trajectories for a variant of protein G are compared with data from site-resolved hydrogen exchange (HX) and other biophysical measurements. Because HX monitors the breaking of individual H-bonds, this experimental technique identifies the stability andmore » H-bond content of excited states, thus enabling quantitative comparison with the simulations. Contrary to experimental findings of a cooperative, all-or-none unfolding process, the simulated denatured state ensemble, on average, is highly collapsed with some transient or persistent native 2° structure. The MD trajectories of this protein G variant and other small proteins exhibit excessive intramolecular H-bonding even for the most expanded conformations, suggesting that the force fields require improvements in describing H-bonding and backbone hydration. Finally and moreover, these comparisons provide a general protocol for validating the ability of simulations to accurately capture rare structural fluctuations.« less
Low-lying excited states of model proteins: Performances of the CC2 method versus multireference methods

NASA Astrophysics Data System (ADS)

Ben Amor, Nadia; Hoyau, Sophie; Maynau, Daniel; Brenner, Valérie

2018-05-01

A benchmark set of relevant geometries of a model protein, the N-acetylphenylalanylamide, is presented to assess the validity of the approximate second-order coupled cluster (CC2) method in studying low-lying excited states of such bio-relevant systems. The studies comprise investigations of basis-set dependence as well as comparison with two multireference methods, the multistate complete active space 2nd order perturbation theory (MS-CASPT2) and the multireference difference dedicated configuration interaction (DDCI) methods. First of all, the applicability and the accuracy of the quasi-linear multireference difference dedicated configuration interaction method have been demonstrated on bio-relevant systems by comparison with the results obtained by the standard MS-CASPT2. Second, both the nature and excitation energy of the first low-lying excited state obtained at the CC2 level are very close to the Davidson corrected CAS+DDCI ones, the mean absolute deviation on the excitation energy being equal to 0.1 eV with a maximum of less than 0.2 eV. Finally, for the following low-lying excited states, if the nature is always well reproduced at the CC2 level, the differences on excitation energies become more important and can depend on the geometry.
Low-lying excited states of model proteins: Performances of the CC2 method versus multireference methods.

PubMed

Ben Amor, Nadia; Hoyau, Sophie; Maynau, Daniel; Brenner, Valérie

2018-05-14

A benchmark set of relevant geometries of a model protein, the N-acetylphenylalanylamide, is presented to assess the validity of the approximate second-order coupled cluster (CC2) method in studying low-lying excited states of such bio-relevant systems. The studies comprise investigations of basis-set dependence as well as comparison with two multireference methods, the multistate complete active space 2nd order perturbation theory (MS-CASPT2) and the multireference difference dedicated configuration interaction (DDCI) methods. First of all, the applicability and the accuracy of the quasi-linear multireference difference dedicated configuration interaction method have been demonstrated on bio-relevant systems by comparison with the results obtained by the standard MS-CASPT2. Second, both the nature and excitation energy of the first low-lying excited state obtained at the CC2 level are very close to the Davidson corrected CAS+DDCI ones, the mean absolute deviation on the excitation energy being equal to 0.1 eV with a maximum of less than 0.2 eV. Finally, for the following low-lying excited states, if the nature is always well reproduced at the CC2 level, the differences on excitation energies become more important and can depend on the geometry.
A comprehensive comparison of tools for differential ChIP-seq analysis

PubMed Central

Steinhauser, Sebastian; Kurzawa, Nils; Eils, Roland

2016-01-01

ChIP-seq has become a widely adopted genomic assay in recent years to determine binding sites for transcription factors or enrichments for specific histone modifications. Beside detection of enriched or bound regions, an important question is to determine differences between conditions. While this is a common analysis for gene expression, for which a large number of computational approaches have been validated, the same question for ChIP-seq is particularly challenging owing to the complexity of ChIP-seq data in terms of noisiness and variability. Many different tools have been developed and published in recent years. However, a comprehensive comparison and review of these tools is still missing. Here, we have reviewed 14 tools, which have been developed to determine differential enrichment between two conditions. They differ in their algorithmic setups, and also in the range of applicability. Hence, we have benchmarked these tools on real data sets for transcription factors and histone modifications, as well as on simulated data sets to quantitatively evaluate their performance. Overall, there is a great variety in the type of signal detected by these tools with a surprisingly low level of agreement. Depending on the type of analysis performed, the choice of method will crucially impact the outcome. PMID:26764273
Benchmarking all-atom simulations using hydrogen exchange

DOE PAGES

Skinner, John J.; Yu, Wookyung; Gichana, Elizabeth K.; ...

2014-10-27

We are now able to fold small proteins reversibly to their native structures [Lindorff-Larsen K, Piana S, Dror RO, Shaw DE (2011) Science 334(6055):517–520] using long-time molecular dynamics (MD) simulations. Our results indicate that modern force fields can reproduce the energy surface near the native structure. In this paper, to test how well the force fields recapitulate the other regions of the energy surface, MD trajectories for a variant of protein G are compared with data from site-resolved hydrogen exchange (HX) and other biophysical measurements. Because HX monitors the breaking of individual H-bonds, this experimental technique identifies the stability andmore » H-bond content of excited states, thus enabling quantitative comparison with the simulations. Contrary to experimental findings of a cooperative, all-or-none unfolding process, the simulated denatured state ensemble, on average, is highly collapsed with some transient or persistent native 2° structure. The MD trajectories of this protein G variant and other small proteins exhibit excessive intramolecular H-bonding even for the most expanded conformations, suggesting that the force fields require improvements in describing H-bonding and backbone hydration. Finally and moreover, these comparisons provide a general protocol for validating the ability of simulations to accurately capture rare structural fluctuations.« less
Setup for investigating gold nanoparticle penetration through reconstructed skin and comparison to published human skin data

NASA Astrophysics Data System (ADS)

Labouta, Hagar I.; Thude, Sibylle; Schneider, Marc

2013-06-01

Owing to the limited source of human skin (HS) and the ethical restrictions of using animals in experiments, in vitro skin equivalents are a possible alternative for conducting particle penetration experiments. The conditions for conducting penetration experiments with model particles, 15-nm gold nanoparticles (AuNP), through nonsealed skin equivalents are described for the first time. These conditions include experimental setup, sterility conditions, effective applied dose determination, skin sectioning, and skin integrity check. Penetration at different exposure times (two and 24 h) and after tissue fixation (fixed versus unfixed skin) are examined to establish a benchmark in comparison to HS in an attempt to get similar results to HS experiments presented earlier. Multiphoton microscopy is used to detect gold luminescence in skin sections. λex=800 nm is used for excitation of AuNP and skin samples, allowing us to determine a relative index for particle penetration. Despite the observed overpredictability of penetration into skin equivalents, they could serve as a first fast screen for testing the behavior of nanoparticles and extrapolate their penetration behavior into HS. Further investigations are required to test a wide range of particles of different physicochemical properties to validate the skin equivalent-human skin particle penetration relationship.
Benchmarking for On-Scalp MEG Sensors.

PubMed

Xie, Minshu; Schneiderman, Justin F; Chukharkin, Maxim L; Kalabukhov, Alexei; Riaz, Bushra; Lundqvist, Daniel; Whitmarsh, Stephen; Hamalainen, Matti; Jousmaki, Veikko; Oostenveld, Robert; Winkler, Dag

2017-06-01

We present a benchmarking protocol for quantitatively comparing emerging on-scalp magnetoencephalography (MEG) sensor technologies to their counterparts in state-of-the-art MEG systems. As a means of validation, we compare a high-critical-temperature superconducting quantum interference device (high T c SQUID) with the low- T c SQUIDs of an Elekta Neuromag TRIUX system in MEG recordings of auditory and somatosensory evoked fields (SEFs) on one human subject. We measure the expected signal gain for the auditory-evoked fields (deeper sources) and notice some unfamiliar features in the on-scalp sensor-based recordings of SEFs (shallower sources). The experimental results serve as a proof of principle for the benchmarking protocol. This approach is straightforward, general to various on-scalp MEG sensors, and convenient to use on human subjects. The unexpected features in the SEFs suggest on-scalp MEG sensors may reveal information about neuromagnetic sources that is otherwise difficult to extract from state-of-the-art MEG recordings. As the first systematically established on-scalp MEG benchmarking protocol, magnetic sensor developers can employ this method to prove the utility of their technology in MEG recordings. Further exploration of the SEFs with on-scalp MEG sensors may reveal unique information about their sources.
Turbulent Dispersion Modelling in a Complex Urban Environment - Data Analysis and Model Development

DTIC Science & Technology

2010-02-01

Technology Laboratory (Dstl) is used as a benchmark for comparison. Comparisons are also made with some more practically oriented computational fluid dynamics...predictions. To achieve clarity in the range of approaches available for practical models of con- taminant dispersion in urban areas, an overview of...complexity of those methods is simplified to a degree that allows straightforward practical implementation and application. Using these results as a
Testing and validating environmental models

USGS Publications Warehouse

Kirchner, J.W.; Hooper, R.P.; Kendall, C.; Neal, C.; Leavesley, G.

1996-01-01

Generally accepted standards for testing and validating ecosystem models would benefit both modellers and model users. Universally applicable test procedures are difficult to prescribe, given the diversity of modelling approaches and the many uses for models. However, the generally accepted scientific principles of documentation and disclosure provide a useful framework for devising general standards for model evaluation. Adequately documenting model tests requires explicit performance criteria, and explicit benchmarks against which model performance is compared. A model's validity, reliability, and accuracy can be most meaningfully judged by explicit comparison against the available alternatives. In contrast, current practice is often characterized by vague, subjective claims that model predictions show 'acceptable' agreement with data; such claims provide little basis for choosing among alternative models. Strict model tests (those that invalid models are unlikely to pass) are the only ones capable of convincing rational skeptics that a model is probably valid. However, 'false positive' rates as low as 10% can substantially erode the power of validation tests, making them insufficiently strict to convince rational skeptics. Validation tests are often undermined by excessive parameter calibration and overuse of ad hoc model features. Tests are often also divorced from the conditions under which a model will be used, particularly when it is designed to forecast beyond the range of historical experience. In such situations, data from laboratory and field manipulation experiments can provide particularly effective tests, because one can create experimental conditions quite different from historical data, and because experimental data can provide a more precisely defined 'target' for the model to hit. We present a simple demonstration showing that the two most common methods for comparing model predictions to environmental time series (plotting model time series against data time series, and plotting predicted versus observed values) have little diagnostic power. We propose that it may be more useful to statistically extract the relationships of primary interest from the time series, and test the model directly against them.
Stress Response as a Function of Task Relevance

DTIC Science & Technology

2010-12-01

be benchmarked for validity and reliability. The State-Trait Anxiety Index (or STAI; Spielberger and Sydeman, 1994) is a popular self-report...and human performance. In J.E. Driskell & E. Salas (Eds.), Stress and Human Performance Spielberger , C.D. and Sydeman, S.J. (1994). State-Trait
User-Centric Approach for Benchmark RDF Data Generator in Big Data Performance Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Purohit, Sumit; Paulson, Patrick R.; Rodriguez, Luke R.

This research focuses on user-centric approach of building such tools and proposes a flexible, extensible, and easy to use framework to support performance analysis of Big Data systems. Finally, case studies from two different domains are presented to validate the framework.
Analytical validation and establishment of reference intervals for a 'high-sensitivity' cardiac troponin-T assay in horses.

PubMed

Shields, E; Seiden-Long, I; Massie, S; Passante, S; Leguillette, R

2016-06-13

Cardiac troponin-I assays have been validated in horses.'High-sensitivity' cardiac troponin assays are now the standard in human cardiology. Appropriately validate the'high-sensitivity' cardiac Troponin-T (hscTnT) assay for clinical use in horses, establish reference intervals, determine the biological variation, and demonstrate assay utility in selected clinical cases. Analytical validation of the Roche hscTnT assay included within- and between-run precision, linear dose response, limit of quantitation (LoQ), stability, and comparison with cTn-I (iSTAT). Reference intervals and biological variation were determined using adult, healthy, Non-Competition Horses (N = 125) and Racing-Thoroughbreds (N = 178). HscTnT levels were measured in two horses with cardiac pathology. The hscTnT demonstrates acceptable within-run (L1 = 6.5 ng/L, CV 14.9 %, L2 = 10.1 ng/L, CV 8.7 %, L3 = 15.3 ng/L, CV 5.4 %) and between-run precision (L1 = 12.2 ng/L, CV 8.4 %, L2 = 57.0 ng/L, CV 8.4 %, L3 = 256.0 ng/L, CV 9.0 %). The assay was linear from 3 to 391 ng/L. The LoQ was validated at 3 ng/L. Samples demonstrated insignificant decay over freeze-thaw cycle. Comparison with cTnI assay showed excellent correlation (range: 8.0-3535.0 ng/L, R(2) = 0.9996). Reference intervals: The upper 95(th) and 99(th) percentile of the hscTnT population distribution were 6.8 and 16.2 ng/L in Non-Competition Horses, and 14.0 and 23.2 ng/L in Racing-Thoroughbreds. Between-breed, diurnal effect, and between-day variation was below LoQ. Two clinical cases with presumed cardiac pathology had hscTnT levels of 220.9 ng/L and 5723.0 ng/L. This benchmark study is the first to comply with CLSI guidelines, thus further establishing the performance characteristics of the hscTnT assay, and reference intervals in healthy horses. Two clinical cases demonstrated further the clinical utility of the assay.
Benchmarks for single-phase flow in fractured porous media

NASA Astrophysics Data System (ADS)

Flemisch, Bernd; Berre, Inga; Boon, Wietse; Fumagalli, Alessio; Schwenck, Nicolas; Scotti, Anna; Stefansson, Ivar; Tatomir, Alexandru

2018-01-01

This paper presents several test cases intended to be benchmarks for numerical schemes for single-phase fluid flow in fractured porous media. A number of solution strategies are compared, including a vertex and two cell-centred finite volume methods, a non-conforming embedded discrete fracture model, a primal and a dual extended finite element formulation, and a mortar discrete fracture model. The proposed benchmarks test the schemes by increasing the difficulties in terms of network geometry, e.g. intersecting fractures, and physical parameters, e.g. low and high fracture-matrix permeability ratio as well as heterogeneous fracture permeabilities. For each problem, the results presented are the number of unknowns, the approximation errors in the porous matrix and in the fractures with respect to a reference solution, and the sparsity and condition number of the discretized linear system. All data and meshes used in this study are publicly available for further comparisons.
A note on bound constraints handling for the IEEE CEC'05 benchmark function suite.

PubMed

Liao, Tianjun; Molina, Daniel; de Oca, Marco A Montes; Stützle, Thomas

2014-01-01

The benchmark functions and some of the algorithms proposed for the special session on real parameter optimization of the 2005 IEEE Congress on Evolutionary Computation (CEC'05) have played and still play an important role in the assessment of the state of the art in continuous optimization. In this article, we show that if bound constraints are not enforced for the final reported solutions, state-of-the-art algorithms produce infeasible best candidate solutions for the majority of functions of the IEEE CEC'05 benchmark function suite. This occurs even though the optima of the CEC'05 functions are within the specified bounds. This phenomenon has important implications on algorithm comparisons, and therefore on algorithm designs. This article's goal is to draw the attention of the community to the fact that some authors might have drawn wrong conclusions from experiments using the CEC'05 problems.
Comparing health system performance assessment and management approaches in the Netherlands and Ontario, Canada

PubMed Central

Tawfik-Shukor, Ali R; Klazinga, Niek S; Arah, Onyebuchi A

2007-01-01

Background Given the proliferation and the growing complexity of performance measurement initiatives in many health systems, the Netherlands and Ontario, Canada expressed interests in cross-national comparisons in an effort to promote knowledge transfer and best practise. To support this cross-national learning, a study was undertaken to compare health system performance approaches in The Netherlands with Ontario, Canada. Methods We explored the performance assessment framework and system of each constituency, the embeddedness of performance data in management and policy processes, and the interrelationships between the frameworks. Methods used included analysing governmental strategic planning and policy documents, literature and internet searches, comparative descriptive tables, and schematics. Data collection and analysis took place in Ontario and The Netherlands. A workshop to validate and discuss the findings was conducted in Toronto, adding important insights to the study. Results Both Ontario and The Netherlands conceive health system performance within supportive frameworks. However they differ in their assessment approaches. Ontario's Scorecard links performance measurement with strategy, aimed at health system integration. The Dutch Health Care Performance Report (Zorgbalans) does not explicitly link performance with strategy, and focuses on the technical quality of healthcare by measuring dimensions of quality, access, and cost against healthcare needs. A backbone 'five diamond' framework maps both frameworks and articulates the interrelations and overlap between their goals, themes, dimensions and indicators. The workshop yielded more contextual insights and further validated the comparative values of each constituency's performance assessment system. Conclusion To compare the health system performance approaches between The Netherlands and Ontario, Canada, several important conceptual and contextual issues must be addressed, before even attempting any future content comparisons and benchmarking. Such issues would lend relevant interpretational credibility to international comparative assessments of the two health systems. PMID:17319947
Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas (GPS - TTBP) Final Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chame, Jacqueline

2011-05-27

The goal of this project is the development of the Gyrokinetic Toroidal Code (GTC) Framework and its applications to problems related to the physics of turbulence and turbulent transport in tokamaks,. The project involves physics studies, code development, noise effect mitigation, supporting computer science efforts, diagnostics and advanced visualizations, verification and validation. Its main scientific themes are mesoscale dynamics and non-locality effects on transport, the physics of secondary structures such as zonal flows, and strongly coherent wave-particle interaction phenomena at magnetic precession resonances. Special emphasis is placed on the implications of these themes for rho-star and current scalings and formore » the turbulent transport of momentum. GTC-TTBP also explores applications to electron thermal transport, particle transport; ITB formation and cross-cuts such as edge-core coupling, interaction of energetic particles with turbulence and neoclassical tearing mode trigger dynamics. Code development focuses on major initiatives in the development of full-f formulations and the capacity to simulate flux-driven transport. In addition to the full-f -formulation, the project includes the development of numerical collision models and methods for coarse graining in phase space. Verification is pursued by linear stability study comparisons with the FULL and HD7 codes and by benchmarking with the GKV, GYSELA and other gyrokinetic simulation codes. Validation of gyrokinetic models of ion and electron thermal transport is pursed by systematic stressing comparisons with fluctuation and transport data from the DIII-D and NSTX tokamaks. The physics and code development research programs are supported by complementary efforts in computer sciences, high performance computing, and data management.« less

31 CFR 30.11 - Q-11: Are TARP recipients required to meet any other standards under the executive compensation...

Code of Federal Regulations, 2011 CFR

2011-07-01

..., including any “benchmarking” or comparisons employed to identify certain percentile levels of compensation (for example, entities used for benchmarking and a justification for using these entities and the...
MIMoSA: An Automated Method for Intermodal Segmentation Analysis of Multiple Sclerosis Brain Lesions.

PubMed

Valcarcel, Alessandra M; Linn, Kristin A; Vandekar, Simon N; Satterthwaite, Theodore D; Muschelli, John; Calabresi, Peter A; Pham, Dzung L; Martin, Melissa Lynne; Shinohara, Russell T

2018-03-08

Magnetic resonance imaging (MRI) is crucial for in vivo detection and characterization of white matter lesions (WMLs) in multiple sclerosis. While WMLs have been studied for over two decades using MRI, automated segmentation remains challenging. Although the majority of statistical techniques for the automated segmentation of WMLs are based on single imaging modalities, recent advances have used multimodal techniques for identifying WMLs. Complementary modalities emphasize different tissue properties, which help identify interrelated features of lesions. Method for Inter-Modal Segmentation Analysis (MIMoSA), a fully automatic lesion segmentation algorithm that utilizes novel covariance features from intermodal coupling regression in addition to mean structure to model the probability lesion is contained in each voxel, is proposed. MIMoSA was validated by comparison with both expert manual and other automated segmentation methods in two datasets. The first included 98 subjects imaged at Johns Hopkins Hospital in which bootstrap cross-validation was used to compare the performance of MIMoSA against OASIS and LesionTOADS, two popular automatic segmentation approaches. For a secondary validation, a publicly available data from a segmentation challenge were used for performance benchmarking. In the Johns Hopkins study, MIMoSA yielded average Sørensen-Dice coefficient (DSC) of .57 and partial AUC of .68 calculated with false positive rates up to 1%. This was superior to performance using OASIS and LesionTOADS. The proposed method also performed competitively in the segmentation challenge dataset. MIMoSA resulted in statistically significant improvements in lesion segmentation performance compared with LesionTOADS and OASIS, and performed competitively in an additional validation study. Copyright © 2018 by the American Society of Neuroimaging.
Dose specification for hippocampal sparing whole brain radiotherapy (HS WBRT): considerations from the UK HIPPO trial QA programme.

PubMed

Megias, Daniel; Phillips, Mark; Clifton-Hadley, Laura; Harron, Elizabeth; Eaton, David J; Sanghera, Paul; Whitfield, Gillian

2017-03-01

The HIPPO trial is a UK randomized Phase II trial of hippocampal sparing (HS) vs conventional whole-brain radiotherapy after surgical resection or radiosurgery in patients with favourable prognosis with 1-4 brain metastases. Each participating centre completed a planning benchmark case as part of the dedicated radiotherapy trials quality assurance programme (RTQA), promoting the safe and effective delivery of HS intensity-modulated radiotherapy (IMRT) in a multicentre trial setting. Submitted planning benchmark cases were reviewed using visualization for radiotherapy software (VODCA) evaluating plan quality and compliance in relation to the HIPPO radiotherapy planning and delivery guidelines. Comparison of the planning benchmark data highlighted a plan specified using dose to medium as an outlier by comparison with those specified using dose to water. Further evaluation identified that the reported plan statistics for dose to medium were lower as a result of the dose calculated at regions of PTV inclusive of bony cranium being lower relative to brain. Specification of dose to water or medium remains a source of potential ambiguity and it is essential that as part of a multicentre trial, consideration is given to reported differences, particularly in the presence of bone. Evaluation of planning benchmark data as part of an RTQA programme has highlighted an important feature of HS IMRT dosimetry dependent on dose being specified to water or medium, informing the development and undertaking of HS IMRT as part of the HIPPO trial. Advances in knowledge: The potential clinical impact of differences between dose to medium and dose to water are demonstrated for the first time, in the setting of HS whole-brain radiotherapy.
ORANGE: a Monte Carlo dose engine for radiotherapy.

PubMed

van der Zee, W; Hogenbirk, A; van der Marck, S C

2005-02-21

This study presents data for the verification of ORANGE, a fast MCNP-based dose engine for radiotherapy treatment planning. In order to verify the new algorithm, it has been benchmarked against DOSXYZ and against measurements. For the benchmarking, first calculations have been done using the ICCR-XIII benchmark. Next, calculations have been done with DOSXYZ and ORANGE in five different phantoms (one homogeneous, two with bone equivalent inserts and two with lung equivalent inserts). The calculations have been done with two mono-energetic photon beams (2 MeV and 6 MeV) and two mono-energetic electron beams (10 MeV and 20 MeV). Comparison of the calculated data (from DOSXYZ and ORANGE) against measurements was possible for a realistic 10 MV photon beam and a realistic 15 MeV electron beam in a homogeneous phantom only. For the comparison of the calculated dose distributions and dose distributions against measurements, the concept of the confidence limit (CL) has been used. This concept reduces the difference between two data sets to a single number, which gives the deviation for 90% of the dose distributions. Using this concept, it was found that ORANGE was always within the statistical bandwidth with DOSXYZ and the measurements. The ICCR-XIII benchmark showed that ORANGE is seven times faster than DOSXYZ, a result comparable with other accelerated Monte Carlo dose systems when no variance reduction is used. As shown for XVMC, using variance reduction techniques has the potential for further acceleration. Using modern computer hardware, this brings the total calculation time for a dose distribution with 1.5% (statistical) accuracy within the clinical range (less then 10 min). This means that ORANGE can be a candidate for a dose engine in radiotherapy treatment planning.
Effect of response format on cognitive reflection: Validating a two- and four-option multiple choice question version of the Cognitive Reflection Test.

PubMed

Sirota, Miroslav; Juanchich, Marie

2018-03-27

The Cognitive Reflection Test, measuring intuition inhibition and cognitive reflection, has become extremely popular because it reliably predicts reasoning performance, decision-making, and beliefs. Across studies, the response format of CRT items sometimes differs, based on the assumed construct equivalence of tests with open-ended versus multiple-choice items (the equivalence hypothesis). Evidence and theoretical reasons, however, suggest that the cognitive processes measured by these response formats and their associated performances might differ (the nonequivalence hypothesis). We tested the two hypotheses experimentally by assessing the performance in tests with different response formats and by comparing their predictive and construct validity. In a between-subjects experiment (n = 452), participants answered stem-equivalent CRT items in an open-ended, a two-option, or a four-option response format and then completed tasks on belief bias, denominator neglect, and paranormal beliefs (benchmark indicators of predictive validity), as well as on actively open-minded thinking and numeracy (benchmark indicators of construct validity). We found no significant differences between the three response formats in the numbers of correct responses, the numbers of intuitive responses (with the exception of the two-option version, which had a higher number than the other tests), and the correlational patterns of the indicators of predictive and construct validity. All three test versions were similarly reliable, but the multiple-choice formats were completed more quickly. We speculate that the specific nature of the CRT items helps build construct equivalence among the different response formats. We recommend using the validated multiple-choice version of the CRT presented here, particularly the four-option CRT, for practical and methodological reasons. Supplementary materials and data are available at https://osf.io/mzhyc/ .
Experimental program for real gas flow code validation at NASA Ames Research Center

NASA Technical Reports Server (NTRS)

Deiwert, George S.; Strawa, Anthony W.; Sharma, Surendra P.; Park, Chul

1989-01-01

The experimental program for validating real gas hypersonic flow codes at NASA Ames Rsearch Center is described. Ground-based test facilities used include ballistic ranges, shock tubes and shock tunnels, arc jet facilities and heated-air hypersonic wind tunnels. Also included are large-scale computer systems for kinetic theory simulations and benchmark code solutions. Flight tests consist of the Aeroassist Flight Experiment, the Space Shuttle, Project Fire 2, and planetary probes such as Galileo, Pioneer Venus, and PAET.
The use of National Weather Service Data to Compute the Dose to the MEOI.

PubMed

Vickers, Linda

2018-05-01

The Turner method is the "benchmark method" for computing the stability class that is used to compute the X/Q (s m). The Turner method should be used to ascertain the validity of X/Q results determined by other methods. This paper used site-specific meteorological data obtained from the National Weather Service. The Turner method described herein is simple, quick, accurate, and transparent because all of the data, calculations, and results are visible for verification and validation with published literature.
A New Performance Improvement Model: Adding Benchmarking to the Analysis of Performance Indicator Data.

PubMed

Al-Kuwaiti, Ahmed; Homa, Karen; Maruthamuthu, Thennarasu

2016-01-01

A performance improvement model was developed that focuses on the analysis and interpretation of performance indicator (PI) data using statistical process control and benchmarking. PIs are suitable for comparison with benchmarks only if the data fall within the statistically accepted limit-that is, show only random variation. Specifically, if there is no significant special-cause variation over a period of time, then the data are ready to be benchmarked. The proposed Define, Measure, Control, Internal Threshold, and Benchmark model is adapted from the Define, Measure, Analyze, Improve, Control (DMAIC) model. The model consists of the following five steps: Step 1. Define the process; Step 2. Monitor and measure the variation over the period of time; Step 3. Check the variation of the process; if stable (no significant variation), go to Step 4; otherwise, control variation with the help of an action plan; Step 4. Develop an internal threshold and compare the process with it; Step 5.1. Compare the process with an internal benchmark; and Step 5.2. Compare the process with an external benchmark. The steps are illustrated through the use of health care-associated infection (HAI) data collected for 2013 and 2014 from the Infection Control Unit, King Fahd Hospital, University of Dammam, Saudi Arabia. Monitoring variation is an important strategy in understanding and learning about a process. In the example, HAI was monitored for variation in 2013, and the need to have a more predictable process prompted the need to control variation by an action plan. The action plan was successful, as noted by the shift in the 2014 data, compared to the historical average, and, in addition, the variation was reduced. The model is subject to limitations: For example, it cannot be used without benchmarks, which need to be calculated the same way with similar patient populations, and it focuses only on the "Analyze" part of the DMAIC model.
Trophic dilution of cyclic volatile methylsiloxanes (cVMS) in the pelagic marine food web of Tokyo Bay, Japan.

PubMed

Powell, David E; Suganuma, Noriyuki; Kobayashi, Keiji; Nakamura, Tsutomu; Ninomiya, Kouzo; Matsumura, Kozaburo; Omura, Naoki; Ushioka, Satoshi

2017-02-01

Bioaccumulation and trophic transfer of cyclic volatile methylsiloxanes (cVMS), specifically octamethylcyclotetrasiloxane (D4), decamethylcyclopentasiloxane (D5), and dodecamethylcyclohexasiloxane (D6), were evaluated in the pelagic marine food web of Tokyo Bay, Japan. Polychlorinated biphenyl (PCB) congeners that are "legacy" chemicals known to bioaccumulate in aquatic organisms and biomagnify across aquatic food webs were used as a benchmark chemical (CB-180) to calibrate the sampled food web and as a reference chemical (CB-153) to validate the results. Trophic magnification factors (TMFs) were calculated from slopes of ordinary least-squares (OLS) regression models and slopes of bootstrap regression models, which were used as robust alternatives to the OLS models. Various regression models were developed that incorporated benchmarking to control bias associated with experimental design, food web dynamics, and trophic level structure. There was no evidence from any of the regression models to suggest biomagnification of cVMS in Tokyo Bay. Rather, the regression models indicated that trophic dilution of cVMS, not trophic magnification, occurred across the sampled food web. Comparison of results for Tokyo Bay to results from other studies indicated that bioaccumulation of cVMS was not related to type of food web (pelagic vs demersal), environment (marine vs freshwater), species composition, or location. Rather, results suggested that differences between study areas was likely related to food web dynamics and variable conditions of exposure resulting from non-uniform patterns of organism movement across spatial concentration gradients. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Comparison of PubMed and Google Scholar literature searches.

PubMed

Anders, Michael E; Evans, Dennis P

2010-05-01

Literature searches are essential to evidence-based respiratory care. To conduct literature searches, respiratory therapists rely on search engines to retrieve information, but there is a dearth of literature on the comparative efficiencies of search engines for researching clinical questions in respiratory care. To compare PubMed and Google Scholar search results for clinical topics in respiratory care to that of a benchmark. We performed literature searches with PubMed and Google Scholar, on 3 clinical topics. In PubMed we used the Clinical Queries search filter. In Google Scholar we used the search filters in the Advanced Scholar Search option. We used the reference list of a related Cochrane Collaboration evidence-based systematic review as the benchmark for each of the search results. We calculated recall (sensitivity) and precision (positive predictive value) with 2 x 2 contingency tables. We compared the results with the chi-square test of independence and Fisher's exact test. PubMed and Google Scholar had similar recall for both overall search results (71% vs 69%) and full-text results (43% vs 51%). PubMed had better precision than Google Scholar for both overall search results (13% vs 0.07%, P < .001) and full-text results (8% vs 0.05%, P < .001). Our results suggest that PubMed searches with the Clinical Queries filter are more precise than with the Advanced Scholar Search in Google Scholar for respiratory care topics. PubMed appears to be more practical to conduct efficient, valid searches for informing evidence-based patient-care protocols, for guiding the care of individual patients, and for educational purposes.
The mark of vegetation change on Earth's surface energy balance: data-driven diagnostics and model validation

NASA Astrophysics Data System (ADS)

Cescatti, A.; Duveiller, G.; Hooker, J.

2017-12-01

Changing vegetation cover not only affects the atmospheric concentration of greenhouse gases but also alters the radiative and non-radiative properties of the surface. The result of competing biophysical processes on Earth's surface energy balance varies spatially and seasonally, and can lead to warming or cooling depending on the specific vegetation change and on the background climate. To date these effects are not accounted for in land-based climate policies because of the complexity of the phenomena, contrasting model predictions and the lack of global data-driven assessments. To overcome the limitations of available observation-based diagnostics and of the on-going model inter-comparison, here we present a new benchmarking dataset derived from satellite remote sensing. This global dataset provides the potential changes induced by multiple vegetation transitions on the single terms of the surface energy balance. We used this dataset for two major goals: 1) Quantify the impact of actual vegetation changes that occurred during the decade 2000-2010, showing the overwhelming role of tropical deforestation in warming the surface by reducing evapotranspiration despite the concurrent brightening of the Earth. 2) Benchmark a series of ESMs against data-driven metrics of the land cover change impacts on the various terms of the surface energy budget and on the surface temperature. We anticipate that the dataset could be also used to evaluate future scenarios of land cover change and to develop the monitoring, reporting and verification guidelines required for the implementation of mitigation plans that account for biophysical land processes.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Marck, Steven C. van der, E-mail: vandermarck@nrg.eu

Recent releases of three major world nuclear reaction data libraries, ENDF/B-VII.1, JENDL-4.0, and JEFF-3.1.1, have been tested extensively using benchmark calculations. The calculations were performed with the latest release of the continuous energy Monte Carlo neutronics code MCNP, i.e. MCNP6. Three types of benchmarks were used, viz. criticality safety benchmarks, (fusion) shielding benchmarks, and reference systems for which the effective delayed neutron fraction is reported. For criticality safety, more than 2000 benchmarks from the International Handbook of Criticality Safety Benchmark Experiments were used. Benchmarks from all categories were used, ranging from low-enriched uranium, compound fuel, thermal spectrum ones (LEU-COMP-THERM), tomore » mixed uranium-plutonium, metallic fuel, fast spectrum ones (MIX-MET-FAST). For fusion shielding many benchmarks were based on IAEA specifications for the Oktavian experiments (for Al, Co, Cr, Cu, LiF, Mn, Mo, Si, Ti, W, Zr), Fusion Neutronics Source in Japan (for Be, C, N, O, Fe, Pb), and Pulsed Sphere experiments at Lawrence Livermore National Laboratory (for {sup 6}Li, {sup 7}Li, Be, C, N, O, Mg, Al, Ti, Fe, Pb, D2O, H2O, concrete, polyethylene and teflon). The new functionality in MCNP6 to calculate the effective delayed neutron fraction was tested by comparison with more than thirty measurements in widely varying systems. Among these were measurements in the Tank Critical Assembly (TCA in Japan) and IPEN/MB-01 (Brazil), both with a thermal spectrum, two cores in Masurca (France) and three cores in the Fast Critical Assembly (FCA, Japan), all with fast spectra. The performance of the three libraries, in combination with MCNP6, is shown to be good. The results for the LEU-COMP-THERM category are on average very close to the benchmark value. Also for most other categories the results are satisfactory. Deviations from the benchmark values do occur in certain benchmark series, or in isolated cases within benchmark series. Such instances can often be related to nuclear data for specific non-fissile elements, such as C, Fe, or Gd. Indications are that the intermediate and mixed spectrum cases are less well described. The results for the shielding benchmarks are generally good, with very similar results for the three libraries in the majority of cases. Nevertheless there are, in certain cases, strong deviations between calculated and benchmark values, such as for Co and Mg. Also, the results show discrepancies at certain energies or angles for e.g. C, N, O, Mo, and W. The functionality of MCNP6 to calculate the effective delayed neutron fraction yields very good results for all three libraries.« less
A Meta-Analysis of Reliability Coefficients in Second Language Research

ERIC Educational Resources Information Center

Plonsky, Luke; Derrick, Deirdre J.

2016-01-01

Ensuring internal validity in quantitative research requires, among other conditions, reliable instrumentation. Unfortunately, however, second language (L2) researchers often fail to report and even more often fail to interpret reliability estimates beyond generic benchmarks for acceptability. As a means to guide interpretations of such estimates,…
ICT Proficiency and Gender: A Validation on Training and Development

ERIC Educational Resources Information Center

Lin, Shinyi; Shih, Tse-Hua; Lu, Ruiling

2013-01-01

Use of innovative learning/instruction mode, embedded in the Certification Pathway System (CPS) developed by Certiport TM, is geared toward Internet and Computing Benchmark & Mentor specifically for IC[superscript 3] certification. The Internet and Computing Core Certification (IC[superscript 3]), as an industry-based credentialing program,…
‘Wasteaware’ benchmark indicators for integrated sustainable waste management in cities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wilson, David C., E-mail: waste@davidcwilson.com; Rodic, Ljiljana; Cowing, Michael J.

Highlights: • Solid waste management (SWM) is a key utility service, but data is often lacking. • Measuring their SWM performance helps a city establish priorities for action. • The Wasteaware benchmark indicators: measure both technical and governance aspects. • Have been developed over 5 years and tested in more than 50 cities on 6 continents. • Enable consistent comparison between cities and countries and monitoring progress. - Abstract: This paper addresses a major problem in international solid waste management, which is twofold: a lack of data, and a lack of consistent data to allow comparison between cities. The papermore » presents an indicator set for integrated sustainable waste management (ISWM) in cities both North and South, to allow benchmarking of a city’s performance, comparing cities and monitoring developments over time. It builds on pioneering work for UN-Habitat’s solid waste management in the World’s cities. The comprehensive analytical framework of a city’s solid waste management system is divided into two overlapping ‘triangles’ – one comprising the three physical components, i.e. collection, recycling, and disposal, and the other comprising three governance aspects, i.e. inclusivity; financial sustainability; and sound institutions and proactive policies. The indicator set includes essential quantitative indicators as well as qualitative composite indicators. This updated and revised ‘Wasteaware’ set of ISWM benchmark indicators is the cumulative result of testing various prototypes in more than 50 cities around the world. This experience confirms the utility of indicators in allowing comprehensive performance measurement and comparison of both ‘hard’ physical components and ‘soft’ governance aspects; and in prioritising ‘next steps’ in developing a city’s solid waste management system, by identifying both local strengths that can be built on and weak points to be addressed. The Wasteaware ISWM indicators are applicable to a broad range of cities with very different levels of income and solid waste management practices. Their wide application as a standard methodology will help to fill the historical data gap.« less
OECD/NEA expert group on uncertainty analysis for criticality safety assessment: Results of benchmark on sensitivity calculation (phase III)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ivanova, T.; Laville, C.; Dyrda, J.

2012-07-01

The sensitivities of the k{sub eff} eigenvalue to neutron cross sections have become commonly used in similarity studies and as part of the validation algorithm for criticality safety assessments. To test calculations of the sensitivity coefficients, a benchmark study (Phase III) has been established by the OECD-NEA/WPNCS/EG UACSA (Expert Group on Uncertainty Analysis for Criticality Safety Assessment). This paper presents some sensitivity results generated by the benchmark participants using various computational tools based upon different computational methods: SCALE/TSUNAMI-3D and -1D, MONK, APOLLO2-MORET 5, DRAGON-SUSD3D and MMKKENO. The study demonstrates the performance of the tools. It also illustrates how model simplificationsmore » impact the sensitivity results and demonstrates the importance of 'implicit' (self-shielding) sensitivities. This work has been a useful step towards verification of the existing and developed sensitivity analysis methods. (authors)« less
A Simulation Environment for Benchmarking Sensor Fusion-Based Pose Estimators.

PubMed

Ligorio, Gabriele; Sabatini, Angelo Maria

2015-12-19

In-depth analysis and performance evaluation of sensor fusion-based estimators may be critical when performed using real-world sensor data. For this reason, simulation is widely recognized as one of the most powerful tools for algorithm benchmarking. In this paper, we present a simulation framework suitable for assessing the performance of sensor fusion-based pose estimators. The systems used for implementing the framework were magnetic/inertial measurement units (MIMUs) and a camera, although the addition of further sensing modalities is straightforward. Typical nuisance factors were also included for each sensor. The proposed simulation environment was validated using real-life sensor data employed for motion tracking. The higher mismatch between real and simulated sensors was about 5% of the measured quantity (for the camera simulation), whereas a lower correlation was found for an axis of the gyroscope (0.90). In addition, a real benchmarking example of an extended Kalman filter for pose estimation from MIMU and camera data is presented.
A Systematic Method for Verification and Validation of Gyrokinetic Microstability Codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bravenec, Ronald

My original proposal for the period Feb. 15, 2014 through Feb. 14, 2017 called for an integrated validation and verification effort carried out by myself with collaborators. The validation component would require experimental profile and power-balance analysis. In addition, it would require running the gyrokinetic codes varying the input profiles within experimental uncertainties to seek agreement with experiment before discounting a code as invalidated. Therefore, validation would require a major increase of effort over my previous grant periods which covered only code verification (code benchmarking). Consequently, I had requested full-time funding. Instead, I am being funded at somewhat less thanmore » half time (5 calendar months per year). As a consequence, I decided to forego the validation component and to only continue the verification efforts.« less
Selecting concepts for a concept-based curriculum: application of a benchmark approach.

PubMed

Giddens, Jean Foret; Wright, Mary; Gray, Irene

2012-09-01

In response to a transformational movement in nursing education, faculty across the country are considering changes to curricula and approaches to teaching. As a result, an emerging trend in many nursing programs is the adoption of a concept-based curriculum. As part of the curriculum development process, the selection of concepts, competencies, and exemplars on which to build courses and base content is needed. This article presents a benchmark approach used to validate and finalize concept selection among educators developing a concept-based curriculum for a statewide nursing consortium. These findings are intended to inform other nurse educators who are currently involved with or are considering this curriculum approach. Copyright 2012, SLACK Incorporated.
A comparison of common programming languages used in bioinformatics

PubMed Central

Fourment, Mathieu; Gillings, Michael R

2008-01-01

Background The performance of different programming languages has previously been benchmarked using abstract mathematical algorithms, but not using standard bioinformatics algorithms. We compared the memory usage and speed of execution for three standard bioinformatics methods, implemented in programs using one of six different programming languages. Programs for the Sellers algorithm, the Neighbor-Joining tree construction algorithm and an algorithm for parsing BLAST file outputs were implemented in C, C++, C#, Java, Perl and Python. Results Implementations in C and C++ were fastest and used the least memory. Programs in these languages generally contained more lines of code. Java and C# appeared to be a compromise between the flexibility of Perl and Python and the fast performance of C and C++. The relative performance of the tested languages did not change from Windows to Linux and no clear evidence of a faster operating system was found. Source code and additional information are available from Conclusion This benchmark provides a comparison of six commonly used programming languages under two different operating systems. The overall comparison shows that a developer should choose an appropriate language carefully, taking into account the performance expected and the library availability for each language. PMID:18251993

Simulation of guided-wave ultrasound propagation in composite laminates: Benchmark comparisons of numerical codes and experiment.

PubMed

Leckey, Cara A C; Wheeler, Kevin R; Hafiychuk, Vasyl N; Hafiychuk, Halyna; Timuçin, Doğan A

2018-03-01

Ultrasonic wave methods constitute the leading physical mechanism for nondestructive evaluation (NDE) and structural health monitoring (SHM) of solid composite materials, such as carbon fiber reinforced polymer (CFRP) laminates. Computational models of ultrasonic wave excitation, propagation, and scattering in CFRP composites can be extremely valuable in designing practicable NDE and SHM hardware, software, and methodologies that accomplish the desired accuracy, reliability, efficiency, and coverage. The development and application of ultrasonic simulation approaches for composite materials is an active area of research in the field of NDE. This paper presents comparisons of guided wave simulations for CFRP composites implemented using four different simulation codes: the commercial finite element modeling (FEM) packages ABAQUS, ANSYS, and COMSOL, and a custom code executing the Elastodynamic Finite Integration Technique (EFIT). Benchmark comparisons are made between the simulation tools and both experimental laser Doppler vibrometry data and theoretical dispersion curves. A pristine and a delamination type case (Teflon insert in the experimental specimen) is studied. A summary is given of the accuracy of simulation results and the respective computational performance of the four different simulation tools. Published by Elsevier B.V.
[Benchmarking in ambulatory care practices--The European Practice Assessment (EPA)].

PubMed

Szecsenyi, Joachim; Broge, Björn; Willms, Sara; Brodowski, Marc; Götz, Katja

2011-01-01

The European Practice Assessment (EPA) is a comprehensive quality management which consists of 220 indicators covering 5 domains (infrastructure, people, information, finance, and quality and safety). The aim of the project presented was to evaluate EPA as an instrument for benchmarking in ambulatory care practices. A before-and-after design with a comparison group was chosen. One hundred and two practices conducted EPA at baseline (t1) and at the 3-year follow-up (t2). A further 209 practices began EPA at t2 (comparison group). Since both practice groups differed in several variables (age of GP, location and size of practice), a matched-pair design based on propensity scores was applied leading to a subgroup of 102 comparable practices (out of the 209 practices). Data analysis was carried out using Z scores of the EPA domains. The results showed significant improvements in all domains between t1 and t2 as well as between the comparison group and t2. Furthermore, the results demonstrate that the implementation of total quality management and the re-assessment of the EPA procedure can lead to significant improvements in almost all domains. Copyright © 2011. Published by Elsevier GmbH.
A community resource benchmarking predictions of peptide binding to MHC-I molecules.

PubMed

Peters, Bjoern; Bui, Huynh-Hoa; Frankild, Sune; Nielson, Morten; Lundegaard, Claus; Kostem, Emrah; Basch, Derek; Lamberth, Kasper; Harndahl, Mikkel; Fleri, Ward; Wilson, Stephen S; Sidney, John; Lund, Ole; Buus, Soren; Sette, Alessandro

2006-06-09

Recognition of peptides bound to major histocompatibility complex (MHC) class I molecules by T lymphocytes is an essential part of immune surveillance. Each MHC allele has a characteristic peptide binding preference, which can be captured in prediction algorithms, allowing for the rapid scan of entire pathogen proteomes for peptide likely to bind MHC. Here we make public a large set of 48,828 quantitative peptide-binding affinity measurements relating to 48 different mouse, human, macaque, and chimpanzee MHC class I alleles. We use this data to establish a set of benchmark predictions with one neural network method and two matrix-based prediction methods extensively utilized in our groups. In general, the neural network outperforms the matrix-based predictions mainly due to its ability to generalize even on a small amount of data. We also retrieved predictions from tools publicly available on the internet. While differences in the data used to generate these predictions hamper direct comparisons, we do conclude that tools based on combinatorial peptide libraries perform remarkably well. The transparent prediction evaluation on this dataset provides tool developers with a benchmark for comparison of newly developed prediction methods. In addition, to generate and evaluate our own prediction methods, we have established an easily extensible web-based prediction framework that allows automated side-by-side comparisons of prediction methods implemented by experts. This is an advance over the current practice of tool developers having to generate reference predictions themselves, which can lead to underestimating the performance of prediction methods they are not as familiar with as their own. The overall goal of this effort is to provide a transparent prediction evaluation allowing bioinformaticians to identify promising features of prediction methods and providing guidance to immunologists regarding the reliability of prediction tools.
A Comparison of Automatic Parallelization Tools/Compilers on the SGI Origin 2000 Using the NAS Benchmarks

NASA Technical Reports Server (NTRS)

Saini, Subhash; Frumkin, Michael; Hribar, Michelle; Jin, Hao-Qiang; Waheed, Abdul; Yan, Jerry

1998-01-01

Porting applications to new high performance parallel and distributed computing platforms is a challenging task. Since writing parallel code by hand is extremely time consuming and costly, porting codes would ideally be automated by using some parallelization tools and compilers. In this paper, we compare the performance of the hand written NAB Parallel Benchmarks against three parallel versions generated with the help of tools and compilers: 1) CAPTools: an interactive computer aided parallelization too] that generates message passing code, 2) the Portland Group's HPF compiler and 3) using compiler directives with the native FORTAN77 compiler on the SGI Origin2000.
Aluminum-Mediated Formation of Cyclic Carbonates: Benchmarking Catalytic Performance Metrics.

PubMed

Rintjema, Jeroen; Kleij, Arjan W

2017-03-22

We report a comparative study on the activity of a series of fifteen binary catalysts derived from various reported aluminum-based complexes. A benchmarking of their initial rates in the coupling of various terminal and internal epoxides in the presence of three different nucleophilic additives was carried out, providing for the first time a useful comparison of activity metrics in the area of cyclic organic carbonate formation. These investigations provide a useful framework for how to realistically valorize relative reactivities and which features are important when considering the ideal operational window of each binary catalyst system. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
What are the assets and weaknesses of HFO detectors? A benchmark framework based on realistic simulations

PubMed Central

Pizzo, Francesca; Bartolomei, Fabrice; Wendling, Fabrice; Bénar, Christian-George

2017-01-01

High-frequency oscillations (HFO) have been suggested as biomarkers of epileptic tissues. While visual marking of these short and small oscillations is tedious and time-consuming, automatic HFO detectors have not yet met a large consensus. Even though detectors have been shown to perform well when validated against visual marking, the large number of false detections due to their lack of robustness hinder their clinical application. In this study, we developed a validation framework based on realistic and controlled simulations to quantify precisely the assets and weaknesses of current detectors. We constructed a dictionary of synthesized elements—HFOs and epileptic spikes—from different patients and brain areas by extracting these elements from the original data using discrete wavelet transform coefficients. These elements were then added to their corresponding simulated background activity (preserving patient- and region- specific spectra). We tested five existing detectors against this benchmark. Compared to other studies confronting detectors, we did not only ranked them according their performance but we investigated the reasons leading to these results. Our simulations, thanks to their realism and their variability, enabled us to highlight unreported issues of current detectors: (1) the lack of robust estimation of the background activity, (2) the underestimated impact of the 1/f spectrum, and (3) the inadequate criteria defining an HFO. We believe that our benchmark framework could be a valuable tool to translate HFOs into a clinical environment. PMID:28406919
Ground truth and benchmarks for performance evaluation

NASA Astrophysics Data System (ADS)

Takeuchi, Ayako; Shneier, Michael; Hong, Tsai Hong; Chang, Tommy; Scrapper, Christopher; Cheok, Geraldine S.

2003-09-01

Progress in algorithm development and transfer of results to practical applications such as military robotics requires the setup of standard tasks, of standard qualitative and quantitative measurements for performance evaluation and validation. Although the evaluation and validation of algorithms have been discussed for over a decade, the research community still faces a lack of well-defined and standardized methodology. The range of fundamental problems include a lack of quantifiable measures of performance, a lack of data from state-of-the-art sensors in calibrated real-world environments, and a lack of facilities for conducting realistic experiments. In this research, we propose three methods for creating ground truth databases and benchmarks using multiple sensors. The databases and benchmarks will provide researchers with high quality data from suites of sensors operating in complex environments representing real problems of great relevance to the development of autonomous driving systems. At NIST, we have prototyped a High Mobility Multi-purpose Wheeled Vehicle (HMMWV) system with a suite of sensors including a Riegl ladar, GDRS ladar, stereo CCD, several color cameras, Global Position System (GPS), Inertial Navigation System (INS), pan/tilt encoders, and odometry . All sensors are calibrated with respect to each other in space and time. This allows a database of features and terrain elevation to be built. Ground truth for each sensor can then be extracted from the database. The main goal of this research is to provide ground truth databases for researchers and engineers to evaluate algorithms for effectiveness, efficiency, reliability, and robustness, thus advancing the development of algorithms.
Statistical process control as a tool for controlling operating room performance: retrospective analysis and benchmarking.

PubMed

Chen, Tsung-Tai; Chang, Yun-Jau; Ku, Shei-Ling; Chung, Kuo-Piao

2010-10-01

There is much research using statistical process control (SPC) to monitor surgical performance, including comparisons among groups to detect small process shifts, but few of these studies have included a stabilization process. This study aimed to analyse the performance of surgeons in operating room (OR) and set a benchmark by SPC after stabilized process. The OR profile of 499 patients who underwent laparoscopic cholecystectomy performed by 16 surgeons at a tertiary hospital in Taiwan during 2005 and 2006 were recorded. SPC was applied to analyse operative and non-operative times using the following five steps: first, the times were divided into two segments; second, they were normalized; third, they were evaluated as individual processes; fourth, the ARL(0) was calculated;, and fifth, the different groups (surgeons) were compared. Outliers were excluded to ensure stability for each group and to facilitate inter-group comparison. The results showed that in the stabilized process, only one surgeon exhibited a significantly shorter total process time (including operative time and non-operative time). In this study, we use five steps to demonstrate how to control surgical and non-surgical time in phase I. There are some measures that can be taken to prevent skew and instability in the process. Also, using SPC, one surgeon can be shown to be a real benchmark. © 2010 Blackwell Publishing Ltd.
A benchmark for comparison of cell tracking algorithms

PubMed Central

Maška, Martin; Ulman, Vladimír; Svoboda, David; Matula, Pavel; Matula, Petr; Ederra, Cristina; Urbiola, Ainhoa; España, Tomás; Venkatesan, Subramanian; Balak, Deepak M.W.; Karas, Pavel; Bolcková, Tereza; Štreitová, Markéta; Carthel, Craig; Coraluppi, Stefano; Harder, Nathalie; Rohr, Karl; Magnusson, Klas E. G.; Jaldén, Joakim; Blau, Helen M.; Dzyubachyk, Oleh; Křížek, Pavel; Hagen, Guy M.; Pastor-Escuredo, David; Jimenez-Carretero, Daniel; Ledesma-Carbayo, Maria J.; Muñoz-Barrutia, Arrate; Meijering, Erik; Kozubek, Michal; Ortiz-de-Solorzano, Carlos

2014-01-01

Motivation: Automatic tracking of cells in multidimensional time-lapse fluorescence microscopy is an important task in many biomedical applications. A novel framework for objective evaluation of cell tracking algorithms has been established under the auspices of the IEEE International Symposium on Biomedical Imaging 2013 Cell Tracking Challenge. In this article, we present the logistics, datasets, methods and results of the challenge and lay down the principles for future uses of this benchmark. Results: The main contributions of the challenge include the creation of a comprehensive video dataset repository and the definition of objective measures for comparison and ranking of the algorithms. With this benchmark, six algorithms covering a variety of segmentation and tracking paradigms have been compared and ranked based on their performance on both synthetic and real datasets. Given the diversity of the datasets, we do not declare a single winner of the challenge. Instead, we present and discuss the results for each individual dataset separately. Availability and implementation: The challenge Web site (http://www.codesolorzano.com/celltrackingchallenge) provides access to the training and competition datasets, along with the ground truth of the training videos. It also provides access to Windows and Linux executable files of the evaluation software and most of the algorithms that competed in the challenge. Contact: codesolorzano@unav.es Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24526711
A benchmark for comparison of dental radiography analysis algorithms.

PubMed

Wang, Ching-Wei; Huang, Cheng-Ta; Lee, Jia-Hong; Li, Chung-Hsing; Chang, Sheng-Wei; Siao, Ming-Jhih; Lai, Tat-Ming; Ibragimov, Bulat; Vrtovec, Tomaž; Ronneberger, Olaf; Fischer, Philipp; Cootes, Tim F; Lindner, Claudia

2016-07-01

Dental radiography plays an important role in clinical diagnosis, treatment and surgery. In recent years, efforts have been made on developing computerized dental X-ray image analysis systems for clinical usages. A novel framework for objective evaluation of automatic dental radiography analysis algorithms has been established under the auspices of the IEEE International Symposium on Biomedical Imaging 2015 Bitewing Radiography Caries Detection Challenge and Cephalometric X-ray Image Analysis Challenge. In this article, we present the datasets, methods and results of the challenge and lay down the principles for future uses of this benchmark. The main contributions of the challenge include the creation of the dental anatomy data repository of bitewing radiographs, the creation of the anatomical abnormality classification data repository of cephalometric radiographs, and the definition of objective quantitative evaluation for comparison and ranking of the algorithms. With this benchmark, seven automatic methods for analysing cephalometric X-ray image and two automatic methods for detecting bitewing radiography caries have been compared, and detailed quantitative evaluation results are presented in this paper. Based on the quantitative evaluation results, we believe automatic dental radiography analysis is still a challenging and unsolved problem. The datasets and the evaluation software will be made available to the research community, further encouraging future developments in this field. (http://www-o.ntust.edu.tw/~cweiwang/ISBI2015/). Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Lecture Notes on Criticality Safety Validation Using MCNP & Whisper

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, Forrest B.; Rising, Michael Evan; Alwin, Jennifer Louise

Training classes for nuclear criticality safety, MCNP documentation. The need for, and problems surrounding, validation of computer codes and data area considered first. Then some background for MCNP & Whisper is given--best practices for Monte Carlo criticality calculations, neutron spectra, S(α,β) thermal neutron scattering data, nuclear data sensitivities, covariance data, and correlation coefficients. Whisper is computational software designed to assist the nuclear criticality safety analyst with validation studies with the Monte Carlo radiation transport package MCNP. Whisper's methodology (benchmark selection – C k's, weights; extreme value theory – bias, bias uncertainty; MOS for nuclear data uncertainty – GLLS) and usagemore » are discussed.« less
Automated benchmarking of peptide-MHC class I binding predictions.

PubMed

Trolle, Thomas; Metushi, Imir G; Greenbaum, Jason A; Kim, Yohan; Sidney, John; Lund, Ole; Sette, Alessandro; Peters, Bjoern; Nielsen, Morten

2015-07-01

Numerous in silico methods predicting peptide binding to major histocompatibility complex (MHC) class I molecules have been developed over the last decades. However, the multitude of available prediction tools makes it non-trivial for the end-user to select which tool to use for a given task. To provide a solid basis on which to compare different prediction tools, we here describe a framework for the automated benchmarking of peptide-MHC class I binding prediction tools. The framework runs weekly benchmarks on data that are newly entered into the Immune Epitope Database (IEDB), giving the public access to frequent, up-to-date performance evaluations of all participating tools. To overcome potential selection bias in the data included in the IEDB, a strategy was implemented that suggests a set of peptides for which different prediction methods give divergent predictions as to their binding capability. Upon experimental binding validation, these peptides entered the benchmark study. The benchmark has run for 15 weeks and includes evaluation of 44 datasets covering 17 MHC alleles and more than 4000 peptide-MHC binding measurements. Inspection of the results allows the end-user to make educated selections between participating tools. Of the four participating servers, NetMHCpan performed the best, followed by ANN, SMM and finally ARB. Up-to-date performance evaluations of each server can be found online at http://tools.iedb.org/auto_bench/mhci/weekly. All prediction tool developers are invited to participate in the benchmark. Sign-up instructions are available at http://tools.iedb.org/auto_bench/mhci/join. mniel@cbs.dtu.dk or bpeters@liai.org Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Establishing objective benchmarks in robotic virtual reality simulation at the level of a competent surgeon using the RobotiX Mentor simulator.

PubMed

Watkinson, William; Raison, Nicholas; Abe, Takashige; Harrison, Patrick; Khan, Shamim; Van der Poel, Henk; Dasgupta, Prokar; Ahmed, Kamran

2018-05-01

To establish objective benchmarks at the level of a competent robotic surgeon across different exercises and metrics for the RobotiX Mentor virtual reality (VR) simulator suitable for use within a robotic surgical training curriculum. This retrospective observational study analysed results from multiple data sources, all of which used the RobotiX Mentor VR simulator. 123 participants with varying experience from novice to expert completed the exercises. Competency was established as the 25th centile of the mean advanced intermediate score. Three basic skill exercises and two advanced skill exercises were used. King's College London. 84 Novice, 26 beginner intermediates, 9 advanced intermediates and 4 experts were used in this retrospective observational study. Objective benchmarks derived from the 25th centile of the mean scores of the advanced intermediates provided suitably challenging yet also achievable targets for training surgeons. The disparity in scores was greatest for the advanced exercises. Novice surgeons are able to achieve the benchmarks across all exercises in the majority of metrics. We have successfully created this proof-of-concept study, which requires validation in a larger cohort. Objective benchmarks obtained from the 25th centile of the mean scores of advanced intermediates provide clinically relevant benchmarks at the standard of a competent robotic surgeon that are challenging yet also attainable. That can be used within a VR training curriculum allowing participants to track and monitor their progress in a structured and progressional manner through five exercises. Providing clearly defined targets, ensuring that a universal training standard has been achieved across training surgeons. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
PFLOTRAN Verification: Development of a Testing Suite to Ensure Software Quality

NASA Astrophysics Data System (ADS)

Hammond, G. E.; Frederick, J. M.

2016-12-01

In scientific computing, code verification ensures the reliability and numerical accuracy of a model simulation by comparing the simulation results to experimental data or known analytical solutions. The model is typically defined by a set of partial differential equations with initial and boundary conditions, and verification ensures whether the mathematical model is solved correctly by the software. Code verification is especially important if the software is used to model high-consequence systems which cannot be physically tested in a fully representative environment [Oberkampf and Trucano (2007)]. Justified confidence in a particular computational tool requires clarity in the exercised physics and transparency in its verification process with proper documentation. We present a quality assurance (QA) testing suite developed by Sandia National Laboratories that performs code verification for PFLOTRAN, an open source, massively-parallel subsurface simulator. PFLOTRAN solves systems of generally nonlinear partial differential equations describing multiphase, multicomponent and multiscale reactive flow and transport processes in porous media. PFLOTRAN's QA test suite compares the numerical solutions of benchmark problems in heat and mass transport against known, closed-form, analytical solutions, including documentation of the exercised physical process models implemented in each PFLOTRAN benchmark simulation. The QA test suite development strives to follow the recommendations given by Oberkampf and Trucano (2007), which describes four essential elements in high-quality verification benchmark construction: (1) conceptual description, (2) mathematical description, (3) accuracy assessment, and (4) additional documentation and user information. Several QA tests within the suite will be presented, including details of the benchmark problems and their closed-form analytical solutions, implementation of benchmark problems in PFLOTRAN simulations, and the criteria used to assess PFLOTRAN's performance in the code verification procedure. References Oberkampf, W. L., and T. G. Trucano (2007), Verification and Validation Benchmarks, SAND2007-0853, 67 pgs., Sandia National Laboratories, Albuquerque, NM.
Automated benchmarking of peptide-MHC class I binding predictions

PubMed Central

Trolle, Thomas; Metushi, Imir G.; Greenbaum, Jason A.; Kim, Yohan; Sidney, John; Lund, Ole; Sette, Alessandro; Peters, Bjoern; Nielsen, Morten

2015-01-01

Motivation: Numerous in silico methods predicting peptide binding to major histocompatibility complex (MHC) class I molecules have been developed over the last decades. However, the multitude of available prediction tools makes it non-trivial for the end-user to select which tool to use for a given task. To provide a solid basis on which to compare different prediction tools, we here describe a framework for the automated benchmarking of peptide-MHC class I binding prediction tools. The framework runs weekly benchmarks on data that are newly entered into the Immune Epitope Database (IEDB), giving the public access to frequent, up-to-date performance evaluations of all participating tools. To overcome potential selection bias in the data included in the IEDB, a strategy was implemented that suggests a set of peptides for which different prediction methods give divergent predictions as to their binding capability. Upon experimental binding validation, these peptides entered the benchmark study. Results: The benchmark has run for 15 weeks and includes evaluation of 44 datasets covering 17 MHC alleles and more than 4000 peptide-MHC binding measurements. Inspection of the results allows the end-user to make educated selections between participating tools. Of the four participating servers, NetMHCpan performed the best, followed by ANN, SMM and finally ARB. Availability and implementation: Up-to-date performance evaluations of each server can be found online at http://tools.iedb.org/auto_bench/mhci/weekly. All prediction tool developers are invited to participate in the benchmark. Sign-up instructions are available at http://tools.iedb.org/auto_bench/mhci/join. Contact: mniel@cbs.dtu.dk or bpeters@liai.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25717196
Toward Establishing a Realistic Benchmark for Airframe Noise Research: Issues and Challenges

NASA Technical Reports Server (NTRS)

Khorrami, Mehdi R.

2010-01-01

The availability of realistic benchmark configurations is essential to enable the validation of current Computational Aeroacoustic (CAA) methodologies and to further the development of new ideas and concepts that will foster the technologies of the next generation of CAA tools. The selection of a real-world configuration, the subsequent design and fabrication of an appropriate model for testing, and the acquisition of the necessarily comprehensive aeroacoustic data base are critical steps that demand great care and attention. In this paper, a brief account of the nose landing-gear configuration, being proposed jointly by NASA and the Gulfstream Aerospace Company as an airframe noise benchmark, is provided. The underlying thought processes and the resulting building block steps that were taken during the development of this benchmark case are given. Resolution of critical, yet conflicting issues is discussed - the desire to maintain geometric fidelity versus model modifications required to accommodate instrumentation; balancing model scale size versus Reynolds number effects; and time, cost, and facility availability versus important parameters like surface finish and installation effects. The decisions taken during the experimental phase of a study can significantly affect the ability of a CAA calculation to reproduce the prevalent flow conditions and associated measurements. For the nose landing gear, the most critical of such issues are highlighted and the compromises made to resolve them are discussed. The results of these compromises will be summarized by examining the positive attributes and shortcomings of this particular benchmark case.
A new numerical benchmark for variably saturated variable-density flow and transport in porous media

NASA Astrophysics Data System (ADS)

Guevara, Carlos; Graf, Thomas

2016-04-01

In subsurface hydrological systems, spatial and temporal variations in solute concentration and/or temperature may affect fluid density and viscosity. These variations could lead to potentially unstable situations, in which a dense fluid overlies a less dense fluid. These situations could produce instabilities that appear as dense plume fingers migrating downwards counteracted by vertical upwards flow of freshwater (Simmons et al., Transp. Porous Medium, 2002). As a result of unstable variable-density flow, solute transport rates are increased over large distances and times as compared to constant-density flow. The numerical simulation of variable-density flow in saturated and unsaturated media requires corresponding benchmark problems against which a computer model is validated (Diersch and Kolditz, Adv. Water Resour, 2002). Recorded data from a laboratory-scale experiment of variable-density flow and solute transport in saturated and unsaturated porous media (Simmons et al., Transp. Porous Medium, 2002) is used to define a new numerical benchmark. The HydroGeoSphere code (Therrien et al., 2004) coupled with PEST (www.pesthomepage.org) are used to obtain an optimized parameter set capable of adequately representing the data set by Simmons et al., (2002). Fingering in the numerical model is triggered using random hydraulic conductivity fields. Due to the inherent randomness, a large number of simulations were conducted in this study. The optimized benchmark model adequately predicts the plume behavior and the fate of solutes. This benchmark is useful for model verification of variable-density flow problems in saturated and/or unsaturated media.
Comparative Modeling and Benchmarking Data Sets for Human Histone Deacetylases and Sirtuin Families

PubMed Central

Xia, Jie; Tilahun, Ermias Lemma; Kebede, Eyob Hailu; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon

2015-01-01

Histone Deacetylases (HDACs) are an important class of drug targets for the treatment of cancers, neurodegenerative diseases and other types of diseases. Virtual screening (VS) has become fairly effective approaches for drug discovery of novel and highly selective Histone Deacetylases Inhibitors (HDACIs). To facilitate the process, we constructed the Maximal Unbiased Benchmarking Data Sets for HDACs (MUBD-HDACs) using our recently published methods that were originally developed for building unbiased benchmarking sets for ligand-based virtual screening (LBVS). The MUBD-HDACs covers all 4 Classes including Class III (Sirtuins family) and 14 HDACs isoforms, composed of 631 inhibitors and 24,609 unbiased decoys. Its ligand sets have been validated extensively as chemically diverse, while the decoy sets were shown to be property-matching with ligands and maximal unbiased in terms of “artificial enrichment” and “analogue bias”. We also conducted comparative studies with DUD-E and DEKOIS 2.0 sets against HDAC2 and HDAC8 targets, and demonstrate that our MUBD-HDACs is unique in that it can be applied unbiasedly to both LBVS and SBVS approaches. In addition, we defined a novel metric, i.e. NLBScore, to detect the “2D bias” and “LBVS favorable” effect within the benchmarking sets. In summary, MUBD-HDACs is the only comprehensive and maximal-unbiased benchmark data sets for HDACs (including Sirtuins) that is available so far. MUBD-HDACs is freely available at http://www.xswlab.org/. PMID:25633490
Successful implementation of diabetes audits in Australia: the Australian National Diabetes Information Audit and Benchmarking (ANDIAB) initiative.

PubMed

Lee, A S; Colagiuri, S; Flack, J R

2018-04-06

We developed and implemented a national audit and benchmarking programme to describe the clinical status of people with diabetes attending specialist diabetes services in Australia. The Australian National Diabetes Information Audit and Benchmarking (ANDIAB) initiative was established as a quality audit activity. De-identified data on demographic, clinical, biochemical and outcome items were collected from specialist diabetes services across Australia to provide cross-sectional data on people with diabetes attending specialist centres at least biennially during the years 1998 to 2011. In total, 38 155 sets of data were collected over the eight ANDIAB audits. Each ANDIAB audit achieved its primary objective to collect, collate, analyse, audit and report clinical diabetes data in Australia. Each audit resulted in the production of a pooled data report, as well as individual site reports allowing comparison and benchmarking against other participating sites. The ANDIAB initiative resulted in the largest cross-sectional national de-identified dataset describing the clinical status of people with diabetes attending specialist diabetes services in Australia. ANDIAB showed that people treated by specialist services had a high burden of diabetes complications. This quality audit activity provided a framework to guide planning of healthcare services. © 2018 Diabetes UK.
Processing and validation of JEFF-3.1.1 and ENDF/B-VII.0 group-wise cross section libraries for shielding calculations

NASA Astrophysics Data System (ADS)

Pescarini, M.; Sinitsa, V.; Orsi, R.; Frisoni, M.

2013-03-01

This paper presents a synthesis of the ENEA-Bologna Nuclear Data Group programme dedicated to generate and validate group-wise cross section libraries for shielding and radiation damage deterministic calculations in nuclear fission reactors, following the data processing methodology recommended in the ANSI/ANS-6.1.2-1999 (R2009) American Standard. The VITJEFF311.BOLIB and VITENDF70.BOLIB finegroup coupled n-γ (199 n + 42 γ - VITAMIN-B6 structure) multi-purpose cross section libraries, based on the Bondarenko method for neutron resonance self-shielding and respectively on JEFF-3.1.1 and ENDF/B-VII.0 evaluated nuclear data, were produced in AMPX format using the NJOY-99.259 and the ENEA-Bologna 2007 Revision of the SCAMPI nuclear data processing systems. Two derived broad-group coupled n-γ (47 n + 20 γ - BUGLE-96 structure) working cross section libraries in FIDO-ANISN format for LWR shielding and pressure vessel dosimetry calculations, named BUGJEFF311.BOLIB and BUGENDF70.BOLIB, were generated by the revised version of SCAMPI, through problem-dependent cross section collapsing and self-shielding from the cited fine-group libraries. The validation results on the criticality safety benchmark experiments for the fine-group libraries and the preliminary validation results for the broad-group working libraries on the PCA-Replica and VENUS-3 engineering neutron shielding benchmark experiments are reported in synthesis.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.