bcsc performance benchmarks: Topics by Science.gov

Sample records for bcsc performance benchmarks

National Performance Benchmarks for Modern Screening Digital Mammography: Update from the Breast Cancer Surveillance Consortium.

PubMed

Lehman, Constance D; Arao, Robert F; Sprague, Brian L; Lee, Janie M; Buist, Diana S M; Kerlikowske, Karla; Henderson, Louise M; Onega, Tracy; Tosteson, Anna N A; Rauscher, Garth H; Miglioretti, Diana L

2017-04-01

Purpose To establish performance benchmarks for modern screening digital mammography and assess performance trends over time in U.S. community practice. Materials and Methods This HIPAA-compliant, institutional review board-approved study measured the performance of digital screening mammography interpreted by 359 radiologists across 95 facilities in six Breast Cancer Surveillance Consortium (BCSC) registries. The study included 1 682 504 digital screening mammograms performed between 2007 and 2013 in 792 808 women. Performance measures were calculated according to the American College of Radiology Breast Imaging Reporting and Data System, 5th edition, and were compared with published benchmarks by the BCSC, the National Mammography Database, and performance recommendations by expert opinion. Benchmarks were derived from the distribution of performance metrics across radiologists and were presented as 50th (median), 10th, 25th, 75th, and 90th percentiles, with graphic presentations using smoothed curves. Results Mean screening performance measures were as follows: abnormal interpretation rate (AIR), 11.6 (95% confidence interval [CI]: 11.5, 11.6); cancers detected per 1000 screens, or cancer detection rate (CDR), 5.1 (95% CI: 5.0, 5.2); sensitivity, 86.9% (95% CI: 86.3%, 87.6%); specificity, 88.9% (95% CI: 88.8%, 88.9%); false-negative rate per 1000 screens, 0.8 (95% CI: 0.7, 0.8); positive predictive value (PPV) 1, 4.4% (95% CI: 4.3%, 4.5%); PPV2, 25.6% (95% CI: 25.1%, 26.1%); PPV3, 28.6% (95% CI: 28.0%, 29.3%); cancers stage 0 or 1, 76.9%; minimal cancers, 57.7%; and node-negative invasive cancers, 79.4%. Recommended CDRs were achieved by 92.1% of radiologists in community practice, and 97.1% achieved recommended ranges for sensitivity. Only 59.0% of radiologists achieved recommended AIRs, and only 63.0% achieved recommended levels of specificity. Conclusion The majority of radiologists in the BCSC surpass cancer detection recommendations for screening mammography; however, AIRs continue to be higher than the recommended rate for almost half of radiologists interpreting screening mammograms. © RSNA, 2016 Online supplemental material is available for this article.
National Performance Benchmarks for Modern Diagnostic Digital Mammography: Update from the Breast Cancer Surveillance Consortium.

PubMed

Sprague, Brian L; Arao, Robert F; Miglioretti, Diana L; Henderson, Louise M; Buist, Diana S M; Onega, Tracy; Rauscher, Garth H; Lee, Janie M; Tosteson, Anna N A; Kerlikowske, Karla; Lehman, Constance D

2017-04-01

Purpose To establish contemporary performance benchmarks for diagnostic digital mammography with use of recent data from the Breast Cancer Surveillance Consortium (BCSC). Materials and Methods Institutional review board approval was obtained for active or passive consenting processes or to obtain a waiver of consent to enroll participants, link data, and perform analyses. Data were obtained from six BCSC registries (418 radiologists, 92 radiology facilities). Mammogram indication and assessments were prospectively collected for women undergoing diagnostic digital mammography and linked with cancer diagnoses from state cancer registries. The study included 401 548 examinations conducted from 2007 to 2013 in 265 360 women. Results Overall diagnostic performance measures were as follows: cancer detection rate, 34.7 per 1000 (95% confidence interval [CI]: 34.1, 35.2); abnormal interpretation rate, 12.6% (95% CI: 12.5%, 12.7%); positive predictive value (PPV) of a biopsy recommendation (PPV 2 ), 27.5% (95% CI: 27.1%, 27.9%); PPV of biopsies performed (PPV 3 ), 30.4% (95% CI: 29.9%, 30.9%); false-negative rate, 4.8 per 1000 (95% CI: 4.6, 5.0); sensitivity, 87.8% (95% CI: 87.3%, 88.4%); and specificity, 90.5% (95% CI: 90.4%, 90.6%). Among cancers detected, 63.4% were stage 0 or 1 cancers, 45.6% were minimal cancers, the mean size of invasive cancers was 21.2 mm, and 69.6% of invasive cancers were node negative. Performance metrics varied widely across diagnostic indications, with cancer detection rate (64.5 per 1000) and abnormal interpretation rate (18.7%) highest for diagnostic mammograms obtained to evaluate a breast problem with a lump. Compared with performance during the screen-film mammography era, diagnostic digital performance showed increased abnormal interpretation and cancer detection rates and decreasing PPVs, with less than 70% of radiologists within acceptable ranges for PPV 2 and PPV 3 . Conclusion These performance measures can serve as national benchmarks that may help transform the marked variation in radiologists' diagnostic performance into targeted quality improvement efforts. © RSNA, 2017 Online supplemental material is available for this article.
Breast cancer risk prediction using a clinical risk model and polygenic risk score.

PubMed

Shieh, Yiwey; Hu, Donglei; Ma, Lin; Huntsman, Scott; Gard, Charlotte C; Leung, Jessica W T; Tice, Jeffrey A; Vachon, Celine M; Cummings, Steven R; Kerlikowske, Karla; Ziv, Elad

2016-10-01

Breast cancer risk assessment can inform the use of screening and prevention modalities. We investigated the performance of the Breast Cancer Surveillance Consortium (BCSC) risk model in combination with a polygenic risk score (PRS) comprised of 83 single nucleotide polymorphisms identified from genome-wide association studies. We conducted a nested case-control study of 486 cases and 495 matched controls within a screening cohort. The PRS was calculated using a Bayesian approach. The contributions of the PRS and variables in the BCSC model to breast cancer risk were tested using conditional logistic regression. Discriminatory accuracy of the models was compared using the area under the receiver operating characteristic curve (AUROC). Increasing quartiles of the PRS were positively associated with breast cancer risk, with OR 2.54 (95 % CI 1.69-3.82) for breast cancer in the highest versus lowest quartile. In a multivariable model, the PRS, family history, and breast density remained strong risk factors. The AUROC of the PRS was 0.60 (95 % CI 0.57-0.64), and an Asian-specific PRS had AUROC 0.64 (95 % CI 0.53-0.74). A combined model including the BCSC risk factors and PRS had better discrimination than the BCSC model (AUROC 0.65 versus 0.62, p = 0.01). The BCSC-PRS model classified 18 % of cases as high-risk (5-year risk ≥3 %), compared with 7 % using the BCSC model. The PRS improved discrimination of the BCSC risk model and classified more cases as high-risk. Further consideration of the PRS's role in decision-making around screening and prevention strategies is merited.
Breast Cancer Risk Prediction Using a Clinical Risk Model and Polygenic Risk Score

PubMed Central

Shieh, Yiwey; Hu, Donglei; Ma, Lin; Huntsman, Scott; Gard, Charlotte C.; Leung, Jessica W.T.; Tice, Jeffrey A.; Vachon, Celine M.; Cummings, Steven R.; Kerlikowske, Karla; Ziv, Elad

2016-01-01

Purpose Breast cancer risk assessment can inform the use of screening and prevention modalities. We investigated the performance of the Breast Cancer Surveillance Consortium (BCSC) risk model in combination with a polygenic risk score (PRS) comprised of 83 single nucleotide polymorphisms identified from genome wide association studies. Methods We conducted a nested case-control study of 486 cases and 495 matched controls within a screening cohort. The PRS was calculated using a Bayesian approach. The contributions of the PRS and variables in the BCSC model to breast cancer risk were tested using conditional logistic regression. Discriminatory accuracy of the models was compared using the area under the receiver operating characteristic curve (AUROC). Results Increasing quartiles of the PRS were positively associated with breast cancer risk, with OR 2.54 (95% CI 1.69-3.82) for breast cancer in the highest versus lowest quartile. In a multivariable model, the PRS, family history, and breast density remained strong risk factors. The AUROC of the PRS was 0.60 (95% CI 0.57-0.64), and an Asian-specific PRS had AUROC 0.64 (95% CI 0.53-0.74). A combined model including the BCSC risk factors and PRS had better discrimination than the BCSC model (AUROC 0.65 versus 0.62, p = 0.01). The BCSC-PRS model classified 18% of cases as high-risk (5-year risk ≥ 3%), compared with 7% using the BCSC model. Conclusion The PRS improved discrimination of the BCSC risk model and classified more cases as high-risk. Impact Further consideration of the PRS's role in decision-making around screening and prevention strategies is merited. PMID:27565998
Nanoparticle Delivery of miR-34a Eradicates Long-term-cultured Breast Cancer Stem Cells via Targeting C22ORF28 Directly

PubMed Central

Lin, Xiaoti; Chen, Weiyu; Wei, Fengqin; Zhou, Binhua P.; Hung, Mien-Chie; Xie, Xiaoming

2017-01-01

Rationale: Cancer stem cells (CSCs) have been implicated as the seeds of therapeutic resistance and metastasis, due to their unique abilities of self-renew, wide differentiation potentials and resistance to most conventional therapies. It is a proactive strategy for cancer therapy to eradicate CSCs. Methods: Tumor tissue-derived breast CSCs (BCSC), including XM322 and XM607, were isolated by fluorescence-activated cell sorting (FACS); while cell line-derived BCSC, including MDA-MB-231.SC and MCF-7.SC, were purified by magnetic-activated cell sorting (MACS). Analyses of microRNA and mRNA expression array profiles were performed in multiple breast cell lines. The mentioned nanoparticles were constructed following the standard molecular cloning protocol. Tissue microarray analysis has been used to study 217 cases of clinical breast cancer specimens. Results: Here, we have successfully established four long-term maintenance BCSC that retain their tumor-initiating biological properties. Our analyses of microarray and qRT-PCR explored that miR-34a is the most pronounced microRNA for investigation of BCSC. We establish hTERT promoter-driven VISA delivery of miR-34a (TV-miR-34a) plasmid that can induce high throughput of miR-34a expression in BCSC. TV-miR-34a significantly inhibited the tumor-initiating properties of long-term-cultured BCSC in vitro and reduced the proliferation of BCSC in vivo by an efficient and safe way. TV-miR-34a synergizes with docetaxel, a standard therapy for invasive breast cancer, to act as a BCSC inhibitor. Further mechanistic investigation indicates that TV-miR-34a directly prevents C22ORF28 accumulation, which abrogates clonogenicity and tumor growth and correlates with low miR-34 and high C22ORF28 levels in breast cancer patients. Conclusion: Taken together, we generated four long-term maintenance BCSC derived from either clinical specimens or cell lines, which would be greatly beneficial to the research progress in breast cancer patients. We further developed the non-viral TV-miR-34a plasmid, which has a great potential to be applied as a clinical application for breast cancer therapy. PMID:29187905
Association of BCSC-1 and MMP-14 with human breast cancer.

PubMed

Di, Dalin; Chen, Lei; Guo, Yingying; Wang, Lina; Wang, Huidong; Ju, Jiyu

2018-04-01

Breast cancer suppressor candidate-1 (BCSC-1) is a candidate tumor suppressor gene that was identified recently. Decreased levels of BCSC-1 have been detected in a variety of cancer types in previous studies. Matrix metalloproteinase (MMP)-14 is a membrane-type MMP that plays an important role in tumor progression and prognosis. Previous research has indicated that MMP-14 is highly expressed in different cancer types and promotes tumor invasion or metastasis by remodeling the extracellular matrix. However, there have been few reports on BCSC-1 and MMP-14 in human breast cancer in recent years. In the present study, the association of BCSC-1 and MMP-14 with human breast cancer was investigated. The immunohistochemical analysis results revealed reduced expression of BCSC-1 and overexpression of MMP-14 in breast cancer tissues compared with adjacent normal breast tissues. Quantitative polymerase chain reaction and western blot analyses also showed that BCSC-1 was expressed at significantly lower levels, and that MMP-14 was expressed at significantly higher levels in breast cancer tissues compared with healthy breast tissue. Furthermore, decreased expression of BCSC-1 and overexpression of MMP-14 were associated with tumor cellular differentiation, lymph node metastasis and distant metastasis. A correlational analysis between BCSC-1 and MMP-14 was also conducted, and the results indicated a negative correlation between the two. In conclusion, the current findings indicate that BCSC-1 is downregulated, while MMP-14 is overexpressed in human breast cancer. These two genes may play important roles during the process of human breast cancer development.
Prolyl isomerase Pin1 acts downstream of miR-200 to promote cancer stem-like cell traits in breast cancer

PubMed Central

Luo, Man-Li; Gong, Chang; Chen, Chun-Hau; Lee, Daniel Y.; Hu, Hai; Huang, Pengyu; Yao, Yandan; Guo, Wenjun; Reinhardt, Ferenc; Wulf, Gerburg; Lieberman, Judy; Zhou, Xiao Zhen; Song, Erwei; Lu, Kun Ping

2014-01-01

Breast cancer stem-like cells (BCSC) have been implicated in tumor growth, metastasis, drug resistance and relapse but druggable targets in appropriate subsets of this cell population have yet to be identified. Here we identify a fundamental role for the prolyl isomerase Pin1 in driving BCSC expansion, invasiveness and tumorigenicity, defining it as a key target of miR-200c which is known to be a critical regluator in BSCS. Pin1 overexpression expanded the growth and tumorigenicity of BCSC and triggered epithelial-mesenchymal transition (EMT). Conversely, genetic or pharmaacological inhibition of Pin1 reduced the abundance and self-renewal activity of BCSC. Moreover, moderate overexpression of miR-200c-resistant Pin1 rescued the BCSC defect in miR-200c-expressing cells. Genetic deletion of Pin1 also decreased the abundance and repopulating capability of normal mouse mammary stem cells. In human cells freshly isolated from reduction mammoplasty tissues, Pin1 overexpression endowed BCSC traits to normal breast epithelial cells, expanding both luminal and basal/myoepithelial lineages in these cells. In contrast, Pin1 silencing in primary breast cancer cells isolated from clinical samples inhibited the expansion, self-renewal activity and tumorigenesis of BCSC in vitro and in vivo. Overall, our work demonstrated that Pin1 is a pivotal regulator acting downstream of miR-200c to drive BCSC and breast tumorigenicity, highlighting a new therapeutic target to eradicate BCSC. PMID:24786790
Comparing sensitivity and specificity of screening mammography in the United States and Denmark

PubMed Central

Jacobsen, Katja Kemp; O'Meara, Ellen S.; Key, Dustin; Buist, Diana SM; Kerlikowske, Karla; Vejborg, Ilse; Sprague, Brian L.; Lynge, Elsebeth; von Euler-Chelpin, My

2015-01-01

Delivery of screening mammography differs substantially between the United States (US) and Denmark. We evaluate whether there are differences in screening sensitivity and specificity. We included screens from women screened at age 50-69 years during 1996-2008/2009 in the US Breast Cancer Surveillance Consortium (BCSC) (n=2,872,791), and from two population-based mammography screening programs in Denmark (Copenhagen, n=148,156 and Funen, n=275,553). Women were followed for one year. For initial screens, recall rate was significantly higher in BCSC (17.6%) than in Copenhagen (4.3%) and Funen (3.1%). Sensitivity was fairly similar in BCSC (91.8%) and Copenhagen (90.5%) and Funen (92.5%). At subsequent screens, recall rates were 8.8%, 1.8% and 1.4% in BCSC, Copenhagen and Funen, respectively. The BCSC sensitivity (82.3%) was lower compared to Copenhagen (88.9%) and Funen (86.9%), but when stratified by time since last screen, the sensitivity was similar. For both initial and subsequent screening, the specificity of screening in BCSC (83.2 and 91.6%) was significantly lower than in Copenhagen (96.6 and 98.8%) and Funen. (97.9 and 99.2%). Taking time since last screen into account, American and Danish women had the same probability of having their asymptomatic cancers detected at screening. However, the majority of women free of asymptomatic cancers experienced more harms in terms of false-positive findings in the US than in Denmark. PMID:25944711
Relationship of Predicted Risk of Developing Invasive Breast Cancer, as Assessed with Three Models, and Breast Cancer Mortality among Breast Cancer Patients

PubMed Central

Pfeiffer, Ruth M.; Miglioretti, Diana L.; Kerlikowske, Karla; Tice, Jeffery; Vacek, Pamela M.; Gierach, Gretchen L.

2016-01-01

Purpose Breast cancer risk prediction models are used to plan clinical trials and counsel women; however, relationships of predicted risks of breast cancer incidence and prognosis after breast cancer diagnosis are unknown. Methods Using largely pre-diagnostic information from the Breast Cancer Surveillance Consortium (BCSC) for 37,939 invasive breast cancers (1996–2007), we estimated 5-year breast cancer risk (<1%; 1–1.66%; ≥1.67%) with three models: BCSC 1-year risk model (BCSC-1; adapted to 5-year predictions); Breast Cancer Risk Assessment Tool (BCRAT); and BCSC 5-year risk model (BCSC-5). Breast cancer-specific mortality post-diagnosis (range: 1–13 years; median: 5.4–5.6 years) was related to predicted risk of developing breast cancer using unadjusted Cox proportional hazards models, and in age-stratified (35–44; 45–54; 55–69; 70–89 years) models adjusted for continuous age, BCSC registry, calendar period, income, mode of presentation, stage and treatment. Mean age at diagnosis was 60 years. Results Of 6,021 deaths, 2,993 (49.7%) were ascribed to breast cancer. In unadjusted case-only analyses, predicted breast cancer risk ≥1.67% versus <1.0% was associated with lower risk of breast cancer death; BCSC-1: hazard ratio (HR) = 0.82 (95% CI = 0.75–0.90); BCRAT: HR = 0.72 (95% CI = 0.65–0.81) and BCSC-5: HR = 0.84 (95% CI = 0.75–0.94). Age-stratified, adjusted models showed similar, although mostly non-significant HRs. Among women ages 55–69 years, HRs approximated 1.0. Generally, higher predicted risk was inversely related to percentages of cancers with unfavorable prognostic characteristics, especially among women 35–44 years. Conclusions Among cases assessed with three models, higher predicted risk of developing breast cancer was not associated with greater risk of breast cancer death; thus, these models would have limited utility in planning studies to evaluate breast cancer mortality reduction strategies. Further, when offering women counseling, it may be useful to note that high predicted risk of developing breast cancer does not imply that if cancer develops it will behave aggressively. PMID:27560501
Cloning of the Escherichia coli endo-1,4-D-glucanase gene and identification of its product.

PubMed

Park, Y W; Yun, H D

1999-03-01

A plasmid (pYP17) containing a genomic DNA insert from Escherichia coli K-12 that confers the ability to hydrolyze carboxymethylcellulose (CMC) was isolated from a genomic library constructed in the cosmid vector pLAFR3 in E. coli DH5alpha. A small 1.65-kb fragment, designated bcsC (pYP300), was sequenced and found to contain an ORF of 1,104 bp encoding a protein of 368 amino acid residues, with a calculated molecular weight of 41,700 Da. BcsC carries a typical prokaryotic signal peptide of 21 amino acid residues. The predicted amino acid sequence of the BcsC protein is similar to that of CelY of Erwinia chrysanthemi, CMCase of Cellulomonas uda, EngX of Acetobacter xylinum, and CelC of Agrobacterium tumefaciens. Based on these sequence similarities, we propose that the bcsC gene is a member of glycosyl hydrolase family 8. The apparent molecular mass of the protein, when expressed in E. coli, is approximately 40 kDa, and the CMCase activity is found mainly in the extracellular space. The enzyme is optimally active at pH 7 and a temperature of 40 degrees C.
Breast Density and Benign Breast Disease: Risk Assessment to Identify Women at High Risk of Breast Cancer.

PubMed

Tice, Jeffrey A; Miglioretti, Diana L; Li, Chin-Shang; Vachon, Celine M; Gard, Charlotte C; Kerlikowske, Karla

2015-10-01

Women with proliferative breast lesions are candidates for primary prevention, but few risk models incorporate benign findings to assess breast cancer risk. We incorporated benign breast disease (BBD) diagnoses into the Breast Cancer Surveillance Consortium (BCSC) risk model, the only breast cancer risk assessment tool that uses breast density. We developed and validated a competing-risk model using 2000 to 2010 SEER data for breast cancer incidence and 2010 vital statistics to adjust for the competing risk of death. We used Cox proportional hazards regression to estimate the relative hazards for age, race/ethnicity, family history of breast cancer, history of breast biopsy, BBD diagnoses, and breast density in the BCSC. We included 1,135,977 women age 35 to 74 years undergoing mammography with no history of breast cancer; 17% of the women had a prior breast biopsy. During a mean follow-up of 6.9 years, 17,908 women were diagnosed with invasive breast cancer. The BCSC BBD model slightly overpredicted risk (expected-to-observed ratio, 1.04; 95% CI, 1.03 to 1.06) and had modest discriminatory accuracy (area under the receiver operator characteristic curve, 0.665). Among women with proliferative findings, adding BBD to the model increased the proportion of women with an estimated 5-year risk of 3% or higher from 9.3% to 27.8% (P<.001). The BCSC BBD model accurately estimates women's risk for breast cancer using breast density and BBD diagnoses. Greater numbers of high-risk women eligible for primary prevention after BBD diagnosis are identified using the BCSC BBD model. © 2015 by American Society of Clinical Oncology.
Breast Density and Benign Breast Disease: Risk Assessment to Identify Women at High Risk of Breast Cancer

PubMed Central

Tice, Jeffrey A.; Miglioretti, Diana L.; Li, Chin-Shang; Vachon, Celine M.; Gard, Charlotte C.; Kerlikowske, Karla

2015-01-01

Purpose Women with proliferative breast lesions are candidates for primary prevention, but few risk models incorporate benign findings to assess breast cancer risk. We incorporated benign breast disease (BBD) diagnoses into the Breast Cancer Surveillance Consortium (BCSC) risk model, the only breast cancer risk assessment tool that uses breast density. Methods We developed and validated a competing-risk model using 2000 to 2010 SEER data for breast cancer incidence and 2010 vital statistics to adjust for the competing risk of death. We used Cox proportional hazards regression to estimate the relative hazards for age, race/ethnicity, family history of breast cancer, history of breast biopsy, BBD diagnoses, and breast density in the BCSC. Results We included 1,135,977 women age 35 to 74 years undergoing mammography with no history of breast cancer; 17% of the women had a prior breast biopsy. During a mean follow-up of 6.9 years, 17,908 women were diagnosed with invasive breast cancer. The BCSC BBD model slightly overpredicted risk (expected-to-observed ratio, 1.04; 95% CI, 1.03 to 1.06) and had modest discriminatory accuracy (area under the receiver operator characteristic curve, 0.665). Among women with proliferative findings, adding BBD to the model increased the proportion of women with an estimated 5-year risk of 3% or higher from 9.3% to 27.8% (P < .001). Conclusion The BCSC BBD model accurately estimates women's risk for breast cancer using breast density and BBD diagnoses. Greater numbers of high-risk women eligible for primary prevention after BBD diagnosis are identified using the BCSC BBD model. PMID:26282663
Joint relative risks for estrogen receptor-positive breast cancer from a clinical model, polygenic risk score, and sex hormones.

PubMed

Shieh, Yiwey; Hu, Donglei; Ma, Lin; Huntsman, Scott; Gard, Charlotte C; Leung, Jessica W T; Tice, Jeffrey A; Ziv, Elad; Kerlikowske, Karla; Cummings, Steven R

2017-11-01

Models that predict the risk of estrogen receptor (ER)-positive breast cancers may improve our ability to target chemoprevention. We investigated the contributions of sex hormones to the discrimination of the Breast Cancer Surveillance Consortium (BCSC) risk model and a polygenic risk score comprised of 83 single nucleotide polymorphisms. We conducted a nested case-control study of 110 women with ER-positive breast cancers and 214 matched controls within a mammography screening cohort. Participants were postmenopausal and not on hormonal therapy. The associations of estradiol, estrone, testosterone, and sex hormone binding globulin with ER-positive breast cancer were evaluated using conditional logistic regression. We assessed the individual and combined discrimination of estradiol, the BCSC risk score, and polygenic risk score using the area under the receiver operating characteristic curve (AUROC). Of the sex hormones assessed, estradiol (OR 3.64, 95% CI 1.64-8.06 for top vs bottom quartile), and to a lesser degree estrone, was most strongly associated with ER-positive breast cancer in unadjusted analysis. The BCSC risk score (OR 1.32, 95% CI 1.00-1.75 per 1% increase) and polygenic risk score (OR 1.58, 95% CI 1.06-2.36 per standard deviation) were also associated with ER-positive cancers. A model containing the BCSC risk score, polygenic risk score, and estradiol levels showed good discrimination for ER-positive cancers (AUROC 0.72, 95% CI 0.65-0.79), representing a significant improvement over the BCSC risk score (AUROC 0.58, 95% CI 0.50-0.65). Adding estradiol and a polygenic risk score to a clinical risk model improves discrimination for postmenopausal ER-positive breast cancers.
Hypoxia-inducible factors regulate pluripotency factor expression by ZNF217- and ALKBH5-mediated modulation of RNA methylation in breast cancer cells.

PubMed

Zhang, Chuanzhao; Zhi, Wanqing Iris; Lu, Haiquan; Samanta, Debangshu; Chen, Ivan; Gabrielson, Edward; Semenza, Gregg L

2016-10-04

Exposure of breast cancer cells to hypoxia increases the percentage of breast cancer stem cells (BCSCs), which are required for tumor initiation and metastasis, and this response is dependent on the activity of hypoxia-inducible factors (HIFs). We previously reported that exposure of breast cancer cells to hypoxia induces the ALKBH5-mediated demethylation of N6-methyladenosine (m6A) in NANOG mRNA leading to increased expression of NANOG, which is a pluripotency factor that promotes BCSC specification. Here we report that exposure of breast cancer cells to hypoxia also induces ZNF217-dependent inhibition of m6A methylation of mRNAs encoding NANOG and KLF4, which is another pluripotency factor that mediates BCSC specification. Although hypoxia induced the BCSC phenotype in all breast-cancer cell lines analyzed, it did so through variable induction of pluripotency factors and ALKBH5 or ZNF217. However, in every breast cancer line, the hypoxic induction of pluripotency factor and ALKBH5 or ZNF217 expression was HIF-dependent. Immunohistochemistry revealed that expression of HIF-1α and ALKBH5 was concordant in all human breast cancer biopsies analyzed. ALKBH5 knockdown in MDA-MB-231 breast cancer cells significantly decreased metastasis from breast to lungs in immunodeficient mice. Thus, HIFs stimulate pluripotency factor expression and BCSC specification by negative regulation of RNA methylation.
Inhibition of Notch signaling alters the phenotype of orthotopic tumors formed from glioblastoma multiforme neurosphere cells but does not hamper intracranial tumor growth regardless of endogene Notch pathway signature.

PubMed

Kristoffersen, Karina; Nedergaard, Mette Kjølhede; Villingshøj, Mette; Borup, Rehannah; Broholm, Helle; Kjær, Andreas; Poulsen, Hans Skovgaard; Stockhausen, Marie-Thérése

2014-07-01

Brain cancer stem-like cells (bCSC) are cancer cells with neural stem cell (NSC)-like properties found in the devastating brain tumor glioblastoma multiforme (GBM). bCSC are proposed a central role in tumor initiation, progression, treatment resistance and relapse and as such present a promising target in GBM research. The Notch signaling pathway is often deregulated in GBM and we have previously characterized GBM-derived bCSC cultures based on their expression of the Notch-1 receptor and found that it could be used as predictive marker for the effect of Notch inhibition. The aim of the present project was therefore to further elucidate the significance of Notch pathway activity for the tumorigenic properties of GBM-derived bCSC. Human-derived GBM xenograft cells previously established as NSC-like neurosphere cultures were used. Notch inhibition was accomplished by exposing the cells to the gamma-secretase inhibitor DAPT prior to gene expression analysis and intracranial injection into immunocompromised mice. By analyzing the expression of several Notch pathway components, we found that the cultures indeed displayed different Notch pathway signatures. However, when DAPT-treated neurosphere cells were injected into the brain of immunocompromised mice, no increase in survival was obtained regardless of Notch pathway signature and Notch inhibition. We did however observe a decrease in the expression of the stem cell marker Nestin, an increase in the proliferative marker Ki-67 and an increased number of abnormal vessels in tumors formed from DAPT-treated, high Notch-1 expressing cultures, when compared with the control. Based on the presented results we propose that Notch inhibition partly induces differentiation of bCSC, and selects for a cell type that more strongly induces angiogenesis if the treatment is not sustained. However, this more differentiated cell type might prove to be more sensitive to conventional therapies.
Combining quantitative and qualitative breast density measures to assess breast cancer risk.

PubMed

Kerlikowske, Karla; Ma, Lin; Scott, Christopher G; Mahmoudzadeh, Amir P; Jensen, Matthew R; Sprague, Brian L; Henderson, Louise M; Pankratz, V Shane; Cummings, Steven R; Miglioretti, Diana L; Vachon, Celine M; Shepherd, John A

2017-08-22

Accurately identifying women with dense breasts (Breast Imaging Reporting and Data System [BI-RADS] heterogeneously or extremely dense) who are at high breast cancer risk will facilitate discussions of supplemental imaging and primary prevention. We examined the independent contribution of dense breast volume and BI-RADS breast density to predict invasive breast cancer and whether dense breast volume combined with Breast Cancer Surveillance Consortium (BCSC) risk model factors (age, race/ethnicity, family history of breast cancer, history of breast biopsy, and BI-RADS breast density) improves identifying women with dense breasts at high breast cancer risk. We conducted a case-control study of 1720 women with invasive cancer and 3686 control subjects. We calculated ORs and 95% CIs for the effect of BI-RADS breast density and Volpara™ automated dense breast volume on invasive cancer risk, adjusting for other BCSC risk model factors plus body mass index (BMI), and we compared C-statistics between models. We calculated BCSC 5-year breast cancer risk, incorporating the adjusted ORs associated with dense breast volume. Compared with women with BI-RADS scattered fibroglandular densities and second-quartile dense breast volume, women with BI-RADS extremely dense breasts and third- or fourth-quartile dense breast volume (75% of women with extremely dense breasts) had high breast cancer risk (OR 2.87, 95% CI 1.84-4.47, and OR 2.56, 95% CI 1.87-3.52, respectively), whereas women with extremely dense breasts and first- or second-quartile dense breast volume were not at significantly increased breast cancer risk (OR 1.53, 95% CI 0.75-3.09, and OR 1.50, 95% CI 0.82-2.73, respectively). Adding continuous dense breast volume to a model with BCSC risk model factors and BMI increased discriminatory accuracy compared with a model with only BCSC risk model factors (C-statistic 0.639, 95% CI 0.623-0.654, vs. C-statistic 0.614, 95% CI 0.598-0.630, respectively; P < 0.001). Women with dense breasts and fourth-quartile dense breast volume had a BCSC 5-year risk of 2.5%, whereas women with dense breasts and first-quartile dense breast volume had a 5-year risk ≤ 1.8%. Risk models with automated dense breast volume combined with BI-RADS breast density may better identify women with dense breasts at high breast cancer risk than risk models with either measure alone.
The contributions of breast density and common genetic variation to breast cancer risk.

PubMed

Vachon, Celine M; Pankratz, V Shane; Scott, Christopher G; Haeberle, Lothar; Ziv, Elad; Jensen, Matthew R; Brandt, Kathleen R; Whaley, Dana H; Olson, Janet E; Heusinger, Katharina; Hack, Carolin C; Jud, Sebastian M; Beckmann, Matthias W; Schulz-Wendtland, Ruediger; Tice, Jeffrey A; Norman, Aaron D; Cunningham, Julie M; Purrington, Kristen S; Easton, Douglas F; Sellers, Thomas A; Kerlikowske, Karla; Fasching, Peter A; Couch, Fergus J

2015-05-01

We evaluated whether a 76-locus polygenic risk score (PRS) and Breast Imaging Reporting and Data System (BI-RADS) breast density were independent risk factors within three studies (1643 case patients, 2397 control patients) using logistic regression models. We incorporated the PRS odds ratio (OR) into the Breast Cancer Surveillance Consortium (BCSC) risk-prediction model while accounting for its attributable risk and compared five-year absolute risk predictions between models using area under the curve (AUC) statistics. All statistical tests were two-sided. BI-RADS density and PRS were independent risk factors across all three studies (P interaction = .23). Relative to those with scattered fibroglandular densities and average PRS (2(nd) quartile), women with extreme density and highest quartile PRS had 2.7-fold (95% confidence interval [CI] = 1.74 to 4.12) increased risk, while those with low density and PRS had reduced risk (OR = 0.30, 95% CI = 0.18 to 0.51). PRS added independent information (P < .001) to the BCSC model and improved discriminatory accuracy from AUC = 0.66 to AUC = 0.69. Although the BCSC-PRS model was well calibrated in case-control data, independent cohort data are needed to test calibration in the general population. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
H19/let-7/LIN28 reciprocal negative regulatory circuit promotes breast cancer stem cell maintenance

PubMed Central

Peng, Fei; Li, Ting-Ting; Wang, Kai-Li; Xiao, Guo-Qing; Wang, Ju-Hong; Zhao, Hai-Dong; Kang, Zhi-Jie; Fan, Wen-Jun; Zhu, Li-Li; Li, Mei; Cui, Bai; Zheng, Fei-Meng; Wang, Hong-Jiang; Lam, Eric W-F; Wang, Bo; Xu, Jie; Liu, Quentin

2017-01-01

Long noncoding RNA-H19 (H19), an imprinted oncofetal gene, has a central role in carcinogenesis. Hitherto, the mechanism by which H19 regulates cancer stem cells, remains elusive. Here we show that breast cancer stem cells (BCSCs) express high levels of H19, and ectopic overexpression of H19 significantly promotes breast cancer cell clonogenicity, migration and mammosphere-forming ability. Conversely, silencing of H19 represses these BCSC properties. In concordance, knockdown of H19 markedly inhibits tumor growth and suppresses tumorigenesis in nude mice. Mechanistically, we found that H19 functions as a competing endogenous RNA to sponge miRNA let-7, leading to an increase in expression of a let-7 target, the core pluripotency factor LIN28, which is enriched in BCSC populations and breast patient samples. Intriguingly, this gain of LIN28 expression can also feedback to reverse the H19 loss-mediated suppression of BCSC properties. Our data also reveal that LIN28 blocks mature let-7 production and, thereby, de-represses H19 expression in breast cancer cells. Appropriately, H19 and LIN28 expression exhibits strong correlations in primary breast carcinomas. Collectively, these findings reveal that lncRNA H19, miRNA let-7 and transcriptional factor LIN28 form a double-negative feedback loop, which has a critical role in the maintenance of BCSCs. Consequently, disrupting this pathway provides a novel therapeutic strategy for breast cancer. PMID:28102845
Notch-1-PTEN-ERK1/2 signaling axis promotes HER2+ breast cancer cell proliferation and stem cell survival.

PubMed

Baker, Andrew; Wyatt, Debra; Bocchetta, Maurizio; Li, Jun; Filipovic, Aleksandra; Green, Andrew; Peiffer, Daniel S; Fuqua, Suzanne; Miele, Lucio; Albain, Kathy S; Osipo, Clodia

2018-05-10

Trastuzumab targets the HER2 receptor on breast cancer cells to attenuate HER2-driven tumor growth. However, resistance to trastuzumab-based therapy remains a major clinical problem for women with HER2+ breast cancer. Breast cancer stem cells (BCSCs) are suggested to be responsible for drug resistance and tumor recurrence. Notch signaling has been shown to promote BCSC survival and self-renewal. Trastuzumab-resistant cells have increased Notch-1 expression. Notch signaling drives cell proliferation in vitro and is required for tumor recurrence in vivo. We demonstrate herein a mechanism by which Notch-1 is required for trastuzumab resistance by repressing PTEN expression to contribute to activation of ERK1/2 signaling. Furthermore, Notch-1-mediated inhibition of PTEN is necessary for BCSC survival in vitro and in vivo. Inhibition of MEK1/2-ERK1/2 signaling in trastuzumab-resistant breast cancer cells mimics effects of Notch-1 knockdown on bulk cell proliferation and BCSC survival. These findings suggest that Notch-1 contributes to trastuzumab resistance by repressing PTEN and this may lead to hyperactivation of ERK1/2 signaling. Furthermore, high Notch-1 and low PTEN mRNA expression may predict poorer overall survival in women with breast cancer. Notch-1 protein expression predicts poorer survival in women with HER2+ breast cancer. These results support a potential future clinical trial combining anti-Notch-1 and anti-MEK/ERK therapy for trastuzumab-resistant breast cancer.
Long non-coding RNA FEZF1-AS1 promotes breast cancer stemness and tumorigenesis via targeting miR-30a/Nanog axis.

PubMed

Zhang, Zhi; Sun, Liwei; Zhang, Yixuan; Lu, Guanming; Li, Yongqiang; Wei, Zhongheng

2018-05-24

Long non-coding RNAs (lncRNAs) have been verified to modulate the tumorigenesis of breast cancer at multiple levels. In present study, we aim to investigate the role of lncRNA FEZF1-AS1 on breast cancer-stem like cells (BCSC) and the potential regulatory mechanism. In breast cancer tissue, lncRNA FEZF1-AS1 was up-regulated compared with controls and indicated poor prognosis of breast cancer patients. In vitro experiments, FEZF1-AS1 was significantly over-expressed in breast cancer cells, especially in sphere subpopulation compared with parental subpopulation. Loss-of-functional indicated that, in BCSC cells (MDA-MB-231 CSC, MCF-7 CSC), FEZF1-AS1 knockdown reduced the CD44 + /CD24 - rate, the mammosphere-forming ability, stem factors (Nanog, Oct4, SOX2), and inhibited the proliferation, migration and invasion. In vivo, FEZF1-AS1 knockdown inhibited the breast cancer cells growth. Bioinformatics analysis tools and series of validation experiments confirmed that FEZF1-AS1 modulated BCSC and Nanog expression through sponging miR-30a, suggesting the regulation of FEZF1-AS1/miR-30a/Nanog. In summary, our study validate the important role of FEZF1-AS1/miR-30a/Nanog in breast cancer stemness and tumorigenesis, providing a novel insight and treatment strategy for breast cancer. © 2018 The Authors. Journal of Cellular Physiology Published by Wiley Periodicals Inc.

Elimination of epithelial-like and mesenchymal-like breast cancer stem cells to inhibit metastasis following nanoparticle-mediated photothermal therapy.

PubMed

Paholak, Hayley J; Stevers, Nicholas O; Chen, Hongwei; Burnett, Joseph P; He, Miao; Korkaya, Hasan; McDermott, Sean P; Deol, Yadwinder; Clouthier, Shawn G; Luther, Tahra; Li, Qiao; Wicha, Max S; Sun, Duxin

2016-10-01

Increasing evidence suggesting breast cancer stem cells (BCSCs) drive metastasis and evade traditional therapies underscores a critical need to exploit the untapped potential of nanotechnology to develop innovative therapies that will significantly improve patient survival. Photothermal therapy (PTT) to induce localized hyperthermia is one of few nanoparticle-based treatments to enter clinical trials in human cancer patients, and has recently gained attention for its ability to induce a systemic immune response targeting distal cancer cells in mouse models. Here, we first conduct classic cancer stem cell (CSC) assays, both in vitro and in immune-compromised mice, to demonstrate that PTT mediated by highly crystallized iron oxide nanoparticles effectively eliminates BCSCs in translational models of triple negative breast cancer. PTT in vitro preferentially targets epithelial-like ALDH + BCSCs, followed by mesenchymal-like CD44+/CD24- BCSCs, compared to bulk cancer cells. PTT inhibits BCSC self-renewal through reduction of mammosphere formation in primary and secondary generations. Secondary implantation in NOD/SCID mice reveals the ability of PTT to impede BCSC-driven tumor formation. Next, we explore the translational potential of PTT using metastatic and immune-competent mouse models. PTT to inhibit BCSCs significantly reduces metastasis to the lung and lymph nodes. In immune-competent BALB/c mice, PTT effectively eliminates ALDH + BCSCs. These results suggest the feasibility of incorporating PTT into standard clinical treatments such as surgery to enhance BCSC destruction and inhibit metastasis, and the potential of such combination therapy to improve long-term survival in patients with metastatic breast cancer. Copyright © 2016 Elsevier Ltd. All rights reserved.
JAK/STAT3-Regulated Fatty Acid β-Oxidation Is Critical for Breast Cancer Stem Cell Self-Renewal and Chemoresistance.

PubMed

Wang, Tianyi; Fahrmann, Johannes Francois; Lee, Heehyoung; Li, Yi-Jia; Tripathi, Satyendra C; Yue, Chanyu; Zhang, Chunyan; Lifshitz, Veronica; Song, Jieun; Yuan, Yuan; Somlo, George; Jandial, Rahul; Ann, David; Hanash, Samir; Jove, Richard; Yu, Hua

2018-01-09

Cancer stem cells (CSCs) are critical for cancer progression and chemoresistance. How lipid metabolism regulates CSCs and chemoresistance remains elusive. Here, we demonstrate that JAK/STAT3 regulates lipid metabolism, which promotes breast CSCs (BCSCs) and cancer chemoresistance. Inhibiting JAK/STAT3 blocks BCSC self-renewal and expression of diverse lipid metabolic genes, including carnitine palmitoyltransferase 1B (CPT1B), which encodes the critical enzyme for fatty acid β-oxidation (FAO). Moreover, mammary-adipocyte-derived leptin upregulates STAT3-induced CPT1B expression and FAO activity in BCSCs. Human breast-cancer-derived data suggest that the STAT3-CPT1B-FAO pathway promotes cancer cell stemness and chemoresistance. Blocking FAO and/or leptin re-sensitizes them to chemotherapy and inhibits BCSCs in mouse breast tumors in vivo. We identify a critical pathway for BCSC maintenance and breast cancer chemoresistance. Copyright © 2017 Elsevier Inc. All rights reserved.
BAG3 promotes stem cell-like phenotype in breast cancer by upregulation of CXCR4 via interaction with its transcript.

PubMed

Liu, Bao-Qin; Zhang, Song; Li, Si; An, Ming-Xin; Li, Chao; Yan, Jing; Wang, Jia-Mei; Wang, Hua-Qin

2017-07-13

BAG3 is an evolutionarily conserved co-chaperone expressed at high levels and has a prosurvival role in many tumor types. The current study reported that BAG3 was induced under specific floating culture conditions that enrich breast cancer stem cell (BCSC)-like cells in spheres. Ectopic BAG3 overexpression increased CD44 + /CD24 - CSC subpopulations, first-generation and second-generation mammosphere formation, indicating that BAG3 promotes CSC self-renewal and maintenance in breast cancer. We further demonstrated that mechanically, BAG3 upregulated CXCR4 expression at the post-transcriptional level. Further studies showed that BAG3 interacted with CXCR4 mRNA and promoted its expression via its coding and 3'-untranslational regions. BAG3 was also found to be positively correlated with CXCR4 expression and unfavorable prognosis in patients with breast cancer. Taken together, our data demonstrate that BAG3 promotes BCSC-like phenotype through CXCR4 via interaction with its transcript. Therefore, this study establishes BAG3 as a potential adverse prognostic factor and a therapeutic target of breast cancer.
Identifying women with dense breasts at high risk for interval cancer: a cohort study.

PubMed

Kerlikowske, Karla; Zhu, Weiwei; Tosteson, Anna N A; Sprague, Brian L; Tice, Jeffrey A; Lehman, Constance D; Miglioretti, Diana L

2015-05-19

Twenty-one states have laws requiring that women be notified if they have dense breasts and that they be advised to discuss supplemental imaging with their provider. To better direct discussions of supplemental imaging by determining which combinations of breast cancer risk and Breast Imaging Reporting and Data System (BI-RADS) breast density categories are associated with high interval cancer rates. Prospective cohort. Breast Cancer Surveillance Consortium (BCSC) breast imaging facilities. 365,426 women aged 40 to 74 years who had 831,455 digital screening mammography examinations. BI-RADS breast density, BCSC 5-year breast cancer risk, and interval cancer rate (invasive cancer ≤12 months after a normal mammography result) per 1000 mammography examinations. High interval cancer rate was defined as more than 1 case per 1000 examinations. High interval cancer rates were observed for women with 5-year risk of 1.67% or greater and extremely dense breasts or 5-year risk of 2.50% or greater and heterogeneously dense breasts (24% of all women with dense breasts). The interval rate of advanced-stage disease was highest (>0.4 case per 1000 examinations) among women with 5-year risk of 2.50% or greater and heterogeneously or extremely dense breasts (21% of all women with dense breasts). Five-year risk was low to average (0% to 1.66%) for 51.0% of women with heterogeneously dense breasts and 52.5% with extremely dense breasts, with interval cancer rates of 0.58 to 0.63 and 0.72 to 0.89 case per 1000 examinations, respectively. The benefit of supplemental imaging was not assessed. Breast density should not be the sole criterion for deciding whether supplemental imaging is justified because not all women with dense breasts have high interval cancer rates. BCSC 5-year risk combined with BI-RADS breast density can identify women at high risk for interval cancer to inform patient-provider discussions about alternative screening strategies. National Cancer Institute.
Equilibrium of adsorption of mixed milk protein/surfactant solutions at the water/air interface.

PubMed

Kotsmar, C; Grigoriev, D O; Xu, F; Aksenenko, E V; Fainerman, V B; Leser, M E; Miller, R

2008-12-16

Ellipsometry and surface profile analysis tensiometry were used to study and compare the adsorption behavior of beta-lactoglobulin (BLG)/C10DMPO, beta-casein (BCS)/C10DMPO and BCS/C12DMPO mixtures at the air/solution interface. The adsorption from protein/surfactant mixed solutions is of competitive nature. The obtained adsorption isotherms suggest a gradual replacement of the protein molecules at the interface with increasing surfactant concentration for all studied mixed systems. The thickness, refractive index, and the adsorbed amount of the respective adsorption layers, determined by ellipsometry, decrease monotonically and reach values close to those for a surface covered only by surfactant molecules, indicating the absence of proteins from a certain surfactant concentration on. These results correlate with the surface tension data. A continuous increase of adsorption layer thickness was observed up to this concentration, caused by the desorption of segments of the protein and transforming the thin surface layer into a rather diffuse and thick one. Replacement and structural changes of the protein molecules are discussed in terms of protein structure and surface activity of surfactant molecules. Theoretical models derived recently were used for the quantitative description of the equilibrium state of the mixed surface layers.
BAG3 promotes stem cell-like phenotype in breast cancer by upregulation of CXCR4 via interaction with its transcript

PubMed Central

Liu, Bao-Qin; Zhang, Song; Li, Si; An, Ming-Xin; Li, Chao; Yan, Jing; Wang, Jia-Mei; Wang, Hua-Qin

2017-01-01

BAG3 is an evolutionarily conserved co-chaperone expressed at high levels and has a prosurvival role in many tumor types. The current study reported that BAG3 was induced under specific floating culture conditions that enrich breast cancer stem cell (BCSC)-like cells in spheres. Ectopic BAG3 overexpression increased CD44+/CD24− CSC subpopulations, first-generation and second-generation mammosphere formation, indicating that BAG3 promotes CSC self-renewal and maintenance in breast cancer. We further demonstrated that mechanically, BAG3 upregulated CXCR4 expression at the post-transcriptional level. Further studies showed that BAG3 interacted with CXCR4 mRNA and promoted its expression via its coding and 3′-untranslational regions. BAG3 was also found to be positively correlated with CXCR4 expression and unfavorable prognosis in patients with breast cancer. Taken together, our data demonstrate that BAG3 promotes BCSC-like phenotype through CXCR4 via interaction with its transcript. Therefore, this study establishes BAG3 as a potential adverse prognostic factor and a therapeutic target of breast cancer. PMID:28703799
HIF2α/EFEMP1 cascade mediates hypoxic effects on breast cancer stem cell hierarchy.

PubMed

Kwak, Ji-Hye; Lee, Na-Hee; Lee, Hwa-Yong; Hong, In-Sun; Nam, Jeong-Seok

2016-07-12

Breast cancer stem cells (BCSCs) have been shown to contribute to tumor growth, metastasis, and recurrence. They are also markedly resistant to conventional cancer treatments, such as chemotherapy and radiation. Recent studies have suggested that hypoxia is one of the prominent micro-environmental factors that increase the self-renewal ability of BCSCs, partially by enhancing CSC phenotypes. Thus, the identification and development of new therapeutic approaches based on targeting the hypoxia-dependent responses in BCSCs is urgent. Through various in vitro studies, we found that hypoxia specifically up-regulates BCSC sphere formation and a subset of CD44+/CD24-/low CSCs. Hypoxia inducible factors 2α (HIF2α) depletion suppressed CSC-like phenotypes and CSC-mediated drug resistance in breast cancer. Furthermore, the stimulatory effects of hypoxia-induced HIF2α on BCSC sphere formation were successfully attenuated by epidermal growth factor-containing fibulin-like extracellular matrix protein 1 (EFEMP1) knockdown. Taken together, these data suggest that HIF2α mediates hypoxia-induced cancer growth/metastasis and that EFEMP1 is a downstream effector of hypoxia-induced HIF2α during breast tumorigenesis.
Diverse profiles of N-acyl-homoserine lactones in biofilm forming strains of Cronobacter sakazakii.

PubMed

Singh, Niharika; Patil, Amrita; Prabhune, Asmita A; Raghav, Mamta; Goel, Gunjan

2017-04-03

The present study investigates the role of quorum sensing (QS) molecules expressed by C. sakazakii in biofilm formation and extracellular polysaccharide expression. The QS signaling was detected using Chromobacterium violaceum 026 and Agrobacterium tumefaciens NTL4(pZLR4) based bioassay. Long chain N-acyl-homoserine lactones (AHLs) with C6- C18 chain length were identified using High Performance Liquid Chromatography and Liquid Chromatography-High Resolution Mass Spectrometry. A higher Specific Biofilm Formation (SBF) index (p < 0.05) with the presence of genes associated with cellulose biosynthesis (bcsA, bcsC and bcsG) was observed in the strains. AHLs and their mechanisms can serve as novel targets for developing technologies to eradicate and prevent biofilm formation by C. sakazakii.
External validation of Medicare claims codes for digital mammography and computer-aided detection.

PubMed

Fenton, Joshua J; Zhu, Weiwei; Balch, Steven; Smith-Bindman, Rebecca; Lindfors, Karen K; Hubbard, Rebecca A

2012-08-01

While Medicare claims are a potential resource for clinical mammography research or quality monitoring, the validity of key data elements remains uncertain. Claims codes for digital mammography and computer-aided detection (CAD), for example, have not been validated against a credible external reference standard. We matched Medicare mammography claims for women who received bilateral mammograms from 2003 to 2006 to corresponding mammography data from the Breast Cancer Surveillance Consortium (BCSC) registries in four U.S. states (N = 253,727 mammograms received by 120,709 women). We assessed the accuracy of the claims-based classifications of bilateral mammograms as either digital versus film and CAD versus non-CAD relative to a reference standard derived from BCSC data. Claims data correctly classified the large majority of film and digital mammograms (97.2% and 97.3%, respectively), yielding excellent agreement beyond chance (κ = 0.90). Claims data correctly classified the large majority of CAD mammograms (96.6%) but a lower percentage of non-CAD mammograms (86.7%). Agreement beyond chance remained high for CAD classification (κ = 0.83). From 2003 to 2006, the predictive values of claims-based digital and CAD classifications increased as the sample prevalences of each technology increased. Medicare claims data can accurately distinguish film and digital bilateral mammograms and mammograms conducted with and without CAD. The validity of Medicare claims data regarding film versus digital mammography and CAD suggests that these data elements can be useful in research and quality improvement. ©2012 AACR.
Benchmarking and Performance Measurement.

ERIC Educational Resources Information Center

Town, J. Stephen

This paper defines benchmarking and its relationship to quality management, describes a project which applied the technique in a library context, and explores the relationship between performance measurement and benchmarking. Numerous benchmarking methods contain similar elements: deciding what to benchmark; identifying partners; gathering…
How to achieve and prove performance improvement - 15 years of experience in German wastewater benchmarking.

PubMed

Bertzbach, F; Franz, T; Möller, K

2012-01-01

This paper shows the results of performance improvement, which have been achieved in benchmarking projects in the wastewater industry in Germany over the last 15 years. A huge number of changes in operational practice and also in achieved annual savings can be shown, induced in particular by benchmarking at process level. Investigation of this question produces some general findings for the inclusion of performance improvement in a benchmarking project and for the communication of its results. Thus, we elaborate on the concept of benchmarking at both utility and process level, which is still a necessary distinction for the integration of performance improvement into our benchmarking approach. To achieve performance improvement via benchmarking it should be made quite clear that this outcome depends, on one hand, on a well conducted benchmarking programme and, on the other, on the individual situation within each participating utility.
Developing integrated benchmarks for DOE performance measurement

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barancik, J.I.; Kramer, C.F.; Thode, Jr. H.C.

1992-09-30

The objectives of this task were to describe and evaluate selected existing sources of information on occupational safety and health with emphasis on hazard and exposure assessment, abatement, training, reporting, and control identifying for exposure and outcome in preparation for developing DOE performance benchmarks. Existing resources and methodologies were assessed for their potential use as practical performance benchmarks. Strengths and limitations of current data resources were identified. Guidelines were outlined for developing new or improved performance factors, which then could become the basis for selecting performance benchmarks. Data bases for non-DOE comparison populations were identified so that DOE performance couldmore » be assessed relative to non-DOE occupational and industrial groups. Systems approaches were described which can be used to link hazards and exposure, event occurrence, and adverse outcome factors, as needed to generate valid, reliable, and predictive performance benchmarks. Data bases were identified which contain information relevant to one or more performance assessment categories . A list of 72 potential performance benchmarks was prepared to illustrate the kinds of information that can be produced through a benchmark development program. Current information resources which may be used to develop potential performance benchmarks are limited. There is need to develop an occupational safety and health information and data system in DOE, which is capable of incorporating demonstrated and documented performance benchmarks prior to, or concurrent with the development of hardware and software. A key to the success of this systems approach is rigorous development and demonstration of performance benchmark equivalents to users of such data before system hardware and software commitments are institutionalized.« less
Breast cancer stem cells, EMT and therapeutic targets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kotiyal, Srishti; Bhattacharya, Susinjan, E-mail: s.bhattacharya@jiit.ac.in

Highlights: • Therapeutic targeting or inhibition of the key molecules of signaling pathways can control growth of breast cancer stem cells (BCSCs). • Development of BCSCs also involves miRNA interactions. • Therapeutic achievement can be done by targeting identified targets in the BCSC pathways. - Abstract: A small heterogeneous population of breast cancer cells acts as seeds to induce new tumor growth. These seeds or breast cancer stem cells (BCSCs) exhibit great phenotypical plasticity which allows them to undergo “epithelial to mesenchymal transition” (EMT) at the site of primary tumor and a future reverse transition. Apart from metastasis they aremore » also responsible for maintaining the tumor and conferring it with drug and radiation resistance and a tendency for post-treatment relapse. Many of the signaling pathways involved in induction of EMT are involved in CSC generation and regulation. Here we are briefly reviewing the mechanism of TGF-β, Wnt, Notch, TNF-α, NF-κB, RTK signalling pathways which are involved in EMT as well as BCSCs maintenance. Therapeutic targeting or inhibition of the key/accessory players of these pathways could control growth of BCSCs and hence malignant cancer. Additionally several miRNAs are dysregulated in cancer stem cells indicating their roles as oncogenes or tumor suppressors. This review also lists the miRNA interactions identified in BCSCs and discusses on some newly identified targets in the BCSC regulatory pathways like SHIP2, nicastrin, Pin 1, IGF-1R, pro-inflammatory cytokines and syndecan which can be targeted for therapeutic achievements.« less
PDS: A Performance Database Server

DOE PAGES

Berry, Michael W.; Dongarra, Jack J.; Larose, Brian H.; ...

1994-01-01

The process of gathering, archiving, and distributing computer benchmark data is a cumbersome task usually performed by computer users and vendors with little coordination. Most important, there is no publicly available central depository of performance data for all ranges of machines from personal computers to supercomputers. We present an Internet-accessible performance database server (PDS) that can be used to extract current benchmark data and literature. As an extension to the X-Windows-based user interface (Xnetlib) to the Netlib archival system, PDS provides an on-line catalog of public domain computer benchmarks such as the LINPACK benchmark, Perfect benchmarks, and the NAS parallelmore » benchmarks. PDS does not reformat or present the benchmark data in any way that conflicts with the original methodology of any particular benchmark; it is thereby devoid of any subjective interpretations of machine performance. We believe that all branches (research laboratories, academia, and industry) of the general computing community can use this facility to archive performance metrics and make them readily available to the public. PDS can provide a more manageable approach to the development and support of a large dynamic database of published performance metrics.« less
Benchmarking: your performance measurement and improvement tool.

PubMed

Senn, G F

2000-01-01

Many respected professional healthcare organizations and societies today are seeking to establish data-driven performance measurement strategies such as benchmarking. Clinicians are, however, resistant to "benchmarking" that is based on financial data alone, concerned that it may be adverse to the patients' best interests. Benchmarking of clinical procedures that uses physician's codes such as Current Procedural Terminology (CPTs) has greater credibility with practitioners. Better Performers, organizations that can perform procedures successfully at lower cost and in less time, become the "benchmark" against which other organizations can measure themselves. The Better Performers' strategies can be adopted by other facilities to save time or money while maintaining quality patient care.
Benchmarking reference services: an introduction.

PubMed

Marshall, J G; Buchanan, H S

1995-01-01

Benchmarking is based on the common sense idea that someone else, either inside or outside of libraries, has found a better way of doing certain things and that your own library's performance can be improved by finding out how others do things and adopting the best practices you find. Benchmarking is one of the tools used for achieving continuous improvement in Total Quality Management (TQM) programs. Although benchmarking can be done on an informal basis, TQM puts considerable emphasis on formal data collection and performance measurement. Used to its full potential, benchmarking can provide a common measuring stick to evaluate process performance. This article introduces the general concept of benchmarking, linking it whenever possible to reference services in health sciences libraries. Data collection instruments that have potential application in benchmarking studies are discussed and the need to develop common measurement tools to facilitate benchmarking is emphasized.
Transaction Processing Performance Council (TPC): State of the Council 2010

NASA Astrophysics Data System (ADS)

Nambiar, Raghunath; Wakou, Nicholas; Carman, Forrest; Majdalany, Michael

The Transaction Processing Performance Council (TPC) is a non-profit corporation founded to define transaction processing and database benchmarks and to disseminate objective, verifiable performance data to the industry. Established in August 1988, the TPC has been integral in shaping the landscape of modern transaction processing and database benchmarks over the past twenty-two years. This paper provides an overview of the TPC's existing benchmark standards and specifications, introduces two new TPC benchmarks under development, and examines the TPC's active involvement in the early creation of additional future benchmarks.
A review on the benchmarking concept in Malaysian construction safety performance

NASA Astrophysics Data System (ADS)

Ishak, Nurfadzillah; Azizan, Muhammad Azizi

2018-02-01

Construction industry is one of the major industries that propels Malaysia's economy in highly contributes to our nation's GDP growth, yet the high fatality rates on construction sites have caused concern among safety practitioners and the stakeholders. Hence, there is a need of benchmarking in performance of Malaysia's construction industry especially in terms of safety. This concept can create a fertile ground for ideas, but only in a receptive environment, organization that share good practices and compare their safety performance against other benefit most to establish improvement in safety culture. This research was conducted to study the awareness important, evaluate current practice and improvement, and also identify the constraint in implement of benchmarking on safety performance in our industry. Additionally, interviews with construction professionals were come out with different views on this concept. Comparison has been done to show the different understanding of benchmarking approach and how safety performance can be benchmarked. But, it's viewed as one mission, which to evaluate objectives identified through benchmarking that will improve the organization's safety performance. Finally, the expected result from this research is to help Malaysia's construction industry implement best practice in safety performance management through the concept of benchmarking.
Proposed biopsy performance benchmarks for MRI based on an audit of a large academic center.

PubMed

Sedora Román, Neda I; Mehta, Tejas S; Sharpe, Richard E; Slanetz, Priscilla J; Venkataraman, Shambhavi; Fein-Zachary, Valerie; Dialani, Vandana

2018-05-01

Performance benchmarks exist for mammography (MG); however, performance benchmarks for magnetic resonance imaging (MRI) are not yet fully developed. The purpose of our study was to perform an MRI audit based on established MG and screening MRI benchmarks and to review whether these benchmarks can be applied to an MRI practice. An IRB approved retrospective review of breast MRIs was performed at our center from 1/1/2011 through 12/31/13. For patients with biopsy recommendation, core biopsy and surgical pathology results were reviewed. The data were used to derive mean performance parameter values, including abnormal interpretation rate (AIR), positive predictive value (PPV), cancer detection rate (CDR), percentage of minimal cancers and axillary node negative cancers and compared with MG and screening MRI benchmarks. MRIs were also divided by screening and diagnostic indications to assess for differences in performance benchmarks amongst these two groups. Of the 2455 MRIs performed over 3-years, 1563 were performed for screening indications and 892 for diagnostic indications. With the exception of PPV2 for screening breast MRIs from 2011 to 2013, PPVs were met for our screening and diagnostic populations when compared to the MRI screening benchmarks established by the Breast Imaging Reporting and Data System (BI-RADS) 5 Atlas ® . AIR and CDR were lower for screening indications as compared to diagnostic indications. New MRI screening benchmarks can be used for screening MRI audits while the American College of Radiology (ACR) desirable goals for diagnostic MG can be used for diagnostic MRI audits. Our study corroborates established findings regarding differences in AIR and CDR amongst screening versus diagnostic indications. © 2017 Wiley Periodicals, Inc.
How to Advance TPC Benchmarks with Dependability Aspects

NASA Astrophysics Data System (ADS)

Almeida, Raquel; Poess, Meikel; Nambiar, Raghunath; Patil, Indira; Vieira, Marco

Transactional systems are the core of the information systems of most organizations. Although there is general acknowledgement that failures in these systems often entail significant impact both on the proceeds and reputation of companies, the benchmarks developed and managed by the Transaction Processing Performance Council (TPC) still maintain their focus on reporting bare performance. Each TPC benchmark has to pass a list of dependability-related tests (to verify ACID properties), but not all benchmarks require measuring their performances. While TPC-E measures the recovery time of some system failures, TPC-H and TPC-C only require functional correctness of such recovery. Consequently, systems used in TPC benchmarks are tuned mostly for performance. In this paper we argue that nowadays systems should be tuned for a more comprehensive suite of dependability tests, and that a dependability metric should be part of TPC benchmark publications. The paper discusses WHY and HOW this can be achieved. Two approaches are introduced and discussed: augmenting each TPC benchmark in a customized way, by extending each specification individually; and pursuing a more unified approach, defining a generic specification that could be adjoined to any TPC benchmark.

Statistical Analysis of NAS Parallel Benchmarks and LINPACK Results

NASA Technical Reports Server (NTRS)

Meuer, Hans-Werner; Simon, Horst D.; Strohmeier, Erich; Lasinski, T. A. (Technical Monitor)

1994-01-01

In the last three years extensive performance data have been reported for parallel machines both based on the NAS Parallel Benchmarks, and on LINPACK. In this study we have used the reported benchmark results and performed a number of statistical experiments using factor, cluster, and regression analyses. In addition to the performance results of LINPACK and the eight NAS parallel benchmarks, we have also included peak performance of the machine, and the LINPACK n and n(sub 1/2) values. Some of the results and observations can be summarized as follows: 1) All benchmarks are strongly correlated with peak performance. 2) LINPACK and EP have each a unique signature. 3) The remaining NPB can grouped into three groups as follows: (CG and IS), (LU and SP), and (MG, FT, and BT). Hence three (or four with EP) benchmarks are sufficient to characterize the overall NPB performance. Our poster presentation will follow a standard poster format, and will present the data of our statistical analysis in detail.
Performance analysis of fusion nuclear-data benchmark experiments for light to heavy materials in MeV energy region with a neutron spectrum shifter

NASA Astrophysics Data System (ADS)

Murata, Isao; Ohta, Masayuki; Miyamaru, Hiroyuki; Kondo, Keitaro; Yoshida, Shigeo; Iida, Toshiyuki; Ochiai, Kentaro; Konno, Chikara

2011-10-01

Nuclear data are indispensable for development of fusion reactor candidate materials. However, benchmarking of the nuclear data in MeV energy region is not yet adequate. In the present study, benchmark performance in the MeV energy region was investigated theoretically for experiments by using a 14 MeV neutron source. We carried out a systematical analysis for light to heavy materials. As a result, the benchmark performance for the neutron spectrum was confirmed to be acceptable, while for gamma-rays it was not sufficiently accurate. Consequently, a spectrum shifter has to be applied. Beryllium had the best performance as a shifter. Moreover, a preliminary examination of whether it is really acceptable that only the spectrum before the last collision is considered in the benchmark performance analysis. It was pointed out that not only the last collision but also earlier collisions should be considered equally in the benchmark performance analysis.
Research on computer systems benchmarking

NASA Technical Reports Server (NTRS)

Smith, Alan Jay (Principal Investigator)

1996-01-01

This grant addresses the topic of research on computer systems benchmarking and is more generally concerned with performance issues in computer systems. This report reviews work in those areas during the period of NASA support under this grant. The bulk of the work performed concerned benchmarking and analysis of CPUs, compilers, caches, and benchmark programs. The first part of this work concerned the issue of benchmark performance prediction. A new approach to benchmarking and machine characterization was reported, using a machine characterizer that measures the performance of a given system in terms of a Fortran abstract machine. Another report focused on analyzing compiler performance. The performance impact of optimization in the context of our methodology for CPU performance characterization was based on the abstract machine model. Benchmark programs are analyzed in another paper. A machine-independent model of program execution was developed to characterize both machine performance and program execution. By merging these machine and program characterizations, execution time can be estimated for arbitrary machine/program combinations. The work was continued into the domain of parallel and vector machines, including the issue of caches in vector processors and multiprocessors. All of the afore-mentioned accomplishments are more specifically summarized in this report, as well as those smaller in magnitude supported by this grant.
Method and system for benchmarking computers

DOEpatents

Gustafson, John L.

1993-09-14

A testing system and method for benchmarking computer systems. The system includes a store containing a scalable set of tasks to be performed to produce a solution in ever-increasing degrees of resolution as a larger number of the tasks are performed. A timing and control module allots to each computer a fixed benchmarking interval in which to perform the stored tasks. Means are provided for determining, after completion of the benchmarking interval, the degree of progress through the scalable set of tasks and for producing a benchmarking rating relating to the degree of progress for each computer.
Performance Evaluation of Supercomputers using HPCC and IMB Benchmarks

NASA Technical Reports Server (NTRS)

Saini, Subhash; Ciotti, Robert; Gunney, Brian T. N.; Spelce, Thomas E.; Koniges, Alice; Dossa, Don; Adamidis, Panagiotis; Rabenseifner, Rolf; Tiyyagura, Sunil R.; Mueller, Matthias;

2006-01-01

The HPC Challenge (HPCC) benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnect fabric of five leading supercomputers - SGI Altix BX2, Cray XI, Cray Opteron Cluster, Dell Xeon cluster, and NEC SX-8. These five systems use five different networks (SGI NUMALINK4, Cray network, Myrinet, InfiniBand, and NEC IXS). The complete set of HPCC benchmarks are run on each of these systems. Additionally, we present Intel MPI Benchmarks (IMB) results to study the performance of 11 MPI communication functions on these systems.

Electric-Drive Vehicle Thermal Performance Benchmarking | Transportation

Science.gov Websites

studies are as follows: Characterize the thermal resistance and conductivity of various layers in the Research | NREL Electric-Drive Vehicle Thermal Performance Benchmarking Electric-Drive Vehicle Thermal Performance Benchmarking A photo of the internal components of an automotive inverter. NREL
Combining Phase Identification and Statistic Modeling for Automated Parallel Benchmark Generation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Ye; Ma, Xiaosong; Liu, Qing Gary

2015-01-01

Parallel application benchmarks are indispensable for evaluating/optimizing HPC software and hardware. However, it is very challenging and costly to obtain high-fidelity benchmarks reflecting the scale and complexity of state-of-the-art parallel applications. Hand-extracted synthetic benchmarks are time-and labor-intensive to create. Real applications themselves, while offering most accurate performance evaluation, are expensive to compile, port, reconfigure, and often plainly inaccessible due to security or ownership concerns. This work contributes APPRIME, a novel tool for trace-based automatic parallel benchmark generation. Taking as input standard communication-I/O traces of an application's execution, it couples accurate automatic phase identification with statistical regeneration of event parameters tomore » create compact, portable, and to some degree reconfigurable parallel application benchmarks. Experiments with four NAS Parallel Benchmarks (NPB) and three real scientific simulation codes confirm the fidelity of APPRIME benchmarks. They retain the original applications' performance characteristics, in particular the relative performance across platforms.« less
Performance Against WELCOA's Worksite Health Promotion Benchmarks Across Years Among Selected US Organizations.

PubMed

Weaver, GracieLee M; Mendenhall, Brandon N; Hunnicutt, David; Picarella, Ryan; Leffelman, Brittanie; Perko, Michael; Bibeau, Daniel L

2018-05-01

The purpose of this study was to quantify the performance of organizations' worksite health promotion (WHP) activities against the benchmarking criteria included in the Well Workplace Checklist (WWC). The Wellness Council of America (WELCOA) developed a tool to assess WHP with its 100-item WWC, which represents WELCOA's 7 performance benchmarks. Workplaces. This study includes a convenience sample of organizations who completed the checklist from 2008 to 2015. The sample size was 4643 entries from US organizations. The WWC includes demographic questions, general questions about WHP programs, and scales to measure the performance against the WELCOA 7 benchmarks. Descriptive analyses of WWC items were completed separately for each year of the study period. The majority of the organizations represented each year were multisite, multishift, medium- to large-sized companies mostly in the services industry. Despite yearly changes in participating organizations, results across the WELCOA 7 benchmark scores were consistent year to year. Across all years, benchmarks that organizations performed the lowest were senior-level support, data collection, and programming; wellness teams and supportive environments were the highest scoring benchmarks. In an era marked with economic swings and health-care reform, it appears that organizations are staying consistent in their performance across these benchmarks. The WWC could be useful for organizations, practitioners, and researchers in assessing the quality of WHP programs.
Cross-industry benchmarking: is it applicable to the operating room?

PubMed

Marco, A P; Hart, S

2001-01-01

The use of benchmarking has been growing in nonmedical industries. This concept is being increasingly applied to medicine as the industry strives to improve quality and improve financial performance. Benchmarks can be either internal (set by the institution) or external (use other's performance as a goal). In some industries, benchmarking has crossed industry lines to identify breakthroughs in thinking. In this article, we examine whether the airline industry can be used as a source of external process benchmarking for the operating room.
Evaluating the Effect of Labeled Benchmarks on Children’s Number Line Estimation Performance and Strategy Use

PubMed Central

Peeters, Dominique; Sekeris, Elke; Verschaffel, Lieven; Luwel, Koen

2017-01-01

Some authors argue that age-related improvements in number line estimation (NLE) performance result from changes in strategy use. More specifically, children’s strategy use develops from only using the origin of the number line, to using the origin and the endpoint, to eventually also relying on the midpoint of the number line. Recently, Peeters et al. (unpublished) investigated whether the provision of additional unlabeled benchmarks at 25, 50, and 75% of the number line, positively affects third and fifth graders’ NLE performance and benchmark-based strategy use. It was found that only the older children benefitted from the presence of these benchmarks at the quartiles of the number line (i.e., 25 and 75%), as they made more use of these benchmarks, leading to more accurate estimates. A possible explanation for this lack of improvement in third graders might be their inability to correctly link the presented benchmarks with their corresponding numerical values. In the present study, we investigated whether labeling these benchmarks with their corresponding numerical values, would have a positive effect on younger children’s NLE performance and quartile-based strategy use as well. Third and sixth graders were assigned to one of three conditions: (a) a control condition with an empty number line bounded by 0 at the origin and 1,000 at the endpoint, (b) an unlabeled condition with three additional external benchmarks without numerical labels at 25, 50, and 75% of the number line, and (c) a labeled condition in which these benchmarks were labeled with 250, 500, and 750, respectively. Results indicated that labeling the benchmarks has a positive effect on third graders’ NLE performance and quartile-based strategy use, whereas sixth graders already benefited from the mere provision of unlabeled benchmarks. These findings imply that children’s benchmark-based strategy use can be stimulated by adding additional externally provided benchmarks on the number line, but that, depending on children’s age and familiarity with the number range, these additional external benchmarks might need to be labeled. PMID:28713302
Evaluating the Effect of Labeled Benchmarks on Children's Number Line Estimation Performance and Strategy Use.

PubMed

Peeters, Dominique; Sekeris, Elke; Verschaffel, Lieven; Luwel, Koen

2017-01-01

Some authors argue that age-related improvements in number line estimation (NLE) performance result from changes in strategy use. More specifically, children's strategy use develops from only using the origin of the number line, to using the origin and the endpoint, to eventually also relying on the midpoint of the number line. Recently, Peeters et al. (unpublished) investigated whether the provision of additional unlabeled benchmarks at 25, 50, and 75% of the number line, positively affects third and fifth graders' NLE performance and benchmark-based strategy use. It was found that only the older children benefitted from the presence of these benchmarks at the quartiles of the number line (i.e., 25 and 75%), as they made more use of these benchmarks, leading to more accurate estimates. A possible explanation for this lack of improvement in third graders might be their inability to correctly link the presented benchmarks with their corresponding numerical values. In the present study, we investigated whether labeling these benchmarks with their corresponding numerical values, would have a positive effect on younger children's NLE performance and quartile-based strategy use as well. Third and sixth graders were assigned to one of three conditions: (a) a control condition with an empty number line bounded by 0 at the origin and 1,000 at the endpoint, (b) an unlabeled condition with three additional external benchmarks without numerical labels at 25, 50, and 75% of the number line, and (c) a labeled condition in which these benchmarks were labeled with 250, 500, and 750, respectively. Results indicated that labeling the benchmarks has a positive effect on third graders' NLE performance and quartile-based strategy use, whereas sixth graders already benefited from the mere provision of unlabeled benchmarks. These findings imply that children's benchmark-based strategy use can be stimulated by adding additional externally provided benchmarks on the number line, but that, depending on children's age and familiarity with the number range, these additional external benchmarks might need to be labeled.
Benchmarking child and adolescent mental health organizations.

PubMed

Brann, Peter; Walter, Garry; Coombs, Tim

2011-04-01

This paper describes aspects of the child and adolescent benchmarking forums that were part of the National Mental Health Benchmarking Project (NMHBP). These forums enabled participating child and adolescent mental health organizations to benchmark themselves against each other, with a view to understanding variability in performance against a range of key performance indicators (KPIs). Six child and adolescent mental health organizations took part in the NMHBP. Representatives from these organizations attended eight benchmarking forums at which they documented their performance against relevant KPIs. They also undertook two special projects designed to help them understand the variation in performance on given KPIs. There was considerable inter-organization variability on many of the KPIs. Even within organizations, there was often substantial variability over time. The variability in indicator data raised many questions for participants. This challenged participants to better understand and describe their local processes, prompted them to collect additional data, and stimulated them to make organizational comparisons. These activities fed into a process of reflection about their performance. Benchmarking has the potential to illuminate intra- and inter-organizational performance in the child and adolescent context.
Full Chain Benchmarking for Open Architecture Airborne ISR Systems: A Case Study for GMTI Radar Applications

DTIC Science & Technology

2015-09-15

middleware implementations via a common object-oriented software hierarchy, with library -specific implementations of the five GMTI benchmark ...Full-Chain Benchmarking for Open Architecture Airborne ISR Systems A Case Study for GMTI Radar Applications Matthias Beebe, Matthew Alexander...time performance, effective benchmarks are necessary to ensure that an ARP system can meet the mission constraints and performance requirements of
Key performance indicators to benchmark hospital information systems - a delphi study.

PubMed

Hübner-Bloder, G; Ammenwerth, E

2009-01-01

To identify the key performance indicators for hospital information systems (HIS) that can be used for HIS benchmarking. A Delphi survey with one qualitative and two quantitative rounds. Forty-four HIS experts from health care IT practice and academia participated in all three rounds. Seventy-seven performance indicators were identified and organized into eight categories: technical quality, software quality, architecture and interface quality, IT vendor quality, IT support and IT department quality, workflow support quality, IT outcome quality, and IT costs. The highest ranked indicators are related to clinical workflow support and user satisfaction. Isolated technical indicators or cost indicators were not seen as useful. The experts favored an interdisciplinary group of all the stakeholders, led by hospital management, to conduct the HIS benchmarking. They proposed benchmarking activities both in regular (annual) intervals as well as at defined events (for example after IT introduction). Most of the experts stated that in their institutions no HIS benchmarking activities are being performed at the moment. In the context of IT governance, IT benchmarking is gaining importance in the healthcare area. The found indicators reflect the view of health care IT professionals and researchers. Research is needed to further validate and operationalize key performance indicators, to provide an IT benchmarking framework, and to provide open repositories for a comparison of the HIS benchmarks of different hospitals.
SP2Bench: A SPARQL Performance Benchmark

NASA Astrophysics Data System (ADS)

Schmidt, Michael; Hornung, Thomas; Meier, Michael; Pinkel, Christoph; Lausen, Georg

A meaningful analysis and comparison of both existing storage schemes for RDF data and evaluation approaches for SPARQL queries necessitates a comprehensive and universal benchmark platform. We present SP2Bench, a publicly available, language-specific performance benchmark for the SPARQL query language. SP2Bench is settled in the DBLP scenario and comprises a data generator for creating arbitrarily large DBLP-like documents and a set of carefully designed benchmark queries. The generated documents mirror vital key characteristics and social-world distributions encountered in the original DBLP data set, while the queries implement meaningful requests on top of this data, covering a variety of SPARQL operator constellations and RDF access patterns. In this chapter, we discuss requirements and desiderata for SPARQL benchmarks and present the SP2Bench framework, including its data generator, benchmark queries and performance metrics.
A Methodology for Benchmarking Relational Database Machines,

DTIC Science & Technology

1984-01-01

user benchmarks is to compare the multiple users to the best-case performance The data for each query classification coll and the performance...called a benchmark. The term benchmark originates from the markers used by sur - veyors in establishing common reference points for their measure...formatted databases. In order to further simplify the problem, we restrict our study to those DBMs which support the relational model. A sur - vey
Evaluation of control strategies using an oxidation ditch benchmark.

PubMed

Abusam, A; Keesman, K J; Spanjers, H; van, Straten G; Meinema, K

2002-01-01

This paper presents validation and implementation results of a benchmark developed for a specific full-scale oxidation ditch wastewater treatment plant. A benchmark is a standard simulation procedure that can be used as a tool in evaluating various control strategies proposed for wastewater treatment plants. It is based on model and performance criteria development. Testing of this benchmark, by comparing benchmark predictions to real measurements of the electrical energy consumptions and amounts of disposed sludge for a specific oxidation ditch WWTP, has shown that it can (reasonably) be used for evaluating the performance of this WWTP. Subsequently, the validated benchmark was then used in evaluating some basic and advanced control strategies. Some of the interesting results obtained are the following: (i) influent flow splitting ratio, between the first and the fourth aerated compartments of the ditch, has no significant effect on the TN concentrations in the effluent, and (ii) for evaluation of long-term control strategies, future benchmarks need to be able to assess settlers' performance.
Accumulo/Hadoop, MongoDB, and Elasticsearch Performance for Semi Structured Intrusion Detection (IDS) Data

DTIC Science & Technology

2016-11-01

iii Contents List of Figures v 1. Introduction 1 2. Background 1 3. Yahoo ! Cloud Serving Benchmark (YCSB) 2 3.1 Data Loading and Performance...transactional system. 3. Yahoo ! Cloud Serving Benchmark (YCSB) 3.1 Data Loading and Performance Testing Framework When originally setting out to perform the...that referred to a data loading and performance testing framework, Yahoo ! Cloud Serving Benchmark (YCSB).12 This framework is freely available and
OPTIMIZATION OF MUD HAMMER DRILLING PERFORMANCE - A PROGRAM TO BENCHMARK THE VIABILITY OF ADVANCED MUD HAMMER DRILLING

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arnis Judzis

2003-01-01

This document details the progress to date on the ''OPTIMIZATION OF MUD HAMMER DRILLING PERFORMANCE -- A PROGRAM TO BENCHMARK THE VIABILITY OF ADVANCED MUD HAMMER DRILLING'' contract for the quarter starting October 2002 through December 2002. Even though we are awaiting the optimization portion of the testing program, accomplishments included the following: (1) Smith International participated in the DOE Mud Hammer program through full scale benchmarking testing during the week of 4 November 2003. (2) TerraTek acknowledges Smith International, BP America, PDVSA, and ConocoPhillips for cost-sharing the Smith benchmarking tests allowing extension of the contract to add to themore » benchmarking testing program. (3) Following the benchmark testing of the Smith International hammer, representatives from DOE/NETL, TerraTek, Smith International and PDVSA met at TerraTek in Salt Lake City to review observations, performance and views on the optimization step for 2003. (4) The December 2002 issue of Journal of Petroleum Technology (Society of Petroleum Engineers) highlighted the DOE fluid hammer testing program and reviewed last years paper on the benchmark performance of the SDS Digger and Novatek hammers. (5) TerraTek's Sid Green presented a technical review for DOE/NETL personnel in Morgantown on ''Impact Rock Breakage'' and its importance on improving fluid hammer performance. Much discussion has taken place on the issues surrounding mud hammer performance at depth conditions.« less
Benchmark for Strategic Performance Improvement.

ERIC Educational Resources Information Center

Gohlke, Annette

1997-01-01

Explains benchmarking, a total quality management tool used to measure and compare the work processes in a library with those in other libraries to increase library performance. Topics include the main groups of upper management, clients, and staff; critical success factors for each group; and benefits of benchmarking. (Author/LRW)

OWL2 benchmarking for the evaluation of knowledge based systems.

PubMed

Khan, Sher Afgun; Qadir, Muhammad Abdul; Abbas, Muhammad Azeem; Afzal, Muhammad Tanvir

2017-01-01

OWL2 semantics are becoming increasingly popular for the real domain applications like Gene engineering and health MIS. The present work identifies the research gap that negligible attention has been paid to the performance evaluation of Knowledge Base Systems (KBS) using OWL2 semantics. To fulfil this identified research gap, an OWL2 benchmark for the evaluation of KBS is proposed. The proposed benchmark addresses the foundational blocks of an ontology benchmark i.e. data schema, workload and performance metrics. The proposed benchmark is tested on memory based, file based, relational database and graph based KBS for performance and scalability measures. The results show that the proposed benchmark is able to evaluate the behaviour of different state of the art KBS on OWL2 semantics. On the basis of the results, the end users (i.e. domain expert) would be able to select a suitable KBS appropriate for his domain.
Benchmarking--Measuring and Comparing for Continuous Improvement.

ERIC Educational Resources Information Center

Henczel, Sue

2002-01-01

Discussion of benchmarking focuses on the use of internal and external benchmarking by special librarians. Highlights include defining types of benchmarking; historical development; benefits, including efficiency, improved performance, increased competitiveness, and better decision making; problems, including inappropriate adaptation; developing a…
Establishing Benchmarks for Outcome Indicators: A Statistical Approach to Developing Performance Standards.

ERIC Educational Resources Information Center

Henry, Gary T.; And Others

1992-01-01

A statistical technique is presented for developing performance standards based on benchmark groups. The benchmark groups are selected using a multivariate technique that relies on a squared Euclidean distance method. For each observation unit (a school district in the example), a unique comparison group is selected. (SLD)
Beyond Benchmarking: Value-Adding Metrics

ERIC Educational Resources Information Center

Fitz-enz, Jac

2007-01-01

HR metrics has grown up a bit over the past two decades, moving away from simple benchmarking practices and toward a more inclusive approach to measuring institutional performance and progress. In this article, the acknowledged "father" of human capital performance benchmarking provides an overview of several aspects of today's HR metrics…
The NAS kernel benchmark program

NASA Technical Reports Server (NTRS)

Bailey, D. H.; Barton, J. T.

1985-01-01

A collection of benchmark test kernels that measure supercomputer performance has been developed for the use of the NAS (Numerical Aerodynamic Simulation) program at the NASA Ames Research Center. This benchmark program is described in detail and the specific ground rules are given for running the program as a performance test.
LASL benchmark performance 1978. [CDC STAR-100, 6600, 7600, Cyber 73, and CRAY-1

DOE Office of Scientific and Technical Information (OSTI.GOV)

McKnight, A.L.

1979-08-01

This report presents the results of running several benchmark programs on a CDC STAR-100, a Cray Research CRAY-1, a CDC 6600, a CDC 7600, and a CDC Cyber 73. The benchmark effort included CRAY-1's at several installations running different operating systems and compilers. This benchmark is part of an ongoing program at Los Alamos Scientific Laboratory to collect performance data and monitor the development trend of supercomputers. 3 tables.
The NAS parallel benchmarks

NASA Technical Reports Server (NTRS)

Bailey, David (Editor); Barton, John (Editor); Lasinski, Thomas (Editor); Simon, Horst (Editor)

1993-01-01

A new set of benchmarks was developed for the performance evaluation of highly parallel supercomputers. These benchmarks consist of a set of kernels, the 'Parallel Kernels,' and a simulated application benchmark. Together they mimic the computation and data movement characteristics of large scale computational fluid dynamics (CFD) applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification - all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.
Toward Scalable Benchmarks for Mass Storage Systems

NASA Technical Reports Server (NTRS)

Miller, Ethan L.

1996-01-01

This paper presents guidelines for the design of a mass storage system benchmark suite, along with preliminary suggestions for programs to be included. The benchmarks will measure both peak and sustained performance of the system as well as predicting both short- and long-term behavior. These benchmarks should be both portable and scalable so they may be used on storage systems from tens of gigabytes to petabytes or more. By developing a standard set of benchmarks that reflect real user workload, we hope to encourage system designers and users to publish performance figures that can be compared with those of other systems. This will allow users to choose the system that best meets their needs and give designers a tool with which they can measure the performance effects of improvements to their systems.
Determining the sample size required to establish whether a medical device is non-inferior to an external benchmark.

PubMed

Sayers, Adrian; Crowther, Michael J; Judge, Andrew; Whitehouse, Michael R; Blom, Ashley W

2017-08-28

The use of benchmarks to assess the performance of implants such as those used in arthroplasty surgery is a widespread practice. It provides surgeons, patients and regulatory authorities with the reassurance that implants used are safe and effective. However, it is not currently clear how or how many implants should be statistically compared with a benchmark to assess whether or not that implant is superior, equivalent, non-inferior or inferior to the performance benchmark of interest.We aim to describe the methods and sample size required to conduct a one-sample non-inferiority study of a medical device for the purposes of benchmarking. Simulation study. Simulation study of a national register of medical devices. We simulated data, with and without a non-informative competing risk, to represent an arthroplasty population and describe three methods of analysis (z-test, 1-Kaplan-Meier and competing risks) commonly used in surgical research. We evaluate the performance of each method using power, bias, root-mean-square error, coverage and CI width. 1-Kaplan-Meier provides an unbiased estimate of implant net failure, which can be used to assess if a surgical device is non-inferior to an external benchmark. Small non-inferiority margins require significantly more individuals to be at risk compared with current benchmarking standards. A non-inferiority testing paradigm provides a useful framework for determining if an implant meets the required performance defined by an external benchmark. Current contemporary benchmarking standards have limited power to detect non-inferiority, and substantially larger samples sizes, in excess of 3200 procedures, are required to achieve a power greater than 60%. It is clear when benchmarking implant performance, net failure estimated using 1-KM is preferential to crude failure estimated by competing risk models. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
The Isprs Benchmark on Indoor Modelling

NASA Astrophysics Data System (ADS)

Khoshelham, K.; Díaz Vilariño, L.; Peter, M.; Kang, Z.; Acharya, D.

2017-09-01

Automated generation of 3D indoor models from point cloud data has been a topic of intensive research in recent years. While results on various datasets have been reported in literature, a comparison of the performance of different methods has not been possible due to the lack of benchmark datasets and a common evaluation framework. The ISPRS benchmark on indoor modelling aims to address this issue by providing a public benchmark dataset and an evaluation framework for performance comparison of indoor modelling methods. In this paper, we present the benchmark dataset comprising several point clouds of indoor environments captured by different sensors. We also discuss the evaluation and comparison of indoor modelling methods based on manually created reference models and appropriate quality evaluation criteria. The benchmark dataset is available for download at: http://www2.isprs.org/commissions/comm4/wg5/benchmark-on-indoor-modelling.html.
Benchmarking Using Basic DBMS Operations

NASA Astrophysics Data System (ADS)

Crolotte, Alain; Ghazal, Ahmad

The TPC-H benchmark proved to be successful in the decision support area. Many commercial database vendors and their related hardware vendors used these benchmarks to show the superiority and competitive edge of their products. However, over time, the TPC-H became less representative of industry trends as vendors keep tuning their database to this benchmark-specific workload. In this paper, we present XMarq, a simple benchmark framework that can be used to compare various software/hardware combinations. Our benchmark model is currently composed of 25 queries that measure the performance of basic operations such as scans, aggregations, joins and index access. This benchmark model is based on the TPC-H data model due to its maturity and well-understood data generation capability. We also propose metrics to evaluate single-system performance and compare two systems. Finally we illustrate the effectiveness of this model by showing experimental results comparing two systems under different conditions.
Benchmarks--Standards Comparisons. Math Competencies: EFF Benchmarks Comparison [and] Reading Competencies: EFF Benchmarks Comparison [and] Writing Competencies: EFF Benchmarks Comparison.

ERIC Educational Resources Information Center

Kent State Univ., OH. Ohio Literacy Resource Center.

This document is intended to show the relationship between Ohio's Standards and Competencies, Equipped for the Future's (EFF's) Standards and Components of Performance, and Ohio's Revised Benchmarks. The document is divided into three parts, with Part 1 covering mathematics instruction, Part 2 covering reading instruction, and Part 3 covering…
Benchmarking can add up for healthcare accounting.

PubMed

Czarnecki, M T

1994-09-01

In 1993, a healthcare accounting and finance benchmarking survey of hospital and nonhospital organizations gathered statistics about key common performance areas. A low response did not allow for statistically significant findings, but the survey identified performance measures that can be used in healthcare financial management settings. This article explains the benchmarking process and examines some of the 1993 study's findings.
The NAS parallel benchmarks

NASA Technical Reports Server (NTRS)

Bailey, D. H.; Barszcz, E.; Barton, J. T.; Carter, R. L.; Lasinski, T. A.; Browning, D. S.; Dagum, L.; Fatoohi, R. A.; Frederickson, P. O.; Schreiber, R. S.

1991-01-01

A new set of benchmarks has been developed for the performance evaluation of highly parallel supercomputers in the framework of the NASA Ames Numerical Aerodynamic Simulation (NAS) Program. These consist of five 'parallel kernel' benchmarks and three 'simulated application' benchmarks. Together they mimic the computation and data movement characteristics of large-scale computational fluid dynamics applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification-all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Munro, J.F.; Kristal, J.; Thompson, G.

The Office of Environmental Management is bringing Headquarters and the Field together to implement process improvements throughout the Complex through a systematic process of organizational learning called benchmarking. Simply stated, benchmarking is a process of continuously comparing and measuring practices, processes, or methodologies with those of other private and public organizations. The EM benchmarking program, which began as the result of a recommendation from Xerox Corporation, is building trust and removing barriers to performance enhancement across the DOE organization. The EM benchmarking program is designed to be field-centered with Headquarters providing facilitatory and integrative functions on an ``as needed`` basis.more » One of the main goals of the program is to assist Field Offices and their associated M&O/M&I contractors develop the capabilities to do benchmarking for themselves. In this regard, a central precept is that in order to realize tangible performance benefits, program managers and staff -- the ones closest to the work - must take ownership of the studies. This avoids the ``check the box`` mentality associated with some third party studies. This workshop will provide participants with a basic level of understanding why the EM benchmarking team was developed and the nature and scope of its mission. Participants will also begin to understand the types of study levels and the particular methodology the EM benchmarking team is using to conduct studies. The EM benchmarking team will also encourage discussion on ways that DOE (both Headquarters and the Field) can team with its M&O/M&I contractors to conduct additional benchmarking studies. This ``introduction to benchmarking`` is intended to create a desire to know more and a greater appreciation of how benchmarking processes could be creatively employed to enhance performance.« less
Congenital Heart Surgery Case Mix Across North American Centers and Impact on Performance Assessment.

PubMed

Pasquali, Sara K; Wallace, Amelia S; Gaynor, J William; Jacobs, Marshall L; O'Brien, Sean M; Hill, Kevin D; Gaies, Michael G; Romano, Jennifer C; Shahian, David M; Mayer, John E; Jacobs, Jeffrey P

2016-11-01

Performance assessment in congenital heart surgery is challenging due to the wide heterogeneity of disease. We describe current case mix across centers, evaluate methodology inclusive of all cardiac operations versus the more homogeneous subset of Society of Thoracic Surgeons benchmark operations, and describe implications regarding performance assessment. Centers (n = 119) participating in the Society of Thoracic Surgeons Congenital Heart Surgery Database (2010 through 2014) were included. Index operation type and frequency across centers were described. Center performance (risk-adjusted operative mortality) was evaluated and classified when including the benchmark versus all eligible operations. Overall, 207 types of operations were performed during the study period (112,140 total cases). Few operations were performed across all centers; only 25% were performed at least once by 75% or more of centers. There was 7.9-fold variation across centers in the proportion of total cases comprising high-complexity cases (STAT 5). In contrast, the benchmark operations made up 36% of cases, and all but 2 were performed by at least 90% of centers. When evaluating performance based on benchmark versus all operations, 15% of centers changed performance classification; 85% remained unchanged. Benchmark versus all operation methodology was associated with lower power, with 35% versus 78% of centers meeting sample size thresholds. There is wide variation in congenital heart surgery case mix across centers. Metrics based on benchmark versus all operations are associated with strengths (less heterogeneity) and weaknesses (lower power), and lead to differing performance classification for some centers. These findings have implications for ongoing efforts to optimize performance assessment, including choice of target population and appropriate interpretation of reported metrics. Copyright © 2016 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.
The MCNP6 Analytic Criticality Benchmark Suite

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, Forrest B.

2016-06-16

Analytical benchmarks provide an invaluable tool for verifying computer codes used to simulate neutron transport. Several collections of analytical benchmark problems [1-4] are used routinely in the verification of production Monte Carlo codes such as MCNP® [5,6]. Verification of a computer code is a necessary prerequisite to the more complex validation process. The verification process confirms that a code performs its intended functions correctly. The validation process involves determining the absolute accuracy of code results vs. nature. In typical validations, results are computed for a set of benchmark experiments using a particular methodology (code, cross-section data with uncertainties, and modeling)more » and compared to the measured results from the set of benchmark experiments. The validation process determines bias, bias uncertainty, and possibly additional margins. Verification is generally performed by the code developers, while validation is generally performed by code users for a particular application space. The VERIFICATION_KEFF suite of criticality problems [1,2] was originally a set of 75 criticality problems found in the literature for which exact analytical solutions are available. Even though the spatial and energy detail is necessarily limited in analytical benchmarks, typically to a few regions or energy groups, the exact solutions obtained can be used to verify that the basic algorithms, mathematics, and methods used in complex production codes perform correctly. The present work has focused on revisiting this benchmark suite. A thorough review of the problems resulted in discarding some of them as not suitable for MCNP benchmarking. For the remaining problems, many of them were reformulated to permit execution in either multigroup mode or in the normal continuous-energy mode for MCNP. Execution of the benchmarks in continuous-energy mode provides a significant advance to MCNP verification methods.« less
Using benchmarks for radiation testing of microprocessors and FPGAs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Quinn, Heather; Robinson, William H.; Rech, Paolo

Performance benchmarks have been used over the years to compare different systems. These benchmarks can be useful for researchers trying to determine how changes to the technology, architecture, or compiler affect the system's performance. No such standard exists for systems deployed into high radiation environments, making it difficult to assess whether changes in the fabrication process, circuitry, architecture, or software affect reliability or radiation sensitivity. In this paper, we propose a benchmark suite for high-reliability systems that is designed for field-programmable gate arrays and microprocessors. As a result, we describe the development process and report neutron test data for themore » hardware and software benchmarks.« less
Using benchmarks for radiation testing of microprocessors and FPGAs

DOE PAGES

Quinn, Heather; Robinson, William H.; Rech, Paolo; ...

2015-12-17

Performance benchmarks have been used over the years to compare different systems. These benchmarks can be useful for researchers trying to determine how changes to the technology, architecture, or compiler affect the system's performance. No such standard exists for systems deployed into high radiation environments, making it difficult to assess whether changes in the fabrication process, circuitry, architecture, or software affect reliability or radiation sensitivity. In this paper, we propose a benchmark suite for high-reliability systems that is designed for field-programmable gate arrays and microprocessors. As a result, we describe the development process and report neutron test data for themore » hardware and software benchmarks.« less
Benchmarking biology research organizations using a new, dedicated tool.

PubMed

van Harten, Willem H; van Bokhorst, Leonard; van Luenen, Henri G A M

2010-02-01

International competition forces fundamental research organizations to assess their relative performance. We present a benchmark tool for scientific research organizations where, contrary to existing models, the group leader is placed in a central position within the organization. We used it in a pilot benchmark study involving six research institutions. Our study shows that data collection and data comparison based on this new tool can be achieved. It proved possible to compare relative performance and organizational characteristics and to generate suggestions for improvement for most participants. However, strict definitions of the parameters used for the benchmark and a thorough insight into the organization of each of the benchmark partners is required to produce comparable data and draw firm conclusions.

Algorithm and Architecture Independent Benchmarking with SEAK

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tallent, Nathan R.; Manzano Franco, Joseph B.; Gawande, Nitin A.

2016-05-23

Many applications of high performance embedded computing are limited by performance or power bottlenecks. We have designed the Suite for Embedded Applications & Kernels (SEAK), a new benchmark suite, (a) to capture these bottlenecks in a way that encourages creative solutions; and (b) to facilitate rigorous, objective, end-user evaluation for their solutions. To avoid biasing solutions toward existing algorithms, SEAK benchmarks use a mission-centric (abstracted from a particular algorithm) and goal-oriented (functional) specification. To encourage solutions that are any combination of software or hardware, we use an end-user black-box evaluation that can capture tradeoffs between performance, power, accuracy, size, andmore » weight. The tradeoffs are especially informative for procurement decisions. We call our benchmarks future proof because each mission-centric interface and evaluation remains useful despite shifting algorithmic preferences. It is challenging to create both concise and precise goal-oriented specifications for mission-centric problems. This paper describes the SEAK benchmark suite and presents an evaluation of sample solutions that highlights power and performance tradeoffs.« less
Comprehensive Benchmark Suite for Simulation of Particle Laden Flows Using the Discrete Element Method with Performance Profiles from the Multiphase Flow with Interface eXchanges (MFiX) Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Peiyuan; Brown, Timothy; Fullmer, William D.

Five benchmark problems are developed and simulated with the computational fluid dynamics and discrete element model code MFiX. The benchmark problems span dilute and dense regimes, consider statistically homogeneous and inhomogeneous (both clusters and bubbles) particle concentrations and a range of particle and fluid dynamic computational loads. Several variations of the benchmark problems are also discussed to extend the computational phase space to cover granular (particles only), bidisperse and heat transfer cases. A weak scaling analysis is performed for each benchmark problem and, in most cases, the scalability of the code appears reasonable up to approx. 103 cores. Profiling ofmore » the benchmark problems indicate that the most substantial computational time is being spent on particle-particle force calculations, drag force calculations and interpolating between discrete particle and continuum fields. Hardware performance analysis was also carried out showing significant Level 2 cache miss ratios and a rather low degree of vectorization. These results are intended to serve as a baseline for future developments to the code as well as a preliminary indicator of where to best focus performance optimizations.« less
Benchmarking forensic mental health organizations.

PubMed

Coombs, Tim; Taylor, Monica; Pirkis, Jane

2011-04-01

This paper describes the forensic mental health forums that were conducted as part of the National Mental Health Benchmarking Project (NMHBP). These forums encouraged participating organizations to compare their performance on a range of key performance indicators (KPIs) with that of their peers. Four forensic mental health organizations took part in the NMHBP. Representatives from these organizations attended eight benchmarking forums at which they documented their performance against previously agreed KPIs. They also undertook three special projects which explored some of the factors that might explain inter-organizational variation in performance. The inter-organizational range for many of the indicators was substantial. Observing this led participants to conduct the special projects to explore three factors which might help explain the variability - seclusion practices, delivery of community mental health services, and provision of court liaison services. The process of conducting the special projects gave participants insights into the practices and structures employed by their counterparts, and provided them with some important lessons for quality improvement. The forensic mental health benchmarking forums have demonstrated that benchmarking is feasible and likely to be useful in improving service performance and quality.
Memory-Intensive Benchmarks: IRAM vs. Cache-Based Machines

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Gaeke, Brian R.; Husbands, Parry; Li, Xiaoye S.; Oliker, Leonid; Yelick, Katherine A.; Biegel, Bryan (Technical Monitor)

2002-01-01

The increasing gap between processor and memory performance has lead to new architectural models for memory-intensive applications. In this paper, we explore the performance of a set of memory-intensive benchmarks and use them to compare the performance of conventional cache-based microprocessors to a mixed logic and DRAM processor called VIRAM. The benchmarks are based on problem statements, rather than specific implementations, and in each case we explore the fundamental hardware requirements of the problem, as well as alternative algorithms and data structures that can help expose fine-grained parallelism or simplify memory access patterns. The benchmarks are characterized by their memory access patterns, their basic control structures, and the ratio of computation to memory operation.
Measuring Distribution Performance? Benchmarking Warrants Your Attention

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ericson, Sean J; Alvarez, Paul

Identifying, designing, and measuring performance metrics is critical to securing customer value, but can be a difficult task. This article examines the use of benchmarks based on publicly available performance data to set challenging, yet fair, metrics and targets.
Competency based training in robotic surgery: benchmark scores for virtual reality robotic simulation.

PubMed

Raison, Nicholas; Ahmed, Kamran; Fossati, Nicola; Buffi, Nicolò; Mottrie, Alexandre; Dasgupta, Prokar; Van Der Poel, Henk

2017-05-01

To develop benchmark scores of competency for use within a competency based virtual reality (VR) robotic training curriculum. This longitudinal, observational study analysed results from nine European Association of Urology hands-on-training courses in VR simulation. In all, 223 participants ranging from novice to expert robotic surgeons completed 1565 exercises. Competency was set at 75% of the mean expert score. Benchmark scores for all general performance metrics generated by the simulator were calculated. Assessment exercises were selected by expert consensus and through learning-curve analysis. Three basic skill and two advanced skill exercises were identified. Benchmark scores based on expert performance offered viable targets for novice and intermediate trainees in robotic surgery. Novice participants met the competency standards for most basic skill exercises; however, advanced exercises were significantly more challenging. Intermediate participants performed better across the seven metrics but still did not achieve the benchmark standard in the more difficult exercises. Benchmark scores derived from expert performances offer relevant and challenging scores for trainees to achieve during VR simulation training. Objective feedback allows both participants and trainers to monitor educational progress and ensures that training remains effective. Furthermore, the well-defined goals set through benchmarking offer clear targets for trainees and enable training to move to a more efficient competency based curriculum. © 2016 The Authors BJU International © 2016 BJU International Published by John Wiley & Sons Ltd.
Benchmarking in national health service procurement in Scotland.

PubMed

Walker, Scott; Masson, Ron; Telford, Ronnie; White, David

2007-11-01

The paper reports the results of a study on benchmarking activities undertaken by the procurement organization within the National Health Service (NHS) in Scotland, namely National Procurement (previously Scottish Healthcare Supplies Contracts Branch). NHS performance is of course politically important, and benchmarking is increasingly seen as a means to improve performance, so the study was carried out to determine if the current benchmarking approaches could be enhanced. A review of the benchmarking activities used by the private sector, local government and NHS organizations was carried out to establish a framework of the motivations, benefits, problems and costs associated with benchmarking. This framework was used to carry out the research through case studies and a questionnaire survey of NHS procurement organizations both in Scotland and other parts of the UK. Nine of the 16 Scottish Health Boards surveyed reported carrying out benchmarking during the last three years. The findings of the research were that there were similarities in approaches between local government and NHS Scotland Health, but differences between NHS Scotland and other UK NHS procurement organizations. Benefits were seen as significant and it was recommended that National Procurement should pursue the formation of a benchmarking group with members drawn from NHS Scotland and external benchmarking bodies to establish measures to be used in benchmarking across the whole of NHS Scotland.
Benchmarking high performance computing architectures with CMS’ skeleton framework

NASA Astrophysics Data System (ADS)

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

2017-10-01

In 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta, Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.
Structural Benchmark Creep Testing for Microcast MarM-247 Advanced Stirling Convertor E2 Heater Head Test Article SN18

NASA Technical Reports Server (NTRS)

Krause, David L.; Brewer, Ethan J.; Pawlik, Ralph

2013-01-01

This report provides test methodology details and qualitative results for the first structural benchmark creep test of an Advanced Stirling Convertor (ASC) heater head of ASC-E2 design heritage. The test article was recovered from a flight-like Microcast MarM-247 heater head specimen previously used in helium permeability testing. The test article was utilized for benchmark creep test rig preparation, wall thickness and diametral laser scan hardware metrological developments, and induction heater custom coil experiments. In addition, a benchmark creep test was performed, terminated after one week when through-thickness cracks propagated at thermocouple weld locations. Following this, it was used to develop a unique temperature measurement methodology using contact thermocouples, thereby enabling future benchmark testing to be performed without the use of conventional welded thermocouples, proven problematic for the alloy. This report includes an overview of heater head structural benchmark creep testing, the origin of this particular test article, test configuration developments accomplished using the test article, creep predictions for its benchmark creep test, qualitative structural benchmark creep test results, and a short summary.
Simple Benchmark Specifications for Space Radiation Protection

NASA Technical Reports Server (NTRS)

Singleterry, Robert C. Jr.; Aghara, Sukesh K.

2013-01-01

This report defines space radiation benchmark specifications. This specification starts with simple, monoenergetic, mono-directional particles on slabs and progresses to human models in spacecraft. This report specifies the models and sources needed to what the team performing the benchmark needs to produce in a report. Also included are brief descriptions of how OLTARIS, the NASA Langley website for space radiation analysis, performs its analysis.
Performance of Multi-chaotic PSO on a shifted benchmark functions set

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pluhacek, Michal; Senkerik, Roman; Zelinka, Ivan

2015-03-10

In this paper the performance of Multi-chaotic PSO algorithm is investigated using two shifted benchmark functions. The purpose of shifted benchmark functions is to simulate the time-variant real-world problems. The results of chaotic PSO are compared with canonical version of the algorithm. It is concluded that using the multi-chaotic approach can lead to better results in optimization of shifted functions.
Benchmarking routine psychological services: a discussion of challenges and methods.

PubMed

Delgadillo, Jaime; McMillan, Dean; Leach, Chris; Lucock, Mike; Gilbody, Simon; Wood, Nick

2014-01-01

Policy developments in recent years have led to important changes in the level of access to evidence-based psychological treatments. Several methods have been used to investigate the effectiveness of these treatments in routine care, with different approaches to outcome definition and data analysis. To present a review of challenges and methods for the evaluation of evidence-based treatments delivered in routine mental healthcare. This is followed by a case example of a benchmarking method applied in primary care. High, average and poor performance benchmarks were calculated through a meta-analysis of published data from services working under the Improving Access to Psychological Therapies (IAPT) Programme in England. Pre-post treatment effect sizes (ES) and confidence intervals were estimated to illustrate a benchmarking method enabling services to evaluate routine clinical outcomes. High, average and poor performance ES for routine IAPT services were estimated to be 0.91, 0.73 and 0.46 for depression (using PHQ-9) and 1.02, 0.78 and 0.52 for anxiety (using GAD-7). Data from one specific IAPT service exemplify how to evaluate and contextualize routine clinical performance against these benchmarks. The main contribution of this report is to summarize key recommendations for the selection of an adequate set of psychometric measures, the operational definition of outcomes, and the statistical evaluation of clinical performance. A benchmarking method is also presented, which may enable a robust evaluation of clinical performance against national benchmarks. Some limitations concerned significant heterogeneity among data sources, and wide variations in ES and data completeness.
Unstructured Adaptive (UA) NAS Parallel Benchmark. Version 1.0

NASA Technical Reports Server (NTRS)

Feng, Huiyu; VanderWijngaart, Rob; Biswas, Rupak; Mavriplis, Catherine

2004-01-01

We present a complete specification of a new benchmark for measuring the performance of modern computer systems when solving scientific problems featuring irregular, dynamic memory accesses. It complements the existing NAS Parallel Benchmark suite. The benchmark involves the solution of a stylized heat transfer problem in a cubic domain, discretized on an adaptively refined, unstructured mesh.
Toxicological Benchmarks for Screening Potential Contaminants of Concern for Effects on Soil and Litter Invertebrates and Heterotrophic Process

DOE Office of Scientific and Technical Information (OSTI.GOV)

Will, M.E.

1994-01-01

This report presents a standard method for deriving benchmarks for the purpose of ''contaminant screening,'' performed by comparing measured ambient concentrations of chemicals. The work was performed under Work Breakdown Structure 1.4.12.2.3.04.07.02 (Activity Data Sheet 8304). In addition, this report presents sets of data concerning the effects of chemicals in soil on invertebrates and soil microbial processes, benchmarks for chemicals potentially associated with United States Department of Energy sites, and literature describing the experiments from which data were drawn for benchmark derivation.
75 FR 6368 - Submission for OMB Review; Comment Request

Federal Register 2010, 2011, 2012, 2013, 2014

2010-02-09

... benchmarking U.S. performance in mathematics and science at the fourth- and eighth-grade levels against other... assessment data for internationally benchmarking U.S. performance in fourth-grade reading. NCES has received...
Benchmarking high performance computing architectures with CMS’ skeleton framework

DOE PAGES

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

2017-11-23

Here, in 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta,more » Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.« less
Benchmarking high performance computing architectures with CMS’ skeleton framework

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

Here, in 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta,more » Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.« less
Implementation of the NAS Parallel Benchmarks in Java

NASA Technical Reports Server (NTRS)

Frumkin, Michael A.; Schultz, Matthew; Jin, Haoqiang; Yan, Jerry; Biegel, Bryan (Technical Monitor)

2002-01-01

Several features make Java an attractive choice for High Performance Computing (HPC). In order to gauge the applicability of Java to Computational Fluid Dynamics (CFD), we have implemented the NAS (NASA Advanced Supercomputing) Parallel Benchmarks in Java. The performance and scalability of the benchmarks point out the areas where improvement in Java compiler technology and in Java thread implementation would position Java closer to Fortran in the competition for CFD applications.
Performance and Scalability of the NAS Parallel Benchmarks in Java

NASA Technical Reports Server (NTRS)

Frumkin, Michael A.; Schultz, Matthew; Jin, Haoqiang; Yan, Jerry; Biegel, Bryan A. (Technical Monitor)

2002-01-01

Several features make Java an attractive choice for scientific applications. In order to gauge the applicability of Java to Computational Fluid Dynamics (CFD), we have implemented the NAS (NASA Advanced Supercomputing) Parallel Benchmarks in Java. The performance and scalability of the benchmarks point out the areas where improvement in Java compiler technology and in Java thread implementation would position Java closer to Fortran in the competition for scientific applications.
An automated protocol for performance benchmarking a widefield fluorescence microscope.

PubMed

Halter, Michael; Bier, Elianna; DeRose, Paul C; Cooksey, Gregory A; Choquette, Steven J; Plant, Anne L; Elliott, John T

2014-11-01

Widefield fluorescence microscopy is a highly used tool for visually assessing biological samples and for quantifying cell responses. Despite its widespread use in high content analysis and other imaging applications, few published methods exist for evaluating and benchmarking the analytical performance of a microscope. Easy-to-use benchmarking methods would facilitate the use of fluorescence imaging as a quantitative analytical tool in research applications, and would aid the determination of instrumental method validation for commercial product development applications. We describe and evaluate an automated method to characterize a fluorescence imaging system's performance by benchmarking the detection threshold, saturation, and linear dynamic range to a reference material. The benchmarking procedure is demonstrated using two different materials as the reference material, uranyl-ion-doped glass and Schott 475 GG filter glass. Both are suitable candidate reference materials that are homogeneously fluorescent and highly photostable, and the Schott 475 GG filter glass is currently commercially available. In addition to benchmarking the analytical performance, we also demonstrate that the reference materials provide for accurate day to day intensity calibration. Published 2014 Wiley Periodicals Inc. Published 2014 Wiley Periodicals Inc. This article is a US government work and, as such, is in the public domain in the United States of America.

Comparison of Origin 2000 and Origin 3000 Using NAS Parallel Benchmarks

NASA Technical Reports Server (NTRS)

Turney, Raymond D.

2001-01-01

This report describes results of benchmark tests on the Origin 3000 system currently being installed at the NASA Ames National Advanced Supercomputing facility. This machine will ultimately contain 1024 R14K processors. The first part of the system, installed in November, 2000 and named mendel, is an Origin 3000 with 128 R12K processors. For comparison purposes, the tests were also run on lomax, an Origin 2000 with R12K processors. The BT, LU, and SP application benchmarks in the NAS Parallel Benchmark Suite and the kernel benchmark FT were chosen to determine system performance and measure the impact of changes on the machine as it evolves. Having been written to measure performance on Computational Fluid Dynamics applications, these benchmarks are assumed appropriate to represent the NAS workload. Since the NAS runs both message passing (MPI) and shared-memory, compiler directive type codes, both MPI and OpenMP versions of the benchmarks were used. The MPI versions used were the latest official release of the NAS Parallel Benchmarks, version 2.3. The OpenMP versiqns used were PBN3b2, a beta version that is in the process of being released. NPB 2.3 and PBN 3b2 are technically different benchmarks, and NPB results are not directly comparable to PBN results.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Cohen, J; Dossa, D; Gokhale, M

Critical data science applications requiring frequent access to storage perform poorly on today's computing architectures. This project addresses efficient computation of data-intensive problems in national security and basic science by exploring, advancing, and applying a new form of computing called storage-intensive supercomputing (SISC). Our goal is to enable applications that simply cannot run on current systems, and, for a broad range of data-intensive problems, to deliver an order of magnitude improvement in price/performance over today's data-intensive architectures. This technical report documents much of the work done under LDRD 07-ERD-063 Storage Intensive Supercomputing during the period 05/07-09/07. The following chapters describe:more » (1) a new file I/O monitoring tool iotrace developed to capture the dynamic I/O profiles of Linux processes; (2) an out-of-core graph benchmark for level-set expansion of scale-free graphs; (3) an entity extraction benchmark consisting of a pipeline of eight components; and (4) an image resampling benchmark drawn from the SWarp program in the LSST data processing pipeline. The performance of the graph and entity extraction benchmarks was measured in three different scenarios: data sets residing on the NFS file server and accessed over the network; data sets stored on local disk; and data sets stored on the Fusion I/O parallel NAND Flash array. The image resampling benchmark compared performance of software-only to GPU-accelerated. In addition to the work reported here, an additional text processing application was developed that used an FPGA to accelerate n-gram profiling for language classification. The n-gram application will be presented at SC07 at the High Performance Reconfigurable Computing Technologies and Applications Workshop. The graph and entity extraction benchmarks were run on a Supermicro server housing the NAND Flash 40GB parallel disk array, the Fusion-io. The Fusion system specs are as follows: SuperMicro X7DBE Xeon Dual Socket Blackford Server Motherboard; 2 Intel Xeon Dual-Core 2.66 GHz processors; 1 GB DDR2 PC2-5300 RAM (2 x 512); 80GB Hard Drive (Seagate SATA II Barracuda). The Fusion board is presently capable of 4X in a PCIe slot. The image resampling benchmark was run on a dual Xeon workstation with NVIDIA graphics card (see Chapter 5 for full specification). An XtremeData Opteron+FPGA was used for the language classification application. We observed that these benchmarks are not uniformly I/O intensive. The only benchmark that showed greater that 50% of the time in I/O was the graph algorithm when it accessed data files over NFS. When local disk was used, the graph benchmark spent at most 40% of its time in I/O. The other benchmarks were CPU dominated. The image resampling benchmark and language classification showed order of magnitude speedup over software by using co-processor technology to offload the CPU-intensive kernels. Our experiments to date suggest that emerging hardware technologies offer significant benefit to boosting the performance of data-intensive algorithms. Using GPU and FPGA co-processors, we were able to improve performance by more than an order of magnitude on the benchmark algorithms, eliminating the processor bottleneck of CPU-bound tasks. Experiments with a prototype solid state nonvolative memory available today show 10X better throughput on random reads than disk, with a 2X speedup on a graph processing benchmark when compared to the use of local SATA disk.« less
SU-E-T-148: Benchmarks and Pre-Treatment Reviews: A Study of Quality Assurance Effectiveness

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lowenstein, J; Nguyen, H; Roll, J

Purpose: To determine the impact benchmarks and pre-treatment reviews have on improving the quality of submitted clinical trial data. Methods: Benchmarks are used to evaluate a site’s ability to develop a treatment that meets a specific protocol’s treatment guidelines prior to placing their first patient on the protocol. A pre-treatment review is an actual patient placed on the protocol in which the dosimetry and contour volumes are evaluated to be per protocol guidelines prior to allowing the beginning of the treatment. A key component of these QA mechanisms is that sites are provided timely feedback to educate them on howmore » to plan per the protocol and prevent protocol deviations on patients accrued to a protocol. For both benchmarks and pre-treatment reviews a dose volume analysis (DVA) was performed using MIM softwareTM. For pre-treatment reviews a volume contour evaluation was also performed. Results: IROC Houston performed a QA effectiveness analysis of a protocol which required both benchmarks and pre-treatment reviews. In 70 percent of the patient cases submitted, the benchmark played an effective role in assuring that the pre-treatment review of the cases met protocol requirements. The 35 percent of sites failing the benchmark subsequently modified there planning technique to pass the benchmark before being allowed to submit a patient for pre-treatment review. However, in 30 percent of the submitted cases the pre-treatment review failed where the majority (71 percent) failed the DVA. 20 percent of sites submitting patients failed to correct their dose volume discrepancies indicated by the benchmark case. Conclusion: Benchmark cases and pre-treatment reviews can be an effective QA tool to educate sites on protocol guidelines and to minimize deviations. Without the benchmark cases it is possible that 65 percent of the cases undergoing a pre-treatment review would have failed to meet the protocols requirements.Support: U24-CA-180803.« less
Thought Experiment to Examine Benchmark Performance for Fusion Nuclear Data

NASA Astrophysics Data System (ADS)

Murata, Isao; Ohta, Masayuki; Kusaka, Sachie; Sato, Fuminobu; Miyamaru, Hiroyuki

2017-09-01

There are many benchmark experiments carried out so far with DT neutrons especially aiming at fusion reactor development. These integral experiments seemed vaguely to validate the nuclear data below 14 MeV. However, no precise studies exist now. The author's group thus started to examine how well benchmark experiments with DT neutrons can play a benchmarking role for energies below 14 MeV. Recently, as a next phase, to generalize the above discussion, the energy range was expanded to the entire region. In this study, thought experiments with finer energy bins have thus been conducted to discuss how to generally estimate performance of benchmark experiments. As a result of thought experiments with a point detector, the sensitivity for a discrepancy appearing in the benchmark analysis is "equally" due not only to contribution directly conveyed to the deterctor, but also due to indirect contribution of neutrons (named (A)) making neutrons conveying the contribution, indirect controbution of neutrons (B) making the neutrons (A) and so on. From this concept, it would become clear from a sensitivity analysis in advance how well and which energy nuclear data could be benchmarked with a benchmark experiment.
Benchmarking Strategies for Measuring the Quality of Healthcare: Problems and Prospects

PubMed Central

Lovaglio, Pietro Giorgio

2012-01-01

Over the last few years, increasing attention has been directed toward the problems inherent to measuring the quality of healthcare and implementing benchmarking strategies. Besides offering accreditation and certification processes, recent approaches measure the performance of healthcare institutions in order to evaluate their effectiveness, defined as the capacity to provide treatment that modifies and improves the patient's state of health. This paper, dealing with hospital effectiveness, focuses on research methods for effectiveness analyses within a strategy comparing different healthcare institutions. The paper, after having introduced readers to the principle debates on benchmarking strategies, which depend on the perspective and type of indicators used, focuses on the methodological problems related to performing consistent benchmarking analyses. Particularly, statistical methods suitable for controlling case-mix, analyzing aggregate data, rare events, and continuous outcomes measured with error are examined. Specific challenges of benchmarking strategies, such as the risk of risk adjustment (case-mix fallacy, underreporting, risk of comparing noncomparable hospitals), selection bias, and possible strategies for the development of consistent benchmarking analyses, are discussed. Finally, to demonstrate the feasibility of the illustrated benchmarking strategies, an application focused on determining regional benchmarks for patient satisfaction (using 2009 Lombardy Region Patient Satisfaction Questionnaire) is proposed. PMID:22666140
Benchmarking strategies for measuring the quality of healthcare: problems and prospects.

PubMed

Lovaglio, Pietro Giorgio

2012-01-01

Over the last few years, increasing attention has been directed toward the problems inherent to measuring the quality of healthcare and implementing benchmarking strategies. Besides offering accreditation and certification processes, recent approaches measure the performance of healthcare institutions in order to evaluate their effectiveness, defined as the capacity to provide treatment that modifies and improves the patient's state of health. This paper, dealing with hospital effectiveness, focuses on research methods for effectiveness analyses within a strategy comparing different healthcare institutions. The paper, after having introduced readers to the principle debates on benchmarking strategies, which depend on the perspective and type of indicators used, focuses on the methodological problems related to performing consistent benchmarking analyses. Particularly, statistical methods suitable for controlling case-mix, analyzing aggregate data, rare events, and continuous outcomes measured with error are examined. Specific challenges of benchmarking strategies, such as the risk of risk adjustment (case-mix fallacy, underreporting, risk of comparing noncomparable hospitals), selection bias, and possible strategies for the development of consistent benchmarking analyses, are discussed. Finally, to demonstrate the feasibility of the illustrated benchmarking strategies, an application focused on determining regional benchmarks for patient satisfaction (using 2009 Lombardy Region Patient Satisfaction Questionnaire) is proposed.
Benchmarking in emergency health systems.

PubMed

Kennedy, Marcus P; Allen, Jacqueline; Allen, Greg

2002-12-01

This paper discusses the role of benchmarking as a component of quality management. It describes the historical background of benchmarking, its competitive origin and the requirement in today's health environment for a more collaborative approach. The classical 'functional and generic' types of benchmarking are discussed with a suggestion to adopt a different terminology that describes the purpose and practicalities of benchmarking. Benchmarking is not without risks. The consequence of inappropriate focus and the need for a balanced overview of process is explored. The competition that is intrinsic to benchmarking is questioned and the negative impact it may have on improvement strategies in poorly performing organizations is recognized. The difficulty in achieving cross-organizational validity in benchmarking is emphasized, as is the need to scrutinize benchmarking measures. The cost effectiveness of benchmarking projects is questioned and the concept of 'best value, best practice' in an environment of fixed resources is examined.
Benchmarking Multilayer-HySEA model for landslide generated tsunami. HTHMP validation process.

NASA Astrophysics Data System (ADS)

Macias, J.; Escalante, C.; Castro, M. J.

2017-12-01

Landslide tsunami hazard may be dominant along significant parts of the coastline around the world, in particular in the USA, as compared to hazards from other tsunamigenic sources. This fact motivated NTHMP about the need of benchmarking models for landslide generated tsunamis, following the same methodology already used for standard tsunami models when the source is seismic. To perform the above-mentioned validation process, a set of candidate benchmarks were proposed. These benchmarks are based on a subset of available laboratory data sets for solid slide experiments and deformable slide experiments, and include both submarine and subaerial slides. A benchmark based on a historic field event (Valdez, AK, 1964) close the list of proposed benchmarks. A total of 7 benchmarks. The Multilayer-HySEA model including non-hydrostatic effects has been used to perform all the benchmarking problems dealing with laboratory experiments proposed in the workshop that was organized at Texas A&M University - Galveston, on January 9-11, 2017 by NTHMP. The aim of this presentation is to show some of the latest numerical results obtained with the Multilayer-HySEA (non-hydrostatic) model in the framework of this validation effort.Acknowledgements. This research has been partially supported by the Spanish Government Research project SIMURISK (MTM2015-70490-C02-01-R) and University of Malaga, Campus de Excelencia Internacional Andalucía Tech. The GPU computations were performed at the Unit of Numerical Methods (University of Malaga).
Clomp

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gylenhaal, J.; Bronevetsky, G.

2007-05-25

CLOMP is the C version of the Livermore OpenMP benchmark deeloped to measure OpenMP overheads and other performance impacts due to threading (like NUMA memory layouts, memory contention, cache effects, etc.) in order to influence future system design. Current best-in-class implementations of OpenMP have overheads at least ten times larger than is required by many of our applications for effective use of OpenMP. This benchmark shows the significant negative performance impact of these relatively large overheads and of other thread effects. The CLOMP benchmark highly configurable to allow a variety of problem sizes and threading effects to be studied andmore » it carefully checks its results to catch many common threading errors. This benchmark is expected to be included as part of the Sequoia Benchmark suite for the Sequoia procurement.« less
All inclusive benchmarking.

PubMed

Ellis, Judith

2006-07-01

The aim of this article is to review published descriptions of benchmarking activity and synthesize benchmarking principles to encourage the acceptance and use of Essence of Care as a new benchmarking approach to continuous quality improvement, and to promote its acceptance as an integral and effective part of benchmarking activity in health services. The Essence of Care, was launched by the Department of Health in England in 2001 to provide a benchmarking tool kit to support continuous improvement in the quality of fundamental aspects of health care, for example, privacy and dignity, nutrition and hygiene. The tool kit is now being effectively used by some frontline staff. However, use is inconsistent, with the value of the tool kit, or the support clinical practice benchmarking requires to be effective, not always recognized or provided by National Health Service managers, who are absorbed with the use of quantitative benchmarking approaches and measurability of comparative performance data. This review of published benchmarking literature, was obtained through an ever-narrowing search strategy commencing from benchmarking within quality improvement literature through to benchmarking activity in health services and including access to not only published examples of benchmarking approaches and models used but the actual consideration of web-based benchmarking data. This supported identification of how benchmarking approaches have developed and been used, remaining true to the basic benchmarking principles of continuous improvement through comparison and sharing (Camp 1989). Descriptions of models and exemplars of quantitative and specifically performance benchmarking activity in industry abound (Camp 1998), with far fewer examples of more qualitative and process benchmarking approaches in use in the public services and then applied to the health service (Bullivant 1998). The literature is also in the main descriptive in its support of the effectiveness of benchmarking activity and although this does not seem to have restricted its popularity in quantitative activity, reticence about the value of the more qualitative approaches, for example Essence of Care, needs to be overcome in order to improve the quality of patient care and experiences. The perceived immeasurability and subjectivity of Essence of Care and clinical practice benchmarks means that these benchmarking approaches are not always accepted or supported by health service organizations as valid benchmarking activity. In conclusion, Essence of Care benchmarking is a sophisticated clinical practice benchmarking approach which needs to be accepted as an integral part of health service benchmarking activity to support improvement in the quality of patient care and experiences.
Implementation of NAS Parallel Benchmarks in Java

NASA Technical Reports Server (NTRS)

Frumkin, Michael; Schultz, Matthew; Jin, Hao-Qiang; Yan, Jerry

2000-01-01

A number of features make Java an attractive but a debatable choice for High Performance Computing (HPC). In order to gauge the applicability of Java to the Computational Fluid Dynamics (CFD) we have implemented NAS Parallel Benchmarks in Java. The performance and scalability of the benchmarks point out the areas where improvement in Java compiler technology and in Java thread implementation would move Java closer to Fortran in the competition for CFD applications.
Benchmarking, Total Quality Management, and Libraries.

ERIC Educational Resources Information Center

Shaughnessy, Thomas W.

1993-01-01

Discussion of the use of Total Quality Management (TQM) in higher education and academic libraries focuses on the identification, collection, and use of reliable data. Methods for measuring quality, including benchmarking, are described; performance measures are considered; and benchmarking techniques are examined. (11 references) (MES)
Developing and Trialling an independent, scalable and repeatable IT-benchmarking procedure for healthcare organisations.

PubMed

Liebe, J D; Hübner, U

2013-01-01

Continuous improvements of IT-performance in healthcare organisations require actionable performance indicators, regularly conducted, independent measurements and meaningful and scalable reference groups. Existing IT-benchmarking initiatives have focussed on the development of reliable and valid indicators, but less on the questions about how to implement an environment for conducting easily repeatable and scalable IT-benchmarks. This study aims at developing and trialling a procedure that meets the afore-mentioned requirements. We chose a well established, regularly conducted (inter-) national IT-survey of healthcare organisations (IT-Report Healthcare) as the environment and offered the participants of the 2011 survey (CIOs of hospitals) to enter a benchmark. The 61 structural and functional performance indicators covered among others the implementation status and integration of IT-systems and functions, global user satisfaction and the resources of the IT-department. Healthcare organisations were grouped by size and ownership. The benchmark results were made available electronically and feedback on the use of these results was requested after several months. Fifty-ninehospitals participated in the benchmarking. Reference groups consisted of up to 141 members depending on the number of beds (size) and the ownership (public vs. private). A total of 122 charts showing single indicator frequency views were sent to each participant. The evaluation showed that 94.1% of the CIOs who participated in the evaluation considered this benchmarking beneficial and reported that they would enter again. Based on the feedback of the participants we developed two additional views that provide a more consolidated picture. The results demonstrate that establishing an independent, easily repeatable and scalable IT-benchmarking procedure is possible and was deemed desirable. Based on these encouraging results a new benchmarking round which includes process indicators is currently conducted.
Using benchmarking techniques and the 2011 maternity practices infant nutrition and care (mPINC) survey to improve performance among peer groups across the United States.

PubMed

Edwards, Roger A; Dee, Deborah; Umer, Amna; Perrine, Cria G; Shealy, Katherine R; Grummer-Strawn, Laurence M

2014-02-01

A substantial proportion of US maternity care facilities engage in practices that are not evidence-based and that interfere with breastfeeding. The CDC Survey of Maternity Practices in Infant Nutrition and Care (mPINC) showed significant variation in maternity practices among US states. The purpose of this article is to use benchmarking techniques to identify states within relevant peer groups that were top performers on mPINC survey indicators related to breastfeeding support. We used 11 indicators of breastfeeding-related maternity care from the 2011 mPINC survey and benchmarking techniques to organize and compare hospital-based maternity practices across the 50 states and Washington, DC. We created peer categories for benchmarking first by region (grouping states by West, Midwest, South, and Northeast) and then by size (grouping states by the number of maternity facilities and dividing each region into approximately equal halves based on the number of facilities). Thirty-four states had scores high enough to serve as benchmarks, and 32 states had scores low enough to reflect the lowest score gap from the benchmark on at least 1 indicator. No state served as the benchmark on more than 5 indicators and no state was furthest from the benchmark on more than 7 indicators. The small peer group benchmarks in the South, West, and Midwest were better than the large peer group benchmarks on 91%, 82%, and 36% of the indicators, respectively. In the West large, the Midwest large, the Midwest small, and the South large peer groups, 4-6 benchmarks showed that less than 50% of hospitals have ideal practice in all states. The evaluation presents benchmarks for peer group state comparisons that provide potential and feasible targets for improvement.
Hospital benchmarking: are U.S. eye hospitals ready?

PubMed

de Korne, Dirk F; van Wijngaarden, Jeroen D H; Sol, Kees J C A; Betz, Robert; Thomas, Richard C; Schein, Oliver D; Klazinga, Niek S

2012-01-01

Benchmarking is increasingly considered a useful management instrument to improve quality in health care, but little is known about its applicability in hospital settings. The aims of this study were to assess the applicability of a benchmarking project in U.S. eye hospitals and compare the results with an international initiative. We evaluated multiple cases by applying an evaluation frame abstracted from the literature to five U.S. eye hospitals that used a set of 10 indicators for efficiency benchmarking. Qualitative analysis entailed 46 semistructured face-to-face interviews with stakeholders, document analyses, and questionnaires. The case studies only partially met the conditions of the evaluation frame. Although learning and quality improvement were stated as overall purposes, the benchmarking initiative was at first focused on efficiency only. No ophthalmic outcomes were included, and clinicians were skeptical about their reporting relevance and disclosure. However, in contrast with earlier findings in international eye hospitals, all U.S. hospitals worked with internal indicators that were integrated in their performance management systems and supported benchmarking. Benchmarking can support performance management in individual hospitals. Having a certain number of comparable institutes provide similar services in a noncompetitive milieu seems to lay fertile ground for benchmarking. International benchmarking is useful only when these conditions are not met nationally. Although the literature focuses on static conditions for effective benchmarking, our case studies show that it is a highly iterative and learning process. The journey of benchmarking seems to be more important than the destination. Improving patient value (health outcomes per unit of cost) requires, however, an integrative perspective where clinicians and administrators closely cooperate on both quality and efficiency issues. If these worlds do not share such a relationship, the added "public" value of benchmarking in health care is questionable.
A benchmarking program to reduce red blood cell outdating: implementation, evaluation, and a conceptual framework.

PubMed

Barty, Rebecca L; Gagliardi, Kathleen; Owens, Wendy; Lauzon, Deborah; Scheuermann, Sheena; Liu, Yang; Wang, Grace; Pai, Menaka; Heddle, Nancy M

2015-07-01

Benchmarking is a quality improvement tool that compares an organization's performance to that of its peers for selected indicators, to improve practice. Processes to develop evidence-based benchmarks for red blood cell (RBC) outdating in Ontario hospitals, based on RBC hospital disposition data from Canadian Blood Services, have been previously reported. These benchmarks were implemented in 160 hospitals provincewide with a multifaceted approach, which included hospital education, inventory management tools and resources, summaries of best practice recommendations, recognition of high-performing sites, and audit tools on the Transfusion Ontario website (http://transfusionontario.org). In this study we describe the implementation process and the impact of the benchmarking program on RBC outdating. A conceptual framework for continuous quality improvement of a benchmarking program was also developed. The RBC outdating rate for all hospitals trended downward continuously from April 2006 to February 2012, irrespective of hospitals' transfusion rates or their distance from the blood supplier. The highest annual outdating rate was 2.82%, at the beginning of the observation period. Each year brought further reductions, with a nadir outdating rate of 1.02% achieved in 2011. The key elements of the successful benchmarking strategy included dynamic targets, a comprehensive and evidence-based implementation strategy, ongoing information sharing, and a robust data system to track information. The Ontario benchmarking program for RBC outdating resulted in continuous and sustained quality improvement. Our conceptual iterative framework for benchmarking provides a guide for institutions implementing a benchmarking program. © 2015 AABB.
Benchmarking Helps Measure Union Programs, Operations.

ERIC Educational Resources Information Center

Mann, Jerry

2001-01-01

Explores three examples of benchmarking by college student unions. Focuses on how a union can collect information from other unions for use as benchmarking standards for the purposes of selling a concept or justifying program increases, or for comparing a union's financial performance to other unions. (EV)
Performance Characteristics of the Multi-Zone NAS Parallel Benchmarks

NASA Technical Reports Server (NTRS)

Jin, Haoqiang; VanderWijngaart, Rob F.

2003-01-01

We describe a new suite of computational benchmarks that models applications featuring multiple levels of parallelism. Such parallelism is often available in realistic flow computations on systems of grids, but had not previously been captured in bench-marks. The new suite, named NPB Multi-Zone, is extended from the NAS Parallel Benchmarks suite, and involves solving the application benchmarks LU, BT and SP on collections of loosely coupled discretization meshes. The solutions on the meshes are updated independently, but after each time step they exchange boundary value information. This strategy provides relatively easily exploitable coarse-grain parallelism between meshes. Three reference implementations are available: one serial, one hybrid using the Message Passing Interface (MPI) and OpenMP, and another hybrid using a shared memory multi-level programming model (SMP+OpenMP). We examine the effectiveness of hybrid parallelization paradigms in these implementations on three different parallel computers. We also use an empirical formula to investigate the performance characteristics of the multi-zone benchmarks.
Analysis of 2D Torus and Hub Topologies of 100Mb/s Ethernet for the Whitney Commodity Computing Testbed

NASA Technical Reports Server (NTRS)

Pedretti, Kevin T.; Fineberg, Samuel A.; Kutler, Paul (Technical Monitor)

1997-01-01

A variety of different network technologies and topologies are currently being evaluated as part of the Whitney Project. This paper reports on the implementation and performance of a Fast Ethernet network configured in a 4x4 2D torus topology in a testbed cluster of 'commodity' Pentium Pro PCs. Several benchmarks were used for performance evaluation: an MPI point to point message passing benchmark, an MPI collective communication benchmark, and the NAS Parallel Benchmarks version 2.2 (NPB2). Our results show that for point to point communication on an unloaded network, the hub and 1 hop routes on the torus have about the same bandwidth and latency. However, the bandwidth decreases and the latency increases on the torus for each additional route hop. Collective communication benchmarks show that the torus provides roughly four times more aggregate bandwidth and eight times faster MPI barrier synchronizations than a hub based network for 16 processor systems. Finally, the SOAPBOX benchmarks, which simulate real-world CFD applications, generally demonstrated substantially better performance on the torus than on the hub. In the few cases the hub was faster, the difference was negligible. In total, our experimental results lead to the conclusion that for Fast Ethernet networks, the torus topology has better performance and scales better than a hub based network.
Issues in benchmarking human reliability analysis methods : a literature review.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lois, Erasmia; Forester, John Alan; Tran, Tuan Q.

There is a diversity of human reliability analysis (HRA) methods available for use in assessing human performance within probabilistic risk assessment (PRA). Due to the significant differences in the methods, including the scope, approach, and underlying models, there is a need for an empirical comparison investigating the validity and reliability of the methods. To accomplish this empirical comparison, a benchmarking study is currently underway that compares HRA methods with each other and against operator performance in simulator studies. In order to account for as many effects as possible in the construction of this benchmarking study, a literature review was conducted,more » reviewing past benchmarking studies in the areas of psychology and risk assessment. A number of lessons learned through these studies are presented in order to aid in the design of future HRA benchmarking endeavors.« less

Issues in Benchmarking Human Reliability Analysis Methods: A Literature Review

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ronald L. Boring; Stacey M. L. Hendrickson; John A. Forester

There is a diversity of human reliability analysis (HRA) methods available for use in assessing human performance within probabilistic risk assessments (PRA). Due to the significant differences in the methods, including the scope, approach, and underlying models, there is a need for an empirical comparison investigating the validity and reliability of the methods. To accomplish this empirical comparison, a benchmarking study comparing and evaluating HRA methods in assessing operator performance in simulator experiments is currently underway. In order to account for as many effects as possible in the construction of this benchmarking study, a literature review was conducted, reviewing pastmore » benchmarking studies in the areas of psychology and risk assessment. A number of lessons learned through these studies are presented in order to aid in the design of future HRA benchmarking endeavors.« less
XWeB: The XML Warehouse Benchmark

NASA Astrophysics Data System (ADS)

Mahboubi, Hadj; Darmont, Jérôme

With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems.
Energy saving in WWTP: Daily benchmarking under uncertainty and data availability limitations.

PubMed

Torregrossa, D; Schutz, G; Cornelissen, A; Hernández-Sancho, F; Hansen, J

2016-07-01

Efficient management of Waste Water Treatment Plants (WWTPs) can produce significant environmental and economic benefits. Energy benchmarking can be used to compare WWTPs, identify targets and use these to improve their performance. Different authors have performed benchmark analysis on monthly or yearly basis but their approaches suffer from a time lag between an event, its detection, interpretation and potential actions. The availability of on-line measurement data on many WWTPs should theoretically enable the decrease of the management response time by daily benchmarking. Unfortunately this approach is often impossible because of limited data availability. This paper proposes a methodology to perform a daily benchmark analysis under database limitations. The methodology has been applied to the Energy Online System (EOS) developed in the framework of the project "INNERS" (INNovative Energy Recovery Strategies in the urban water cycle). EOS calculates a set of Key Performance Indicators (KPIs) for the evaluation of energy and process performances. In EOS, the energy KPIs take in consideration the pollutant load in order to enable the comparison between different plants. For example, EOS does not analyse the energy consumption but the energy consumption on pollutant load. This approach enables the comparison of performances for plants with different loads or for a single plant under different load conditions. The energy consumption is measured by on-line sensors, while the pollutant load is measured in the laboratory approximately every 14 days. Consequently, the unavailability of the water quality parameters is the limiting factor in calculating energy KPIs. In this paper, in order to overcome this limitation, the authors have developed a methodology to estimate the required parameters and manage the uncertainty in the estimation. By coupling the parameter estimation with an interval based benchmark approach, the authors propose an effective, fast and reproducible way to manage infrequent inlet measurements. Its use enables benchmarking on a daily basis and prepares the ground for further investigation. Copyright © 2016 Elsevier Inc. All rights reserved.
Benchmarking and the laboratory

PubMed Central

Galloway, M; Nadin, L

2001-01-01

This article describes how benchmarking can be used to assess laboratory performance. Two benchmarking schemes are reviewed, the Clinical Benchmarking Company's Pathology Report and the College of American Pathologists' Q-Probes scheme. The Clinical Benchmarking Company's Pathology Report is undertaken by staff based in the clinical management unit, Keele University with appropriate input from the professional organisations within pathology. Five annual reports have now been completed. Each report is a detailed analysis of 10 areas of laboratory performance. In this review, particular attention is focused on the areas of quality, productivity, variation in clinical practice, skill mix, and working hours. The Q-Probes scheme is part of the College of American Pathologists programme in studies of quality assurance. The Q-Probes scheme and its applicability to pathology in the UK is illustrated by reviewing two recent Q-Probe studies: routine outpatient test turnaround time and outpatient test order accuracy. The Q-Probes scheme is somewhat limited by the small number of UK laboratories that have participated. In conclusion, as a result of the government's policy in the UK, benchmarking is here to stay. Benchmarking schemes described in this article are one way in which pathologists can demonstrate that they are providing a cost effective and high quality service. Key Words: benchmarking • pathology PMID:11477112
Mathematics Content Standards Benchmarks and Performance Standards

ERIC Educational Resources Information Center

New Mexico Public Education Department, 2008

2008-01-01

New Mexico Mathematics Content Standards, Benchmarks, and Performance Standards identify what students should know and be able to do across all grade levels, forming a spiraling framework in the sense that many skills, once introduced, develop over time. While the Performance Standards are set forth at grade-specific levels, they do not exist as…
A comparison of five benchmarks

NASA Technical Reports Server (NTRS)

Huss, Janice E.; Pennline, James A.

1987-01-01

Five benchmark programs were obtained and run on the NASA Lewis CRAY X-MP/24. A comparison was made between the programs codes and between the methods for calculating performance figures. Several multitasking jobs were run to gain experience in how parallel performance is measured.
Benchmarking in the Two-Year Public Postsecondary Sector: A Learning Process

ERIC Educational Resources Information Center

Mitchell, Jennevieve

2015-01-01

The recession prompted reflection on how resource allocation decisions contribute to the performance of community colleges in the United States. Private benchmarking initiatives, most notably those established by the National Higher Education Benchmarking Institute, can only partially begin to address this question. Empirical and financial…
Machine characterization and benchmark performance prediction

NASA Technical Reports Server (NTRS)

Saavedra-Barrera, Rafael H.

1988-01-01

From runs of standard benchmarks or benchmark suites, it is not possible to characterize the machine nor to predict the run time of other benchmarks which have not been run. A new approach to benchmarking and machine characterization is reported. The creation and use of a machine analyzer is described, which measures the performance of a given machine on FORTRAN source language constructs. The machine analyzer yields a set of parameters which characterize the machine and spotlight its strong and weak points. Also described is a program analyzer, which analyzes FORTRAN programs and determines the frequency of execution of each of the same set of source language operations. It is then shown that by combining a machine characterization and a program characterization, we are able to predict with good accuracy the run time of a given benchmark on a given machine. Characterizations are provided for the Cray-X-MP/48, Cyber 205, IBM 3090/200, Amdahl 5840, Convex C-1, VAX 8600, VAX 11/785, VAX 11/780, SUN 3/50, and IBM RT-PC/125, and for the following benchmark programs or suites: Los Alamos (BMK8A1), Baskett, Linpack, Livermore Loops, Madelbrot Set, NAS Kernels, Shell Sort, Smith, Whetstone and Sieve of Erathostenes.
Closed-Loop Neuromorphic Benchmarks

PubMed Central

Stewart, Terrence C.; DeWolf, Travis; Kleinhans, Ashley; Eliasmith, Chris

2015-01-01

Evaluating the effectiveness and performance of neuromorphic hardware is difficult. It is even more difficult when the task of interest is a closed-loop task; that is, a task where the output from the neuromorphic hardware affects some environment, which then in turn affects the hardware's future input. However, closed-loop situations are one of the primary potential uses of neuromorphic hardware. To address this, we present a methodology for generating closed-loop benchmarks that makes use of a hybrid of real physical embodiment and a type of “minimal” simulation. Minimal simulation has been shown to lead to robust real-world performance, while still maintaining the practical advantages of simulation, such as making it easy for the same benchmark to be used by many researchers. This method is flexible enough to allow researchers to explicitly modify the benchmarks to identify specific task domains where particular hardware excels. To demonstrate the method, we present a set of novel benchmarks that focus on motor control for an arbitrary system with unknown external forces. Using these benchmarks, we show that an error-driven learning rule can consistently improve motor control performance across a randomly generated family of closed-loop simulations, even when there are up to 15 interacting joints to be controlled. PMID:26696820
Benchmarking image fusion system design parameters

NASA Astrophysics Data System (ADS)

Howell, Christopher L.

2013-06-01

A clear and absolute method for discriminating between image fusion algorithm performances is presented. This method can effectively be used to assist in the design and modeling of image fusion systems. Specifically, it is postulated that quantifying human task performance using image fusion should be benchmarked to whether the fusion algorithm, at a minimum, retained the performance benefit achievable by each independent spectral band being fused. The established benchmark would then clearly represent the threshold that a fusion system should surpass to be considered beneficial to a particular task. A genetic algorithm is employed to characterize the fused system parameters using a Matlab® implementation of NVThermIP as the objective function. By setting the problem up as a mixed-integer constraint optimization problem, one can effectively look backwards through the image acquisition process: optimizing fused system parameters by minimizing the difference between modeled task difficulty measure and the benchmark task difficulty measure. The results of an identification perception experiment are presented, where human observers were asked to identify a standard set of military targets, and used to demonstrate the effectiveness of the benchmarking process.
A New Performance Improvement Model: Adding Benchmarking to the Analysis of Performance Indicator Data.

PubMed

Al-Kuwaiti, Ahmed; Homa, Karen; Maruthamuthu, Thennarasu

2016-01-01

A performance improvement model was developed that focuses on the analysis and interpretation of performance indicator (PI) data using statistical process control and benchmarking. PIs are suitable for comparison with benchmarks only if the data fall within the statistically accepted limit-that is, show only random variation. Specifically, if there is no significant special-cause variation over a period of time, then the data are ready to be benchmarked. The proposed Define, Measure, Control, Internal Threshold, and Benchmark model is adapted from the Define, Measure, Analyze, Improve, Control (DMAIC) model. The model consists of the following five steps: Step 1. Define the process; Step 2. Monitor and measure the variation over the period of time; Step 3. Check the variation of the process; if stable (no significant variation), go to Step 4; otherwise, control variation with the help of an action plan; Step 4. Develop an internal threshold and compare the process with it; Step 5.1. Compare the process with an internal benchmark; and Step 5.2. Compare the process with an external benchmark. The steps are illustrated through the use of health care-associated infection (HAI) data collected for 2013 and 2014 from the Infection Control Unit, King Fahd Hospital, University of Dammam, Saudi Arabia. Monitoring variation is an important strategy in understanding and learning about a process. In the example, HAI was monitored for variation in 2013, and the need to have a more predictable process prompted the need to control variation by an action plan. The action plan was successful, as noted by the shift in the 2014 data, compared to the historical average, and, in addition, the variation was reduced. The model is subject to limitations: For example, it cannot be used without benchmarks, which need to be calculated the same way with similar patient populations, and it focuses only on the "Analyze" part of the DMAIC model.
Evaluation of CHO Benchmarks on the Arria 10 FPGA using Intel FPGA SDK for OpenCL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal

The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. Benchmarking of OpenCL-based framework is an effective way for analyzing the performance of system by studying the execution of the benchmark applications. CHO is a suite of benchmark applications that provides support for OpenCL [1]. The authors presented CHO as an OpenCL port of the CHStone benchmark. Using Altera OpenCL (AOCL) compiler to synthesize the benchmark applications, they listed the resource usage and performance of each kernel that can be successfully synthesized by the compiler. In this report, we evaluate the resource usage and performance of the CHO benchmark applications using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board that features an Arria 10 FPGA device. The focus of the report is to have a better understanding of the resource usage and performance of the kernel implementations using Arria-10 FPGA devices compared to Stratix-5 FPGA devices. In addition, we also gain knowledge about the limitations of the current compiler when it fails to synthesize a benchmark application.« less
Performance of Landslide-HySEA tsunami model for NTHMP benchmarking validation process

NASA Astrophysics Data System (ADS)

Macias, Jorge

2017-04-01

In its FY2009 Strategic Plan, the NTHMP required that all numerical tsunami inundation models be verified as accurate and consistent through a model benchmarking process. This was completed in 2011, but only for seismic tsunami sources and in a limited manner for idealized solid underwater landslides. Recent work by various NTHMP states, however, has shown that landslide tsunami hazard may be dominant along significant parts of the US coastline, as compared to hazards from other tsunamigenic sources. To perform the above-mentioned validation process, a set of candidate benchmarks were proposed. These benchmarks are based on a subset of available laboratory date sets for solid slide experiments and deformable slide experiments, and include both submarine and subaerial slides. A benchmark based on a historic field event (Valdez, AK, 1964) close the list of proposed benchmarks. The Landslide-HySEA model has participated in the workshop that was organized at Texas A&M University - Galveston, on January 9-11, 2017. The aim of this presentation is to show some of the numerical results obtained for Landslide-HySEA in the framework of this benchmarking validation/verification effort. Acknowledgements. This research has been partially supported by the Junta de Andalucía research project TESELA (P11-RNM7069), the Spanish Government Research project SIMURISK (MTM2015-70490-C02-01-R) and Universidad de Málaga, Campus de Excelencia Internacional Andalucía Tech. The GPU computations were performed at the Unit of Numerical Methods (University of Malaga).
7 CFR 245.12 - State agencies and direct certification requirements.

Code of Federal Regulations, 2014 CFR

2014-01-01

... NUTRITION SERVICE, DEPARTMENT OF AGRICULTURE CHILD NUTRITION PROGRAMS DETERMINING ELIGIBILITY FOR FREE AND... performance benchmarks set forth in paragraph (b) of this section for directly certifying children who are.... State agencies must meet performance benchmarks for directly certifying for free school meals children...
Performance Analysis of the ARL Linux Networx Cluster

DTIC Science & Technology

2004-06-01

OVERFLOW, used processors selected by SGE. All benchmarks on the GAMESS, COBALT, LSDYNA and FLUENT. Each code Origin 3800 were executed using IRIX cpusets...scheduler. for these benchmarks defines a missile with grid fins consisting of seventeen million cells [31. 4. Application Performance Results and
The Suite for Embedded Applications and Kernels

DOE Office of Scientific and Technical Information (OSTI.GOV)

2016-05-10

Many applications of high performance embedded computing are limited by performance or power bottlenecks. We havedesigned SEAK, a new benchmark suite, (a) to capture these bottlenecks in a way that encourages creative solutions to these bottlenecks? and (b) to facilitate rigorous, objective, end-user evaluation for their solutions. To avoid biasing solutions toward existing algorithms, SEAK benchmarks use a mission-centric (abstracted from a particular algorithm) andgoal-oriented (functional) specification. To encourage solutions that are any combination of software or hardware, we use an end-user blackbox evaluation that can capture tradeoffs between performance, power, accuracy, size, and weight. The tradeoffs are especially informativemore » for procurement decisions. We call our benchmarks future proof because each mission-centric interface and evaluation remains useful despite shifting algorithmic preferences. It is challenging to create both concise and precise goal-oriented specifications for mission-centric problems. This paper describes the SEAK benchmark suite and presents an evaluation of sample solutions that highlights power and performance tradeoffs.« less
An Exploration of the Gap between Highest and Lowest Ability Readers across 20 Countries

ERIC Educational Resources Information Center

Alivernini, Fabio

2013-01-01

The aim of the present study, based on data from 20 countries, is to identify the pattern of variables (at country, school and student levels), which are typical of students performing below the low international benchmark compared to students performing at the advanced performance benchmark, in the Progress in International Reading Literacy Study…
Benchmark problems for numerical implementations of phase field models

DOE PAGES

Jokisaari, A. M.; Voorhees, P. W.; Guyer, J. E.; ...

2016-10-01

Here, we present the first set of benchmark problems for phase field models that are being developed by the Center for Hierarchical Materials Design (CHiMaD) and the National Institute of Standards and Technology (NIST). While many scientific research areas use a limited set of well-established software, the growing phase field community continues to develop a wide variety of codes and lacks benchmark problems to consistently evaluate the numerical performance of new implementations. Phase field modeling has become significantly more popular as computational power has increased and is now becoming mainstream, driving the need for benchmark problems to validate and verifymore » new implementations. We follow the example set by the micromagnetics community to develop an evolving set of benchmark problems that test the usability, computational resources, numerical capabilities and physical scope of phase field simulation codes. In this paper, we propose two benchmark problems that cover the physics of solute diffusion and growth and coarsening of a second phase via a simple spinodal decomposition model and a more complex Ostwald ripening model. We demonstrate the utility of benchmark problems by comparing the results of simulations performed with two different adaptive time stepping techniques, and we discuss the needs of future benchmark problems. The development of benchmark problems will enable the results of quantitative phase field models to be confidently incorporated into integrated computational materials science and engineering (ICME), an important goal of the Materials Genome Initiative.« less
Neutron Deep Penetration Calculations in Light Water with Monte Carlo TRIPOLI-4® Variance Reduction Techniques

NASA Astrophysics Data System (ADS)

Lee, Yi-Kang

2017-09-01

Nuclear decommissioning takes place in several stages due to the radioactivity in the reactor structure materials. A good estimation of the neutron activation products distributed in the reactor structure materials impacts obviously on the decommissioning planning and the low-level radioactive waste management. Continuous energy Monte-Carlo radiation transport code TRIPOLI-4 has been applied on radiation protection and shielding analyses. To enhance the TRIPOLI-4 application in nuclear decommissioning activities, both experimental and computational benchmarks are being performed. To calculate the neutron activation of the shielding and structure materials of nuclear facilities, the knowledge of 3D neutron flux map and energy spectra must be first investigated. To perform this type of neutron deep penetration calculations with the Monte Carlo transport code, variance reduction techniques are necessary in order to reduce the uncertainty of the neutron activation estimation. In this study, variance reduction options of the TRIPOLI-4 code were used on the NAIADE 1 light water shielding benchmark. This benchmark document is available from the OECD/NEA SINBAD shielding benchmark database. From this benchmark database, a simplified NAIADE 1 water shielding model was first proposed in this work in order to make the code validation easier. Determination of the fission neutron transport was performed in light water for penetration up to 50 cm for fast neutrons and up to about 180 cm for thermal neutrons. Measurement and calculation results were benchmarked. Variance reduction options and their performance were discussed and compared.
Critical Assessment of Metagenome Interpretation – a benchmark of computational metagenomics software

PubMed Central

Sczyrba, Alexander; Hofmann, Peter; Belmann, Peter; Koslicki, David; Janssen, Stefan; Dröge, Johannes; Gregor, Ivan; Majda, Stephan; Fiedler, Jessika; Dahms, Eik; Bremges, Andreas; Fritz, Adrian; Garrido-Oter, Ruben; Jørgensen, Tue Sparholt; Shapiro, Nicole; Blood, Philip D.; Gurevich, Alexey; Bai, Yang; Turaev, Dmitrij; DeMaere, Matthew Z.; Chikhi, Rayan; Nagarajan, Niranjan; Quince, Christopher; Meyer, Fernando; Balvočiūtė, Monika; Hansen, Lars Hestbjerg; Sørensen, Søren J.; Chia, Burton K. H.; Denis, Bertrand; Froula, Jeff L.; Wang, Zhong; Egan, Robert; Kang, Dongwan Don; Cook, Jeffrey J.; Deltel, Charles; Beckstette, Michael; Lemaitre, Claire; Peterlongo, Pierre; Rizk, Guillaume; Lavenier, Dominique; Wu, Yu-Wei; Singer, Steven W.; Jain, Chirag; Strous, Marc; Klingenberg, Heiner; Meinicke, Peter; Barton, Michael; Lingner, Thomas; Lin, Hsin-Hung; Liao, Yu-Chieh; Silva, Genivaldo Gueiros Z.; Cuevas, Daniel A.; Edwards, Robert A.; Saha, Surya; Piro, Vitor C.; Renard, Bernhard Y.; Pop, Mihai; Klenk, Hans-Peter; Göker, Markus; Kyrpides, Nikos C.; Woyke, Tanja; Vorholt, Julia A.; Schulze-Lefert, Paul; Rubin, Edward M.; Darling, Aaron E.; Rattei, Thomas; McHardy, Alice C.

2018-01-01

In metagenome analysis, computational methods for assembly, taxonomic profiling and binning are key components facilitating downstream biological data interpretation. However, a lack of consensus about benchmarking datasets and evaluation metrics complicates proper performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on datasets of unprecedented complexity and realism. Benchmark metagenomes were generated from ~700 newly sequenced microorganisms and ~600 novel viruses and plasmids, including genomes with varying degrees of relatedness to each other and to publicly available ones and representing common experimental setups. Across all datasets, assembly and genome binning programs performed well for species represented by individual genomes, while performance was substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below the family level. Parameter settings substantially impacted performances, underscoring the importance of program reproducibility. While highlighting current challenges in computational metagenomics, the CAMI results provide a roadmap for software selection to answer specific research questions. PMID:28967888

Family History and Breast Cancer Risk Among Older Women in the Breast Cancer Surveillance Consortium Cohort.

PubMed

Braithwaite, Dejana; Miglioretti, Diana L; Zhu, Weiwei; Demb, Joshua; Trentham-Dietz, Amy; Sprague, Brian; Tice, Jeffrey A; Onega, Tracy; Henderson, Louise M; Buist, Diana S M; Ziv, Elad; Walter, Louise C; Kerlikowske, Karla

2018-04-01

First-degree family history is a strong risk factor for breast cancer, but controversy exists about the magnitude of the association among older women. To determine whether first-degree family history is associated with increased risk of breast cancer among older women, and identify whether the association varies by breast density. Prospective cohort study between 1996 and 2012 from 7 Breast Cancer Surveillance Consortium (BCSC) registries located in New Hampshire, North Carolina, San Francisco Bay area, western Washington state, New Mexico, Colorado, and Vermont. During a mean (SD) follow-up of 6.3 (3.2) years, 10 929 invasive breast cancers were diagnosed in a cohort of 403 268 women 65 years and older with data from 472 220 mammography examinations. We estimated the 5-year cumulative incidence of invasive breast cancer by first-degree family history, breast density, and age groups. Cox proportional hazards models were fit to estimate the association of first-degree family history with risk of invasive breast cancer (after adjustment for breast density, BCSC registry, race/ethnicity, body mass index, postmenopausal hormone therapy use, and benign breast disease for age groups 65 to 74 years and 75 years and older, separately). Data analyses were performed between June 2016 and June 2017. First-degree family history of breast cancer. Incident breast cancer. In 403 268 women 65 years and older, first-degree family history was associated with an increased risk of breast cancer among women ages 65 to 74 years (hazard ratio [HR], 1.48; 95% CI, 1.35-1.61) and 75 years and older (HR, 1.44; 95% CI, 1.28-1.62). Estimates were similar for women 65 to 74 years with first-degree relative's diagnosis age younger than 50 years (HR, 1.47; 95% CI, 1.25-1.73) vs 50 years and older (HR, 1.33; 95% CI, 1.17-1.51) and for women ages 75 years and older with the relative's diagnosis age younger than 50 years (HR, 1.31; 95% CI, 1.05-1.63) vs 50 years and older (HR, 1.55; 95% CI, 1.33-1.81). Among women ages 65 to 74 years, the risk associated with first-degree family history was highest among those with fatty breasts (HR, 1.67; 95% CI, 1.27-2.21), whereas in women 75 years and older the risk associated with family history was highest among those with dense breasts (HR, 1.55; 95% CI, 1.29-1.87). First-degree family history was associated with increased risk of invasive breast cancer in all subgroups of older women irrespective of a relative's age at diagnosis.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Van Der Marck, S. C.

Three nuclear data libraries have been tested extensively using criticality safety benchmark calculations. The three libraries are the new release of the US library ENDF/B-VII.1 (2011), the new release of the Japanese library JENDL-4.0 (2011), and the OECD/NEA library JEFF-3.1 (2006). All calculations were performed with the continuous-energy Monte Carlo code MCNP (version 4C3, as well as version 6-beta1). Around 2000 benchmark cases from the International Handbook of Criticality Safety Benchmark Experiments (ICSBEP) were used. The results were analyzed per ICSBEP category, and per element. Overall, the three libraries show similar performance on most criticality safety benchmarks. The largest differencesmore » are probably caused by elements such as Be, C, Fe, Zr, W. (authors)« less
High-performance conjugate-gradient benchmark: A new metric for ranking high-performance computing systems

DOE PAGES

Dongarra, Jack; Heroux, Michael A.; Luszczek, Piotr

2015-08-17

Here, we describe a new high-performance conjugate-gradient (HPCG) benchmark. HPCG is composed of computations and data-access patterns commonly found in scientific applications. HPCG strives for a better correlation to existing codes from the computational science domain and to be representative of their performance. Furthermore, HPCG is meant to help drive the computer system design and implementation in directions that will better impact future performance improvement.
Using Benchmarking Techniques and the 2011 Maternity Practices Infant Nutrition and Care (mPINC) Survey to Improve Performance among Peer Groups across the United States

PubMed Central

Edwards, Roger A.; Dee, Deborah; Umer, Amna; Perrine, Cria G.; Shealy, Katherine R.; Grummer-Strawn, Laurence M.

2015-01-01

Background A substantial proportion of US maternity care facilities engage in practices that are not evidence-based and that interfere with breastfeeding. The CDC Survey of Maternity Practices in Infant Nutrition and Care (mPINC) showed significant variation in maternity practices among US states. Objective The purpose of this article is to use benchmarking techniques to identify states within relevant peer groups that were top performers on mPINC survey indicators related to breastfeeding support. Methods We used 11 indicators of breastfeeding-related maternity care from the 2011 mPINC survey and benchmarking techniques to organize and compare hospital-based maternity practices across the 50 states and Washington, DC. We created peer categories for benchmarking first by region (grouping states by West, Midwest, South, and Northeast) and then by size (grouping states by the number of maternity facilities and dividing each region into approximately equal halves based on the number of facilities). Results Thirty-four states had scores high enough to serve as benchmarks, and 32 states had scores low enough to reflect the lowest score gap from the benchmark on at least 1 indicator. No state served as the benchmark on more than 5 indicators and no state was furthest from the benchmark on more than 7 indicators. The small peer group benchmarks in the South, West, and Midwest were better than the large peer group benchmarks on 91%, 82%, and 36% of the indicators, respectively. In the West large, the Midwest large, the Midwest small, and the South large peer groups, 4–6 benchmarks showed that less than 50% of hospitals have ideal practice in all states. Conclusion The evaluation presents benchmarks for peer group state comparisons that provide potential and feasible targets for improvement. PMID:24394963
NAS Parallel Benchmark Results 11-96. 1.0

NASA Technical Reports Server (NTRS)

Bailey, David H.; Bailey, David; Chancellor, Marisa K. (Technical Monitor)

1997-01-01

The NAS Parallel Benchmarks have been developed at NASA Ames Research Center to study the performance of parallel supercomputers. The eight benchmark problems are specified in a "pencil and paper" fashion. In other words, the complete details of the problem to be solved are given in a technical document, and except for a few restrictions, benchmarkers are free to select the language constructs and implementation techniques best suited for a particular system. These results represent the best results that have been reported to us by the vendors for the specific 3 systems listed. In this report, we present new NPB (Version 1.0) performance results for the following systems: DEC Alpha Server 8400 5/440, Fujitsu VPP Series (VX, VPP300, and VPP700), HP/Convex Exemplar SPP2000, IBM RS/6000 SP P2SC node (120 MHz), NEC SX-4/32, SGI/CRAY T3E, SGI Origin200, and SGI Origin2000. We also report High Performance Fortran (HPF) based NPB results for IBM SP2 Wide Nodes, HP/Convex Exemplar SPP2000, and SGI/CRAY T3D. These results have been submitted by Applied Parallel Research (APR) and Portland Group Inc. (PGI). We also present sustained performance per dollar for Class B LU, SP and BT benchmarks.
Benchmarking Big Data Systems and the BigData Top100 List.

PubMed

Baru, Chaitanya; Bhandarkar, Milind; Nambiar, Raghunath; Poess, Meikel; Rabl, Tilmann

2013-03-01

"Big data" has become a major force of innovation across enterprises of all sizes. New platforms with increasingly more features for managing big datasets are being announced almost on a weekly basis. Yet, there is currently a lack of any means of comparability among such platforms. While the performance of traditional database systems is well understood and measured by long-established institutions such as the Transaction Processing Performance Council (TCP), there is neither a clear definition of the performance of big data systems nor a generally agreed upon metric for comparing these systems. In this article, we describe a community-based effort for defining a big data benchmark. Over the past year, a Big Data Benchmarking Community has become established in order to fill this void. The effort focuses on defining an end-to-end application-layer benchmark for measuring the performance of big data applications, with the ability to easily adapt the benchmark specification to evolving challenges in the big data space. This article describes the efforts that have been undertaken thus far toward the definition of a BigData Top100 List. While highlighting the major technical as well as organizational challenges, through this article, we also solicit community input into this process.
New NAS Parallel Benchmarks Results

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Saphir, William; VanderWijngaart, Rob; Woo, Alex; Kutler, Paul (Technical Monitor)

1997-01-01

NPB2 (NAS (NASA Advanced Supercomputing) Parallel Benchmarks 2) is an implementation, based on Fortran and the MPI (message passing interface) message passing standard, of the original NAS Parallel Benchmark specifications. NPB2 programs are run with little or no tuning, in contrast to NPB vendor implementations, which are highly optimized for specific architectures. NPB2 results complement, rather than replace, NPB results. Because they have not been optimized by vendors, NPB2 implementations approximate the performance a typical user can expect for a portable parallel program on distributed memory parallel computers. Together these results provide an insightful comparison of the real-world performance of high-performance computers. New NPB2 features: New implementation (CG), new workstation class problem sizes, new serial sample versions, more performance statistics.
Simulation-based comprehensive benchmarking of RNA-seq aligners

PubMed Central

Baruzzo, Giacomo; Hayer, Katharina E; Kim, Eun Ji; Di Camillo, Barbara; FitzGerald, Garret A; Grant, Gregory R

2018-01-01

Alignment is the first step in most RNA-seq analysis pipelines, and the accuracy of downstream analyses depends heavily on it. Unlike most steps in the pipeline, alignment is particularly amenable to benchmarking with simulated data. We performed a comprehensive benchmarking of 14 common splice-aware aligners for base, read, and exon junction-level accuracy and compared default with optimized parameters. We found that performance varied by genome complexity, and accuracy and popularity were poorly correlated. The most widely cited tool underperforms for most metrics, particularly when using default settings. PMID:27941783
Iowa's Community College Adult Literacy Annual Report. Program Year 2007, July 1, 2006-June 30, 2007

ERIC Educational Resources Information Center

Division of Community Colleges and Workforce Preparation, Iowa Department of Education, 2007

2007-01-01

This comprehensive document replaces the previously published Benchmark Report, Benchmark Report Executive Summary, Iowa's Community College Basic Literacy Skills Credential Report, Iowa GED Statistical Report, GED Annual Performance Report and Iowa's Adult Literacy Program National Reporting System Annual Performance Report (Graphic…
Benchmarking Universities' Efficiency Indicators in the Presence of Internal Heterogeneity

ERIC Educational Resources Information Center

Agasisti, Tommaso; Bonomi, Francesca

2014-01-01

When benchmarking its performance, a university is usually considered as a single strategic unit. According to the evidence, however, lower levels within an organisation (such as faculties, departments and schools) play a significant role in institutional governance, affecting the overall performance. In this article, an empirical analysis was…
Benchmarking infrastructure for mutation text mining

PubMed Central

2014-01-01

Background Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. Results We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. Conclusion We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption. PMID:24568600
Benchmarking infrastructure for mutation text mining.

PubMed

Klein, Artjom; Riazanov, Alexandre; Hindle, Matthew M; Baker, Christopher Jo

2014-02-25

Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption.
Shift Verification and Validation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pandya, Tara M.; Evans, Thomas M.; Davidson, Gregory G

2016-09-07

This documentation outlines the verification and validation of Shift for the Consortium for Advanced Simulation of Light Water Reactors (CASL). Five main types of problems were used for validation: small criticality benchmark problems; full-core reactor benchmarks for light water reactors; fixed-source coupled neutron-photon dosimetry benchmarks; depletion/burnup benchmarks; and full-core reactor performance benchmarks. We compared Shift results to measured data and other simulated Monte Carlo radiation transport code results, and found very good agreement in a variety of comparison measures. These include prediction of critical eigenvalue, radial and axial pin power distributions, rod worth, leakage spectra, and nuclide inventories over amore » burn cycle. Based on this validation of Shift, we are confident in Shift to provide reference results for CASL benchmarking.« less
FY 2004 Top 200 Users Survey Report

DTIC Science & Technology

2004-10-01

Rating 82% 79% 80% ACSI Federal Government Benchmark* 71.1% 70.2% 70.9% DTIC Excels by +10.9 +8.8 +9.1 *ACSI is the official service quality benchmark...ACSI Federal Government Benchmark* 71.1% 70.2% 70.9% DTIC Excels by +10.9 +8.8 +9.1 *ACSI is the official service quality benchmark for the Federal...Users Survey % DTIC’s Other Overall Product/ Service Quality and Performance: Not only were we able to ascertain product and service usage data from
Benchmarking for Excellence and the Nursing Process

NASA Technical Reports Server (NTRS)

Sleboda, Claire

1999-01-01

Nursing is a service profession. The services provided are essential to life and welfare. Therefore, setting the benchmark for high quality care is fundamental. Exploring the definition of a benchmark value will help to determine a best practice approach. A benchmark is the descriptive statement of a desired level of performance against which quality can be judged. It must be sufficiently well understood by managers and personnel in order that it may serve as a standard against which to measure value.
Results Oriented Benchmarking: The Evolution of Benchmarking at NASA from Competitive Comparisons to World Class Space Partnerships

NASA Technical Reports Server (NTRS)

Bell, Michael A.

1999-01-01

Informal benchmarking using personal or professional networks has taken place for many years at the Kennedy Space Center (KSC). The National Aeronautics and Space Administration (NASA) recognized early on, the need to formalize the benchmarking process for better utilization of resources and improved benchmarking performance. The need to compete in a faster, better, cheaper environment has been the catalyst for formalizing these efforts. A pioneering benchmarking consortium was chartered at KSC in January 1994. The consortium known as the Kennedy Benchmarking Clearinghouse (KBC), is a collaborative effort of NASA and all major KSC contractors. The charter of this consortium is to facilitate effective benchmarking, and leverage the resulting quality improvements across KSC. The KBC acts as a resource with experienced facilitators and a proven process. One of the initial actions of the KBC was to develop a holistic methodology for Center-wide benchmarking. This approach to Benchmarking integrates the best features of proven benchmarking models (i.e., Camp, Spendolini, Watson, and Balm). This cost-effective alternative to conventional Benchmarking approaches has provided a foundation for consistent benchmarking at KSC through the development of common terminology, tools, and techniques. Through these efforts a foundation and infrastructure has been built which allows short duration benchmarking studies yielding results gleaned from world class partners that can be readily implemented. The KBC has been recognized with the Silver Medal Award (in the applied research category) from the International Benchmarking Clearinghouse.
Benchmarking the American Society of Breast Surgeon Member Performance for More Than a Million Quality Measure-Patient Encounters.

PubMed

Landercasper, Jeffrey; Fayanju, Oluwadamilola M; Bailey, Lisa; Berry, Tiffany S; Borgert, Andrew J; Buras, Robert; Chen, Steven L; Degnim, Amy C; Froman, Joshua; Gass, Jennifer; Greenberg, Caprice; Mautner, Starr Koslow; Krontiras, Helen; Ramirez, Luis D; Sowden, Michelle; Wexelman, Barbara; Wilke, Lee; Rao, Roshni

2018-02-01

Nine breast cancer quality measures (QM) were selected by the American Society of Breast Surgeons (ASBrS) for the Centers for Medicare and Medicaid Services (CMS) Quality Payment Programs (QPP) and other performance improvement programs. We report member performance. Surgeons entered QM data into an electronic registry. For each QM, aggregate "performance met" (PM) was reported (median, range and percentiles) and benchmarks (target goals) were calculated by CMS methodology, specifically, the Achievable Benchmark of Care™ (ABC) method. A total of 1,286,011 QM encounters were captured from 2011-2015. For 7 QM, first and last PM rates were as follows: (1) needle biopsy (95.8, 98.5%), (2) specimen imaging (97.9, 98.8%), (3) specimen orientation (98.5, 98.3%), (4) sentinel node use (95.1, 93.4%), (5) antibiotic selection (98.0, 99.4%), (6) antibiotic duration (99.0, 99.8%), and (7) no surgical site infection (98.8, 98.9%); all p values < 0.001 for trends. Variability and reasons for noncompliance by surgeon for each QM were identified. The CMS-calculated target goals (ABC™ benchmarks) for PM for 6 QM were 100%, suggesting that not meeting performance is a "never should occur" event. Surgeons self-reported a large number of specialty-specific patient-measure encounters into a registry for self-assessment and participation in QPP. Despite high levels of performance demonstrated initially in 2011 with minimal subsequent change, the ASBrS concluded "perfect" performance was not a realistic goal for QPP. Thus, after review of our normative performance data, the ASBrS recommended different benchmarks than CMS for each QM.
Visual Arts Performance Standards at Grades 4, 8 and 12 for North Dakota Visual Art Standards and Benchmarks.

ERIC Educational Resources Information Center

Shaw-Elgin, Linda; Jackson, Jane; Kurkowski, Bob; Riehl, Lori; Syvertson, Karen; Whitney, Linda

This document outlines the performance standards for visual arts in North Dakota public schools, grades K-12. Four levels of performance are provided for each benchmark by North Dakota educators for K-4, 5-8, and 9-12 grade levels. Level 4 describes advanced proficiency; Level 3, proficiency; Level 2, partial proficiency; and Level 1, novice. Each…
Standardised Benchmarking in the Quest for Orthologs

PubMed Central

Altenhoff, Adrian M.; Boeckmann, Brigitte; Capella-Gutierrez, Salvador; Dalquen, Daniel A.; DeLuca, Todd; Forslund, Kristoffer; Huerta-Cepas, Jaime; Linard, Benjamin; Pereira, Cécile; Pryszcz, Leszek P.; Schreiber, Fabian; Sousa da Silva, Alan; Szklarczyk, Damian; Train, Clément-Marie; Bork, Peer; Lecompte, Odile; von Mering, Christian; Xenarios, Ioannis; Sjölander, Kimmen; Juhl Jensen, Lars; Martin, Maria J.; Muffato, Matthieu; Gabaldón, Toni; Lewis, Suzanna E.; Thomas, Paul D.; Sonnhammer, Erik; Dessimoz, Christophe

2016-01-01

The identification of evolutionarily related genes across different species—orthologs in particular—forms the backbone of many comparative, evolutionary, and functional genomic analyses. Achieving high accuracy in orthology inference is thus essential. Yet the true evolutionary history of genes, required to ascertain orthology, is generally unknown. Furthermore, orthologs are used for very different applications across different phyla, with different requirements in terms of the precision-recall trade-off. As a result, assessing the performance of orthology inference methods remains difficult for both users and method developers. Here, we present a community effort to establish standards in orthology benchmarking and facilitate orthology benchmarking through an automated web-based service (http://orthology.benchmarkservice.org). Using this new service, we characterise the performance of 15 well-established orthology inference methods and resources on a battery of 20 different benchmarks. Standardised benchmarking provides a way for users to identify the most effective methods for the problem at hand, sets a minimal requirement for new tools and resources, and guides the development of more accurate orthology inference methods. PMID:27043882
Benchmarking CRISPR on-target sgRNA design.

PubMed

Yan, Jifang; Chuai, Guohui; Zhou, Chi; Zhu, Chenyu; Yang, Jing; Zhang, Chao; Gu, Feng; Xu, Han; Wei, Jia; Liu, Qi

2017-02-15

CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)-based gene editing has been widely implemented in various cell types and organisms. A major challenge in the effective application of the CRISPR system is the need to design highly efficient single-guide RNA (sgRNA) with minimal off-target cleavage. Several tools are available for sgRNA design, while limited tools were compared. In our opinion, benchmarking the performance of the available tools and indicating their applicable scenarios are important issues. Moreover, whether the reported sgRNA design rules are reproducible across different sgRNA libraries, cell types and organisms remains unclear. In our study, a systematic and unbiased benchmark of the sgRNA predicting efficacy was performed on nine representative on-target design tools, based on six benchmark data sets covering five different cell types. The benchmark study presented here provides novel quantitative insights into the available CRISPR tools. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Improving Federal Education Programs through an Integrated Performance and Benchmarking System.

ERIC Educational Resources Information Center

Department of Education, Washington, DC. Office of the Under Secretary.

This document highlights the problems with current federal education program data collection activities and lists several factors that make movement toward a possible solution, then discusses the vision for the Integrated Performance and Benchmarking System (IPBS), a vision of an Internet-based system for harvesting information from states about…
Benchmarking in Academic Pharmacy Departments

PubMed Central

Chisholm-Burns, Marie; Nappi, Jean; Gubbins, Paul O.; Ross, Leigh Ann

2010-01-01

Benchmarking in academic pharmacy, and recommendations for the potential uses of benchmarking in academic pharmacy departments are discussed in this paper. Benchmarking is the process by which practices, procedures, and performance metrics are compared to an established standard or best practice. Many businesses and industries use benchmarking to compare processes and outcomes, and ultimately plan for improvement. Institutions of higher learning have embraced benchmarking practices to facilitate measuring the quality of their educational and research programs. Benchmarking is used internally as well to justify the allocation of institutional resources or to mediate among competing demands for additional program staff or space. Surveying all chairs of academic pharmacy departments to explore benchmarking issues such as department size and composition, as well as faculty teaching, scholarly, and service productivity, could provide valuable information. To date, attempts to gather this data have had limited success. We believe this information is potentially important, urge that efforts to gather it should be continued, and offer suggestions to achieve full participation. PMID:21179251
Benchmarking in academic pharmacy departments.

PubMed

Bosso, John A; Chisholm-Burns, Marie; Nappi, Jean; Gubbins, Paul O; Ross, Leigh Ann

2010-10-11

Benchmarking in academic pharmacy, and recommendations for the potential uses of benchmarking in academic pharmacy departments are discussed in this paper. Benchmarking is the process by which practices, procedures, and performance metrics are compared to an established standard or best practice. Many businesses and industries use benchmarking to compare processes and outcomes, and ultimately plan for improvement. Institutions of higher learning have embraced benchmarking practices to facilitate measuring the quality of their educational and research programs. Benchmarking is used internally as well to justify the allocation of institutional resources or to mediate among competing demands for additional program staff or space. Surveying all chairs of academic pharmacy departments to explore benchmarking issues such as department size and composition, as well as faculty teaching, scholarly, and service productivity, could provide valuable information. To date, attempts to gather this data have had limited success. We believe this information is potentially important, urge that efforts to gather it should be continued, and offer suggestions to achieve full participation.
Predicting Cost/Performance Trade-Offs for Whitney: A Commodity Computing Cluster

NASA Technical Reports Server (NTRS)

Becker, Jeffrey C.; Nitzberg, Bill; VanderWijngaart, Rob F.; Kutler, Paul (Technical Monitor)

1997-01-01

Recent advances in low-end processor and network technology have made it possible to build a "supercomputer" out of commodity components. We develop simple models of the NAS Parallel Benchmarks version 2 (NPB 2) to explore the cost/performance trade-offs involved in building a balanced parallel computer supporting a scientific workload. We develop closed form expressions detailing the number and size of messages sent by each benchmark. Coupling these with measured single processor performance, network latency, and network bandwidth, our models predict benchmark performance to within 30%. A comparison based on total system cost reveals that current commodity technology (200 MHz Pentium Pros with 100baseT Ethernet) is well balanced for the NPBs up to a total system cost of around $1,000,000.
Performance benchmark of LHCb code on state-of-the-art x86 architectures

NASA Astrophysics Data System (ADS)

Campora Perez, D. H.; Neufeld, N.; Schwemmer, R.

2015-12-01

For Run 2 of the LHC, LHCb is replacing a significant part of its event filter farm with new compute nodes. For the evaluation of the best performing solution, we have developed a method to convert our high level trigger application into a stand-alone, bootable benchmark image. With additional instrumentation we turned it into a self-optimising benchmark which explores techniques such as late forking, NUMA balancing and optimal number of threads, i.e. it automatically optimises box-level performance. We have run this procedure on a wide range of Haswell-E CPUs and numerous other architectures from both Intel and AMD, including also the latest Intel micro-blade servers. We present results in terms of performance, power consumption, overheads and relative cost.
RISC Processors and High Performance Computing

NASA Technical Reports Server (NTRS)

Bailey, David H.; Saini, Subhash; Craw, James M. (Technical Monitor)

1995-01-01

This tutorial will discuss the top five RISC microprocessors and the parallel systems in which they are used. It will provide a unique cross-machine comparison not available elsewhere. The effective performance of these processors will be compared by citing standard benchmarks in the context of real applications. The latest NAS Parallel Benchmarks, both absolute performance and performance per dollar, will be listed. The next generation of the NPB will be described. The tutorial will conclude with a discussion of future directions in the field. Technology Transfer Considerations: All of these computer systems are commercially available internationally. Information about these processors is available in the public domain, mostly from the vendors themselves. The NAS Parallel Benchmarks and their results have been previously approved numerous times for public release, beginning back in 1991.
Benchmarks for target tracking

NASA Astrophysics Data System (ADS)

Dunham, Darin T.; West, Philip D.

2011-09-01

The term benchmark originates from the chiseled horizontal marks that surveyors made, into which an angle-iron could be placed to bracket ("bench") a leveling rod, thus ensuring that the leveling rod can be repositioned in exactly the same place in the future. A benchmark in computer terms is the result of running a computer program, or a set of programs, in order to assess the relative performance of an object by running a number of standard tests and trials against it. This paper will discuss the history of simulation benchmarks that are being used by multiple branches of the military and agencies of the US government. These benchmarks range from missile defense applications to chemical biological situations. Typically, a benchmark is used with Monte Carlo runs in order to tease out how algorithms deal with variability and the range of possible inputs. We will also describe problems that can be solved by a benchmark.
Benchmarking. Issues in the Design and Implementation of a Benchmarking System for Employment and Training Programs for Young People.

ERIC Educational Resources Information Center

Coughlin, David C.; Bielen, Rhonda P.

This paper has been prepared to assist the United States Department of Labor to explore new approaches to evaluating and measuring the performance of employment and training activities for youth. As one of several tools for evaluating success of local youth training programs, "benchmarking" provides a system for measuring the development…
Developing a benchmark for emotional analysis of music

PubMed Central

Yang, Yi-Hsuan; Soleymani, Mohammad

2017-01-01

Music emotion recognition (MER) field rapidly expanded in the last decade. Many new methods and new audio features are developed to improve the performance of MER algorithms. However, it is very difficult to compare the performance of the new methods because of the data representation diversity and scarcity of publicly available data. In this paper, we address these problems by creating a data set and a benchmark for MER. The data set that we release, a MediaEval Database for Emotional Analysis in Music (DEAM), is the largest available data set of dynamic annotations (valence and arousal annotations for 1,802 songs and song excerpts licensed under Creative Commons with 2Hz time resolution). Using DEAM, we organized the ‘Emotion in Music’ task at MediaEval Multimedia Evaluation Campaign from 2013 to 2015. The benchmark attracted, in total, 21 active teams to participate in the challenge. We analyze the results of the benchmark: the winning algorithms and feature-sets. We also describe the design of the benchmark, the evaluation procedures and the data cleaning and transformations that we suggest. The results from the benchmark suggest that the recurrent neural network based approaches combined with large feature-sets work best for dynamic MER. PMID:28282400
Developing a benchmark for emotional analysis of music.

PubMed

Aljanaki, Anna; Yang, Yi-Hsuan; Soleymani, Mohammad

2017-01-01

Music emotion recognition (MER) field rapidly expanded in the last decade. Many new methods and new audio features are developed to improve the performance of MER algorithms. However, it is very difficult to compare the performance of the new methods because of the data representation diversity and scarcity of publicly available data. In this paper, we address these problems by creating a data set and a benchmark for MER. The data set that we release, a MediaEval Database for Emotional Analysis in Music (DEAM), is the largest available data set of dynamic annotations (valence and arousal annotations for 1,802 songs and song excerpts licensed under Creative Commons with 2Hz time resolution). Using DEAM, we organized the 'Emotion in Music' task at MediaEval Multimedia Evaluation Campaign from 2013 to 2015. The benchmark attracted, in total, 21 active teams to participate in the challenge. We analyze the results of the benchmark: the winning algorithms and feature-sets. We also describe the design of the benchmark, the evaluation procedures and the data cleaning and transformations that we suggest. The results from the benchmark suggest that the recurrent neural network based approaches combined with large feature-sets work best for dynamic MER.
A large-scale benchmark of gene prioritization methods.

PubMed

Guala, Dimitri; Sonnhammer, Erik L L

2017-04-21

In order to maximize the use of results from high-throughput experimental studies, e.g. GWAS, for identification and diagnostics of new disease-associated genes, it is important to have properly analyzed and benchmarked gene prioritization tools. While prospective benchmarks are underpowered to provide statistically significant results in their attempt to differentiate the performance of gene prioritization tools, a strategy for retrospective benchmarking has been missing, and new tools usually only provide internal validations. The Gene Ontology(GO) contains genes clustered around annotation terms. This intrinsic property of GO can be utilized in construction of robust benchmarks, objective to the problem domain. We demonstrate how this can be achieved for network-based gene prioritization tools, utilizing the FunCoup network. We use cross-validation and a set of appropriate performance measures to compare state-of-the-art gene prioritization algorithms: three based on network diffusion, NetRank and two implementations of Random Walk with Restart, and MaxLink that utilizes network neighborhood. Our benchmark suite provides a systematic and objective way to compare the multitude of available and future gene prioritization tools, enabling researchers to select the best gene prioritization tool for the task at hand, and helping to guide the development of more accurate methods.
Paediatric International Nursing Study: using person-centred key performance indicators to benchmark children's services.

PubMed

McCance, Tanya; Wilson, Val; Kornman, Kelly

2016-07-01

The aim of the Paediatric International Nursing Study was to explore the utility of key performance indicators in developing person-centred practice across a range of services provided to sick children. The objective addressed in this paper was evaluating the use of these indicators to benchmark services internationally. This study builds on primary research, which produced indicators that were considered novel both in terms of their positive orientation and use in generating data that privileges the patient voice. This study extends this research through wider testing on an international platform within paediatrics. The overall methodological approach was a realistic evaluation used to evaluate the implementation of the key performance indicators, which combined an integrated development and evaluation methodology. The study involved children's wards/hospitals in Australia (six sites across three states) and Europe (seven sites across four countries). Qualitative and quantitative methods were used during the implementation process, however, this paper reports the quantitative data only, which used survey, observations and documentary review. The findings demonstrate the quality of care being delivered to children and their families across different international sites. The benchmarking does, however, highlight some differences between paediatric and general hospitals, and between the different key performance indicators across all the sites. The findings support the use of the key performance indicators as a novel method to benchmark services internationally. Whilst the data collected across 20 paediatric sites suggest services are more similar than different, benchmarking illuminates variations that encourage a critical dialogue about what works and why. The transferability of the key performance indicators and measurement framework across different settings has significant implications for practice. The findings offer an approach to benchmarking and celebrating the successes within practice, while learning from partners across the globe in further developing person-centred cultures. © 2016 John Wiley & Sons Ltd.
Benchmark experiments at ASTRA facility on definition of space distribution of {sup 235}U fission reaction rate

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bobrov, A. A.; Boyarinov, V. F.; Glushkov, A. E.

2012-07-01

Results of critical experiments performed at five ASTRA facility configurations modeling the high-temperature helium-cooled graphite-moderated reactors are presented. Results of experiments on definition of space distribution of {sup 235}U fission reaction rate performed at four from these five configurations are presented more detail. Analysis of available information showed that all experiments on criticality at these five configurations are acceptable for use them as critical benchmark experiments. All experiments on definition of space distribution of {sup 235}U fission reaction rate are acceptable for use them as physical benchmark experiments. (authors)
Sensitivity Analysis of OECD Benchmark Tests in BISON

DOE Office of Scientific and Technical Information (OSTI.GOV)

Swiler, Laura Painton; Gamble, Kyle; Schmidt, Rodney C.

2015-09-01

This report summarizes a NEAMS (Nuclear Energy Advanced Modeling and Simulation) project focused on sensitivity analysis of a fuels performance benchmark problem. The benchmark problem was defined by the Uncertainty Analysis in Modeling working group of the Nuclear Science Committee, part of the Nuclear Energy Agency of the Organization for Economic Cooperation and Development (OECD ). The benchmark problem involv ed steady - state behavior of a fuel pin in a Pressurized Water Reactor (PWR). The problem was created in the BISON Fuels Performance code. Dakota was used to generate and analyze 300 samples of 17 input parameters defining coremore » boundary conditions, manuf acturing tolerances , and fuel properties. There were 24 responses of interest, including fuel centerline temperatures at a variety of locations and burnup levels, fission gas released, axial elongation of the fuel pin, etc. Pearson and Spearman correlatio n coefficients and Sobol' variance - based indices were used to perform the sensitivity analysis. This report summarizes the process and presents results from this study.« less
Parallelization of NAS Benchmarks for Shared Memory Multiprocessors

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry C.; Saini, Subhash (Technical Monitor)

1998-01-01

This paper presents our experiences of parallelizing the sequential implementation of NAS benchmarks using compiler directives on SGI Origin2000 distributed shared memory (DSM) system. Porting existing applications to new high performance parallel and distributed computing platforms is a challenging task. Ideally, a user develops a sequential version of the application, leaving the task of porting to new generations of high performance computing systems to parallelization tools and compilers. Due to the simplicity of programming shared-memory multiprocessors, compiler developers have provided various facilities to allow the users to exploit parallelism. Native compilers on SGI Origin2000 support multiprocessing directives to allow users to exploit loop-level parallelism in their programs. Additionally, supporting tools can accomplish this process automatically and present the results of parallelization to the users. We experimented with these compiler directives and supporting tools by parallelizing sequential implementation of NAS benchmarks. Results reported in this paper indicate that with minimal effort, the performance gain is comparable with the hand-parallelized, carefully optimized, message-passing implementations of the same benchmarks.
PMLB: a large benchmark suite for machine learning evaluation and comparison.

PubMed

Olson, Randal S; La Cava, William; Orzechowski, Patryk; Urbanowicz, Ryan J; Moore, Jason H

2017-01-01

The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark datasets have emerged from different sources, but their organization and adoption as standards have been inconsistent. As such, selecting and curating specific benchmarks remains an unnecessary burden on machine learning practitioners and data scientists. The present study introduces an accessible, curated, and developing public benchmark resource to facilitate identification of the strengths and weaknesses of different machine learning methodologies. We compare meta-features among the current set of benchmark datasets in this resource to characterize the diversity of available data. Finally, we apply a number of established machine learning methods to the entire benchmark suite and analyze how datasets and algorithms cluster in terms of performance. From this study, we find that existing benchmarks lack the diversity to properly benchmark machine learning algorithms, and there are several gaps in benchmarking problems that still need to be considered. This work represents another important step towards understanding the limitations of popular benchmarking suites and developing a resource that connects existing benchmarking standards to more diverse and efficient standards in the future.
75 FR 68606 - Notice of Submission for OMB Review

Federal Register 2010, 2011, 2012, 2013, 2014

2010-11-08

... provides data for internationally benchmarking U.S. performance in mathematics and science at the fourth... years in more than 50 countries and provides assessment data for internationally benchmarking U.S...
Utilizing a Trauma Systems Approach to Benchmark and Improve Combat Casualty Care

DTIC Science & Technology

2010-07-01

modern battlefield utilizing evidence - based medicine . The development of injury care benchmarks enhanced the evolution of the combat casualty care performance improvement process within the trauma system.
Small drinking water systems under spatiotemporal water quality variability: a risk-based performance benchmarking framework.

PubMed

Bereskie, Ty; Haider, Husnain; Rodriguez, Manuel J; Sadiq, Rehan

2017-08-23

Traditional approaches for benchmarking drinking water systems are binary, based solely on the compliance and/or non-compliance of one or more water quality performance indicators against defined regulatory guidelines/standards. The consequence of water quality failure is dependent on location within a water supply system as well as time of the year (i.e., season) with varying levels of water consumption. Conventional approaches used for water quality comparison purposes fail to incorporate spatiotemporal variability and degrees of compliance and/or non-compliance. This can lead to misleading or inaccurate performance assessment data used in the performance benchmarking process. In this research, a hierarchical risk-based water quality performance benchmarking framework is proposed to evaluate small drinking water systems (SDWSs) through cross-comparison amongst similar systems. The proposed framework (R WQI framework) is designed to quantify consequence associated with seasonal and location-specific water quality issues in a given drinking water supply system to facilitate more efficient decision-making for SDWSs striving for continuous performance improvement. Fuzzy rule-based modelling is used to address imprecision associated with measuring performance based on singular water quality guidelines/standards and the uncertainties present in SDWS operations and monitoring. This proposed R WQI framework has been demonstrated using data collected from 16 SDWSs in Newfoundland and Labrador and Quebec, Canada, and compared to the Canadian Council of Ministers of the Environment WQI, a traditional, guidelines/standard-based approach. The study found that the R WQI framework provides an in-depth state of water quality and benchmarks SDWSs more rationally based on the frequency of occurrence and consequence of failure events.
Benchmarking gate-based quantum computers

NASA Astrophysics Data System (ADS)

Michielsen, Kristel; Nocon, Madita; Willsch, Dennis; Jin, Fengping; Lippert, Thomas; De Raedt, Hans

2017-11-01

With the advent of public access to small gate-based quantum processors, it becomes necessary to develop a benchmarking methodology such that independent researchers can validate the operation of these processors. We explore the usefulness of a number of simple quantum circuits as benchmarks for gate-based quantum computing devices and show that circuits performing identity operations are very simple, scalable and sensitive to gate errors and are therefore very well suited for this task. We illustrate the procedure by presenting benchmark results for the IBM Quantum Experience, a cloud-based platform for gate-based quantum computing.

Managing for Results in America's Great City Schools 2014: Results from Fiscal Year 2012-13. A Report of the Performance Measurement and Benchmarking Project

ERIC Educational Resources Information Center

Council of the Great City Schools, 2014

2014-01-01

In 2002 the "Council of the Great City Schools" and its members set out to develop performance measures that could be used to improve business operations in urban public school districts. The Council launched the "Performance Measurement and Benchmarking Project" to achieve these objectives. The purposes of the project was to:…
Developing a Benchmarking Process in Perfusion: A Report of the Perfusion Downunder Collaboration

PubMed Central

Baker, Robert A.; Newland, Richard F.; Fenton, Carmel; McDonald, Michael; Willcox, Timothy W.; Merry, Alan F.

2012-01-01

Abstract: Improving and understanding clinical practice is an appropriate goal for the perfusion community. The Perfusion Downunder Collaboration has established a multi-center perfusion focused database aimed at achieving these goals through the development of quantitative quality indicators for clinical improvement through benchmarking. Data were collected using the Perfusion Downunder Collaboration database from procedures performed in eight Australian and New Zealand cardiac centers between March 2007 and February 2011. At the Perfusion Downunder Meeting in 2010, it was agreed by consensus, to report quality indicators (QI) for glucose level, arterial outlet temperature, and pCO2 management during cardiopulmonary bypass. The values chosen for each QI were: blood glucose ≥4 mmol/L and ≤10 mmol/L; arterial outlet temperature ≤37°C; and arterial blood gas pCO2 ≥ 35 and ≤45 mmHg. The QI data were used to derive benchmarks using the Achievable Benchmark of Care (ABC™) methodology to identify the incidence of QIs at the best performing centers. Five thousand four hundred and sixty-five procedures were evaluated to derive QI and benchmark data. The incidence of the blood glucose QI ranged from 37–96% of procedures, with a benchmark value of 90%. The arterial outlet temperature QI occurred in 16–98% of procedures with the benchmark of 94%; while the arterial pCO2 QI occurred in 21–91%, with the benchmark value of 80%. We have derived QIs and benchmark calculations for the management of several key aspects of cardiopulmonary bypass to provide a platform for improving the quality of perfusion practice. PMID:22730861
[Benchmarking and other functions of ROM: back to basics].

PubMed

Barendregt, M

2015-01-01

Since 2011 outcome data in the Dutch mental health care have been collected on a national scale. This has led to confusion about the position of benchmarking in the system known as routine outcome monitoring (rom). To provide insight into the various objectives and uses of aggregated outcome data. A qualitative review was performed and the findings were analysed. Benchmarking is a strategy for finding best practices and for improving efficacy and it belongs to the domain of quality management. Benchmarking involves comparing outcome data by means of instrumentation and is relatively tolerant with regard to the validity of the data. Although benchmarking is a function of rom, it must be differentiated form other functions from rom. Clinical management, public accountability, research, payment for performance and information for patients are all functions of rom which require different ways of data feedback and which make different demands on the validity of the underlying data. Benchmarking is often wrongly regarded as being simply a synonym for 'comparing institutions'. It is, however, a method which includes many more factors; it can be used to improve quality and has a more flexible approach to the validity of outcome data and is less concerned than other rom functions about funding and the amount of information given to patients. Benchmarking can make good use of currently available outcome data.
Benchmarking of venous thromboembolism prophylaxis practice with ENT.UK guidelines.

PubMed

Al-Qahtani, Ali S

2017-05-01

The aim of this study was to benchmark our guidelines of prevention of venous thromboembolism (VTE) in ENT surgical population against ENT.UK guidelines, and also to encourage healthcare providers to utilize benchmarking as an effective method of improving performance. The study design is prospective descriptive analysis. The setting of this study is tertiary referral centre (Assir Central Hospital, Abha, Saudi Arabia). In this study, we are benchmarking our practice guidelines of the prevention of VTE in the ENT surgical population against that of ENT.UK guidelines to mitigate any gaps. ENT guidelines 2010 were downloaded from the ENT.UK Website. Our guidelines were compared with the possibilities that either our performance meets or fall short of ENT.UK guidelines. Immediate corrective actions will take place if there is quality chasm between the two guidelines. ENT.UK guidelines are evidence-based and updated which may serve as role-model for adoption and benchmarking. Our guidelines were accordingly amended to contain all factors required in providing a quality service to ENT surgical patients. While not given appropriate attention, benchmarking is a useful tool in improving quality of health care. It allows learning from others' practices and experiences, and works towards closing any quality gaps. In addition, benchmarking clinical outcomes is critical for quality improvement and informing decisions concerning service provision. It is recommended to be included on the list of quality improvement methods of healthcare services.
How Are You Doing? Key Performance Indicators and Benchmarking

ERIC Educational Resources Information Center

Fahey, John P.

2011-01-01

School business officials need to "know and show" that their operations are well managed. To do so, they ask themselves questions, such as "How are they doing? How do they compare with others? Are they making progress fast enough? Are they using the best practices?" Using key performance indicators (KPIs) and benchmarking as regular parts of their…
Is Higher Better? Determinants and Comparisons of Performance on the Major Field Test in Business

ERIC Educational Resources Information Center

Bielinska-Kwapisz, Agnieszka; Brown, F. William; Semenik, Richard

2012-01-01

Student performance on the Major Field Achievement Test in Business is an important benchmark for college of business programs. The authors' results indicate that such benchmarking can only be meaningful if certain student characteristics are taken into account. The differences in achievement between cohorts are explored in detail by separating…
Access to a simulator is not enough: the benefits of virtual reality training based on peer-group-derived benchmarks--a randomized controlled trial.

PubMed

von Websky, Martin W; Raptis, Dimitri A; Vitz, Martina; Rosenthal, Rachel; Clavien, P A; Hahnloser, Dieter

2013-11-01

Virtual reality (VR) simulators are widely used to familiarize surgical novices with laparoscopy, but VR training methods differ in efficacy. In the present trial, self-controlled basic VR training (SC-training) was tested against training based on peer-group-derived benchmarks (PGD-training). First, novice laparoscopic residents were randomized into a SC group (n = 34), and a group using PGD-benchmarks (n = 34) for basic laparoscopic training. After completing basic training, both groups performed 60 VR laparoscopic cholecystectomies for performance analysis. Primary endpoints were simulator metrics; secondary endpoints were program adherence, trainee motivation, and training efficacy. Altogether, 66 residents completed basic training, and 3,837 of 3,960 (96.8 %) cholecystectomies were available for analysis. Course adherence was good, with only two dropouts, both in the SC-group. The PGD-group spent more time and repetitions in basic training until the benchmarks were reached and subsequently showed better performance in the readout cholecystectomies: Median time (gallbladder extraction) showed significant differences of 520 s (IQR 354-738 s) in SC-training versus 390 s (IQR 278-536 s) in the PGD-group (p < 0.001) and 215 s (IQR 175-276 s) in experts, respectively. Path length of the right instrument also showed significant differences, again with the PGD-training group being more efficient. Basic VR laparoscopic training based on PGD benchmarks with external assessment is superior to SC training, resulting in higher trainee motivation and better performance in simulated laparoscopic cholecystectomies. We recommend such a basic course based on PGD benchmarks before advancing to more elaborate VR training.
Validation of tsunami inundation model TUNA-RP using OAR-PMEL-135 benchmark problem set

NASA Astrophysics Data System (ADS)

Koh, H. L.; Teh, S. Y.; Tan, W. K.; Kh'ng, X. Y.

2017-05-01

A standard set of benchmark problems, known as OAR-PMEL-135, is developed by the US National Tsunami Hazard Mitigation Program for tsunami inundation model validation. Any tsunami inundation model must be tested for its accuracy and capability using this standard set of benchmark problems before it can be gainfully used for inundation simulation. The authors have previously developed an in-house tsunami inundation model known as TUNA-RP. This inundation model solves the two-dimensional nonlinear shallow water equations coupled with a wet-dry moving boundary algorithm. This paper presents the validation of TUNA-RP against the solutions provided in the OAR-PMEL-135 benchmark problem set. This benchmark validation testing shows that TUNA-RP can indeed perform inundation simulation with accuracy consistent with that in the tested benchmark problem set.
Preliminary Results for the OECD/NEA Time Dependent Benchmark using Rattlesnake, Rattlesnake-IQS and TDKENO

DOE Office of Scientific and Technical Information (OSTI.GOV)

DeHart, Mark D.; Mausolff, Zander; Weems, Zach

2016-08-01

One goal of the MAMMOTH M&S project is to validate the analysis capabilities within MAMMOTH. Historical data has shown limited value for validation of full three-dimensional (3D) multi-physics methods. Initial analysis considered the TREAT startup minimum critical core and one of the startup transient tests. At present, validation is focusing on measurements taken during the M8CAL test calibration series. These exercises will valuable in preliminary assessment of the ability of MAMMOTH to perform coupled multi-physics calculations; calculations performed to date are being used to validate the neutron transport solver Rattlesnake\\cite{Rattlesnake} and the fuels performance code BISON. Other validation projects outsidemore » of TREAT are available for single-physics benchmarking. Because the transient solution capability of Rattlesnake is one of the key attributes that makes it unique for TREAT transient simulations, validation of the transient solution of Rattlesnake using other time dependent kinetics benchmarks has considerable value. The Nuclear Energy Agency (NEA) of the Organization for Economic Cooperation and Development (OECD) has recently developed a computational benchmark for transient simulations. This benchmark considered both two-dimensional (2D) and 3D configurations for a total number of 26 different transients. All are negative reactivity insertions, typically returning to the critical state after some time.« less
Python/Lua Benchmarks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Busby, L.

This is an adaptation of the pre-existing Scimark benchmark code to a variety of Python and Lua implementations. It also measures performance of the Fparser expression parser and C and C++ code on a variety of simple scientific expressions.
Kaiser Permanente's performance improvement system, Part 1: From benchmarking to executing on strategic priorities.

PubMed

Schilling, Lisa; Chase, Alide; Kehrli, Sommer; Liu, Amy Y; Stiefel, Matt; Brentari, Ruth

2010-11-01

By 2004, senior leaders at Kaiser Permanente, the largest not-for-profit health plan in the United States, recognizing variations across service areas in quality, safety, service, and efficiency, began developing a performance improvement (PI) system to realizing best-in-class quality performance across all 35 medical centers. MEASURING SYSTEMWIDE PERFORMANCE: In 2005, a Web-based data dashboard, "Big Q," which tracks the performance of each medical center and service area against external benchmarks and internal goals, was created. PLANNING FOR PI AND BENCHMARKING PERFORMANCE: In 2006, Kaiser Permanente national and regional continued planning the PI system, and in 2007, quality, medical group, operations, and information technology leaders benchmarked five high-performing organizations to identify capabilities required to achieve consistent best-in-class organizational performance. THE PI SYSTEM: The PI system addresses the six capabilities: leadership priority setting, a systems approach to improvement, measurement capability, a learning organization, improvement capacity, and a culture of improvement. PI "deep experts" (mentors) consult with national, regional, and local leaders, and more than 500 improvement advisors are trained to manage portfolios of 90-120 day improvement initiatives at medical centers. Between the second quarter of 2008 and the first quarter of 2009, performance across all Kaiser Permanente medical centers improved on the Big Q metrics. The lessons learned in implementing and sustaining PI as it becomes fully integrated into all levels of Kaiser Permanente can be generalized to other health care systems, hospitals, and other health care organizations.
ERK/p38 MAPK inhibition reduces radio-resistance to a pulsed proton beam in breast cancer stem cells

NASA Astrophysics Data System (ADS)

Jung, Myung-Hwan; Park, Jeong Chan

2015-10-01

Recent studies have identified highly tumorigenic cells with stem cell-like characteristics, termed cancer stem cells (CSCs) in human cancers. CSCs are resistant to conventional radiotherapy and chemotherapy owing to their high DNA repair ability and oncogene overexpression. However, the mechanisms regulating CSC radio-resistance, particularly proton beam resistance, remain unclear. We isolated CSCs from the breast cancer cell lines MCF-7 and MDA-MB-231, which expressed the characteristic breast CSC membrane protein markers CD44+/CD24-/ low , and irradiated the CSCs with pulsed proton beams. We confirmed that CSCs were resistant to pulsed proton beams and showed that treatment with p38 and ERK inhibitors reduced CSC radio-resistance. Based on these results, BCSC radio-resistance can be reduced during proton beam therapy by co-treatment with ERK1/2 or p38 inhibitors, a novel approach to breast cancer therapy.
Medical school benchmarking - from tools to programmes.

PubMed

Wilkinson, Tim J; Hudson, Judith N; Mccoll, Geoffrey J; Hu, Wendy C Y; Jolly, Brian C; Schuwirth, Lambert W T

2015-02-01

Benchmarking among medical schools is essential, but may result in unwanted effects. To apply a conceptual framework to selected benchmarking activities of medical schools. We present an analogy between the effects of assessment on student learning and the effects of benchmarking on medical school educational activities. A framework by which benchmarking can be evaluated was developed and applied to key current benchmarking activities in Australia and New Zealand. The analogy generated a conceptual framework that tested five questions to be considered in relation to benchmarking: what is the purpose? what are the attributes of value? what are the best tools to assess the attributes of value? what happens to the results? and, what is the likely "institutional impact" of the results? If the activities were compared against a blueprint of desirable medical graduate outcomes, notable omissions would emerge. Medical schools should benchmark their performance on a range of educational activities to ensure quality improvement and to assure stakeholders that standards are being met. Although benchmarking potentially has positive benefits, it could also result in perverse incentives with unforeseen and detrimental effects on learning if it is undertaken using only a few selected assessment tools.
Toxicological benchmarks for screening potential contaminants of concern for effects on terrestrial plants: 1994 revision

DOE Office of Scientific and Technical Information (OSTI.GOV)

Will, M.E.; Suter, G.W. II

1994-09-01

One of the initial stages in ecological risk assessment for hazardous waste sites is screening contaminants to determine which of them are worthy of further consideration as contaminants of potential concern. This process is termed contaminant screening. It is performed by comparing measured ambient concentrations of chemicals to benchmark concentrations. Currently, no standard benchmark concentrations exist for assessing contaminants in soil with respect to their toxicity to plants. This report presents a standard method for deriving benchmarks for this purpose (phytotoxicity benchmarks), a set of data concerning effects of chemicals in soil or soil solution on plants, and a setmore » of phytotoxicity benchmarks for 38 chemicals potentially associated with United States Department of Energy (DOE) sites. In addition, background information on the phytotoxicity and occurrence of the chemicals in soils is presented, and literature describing the experiments from which data were drawn for benchmark derivation is reviewed. Chemicals that are found in soil at concentrations exceeding both the phytotoxicity benchmark and the background concentration for the soil type should be considered contaminants of potential concern.« less
Toxicological Benchmarks for Screening Potential Contaminants of Concern for Effects on Terrestrial Plants

DOE Office of Scientific and Technical Information (OSTI.GOV)

Suter, G.W. II

1993-01-01

One of the initial stages in ecological risk assessment for hazardous waste sites is screening contaminants to determine which of them are worthy of further consideration as contaminants of potential concern. This process is termed contaminant screening. It is performed by comparing measured ambient concentrations of chemicals to benchmark concentrations. Currently, no standard benchmark concentrations exist for assessing contaminants in soil with respect to their toxicity to plants. This report presents a standard method for deriving benchmarks for this purpose (phytotoxicity benchmarks), a set of data concerning effects of chemicals in soil or soil solution on plants, and a setmore » of phytotoxicity benchmarks for 38 chemicals potentially associated with United States Department of Energy (DOE) sites. In addition, background information on the phytotoxicity and occurrence of the chemicals in soils is presented, and literature describing the experiments from which data were drawn for benchmark derivation is reviewed. Chemicals that are found in soil at concentrations exceeding both the phytotoxicity benchmark and the background concentration for the soil type should be considered contaminants of potential concern.« less
A benchmarking method to measure dietary absorption efficiency of chemicals by fish.

PubMed

Xiao, Ruiyang; Adolfsson-Erici, Margaretha; Åkerman, Gun; McLachlan, Michael S; MacLeod, Matthew

2013-12-01

Understanding the dietary absorption efficiency of chemicals in the gastrointestinal tract of fish is important from both a scientific and a regulatory point of view. However, reported fish absorption efficiencies for well-studied chemicals are highly variable. In the present study, the authors developed and exploited an internal chemical benchmarking method that has the potential to reduce uncertainty and variability and, thus, to improve the precision of measurements of fish absorption efficiency. The authors applied the benchmarking method to measure the gross absorption efficiency for 15 chemicals with a wide range of physicochemical properties and structures. They selected 2,2',5,6'-tetrachlorobiphenyl (PCB53) and decabromodiphenyl ethane as absorbable and nonabsorbable benchmarks, respectively. Quantities of chemicals determined in fish were benchmarked to the fraction of PCB53 recovered in fish, and quantities of chemicals determined in feces were benchmarked to the fraction of decabromodiphenyl ethane recovered in feces. The performance of the benchmarking procedure was evaluated based on the recovery of the test chemicals and precision of absorption efficiency from repeated tests. Benchmarking did not improve the precision of the measurements; after benchmarking, however, the median recovery for 15 chemicals was 106%, and variability of recoveries was reduced compared with before benchmarking, suggesting that benchmarking could account for incomplete extraction of chemical in fish and incomplete collection of feces from different tests. © 2013 SETAC.
A Privacy-Preserving Platform for User-Centric Quantitative Benchmarking

NASA Astrophysics Data System (ADS)

Herrmann, Dominik; Scheuer, Florian; Feustel, Philipp; Nowey, Thomas; Federrath, Hannes

We propose a centralised platform for quantitative benchmarking of key performance indicators (KPI) among mutually distrustful organisations. Our platform offers users the opportunity to request an ad-hoc benchmarking for a specific KPI within a peer group of their choice. Architecture and protocol are designed to provide anonymity to its users and to hide the sensitive KPI values from other clients and the central server. To this end, we integrate user-centric peer group formation, exchangeable secure multi-party computation protocols, short-lived ephemeral key pairs as pseudonyms, and attribute certificates. We show by empirical evaluation of a prototype that the performance is acceptable for reasonably sized peer groups.
Benchmark Testing of a New 56Fe Evaluation for Criticality Safety Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leal, Luiz C; Ivanov, E.

2015-01-01

The SAMMY code was used to evaluate resonance parameters of the 56Fe cross section in the resolved resonance energy range of 0–2 MeV using transmission data, capture, elastic, inelastic, and double differential elastic cross sections. The resonance analysis was performed with the code SAMMY that fits R-matrix resonance parameters using the generalized least-squares technique (Bayes’ theory). The evaluation yielded a set of resonance parameters that reproduced the experimental data very well, along with a resonance parameter covariance matrix for data uncertainty calculations. Benchmark tests were conducted to assess the evaluation performance in benchmark calculations.
Benchmarking Diagnostic Algorithms on an Electrical Power System Testbed

NASA Technical Reports Server (NTRS)

Kurtoglu, Tolga; Narasimhan, Sriram; Poll, Scott; Garcia, David; Wright, Stephanie

2009-01-01

Diagnostic algorithms (DAs) are key to enabling automated health management. These algorithms are designed to detect and isolate anomalies of either a component or the whole system based on observations received from sensors. In recent years a wide range of algorithms, both model-based and data-driven, have been developed to increase autonomy and improve system reliability and affordability. However, the lack of support to perform systematic benchmarking of these algorithms continues to create barriers for effective development and deployment of diagnostic technologies. In this paper, we present our efforts to benchmark a set of DAs on a common platform using a framework that was developed to evaluate and compare various performance metrics for diagnostic technologies. The diagnosed system is an electrical power system, namely the Advanced Diagnostics and Prognostics Testbed (ADAPT) developed and located at the NASA Ames Research Center. The paper presents the fundamentals of the benchmarking framework, the ADAPT system, description of faults and data sets, the metrics used for evaluation, and an in-depth analysis of benchmarking results obtained from testing ten diagnostic algorithms on the ADAPT electrical power system testbed.
Generating Shifting Workloads to Benchmark Adaptability in Relational Database Systems

NASA Astrophysics Data System (ADS)

Rabl, Tilmann; Lang, Andreas; Hackl, Thomas; Sick, Bernhard; Kosch, Harald

A large body of research concerns the adaptability of database systems. Many commercial systems already contain autonomic processes that adapt configurations as well as data structures and data organization. Yet there is virtually no possibility for a just measurement of the quality of such optimizations. While standard benchmarks have been developed that simulate real-world database applications very precisely, none of them considers variations in workloads produced by human factors. Today’s benchmarks test the performance of database systems by measuring peak performance on homogeneous request streams. Nevertheless, in systems with user interaction access patterns are constantly shifting. We present a benchmark that simulates a web information system with interaction of large user groups. It is based on the analysis of a real online eLearning management system with 15,000 users. The benchmark considers the temporal dependency of user interaction. Main focus is to measure the adaptability of a database management system according to shifting workloads. We will give details on our design approach that uses sophisticated pattern analysis and data mining techniques.

Evaluation and optimization of virtual screening workflows with DEKOIS 2.0--a public library of challenging docking benchmark sets.

PubMed

Bauer, Matthias R; Ibrahim, Tamer M; Vogel, Simon M; Boeckler, Frank M

2013-06-24

The application of molecular benchmarking sets helps to assess the actual performance of virtual screening (VS) workflows. To improve the efficiency of structure-based VS approaches, the selection and optimization of various parameters can be guided by benchmarking. With the DEKOIS 2.0 library, we aim to further extend and complement the collection of publicly available decoy sets. Based on BindingDB bioactivity data, we provide 81 new and structurally diverse benchmark sets for a wide variety of different target classes. To ensure a meaningful selection of ligands, we address several issues that can be found in bioactivity data. We have improved our previously introduced DEKOIS methodology with enhanced physicochemical matching, now including the consideration of molecular charges, as well as a more sophisticated elimination of latent actives in the decoy set (LADS). We evaluate the docking performance of Glide, GOLD, and AutoDock Vina with our data sets and highlight existing challenges for VS tools. All DEKOIS 2.0 benchmark sets will be made accessible at http://www.dekois.com.
Surgeon-Specific Reports in General Surgery: Establishing Benchmarks for Peer Comparison Within a Single Hospital.

PubMed

Hatfield, Mark D; Ashton, Carol M; Bass, Barbara L; Shirkey, Beverly A

2016-02-01

Methods to assess a surgeon's individual performance based on clinically meaningful outcomes have not been fully developed, due to small numbers of adverse outcomes and wide variation in case volumes. The Achievable Benchmark of Care (ABC) method addresses these issues by identifying benchmark-setting surgeons with high levels of performance and greater case volumes. This method was used to help surgeons compare their surgical practice to that of their peers by using merged National Surgical Quality Improvement Program (NSQIP) and Metabolic and Bariatric Surgery Accreditation and Quality Improvement Program (MBSAQIP) data to generate surgeon-specific reports. A retrospective cohort study at a single institution's department of surgery was conducted involving 107 surgeons (8,660 cases) over 5.5 years. Stratification of more than 32,000 CPT codes into 16 CPT clusters served as the risk adjustment. Thirty-day outcomes of interest included surgical site infection (SSI), acute kidney injury (AKI), and mortality. Performance characteristics of the ABC method were explored by examining how many surgeons were identified as benchmark-setters in view of volume and outcome rates within CPT clusters. For the data captured, most surgeons performed cases spanning a median of 5 CPT clusters (range 1 to 15 clusters), with a median of 26 cases (range 1 to 776 cases) and a median of 2.8 years (range 0 to 5.5 years). The highest volume surgeon for that CPT cluster set the benchmark for 6 of 16 CPT clusters for SSIs, 8 of 16 CPT clusters for AKIs, and 9 of 16 CPT clusters for mortality. The ABC method appears to be a sound and useful approach to identifying benchmark-setting surgeons within a single institution. Such surgeons may be able to help their peers improve their performance. Copyright © 2016 American College of Surgeons. Published by Elsevier Inc. All rights reserved.
Assessment of Static Delamination Propagation Capabilities in Commercial Finite Element Codes Using Benchmark Analysis

NASA Technical Reports Server (NTRS)

Orifici, Adrian C.; Krueger, Ronald

2010-01-01

With capabilities for simulating delamination growth in composite materials becoming available, the need for benchmarking and assessing these capabilities is critical. In this study, benchmark analyses were performed to assess the delamination propagation simulation capabilities of the VCCT implementations in Marc TM and MD NastranTM. Benchmark delamination growth results for Double Cantilever Beam, Single Leg Bending and End Notched Flexure specimens were generated using a numerical approach. This numerical approach was developed previously, and involves comparing results from a series of analyses at different delamination lengths to a single analysis with automatic crack propagation. Specimens were analyzed with three-dimensional and two-dimensional models, and compared with previous analyses using Abaqus . The results demonstrated that the VCCT implementation in Marc TM and MD Nastran(TradeMark) was capable of accurately replicating the benchmark delamination growth results and that the use of the numerical benchmarks offers advantages over benchmarking using experimental and analytical results.
Can data-driven benchmarks be used to set the goals of healthy people 2010?

PubMed Central

Allison, J; Kiefe, C I; Weissman, N W

1999-01-01

OBJECTIVES: Expert panels determined the public health goals of Healthy People 2000 subjectively. The present study examined whether data-driven benchmarks provide a better alternative. METHODS: We developed the "pared-mean" method to define from data the best achievable health care practices. We calculated the pared-mean benchmark for screening mammography from the 1994 National Health Interview Survey, using the metropolitan statistical area as the "provider" unit. Beginning with the best-performing provider and adding providers in descending sequence, we established the minimum provider subset that included at least 10% of all women surveyed on this question. The pared-mean benchmark is then the proportion of women in this subset who received mammography. RESULTS: The pared-mean benchmark for screening mammography was 71%, compared with the Healthy People 2000 goal of 60%. CONCLUSIONS: For Healthy People 2010, benchmarks derived from data reflecting the best available care provide viable alternatives to consensus-derived targets. We are currently pursuing additional refinements to the data-driven pared-mean benchmark approach. PMID:9987466
Excited, Proud, and Accomplished: Exploring the Effects of Feedback Supplemented with Web-Based Peer Benchmarking on Self-Regulated Learning in Marketing Classrooms

ERIC Educational Resources Information Center

Raska, David

2014-01-01

This research explores and tests the effect of an innovative performance feedback practice--feedback supplemented with web-based peer benchmarking--through a lens of social cognitive framework for self-regulated learning. The results suggest that providing performance feedback with references to exemplary peer output is positively associated with…
Maximizing Use of Extension Beef Cattle Benchmarks Data Derived from Cow Herd Appraisal Performance Software

ERIC Educational Resources Information Center

Ramsay, Jennifer M.; Hanna, Lauren L. Hulsman; Ringwall, Kris A.

2016-01-01

One goal of Extension is to provide practical information that makes a difference to producers. Cow Herd Appraisal Performance Software (CHAPS) has provided beef producers with production benchmarks for 30 years, creating a large historical data set. Many such large data sets contain useful information but are underutilized. Our goal was to create…
Performance Evaluation and Benchmarking of Next Intelligent Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

del Pobil, Angel; Madhavan, Raj; Bonsignorio, Fabio

Performance Evaluation and Benchmarking of Intelligent Systems presents research dedicated to the subject of performance evaluation and benchmarking of intelligent systems by drawing from the experiences and insights of leading experts gained both through theoretical development and practical implementation of intelligent systems in a variety of diverse application domains. This contributed volume offers a detailed and coherent picture of state-of-the-art, recent developments, and further research areas in intelligent systems. The chapters cover a broad range of applications, such as assistive robotics, planetary surveying, urban search and rescue, and line tracking for automotive assembly. Subsystems or components described in this bookmore » include human-robot interaction, multi-robot coordination, communications, perception, and mapping. Chapters are also devoted to simulation support and open source software for cognitive platforms, providing examples of the type of enabling underlying technologies that can help intelligent systems to propagate and increase in capabilities. Performance Evaluation and Benchmarking of Intelligent Systems serves as a professional reference for researchers and practitioners in the field. This book is also applicable to advanced courses for graduate level students and robotics professionals in a wide range of engineering and related disciplines including computer science, automotive, healthcare, manufacturing, and service robotics.« less
The Army Pollution Prevention Program: Improving Performance Through Benchmarking.

DTIC Science & Technology

1995-06-01

Washington, DC 20503. 1. AGENCY USE ONLY (Leave Blank) 2. REPORT DATE June 1995 3. REPORT TYPE AND DATES COVERED Final 4. TITLE AND SUBTITLE...unlimited 12b. DISTRIBUTION CODE 13. ABSTRACT (Maximum 200 words) This report investigates the feasibility of using benchmarking as a method for...could use to determine to what degree it should integrate benchmarking with other quality management tools to support the pollution prevention program
Benchmarking hypercube hardware and software

NASA Technical Reports Server (NTRS)

Grunwald, Dirk C.; Reed, Daniel A.

1986-01-01

It was long a truism in computer systems design that balanced systems achieve the best performance. Message passing parallel processors are no different. To quantify the balance of a hypercube design, an experimental methodology was developed and the associated suite of benchmarks was applied to several existing hypercubes. The benchmark suite includes tests of both processor speed in the absence of internode communication and message transmission speed as a function of communication patterns.
Funnel plot control limits to identify poorly performing healthcare providers when there is uncertainty in the value of the benchmark.

PubMed

Manktelow, Bradley N; Seaton, Sarah E; Evans, T Alun

2016-12-01

There is an increasing use of statistical methods, such as funnel plots, to identify poorly performing healthcare providers. Funnel plots comprise the construction of control limits around a benchmark and providers with outcomes falling outside the limits are investigated as potential outliers. The benchmark is usually estimated from observed data but uncertainty in this estimate is usually ignored when constructing control limits. In this paper, the use of funnel plots in the presence of uncertainty in the value of the benchmark is reviewed for outcomes from a Binomial distribution. Two methods to derive the control limits are shown: (i) prediction intervals; (ii) tolerance intervals Tolerance intervals formally include the uncertainty in the value of the benchmark while prediction intervals do not. The probability properties of 95% control limits derived using each method were investigated through hypothesised scenarios. Neither prediction intervals nor tolerance intervals produce funnel plot control limits that satisfy the nominal probability characteristics when there is uncertainty in the value of the benchmark. This is not necessarily to say that funnel plots have no role to play in healthcare, but that without the development of intervals satisfying the nominal probability characteristics they must be interpreted with care. © The Author(s) 2014.
CompaRNA: a server for continuous benchmarking of automated methods for RNA secondary structure prediction

PubMed Central

Puton, Tomasz; Kozlowski, Lukasz P.; Rother, Kristian M.; Bujnicki, Janusz M.

2013-01-01

We present a continuous benchmarking approach for the assessment of RNA secondary structure prediction methods implemented in the CompaRNA web server. As of 3 October 2012, the performance of 28 single-sequence and 13 comparative methods has been evaluated on RNA sequences/structures released weekly by the Protein Data Bank. We also provide a static benchmark generated on RNA 2D structures derived from the RNAstrand database. Benchmarks on both data sets offer insight into the relative performance of RNA secondary structure prediction methods on RNAs of different size and with respect to different types of structure. According to our tests, on the average, the most accurate predictions obtained by a comparative approach are generated by CentroidAlifold, MXScarna, RNAalifold and TurboFold. On the average, the most accurate predictions obtained by single-sequence analyses are generated by CentroidFold, ContextFold and IPknot. The best comparative methods typically outperform the best single-sequence methods if an alignment of homologous RNA sequences is available. This article presents the results of our benchmarks as of 3 October 2012, whereas the rankings presented online are continuously updated. We will gladly include new prediction methods and new measures of accuracy in the new editions of CompaRNA benchmarks. PMID:23435231
International land Model Benchmarking (ILAMB) Package v002.00

DOE Data Explorer

Collier, Nathaniel [Oak Ridge National Laboratory; Hoffman, Forrest M. [Oak Ridge National Laboratory; Mu, Mingquan [University of California, Irvine; Randerson, James T. [University of California, Irvine; Riley, William J. [Lawrence Berkeley National Laboratory

2016-05-09

As a contribution to International Land Model Benchmarking (ILAMB) Project, we are providing new analysis approaches, benchmarking tools, and science leadership. The goal of ILAMB is to assess and improve the performance of land models through international cooperation and to inform the design of new measurement campaigns and field studies to reduce uncertainties associated with key biogeochemical processes and feedbacks. ILAMB is expected to be a primary analysis tool for CMIP6 and future model-data intercomparison experiments. This team has developed initial prototype benchmarking systems for ILAMB, which will be improved and extended to include ocean model metrics and diagnostics.
International land Model Benchmarking (ILAMB) Package v001.00

DOE Data Explorer

Mu, Mingquan [University of California, Irvine; Randerson, James T. [University of California, Irvine; Riley, William J. [Lawrence Berkeley National Laboratory; Hoffman, Forrest M. [Oak Ridge National Laboratory

2016-05-02

As a contribution to International Land Model Benchmarking (ILAMB) Project, we are providing new analysis approaches, benchmarking tools, and science leadership. The goal of ILAMB is to assess and improve the performance of land models through international cooperation and to inform the design of new measurement campaigns and field studies to reduce uncertainties associated with key biogeochemical processes and feedbacks. ILAMB is expected to be a primary analysis tool for CMIP6 and future model-data intercomparison experiments. This team has developed initial prototype benchmarking systems for ILAMB, which will be improved and extended to include ocean model metrics and diagnostics.
Engine Benchmarking - Final CRADA Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wallner, Thomas

Detailed benchmarking of the powertrains of three light-duty vehicles was performed. Results were presented and provided to CRADA partners. The vehicles included a MY2011 Audi A4, a MY2012 Mini Cooper and a MY2014 Nissan Versa.
Rural and urban transit district benchmarking : effectiveness and efficiency guidance document.

DOT National Transportation Integrated Search

2011-05-01

Rural and urban transit systems have sought ways to compare performance across agencies, : identifying successful service delivery strategies and applying these concepts to achieve : successful results within their agency. Benchmarking is a method us...
Present Status and Extensions of the Monte Carlo Performance Benchmark

NASA Astrophysics Data System (ADS)

Hoogenboom, J. Eduard; Petrovic, Bojan; Martin, William R.

2014-06-01

The NEA Monte Carlo Performance benchmark started in 2011 aiming to monitor over the years the abilities to perform a full-size Monte Carlo reactor core calculation with a detailed power production for each fuel pin with axial distribution. This paper gives an overview of the contributed results thus far. It shows that reaching a statistical accuracy of 1 % for most of the small fuel zones requires about 100 billion neutron histories. The efficiency of parallel execution of Monte Carlo codes on a large number of processor cores shows clear limitations for computer clusters with common type computer nodes. However, using true supercomputers the speedup of parallel calculations is increasing up to large numbers of processor cores. More experience is needed from calculations on true supercomputers using large numbers of processors in order to predict if the requested calculations can be done in a short time. As the specifications of the reactor geometry for this benchmark test are well suited for further investigations of full-core Monte Carlo calculations and a need is felt for testing other issues than its computational performance, proposals are presented for extending the benchmark to a suite of benchmark problems for evaluating fission source convergence for a system with a high dominance ratio, for coupling with thermal-hydraulics calculations to evaluate the use of different temperatures and coolant densities and to study the correctness and effectiveness of burnup calculations. Moreover, other contemporary proposals for a full-core calculation with realistic geometry and material composition will be discussed.
Benchmarking in pathology: development of a benchmarking complexity unit and associated key performance indicators.

PubMed

Neil, Amanda; Pfeffer, Sally; Burnett, Leslie

2013-01-01

This paper details the development of a new type of pathology laboratory productivity unit, the benchmarking complexity unit (BCU). The BCU provides a comparative index of laboratory efficiency, regardless of test mix. It also enables estimation of a measure of how much complex pathology a laboratory performs, and the identification of peer organisations for the purposes of comparison and benchmarking. The BCU is based on the theory that wage rates reflect productivity at the margin. A weighting factor for the ratio of medical to technical staff time was dynamically calculated based on actual participant site data. Given this weighting, a complexity value for each test, at each site, was calculated. The median complexity value (number of BCUs) for that test across all participating sites was taken as its complexity value for the Benchmarking in Pathology Program. The BCU allowed implementation of an unbiased comparison unit and test listing that was found to be a robust indicator of the relative complexity for each test. Employing the BCU data, a number of Key Performance Indicators (KPIs) were developed, including three that address comparative organisational complexity, analytical depth and performance efficiency, respectively. Peer groups were also established using the BCU combined with simple organisational and environmental metrics. The BCU has enabled productivity statistics to be compared between organisations. The BCU corrects for differences in test mix and workload complexity of different organisations and also allows for objective stratification into peer groups.
Using a visual plate waste study to monitor menu performance.

PubMed

Connors, Priscilla L; Rozell, Sarah B

2004-01-01

Two visual plate waste studies were conducted in 1-week phases over a 1-year period in an acute care hospital. A total of 383 trays were evaluated in the first phase and 467 in the second. Food items were ranked for consumption from a low (1) to high (6) score, with a score of 4.0 set as the benchmark denoting a minimum level of acceptable consumption. In the first phase two entrees, four starches, all of the vegetables, sliced white bread, and skim milk scored below the benchmark. As a result six menu items were replaced and one was modified. In the second phase all entrees scored at or above 4.0, as did seven vegetables, and a dinner roll that replaced sliced white bread. Skim milk continued to score below the benchmark. A visual plate waste study assists in benchmarking performance, planning menu changes, and assessing effectiveness.
Analysis of 100Mb/s Ethernet for the Whitney Commodity Computing Testbed

NASA Technical Reports Server (NTRS)

Fineberg, Samuel A.; Pedretti, Kevin T.; Kutler, Paul (Technical Monitor)

1997-01-01

We evaluate the performance of a Fast Ethernet network configured with a single large switch, a single hub, and a 4x4 2D torus topology in a testbed cluster of "commodity" Pentium Pro PCs. We also evaluated a mixed network composed of ethernet hubs and switches. An MPI collective communication benchmark, and the NAS Parallel Benchmarks version 2.2 (NPB2) show that the torus network performs best for all sizes that we were able to test (up to 16 nodes). For larger networks the ethernet switch outperforms the hub, though its performance is far less than peak. The hub/switch combination tests indicate that the NAS parallel benchmarks are relatively insensitive to hub densities of less than 7 nodes per hub.
FireHose Streaming Benchmarks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Karl Anderson, Steve Plimpton

2015-01-27

The FireHose Streaming Benchmarks are a suite of stream-processing benchmarks defined to enable comparison of streaming software and hardware, both quantitatively vis-a-vis the rate at which they can process data, and qualitatively by judging the effort involved to implement and run the benchmarks. Each benchmark has two parts. The first is a generator which produces and outputs datums at a high rate in a specific format. The second is an analytic which reads the stream of datums and is required to perform a well-defined calculation on the collection of datums, typically to find anomalous datums that have been created inmore » the stream by the generator. The FireHose suite provides code for the generators, sample code for the analytics (which users are free to re-implement in their own custom frameworks), and a precise definition of each benchmark calculation.« less

Benchmarking of HEU Mental Annuli Critical Assemblies with Internally Reflected Graphite Cylinder

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xiaobo, Liu; Bess, John D.; Marshall, Margaret A.

Three experimental configurations of critical assemblies, performed in 1963 at the Oak Ridge Critical Experiment Facility, which are assembled using three different diameter HEU annuli (15-9 inches, 15-7 inches and 13-7 inches) metal annuli with internally reflected graphite cylinder are evaluated and benchmarked. The experimental uncertainties which are 0.00055, 0.00055 and 0.00055 respectively, and biases to the detailed benchmark models which are -0.00179, -0.00189 and -0.00114 respectively, were determined, and the experimental benchmark keff results were obtained for both detailed and simplified model. The calculation results for both detailed and simplified models using MCNP6-1.0 and ENDF VII.1 agree well tomore » the benchmark experimental results with a difference of less than 0.2%. These are acceptable benchmark experiments for inclusion in the ICSBEP Handbook.« less
A Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Madduri, Kamesh; Ediger, David; Jiang, Karl

2009-02-15

We present a new lock-free parallel algorithm for computing betweenness centralityof massive small-world networks. With minor changes to the data structures, ouralgorithm also achieves better spatial cache locality compared to previous approaches. Betweenness centrality is a key algorithm kernel in HPCS SSCA#2, a benchmark extensively used to evaluate the performance of emerging high-performance computing architectures for graph-theoretic computations. We design optimized implementations of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded systems: a Cray XMT system with the Threadstorm processor, and a single-socket Sun multicore server with the UltraSPARC T2 processor. For a small-world network of 134 millionmore » vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to more than a 2X performance improvement over the previous parallel implementations. To better characterize the performance of these multithreaded systems, we correlate the SSCA#2 performance results with data from the memory-intensive STREAM and RandomAccess benchmarks. Finally, we demonstrate the applicability of our implementation to analyze massive real-world datasets by computing approximate betweenness centrality for a large-scale IMDb movie-actor network.« less
A Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Madduri, Kamesh; Ediger, David; Jiang, Karl

2009-05-29

We present a new lock-free parallel algorithm for computing betweenness centrality of massive small-world networks. With minor changes to the data structures, our algorithm also achieves better spatial cache locality compared to previous approaches. Betweenness centrality is a key algorithm kernel in the HPCS SSCA#2 Graph Analysis benchmark, which has been extensively used to evaluate the performance of emerging high-performance computing architectures for graph-theoretic computations. We design optimized implementations of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded systems: a Cray XMT system with the ThreadStorm processor, and a single-socket Sun multicore server with the UltraSparc T2 processor.more » For a small-world network of 134 million vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to more than a 2X performance improvement over the previous parallel implementations. To better characterize the performance of these multithreaded systems, we correlate the SSCA#2 performance results with data from the memory-intensive STREAM and RandomAccess benchmarks. Finally, we demonstrate the applicability of our implementation to analyze massive real-world datasets by computing approximate betweenness centrality for a large-scale IMDb movie-actor network.« less
Optimization of Deep Drilling Performance - Development and Benchmark Testing of Advanced Diamond Product Drill Bits & HP/HT Fluids to Significantly Improve Rates of Penetration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alan Black; Arnis Judzis

2005-09-30

This document details the progress to date on the OPTIMIZATION OF DEEP DRILLING PERFORMANCE--DEVELOPMENT AND BENCHMARK TESTING OF ADVANCED DIAMOND PRODUCT DRILL BITS AND HP/HT FLUIDS TO SIGNIFICANTLY IMPROVE RATES OF PENETRATION contract for the year starting October 2004 through September 2005. The industry cost shared program aims to benchmark drilling rates of penetration in selected simulated deep formations and to significantly improve ROP through a team development of aggressive diamond product drill bit--fluid system technologies. Overall the objectives are as follows: Phase 1--Benchmark ''best in class'' diamond and other product drilling bits and fluids and develop concepts for amore » next level of deep drilling performance; Phase 2--Develop advanced smart bit-fluid prototypes and test at large scale; and Phase 3--Field trial smart bit--fluid concepts, modify as necessary and commercialize products. As of report date, TerraTek has concluded all Phase 1 testing and is planning Phase 2 development.« less
Reliable B Cell Epitope Predictions: Impacts of Method Development and Improved Benchmarking

PubMed Central

Kringelum, Jens Vindahl; Lundegaard, Claus; Lund, Ole; Nielsen, Morten

2012-01-01

The interaction between antibodies and antigens is one of the most important immune system mechanisms for clearing infectious organisms from the host. Antibodies bind to antigens at sites referred to as B-cell epitopes. Identification of the exact location of B-cell epitopes is essential in several biomedical applications such as; rational vaccine design, development of disease diagnostics and immunotherapeutics. However, experimental mapping of epitopes is resource intensive making in silico methods an appealing complementary approach. To date, the reported performance of methods for in silico mapping of B-cell epitopes has been moderate. Several issues regarding the evaluation data sets may however have led to the performance values being underestimated: Rarely, all potential epitopes have been mapped on an antigen, and antibodies are generally raised against the antigen in a given biological context not against the antigen monomer. Improper dealing with these aspects leads to many artificial false positive predictions and hence to incorrect low performance values. To demonstrate the impact of proper benchmark definitions, we here present an updated version of the DiscoTope method incorporating a novel spatial neighborhood definition and half-sphere exposure as surface measure. Compared to other state-of-the-art prediction methods, Discotope-2.0 displayed improved performance both in cross-validation and in independent evaluations. Using DiscoTope-2.0, we assessed the impact on performance when using proper benchmark definitions. For 13 proteins in the training data set where sufficient biological information was available to make a proper benchmark redefinition, the average AUC performance was improved from 0.791 to 0.824. Similarly, the average AUC performance on an independent evaluation data set improved from 0.712 to 0.727. Our results thus demonstrate that given proper benchmark definitions, B-cell epitope prediction methods achieve highly significant predictive performances suggesting these tools to be a powerful asset in rational epitope discovery. The updated version of DiscoTope is available at www.cbs.dtu.dk/services/DiscoTope-2.0. PMID:23300419
A KPI framework for process-based benchmarking of hospital information systems.

PubMed

Jahn, Franziska; Winter, Alfred

2011-01-01

Benchmarking is a major topic for monitoring, directing and elucidating the performance of hospital information systems (HIS). Current approaches neglect the outcome of the processes that are supported by the HIS and their contribution to the hospital's strategic goals. We suggest to benchmark HIS based on clinical documentation processes and their outcome. A framework consisting of a general process model and outcome criteria for clinical documentation processes is introduced.
Data Don't Drive: Building a Practitioner-Driven Culture of Inquiry to Assess Community College Performance. Lumina Foundation for Education Research Report

ERIC Educational Resources Information Center

Dowd, Alicia C.

2005-01-01

This report reviews the benchmarking practices that are presently being used at community colleges. It introduces the concept of a "culture of inquiry" as a means for judging their potential value. It classifies benchmarking efforts among three types--performance, diagnostic, and process--and characterizes each by its typical use. The…
Restaurant Energy Use Benchmarking Guideline

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hedrick, R.; Smith, V.; Field, K.

2011-07-01

A significant operational challenge for food service operators is defining energy use benchmark metrics to compare against the performance of individual stores. Without metrics, multiunit operators and managers have difficulty identifying which stores in their portfolios require extra attention to bring their energy performance in line with expectations. This report presents a method whereby multiunit operators may use their own utility data to create suitable metrics for evaluating their operations.
Subsequent Breast Cancer Risk Following Diagnosis of Atypical Ductal Hyperplasia on Needle Biopsy

PubMed Central

Menes, Tehillah S; Kerlikowske, Karla; Lange, Jane; Jaffer, Shabnam; Rosenberg, Robert; Miglioretti, Diana L.

2017-01-01

Background Atypical ductal hyperplasia (ADH) is a known strong risk factor for breast cancer. Published risk estimates are based on cohorts that included women diagnosed prior to the widespread use of screening mammograms and do not differentiate between the methods used to diagnose ADH, which may be related to size of the ADH focus. These risks may overestimate the risk of women currently diagnosed with ADH. We sought to examine the risk of invasive cancer associated with ADH diagnosed on core needle biopsy versus excisional biopsy. Design Cohort study comparing ten-year cumulative risk of invasive breast cancer in women undergoing mammography with and without a diagnosis of ADH. Setting Five breast imaging registries that participate in the National Cancer Institute–funded Breast Cancer Surveillance Consortium (BCSC). Participants Women undergoing mammography in the BCSC. Exposure Diagnosis of ADH on core needle biopsy or excisional biopsy in women undergoing mammography. Main outcome Ten-year cumulative risk of invasive breast cancer risk. Results The sample included 955,331 women with 1,727 diagnoses of ADH. From 1996 to 2012, the proportion of ADH diagnosed by core needle biopsy increased from 21% to 77%. Ten years following a diagnosis of ADH, the cumulative risk of invasive breast cancer was 2.6 (95% CI 2.0, 3.4) times higher than risk in women with no ADH. ADH diagnosed via excisional biopsy was associated with an adjusted HR of 3.0 (95% CI 2.0, 4.5), and via core needle biopsy, with an HR of 2.2 (95% CI 1.5, 3.4). Ten years after an ADH diagnosis, an estimated 5.7% (95% CI 4.3, 10.1) of women were diagnosed with invasive cancer. Women with ADH diagnosed on excisional biopsy had a slightly higher risk (6.7 %; 95% CI 3.0, 12.8) compared to those with ADH diagnosed via core needle biopsy (5.0%; 95% CI 2.2, 8.9). Conclusions Current 10-year risks of invasive breast cancer after a diagnosis of ADH may be lower than previously reported. The risk associated with ADH is slightly lower for women diagnosed by needle core biopsy as compared to excisional biopsy. PMID:27607465
Benchmarking: applications to transfusion medicine.

PubMed

Apelseth, Torunn Oveland; Molnar, Laura; Arnold, Emmy; Heddle, Nancy M

2012-10-01

Benchmarking is as a structured continuous collaborative process in which comparisons for selected indicators are used to identify factors that, when implemented, will improve transfusion practices. This study aimed to identify transfusion medicine studies reporting on benchmarking, summarize the benchmarking approaches used, and identify important considerations to move the concept of benchmarking forward in the field of transfusion medicine. A systematic review of published literature was performed to identify transfusion medicine-related studies that compared at least 2 separate institutions or regions with the intention of benchmarking focusing on 4 areas: blood utilization, safety, operational aspects, and blood donation. Forty-five studies were included: blood utilization (n = 35), safety (n = 5), operational aspects of transfusion medicine (n = 5), and blood donation (n = 0). Based on predefined criteria, 7 publications were classified as benchmarking, 2 as trending, and 36 as single-event studies. Three models of benchmarking are described: (1) a regional benchmarking program that collects and links relevant data from existing electronic sources, (2) a sentinel site model where data from a limited number of sites are collected, and (3) an institutional-initiated model where a site identifies indicators of interest and approaches other institutions. Benchmarking approaches are needed in the field of transfusion medicine. Major challenges include defining best practices and developing cost-effective methods of data collection. For those interested in initiating a benchmarking program, the sentinel site model may be most effective and sustainable as a starting point, although the regional model would be the ideal goal. Copyright © 2012 Elsevier Inc. All rights reserved.
Benchmarking, benchmarks, or best practices? Applying quality improvement principles to decrease surgical turnaround time.

PubMed

Mitchell, L

1996-01-01

The processes of benchmarking, benchmark data comparative analysis, and study of best practices are distinctly different. The study of best practices is explained with an example based on the Arthur Andersen & Co. 1992 "Study of Best Practices in Ambulatory Surgery". The results of a national best practices study in ambulatory surgery were used to provide our quality improvement team with the goal of improving the turnaround time between surgical cases. The team used a seven-step quality improvement problem-solving process to improve the surgical turnaround time. The national benchmark for turnaround times between surgical cases in 1992 was 13.5 minutes. The initial turnaround time at St. Joseph's Medical Center was 19.9 minutes. After the team implemented solutions, the time was reduced to an average of 16.3 minutes, an 18% improvement. Cost-benefit analysis showed a potential enhanced revenue of approximately $300,000, or a potential savings of $10,119. Applying quality improvement principles to benchmarking, benchmarks, or best practices can improve process performance. Understanding which form of benchmarking the institution wishes to embark on will help focus a team and use appropriate resources. Communicating with professional organizations that have experience in benchmarking will save time and money and help achieve the desired results.
Performance Modeling and Measurement of Parallelized Code for Distributed Shared Memory Multiprocessors

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry

1998-01-01

This paper presents a model to evaluate the performance and overhead of parallelizing sequential code using compiler directives for multiprocessing on distributed shared memory (DSM) systems. With increasing popularity of shared address space architectures, it is essential to understand their performance impact on programs that benefit from shared memory multiprocessing. We present a simple model to characterize the performance of programs that are parallelized using compiler directives for shared memory multiprocessing. We parallelized the sequential implementation of NAS benchmarks using native Fortran77 compiler directives for an Origin2000, which is a DSM system based on a cache-coherent Non Uniform Memory Access (ccNUMA) architecture. We report measurement based performance of these parallelized benchmarks from four perspectives: efficacy of parallelization process; scalability; parallelization overhead; and comparison with hand-parallelized and -optimized version of the same benchmarks. Our results indicate that sequential programs can conveniently be parallelized for DSM systems using compiler directives but realizing performance gains as predicted by the performance model depends primarily on minimizing architecture-specific data locality overhead.
GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods.

PubMed

Schaffter, Thomas; Marbach, Daniel; Floreano, Dario

2011-08-15

Over the last decade, numerous methods have been developed for inference of regulatory networks from gene expression data. However, accurate and systematic evaluation of these methods is hampered by the difficulty of constructing adequate benchmarks and the lack of tools for a differentiated analysis of network predictions on such benchmarks. Here, we describe a novel and comprehensive method for in silico benchmark generation and performance profiling of network inference methods available to the community as an open-source software called GeneNetWeaver (GNW). In addition to the generation of detailed dynamical models of gene regulatory networks to be used as benchmarks, GNW provides a network motif analysis that reveals systematic prediction errors, thereby indicating potential ways of improving inference methods. The accuracy of network inference methods is evaluated using standard metrics such as precision-recall and receiver operating characteristic curves. We show how GNW can be used to assess the performance and identify the strengths and weaknesses of six inference methods. Furthermore, we used GNW to provide the international Dialogue for Reverse Engineering Assessments and Methods (DREAM) competition with three network inference challenges (DREAM3, DREAM4 and DREAM5). GNW is available at http://gnw.sourceforge.net along with its Java source code, user manual and supporting data. Supplementary data are available at Bioinformatics online. dario.floreano@epfl.ch.
Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mishra, Alok; Li, Lingda; Kong, Martin

Here, the latest OpenMP standard offers automatic device offloading capabilities which facilitate GPU programming. Despite this, there remain many challenges. One of these is the unified memory feature introduced in recent GPUs. GPUs in current and future HPC systems have enhanced support for unified memory space. In such systems, CPU and GPU can access each other's memory transparently, that is, the data movement is managed automatically by the underlying system software and hardware. Memory over subscription is also possible in these systems. However, there is a significant lack of knowledge about how this mechanism will perform, and how programmers shouldmore » use it. We have modified several benchmarks codes, in the Rodinia benchmark suite, to study the behavior of OpenMP accelerator extensions and have used them to explore the impact of unified memory in an OpenMP context. We moreover modified the open source LLVM compiler to allow OpenMP programs to exploit unified memory. The results of our evaluation reveal that, while the performance of unified memory is comparable with that of normal GPU offloading for benchmarks with little data reuse, it suffers from significant overhead when GPU memory is over subcribed for benchmarks with large amount of data reuse. Based on these results, we provide several guidelines for programmers to achieve better performance with unified memory.« less
Benchmark calculation for radioactivity inventory using MAXS library based on JENDL-4.0 and JEFF-3.0/A for decommissioning BWR plants

NASA Astrophysics Data System (ADS)

Tanaka, Ken-ichi

2016-06-01

We performed benchmark calculation for radioactivity activated in a Primary Containment Vessel (PCV) of a Boiling Water Reactor (BWR) by using MAXS library, which was developed by collapsing with neutron energy spectra in the PCV of the BWR. Radioactivities due to neutron irradiation were measured by using activation foil detector of Gold (Au) and Nickel (Ni) at thirty locations in the PCV. We performed activation calculations of the foils with SCALE5.1/ORIGEN-S code with irradiation conditions of each foil location as the benchmark calculation. We compared calculations and measurements to estimate an effectiveness of MAXS library.
GEN-IV Benchmarking of Triso Fuel Performance Models under accident conditions modeling input data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Collin, Blaise Paul

This document presents the benchmark plan for the calculation of particle fuel performance on safety testing experiments that are representative of operational accidental transients. The benchmark is dedicated to the modeling of fission product release under accident conditions by fuel performance codes from around the world, and the subsequent comparison to post-irradiation experiment (PIE) data from the modeled heating tests. The accident condition benchmark is divided into three parts: • The modeling of a simplified benchmark problem to assess potential numerical calculation issues at low fission product release. • The modeling of the AGR-1 and HFR-EU1bis safety testing experiments. •more » The comparison of the AGR-1 and HFR-EU1bis modeling results with PIE data. The simplified benchmark case, thereafter named NCC (Numerical Calculation Case), is derived from “Case 5” of the International Atomic Energy Agency (IAEA) Coordinated Research Program (CRP) on coated particle fuel technology [IAEA 2012]. It is included so participants can evaluate their codes at low fission product release. “Case 5” of the IAEA CRP-6 showed large code-to-code discrepancies in the release of fission products, which were attributed to “effects of the numerical calculation method rather than the physical model” [IAEA 2012]. The NCC is therefore intended to check if these numerical effects subsist. The first two steps imply the involvement of the benchmark participants with a modeling effort following the guidelines and recommendations provided by this document. The third step involves the collection of the modeling results by Idaho National Laboratory (INL) and the comparison of these results with the available PIE data. The objective of this document is to provide all necessary input data to model the benchmark cases, and to give some methodology guidelines and recommendations in order to make all results suitable for comparison with each other. The participants should read this document thoroughly to make sure all the data needed for their calculations is provided in the document. Missing data will be added to a revision of the document if necessary. 09/2016: Tables 6 and 8 updated. AGR-2 input data added« less
Generation IV benchmarking of TRISO fuel performance models under accident conditions: Modeling input data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Collin, Blaise P.

2014-09-01

This document presents the benchmark plan for the calculation of particle fuel performance on safety testing experiments that are representative of operational accidental transients. The benchmark is dedicated to the modeling of fission product release under accident conditions by fuel performance codes from around the world, and the subsequent comparison to post-irradiation experiment (PIE) data from the modeled heating tests. The accident condition benchmark is divided into three parts: the modeling of a simplified benchmark problem to assess potential numerical calculation issues at low fission product release; the modeling of the AGR-1 and HFR-EU1bis safety testing experiments; and, the comparisonmore » of the AGR-1 and HFR-EU1bis modeling results with PIE data. The simplified benchmark case, thereafter named NCC (Numerical Calculation Case), is derived from ''Case 5'' of the International Atomic Energy Agency (IAEA) Coordinated Research Program (CRP) on coated particle fuel technology [IAEA 2012]. It is included so participants can evaluate their codes at low fission product release. ''Case 5'' of the IAEA CRP-6 showed large code-to-code discrepancies in the release of fission products, which were attributed to ''effects of the numerical calculation method rather than the physical model''[IAEA 2012]. The NCC is therefore intended to check if these numerical effects subsist. The first two steps imply the involvement of the benchmark participants with a modeling effort following the guidelines and recommendations provided by this document. The third step involves the collection of the modeling results by Idaho National Laboratory (INL) and the comparison of these results with the available PIE data. The objective of this document is to provide all necessary input data to model the benchmark cases, and to give some methodology guidelines and recommendations in order to make all results suitable for comparison with each other. The participants should read this document thoroughly to make sure all the data needed for their calculations is provided in the document. Missing data will be added to a revision of the document if necessary.« less
HPGMG 1.0: A Benchmark for Ranking High Performance Computing Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Adams, Mark; Brown, Jed; Shalf, John

2014-05-05

This document provides an overview of the benchmark ? HPGMG ? for ranking large scale general purpose computers for use on the Top500 list [8]. We provide a rationale for the need for a replacement for the current metric HPL, some background of the Top500 list and the challenges of developing such a metric; we discuss our design philosophy and methodology, and an overview of the specification of the benchmark. The primary documentation with maintained details on the specification can be found at hpgmg.org and the Wiki and benchmark code itself can be found in the repository https://bitbucket.org/hpgmg/hpgmg.
Ab initio calculations, structure, NBO and NCI analyses of Xsbnd H⋯π interactions

NASA Astrophysics Data System (ADS)

Wu, Qiyang; Su, He; Wang, Hongyan; Wang, Hui

2018-02-01

The performance of ab initio methods (MP2, DFT/B3LYP, random-phase approximation (RPA), CCSD(T) and QCISD(T)) in predicting interaction energy of Xsbnd H⋯π (Xsbnd H = HCCH, HCl, HF; π = C2H2, C2H4, C6H6) hydrogen complexes are assessed systematically. The CCSD(T)/CBS benchmarks of interaction energy are reported. It is found that RPA agrees well with CCSD(T)/CBS benchmarks and experimental results. CCSD(T) and QCISD(T) perform the best only when compared with CCSD(T)/CBS benchmarks, MP2 performs well only for experimental data. B3LYP provides the worst accuracy. Additionally, the equilibrium structure, interaction type of Xsbnd H⋯π hydrogen complexes are investigated by the natural bond orbital (NBO) and the non-covalent interaction index (NCI).
The Alpha consensus meeting on cryopreservation key performance indicators and benchmarks: proceedings of an expert meeting.

PubMed

2012-08-01

This proceedings report presents the outcomes from an international workshop designed to establish consensus on: definitions for key performance indicators (KPIs) for oocyte and embryo cryopreservation, using either slow freezing or vitrification; minimum performance level values for each KPI, representing basic competency; and aspirational benchmark values for each KPI, representing best practice goals. This report includes general presentations about current practice and factors for consideration in the development of KPIs. A total of 14 KPIs were recommended and benchmarks for each are presented. No recommendations were made regarding specific cryopreservation techniques or devices, or whether vitrification is 'better' than slow freezing, or vice versa, for any particular stage or application, as this was considered to be outside the scope of this workshop. Copyright © 2012 Reproductive Healthcare Ltd. Published by Elsevier Ltd. All rights reserved.

Interlaboratory Study Characterizing a Yeast Performance Standard for Benchmarking LC-MS Platform Performance*

PubMed Central

Paulovich, Amanda G.; Billheimer, Dean; Ham, Amy-Joan L.; Vega-Montoto, Lorenzo; Rudnick, Paul A.; Tabb, David L.; Wang, Pei; Blackman, Ronald K.; Bunk, David M.; Cardasis, Helene L.; Clauser, Karl R.; Kinsinger, Christopher R.; Schilling, Birgit; Tegeler, Tony J.; Variyath, Asokan Mulayath; Wang, Mu; Whiteaker, Jeffrey R.; Zimmerman, Lisa J.; Fenyo, David; Carr, Steven A.; Fisher, Susan J.; Gibson, Bradford W.; Mesri, Mehdi; Neubert, Thomas A.; Regnier, Fred E.; Rodriguez, Henry; Spiegelman, Cliff; Stein, Stephen E.; Tempst, Paul; Liebler, Daniel C.

2010-01-01

Optimal performance of LC-MS/MS platforms is critical to generating high quality proteomics data. Although individual laboratories have developed quality control samples, there is no widely available performance standard of biological complexity (and associated reference data sets) for benchmarking of platform performance for analysis of complex biological proteomes across different laboratories in the community. Individual preparations of the yeast Saccharomyces cerevisiae proteome have been used extensively by laboratories in the proteomics community to characterize LC-MS platform performance. The yeast proteome is uniquely attractive as a performance standard because it is the most extensively characterized complex biological proteome and the only one associated with several large scale studies estimating the abundance of all detectable proteins. In this study, we describe a standard operating protocol for large scale production of the yeast performance standard and offer aliquots to the community through the National Institute of Standards and Technology where the yeast proteome is under development as a certified reference material to meet the long term needs of the community. Using a series of metrics that characterize LC-MS performance, we provide a reference data set demonstrating typical performance of commonly used ion trap instrument platforms in expert laboratories; the results provide a basis for laboratories to benchmark their own performance, to improve upon current methods, and to evaluate new technologies. Additionally, we demonstrate how the yeast reference, spiked with human proteins, can be used to benchmark the power of proteomics platforms for detection of differentially expressed proteins at different levels of concentration in a complex matrix, thereby providing a metric to evaluate and minimize preanalytical and analytical variation in comparative proteomics experiments. PMID:19858499
Benchmarking Ada tasking on tightly coupled multiprocessor architectures

NASA Technical Reports Server (NTRS)

Collard, Philippe; Goforth, Andre; Marquardt, Matthew

1989-01-01

The development of benchmarks and performance measures for parallel Ada tasking is reported with emphasis on the macroscopic behavior of the benchmark across a set of load parameters. The application chosen for the study was the NASREM model for telerobot control, relevant to many NASA missions. The results of the study demonstrate the potential of parallel Ada in accomplishing the task of developing a control system for a system such as the Flight Telerobotic Servicer using the NASREM framework.
Toxicological benchmarks for screening potential contaminants of concern for effects on soil and litter invertebrates and heterotrophic process

DOE Office of Scientific and Technical Information (OSTI.GOV)

Will, M.E.; Suter, G.W. II

1994-09-01

One of the initial stages in ecological risk assessments for hazardous waste sites is the screening of contaminants to determine which of them are worthy of further consideration as {open_quotes}contaminants of potential concern.{close_quotes} This process is termed {open_quotes}contaminant screening.{close_quotes} It is performed by comparing measured ambient concentrations of chemicals to benchmark concentrations. Currently, no standard benchmark concentrations exist for assessing contaminants in soil with respect to their toxicity to soil- and litter-dwelling invertebrates, including earthworms, other micro- and macroinvertebrates, or heterotrophic bacteria and fungi. This report presents a standard method for deriving benchmarks for this purpose, sets of data concerningmore » effects of chemicals in soil on invertebrates and soil microbial processes, and benchmarks for chemicals potentially associated with United States Department of Energy sites. In addition, literature describing the experiments from which data were drawn for benchmark derivation. Chemicals that are found in soil at concentrations exceeding both the benchmarks and the background concentration for the soil type should be considered contaminants of potential concern.« less
Benchmarking Brain-Computer Interfaces Outside the Laboratory: The Cybathlon 2016

PubMed Central

Novak, Domen; Sigrist, Roland; Gerig, Nicolas J.; Wyss, Dario; Bauer, René; Götz, Ulrich; Riener, Robert

2018-01-01

This paper presents a new approach to benchmarking brain-computer interfaces (BCIs) outside the lab. A computer game was created that mimics a real-world application of assistive BCIs, with the main outcome metric being the time needed to complete the game. This approach was used at the Cybathlon 2016, a competition for people with disabilities who use assistive technology to achieve tasks. The paper summarizes the technical challenges of BCIs, describes the design of the benchmarking game, then describes the rules for acceptable hardware, software and inclusion of human pilots in the BCI competition at the Cybathlon. The 11 participating teams, their approaches, and their results at the Cybathlon are presented. Though the benchmarking procedure has some limitations (for instance, we were unable to identify any factors that clearly contribute to BCI performance), it can be successfully used to analyze BCI performance in realistic, less structured conditions. In the future, the parameters of the benchmarking game could be modified to better mimic different applications (e.g., the need to use some commands more frequently than others). Furthermore, the Cybathlon has the potential to showcase such devices to the general public. PMID:29375294
Encoding color information for visual tracking: Algorithms and benchmark.

PubMed

Liang, Pengpeng; Blasch, Erik; Ling, Haibin

2015-12-01

While color information is known to provide rich discriminative clues for visual inference, most modern visual trackers limit themselves to the grayscale realm. Despite recent efforts to integrate color in tracking, there is a lack of comprehensive understanding of the role color information can play. In this paper, we attack this problem by conducting a systematic study from both the algorithm and benchmark perspectives. On the algorithm side, we comprehensively encode 10 chromatic models into 16 carefully selected state-of-the-art visual trackers. On the benchmark side, we compile a large set of 128 color sequences with ground truth and challenge factor annotations (e.g., occlusion). A thorough evaluation is conducted by running all the color-encoded trackers, together with two recently proposed color trackers. A further validation is conducted on an RGBD tracking benchmark. The results clearly show the benefit of encoding color information for tracking. We also perform detailed analysis on several issues, including the behavior of various combinations between color model and visual tracker, the degree of difficulty of each sequence for tracking, and how different challenge factors affect the tracking performance. We expect the study to provide the guidance, motivation, and benchmark for future work on encoding color in visual tracking.
Performance Monitoring of Distributed Data Processing Systems

NASA Technical Reports Server (NTRS)

Ojha, Anand K.

2000-01-01

Test and checkout systems are essential components in ensuring safety and reliability of aircraft and related systems for space missions. A variety of systems, developed over several years, are in use at the NASA/KSC. Many of these systems are configured as distributed data processing systems with the functionality spread over several multiprocessor nodes interconnected through networks. To be cost-effective, a system should take the least amount of resource and perform a given testing task in the least amount of time. There are two aspects of performance evaluation: monitoring and benchmarking. While monitoring is valuable to system administrators in operating and maintaining, benchmarking is important in designing and upgrading computer-based systems. These two aspects of performance evaluation are the foci of this project. This paper first discusses various issues related to software, hardware, and hybrid performance monitoring as applicable to distributed systems, and specifically to the TCMS (Test Control and Monitoring System). Next, a comparison of several probing instructions are made to show that the hybrid monitoring technique developed by the NIST (National Institutes for Standards and Technology) is the least intrusive and takes only one-fourth of the time taken by software monitoring probes. In the rest of the paper, issues related to benchmarking a distributed system have been discussed and finally a prescription for developing a micro-benchmark for the TCMS has been provided.
User and Performance Impacts from Franklin Upgrades

DOE Office of Scientific and Technical Information (OSTI.GOV)

He, Yun

2009-05-10

The NERSC flagship computer Cray XT4 system"Franklin" has gone through three major upgrades: quad core upgrade, CLE 2.1 upgrade, and IO upgrade, during the past year. In this paper, we will discuss the various aspects of the user impacts such as user access, user environment, and user issues etc from these upgrades. The performance impacts on the kernel benchmarks and selected application benchmarks will also be presented.
Implementation of BT, SP, LU, and FT of NAS Parallel Benchmarks in Java

NASA Technical Reports Server (NTRS)

Schultz, Matthew; Frumkin, Michael; Jin, Hao-Qiang; Yan, Jerry

2000-01-01

A number of Java features make it an attractive but a debatable choice for High Performance Computing. We have implemented benchmarks working on single structured grid BT,SP,LU and FT in Java. The performance and scalability of the Java code shows that a significant improvement in Java compiler technology and in Java thread implementation are necessary for Java to compete with Fortran in HPC applications.
Automated benchmarking of peptide-MHC class I binding predictions.

PubMed

Trolle, Thomas; Metushi, Imir G; Greenbaum, Jason A; Kim, Yohan; Sidney, John; Lund, Ole; Sette, Alessandro; Peters, Bjoern; Nielsen, Morten

2015-07-01

Numerous in silico methods predicting peptide binding to major histocompatibility complex (MHC) class I molecules have been developed over the last decades. However, the multitude of available prediction tools makes it non-trivial for the end-user to select which tool to use for a given task. To provide a solid basis on which to compare different prediction tools, we here describe a framework for the automated benchmarking of peptide-MHC class I binding prediction tools. The framework runs weekly benchmarks on data that are newly entered into the Immune Epitope Database (IEDB), giving the public access to frequent, up-to-date performance evaluations of all participating tools. To overcome potential selection bias in the data included in the IEDB, a strategy was implemented that suggests a set of peptides for which different prediction methods give divergent predictions as to their binding capability. Upon experimental binding validation, these peptides entered the benchmark study. The benchmark has run for 15 weeks and includes evaluation of 44 datasets covering 17 MHC alleles and more than 4000 peptide-MHC binding measurements. Inspection of the results allows the end-user to make educated selections between participating tools. Of the four participating servers, NetMHCpan performed the best, followed by ANN, SMM and finally ARB. Up-to-date performance evaluations of each server can be found online at http://tools.iedb.org/auto_bench/mhci/weekly. All prediction tool developers are invited to participate in the benchmark. Sign-up instructions are available at http://tools.iedb.org/auto_bench/mhci/join. mniel@cbs.dtu.dk or bpeters@liai.org Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Validating Cellular Automata Lava Flow Emplacement Algorithms with Standard Benchmarks

NASA Astrophysics Data System (ADS)

Richardson, J. A.; Connor, L.; Charbonnier, S. J.; Connor, C.; Gallant, E.

2015-12-01

A major existing need in assessing lava flow simulators is a common set of validation benchmark tests. We propose three levels of benchmarks which test model output against increasingly complex standards. First, imulated lava flows should be morphologically identical, given changes in parameter space that should be inconsequential, such as slope direction. Second, lava flows simulated in simple parameter spaces can be tested against analytical solutions or empirical relationships seen in Bingham fluids. For instance, a lava flow simulated on a flat surface should produce a circular outline. Third, lava flows simulated over real world topography can be compared to recent real world lava flows, such as those at Tolbachik, Russia, and Fogo, Cape Verde. Success or failure of emplacement algorithms in these validation benchmarks can be determined using a Bayesian approach, which directly tests the ability of an emplacement algorithm to correctly forecast lava inundation. Here we focus on two posterior metrics, P(A|B) and P(¬A|¬B), which describe the positive and negative predictive value of flow algorithms. This is an improvement on less direct statistics such as model sensitivity and the Jaccard fitness coefficient. We have performed these validation benchmarks on a new, modular lava flow emplacement simulator that we have developed. This simulator, which we call MOLASSES, follows a Cellular Automata (CA) method. The code is developed in several interchangeable modules, which enables quick modification of the distribution algorithm from cell locations to their neighbors. By assessing several different distribution schemes with the benchmark tests, we have improved the performance of MOLASSES to correctly match early stages of the 2012-3 Tolbachik Flow, Kamchakta Russia, to 80%. We also can evaluate model performance given uncertain input parameters using a Monte Carlo setup. This illuminates sensitivity to model uncertainty.
Automated benchmarking of peptide-MHC class I binding predictions

PubMed Central

Trolle, Thomas; Metushi, Imir G.; Greenbaum, Jason A.; Kim, Yohan; Sidney, John; Lund, Ole; Sette, Alessandro; Peters, Bjoern; Nielsen, Morten

2015-01-01

Motivation: Numerous in silico methods predicting peptide binding to major histocompatibility complex (MHC) class I molecules have been developed over the last decades. However, the multitude of available prediction tools makes it non-trivial for the end-user to select which tool to use for a given task. To provide a solid basis on which to compare different prediction tools, we here describe a framework for the automated benchmarking of peptide-MHC class I binding prediction tools. The framework runs weekly benchmarks on data that are newly entered into the Immune Epitope Database (IEDB), giving the public access to frequent, up-to-date performance evaluations of all participating tools. To overcome potential selection bias in the data included in the IEDB, a strategy was implemented that suggests a set of peptides for which different prediction methods give divergent predictions as to their binding capability. Upon experimental binding validation, these peptides entered the benchmark study. Results: The benchmark has run for 15 weeks and includes evaluation of 44 datasets covering 17 MHC alleles and more than 4000 peptide-MHC binding measurements. Inspection of the results allows the end-user to make educated selections between participating tools. Of the four participating servers, NetMHCpan performed the best, followed by ANN, SMM and finally ARB. Availability and implementation: Up-to-date performance evaluations of each server can be found online at http://tools.iedb.org/auto_bench/mhci/weekly. All prediction tool developers are invited to participate in the benchmark. Sign-up instructions are available at http://tools.iedb.org/auto_bench/mhci/join. Contact: mniel@cbs.dtu.dk or bpeters@liai.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25717196
A frontier analysis approach for benchmarking hospital performance in the treatment of acute myocardial infarction.

PubMed

Stanford, Robert E

2004-05-01

This paper uses a non-parametric frontier model and adaptations of the concepts of cross-efficiency and peer-appraisal to develop a formal methodology for benchmarking provider performance in the treatment of Acute Myocardial Infarction (AMI). Parameters used in the benchmarking process are the rates of proper recognition of indications of six standard treatment processes for AMI; the decision making units (DMUs) to be compared are the Medicare eligible hospitals of a particular state; the analysis produces an ordinal ranking of individual hospital performance scores. The cross-efficiency/peer-appraisal calculation process is constructed to accommodate DMUs that experience no patients in some of the treatment categories. While continuing to rate highly the performances of DMUs which are efficient in the Pareto-optimal sense, our model produces individual DMU performance scores that correlate significantly with good overall performance, as determined by a comparison of the sums of the individual DMU recognition rates for the six standard treatment processes. The methodology is applied to data collected from 107 state Medicare hospitals.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Marck, Steven C. van der, E-mail: vandermarck@nrg.eu

Recent releases of three major world nuclear reaction data libraries, ENDF/B-VII.1, JENDL-4.0, and JEFF-3.1.1, have been tested extensively using benchmark calculations. The calculations were performed with the latest release of the continuous energy Monte Carlo neutronics code MCNP, i.e. MCNP6. Three types of benchmarks were used, viz. criticality safety benchmarks, (fusion) shielding benchmarks, and reference systems for which the effective delayed neutron fraction is reported. For criticality safety, more than 2000 benchmarks from the International Handbook of Criticality Safety Benchmark Experiments were used. Benchmarks from all categories were used, ranging from low-enriched uranium, compound fuel, thermal spectrum ones (LEU-COMP-THERM), tomore » mixed uranium-plutonium, metallic fuel, fast spectrum ones (MIX-MET-FAST). For fusion shielding many benchmarks were based on IAEA specifications for the Oktavian experiments (for Al, Co, Cr, Cu, LiF, Mn, Mo, Si, Ti, W, Zr), Fusion Neutronics Source in Japan (for Be, C, N, O, Fe, Pb), and Pulsed Sphere experiments at Lawrence Livermore National Laboratory (for {sup 6}Li, {sup 7}Li, Be, C, N, O, Mg, Al, Ti, Fe, Pb, D2O, H2O, concrete, polyethylene and teflon). The new functionality in MCNP6 to calculate the effective delayed neutron fraction was tested by comparison with more than thirty measurements in widely varying systems. Among these were measurements in the Tank Critical Assembly (TCA in Japan) and IPEN/MB-01 (Brazil), both with a thermal spectrum, two cores in Masurca (France) and three cores in the Fast Critical Assembly (FCA, Japan), all with fast spectra. The performance of the three libraries, in combination with MCNP6, is shown to be good. The results for the LEU-COMP-THERM category are on average very close to the benchmark value. Also for most other categories the results are satisfactory. Deviations from the benchmark values do occur in certain benchmark series, or in isolated cases within benchmark series. Such instances can often be related to nuclear data for specific non-fissile elements, such as C, Fe, or Gd. Indications are that the intermediate and mixed spectrum cases are less well described. The results for the shielding benchmarks are generally good, with very similar results for the three libraries in the majority of cases. Nevertheless there are, in certain cases, strong deviations between calculated and benchmark values, such as for Co and Mg. Also, the results show discrepancies at certain energies or angles for e.g. C, N, O, Mo, and W. The functionality of MCNP6 to calculate the effective delayed neutron fraction yields very good results for all three libraries.« less
A Web Resource for Standardized Benchmark Datasets, Metrics, and Rosetta Protocols for Macromolecular Modeling and Design.

PubMed

Ó Conchúir, Shane; Barlow, Kyle A; Pache, Roland A; Ollikainen, Noah; Kundert, Kale; O'Meara, Matthew J; Smith, Colin A; Kortemme, Tanja

2015-01-01

The development and validation of computational macromolecular modeling and design methods depend on suitable benchmark datasets and informative metrics for comparing protocols. In addition, if a method is intended to be adopted broadly in diverse biological applications, there needs to be information on appropriate parameters for each protocol, as well as metrics describing the expected accuracy compared to experimental data. In certain disciplines, there exist established benchmarks and public resources where experts in a particular methodology are encouraged to supply their most efficient implementation of each particular benchmark. We aim to provide such a resource for protocols in macromolecular modeling and design. We present a freely accessible web resource (https://kortemmelab.ucsf.edu/benchmarks) to guide the development of protocols for protein modeling and design. The site provides benchmark datasets and metrics to compare the performance of a variety of modeling protocols using different computational sampling methods and energy functions, providing a "best practice" set of parameters for each method. Each benchmark has an associated downloadable benchmark capture archive containing the input files, analysis scripts, and tutorials for running the benchmark. The captures may be run with any suitable modeling method; we supply command lines for running the benchmarks using the Rosetta software suite. We have compiled initial benchmarks for the resource spanning three key areas: prediction of energetic effects of mutations, protein design, and protein structure prediction, each with associated state-of-the-art modeling protocols. With the help of the wider macromolecular modeling community, we hope to expand the variety of benchmarks included on the website and continue to evaluate new iterations of current methods as they become available.
Spherical harmonic results for the 3D Kobayashi Benchmark suite

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, P N; Chang, B; Hanebutte, U R

1999-03-02

Spherical harmonic solutions are presented for the Kobayashi benchmark suite. The results were obtained with Ardra, a scalable, parallel neutron transport code developed at Lawrence Livermore National Laboratory (LLNL). The calculations were performed on the IBM ASCI Blue-Pacific computer at LLNL.
The philosophy of benchmark testing a standards-based picture archiving and communications system.

PubMed

Richardson, N E; Thomas, J A; Lyche, D K; Romlein, J; Norton, G S; Dolecek, Q E

1999-05-01

The Department of Defense issued its requirements for a Digital Imaging Network-Picture Archiving and Communications System (DIN-PACS) in a Request for Proposals (RFP) to industry in January 1997, with subsequent contracts being awarded in November 1997 to the Agfa Division of Bayer and IBM Global Government Industry. The Government's technical evaluation process consisted of evaluating a written technical proposal as well as conducting a benchmark test of each proposed system at the vendor's test facility. The purpose of benchmark testing was to evaluate the performance of the fully integrated system in a simulated operational environment. The benchmark test procedures and test equipment were developed through a joint effort between the Government, academic institutions, and private consultants. Herein the authors discuss the resources required and the methods used to benchmark test a standards-based PACS.
An integrated data envelopment analysis-artificial neural network approach for benchmarking of bank branches

NASA Astrophysics Data System (ADS)

Shokrollahpour, Elsa; Hosseinzadeh Lotfi, Farhad; Zandieh, Mostafa

2016-06-01

Efficiency and quality of services are crucial to today's banking industries. The competition in this section has become increasingly intense, as a result of fast improvements in Technology. Therefore, performance analysis of the banking sectors attracts more attention these days. Even though data envelopment analysis (DEA) is a pioneer approach in the literature as of an efficiency measurement tool and finding benchmarks, it is on the other hand unable to demonstrate the possible future benchmarks. The drawback to it could be that the benchmarks it provides us with, may still be less efficient compared to the more advanced future benchmarks. To cover for this weakness, artificial neural network is integrated with DEA in this paper to calculate the relative efficiency and more reliable benchmarks of one of the Iranian commercial bank branches. Therefore, each branch could have a strategy to improve the efficiency and eliminate the cause of inefficiencies based on a 5-year time forecast.
Interactive visual optimization and analysis for RFID benchmarking.

PubMed

Wu, Yingcai; Chung, Ka-Kei; Qu, Huamin; Yuan, Xiaoru; Cheung, S C

2009-01-01

Radio frequency identification (RFID) is a powerful automatic remote identification technique that has wide applications. To facilitate RFID deployment, an RFID benchmarking instrument called aGate has been invented to identify the strengths and weaknesses of different RFID technologies in various environments. However, the data acquired by aGate are usually complex time varying multidimensional 3D volumetric data, which are extremely challenging for engineers to analyze. In this paper, we introduce a set of visualization techniques, namely, parallel coordinate plots, orientation plots, a visual history mechanism, and a 3D spatial viewer, to help RFID engineers analyze benchmark data visually and intuitively. With the techniques, we further introduce two workflow procedures (a visual optimization procedure for finding the optimum reader antenna configuration and a visual analysis procedure for comparing the performance and identifying the flaws of RFID devices) for the RFID benchmarking, with focus on the performance analysis of the aGate system. The usefulness and usability of the system are demonstrated in the user evaluation.
Measurement, Standards, and Peer Benchmarking: One Hospital's Journey.

PubMed

Martin, Brian S; Arbore, Mark

2016-04-01

Peer-to-peer benchmarking is an important component of rapid-cycle performance improvement in patient safety and quality-improvement efforts. Institutions should carefully examine critical success factors before engagement in peer-to-peer benchmarking in order to maximize growth and change opportunities. Solutions for Patient Safety has proven to be a high-yield engagement for Children's Hospital of Pittsburgh of University of Pittsburgh Medical Center, with measureable improvement in both organizational process and culture. Copyright © 2016 Elsevier Inc. All rights reserved.
Benchmark Evaluation of HTR-PROTEUS Pebble Bed Experimental Program

DOE PAGES

Bess, John D.; Montierth, Leland; Köberl, Oliver; ...

2014-10-09

Benchmark models were developed to evaluate 11 critical core configurations of the HTR-PROTEUS pebble bed experimental program. Various additional reactor physics measurements were performed as part of this program; currently only a total of 37 absorber rod worth measurements have been evaluated as acceptable benchmark experiments for Cores 4, 9, and 10. Dominant uncertainties in the experimental keff for all core configurations come from uncertainties in the ²³⁵U enrichment of the fuel, impurities in the moderator pebbles, and the density and impurity content of the radial reflector. Calculations of k eff with MCNP5 and ENDF/B-VII.0 neutron nuclear data aremore » greater than the benchmark values but within 1% and also within the 3σ uncertainty, except for Core 4, which is the only randomly packed pebble configuration. Repeated calculations of k eff with MCNP6.1 and ENDF/B-VII.1 are lower than the benchmark values and within 1% (~3σ) except for Cores 5 and 9, which calculate lower than the benchmark eigenvalues within 4σ. The primary difference between the two nuclear data libraries is the adjustment of the absorption cross section of graphite. Simulations of the absorber rod worth measurements are within 3σ of the benchmark experiment values. The complete benchmark evaluation details are available in the 2014 edition of the International Handbook of Evaluated Reactor Physics Benchmark Experiments.« less

Systematic Benchmarking of Diagnostic Technologies for an Electrical Power System

NASA Technical Reports Server (NTRS)

Kurtoglu, Tolga; Jensen, David; Poll, Scott

2009-01-01

Automated health management is a critical functionality for complex aerospace systems. A wide variety of diagnostic algorithms have been developed to address this technical challenge. Unfortunately, the lack of support to perform large-scale V&V (verification and validation) of diagnostic technologies continues to create barriers to effective development and deployment of such algorithms for aerospace vehicles. In this paper, we describe a formal framework developed for benchmarking of diagnostic technologies. The diagnosed system is the Advanced Diagnostics and Prognostics Testbed (ADAPT), a real-world electrical power system (EPS), developed and maintained at the NASA Ames Research Center. The benchmarking approach provides a systematic, empirical basis to the testing of diagnostic software and is used to provide performance assessment for different diagnostic algorithms.
NetBenchmark: a bioconductor package for reproducible benchmarks of gene regulatory network inference.

PubMed

Bellot, Pau; Olsen, Catharina; Salembier, Philippe; Oliveras-Vergés, Albert; Meyer, Patrick E

2015-09-29

In the last decade, a great number of methods for reconstructing gene regulatory networks from expression data have been proposed. However, very few tools and datasets allow to evaluate accurately and reproducibly those methods. Hence, we propose here a new tool, able to perform a systematic, yet fully reproducible, evaluation of transcriptional network inference methods. Our open-source and freely available Bioconductor package aggregates a large set of tools to assess the robustness of network inference algorithms against different simulators, topologies, sample sizes and noise intensities. The benchmarking framework that uses various datasets highlights the specialization of some methods toward network types and data. As a result, it is possible to identify the techniques that have broad overall performances.
Open Rotor - Analysis of Diagnostic Data

NASA Technical Reports Server (NTRS)

Envia, Edmane

2011-01-01

NASA is researching open rotor propulsion as part of its technology research and development plan for addressing the subsonic transport aircraft noise, emission and fuel burn goals. The low-speed wind tunnel test for investigating the aerodynamic and acoustic performance of a benchmark blade set at the approach and takeoff conditions has recently concluded. A high-speed wind tunnel diagnostic test campaign has begun to investigate the performance of this benchmark open rotor blade set at the cruise condition. Databases from both speed regimes will comprise a comprehensive collection of benchmark open rotor data for use in assessing/validating aerodynamic and noise prediction tools (component & system level) as well as providing insights into the physics of open rotors to help guide the development of quieter open rotors.
Heterogeneous Distributed Computing for Computational Aerosciences

NASA Technical Reports Server (NTRS)

Sunderam, Vaidy S.

1998-01-01

The research supported under this award focuses on heterogeneous distributed computing for high-performance applications, with particular emphasis on computational aerosciences. The overall goal of this project was to and investigate issues in, and develop solutions to, efficient execution of computational aeroscience codes in heterogeneous concurrent computing environments. In particular, we worked in the context of the PVM[1] system and, subsequent to detailed conversion efforts and performance benchmarking, devising novel techniques to increase the efficacy of heterogeneous networked environments for computational aerosciences. Our work has been based upon the NAS Parallel Benchmark suite, but has also recently expanded in scope to include the NAS I/O benchmarks as specified in the NHT-1 document. In this report we summarize our research accomplishments under the auspices of the grant.
Relationship between the TCAP and the Pearson Benchmark Assessment in Elementary Students' Reading and Math Performance in a Northeastern Tennessee School District

ERIC Educational Resources Information Center

Dugger-Roberts, Cherith A.

2014-01-01

The purpose of this quantitative study was to determine if there was a relationship between the TCAP test and Pearson Benchmark assessment in elementary students' reading and language arts and math performance in a northeastern Tennessee school district. This study involved 3rd, 4th, 5th, and 6th grade students. The study focused on the following…
Report from the First CERT-RMM Users Group Workshop Series

DTIC Science & Technology

2012-04-01

deploy processes to support our programs – Benchmark our programs to determine current gaps – Complements current work in CMMI® and ISO 27001 19...benchmarking program performance through process analytics and Lean/Six Sigma activities to ensure Performance Excellence. • Provides ISO Standards...Office www.cmu.edu/ iso 29 Carnegie Mellon University • Est 1967 in Pittsburgh, PA • Global, private research university • Ranked 22nd • 15,000
Benchmarking database performance for genomic data.

PubMed

Khushi, Matloob

2015-06-01

Genomic regions represent features such as gene annotations, transcription factor binding sites and epigenetic modifications. Performing various genomic operations such as identifying overlapping/non-overlapping regions or nearest gene annotations are common research needs. The data can be saved in a database system for easy management, however, there is no comprehensive database built-in algorithm at present to identify overlapping regions. Therefore I have developed a novel region-mapping (RegMap) SQL-based algorithm to perform genomic operations and have benchmarked the performance of different databases. Benchmarking identified that PostgreSQL extracts overlapping regions much faster than MySQL. Insertion and data uploads in PostgreSQL were also better, although general searching capability of both databases was almost equivalent. In addition, using the algorithm pair-wise, overlaps of >1000 datasets of transcription factor binding sites and histone marks, collected from previous publications, were reported and it was found that HNF4G significantly co-locates with cohesin subunit STAG1 (SA1).Inc. © 2015 Wiley Periodicals, Inc.
Low Molecular Weight Polymethacrylates as Multi-Functional Lubricant Additives

DOE PAGES

Cosimbescu, Lelia; Vellore, Azhar; Shantini Ramasamy, Uma; ...

2018-04-24

In this study, low molecular weight, moderately polar polymethacrylate polymers are explored as potential multi-functional lubricant additives. The performance of these novel additives in base oil is evaluated in terms of their viscosity index, shear stability, and friction-and-wear. The new compounds are compared to two benchmarks, a typical polymeric viscosity modifier and a fully-formulated oil. Results show that the best performing of the new polymers exhibit viscosity index and friction comparable to that of both benchmarks, far superior shear stability to either benchmark (as much as 15x lower shear loss), and wear reduction significantly better than a typical viscosity modifiermore » (lower wear volume by a factor of 2-3). The findings also suggest that the polarity and molecular weight of the polymers affect their performance which suggests future synthetic strategies may enable this new class of additives to replace multiple additives in typical lubricant formulations.« less
Low Molecular Weight Polymethacrylates as Multi-Functional Lubricant Additives

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cosimbescu, Lelia; Vellore, Azhar; Shantini Ramasamy, Uma

In this study, low molecular weight, moderately polar polymethacrylate polymers are explored as potential multi-functional lubricant additives. The performance of these novel additives in base oil is evaluated in terms of their viscosity index, shear stability, and friction-and-wear. The new compounds are compared to two benchmarks, a typical polymeric viscosity modifier and a fully-formulated oil. Results show that the best performing of the new polymers exhibit viscosity index and friction comparable to that of both benchmarks, far superior shear stability to either benchmark (as much as 15x lower shear loss), and wear reduction significantly better than a typical viscosity modifiermore » (lower wear volume by a factor of 2-3). The findings also suggest that the polarity and molecular weight of the polymers affect their performance which suggests future synthetic strategies may enable this new class of additives to replace multiple additives in typical lubricant formulations.« less
Final Report of the NASA Office of Safety and Mission Assurance Agile Benchmarking Team

NASA Technical Reports Server (NTRS)

Wetherholt, Martha

2016-01-01

To ensure that the NASA Safety and Mission Assurance (SMA) community remains in a position to perform reliable Software Assurance (SA) on NASAs critical software (SW) systems with the software industry rapidly transitioning from waterfall to Agile processes, Terry Wilcutt, Chief, Safety and Mission Assurance, Office of Safety and Mission Assurance (OSMA) established the Agile Benchmarking Team (ABT). The Team's tasks were: 1. Research background literature on current Agile processes, 2. Perform benchmark activities with other organizations that are involved in software Agile processes to determine best practices, 3. Collect information on Agile-developed systems to enable improvements to the current NASA standards and processes to enhance their ability to perform reliable software assurance on NASA Agile-developed systems, 4. Suggest additional guidance and recommendations for updates to those standards and processes, as needed. The ABT's findings and recommendations for software management, engineering and software assurance are addressed herein.
Analysing the performance of personal computers based on Intel microprocessors for sequence aligning bioinformatics applications.

PubMed

Nair, Pradeep S; John, Eugene B

2007-01-01

Aligning specific sequences against a very large number of other sequences is a central aspect of bioinformatics. With the widespread availability of personal computers in biology laboratories, sequence alignment is now often performed locally. This makes it necessary to analyse the performance of personal computers for sequence aligning bioinformatics benchmarks. In this paper, we analyse the performance of a personal computer for the popular BLAST and FASTA sequence alignment suites. Results indicate that these benchmarks have a large number of recurring operations and use memory operations extensively. It seems that the performance can be improved with a bigger L1-cache.
Performance Comparison of HPF and MPI Based NAS Parallel Benchmarks

NASA Technical Reports Server (NTRS)

Saini, Subhash

1997-01-01

Compilers supporting High Performance Form (HPF) features first appeared in late 1994 and early 1995 from Applied Parallel Research (APR), Digital Equipment Corporation, and The Portland Group (PGI). IBM introduced an HPF compiler for the IBM RS/6000 SP2 in April of 1996. Over the past two years, these implementations have shown steady improvement in terms of both features and performance. The performance of various hardware/ programming model (HPF and MPI) combinations will be compared, based on latest NAS Parallel Benchmark results, thus providing a cross-machine and cross-model comparison. Specifically, HPF based NPB results will be compared with MPI based NPB results to provide perspective on performance currently obtainable using HPF versus MPI or versus hand-tuned implementations such as those supplied by the hardware vendors. In addition, we would also present NPB, (Version 1.0) performance results for the following systems: DEC Alpha Server 8400 5/440, Fujitsu CAPP Series (VX, VPP300, and VPP700), HP/Convex Exemplar SPP2000, IBM RS/6000 SP P2SC node (120 MHz), NEC SX-4/32, SGI/CRAY T3E, and SGI Origin2000. We would also present sustained performance per dollar for Class B LU, SP and BT benchmarks.
A Programming Model Performance Study Using the NAS Parallel Benchmarks

DOE PAGES

Shan, Hongzhang; Blagojević, Filip; Min, Seung-Jai; ...

2010-01-01

Harnessing the power of multicore platforms is challenging due to the additional levels of parallelism present. In this paper we use the NAS Parallel Benchmarks to study three programming models, MPI, OpenMP and PGAS to understand their performance and memory usage characteristics on current multicore architectures. To understand these characteristics we use the Integrated Performance Monitoring tool and other ways to measure communication versus computation time, as well as the fraction of the run time spent in OpenMP. The benchmarks are run on two different Cray XT5 systems and an Infiniband cluster. Our results show that in general the threemore » programming models exhibit very similar performance characteristics. In a few cases, OpenMP is significantly faster because it explicitly avoids communication. For these particular cases, we were able to re-write the UPC versions and achieve equal performance to OpenMP. Using OpenMP was also the most advantageous in terms of memory usage. Also we compare performance differences between the two Cray systems, which have quad-core and hex-core processors. We show that at scale the performance is almost always slower on the hex-core system because of increased contention for network resources.« less
Benchmarking of trauma care worldwide: the potential value of an International Trauma Data Bank (ITDB).

PubMed

Haider, Adil H; Hashmi, Zain G; Gupta, Sonia; Zafar, Syed Nabeel; David, Jean-Stephane; Efron, David T; Stevens, Kent A; Zafar, Hasnain; Schneider, Eric B; Voiglio, Eric; Coimbra, Raul; Haut, Elliott R

2014-08-01

National trauma registries have helped improve patient outcomes across the world. Recently, the idea of an International Trauma Data Bank (ITDB) has been suggested to establish global comparative assessments of trauma outcomes. The objective of this study was to determine whether global trauma data could be combined to perform international outcomes benchmarking. We used observed/expected (O/E) mortality ratios to compare two trauma centers [European high-income country (HIC) and Asian lower-middle income country (LMIC)] with centers in the North American National Trauma Data Bank (NTDB). Patients (≥16 years) with blunt/penetrating injuries were included. Multivariable logistic regression, adjusting for known predictors of trauma mortality, was performed. Estimates were used to predict the expected deaths at each center and to calculate O/E mortality ratios for benchmarking. A total of 375,433 patients from 301 centers were included from the NTDB (2002-2010). The LMIC trauma center had 806 patients (2002-2010), whereas the HIC reported 1,003 patients (2002-2004). The most important known predictors of trauma mortality were adequately recorded in all datasets. Mortality benchmarking revealed that the HIC center performed similarly to the NTDB centers [O/E = 1.11 (95% confidence interval (CI) 0.92-1.35)], whereas the LMIC center showed significantly worse survival [O/E = 1.52 (1.23-1.88)]. Subset analyses of patients with blunt or penetrating injury showed similar results. Using only a few key covariates, aggregated global trauma data can be used to adequately perform international trauma center benchmarking. The creation of the ITDB is feasible and recommended as it may be a pivotal step towards improving global trauma outcomes.
The Royal Australian and New Zealand College of Radiologists (RANZCR) relative value unit workload model, its limitations and the evolution to a safety, quality and performance framework.

PubMed

Pitman, A; Jones, D N; Stuart, D; Lloydhope, K; Mallitt, K; O'Rourke, P

2009-10-01

The study reports on the evolution of the Australian radiologist relative value unit (RVU) model of measuring radiologist reporting workloads in teaching hospital departments, and aims to outline a way forward for the development of a broad national safety, quality and performance framework that enables value mapping, measurement and benchmarking. The Radiology International Benchmarking Project of Queensland Health provided a suitable high-level national forum where the existing Pitman-Jones RVU model was applied to contemporaneous data, and its shortcomings and potential avenues for future development were analysed. Application of the Pitman-Jones model to Queensland data and also a Victorian benchmark showed that the original recommendation of 40,000 crude RVU per full-time equivalent consultant radiologist (97-98 baseline level) has risen only moderately, to now lie around 45,000 crude RVU/full-time equivalent. Notwithstanding this, the model has a number of weaknesses and is becoming outdated, as it cannot capture newer time-consuming examinations particularly in CT. A significant re-evaluation of the value of medical imaging is required, and is now occurring. We must rethink how we measure, benchmark, display and continually improve medical imaging safety, quality and performance, throughout the imaging care cycle and beyond. It will be necessary to ensure alignment with patient needs, as well as clinical and organisational objectives. Clear recommendations for the development of an updated national reporting workload RVU system are available, and an opportunity now exists for developing a much broader national model. A more sophisticated and balanced multidimensional safety, quality and performance framework that enables measurement and benchmarking of all important elements of health-care service is needed.
Benchmarking the ATLAS software through the Kit Validation engine

NASA Astrophysics Data System (ADS)

De Salvo, Alessandro; Brasolin, Franco

2010-04-01

The measurement of the experiment software performance is a very important metric in order to choose the most effective resources to be used and to discover the bottlenecks of the code implementation. In this work we present the benchmark techniques used to measure the ATLAS software performance through the ATLAS offline testing engine Kit Validation and the online portal Global Kit Validation. The performance measurements, the data collection, the online analysis and display of the results will be presented. The results of the measurement on different platforms and architectures will be shown, giving a full report on the CPU power and memory consumption of the Monte Carlo generation, simulation, digitization and reconstruction of the most CPU-intensive channels. The impact of the multi-core computing on the ATLAS software performance will also be presented, comparing the behavior of different architectures when increasing the number of concurrent processes. The benchmark techniques described in this paper have been used in the HEPiX group since the beginning of 2008 to help defining the performance metrics for the High Energy Physics applications, based on the real experiment software.
On the predictability of land surface fluxes from meteorological variables

NASA Astrophysics Data System (ADS)

Haughton, Ned; Abramowitz, Gab; Pitman, Andy J.

2018-01-01

Previous research has shown that land surface models (LSMs) are performing poorly when compared with relatively simple empirical models over a wide range of metrics and environments. Atmospheric driving data appear to provide information about land surface fluxes that LSMs are not fully utilising. Here, we further quantify the information available in the meteorological forcing data that are used by LSMs for predicting land surface fluxes, by interrogating FLUXNET data, and extending the benchmarking methodology used in previous experiments. We show that substantial performance improvement is possible for empirical models using meteorological data alone, with no explicit vegetation or soil properties, thus setting lower bounds on a priori expectations on LSM performance. The process also identifies key meteorological variables that provide predictive power. We provide an ensemble of empirical benchmarks that are simple to reproduce and provide a range of behaviours and predictive performance, acting as a baseline benchmark set for future studies. We reanalyse previously published LSM simulations and show that there is more diversity between LSMs than previously indicated, although it remains unclear why LSMs are broadly performing so much worse than simple empirical models.
Benchmark of four popular virtual screening programs: construction of the active/decoy dataset remains a major determinant of measured performance.

PubMed

Chaput, Ludovic; Martinez-Sanz, Juan; Saettel, Nicolas; Mouawad, Liliane

2016-01-01

In a structure-based virtual screening, the choice of the docking program is essential for the success of a hit identification. Benchmarks are meant to help in guiding this choice, especially when undertaken on a large variety of protein targets. Here, the performance of four popular virtual screening programs, Gold, Glide, Surflex and FlexX, is compared using the Directory of Useful Decoys-Enhanced database (DUD-E), which includes 102 targets with an average of 224 ligands per target and 50 decoys per ligand, generated to avoid biases in the benchmarking. Then, a relationship between these program performances and the properties of the targets or the small molecules was investigated. The comparison was based on two metrics, with three different parameters each. The BEDROC scores with α = 80.5, indicated that, on the overall database, Glide succeeded (score > 0.5) for 30 targets, Gold for 27, FlexX for 14 and Surflex for 11. The performance did not depend on the hydrophobicity nor the openness of the protein cavities, neither on the families to which the proteins belong. However, despite the care in the construction of the DUD-E database, the small differences that remain between the actives and the decoys likely explain the successes of Gold, Surflex and FlexX. Moreover, the similarity between the actives of a target and its crystal structure ligand seems to be at the basis of the good performance of Glide. When all targets with significant biases are removed from the benchmarking, a subset of 47 targets remains, for which Glide succeeded for only 5 targets, Gold for 4 and FlexX and Surflex for 2. The performance dramatic drop of all four programs when the biases are removed shows that we should beware of virtual screening benchmarks, because good performances may be due to wrong reasons. Therefore, benchmarking would hardly provide guidelines for virtual screening experiments, despite the tendency that is maintained, i.e., Glide and Gold display better performance than FlexX and Surflex. We recommend to always use several programs and combine their results. Graphical AbstractSummary of the results obtained by virtual screening with the four programs, Glide, Gold, Surflex and FlexX, on the 102 targets of the DUD-E database. The percentage of targets with successful results, i.e., with BDEROC(α = 80.5) > 0.5, when the entire database is considered are in Blue, and when targets with biased chemical libraries are removed are in Red.
[Does implementation of benchmarking in quality circles improve the quality of care of patients with asthma and reduce drug interaction?].

PubMed

Kaufmann-Kolle, Petra; Szecsenyi, Joachim; Broge, Björn; Haefeli, Walter Emil; Schneider, Antonius

2011-01-01

The purpose of this cluster-randomised controlled trial was to evaluate the efficacy of quality circles (QCs) working either with general data-based feedback or with an open benchmark within the field of asthma care and drug-drug interactions. Twelve QCs, involving 96 general practitioners from 85 practices, were randomised. Six QCs worked with traditional anonymous feedback and six with an open benchmark. Two QC meetings supported with feedback reports were held covering the topics "drug-drug interactions" and "asthma"; in both cases discussions were guided by a trained moderator. Outcome measures included health-related quality of life and patient satisfaction with treatment, asthma severity and number of potentially inappropriate drug combinations as well as the general practitioners' satisfaction in relation to the performance of the QC. A significant improvement in the treatment of asthma was observed in both trial arms. However, there was only a slight improvement regarding inappropriate drug combinations. There were no relevant differences between the group with open benchmark (B-QC) and traditional quality circles (T-QC). The physicians' satisfaction with the QC performance was significantly higher in the T-QCs. General practitioners seem to take a critical perspective about open benchmarking in quality circles. Caution should be used when implementing benchmarking in a quality circle as it did not improve healthcare when compared to the traditional procedure with anonymised comparisons. Copyright © 2011. Published by Elsevier GmbH.
Energy benchmarking of commercial buildings: a low-cost pathway toward urban sustainability

NASA Astrophysics Data System (ADS)

Cox, Matt; Brown, Marilyn A.; Sun, Xiaojing

2013-09-01

US cities are beginning to experiment with a regulatory approach to address information failures in the real estate market by mandating the energy benchmarking of commercial buildings. Understanding how a commercial building uses energy has many benefits; for example, it helps building owners and tenants identify poor-performing buildings and subsystems and it enables high-performing buildings to achieve greater occupancy rates, rents, and property values. This paper estimates the possible impacts of a national energy benchmarking mandate through analysis chiefly utilizing the Georgia Tech version of the National Energy Modeling System (GT-NEMS). Correcting input discount rates results in a 4.0% reduction in projected energy consumption for seven major classes of equipment relative to the reference case forecast in 2020, rising to 8.7% in 2035. Thus, the official US energy forecasts appear to overestimate future energy consumption by underestimating investments in energy-efficient equipment. Further discount rate reductions spurred by benchmarking policies yield another 1.3-1.4% in energy savings in 2020, increasing to 2.2-2.4% in 2035. Benchmarking would increase the purchase of energy-efficient equipment, reducing energy bills, CO2 emissions, and conventional air pollution. Achieving comparable CO2 savings would require more than tripling existing US solar capacity. Our analysis suggests that nearly 90% of the energy saved by a national benchmarking policy would benefit metropolitan areas, and the policy’s benefits would outweigh its costs, both to the private sector and society broadly.

Comparing Hospital Processes and Outcomes in California Medicare Beneficiaries: Simulation Prompts Reconsideration.

PubMed

Escobar, Gabriel J; Baker, Jennifer M; Turk, Benjamin J; Draper, David; Liu, Vincent; Kipnis, Patricia

2017-01-01

This article is not a traditional research report. It describes how conducting a specific set of benchmarking analyses led us to broader reflections on hospital benchmarking. We reexamined an issue that has received far less attention from researchers than in the past: How variations in the hospital admission threshold might affect hospital rankings. Considering this threshold made us reconsider what benchmarking is and what future benchmarking studies might be like. Although we recognize that some of our assertions are speculative, they are based on our reading of the literature and previous and ongoing data analyses being conducted in our research unit. We describe the benchmarking analyses that led to these reflections. The Centers for Medicare and Medicaid Services' Hospital Compare Web site includes data on fee-for-service Medicare beneficiaries but does not control for severity of illness, which requires physiologic data now available in most electronic medical records.To address this limitation, we compared hospital processes and outcomes among Kaiser Permanente Northern California's (KPNC) Medicare Advantage beneficiaries and non-KPNC California Medicare beneficiaries between 2009 and 2010. We assigned a simulated severity of illness measure to each record and explored the effect of having the additional information on outcomes. We found that if the admission severity of illness in non-KPNC hospitals increased, KPNC hospitals' mortality performance would appear worse; conversely, if admission severity at non-KPNC hospitals' decreased, KPNC hospitals' performance would appear better. Future hospital benchmarking should consider the impact of variation in admission thresholds.
Benchmarking initiatives in the water industry.

PubMed

Parena, R; Smeets, E

2001-01-01

Customer satisfaction and service care are every day pushing professionals in the water industry to seek to improve their performance, lowering costs and increasing the provided service level. Process Benchmarking is generally recognised as a systematic mechanism of comparing one's own utility with other utilities or businesses with the intent of self-improvement by adopting structures or methods used elsewhere. The IWA Task Force on Benchmarking, operating inside the Statistics and Economics Committee, has been committed to developing a general accepted concept of Process Benchmarking to support water decision-makers in addressing issues of efficiency. In a first step the Task Force disseminated among the Committee members a questionnaire focused on providing suggestions about the kind, the evolution degree and the main concepts of Benchmarking adopted in the represented Countries. A comparison among the guidelines adopted in The Netherlands and Scandinavia has recently challenged the Task Force in drafting a methodology for a worldwide process benchmarking in water industry. The paper provides a framework of the most interesting benchmarking experiences in the water sector and describes in detail both the final results of the survey and the methodology focused on identification of possible improvement areas.
Conceptual Models, Choices, and Benchmarks for Building Quality Work Cultures.

ERIC Educational Resources Information Center

Acker-Hocevar, Michele

1996-01-01

The two models in Florida's Educational Quality Benchmark System represent a new way of thinking about developing schools' work culture. The Quality Performance System Model identifies nine dimensions of work within a quality system. The Change Process Model provides a theoretical framework for changing existing beliefs, attitudes, and behaviors…
Children's Services Statistical Neighbour Benchmarking Tool. Practitioner User Guide

ERIC Educational Resources Information Center

National Foundation for Educational Research, 2007

2007-01-01

Statistical neighbour models provide one method for benchmarking progress. For each local authority (LA), these models designate a number of other LAs deemed to have similar characteristics. These designated LAs are known as statistical neighbours. Any LA may compare its performance (as measured by various indicators) against its statistical…
Benchmarks Momentum on Increase

ERIC Educational Resources Information Center

McNeil, Michele

2008-01-01

No longer content with the patchwork quilt of assessments used to measure states' K-12 performance, top policy groups are pushing states toward international benchmarking as a way to better prepare students for a competitive global economy. The National Governors Association, the Council of Chief State School Officers, and the standards-advocacy…
Toward benchmarking in catalysis science: Best practices, challenges, and opportunities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bligaard, Thomas; Bullock, R. Morris; Campbell, Charles T.

Benchmarking is a community-based and (preferably) community-driven activity involving consensus-based decisions on how to make reproducible, fair, and relevant assessments. In catalysis science, important catalyst performance metrics include activity, selectivity, and the deactivation profile, which enable comparisons between new and standard catalysts. Benchmarking also requires careful documentation, archiving, and sharing of methods and measurements, to ensure that the full value of research data can be realized. Beyond these goals, benchmarking presents unique opportunities to advance and accelerate understanding of complex reaction systems by combining and comparing experimental information from multiple, in situ and operando techniques with theoretical insights derived frommore » calculations characterizing model systems. This Perspective describes the origins and uses of benchmarking and its applications in computational catalysis, heterogeneous catalysis, molecular catalysis, and electrocatalysis. As a result, it also discusses opportunities and challenges for future developments in these fields.« less
Toward benchmarking in catalysis science: Best practices, challenges, and opportunities

DOE PAGES

Bligaard, Thomas; Bullock, R. Morris; Campbell, Charles T.; ...

2016-03-07

Benchmarking is a community-based and (preferably) community-driven activity involving consensus-based decisions on how to make reproducible, fair, and relevant assessments. In catalysis science, important catalyst performance metrics include activity, selectivity, and the deactivation profile, which enable comparisons between new and standard catalysts. Benchmarking also requires careful documentation, archiving, and sharing of methods and measurements, to ensure that the full value of research data can be realized. Beyond these goals, benchmarking presents unique opportunities to advance and accelerate understanding of complex reaction systems by combining and comparing experimental information from multiple, in situ and operando techniques with theoretical insights derived frommore » calculations characterizing model systems. This Perspective describes the origins and uses of benchmarking and its applications in computational catalysis, heterogeneous catalysis, molecular catalysis, and electrocatalysis. As a result, it also discusses opportunities and challenges for future developments in these fields.« less
Phase field benchmark problems for dendritic growth and linear elasticity

DOE PAGES

Jokisaari, Andrea M.; Voorhees, P. W.; Guyer, Jonathan E.; ...

2018-03-26

We present the second set of benchmark problems for phase field models that are being jointly developed by the Center for Hierarchical Materials Design (CHiMaD) and the National Institute of Standards and Technology (NIST) along with input from other members in the phase field community. As the integrated computational materials engineering (ICME) approach to materials design has gained traction, there is an increasing need for quantitative phase field results. New algorithms and numerical implementations increase computational capabilities, necessitating standard problems to evaluate their impact on simulated microstructure evolution as well as their computational performance. We propose one benchmark problem formore » solidifiication and dendritic growth in a single-component system, and one problem for linear elasticity via the shape evolution of an elastically constrained precipitate. We demonstrate the utility and sensitivity of the benchmark problems by comparing the results of 1) dendritic growth simulations performed with different time integrators and 2) elastically constrained precipitate simulations with different precipitate sizes, initial conditions, and elastic moduli. As a result, these numerical benchmark problems will provide a consistent basis for evaluating different algorithms, both existing and those to be developed in the future, for accuracy and computational efficiency when applied to simulate physics often incorporated in phase field models.« less
Phase field benchmark problems for dendritic growth and linear elasticity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jokisaari, Andrea M.; Voorhees, P. W.; Guyer, Jonathan E.

We present the second set of benchmark problems for phase field models that are being jointly developed by the Center for Hierarchical Materials Design (CHiMaD) and the National Institute of Standards and Technology (NIST) along with input from other members in the phase field community. As the integrated computational materials engineering (ICME) approach to materials design has gained traction, there is an increasing need for quantitative phase field results. New algorithms and numerical implementations increase computational capabilities, necessitating standard problems to evaluate their impact on simulated microstructure evolution as well as their computational performance. We propose one benchmark problem formore » solidifiication and dendritic growth in a single-component system, and one problem for linear elasticity via the shape evolution of an elastically constrained precipitate. We demonstrate the utility and sensitivity of the benchmark problems by comparing the results of 1) dendritic growth simulations performed with different time integrators and 2) elastically constrained precipitate simulations with different precipitate sizes, initial conditions, and elastic moduli. As a result, these numerical benchmark problems will provide a consistent basis for evaluating different algorithms, both existing and those to be developed in the future, for accuracy and computational efficiency when applied to simulate physics often incorporated in phase field models.« less
A benchmarking tool to evaluate computer tomography perfusion infarct core predictions against a DWI standard.

PubMed

Cereda, Carlo W; Christensen, Søren; Campbell, Bruce Cv; Mishra, Nishant K; Mlynash, Michael; Levi, Christopher; Straka, Matus; Wintermark, Max; Bammer, Roland; Albers, Gregory W; Parsons, Mark W; Lansberg, Maarten G

2016-10-01

Differences in research methodology have hampered the optimization of Computer Tomography Perfusion (CTP) for identification of the ischemic core. We aim to optimize CTP core identification using a novel benchmarking tool. The benchmarking tool consists of an imaging library and a statistical analysis algorithm to evaluate the performance of CTP. The tool was used to optimize and evaluate an in-house developed CTP-software algorithm. Imaging data of 103 acute stroke patients were included in the benchmarking tool. Median time from stroke onset to CT was 185 min (IQR 180-238), and the median time between completion of CT and start of MRI was 36 min (IQR 25-79). Volumetric accuracy of the CTP-ROIs was optimal at an rCBF threshold of <38%; at this threshold, the mean difference was 0.3 ml (SD 19.8 ml), the mean absolute difference was 14.3 (SD 13.7) ml, and CTP was 67% sensitive and 87% specific for identification of DWI positive tissue voxels. The benchmarking tool can play an important role in optimizing CTP software as it provides investigators with a novel method to directly compare the performance of alternative CTP software packages. © The Author(s) 2015.
OPTIMIZATION OF MUD HAMMER DRILLING PERFORMANCE - A PROGRAM TO BENCHMARK THE VIABILITY OF ADVANCED MUD HAMMER DRILLING

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arnis Judzis

2002-10-01

This document details the progress to date on the OPTIMIZATION OF MUD HAMMER DRILLING PERFORMANCE -- A PROGRAM TO BENCHMARK THE VIABILITY OF ADVANCED MUD HAMMER DRILLING contract for the quarter starting July 2002 through September 2002. Even though we are awaiting the optimization portion of the testing program, accomplishments include the following: (1) Smith International agreed to participate in the DOE Mud Hammer program. (2) Smith International chromed collars for upcoming benchmark tests at TerraTek, now scheduled for 4Q 2002. (3) ConocoPhillips had a field trial of the Smith fluid hammer offshore Vietnam. The hammer functioned properly, though themore » well encountered hole conditions and reaming problems. ConocoPhillips plan another field trial as a result. (4) DOE/NETL extended the contract for the fluid hammer program to allow Novatek to ''optimize'' their much delayed tool to 2003 and to allow Smith International to add ''benchmarking'' tests in light of SDS Digger Tools' current financial inability to participate. (5) ConocoPhillips joined the Industry Advisors for the mud hammer program. (6) TerraTek acknowledges Smith International, BP America, PDVSA, and ConocoPhillips for cost-sharing the Smith benchmarking tests allowing extension of the contract to complete the optimizations.« less
Toxicological Benchmarks for Screening of Potential Contaminants of Concern for Effects on Aquatic Biota on the Oak Ridge Reservation, Oak Ridge, Tennessee

DOE Office of Scientific and Technical Information (OSTI.GOV)

Suter, G.W., II

1993-01-01

One of the initial stages in ecological risk assessment of hazardous waste sites is the screening of contaminants to determine which, if any, of them are worthy of further consideration; this process is termed contaminant screening. Screening is performed by comparing concentrations in ambient media to benchmark concentrations that are either indicative of a high likelihood of significant effects (upper screening benchmarks) or of a very low likelihood of significant effects (lower screening benchmarks). Exceedance of an upper screening benchmark indicates that the chemical in question is clearly of concern and remedial actions are likely to be needed. Exceedance ofmore » a lower screening benchmark indicates that a contaminant is of concern unless other information indicates that the data are unreliable or the comparison is inappropriate. Chemicals with concentrations below the lower benchmark are not of concern if the ambient data are judged to be adequate. This report presents potential screening benchmarks for protection of aquatic life from contaminants in water. Because there is no guidance for screening benchmarks, a set of alternative benchmarks is presented herein. The alternative benchmarks are based on different conceptual approaches to estimating concentrations causing significant effects. For the upper screening benchmark, there are the acute National Ambient Water Quality Criteria (NAWQC) and the Secondary Acute Values (SAV). The SAV concentrations are values estimated with 80% confidence not to exceed the unknown acute NAWQC for those chemicals with no NAWQC. The alternative chronic benchmarks are the chronic NAWQC, the Secondary Chronic Value (SCV), the lowest chronic values for fish and daphnids, the lowest EC20 for fish and daphnids from chronic toxicity tests, the estimated EC20 for a sensitive species, and the concentration estimated to cause a 20% reduction in the recruit abundance of largemouth bass. It is recommended that ambient chemical concentrations be compared to all of these benchmarks. If NAWQC are exceeded, the chemicals must be contaminants of concern because the NAWQC are applicable or relevant and appropriate requirements (ARARs). If NAWQC are not exceeded, but other benchmarks are, contaminants should be selected on the basis of the number of benchmarks exceeded and the conservatism of the particular benchmark values, as discussed in the text. To the extent that toxicity data are available, this report presents the alternative benchmarks for chemicals that have been detected on the Oak Ridge Reservation. It also presents the data used to calculate the benchmarks and the sources of the data. It compares the benchmarks and discusses their relative conservatism and utility. This report supersedes a prior aquatic benchmarks report (Suter and Mabrey 1994). It adds two new types of benchmarks. It also updates the benchmark values where appropriate, adds some new benchmark values, replaces secondary sources with primary sources, and provides more complete documentation of the sources and derivation of all values.« less
Benchmarking road safety performance: Identifying a meaningful reference (best-in-class).

PubMed

Chen, Faan; Wu, Jiaorong; Chen, Xiaohong; Wang, Jianjun; Wang, Di

2016-01-01

For road safety improvement, comparing and benchmarking performance are widely advocated as the emerging and preferred approaches. However, there is currently no universally agreed upon approach for the process of road safety benchmarking, and performing the practice successfully is by no means easy. This is especially true for the two core activities of which: (1) developing a set of road safety performance indicators (SPIs) and combining them into a composite index; and (2) identifying a meaningful reference (best-in-class), one which has already obtained outstanding road safety practices. To this end, a scientific technique that can combine the multi-dimensional safety performance indicators (SPIs) into an overall index, and subsequently can identify the 'best-in-class' is urgently required. In this paper, the Entropy-embedded RSR (Rank-sum ratio), an innovative, scientific and systematic methodology is investigated with the aim of conducting the above two core tasks in an integrative and concise procedure, more specifically in a 'one-stop' way. Using a combination of results from other methods (e.g. the SUNflower approach) and other measures (e.g. Human Development Index) as a relevant reference, a given set of European countries are robustly ranked and grouped into several classes based on the composite Road Safety Index. Within each class the 'best-in-class' is then identified. By benchmarking road safety performance, the results serve to promote best practice, encourage the adoption of successful road safety strategies and measures and, more importantly, inspire the kind of political leadership needed to create a road transport system that maximizes safety. Copyright © 2015 Elsevier Ltd. All rights reserved.
Design and Implementation of a Web-Based Reporting and Benchmarking Center for Inpatient Glucometrics

PubMed Central

Schnipper, Jeffrey Lawrence; Messler, Jordan; Ramos, Pedro; Kulasa, Kristen; Nolan, Ann; Rogers, Kendall

2014-01-01

Background: Insulin is a top source of adverse drug events in the hospital, and glycemic control is a focus of improvement efforts across the country. Yet, the majority of hospitals have no data to gauge their performance on glycemic control, hypoglycemia rates, or hypoglycemic management. Current tools to outsource glucometrics reports are limited in availability or function. Methods: Society of Hospital Medicine (SHM) faculty designed and implemented a web-based data and reporting center that calculates glucometrics on blood glucose data files securely uploaded by users. Unit labels, care type (critical care, non–critical care), and unit type (eg, medical, surgical, mixed, pediatrics) are defined on upload allowing for robust, flexible reporting. Reports for any date range, care type, unit type, or any combination of units are available on demand for review or downloading into a variety of file formats. Four reports with supporting graphics depict glycemic control, hypoglycemia, and hypoglycemia management by patient day or patient stay. Benchmarking and performance ranking reports are generated periodically for all hospitals in the database. Results: In all, 76 hospitals have uploaded at least 12 months of data for non–critical care areas and 67 sites have uploaded critical care data. Critical care benchmarking reveals wide variability in performance. Some hospitals achieve top quartile performance in both glycemic control and hypoglycemia parameters. Conclusions: This new web-based glucometrics data and reporting tool allows hospitals to track their performance with a flexible reporting system, and provides them with external benchmarking. Tools like this help to establish standardized glucometrics and performance standards. PMID:24876426
Design and implementation of a web-based reporting and benchmarking center for inpatient glucometrics.

PubMed

Maynard, Greg; Schnipper, Jeffrey Lawrence; Messler, Jordan; Ramos, Pedro; Kulasa, Kristen; Nolan, Ann; Rogers, Kendall

2014-07-01

Insulin is a top source of adverse drug events in the hospital, and glycemic control is a focus of improvement efforts across the country. Yet, the majority of hospitals have no data to gauge their performance on glycemic control, hypoglycemia rates, or hypoglycemic management. Current tools to outsource glucometrics reports are limited in availability or function. Society of Hospital Medicine (SHM) faculty designed and implemented a web-based data and reporting center that calculates glucometrics on blood glucose data files securely uploaded by users. Unit labels, care type (critical care, non-critical care), and unit type (eg, medical, surgical, mixed, pediatrics) are defined on upload allowing for robust, flexible reporting. Reports for any date range, care type, unit type, or any combination of units are available on demand for review or downloading into a variety of file formats. Four reports with supporting graphics depict glycemic control, hypoglycemia, and hypoglycemia management by patient day or patient stay. Benchmarking and performance ranking reports are generated periodically for all hospitals in the database. In all, 76 hospitals have uploaded at least 12 months of data for non-critical care areas and 67 sites have uploaded critical care data. Critical care benchmarking reveals wide variability in performance. Some hospitals achieve top quartile performance in both glycemic control and hypoglycemia parameters. This new web-based glucometrics data and reporting tool allows hospitals to track their performance with a flexible reporting system, and provides them with external benchmarking. Tools like this help to establish standardized glucometrics and performance standards. © 2014 Diabetes Technology Society.
International benchmarking and best practice management: in search of health care and hospital excellence.

PubMed

von Eiff, Wilfried

2015-01-01

Hospitals worldwide are facing the same opportunities and threats: the demographics of an aging population; steady increases in chronic diseases and severe illnesses; and a steadily increasing demand for medical services with more intensive treatment for multi-morbid patients. Additionally, patients are becoming more demanding. They expect high quality medicine within a dignity-driven and painless healing environment. The severe financial pressures that these developments entail oblige care providers to more and more cost-containment and to apply process reengineering, as well as continuous performance improvement measures, so as to achieve future financial sustainability. At the same time, regulators are calling for improved patient outcomes. Benchmarking and best practice management are successfully proven performance improvement tools for enabling hospitals to achieve a higher level of clinical output quality, enhanced patient satisfaction, and care delivery capability, while simultaneously containing and reducing costs. This chapter aims to clarify what benchmarking is and what it is not. Furthermore, it is stated that benchmarking is a powerful managerial tool for improving decision-making processes that can contribute to the above-mentioned improvement measures in health care delivery. The benchmarking approach described in this chapter is oriented toward the philosophy of an input-output model and is explained based on practical international examples from different industries in various countries. Benchmarking is not a project with a defined start and end point, but a continuous initiative of comparing key performance indicators, process structures, and best practices from best-in-class companies inside and outside industry. Benchmarking is an ongoing process of measuring and searching for best-in-class performance: Measure yourself with yourself over time against key performance indicators. Measure yourself against others. Identify best practices. Equal or exceed this best practice in your institution. Focus on simple and effective ways to implement solutions. Comparing only figures, such as average length of stay, costs of procedures, infection rates, or out-of-stock rates, can lead easily to wrong conclusions and decision making with often-disastrous consequences. Just looking at figures and ratios is not the basis for detecting potential excellence. It is necessary to look beyond the numbers to understand how processes work and contribute to best-in-class results. Best practices from even quite different industries can enable hospitals to leapfrog results in patient orientation, clinical excellence, and cost-effectiveness. Despite common benchmarking approaches, it is pointed out that a comparison without "looking behind the figures" (what it means to be familiar with the process structure, process dynamic and drivers, process institutions/rules and process-related incentive components) will be extremely limited referring to reliability and quality of findings. In order to demonstrate transferability of benchmarking results between different industries practical examples from health care, automotive, and hotel service have been selected. Additionally, it is depicted that international comparisons between hospitals providing medical services in different health care systems do have a great potential for achieving leapfrog results in medical quality, organization of service provision, effective work structures, purchasing and logistics processes, or management, etc.
Object-Oriented Implementation of the NAS Parallel Benchmarks using Charm++

NASA Technical Reports Server (NTRS)

Krishnan, Sanjeev; Bhandarkar, Milind; Kale, Laxmikant V.

1996-01-01

This report describes experiences with implementing the NAS Computational Fluid Dynamics benchmarks using a parallel object-oriented language, Charm++. Our main objective in implementing the NAS CFD kernel benchmarks was to develop a code that could be used to easily experiment with different domain decomposition strategies and dynamic load balancing. We also wished to leverage the object-orientation provided by the Charm++ parallel object-oriented language, to develop reusable abstractions that would simplify the process of developing parallel applications. We first describe the Charm++ parallel programming model and the parallel object array abstraction, then go into detail about each of the Scalar Pentadiagonal (SP) and Lower/Upper Triangular (LU) benchmarks, along with performance results. Finally we conclude with an evaluation of the methodology used.
Technical Report: Benchmarking for Quasispecies Abundance Inference with Confidence Intervals from Metagenomic Sequence Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

McLoughlin, K.

2016-01-22

The software application “MetaQuant” was developed by our group at Lawrence Livermore National Laboratory (LLNL). It is designed to profile microbial populations in a sample using data from whole-genome shotgun (WGS) metagenomic DNA sequencing. Several other metagenomic profiling applications have been described in the literature. We ran a series of benchmark tests to compare the performance of MetaQuant against that of a few existing profiling tools, using real and simulated sequence datasets. This report describes our benchmarking procedure and results.
[Do you mean benchmarking?].

PubMed

Bonnet, F; Solignac, S; Marty, J

2008-03-01

The purpose of benchmarking is to settle improvement processes by comparing the activities to quality standards. The proposed methodology is illustrated by benchmark business cases performed inside medical plants on some items like nosocomial diseases or organization of surgery facilities. Moreover, the authors have built a specific graphic tool, enhanced with balance score numbers and mappings, so that the comparison between different anesthesia-reanimation services, which are willing to start an improvement program, is easy and relevant. This ready-made application is even more accurate as far as detailed tariffs of activities are implemented.
EVA Health and Human Performance Benchmarking Study

NASA Technical Reports Server (NTRS)

Abercromby, A. F.; Norcross, J.; Jarvis, S. L.

2016-01-01

Multiple HRP Risks and Gaps require detailed characterization of human health and performance during exploration extravehicular activity (EVA) tasks; however, a rigorous and comprehensive methodology for characterizing and comparing the health and human performance implications of current and future EVA spacesuit designs does not exist. This study will identify and implement functional tasks and metrics, both objective and subjective, that are relevant to health and human performance, such as metabolic expenditure, suit fit, discomfort, suited postural stability, cognitive performance, and potentially biochemical responses for humans working inside different EVA suits doing functional tasks under the appropriate simulated reduced gravity environments. This study will provide health and human performance benchmark data for humans working in current EVA suits (EMU, Mark III, and Z2) as well as shirtsleeves using a standard set of tasks and metrics with quantified reliability. Results and methodologies developed during this test will provide benchmark data against which future EVA suits, and different suit configurations (eg, varied pressure, mass, CG) may be reliably compared in subsequent tests. Results will also inform fitness for duty standards as well as design requirements and operations concepts for future EVA suits and other exploration systems.

Surflex-Dock: Docking benchmarks and real-world application

NASA Astrophysics Data System (ADS)

Spitzer, Russell; Jain, Ajay N.

2012-06-01

Benchmarks for molecular docking have historically focused on re-docking the cognate ligand of a well-determined protein-ligand complex to measure geometric pose prediction accuracy, and measurement of virtual screening performance has been focused on increasingly large and diverse sets of target protein structures, cognate ligands, and various types of decoy sets. Here, pose prediction is reported on the Astex Diverse set of 85 protein ligand complexes, and virtual screening performance is reported on the DUD set of 40 protein targets. In both cases, prepared structures of targets and ligands were provided by symposium organizers. The re-prepared data sets yielded results not significantly different than previous reports of Surflex-Dock on the two benchmarks. Minor changes to protein coordinates resulting from complex pre-optimization had large effects on observed performance, highlighting the limitations of cognate ligand re-docking for pose prediction assessment. Docking protocols developed for cross-docking, which address protein flexibility and produce discrete families of predicted poses, produced substantially better performance for pose prediction. Performance on virtual screening performance was shown to benefit by employing and combining multiple screening methods: docking, 2D molecular similarity, and 3D molecular similarity. In addition, use of multiple protein conformations significantly improved screening enrichment.
How does audit and feedback influence intentions of health professionals to improve practice? A laboratory experiment and field study in cardiac rehabilitation.

PubMed

Gude, Wouter T; van Engen-Verheul, Mariëtte M; van der Veer, Sabine N; de Keizer, Nicolette F; Peek, Niels

2017-04-01

To identify factors that influence the intentions of health professionals to improve their practice when confronted with clinical performance feedback, which is an essential first step in the audit and feedback mechanism. We conducted a theory-driven laboratory experiment with 41 individual professionals, and a field study in 18 centres in the context of a cluster-randomised trial of electronic audit and feedback in cardiac rehabilitation. Feedback reports were provided through a web-based application, and included performance scores and benchmark comparisons (high, intermediate or low performance) for a set of process and outcome indicators. From each report participants selected indicators for improvement into their action plan. Our unit of observation was an indicator presented in a feedback report (selected yes/no); we considered selecting an indicator to reflect an intention to improve. We analysed 767 observations in the laboratory experiment and 614 in the field study, respectively. Each 10% decrease in performance score increased the probability of an indicator being selected by 54% (OR, 1.54; 95% CI 1.29% to 1.83%) in the laboratory experiment, and 25% (OR, 1.25; 95% CI 1.13% to 1.39%) in the field study. Also, performance being benchmarked as low and intermediate increased this probability in laboratory settings. Still, participants ignored the benchmarks in 34% (laboratory experiment) and 48% (field study) of their selections. When confronted with clinical performance feedback, performance scores and benchmark comparisons influenced health professionals' intentions to improve practice. However, there was substantial variation in these intentions, because professionals disagreed with benchmarks, deemed improvement unfeasible or did not consider the indicator an essential aspect of care quality. These phenomena impede intentions to improve practice, and are thus likely to dilute the effects of audit and feedback interventions. NTR3251, pre-results. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets

PubMed Central

2013-01-01

Background While a large body of work exists on comparing and benchmarking descriptors of molecular structures, a similar comparison of protein descriptor sets is lacking. Hence, in the current work a total of 13 amino acid descriptor sets have been benchmarked with respect to their ability of establishing bioactivity models. The descriptor sets included in the study are Z-scales (3 variants), VHSE, T-scales, ST-scales, MS-WHIM, FASGAI, BLOSUM, a novel protein descriptor set (termed ProtFP (4 variants)), and in addition we created and benchmarked three pairs of descriptor combinations. Prediction performance was evaluated in seven structure-activity benchmarks which comprise Angiotensin Converting Enzyme (ACE) dipeptidic inhibitor data, and three proteochemometric data sets, namely (1) GPCR ligands modeled against a GPCR panel, (2) enzyme inhibitors (NNRTIs) with associated bioactivities against a set of HIV enzyme mutants, and (3) enzyme inhibitors (PIs) with associated bioactivities on a large set of HIV enzyme mutants. Results The amino acid descriptor sets compared here show similar performance (<0.1 log units RMSE difference and <0.1 difference in MCC), while errors for individual proteins were in some cases found to be larger than those resulting from descriptor set differences ( > 0.3 log units RMSE difference and >0.7 difference in MCC). Combining different descriptor sets generally leads to better modeling performance than utilizing individual sets. The best performers were Z-scales (3) combined with ProtFP (Feature), or Z-Scales (3) combined with an average Z-Scale value for each target, while ProtFP (PCA8), ST-Scales, and ProtFP (Feature) rank last. Conclusions While amino acid descriptor sets capture different aspects of amino acids their ability to be used for bioactivity modeling is still – on average – surprisingly similar. Still, combining sets describing complementary information consistently leads to small but consistent improvement in modeling performance (average MCC 0.01 better, average RMSE 0.01 log units lower). Finally, performance differences exist between the targets compared thereby underlining that choosing an appropriate descriptor set is of fundamental for bioactivity modeling, both from the ligand- as well as the protein side. PMID:24059743
Inclusion of Highest Glasgow Coma Scale Motor Component Score in Mortality Risk Adjustment for Benchmarking of Trauma Center Performance.

PubMed

Gomez, David; Byrne, James P; Alali, Aziz S; Xiong, Wei; Hoeft, Chris; Neal, Melanie; Subacius, Harris; Nathens, Avery B

2017-12-01

The Glasgow Coma Scale (GCS) is the most widely used measure of traumatic brain injury (TBI) severity. Currently, the arrival GCS motor component (mGCS) score is used in risk-adjustment models for external benchmarking of mortality. However, there is evidence that the highest mGCS score in the first 24 hours after injury might be a better predictor of death. Our objective was to evaluate the impact of including the highest mGCS score on the performance of risk-adjustment models and subsequent external benchmarking results. Data were derived from the Trauma Quality Improvement Program analytic dataset (January 2014 through March 2015) and were limited to the severe TBI cohort (16 years or older, isolated head injury, GCS ≤8). Risk-adjustment models were created that varied in the mGCS covariates only (initial score, highest score, or both initial and highest mGCS scores). Model performance and fit, as well as external benchmarking results, were compared. There were 6,553 patients with severe TBI across 231 trauma centers included. Initial and highest mGCS scores were different in 47% of patients (n = 3,097). Model performance and fit improved when both initial and highest mGCS scores were included, as evidenced by improved C-statistic, Akaike Information Criterion, and adjusted R-squared values. Three-quarters of centers changed their adjusted odds ratio decile, 2.6% of centers changed outlier status, and 45% of centers exhibited a ≥0.5-SD change in the odds ratio of death after including highest mGCS score in the model. This study supports the concept that additional clinical information has the potential to not only improve the performance of current risk-adjustment models, but can also have a meaningful impact on external benchmarking strategies. Highest mGCS score is a good potential candidate for inclusion in additional models. Copyright © 2017 American College of Surgeons. Published by Elsevier Inc. All rights reserved.
A Prototype Tool to Enable Farmers to Measure and Improve the Welfare Performance of the Farm Animal Enterprise: The Unified Field Index

PubMed Central

Colditz, Ian G.; Ferguson, Drewe M.; Collins, Teresa; Matthews, Lindsay; Hemsworth, Paul H.

2014-01-01

Simple Summary Benchmarking is a tool widely used in agricultural industries that harnesses the experience of farmers to generate knowledge of practices that lead to better on-farm productivity and performance. We propose, by analogy with production performance, a method for measuring the animal welfare performance of an enterprise and describe a tool for farmers to monitor and improve the animal welfare performance of their business. A general framework is outlined for assessing and monitoring risks to animal welfare based on measures of animals, the environment they are kept in and how they are managed. The tool would enable farmers to continually improve animal welfare. Abstract Schemes for the assessment of farm animal welfare and assurance of welfare standards have proliferated in recent years. An acknowledged short-coming has been the lack of impact of these schemes on the welfare standards achieved on farm due in part to sociological factors concerning their implementation. Here we propose the concept of welfare performance based on a broad set of performance attributes of an enterprise and describe a tool based on risk assessment and benchmarking methods for measuring and managing welfare performance. The tool termed the Unified Field Index is presented in a general form comprising three modules addressing animal, resource, and management factors. Domains within these modules accommodate the principle conceptual perspectives for welfare assessment: biological functioning; emotional states; and naturalness. Pan-enterprise analysis in any livestock sector could be used to benchmark welfare performance of individual enterprises and also provide statistics of welfare performance for the livestock sector. An advantage of this concept of welfare performance is its use of continuous scales of measurement rather than traditional pass/fail measures. Through the feedback provided via benchmarking, the tool should help farmers better engage in on-going improvement of farm practices that affect animal welfare. PMID:26480317
[Therapeutic strategies targeting brain tumor stem cells].

PubMed

Toda, Masahiro

2009-07-01

Progress in stem cell research reveals cancer stem cells to be present in a variety of malignant tumors. Since they exhibit resistance to anticancer drugs and radiotherapy, analysis of their properties has been rapidly carried forward as an important target for the treatment of intractable malignancies, including brain tumors. In fact, brain cancer stem cells (BCSCs) have been isolated from brain tumor tissue and brain tumor cell lines by using neural stem cell culture methods and isolation methods for side population (SP) cells, which have high drug-efflux capacity. Although the analysis of the properties of BCSCs is the most important to developing methods in treating BCSCs, the absence of BCSC purification methods should be remedied by taking it up as an important research task in the immediate future. Thus far, there are no effective treatment methods for BCSCs, and several treatment methods have been proposed based on the cell biology characteristics of BCSCs. In this article, I outline potential treatment methods damaging treatment-resistant BCSCs, including immunotherapy which is currently a topic of our research.
Performance Comparison of NAMI DANCE and FLOW-3D® Models in Tsunami Propagation, Inundation and Currents using NTHMP Benchmark Problems

NASA Astrophysics Data System (ADS)

Velioglu Sogut, Deniz; Yalciner, Ahmet Cevdet

2018-06-01

Field observations provide valuable data regarding nearshore tsunami impact, yet only in inundation areas where tsunami waves have already flooded. Therefore, tsunami modeling is essential to understand tsunami behavior and prepare for tsunami inundation. It is necessary that all numerical models used in tsunami emergency planning be subject to benchmark tests for validation and verification. This study focuses on two numerical codes, NAMI DANCE and FLOW-3D®, for validation and performance comparison. NAMI DANCE is an in-house tsunami numerical model developed by the Ocean Engineering Research Center of Middle East Technical University, Turkey and Laboratory of Special Research Bureau for Automation of Marine Research, Russia. FLOW-3D® is a general purpose computational fluid dynamics software, which was developed by scientists who pioneered in the design of the Volume-of-Fluid technique. The codes are validated and their performances are compared via analytical, experimental and field benchmark problems, which are documented in the ``Proceedings and Results of the 2011 National Tsunami Hazard Mitigation Program (NTHMP) Model Benchmarking Workshop'' and the ``Proceedings and Results of the NTHMP 2015 Tsunami Current Modeling Workshop". The variations between the numerical solutions of these two models are evaluated through statistical error analysis.
Benchmarking a Soil Moisture Data Assimilation System for Agricultural Drought Monitoring

NASA Technical Reports Server (NTRS)

Hun, Eunjin; Crow, Wade T.; Holmes, Thomas; Bolten, John

2014-01-01

Despite considerable interest in the application of land surface data assimilation systems (LDAS) for agricultural drought applications, relatively little is known about the large-scale performance of such systems and, thus, the optimal methodological approach for implementing them. To address this need, this paper evaluates an LDAS for agricultural drought monitoring by benchmarking individual components of the system (i.e., a satellite soil moisture retrieval algorithm, a soil water balance model and a sequential data assimilation filter) against a series of linear models which perform the same function (i.e., have the same basic inputoutput structure) as the full system component. Benchmarking is based on the calculation of the lagged rank cross-correlation between the normalized difference vegetation index (NDVI) and soil moisture estimates acquired for various components of the system. Lagged soil moistureNDVI correlations obtained using individual LDAS components versus their linear analogs reveal the degree to which non-linearities andor complexities contained within each component actually contribute to the performance of the LDAS system as a whole. Here, a particular system based on surface soil moisture retrievals from the Land Parameter Retrieval Model (LPRM), a two-layer Palmer soil water balance model and an Ensemble Kalman filter (EnKF) is benchmarked. Results suggest significant room for improvement in each component of the system.
Thermal Performance Benchmarking: Annual Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Moreno, Gilbert

2016-04-08

The goal for this project is to thoroughly characterize the performance of state-of-the-art (SOA) automotive power electronics and electric motor thermal management systems. Information obtained from these studies will be used to: Evaluate advantages and disadvantages of different thermal management strategies; establish baseline metrics for the thermal management systems; identify methods of improvement to advance the SOA; increase the publicly available information related to automotive traction-drive thermal management systems; help guide future electric drive technologies (EDT) research and development (R&D) efforts. The performance results combined with component efficiency and heat generation information obtained by Oak Ridge National Laboratory (ORNL) maymore » then be used to determine the operating temperatures for the EDT components under drive-cycle conditions. In FY15, the 2012 Nissan LEAF power electronics and electric motor thermal management systems were benchmarked. Testing of the 2014 Honda Accord Hybrid power electronics thermal management system started in FY15; however, due to time constraints it was not possible to include results for this system in this report. The focus of this project is to benchmark the thermal aspects of the systems. ORNL's benchmarking of electric and hybrid electric vehicle technology reports provide detailed descriptions of the electrical and packaging aspects of these automotive systems.« less
Benchmarking and Self-Assessment in the Wine Industry

DOE Office of Scientific and Technical Information (OSTI.GOV)

Galitsky, Christina; Radspieler, Anthony; Worrell, Ernst

2005-12-01

Not all industrial facilities have the staff or theopportunity to perform a detailed audit of their operations. The lack ofknowledge of energy efficiency opportunities provides an importantbarrier to improving efficiency. Benchmarking programs in the U.S. andabroad have shown to improve knowledge of the energy performance ofindustrial facilities and buildings and to fuel energy managementpractices. Benchmarking provides a fair way to compare the energyintensity of plants, while accounting for structural differences (e.g.,the mix of products produced, climate conditions) between differentfacilities. In California, the winemaking industry is not only one of theeconomic pillars of the economy; it is also a large energymore » consumer, witha considerable potential for energy-efficiency improvement. LawrenceBerkeley National Laboratory and Fetzer Vineyards developed the firstbenchmarking tool for the California wine industry called "BEST(Benchmarking and Energy and water Savings Tool) Winery". BEST Wineryenables a winery to compare its energy efficiency to a best practicereference winery. Besides overall performance, the tool enables the userto evaluate the impact of implementing efficiency measures. The toolfacilitates strategic planning of efficiency measures, based on theestimated impact of the measures, their costs and savings. The tool willraise awareness of current energy intensities and offer an efficient wayto evaluate the impact of future efficiency measures.« less
Processor Emulator with Benchmark Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lloyd, G. Scott; Pearce, Roger; Gokhale, Maya

2015-11-13

A processor emulator and a suite of benchmark applications have been developed to assist in characterizing the performance of data-centric workloads on current and future computer architectures. Some of the applications have been collected from other open source projects. For more details on the emulator and an example of its usage, see reference [1].
Using Web-Based Peer Benchmarking to Manage the Client-Based Project

ERIC Educational Resources Information Center

Raska, David; Keller, Eileen Weisenbach; Shaw, Doris

2013-01-01

The complexities of integrating client-based projects into marketing courses provide challenges for the instructor but produce richness of context and active learning for the student. This paper explains the integration of Web-based peer benchmarking as a means of improving student performance on client-based projects within a single semester in…
Taking Aims: New CASE Study Benchmarks Advancement Investments and Returns

ERIC Educational Resources Information Center

Goldsmith, Rae

2012-01-01

Advancement professionals have always been thirsty for information that will help them understand how their programs compare with those of their peers. But in recent years the demand for benchmarking data has exploded as budgets have become leaner, leaders have become more business minded, and terms like "performance metrics and return on…
Can Human Capital Metrics Effectively Benchmark Higher Education with For-Profit Companies?

ERIC Educational Resources Information Center

Hagedorn, Kathy; Forlaw, Blair

2007-01-01

Last fall, Saint Louis University participated in St. Louis, Missouri's, first Human Capital Performance Study alongside several of the region's largest for-profit employers. The university also participated this year in the benchmarking of employee engagement factors conducted by the St. Louis Business Journal in its effort to quantify and select…
The OpenMP Implementation of NAS Parallel Benchmarks and its Performance

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Frumkin, Michael; Yan, Jerry

1999-01-01

As the new ccNUMA architecture became popular in recent years, parallel programming with compiler directives on these machines has evolved to accommodate new needs. In this study, we examine the effectiveness of OpenMP directives for parallelizing the NAS Parallel Benchmarks. Implementation details will be discussed and performance will be compared with the MPI implementation. We have demonstrated that OpenMP can achieve very good results for parallelization on a shared memory system, but effective use of memory and cache is very important.
Multi-Core Processor Memory Contention Benchmark Analysis Case Study

NASA Technical Reports Server (NTRS)

Simon, Tyler; McGalliard, James

2009-01-01

Multi-core processors dominate current mainframe, server, and high performance computing (HPC) systems. This paper provides synthetic kernel and natural benchmark results from an HPC system at the NASA Goddard Space Flight Center that illustrate the performance impacts of multi-core (dual- and quad-core) vs. single core processor systems. Analysis of processor design, application source code, and synthetic and natural test results all indicate that multi-core processors can suffer from significant memory subsystem contention compared to similar single-core processors.
Accelerating progress in Artificial General Intelligence: Choosing a benchmark for natural world interaction

NASA Astrophysics Data System (ADS)

Rohrer, Brandon

2010-12-01

Measuring progress in the field of Artificial General Intelligence (AGI) can be difficult without commonly accepted methods of evaluation. An AGI benchmark would allow evaluation and comparison of the many computational intelligence algorithms that have been developed. In this paper I propose that a benchmark for natural world interaction would possess seven key characteristics: fitness, breadth, specificity, low cost, simplicity, range, and task focus. I also outline two benchmark examples that meet most of these criteria. In the first, the direction task, a human coach directs a machine to perform a novel task in an unfamiliar environment. The direction task is extremely broad, but may be idealistic. In the second, the AGI battery, AGI candidates are evaluated based on their performance on a collection of more specific tasks. The AGI battery is designed to be appropriate to the capabilities of currently existing systems. Both the direction task and the AGI battery would require further definition before implementing. The paper concludes with a description of a task that might be included in the AGI battery: the search and retrieve task.
Avoiding unintended incentives in ACO payment models.

PubMed

Douven, Rudy; McGuire, Thomas G; McWilliams, J Michael

2015-01-01

One goal of the Medicare Shared Savings Program for accountable care organizations (ACOs) is to reduce Medicare spending for ACOs' patients relative to the organizations' spending history. However, we found that current rules for setting ACO spending targets (or benchmarks) diminish ACOs' incentives to generate savings and may even encourage higher instead of lower Medicare spending. Spending in the three years before ACOs enter or renew a contract is weighted unequally in the benchmark calculation, with a high weight of 0.6 given to the year just before a new contract starts. Thus, ACOs have incentives to increase spending in that year to inflate their benchmark for future years and thereby make it easier to obtain shared savings from Medicare in the new contract period. We suggest strategies to improve incentives for ACOs, including changes to the weights used to determine benchmarks and new payment models that base an ACO's spending target not only on its own past performance but also on the performance of other ACOs or Medicare providers. Project HOPE—The People-to-People Health Foundation, Inc.
Benchmarking NNWSI flow and transport codes: COVE 1 results

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hayden, N.K.

1985-06-01

The code verification (COVE) activity of the Nevada Nuclear Waste Storage Investigations (NNWSI) Project is the first step in certification of flow and transport codes used for NNWSI performance assessments of a geologic repository for disposing of high-level radioactive wastes. The goals of the COVE activity are (1) to demonstrate and compare the numerical accuracy and sensitivity of certain codes, (2) to identify and resolve problems in running typical NNWSI performance assessment calculations, and (3) to evaluate computer requirements for running the codes. This report describes the work done for COVE 1, the first step in benchmarking some of themore » codes. Isothermal calculations for the COVE 1 benchmarking have been completed using the hydrologic flow codes SAGUARO, TRUST, and GWVIP; the radionuclide transport codes FEMTRAN and TRUMP; and the coupled flow and transport code TRACR3D. This report presents the results of three cases of the benchmarking problem solved for COVE 1, a comparison of the results, questions raised regarding sensitivities to modeling techniques, and conclusions drawn regarding the status and numerical sensitivities of the codes. 30 refs.« less
Benchmarking for the competitive marketplace.

PubMed

Clarke, R W; Sucher, T O

1999-07-01

One would get little argument these days regarding the importance of performance measurement in the health care industry. The traditional approach has been the straightforward use of measurable units such as financial comparisons and clinical indicators (e.g., length of stay). Also we in the health care industry have traditionally benchmarked our performance and strategies against those most like ourselves. Today's competitive market demands a more customer-focused set of performance measures that go beyond traditional approaches such as customer service. The most important task in today's environment is to study the customers' emerging priorities and adjust our business to meet those priorities.

Comparing Hospital Processes and Outcomes in California Medicare Beneficiaries: Simulation Prompts Reconsideration

PubMed Central

Escobar, Gabriel J; Baker, Jennifer M; Turk, Benjamin J; Draper, David; Liu, Vincent; Kipnis, Patricia

2017-01-01

Introduction This article is not a traditional research report. It describes how conducting a specific set of benchmarking analyses led us to broader reflections on hospital benchmarking. We reexamined an issue that has received far less attention from researchers than in the past: How variations in the hospital admission threshold might affect hospital rankings. Considering this threshold made us reconsider what benchmarking is and what future benchmarking studies might be like. Although we recognize that some of our assertions are speculative, they are based on our reading of the literature and previous and ongoing data analyses being conducted in our research unit. We describe the benchmarking analyses that led to these reflections. Objectives The Centers for Medicare and Medicaid Services’ Hospital Compare Web site includes data on fee-for-service Medicare beneficiaries but does not control for severity of illness, which requires physiologic data now available in most electronic medical records. To address this limitation, we compared hospital processes and outcomes among Kaiser Permanente Northern California’s (KPNC) Medicare Advantage beneficiaries and non-KPNC California Medicare beneficiaries between 2009 and 2010. Methods We assigned a simulated severity of illness measure to each record and explored the effect of having the additional information on outcomes. Results We found that if the admission severity of illness in non-KPNC hospitals increased, KPNC hospitals’ mortality performance would appear worse; conversely, if admission severity at non-KPNC hospitals’ decreased, KPNC hospitals’ performance would appear better. Conclusion Future hospital benchmarking should consider the impact of variation in admission thresholds. PMID:29035176
Benchmarking Heavy Ion Transport Codes FLUKA, HETC-HEDS MARS15, MCNPX, and PHITS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ronningen, Reginald Martin; Remec, Igor; Heilbronn, Lawrence H.

Powerful accelerators such as spallation neutron sources, muon-collider/neutrino facilities, and rare isotope beam facilities must be designed with the consideration that they handle the beam power reliably and safely, and they must be optimized to yield maximum performance relative to their design requirements. The simulation codes used for design purposes must produce reliable results. If not, component and facility designs can become costly, have limited lifetime and usefulness, and could even be unsafe. The objective of this proposal is to assess the performance of the currently available codes PHITS, FLUKA, MARS15, MCNPX, and HETC-HEDS that could be used for designmore » simulations involving heavy ion transport. We plan to access their performance by performing simulations and comparing results against experimental data of benchmark quality. Quantitative knowledge of the biases and the uncertainties of the simulations is essential as this potentially impacts the safe, reliable and cost effective design of any future radioactive ion beam facility. Further benchmarking of heavy-ion transport codes was one of the actions recommended in the Report of the 2003 RIA R&D Workshop".« less
A fast elitism Gaussian estimation of distribution algorithm and application for PID optimization.

PubMed

Xu, Qingyang; Zhang, Chengjin; Zhang, Li

2014-01-01

Estimation of distribution algorithm (EDA) is an intelligent optimization algorithm based on the probability statistics theory. A fast elitism Gaussian estimation of distribution algorithm (FEGEDA) is proposed in this paper. The Gaussian probability model is used to model the solution distribution. The parameters of Gaussian come from the statistical information of the best individuals by fast learning rule. A fast learning rule is used to enhance the efficiency of the algorithm, and an elitism strategy is used to maintain the convergent performance. The performances of the algorithm are examined based upon several benchmarks. In the simulations, a one-dimensional benchmark is used to visualize the optimization process and probability model learning process during the evolution, and several two-dimensional and higher dimensional benchmarks are used to testify the performance of FEGEDA. The experimental results indicate the capability of FEGEDA, especially in the higher dimensional problems, and the FEGEDA exhibits a better performance than some other algorithms and EDAs. Finally, FEGEDA is used in PID controller optimization of PMSM and compared with the classical-PID and GA.
A Fast Elitism Gaussian Estimation of Distribution Algorithm and Application for PID Optimization

PubMed Central

Xu, Qingyang; Zhang, Chengjin; Zhang, Li

2014-01-01

Estimation of distribution algorithm (EDA) is an intelligent optimization algorithm based on the probability statistics theory. A fast elitism Gaussian estimation of distribution algorithm (FEGEDA) is proposed in this paper. The Gaussian probability model is used to model the solution distribution. The parameters of Gaussian come from the statistical information of the best individuals by fast learning rule. A fast learning rule is used to enhance the efficiency of the algorithm, and an elitism strategy is used to maintain the convergent performance. The performances of the algorithm are examined based upon several benchmarks. In the simulations, a one-dimensional benchmark is used to visualize the optimization process and probability model learning process during the evolution, and several two-dimensional and higher dimensional benchmarks are used to testify the performance of FEGEDA. The experimental results indicate the capability of FEGEDA, especially in the higher dimensional problems, and the FEGEDA exhibits a better performance than some other algorithms and EDAs. Finally, FEGEDA is used in PID controller optimization of PMSM and compared with the classical-PID and GA. PMID:24892059
MPI, HPF or OpenMP: A Study with the NAS Benchmarks

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Frumkin, Michael; Hribar, Michelle; Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

1999-01-01

Porting applications to new high performance parallel and distributed platforms is a challenging task. Writing parallel code by hand is time consuming and costly, but the task can be simplified by high level languages and would even better be automated by parallelizing tools and compilers. The definition of HPF (High Performance Fortran, based on data parallel model) and OpenMP (based on shared memory parallel model) standards has offered great opportunity in this respect. Both provide simple and clear interfaces to language like FORTRAN and simplify many tedious tasks encountered in writing message passing programs. In our study we implemented the parallel versions of the NAS Benchmarks with HPF and OpenMP directives. Comparison of their performance with the MPI implementation and pros and cons of different approaches will be discussed along with experience of using computer-aided tools to help parallelize these benchmarks. Based on the study,potentials of applying some of the techniques to realistic aerospace applications will be presented
MPI, HPF or OpenMP: A Study with the NAS Benchmarks

NASA Technical Reports Server (NTRS)

Jin, H.; Frumkin, M.; Hribar, M.; Waheed, A.; Yan, J.; Saini, Subhash (Technical Monitor)

1999-01-01

Porting applications to new high performance parallel and distributed platforms is a challenging task. Writing parallel code by hand is time consuming and costly, but this task can be simplified by high level languages and would even better be automated by parallelizing tools and compilers. The definition of HPF (High Performance Fortran, based on data parallel model) and OpenMP (based on shared memory parallel model) standards has offered great opportunity in this respect. Both provide simple and clear interfaces to language like FORTRAN and simplify many tedious tasks encountered in writing message passing programs. In our study, we implemented the parallel versions of the NAS Benchmarks with HPF and OpenMP directives. Comparison of their performance with the MPI implementation and pros and cons of different approaches will be discussed along with experience of using computer-aided tools to help parallelize these benchmarks. Based on the study, potentials of applying some of the techniques to realistic aerospace applications will be presented.
A benchmark for statistical microarray data analysis that preserves actual biological and technical variance.

PubMed

De Hertogh, Benoît; De Meulder, Bertrand; Berger, Fabrice; Pierre, Michael; Bareke, Eric; Gaigneaux, Anthoula; Depiereux, Eric

2010-01-11

Recent reanalysis of spike-in datasets underscored the need for new and more accurate benchmark datasets for statistical microarray analysis. We present here a fresh method using biologically-relevant data to evaluate the performance of statistical methods. Our novel method ranks the probesets from a dataset composed of publicly-available biological microarray data and extracts subset matrices with precise information/noise ratios. Our method can be used to determine the capability of different methods to better estimate variance for a given number of replicates. The mean-variance and mean-fold change relationships of the matrices revealed a closer approximation of biological reality. Performance analysis refined the results from benchmarks published previously.We show that the Shrinkage t test (close to Limma) was the best of the methods tested, except when two replicates were examined, where the Regularized t test and the Window t test performed slightly better. The R scripts used for the analysis are available at http://urbm-cluster.urbm.fundp.ac.be/~bdemeulder/.
A Simulation Environment for Benchmarking Sensor Fusion-Based Pose Estimators.

PubMed

Ligorio, Gabriele; Sabatini, Angelo Maria

2015-12-19

In-depth analysis and performance evaluation of sensor fusion-based estimators may be critical when performed using real-world sensor data. For this reason, simulation is widely recognized as one of the most powerful tools for algorithm benchmarking. In this paper, we present a simulation framework suitable for assessing the performance of sensor fusion-based pose estimators. The systems used for implementing the framework were magnetic/inertial measurement units (MIMUs) and a camera, although the addition of further sensing modalities is straightforward. Typical nuisance factors were also included for each sensor. The proposed simulation environment was validated using real-life sensor data employed for motion tracking. The higher mismatch between real and simulated sensors was about 5% of the measured quantity (for the camera simulation), whereas a lower correlation was found for an axis of the gyroscope (0.90). In addition, a real benchmarking example of an extended Kalman filter for pose estimation from MIMU and camera data is presented.
RCQ-GA: RDF Chain Query Optimization Using Genetic Algorithms

NASA Astrophysics Data System (ADS)

Hogenboom, Alexander; Milea, Viorel; Frasincar, Flavius; Kaymak, Uzay

The application of Semantic Web technologies in an Electronic Commerce environment implies a need for good support tools. Fast query engines are needed for efficient querying of large amounts of data, usually represented using RDF. We focus on optimizing a special class of SPARQL queries, the so-called RDF chain queries. For this purpose, we devise a genetic algorithm called RCQ-GA that determines the order in which joins need to be performed for an efficient evaluation of RDF chain queries. The approach is benchmarked against a two-phase optimization algorithm, previously proposed in literature. The more complex a query is, the more RCQ-GA outperforms the benchmark in solution quality, execution time needed, and consistency of solution quality. When the algorithms are constrained by a time limit, the overall performance of RCQ-GA compared to the benchmark further improves.
Commercial Building Energy Saver, Web App

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hong, Tianzhen; Piette, Mary; Lee, Sang Hoon

The CBES App is a web-based toolkit for use by small businesses and building owners and operators of small and medium size commercial buildings to perform energy benchmarking and retrofit analysis for buildings. The CBES App analyzes the energy performance of user's building for pre-and posto-retrofit, in conjunction with user's input data, to identify recommended retrofit measures, energy savings and economic analysis for the selected measures. The CBES App provides energy benchmarking, including getting an EnergyStar score using EnergyStar API and benchmarking against California peer buildings using the EnergyIQ API. The retrofit analysis includes a preliminary analysis by looking upmore » retrofit measures from a pre-simulated database DEEP, and a detailed analysis creating and running EnergyPlus models to calculate energy savings of retrofit measures. The CBES App builds upon the LBNL CBES API.« less
Benchmarking study of the MCNP code against cold critical experiments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sitaraman, S.

1991-01-01

The purpose of this study was to benchmark the widely used Monte Carlo code MCNP against a set of cold critical experiments with a view to using the code as a means of independently verifying the performance of faster but less accurate Monte Carlo and deterministic codes. The experiments simulated consisted of both fast and thermal criticals as well as fuel in a variety of chemical forms. A standard set of benchmark cold critical experiments was modeled. These included the two fast experiments, GODIVA and JEZEBEL, the TRX metallic uranium thermal experiments, the Babcock and Wilcox oxide and mixed oxidemore » experiments, and the Oak Ridge National Laboratory (ORNL) and Pacific Northwest Laboratory (PNL) nitrate solution experiments. The principal case studied was a small critical experiment that was performed with boiling water reactor bundles.« less
Assessing and benchmarking multiphoton microscopes for biologists

PubMed Central

Corbin, Kaitlin; Pinkard, Henry; Peck, Sebastian; Beemiller, Peter; Krummel, Matthew F.

2017-01-01

Multiphoton microscopy has become staple tool for tracking cells within tissues and organs due to superior depth of penetration, low excitation volumes, and reduced phototoxicity. Many factors, ranging from laser pulse width to relay optics to detectors and electronics, contribute to the overall ability of these microscopes to excite and detect fluorescence deep within tissues. However, we have found that there are few standard ways already described in the literature to distinguish between microscopes or to benchmark existing microscopes to measure the overall quality and efficiency of these instruments. Here, we discuss some simple parameters and methods that can either be used within a multiphoton facility or by a prospective purchaser to benchmark performance. This can both assist in identifying decay in microscope performance and in choosing features of a scope that are suited to experimental needs. PMID:24974026
Revel8or: Model Driven Capacity Planning Tool Suite

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhu, Liming; Liu, Yan; Bui, Ngoc B.

2007-05-31

Designing complex multi-tier applications that must meet strict performance requirements is a challenging software engineering problem. Ideally, the application architect could derive accurate performance predictions early in the project life-cycle, leveraging initial application design-level models and a description of the target software and hardware platforms. To this end, we have developed a capacity planning tool suite for component-based applications, called Revel8tor. The tool adheres to the model driven development paradigm and supports benchmarking and performance prediction for J2EE, .Net and Web services platforms. The suite is composed of three different tools: MDAPerf, MDABench and DSLBench. MDAPerf allows annotation of designmore » diagrams and derives performance analysis models. MDABench allows a customized benchmark application to be modeled in the UML 2.0 Testing Profile and automatically generates a deployable application, with measurement automatically conducted. DSLBench allows the same benchmark modeling and generation to be conducted using a simple performance engineering Domain Specific Language (DSL) in Microsoft Visual Studio. DSLBench integrates with Visual Studio and reuses its load testing infrastructure. Together, the tool suite can assist capacity planning across platforms in an automated fashion.« less
Cloud-Based Evaluation of Anatomical Structure Segmentation and Landmark Detection Algorithms: VISCERAL Anatomy Benchmarks.

PubMed

Jimenez-Del-Toro, Oscar; Muller, Henning; Krenn, Markus; Gruenberg, Katharina; Taha, Abdel Aziz; Winterstein, Marianne; Eggel, Ivan; Foncubierta-Rodriguez, Antonio; Goksel, Orcun; Jakab, Andras; Kontokotsios, Georgios; Langs, Georg; Menze, Bjoern H; Salas Fernandez, Tomas; Schaer, Roger; Walleyo, Anna; Weber, Marc-Andre; Dicente Cid, Yashin; Gass, Tobias; Heinrich, Mattias; Jia, Fucang; Kahl, Fredrik; Kechichian, Razmig; Mai, Dominic; Spanier, Assaf B; Vincent, Graham; Wang, Chunliang; Wyeth, Daniel; Hanbury, Allan

2016-11-01

Variations in the shape and appearance of anatomical structures in medical images are often relevant radiological signs of disease. Automatic tools can help automate parts of this manual process. A cloud-based evaluation framework is presented in this paper including results of benchmarking current state-of-the-art medical imaging algorithms for anatomical structure segmentation and landmark detection: the VISCERAL Anatomy benchmarks. The algorithms are implemented in virtual machines in the cloud where participants can only access the training data and can be run privately by the benchmark administrators to objectively compare their performance in an unseen common test set. Overall, 120 computed tomography and magnetic resonance patient volumes were manually annotated to create a standard Gold Corpus containing a total of 1295 structures and 1760 landmarks. Ten participants contributed with automatic algorithms for the organ segmentation task, and three for the landmark localization task. Different algorithms obtained the best scores in the four available imaging modalities and for subsets of anatomical structures. The annotation framework, resulting data set, evaluation setup, results and performance analysis from the three VISCERAL Anatomy benchmarks are presented in this article. Both the VISCERAL data set and Silver Corpus generated with the fusion of the participant algorithms on a larger set of non-manually-annotated medical images are available to the research community.
Deepthi Vaidhynathan | NREL

Science.gov Websites

Complex Systems Simulation and Optimization Group on performance analysis and benchmarking latest . Research Interests High Performance Computing|Embedded System |Microprocessors & Microcontrollers
Performance effects of irregular communications patterns on massively parallel multiprocessors

NASA Technical Reports Server (NTRS)

Saltz, Joel; Petiton, Serge; Berryman, Harry; Rifkin, Adam

1991-01-01

A detailed study of the performance effects of irregular communications patterns on the CM-2 was conducted. The communications capabilities of the CM-2 were characterized under a variety of controlled conditions. In the process of carrying out the performance evaluation, extensive use was made of a parameterized synthetic mesh. In addition, timings with unstructured meshes generated for aerodynamic codes and a set of sparse matrices with banded patterns on non-zeroes were performed. This benchmarking suite stresses the communications capabilities of the CM-2 in a range of different ways. Benchmark results demonstrate that it is possible to make effective use of much of the massive concurrency available in the communications network.
Improving HEI Productivity and Performance through Project Management: Implications from a Benchmarking Case Study

ERIC Educational Resources Information Center

Bryde, David; Leighton, Diana

2009-01-01

As higher education institutions (HEIs) look to be more commercial in their outlook they are likely to become more dependent on the successful implementation of projects. This article reports a benchmarking survey of PM maturity in a HEI, with the purpose of assessing its capability to implement projects. Data were collected via questionnaires…
Highly Enriched Uranium Metal Cylinders Surrounded by Various Reflector Materials

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bernard Jones; J. Blair Briggs; Leland Monteirth

A series of experiments was performed at Los Alamos Scientific Laboratory in 1958 to determine critical masses of cylinders of Oralloy (Oy) reflected by a number of materials. The experiments were all performed on the Comet Universal Critical Assembly Machine, and consisted of discs of highly enriched uranium (93.3 wt.% 235U) reflected by half-inch and one-inch-thick cylindrical shells of various reflector materials. The experiments were performed by members of Group N-2, particularly K. W. Gallup, G. E. Hansen, H. C. Paxton, and R. H. White. This experiment was intended to ascertain critical masses for criticality safety purposes, as well asmore » to compare neutron transport cross sections to those obtained from danger coefficient measurements with the Topsy Oralloy-Tuballoy reflected and Godiva unreflected critical assemblies. The reflector materials examined in this series of experiments are as follows: magnesium, titanium, aluminum, graphite, mild steel, nickel, copper, cobalt, molybdenum, natural uranium, tungsten, beryllium, aluminum oxide, molybdenum carbide, and polythene (polyethylene). Also included are two special configurations of composite beryllium and iron reflectors. Analyses were performed in which uncertainty associated with six different parameters was evaluated; namely, extrapolation to the uranium critical mass, uranium density, 235U enrichment, reflector density, reflector thickness, and reflector impurities. In addition to the idealizations made by the experimenters (removal of the platen and diaphragm), two simplifications were also made to the benchmark models that resulted in a small bias and additional uncertainty. First of all, since impurities in core and reflector materials are only estimated, they are not included in the benchmark models. Secondly, the room, support structure, and other possible surrounding equipment were not included in the model. Bias values that result from these two simplifications were determined and associated uncertainty in the bias values were included in the overall uncertainty in benchmark keff values. Bias values were very small, ranging from 0.0004 ?k low to 0.0007 ?k low. Overall uncertainties range from ? 0.0018 to ? 0.0030. Major contributors to the overall uncertainty include uncertainty in the extrapolation to the uranium critical mass and the uranium density. Results are summarized in Figure 1. Figure 1. Experimental, Benchmark-Model, and MCNP/KENO Calculated Results The 32 configurations described and evaluated under ICSBEP Identifier HEU-MET-FAST-084 are judged to be acceptable for use as criticality safety benchmark experiments and should be valuable integral benchmarks for nuclear data testing of the various reflector materials. Details of the benchmark models, uncertainty analyses, and final results are given in this paper.« less
A new deadlock resolution protocol and message matching algorithm for the extreme-scale simulator

DOE PAGES

Engelmann, Christian; Naughton, III, Thomas J.

2016-03-22

Investigating the performance of parallel applications at scale on future high-performance computing (HPC) architectures and the performance impact of different HPC architecture choices is an important component of HPC hardware/software co-design. The Extreme-scale Simulator (xSim) is a simulation toolkit for investigating the performance of parallel applications at scale. xSim scales to millions of simulated Message Passing Interface (MPI) processes. The overhead introduced by a simulation tool is an important performance and productivity aspect. This paper documents two improvements to xSim: (1)~a new deadlock resolution protocol to reduce the parallel discrete event simulation overhead and (2)~a new simulated MPI message matchingmore » algorithm to reduce the oversubscription management overhead. The results clearly show a significant performance improvement. The simulation overhead for running the NAS Parallel Benchmark suite was reduced from 102% to 0% for the embarrassingly parallel (EP) benchmark and from 1,020% to 238% for the conjugate gradient (CG) benchmark. xSim offers a highly accurate simulation mode for better tracking of injected MPI process failures. Furthermore, with highly accurate simulation, the overhead was reduced from 3,332% to 204% for EP and from 37,511% to 13,808% for CG.« less
Regression Tree-Based Methodology for Customizing Building Energy Benchmarks to Individual Commercial Buildings

NASA Astrophysics Data System (ADS)

Kaskhedikar, Apoorva Prakash

According to the U.S. Energy Information Administration, commercial buildings represent about 40% of the United State's energy consumption of which office buildings consume a major portion. Gauging the extent to which an individual building consumes energy in excess of its peers is the first step in initiating energy efficiency improvement. Energy Benchmarking offers initial building energy performance assessment without rigorous evaluation. Energy benchmarking tools based on the Commercial Buildings Energy Consumption Survey (CBECS) database are investigated in this thesis. This study proposes a new benchmarking methodology based on decision trees, where a relationship between the energy use intensities (EUI) and building parameters (continuous and categorical) is developed for different building types. This methodology was applied to medium office and school building types contained in the CBECS database. The Random Forest technique was used to find the most influential parameters that impact building energy use intensities. Subsequently, correlations which were significant were identified between EUIs and CBECS variables. Other than floor area, some of the important variables were number of workers, location, number of PCs and main cooling equipment. The coefficient of variation was used to evaluate the effectiveness of the new model. The customization technique proposed in this thesis was compared with another benchmarking model that is widely used by building owners and designers namely, the ENERGY STAR's Portfolio Manager. This tool relies on the standard Linear Regression methods which is only able to handle continuous variables. The model proposed uses data mining technique and was found to perform slightly better than the Portfolio Manager. The broader impacts of the new benchmarking methodology proposed is that it allows for identifying important categorical variables, and then incorporating them in a local, as against a global, model framework for EUI pertinent to the building type. The ability to identify and rank the important variables is of great importance in practical implementation of the benchmarking tools which rely on query-based building and HVAC variable filters specified by the user.

Methodology and issues of integral experiments selection for nuclear data validation

NASA Astrophysics Data System (ADS)

Tatiana, Ivanova; Ivanov, Evgeny; Hill, Ian

2017-09-01

Nuclear data validation involves a large suite of Integral Experiments (IEs) for criticality, reactor physics and dosimetry applications. [1] Often benchmarks are taken from international Handbooks. [2, 3] Depending on the application, IEs have different degrees of usefulness in validation, and usually the use of a single benchmark is not advised; indeed, it may lead to erroneous interpretation and results. [1] This work aims at quantifying the importance of benchmarks used in application dependent cross section validation. The approach is based on well-known General Linear Least Squared Method (GLLSM) extended to establish biases and uncertainties for given cross sections (within a given energy interval). The statistical treatment results in a vector of weighting factors for the integral benchmarks. These factors characterize the value added by a benchmark for nuclear data validation for the given application. The methodology is illustrated by one example, selecting benchmarks for 239Pu cross section validation. The studies were performed in the framework of Subgroup 39 (Methods and approaches to provide feedback from nuclear and covariance data adjustment for improvement of nuclear data files) established at the Working Party on International Nuclear Data Evaluation Cooperation (WPEC) of the Nuclear Science Committee under the Nuclear Energy Agency (NEA/OECD).
Decoys Selection in Benchmarking Datasets: Overview and Perspectives

PubMed Central

Réau, Manon; Langenfeld, Florent; Zagury, Jean-François; Lagarde, Nathalie; Montes, Matthieu

2018-01-01

Virtual Screening (VS) is designed to prospectively help identifying potential hits, i.e., compounds capable of interacting with a given target and potentially modulate its activity, out of large compound collections. Among the variety of methodologies, it is crucial to select the protocol that is the most adapted to the query/target system under study and that yields the most reliable output. To this aim, the performance of VS methods is commonly evaluated and compared by computing their ability to retrieve active compounds in benchmarking datasets. The benchmarking datasets contain a subset of known active compounds together with a subset of decoys, i.e., assumed non-active molecules. The composition of both the active and the decoy compounds subsets is critical to limit the biases in the evaluation of the VS methods. In this review, we focus on the selection of decoy compounds that has considerably changed over the years, from randomly selected compounds to highly customized or experimentally validated negative compounds. We first outline the evolution of decoys selection in benchmarking databases as well as current benchmarking databases that tend to minimize the introduction of biases, and secondly, we propose recommendations for the selection and the design of benchmarking datasets. PMID:29416509
Electric load shape benchmarking for small- and medium-sized commercial buildings

DOE Office of Scientific and Technical Information (OSTI.GOV)

Luo, Xuan; Hong, Tianzhen; Chen, Yixing

Small- and medium-sized commercial buildings owners and utility managers often look for opportunities for energy cost savings through energy efficiency and energy waste minimization. However, they currently lack easy access to low-cost tools that help interpret the massive amount of data needed to improve understanding of their energy use behaviors. Benchmarking is one of the techniques used in energy audits to identify which buildings are priorities for an energy analysis. Traditional energy performance indicators, such as the energy use intensity (annual energy per unit of floor area), consider only the total annual energy consumption, lacking consideration of the fluctuation ofmore » energy use behavior over time, which reveals the time of use information and represents distinct energy use behaviors during different time spans. To fill the gap, this study developed a general statistical method using 24-hour electric load shape benchmarking to compare a building or business/tenant space against peers. Specifically, the study developed new forms of benchmarking metrics and data analysis methods to infer the energy performance of a building based on its load shape. We first performed a data experiment with collected smart meter data using over 2,000 small- and medium-sized businesses in California. We then conducted a cluster analysis of the source data, and determined and interpreted the load shape features and parameters with peer group analysis. Finally, we implemented the load shape benchmarking feature in an open-access web-based toolkit (the Commercial Building Energy Saver) to provide straightforward and practical recommendations to users. The analysis techniques were generic and flexible for future datasets of other building types and in other utility territories.« less
Electric load shape benchmarking for small- and medium-sized commercial buildings

DOE PAGES

Luo, Xuan; Hong, Tianzhen; Chen, Yixing; ...

2017-07-28

Small- and medium-sized commercial buildings owners and utility managers often look for opportunities for energy cost savings through energy efficiency and energy waste minimization. However, they currently lack easy access to low-cost tools that help interpret the massive amount of data needed to improve understanding of their energy use behaviors. Benchmarking is one of the techniques used in energy audits to identify which buildings are priorities for an energy analysis. Traditional energy performance indicators, such as the energy use intensity (annual energy per unit of floor area), consider only the total annual energy consumption, lacking consideration of the fluctuation ofmore » energy use behavior over time, which reveals the time of use information and represents distinct energy use behaviors during different time spans. To fill the gap, this study developed a general statistical method using 24-hour electric load shape benchmarking to compare a building or business/tenant space against peers. Specifically, the study developed new forms of benchmarking metrics and data analysis methods to infer the energy performance of a building based on its load shape. We first performed a data experiment with collected smart meter data using over 2,000 small- and medium-sized businesses in California. We then conducted a cluster analysis of the source data, and determined and interpreted the load shape features and parameters with peer group analysis. Finally, we implemented the load shape benchmarking feature in an open-access web-based toolkit (the Commercial Building Energy Saver) to provide straightforward and practical recommendations to users. The analysis techniques were generic and flexible for future datasets of other building types and in other utility territories.« less
GROWTH OF THE INTERNATIONAL CRITICALITY SAFETY AND REACTOR PHYSICS EXPERIMENT EVALUATION PROJECTS

DOE Office of Scientific and Technical Information (OSTI.GOV)

J. Blair Briggs; John D. Bess; Jim Gulliford

2011-09-01

Since the International Conference on Nuclear Criticality Safety (ICNC) 2007, the International Criticality Safety Benchmark Evaluation Project (ICSBEP) and the International Reactor Physics Experiment Evaluation Project (IRPhEP) have continued to expand their efforts and broaden their scope. Eighteen countries participated on the ICSBEP in 2007. Now, there are 20, with recent contributions from Sweden and Argentina. The IRPhEP has also expanded from eight contributing countries in 2007 to 16 in 2011. Since ICNC 2007, the contents of the 'International Handbook of Evaluated Criticality Safety Benchmark Experiments1' have increased from 442 evaluations (38000 pages), containing benchmark specifications for 3955 critical ormore » subcritical configurations to 516 evaluations (nearly 55000 pages), containing benchmark specifications for 4405 critical or subcritical configurations in the 2010 Edition of the ICSBEP Handbook. The contents of the Handbook have also increased from 21 to 24 criticality-alarm-placement/shielding configurations with multiple dose points for each, and from 20 to 200 configurations categorized as fundamental physics measurements relevant to criticality safety applications. Approximately 25 new evaluations and 150 additional configurations are expected to be added to the 2011 edition of the Handbook. Since ICNC 2007, the contents of the 'International Handbook of Evaluated Reactor Physics Benchmark Experiments2' have increased from 16 different experimental series that were performed at 12 different reactor facilities to 53 experimental series that were performed at 30 different reactor facilities in the 2011 edition of the Handbook. Considerable effort has also been made to improve the functionality of the searchable database, DICE (Database for the International Criticality Benchmark Evaluation Project) and verify the accuracy of the data contained therein. DICE will be discussed in separate papers at ICNC 2011. The status of the ICSBEP and the IRPhEP will be discussed in the full paper, selected benchmarks that have been added to the ICSBEP Handbook will be highlighted, and a preview of the new benchmarks that will appear in the September 2011 edition of the Handbook will be provided. Accomplishments of the IRPhEP will also be highlighted and the future of both projects will be discussed. REFERENCES (1) International Handbook of Evaluated Criticality Safety Benchmark Experiments, NEA/NSC/DOC(95)03/I-IX, Organisation for Economic Co-operation and Development-Nuclear Energy Agency (OECD-NEA), September 2010 Edition, ISBN 978-92-64-99140-8. (2) International Handbook of Evaluated Reactor Physics Benchmark Experiments, NEA/NSC/DOC(2006)1, Organisation for Economic Co-operation and Development-Nuclear Energy Agency (OECD-NEA), March 2011 Edition, ISBN 978-92-64-99141-5.« less
Benchmark for Peak Detection Algorithms in Fiber Bragg Grating Interrogation and a New Neural Network for its Performance Improvement

PubMed Central

Negri, Lucas; Nied, Ademir; Kalinowski, Hypolito; Paterno, Aleksander

2011-01-01

This paper presents a benchmark for peak detection algorithms employed in fiber Bragg grating spectrometric interrogation systems. The accuracy, precision, and computational performance of currently used algorithms and those of a new proposed artificial neural network algorithm are compared. Centroid and gaussian fitting algorithms are shown to have the highest precision but produce systematic errors that depend on the FBG refractive index modulation profile. The proposed neural network displays relatively good precision with reduced systematic errors and improved computational performance when compared to other networks. Additionally, suitable algorithms may be chosen with the general guidelines presented. PMID:22163806
Polarization Control with Piezoelectric and LiNbO3 Transducers

NASA Astrophysics Data System (ADS)

Bradley, E.; Miles, E.; Loginov, B.; Vu, N.

Several Polarization control transducers have appeared on the market, and now automated, endless polarization control systems using these transducers are becoming available. Unfortunately it is not entirely clear what benchmark performance tests a polarization control system must pass, and the polarization disturbances a system must handle are open to some debate. We present quantitative measurements of realistic polarization disturbances and two benchmark tests we have successfully used to evaluate the performance of an automated, endless polarization control system. We use these tests to compare the performance of a system using piezoelectric transducers to that of a system using LiNbO3 transducers.
Note: The performance of new density functionals for a recent blind test of non-covalent interactions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mardirossian, Narbe; Head-Gordon, Martin

Benchmark datasets of non-covalent interactions are essential for assessing the performance of density functionals and other quantum chemistry approaches. In a recent blind test, Taylor et al. benchmarked 14 methods on a new dataset consisting of 10 dimer potential energy curves calculated using coupled cluster with singles, doubles, and perturbative triples (CCSD(T)) at the complete basis set (CBS) limit (80 data points in total). Finally, the dataset is particularly interesting because compressed, near-equilibrium, and stretched regions of the potential energy surface are extensively sampled.
Better Medicare Cost Report data are needed to help hospitals benchmark costs and performance.

PubMed

Magnus, S A; Smith, D G

2000-01-01

To evaluate costs and achieve cost control in the face of new technology and demands for efficiency from both managed care and governmental payers, hospitals need to benchmark their costs against those of other comparable hospitals. Since they typically use Medicare Cost Report (MCR) data for this purpose, a variety of cost accounting problems with the MCR may hamper hospitals' understanding of their relative costs and performance. Managers and researchers alike need to investigate the validity, accuracy, and timeliness of the MCR's cost accounting data.
Note: The performance of new density functionals for a recent blind test of non-covalent interactions

DOE PAGES

Mardirossian, Narbe; Head-Gordon, Martin

2016-11-09

Benchmark datasets of non-covalent interactions are essential for assessing the performance of density functionals and other quantum chemistry approaches. In a recent blind test, Taylor et al. benchmarked 14 methods on a new dataset consisting of 10 dimer potential energy curves calculated using coupled cluster with singles, doubles, and perturbative triples (CCSD(T)) at the complete basis set (CBS) limit (80 data points in total). Finally, the dataset is particularly interesting because compressed, near-equilibrium, and stretched regions of the potential energy surface are extensively sampled.
Revenue cycle management.

PubMed

Manley, Ray; Satiani, Bhagwan

2009-11-01

With the widening gap between overhead expenses and reimbursement, management of the revenue cycle is a critical part of a successful vascular surgery practice. It is important to review the data on all the components of the revenue cycle: payer contracting, appointment scheduling, preregistration, registration process, coding and capturing charges, proper billing of patients and insurers, follow-up of accounts receivable, and finally using appropriate benchmarking. The industry benchmarks used should be those of peers in identical groups. Warning signs of poor performance are discussed enabling the practice to formulate a performance improvement plan.
An automated benchmarking platform for MHC class II binding prediction methods.

PubMed

Andreatta, Massimo; Trolle, Thomas; Yan, Zhen; Greenbaum, Jason A; Peters, Bjoern; Nielsen, Morten

2018-05-01

Computational methods for the prediction of peptide-MHC binding have become an integral and essential component for candidate selection in experimental T cell epitope discovery studies. The sheer amount of published prediction methods-and often discordant reports on their performance-poses a considerable quandary to the experimentalist who needs to choose the best tool for their research. With the goal to provide an unbiased, transparent evaluation of the state-of-the-art in the field, we created an automated platform to benchmark peptide-MHC class II binding prediction tools. The platform evaluates the absolute and relative predictive performance of all participating tools on data newly entered into the Immune Epitope Database (IEDB) before they are made public, thereby providing a frequent, unbiased assessment of available prediction tools. The benchmark runs on a weekly basis, is fully automated, and displays up-to-date results on a publicly accessible website. The initial benchmark described here included six commonly used prediction servers, but other tools are encouraged to join with a simple sign-up procedure. Performance evaluation on 59 data sets composed of over 10 000 binding affinity measurements suggested that NetMHCIIpan is currently the most accurate tool, followed by NN-align and the IEDB consensus method. Weekly reports on the participating methods can be found online at: http://tools.iedb.org/auto_bench/mhcii/weekly/. mniel@bioinformatics.dtu.dk. Supplementary data are available at Bioinformatics online.
Benchmark CCSD(T) and DFT study of binding energies in Be7 - 12: in search of reliable DFT functional for beryllium clusters

NASA Astrophysics Data System (ADS)

Labanc, Daniel; Šulka, Martin; Pitoňák, Michal; Černušák, Ivan; Urban, Miroslav; Neogrády, Pavel

2018-05-01

We present a computational study of the stability of small homonuclear beryllium clusters Be7 - 12 in singlet electronic states. Our predictions are based on highly correlated CCSD(T) coupled cluster calculations. Basis set convergence towards the complete basis set limit as well as the role of the 1s core electron correlation are carefully examined. Our CCSD(T) data for binding energies of Be7 - 12 clusters serve as a benchmark for performance assessment of several density functional theory (DFT) methods frequently used in beryllium cluster chemistry. We observe that, from Be10 clusters on, the deviation from CCSD(T) benchmarks is stable with respect to size, and fluctuating within 0.02 eV error bar for most examined functionals. This opens up the possibility of scaling the DFT binding energies for large Be clusters using CCSD(T) benchmark values for smaller clusters. We also tried to find analogies between the performance of DFT functionals for Be clusters and for the valence-isoelectronic Mg clusters investigated recently in Truhlar's group. We conclude that it is difficult to find DFT functionals that perform reasonably well for both beryllium and magnesium clusters. Out of 12 functionals examined, only the M06-2X functional gives reasonably accurate and balanced binding energies for both Be and Mg clusters.
Optimized selection of benchmark test parameters for image watermark algorithms based on Taguchi methods and corresponding influence on design decisions for real-world applications

NASA Astrophysics Data System (ADS)

Rodriguez, Tony F.; Cushman, David A.

2003-06-01

With the growing commercialization of watermarking techniques in various application scenarios it has become increasingly important to quantify the performance of watermarking products. The quantification of relative merits of various products is not only essential in enabling further adoption of the technology by society as a whole, but will also drive the industry to develop testing plans/methodologies to ensure quality and minimize cost (to both vendors & customers.) While the research community understands the theoretical need for a publicly available benchmarking system to quantify performance, there has been less discussion on the practical application of these systems. By providing a standard set of acceptance criteria, benchmarking systems can dramatically increase the quality of a particular watermarking solution, validating the product performances if they are used efficiently and frequently during the design process. In this paper we describe how to leverage specific design of experiments techniques to increase the quality of a watermarking scheme, to be used with the benchmark tools being developed by the Ad-Hoc Watermark Verification Group. A Taguchi Loss Function is proposed for an application and orthogonal arrays used to isolate optimal levels for a multi-factor experimental situation. Finally, the results are generalized to a population of cover works and validated through an exhaustive test.
PFLOTRAN Verification: Development of a Testing Suite to Ensure Software Quality

NASA Astrophysics Data System (ADS)

Hammond, G. E.; Frederick, J. M.

2016-12-01

In scientific computing, code verification ensures the reliability and numerical accuracy of a model simulation by comparing the simulation results to experimental data or known analytical solutions. The model is typically defined by a set of partial differential equations with initial and boundary conditions, and verification ensures whether the mathematical model is solved correctly by the software. Code verification is especially important if the software is used to model high-consequence systems which cannot be physically tested in a fully representative environment [Oberkampf and Trucano (2007)]. Justified confidence in a particular computational tool requires clarity in the exercised physics and transparency in its verification process with proper documentation. We present a quality assurance (QA) testing suite developed by Sandia National Laboratories that performs code verification for PFLOTRAN, an open source, massively-parallel subsurface simulator. PFLOTRAN solves systems of generally nonlinear partial differential equations describing multiphase, multicomponent and multiscale reactive flow and transport processes in porous media. PFLOTRAN's QA test suite compares the numerical solutions of benchmark problems in heat and mass transport against known, closed-form, analytical solutions, including documentation of the exercised physical process models implemented in each PFLOTRAN benchmark simulation. The QA test suite development strives to follow the recommendations given by Oberkampf and Trucano (2007), which describes four essential elements in high-quality verification benchmark construction: (1) conceptual description, (2) mathematical description, (3) accuracy assessment, and (4) additional documentation and user information. Several QA tests within the suite will be presented, including details of the benchmark problems and their closed-form analytical solutions, implementation of benchmark problems in PFLOTRAN simulations, and the criteria used to assess PFLOTRAN's performance in the code verification procedure. References Oberkampf, W. L., and T. G. Trucano (2007), Verification and Validation Benchmarks, SAND2007-0853, 67 pgs., Sandia National Laboratories, Albuquerque, NM.
Optimally stopped variational quantum algorithms

NASA Astrophysics Data System (ADS)

Vinci, Walter; Shabani, Alireza

2018-04-01

Quantum processors promise a paradigm shift in high-performance computing which needs to be assessed by accurate benchmarking measures. In this article, we introduce a benchmark for the variational quantum algorithm (VQA), recently proposed as a heuristic algorithm for small-scale quantum processors. In VQA, a classical optimization algorithm guides the processor's quantum dynamics to yield the best solution for a given problem. A complete assessment of the scalability and competitiveness of VQA should take into account both the quality and the time of dynamics optimization. The method of optimal stopping, employed here, provides such an assessment by explicitly including time as a cost factor. Here, we showcase this measure for benchmarking VQA as a solver for some quadratic unconstrained binary optimization. Moreover, we show that a better choice for the cost function of the classical routine can significantly improve the performance of the VQA algorithm and even improve its scaling properties.
An Integrated Development Environment for Adiabatic Quantum Programming

DOE Office of Scientific and Technical Information (OSTI.GOV)

Humble, Travis S; McCaskey, Alex; Bennink, Ryan S

2014-01-01

Adiabatic quantum computing is a promising route to the computational power afforded by quantum information processing. The recent availability of adiabatic hardware raises the question of how well quantum programs perform. Benchmarking behavior is challenging since the multiple steps to synthesize an adiabatic quantum program are highly tunable. We present an adiabatic quantum programming environment called JADE that provides control over all the steps taken during program development. JADE captures the workflow needed to rigorously benchmark performance while also allowing a variety of problem types, programming techniques, and processor configurations. We have also integrated JADE with a quantum simulation enginemore » that enables program profiling using numerical calculation. The computational engine supports plug-ins for simulation methodologies tailored to various metrics and computing resources. We present the design, integration, and deployment of JADE and discuss its use for benchmarking adiabatic quantum programs.« less
Benchmarking Memory Performance with the Data Cube Operator

NASA Technical Reports Server (NTRS)

Frumkin, Michael A.; Shabanov, Leonid V.

2004-01-01

Data movement across a computer memory hierarchy and across computational grids is known to be a limiting factor for applications processing large data sets. We use the Data Cube Operator on an Arithmetic Data Set, called ADC, to benchmark capabilities of computers and of computational grids to handle large distributed data sets. We present a prototype implementation of a parallel algorithm for computation of the operatol: The algorithm follows a known approach for computing views from the smallest parent. The ADC stresses all levels of grid memory and storage by producing some of 2d views of an Arithmetic Data Set of d-tuples described by a small number of integers. We control data intensity of the ADC by selecting the tuple parameters, the sizes of the views, and the number of realized views. Benchmarking results of memory performance of a number of computer architectures and of a small computational grid are presented.
Space network scheduling benchmark: A proof-of-concept process for technology transfer

NASA Technical Reports Server (NTRS)

Moe, Karen; Happell, Nadine; Hayden, B. J.; Barclay, Cathy

1993-01-01

This paper describes a detailed proof-of-concept activity to evaluate flexible scheduling technology as implemented in the Request Oriented Scheduling Engine (ROSE) and applied to Space Network (SN) scheduling. The criteria developed for an operational evaluation of a reusable scheduling system is addressed including a methodology to prove that the proposed system performs at least as well as the current system in function and performance. The improvement of the new technology must be demonstrated and evaluated against the cost of making changes. Finally, there is a need to show significant improvement in SN operational procedures. Successful completion of a proof-of-concept would eventually lead to an operational concept and implementation transition plan, which is outside the scope of this paper. However, a high-fidelity benchmark using actual SN scheduling requests has been designed to test the ROSE scheduling tool. The benchmark evaluation methodology, scheduling data, and preliminary results are described.
The impact of a scheduling change on ninth grade high school performance on biology benchmark exams and the California Standards Test

NASA Astrophysics Data System (ADS)

Leonardi, Marcelo

The primary purpose of this study was to examine the impact of a scheduling change from a trimester 4x4 block schedule to a modified hybrid schedule on student achievement in ninth grade biology courses. This study examined the impact of the scheduling change on student achievement through teacher created benchmark assessments in Genetics, DNA, and Evolution and on the California Standardized Test in Biology. The secondary purpose of this study examined the ninth grade biology teacher perceptions of ninth grade biology student achievement. Using a mixed methods research approach, data was collected both quantitatively and qualitatively as aligned to research questions. Quantitative methods included gathering data from departmental benchmark exams and California Standardized Test in Biology and conducting multiple analysis of covariance and analysis of covariance to determine significance differences. Qualitative methods include journal entries questions and focus group interviews. The results revealed a statistically significant increase in scores on both the DNA and Evolution benchmark exams. DNA and Evolution benchmark exams showed significant improvements from a change in scheduling format. The scheduling change was responsible for 1.5% of the increase in DNA benchmark scores and 2% of the increase in Evolution benchmark scores. The results revealed a statistically significant decrease in scores on the Genetics Benchmark exam as a result of the scheduling change. The scheduling change was responsible for 1% of the decrease in Genetics benchmark scores. The results also revealed a statistically significant increase in scores on the CST Biology exam. The scheduling change was responsible for .7% of the increase in CST Biology scores. Results of the focus group discussions indicated that all teachers preferred the modified hybrid schedule over the trimester schedule and that it improved student achievement.

Elementary School Students' Science Talk Ability in Inquiry-Oriented Settings in Taiwan: Test Development, Verification, and Performance Benchmarks

ERIC Educational Resources Information Center

Lin, Sheau-Wen; Liu, Yu; Chen, Shin-Feng; Wang, Jing-Ru; Kao, Huey-Lien

2016-01-01

The purpose of this study was to develop a computer-based measure of elementary students' science talk and to report students' benchmarks. The development procedure had three steps: defining the framework of the test, collecting and identifying key reference sets of science talk, and developing and verifying the science talk instrument. The…
75 FR 43554 - Notice of Lodging of Consent Decree Under the Federal Water Pollution Control Act (“Clean Water...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-07-26

... Benchmark Engineering Corp., Civil Action No. 10-40131 was lodged with the United States District Court for... requires Defendants to pay a civil penalty of $150,000, perform a Supplemental Environmental Project, and.... Fafard Real Estate and Development Corp., FRE Building Co. Inc., and Benchmark Engineering Corp., D.J...
PSO algorithm enhanced with Lozi Chaotic Map - Tuning experiment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pluhacek, Michal; Senkerik, Roman; Zelinka, Ivan

2015-03-10

In this paper it is investigated the effect of tuning of control parameters of the Lozi Chaotic Map employed as a chaotic pseudo-random number generator for the particle swarm optimization algorithm. Three different benchmark functions are selected from the IEEE CEC 2013 competition benchmark set. The Lozi map is extensively tuned and the performance of PSO is evaluated.
Design and development of a community carbon cycle benchmarking system for CMIP5 models

NASA Astrophysics Data System (ADS)

Mu, M.; Hoffman, F. M.; Lawrence, D. M.; Riley, W. J.; Keppel-Aleks, G.; Randerson, J. T.

2013-12-01

Benchmarking has been widely used to assess the ability of atmosphere, ocean, sea ice, and land surface models to capture the spatial and temporal variability of observations during the historical period. For the carbon cycle and terrestrial ecosystems, the design and development of an open-source community platform has been an important goal as part of the International Land Model Benchmarking (ILAMB) project. Here we designed and developed a software system that enables the user to specify the models, benchmarks, and scoring systems so that results can be tailored to specific model intercomparison projects. We used this system to evaluate the performance of CMIP5 Earth system models (ESMs). Our scoring system used information from four different aspects of climate, including the climatological mean spatial pattern of gridded surface variables, seasonal cycle dynamics, the amplitude of interannual variability, and long-term decadal trends. We used this system to evaluate burned area, global biomass stocks, net ecosystem exchange, gross primary production, and ecosystem respiration from CMIP5 historical simulations. Initial results indicated that the multi-model mean often performed better than many of the individual models for most of the observational constraints.
Toward Automated Benchmarking of Atomistic Force Fields: Neat Liquid Densities and Static Dielectric Constants from the ThermoML Data Archive.

PubMed

Beauchamp, Kyle A; Behr, Julie M; Rustenburg, Ariën S; Bayly, Christopher I; Kroenlein, Kenneth; Chodera, John D

2015-10-08

Atomistic molecular simulations are a powerful way to make quantitative predictions, but the accuracy of these predictions depends entirely on the quality of the force field employed. Although experimental measurements of fundamental physical properties offer a straightforward approach for evaluating force field quality, the bulk of this information has been tied up in formats that are not machine-readable. Compiling benchmark data sets of physical properties from non-machine-readable sources requires substantial human effort and is prone to the accumulation of human errors, hindering the development of reproducible benchmarks of force-field accuracy. Here, we examine the feasibility of benchmarking atomistic force fields against the NIST ThermoML data archive of physicochemical measurements, which aggregates thousands of experimental measurements in a portable, machine-readable, self-annotating IUPAC-standard format. As a proof of concept, we present a detailed benchmark of the generalized Amber small-molecule force field (GAFF) using the AM1-BCC charge model against experimental measurements (specifically, bulk liquid densities and static dielectric constants at ambient pressure) automatically extracted from the archive and discuss the extent of data available for use in larger scale (or continuously performed) benchmarks. The results of even this limited initial benchmark highlight a general problem with fixed-charge force fields in the representation low-dielectric environments, such as those seen in binding cavities or biological membranes.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bailey, David H.

The NAS Parallel Benchmarks (NPB) are a suite of parallel computer performance benchmarks. They were originally developed at the NASA Ames Research Center in 1991 to assess high-end parallel supercomputers. Although they are no longer used as widely as they once were for comparing high-end system performance, they continue to be studied and analyzed a great deal in the high-performance computing community. The acronym 'NAS' originally stood for the Numerical Aeronautical Simulation Program at NASA Ames. The name of this organization was subsequently changed to the Numerical Aerospace Simulation Program, and more recently to the NASA Advanced Supercomputing Center, althoughmore » the acronym remains 'NAS.' The developers of the original NPB suite were David H. Bailey, Eric Barszcz, John Barton, David Browning, Russell Carter, LeoDagum, Rod Fatoohi, Samuel Fineberg, Paul Frederickson, Thomas Lasinski, Rob Schreiber, Horst Simon, V. Venkatakrishnan and Sisira Weeratunga. The original NAS Parallel Benchmarks consisted of eight individual benchmark problems, each of which focused on some aspect of scientific computing. The principal focus was in computational aerophysics, although most of these benchmarks have much broader relevance, since in a much larger sense they are typical of many real-world scientific computing applications. The NPB suite grew out of the need for a more rational procedure to select new supercomputers for acquisition by NASA. The emergence of commercially available highly parallel computer systems in the late 1980s offered an attractive alternative to parallel vector supercomputers that had been the mainstay of high-end scientific computing. However, the introduction of highly parallel systems was accompanied by a regrettable level of hype, not only on the part of the commercial vendors but even, in some cases, by scientists using the systems. As a result, it was difficult to discern whether the new systems offered any fundamental performance advantage over vector supercomputers, and, if so, which of the parallel offerings would be most useful in real-world scientific computation. In part to draw attention to some of the performance reporting abuses prevalent at the time, the present author wrote a humorous essay 'Twelve Ways to Fool the Masses,' which described in a light-hearted way a number of the questionable ways in which both vendor marketing people and scientists were inflating and distorting their performance results. All of this underscored the need for an objective and scientifically defensible measure to compare performance on these systems.« less
Career performance trajectories of Olympic swimmers: benchmarks for talent development.

PubMed

Allen, Sian V; Vandenbogaerde, Tom J; Hopkins, William G

2014-01-01

The age-related progression of elite athletes to their career-best performances can provide benchmarks for talent development. The purpose of this study was to model career performance trajectories of Olympic swimmers to develop these benchmarks. We searched the Web for annual best times of swimmers who were top 16 in pool events at the 2008 or 2012 Olympics, from each swimmer's earliest available competitive performance through to 2012. There were 6959 times in the 13 events for each sex, for 683 swimmers, with 10 ± 3 performances per swimmer (mean ± s). Progression to peak performance was tracked with individual quadratic trajectories derived using a mixed linear model that included adjustments for better performance in Olympic years and for the use of full-body polyurethane swimsuits in 2009. Analysis of residuals revealed appropriate fit of quadratic trends to the data. The trajectories provided estimates of age of peak performance and the duration of the age window of trivial improvement and decline around the peak. Men achieved peak performance later than women (24.2 ± 2.1 vs. 22.5 ± 2.4 years), while peak performance occurred at later ages for the shorter distances for both sexes (∼1.5-2.0 years between sprint and distance-event groups). Men and women had a similar duration in the peak-performance window (2.6 ± 1.5 years) and similar progressions to peak performance over four years (2.4 ± 1.2%) and eight years (9.5 ± 4.8%). These data provide performance targets for swimmers aiming to achieve elite-level performance.
Validating the performance of correlated fission multiplicity implementation in radiation transport codes with subcritical neutron multiplication benchmark experiments

DOE PAGES

Arthur, Jennifer; Bahran, Rian; Hutchinson, Jesson; ...

2018-06-14

Historically, radiation transport codes have uncorrelated fission emissions. In reality, the particles emitted by both spontaneous and induced fissions are correlated in time, energy, angle, and multiplicity. This work validates the performance of various current Monte Carlo codes that take into account the underlying correlated physics of fission neutrons, specifically neutron multiplicity distributions. The performance of 4 Monte Carlo codes - MCNP®6.2, MCNP®6.2/FREYA, MCNP®6.2/CGMF, and PoliMi - was assessed using neutron multiplicity benchmark experiments. In addition, MCNP®6.2 simulations were run using JEFF-3.2 and JENDL-4.0, rather than ENDF/B-VII.1, data for 239Pu and 240Pu. The sensitive benchmark parameters that in this workmore » represent the performance of each correlated fission multiplicity Monte Carlo code include the singles rate, the doubles rate, leakage multiplication, and Feynman histograms. Although it is difficult to determine which radiation transport code shows the best overall performance in simulating subcritical neutron multiplication inference benchmark measurements, it is clear that correlations exist between the underlying nuclear data utilized by (or generated by) the various codes, and the correlated neutron observables of interest. This could prove useful in nuclear data validation and evaluation applications, in which a particular moment of the neutron multiplicity distribution is of more interest than the other moments. It is also quite clear that, because transport is handled by MCNP®6.2 in 3 of the 4 codes, with the 4th code (PoliMi) being based on an older version of MCNP®, the differences in correlated neutron observables of interest are most likely due to the treatment of fission event generation in each of the different codes, as opposed to the radiation transport.« less
Validating the performance of correlated fission multiplicity implementation in radiation transport codes with subcritical neutron multiplication benchmark experiments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arthur, Jennifer; Bahran, Rian; Hutchinson, Jesson

Historically, radiation transport codes have uncorrelated fission emissions. In reality, the particles emitted by both spontaneous and induced fissions are correlated in time, energy, angle, and multiplicity. This work validates the performance of various current Monte Carlo codes that take into account the underlying correlated physics of fission neutrons, specifically neutron multiplicity distributions. The performance of 4 Monte Carlo codes - MCNP®6.2, MCNP®6.2/FREYA, MCNP®6.2/CGMF, and PoliMi - was assessed using neutron multiplicity benchmark experiments. In addition, MCNP®6.2 simulations were run using JEFF-3.2 and JENDL-4.0, rather than ENDF/B-VII.1, data for 239Pu and 240Pu. The sensitive benchmark parameters that in this workmore » represent the performance of each correlated fission multiplicity Monte Carlo code include the singles rate, the doubles rate, leakage multiplication, and Feynman histograms. Although it is difficult to determine which radiation transport code shows the best overall performance in simulating subcritical neutron multiplication inference benchmark measurements, it is clear that correlations exist between the underlying nuclear data utilized by (or generated by) the various codes, and the correlated neutron observables of interest. This could prove useful in nuclear data validation and evaluation applications, in which a particular moment of the neutron multiplicity distribution is of more interest than the other moments. It is also quite clear that, because transport is handled by MCNP®6.2 in 3 of the 4 codes, with the 4th code (PoliMi) being based on an older version of MCNP®, the differences in correlated neutron observables of interest are most likely due to the treatment of fission event generation in each of the different codes, as opposed to the radiation transport.« less
Performance of exchange-correlation functionals in density functional theory calculations for liquid metal: A benchmark test for sodium.

PubMed

Han, Jeong-Hwan; Oda, Takuji

2018-04-14

The performance of exchange-correlation functionals in density-functional theory (DFT) calculations for liquid metal has not been sufficiently examined. In the present study, benchmark tests of Perdew-Burke-Ernzerhof (PBE), Armiento-Mattsson 2005 (AM05), PBE re-parameterized for solids, and local density approximation (LDA) functionals are conducted for liquid sodium. The pair correlation function, equilibrium atomic volume, bulk modulus, and relative enthalpy are evaluated at 600 K and 1000 K. Compared with the available experimental data, the errors range from -11.2% to 0.0% for the atomic volume, from -5.2% to 22.0% for the bulk modulus, and from -3.5% to 2.5% for the relative enthalpy depending on the DFT functional. The generalized gradient approximation functionals are superior to the LDA functional, and the PBE and AM05 functionals exhibit the best performance. In addition, we assess whether the error tendency in liquid simulations is comparable to that in solid simulations, which would suggest that the atomic volume and relative enthalpy performances are comparable between solid and liquid states but that the bulk modulus performance is not. These benchmark test results indicate that the results of liquid simulations are significantly dependent on the exchange-correlation functional and that the DFT functional performance in solid simulations can be used to roughly estimate the performance in liquid simulations.
Performance of exchange-correlation functionals in density functional theory calculations for liquid metal: A benchmark test for sodium

NASA Astrophysics Data System (ADS)

Han, Jeong-Hwan; Oda, Takuji

2018-04-01

The performance of exchange-correlation functionals in density-functional theory (DFT) calculations for liquid metal has not been sufficiently examined. In the present study, benchmark tests of Perdew-Burke-Ernzerhof (PBE), Armiento-Mattsson 2005 (AM05), PBE re-parameterized for solids, and local density approximation (LDA) functionals are conducted for liquid sodium. The pair correlation function, equilibrium atomic volume, bulk modulus, and relative enthalpy are evaluated at 600 K and 1000 K. Compared with the available experimental data, the errors range from -11.2% to 0.0% for the atomic volume, from -5.2% to 22.0% for the bulk modulus, and from -3.5% to 2.5% for the relative enthalpy depending on the DFT functional. The generalized gradient approximation functionals are superior to the LDA functional, and the PBE and AM05 functionals exhibit the best performance. In addition, we assess whether the error tendency in liquid simulations is comparable to that in solid simulations, which would suggest that the atomic volume and relative enthalpy performances are comparable between solid and liquid states but that the bulk modulus performance is not. These benchmark test results indicate that the results of liquid simulations are significantly dependent on the exchange-correlation functional and that the DFT functional performance in solid simulations can be used to roughly estimate the performance in liquid simulations.
Information processing using a single dynamical node as complex system

PubMed Central

Appeltant, L.; Soriano, M.C.; Van der Sande, G.; Danckaert, J.; Massar, S.; Dambre, J.; Schrauwen, B.; Mirasso, C.R.; Fischer, I.

2011-01-01

Novel methods for information processing are highly desired in our information-driven society. Inspired by the brain's ability to process information, the recently introduced paradigm known as 'reservoir computing' shows that complex networks can efficiently perform computation. Here we introduce a novel architecture that reduces the usually required large number of elements to a single nonlinear node with delayed feedback. Through an electronic implementation, we experimentally and numerically demonstrate excellent performance in a speech recognition benchmark. Complementary numerical studies also show excellent performance for a time series prediction benchmark. These results prove that delay-dynamical systems, even in their simplest manifestation, can perform efficient information processing. This finding paves the way to feasible and resource-efficient technological implementations of reservoir computing. PMID:21915110
Performance and scalability evaluation of "Big Memory" on Blue Gene Linux.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yoshii, K.; Iskra, K.; Naik, H.

2011-05-01

We address memory performance issues observed in Blue Gene Linux and discuss the design and implementation of 'Big Memory' - an alternative, transparent memory space introduced to eliminate the memory performance issues. We evaluate the performance of Big Memory using custom memory benchmarks, NAS Parallel Benchmarks, and the Parallel Ocean Program, at a scale of up to 4,096 nodes. We find that Big Memory successfully resolves the performance issues normally encountered in Blue Gene Linux. For the ocean simulation program, we even find that Linux with Big Memory provides better scalability than does the lightweight compute node kernel designed solelymore » for high-performance applications. Originally intended exclusively for compute node tasks, our new memory subsystem dramatically improves the performance of certain I/O node applications as well. We demonstrate this performance using the central processor of the LOw Frequency ARray radio telescope as an example.« less
High Performance Computing at NASA

NASA Technical Reports Server (NTRS)

Bailey, David H.; Cooper, D. M. (Technical Monitor)

1994-01-01

The speaker will give an overview of high performance computing in the U.S. in general and within NASA in particular, including a description of the recently signed NASA-IBM cooperative agreement. The latest performance figures of various parallel systems on the NAS Parallel Benchmarks will be presented. The speaker was one of the authors of the NAS (National Aerospace Standards) Parallel Benchmarks, which are now widely cited in the industry as a measure of sustained performance on realistic high-end scientific applications. It will be shown that significant progress has been made by the highly parallel supercomputer industry during the past year or so, with several new systems, based on high-performance RISC processors, that now deliver superior performance per dollar compared to conventional supercomputers. Various pitfalls in reporting performance will be discussed. The speaker will then conclude by assessing the general state of the high performance computing field.
Benchmarking multimedia performance

NASA Astrophysics Data System (ADS)

Zandi, Ahmad; Sudharsanan, Subramania I.

1998-03-01

With the introduction of faster processors and special instruction sets tailored to multimedia, a number of exciting applications are now feasible on the desktops. Among these is the DVD playback consisting, among other things, of MPEG-2 video and Dolby digital audio or MPEG-2 audio. Other multimedia applications such as video conferencing and speech recognition are also becoming popular on computer systems. In view of this tremendous interest in multimedia, a group of major computer companies have formed, Multimedia Benchmarks Committee as part of Standard Performance Evaluation Corp. to address the performance issues of multimedia applications. The approach is multi-tiered with three tiers of fidelity from minimal to full compliant. In each case the fidelity of the bitstream reconstruction as well as quality of the video or audio output are measured and the system is classified accordingly. At the next step the performance of the system is measured. In many multimedia applications such as the DVD playback the application needs to be run at a specific rate. In this case the measurement of the excess processing power, makes all the difference. All these make a system level, application based, multimedia benchmark very challenging. Several ideas and methodologies for each aspect of the problems will be presented and analyzed.
A comparison of common programming languages used in bioinformatics.

PubMed

Fourment, Mathieu; Gillings, Michael R

2008-02-05

The performance of different programming languages has previously been benchmarked using abstract mathematical algorithms, but not using standard bioinformatics algorithms. We compared the memory usage and speed of execution for three standard bioinformatics methods, implemented in programs using one of six different programming languages. Programs for the Sellers algorithm, the Neighbor-Joining tree construction algorithm and an algorithm for parsing BLAST file outputs were implemented in C, C++, C#, Java, Perl and Python. Implementations in C and C++ were fastest and used the least memory. Programs in these languages generally contained more lines of code. Java and C# appeared to be a compromise between the flexibility of Perl and Python and the fast performance of C and C++. The relative performance of the tested languages did not change from Windows to Linux and no clear evidence of a faster operating system was found. Source code and additional information are available from http://www.bioinformatics.org/benchmark/. This benchmark provides a comparison of six commonly used programming languages under two different operating systems. The overall comparison shows that a developer should choose an appropriate language carefully, taking into account the performance expected and the library availability for each language.
Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool

PubMed Central

Clark, Neil R.; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D.; Jones, Matthew R.; Ma’ayan, Avi

2016-01-01

Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community. PMID:26848405
Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool.

PubMed

Clark, Neil R; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D; Jones, Matthew R; Ma'ayan, Avi

2015-11-01

Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.
Benchmarking government action for obesity prevention--an innovative advocacy strategy.

PubMed

Martin, J; Peeters, A; Honisett, S; Mavoa, H; Swinburn, B; de Silva-Sanigorski, A

2014-01-01

Successful obesity prevention will require a leading role for governments, but internationally they have been slow to act. League tables of benchmark indicators of action can be a valuable advocacy and evaluation tool. To develop a benchmarking tool for government action on obesity prevention, implement it across Australian jurisdictions and to publicly award the best and worst performers. A framework was developed which encompassed nine domains, reflecting best practice government action on obesity prevention: whole-of-government approaches; marketing restrictions; access to affordable, healthy food; school food and physical activity; food in public facilities; urban design and transport; leisure and local environments; health services, and; social marketing. A scoring system was used by non-government key informants to rate the performance of their government. National rankings were generated and the results were communicated to all Premiers/Chief Ministers, the media and the national obesity research and practice community. Evaluation of the initial tool in 2010 showed it to be feasible to implement and able to discriminate the better and worse performing governments. Evaluation of the rubric in 2011 confirmed this to be a robust and useful method. In relation to government action, the best performing governments were those with whole-of-government approaches, had extended common initiatives and demonstrated innovation and strong political will. This new benchmarking tool, the Obesity Action Award, has enabled identification of leading government action on obesity prevention and the key characteristics associated with their success. We recommend this tool for other multi-state/country comparisons. Copyright © 2013 Asian Oceanian Association for the Study of Obesity. Published by Elsevier Ltd. All rights reserved.
BACT Simulation User Guide (Version 7.0)

NASA Technical Reports Server (NTRS)

Waszak, Martin R.

1997-01-01

This report documents the structure and operation of a simulation model of the Benchmark Active Control Technology (BACT) Wind-Tunnel Model. The BACT system was designed, built, and tested at NASA Langley Research Center as part of the Benchmark Models Program and was developed to perform wind-tunnel experiments to obtain benchmark quality data to validate computational fluid dynamics and computational aeroelasticity codes, to verify the accuracy of current aeroservoelasticity design and analysis tools, and to provide an active controls testbed for evaluating new and innovative control algorithms for flutter suppression and gust load alleviation. The BACT system has been especially valuable as a control system testbed.

Hazards of benchmarking complications with the National Trauma Data Bank: numerators in search of denominators.

PubMed

Kardooni, Shahrzad; Haut, Elliott R; Chang, David C; Pierce, Charles A; Efron, David T; Haider, Adil H; Pronovost, Peter J; Cornwell, Edward E

2008-02-01

Complication rates after trauma may serve as important indicators of quality of care. Meaningful performance benchmarks for complication rates require reference standards from valid and reliable data. Selection of appropriate numerators and denominators is a major consideration for data validity in performance improvement and benchmarking. We examined the suitability of the National Trauma Data Bank (NTDB) as a reference for benchmarking trauma center complication rates. We selected the five most commonly reported complications in the NTDB v. 6.1 (pneumonia, urinary tract infection, acute respiratory distress syndrome, deep vein thrombosis, myocardial infarction). We compared rates for each complication using three different denominators defined by different populations at risk. A-all patients from all 700 reporting facilities as the denominator (n = 1,466,887); B-only patients from the 441 hospitals reporting at least one complication (n = 1,307,729); C-patients from hospitals reporting at least one occurrence of each specific complication, giving a unique denominator for each complication (n range = 869,675-1,167,384). We also looked at differences in hospital characteristics between complication reporters and nonreporters. There was a 12.2% increase in the rate of each complication when patients from facilities not reporting any complications were excluded from the denominator. When rates were calculated using a unique denominator for each complication, rates increased 25% to 70%. The change from rate A to rate C produced a new rank order for the top five complications. When compared directly, rates B and C were also significantly different for all complications (all p < 0.01). Hospitals that reported complication information had significantly higher annual admissions and were more likely to be designated level I or II trauma centers and be university teaching hospitals. There is great variability in complication data reported in the NTDB that may introduce bias and significantly influence rates of complications reported. This potential for bias creates a challenge for appropriately interpreting complication rates for hospital performance benchmarking. We recognize the value of large aggregated registries such as the NTDB as a valuable tool for benchmarking and performance improvement purposes. However, we strongly advocate the need for conscientious selection of numerators and denominators that serve as the basic foundation for research.
Improving patient safety culture in Saudi Arabia (2012-2015): trending, improvement and benchmarking.

PubMed

Alswat, Khalid; Abdalla, Rawia Ahmad Mustafa; Titi, Maher Abdelraheim; Bakash, Maram; Mehmood, Faiza; Zubairi, Beena; Jamal, Diana; El-Jardali, Fadi

2017-08-02

Measuring patient safety culture can provide insight into areas for improvement and help monitor changes over time. This study details the findings of a re-assessment of patient safety culture in a multi-site Medical City in Riyadh, Kingdom of Saudi Arabia (KSA). Results were compared to an earlier assessment conducted in 2012 and benchmarked with regional and international studies. Such assessments can provide hospital leadership with insight on how their hospital is performing on patient safety culture composites as a result of quality improvement plans. This paper also explored the association between patient safety culture predictors and patient safety grade, perception of patient safety, frequency of events reported and number of events reported. We utilized a customized version of the patient safety culture survey developed by the Agency for Healthcare Research and Quality. The Medical City is a tertiary care teaching facility composed of two sites (total capacity of 904 beds). Data was analyzed using SPSS 24 at a significance level of 0.05. A t-Test was used to compare results from the 2012 survey to that conducted in 2015. Two adopted Generalized Estimating Equations in addition to two linear models were used to assess the association between composites and patient safety culture outcomes. Results were also benchmarked against similar initiatives in Lebanon, Palestine and USA. Areas of strength in 2015 included Teamwork within units, and Organizational Learning-Continuous Improvement; areas requiring improvement included Non-Punitive Response to Error, and Staffing. Comparing results to the 2012 survey revealed improvement on some areas but non-punitive response to error and Staffing remained the lowest scoring composites in 2015. Regression highlighted significant association between managerial support, organizational learning and feedback and improved survey outcomes. Comparison to international benchmarks revealed that the hospital is performing at or better than benchmark on several composites. The Medical City has made significant progress on several of the patient safety culture composites despite still having areas requiring additional improvement. Patient safety culture outcomes are evidently linked to better performance on specific composites. While results are comparable with regional and international benchmarks, findings confirm that regular assessment can allow hospitals to better understand and visualize changes in their performance and identify additional areas for improvement.
International benchmarking of specialty hospitals. A series of case studies on comprehensive cancer centres.

PubMed

van Lent, Wineke A M; de Beer, Relinde D; van Harten, Wim H

2010-08-31

Benchmarking is one of the methods used in business that is applied to hospitals to improve the management of their operations. International comparison between hospitals can explain performance differences. As there is a trend towards specialization of hospitals, this study examines the benchmarking process and the success factors of benchmarking in international specialized cancer centres. Three independent international benchmarking studies on operations management in cancer centres were conducted. The first study included three comprehensive cancer centres (CCC), three chemotherapy day units (CDU) were involved in the second study and four radiotherapy departments were included in the final study. Per multiple case study a research protocol was used to structure the benchmarking process. After reviewing the multiple case studies, the resulting description was used to study the research objectives. We adapted and evaluated existing benchmarking processes through formalizing stakeholder involvement and verifying the comparability of the partners. We also devised a framework to structure the indicators to produce a coherent indicator set and better improvement suggestions. Evaluating the feasibility of benchmarking as a tool to improve hospital processes led to mixed results. Case study 1 resulted in general recommendations for the organizations involved. In case study 2, the combination of benchmarking and lean management led in one CDU to a 24% increase in bed utilization and a 12% increase in productivity. Three radiotherapy departments of case study 3, were considering implementing the recommendations.Additionally, success factors, such as a well-defined and small project scope, partner selection based on clear criteria, stakeholder involvement, simple and well-structured indicators, analysis of both the process and its results and, adapt the identified better working methods to the own setting, were found. The improved benchmarking process and the success factors can produce relevant input to improve the operations management of specialty hospitals.
International benchmarking of specialty hospitals. A series of case studies on comprehensive cancer centres

PubMed Central

2010-01-01

Background Benchmarking is one of the methods used in business that is applied to hospitals to improve the management of their operations. International comparison between hospitals can explain performance differences. As there is a trend towards specialization of hospitals, this study examines the benchmarking process and the success factors of benchmarking in international specialized cancer centres. Methods Three independent international benchmarking studies on operations management in cancer centres were conducted. The first study included three comprehensive cancer centres (CCC), three chemotherapy day units (CDU) were involved in the second study and four radiotherapy departments were included in the final study. Per multiple case study a research protocol was used to structure the benchmarking process. After reviewing the multiple case studies, the resulting description was used to study the research objectives. Results We adapted and evaluated existing benchmarking processes through formalizing stakeholder involvement and verifying the comparability of the partners. We also devised a framework to structure the indicators to produce a coherent indicator set and better improvement suggestions. Evaluating the feasibility of benchmarking as a tool to improve hospital processes led to mixed results. Case study 1 resulted in general recommendations for the organizations involved. In case study 2, the combination of benchmarking and lean management led in one CDU to a 24% increase in bed utilization and a 12% increase in productivity. Three radiotherapy departments of case study 3, were considering implementing the recommendations. Additionally, success factors, such as a well-defined and small project scope, partner selection based on clear criteria, stakeholder involvement, simple and well-structured indicators, analysis of both the process and its results and, adapt the identified better working methods to the own setting, were found. Conclusions The improved benchmarking process and the success factors can produce relevant input to improve the operations management of specialty hospitals. PMID:20807408
FX-87 performance measurements: data-flow implementation. Technical report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hammel, R.T.; Gifford, D.K.

1988-11-01

This report documents a series of experiments performed to explore the thesis that the FX-87 effect system permits a compiler to schedule imperative programs (i.e., programs that may contain side-effects) for execution on a parallel computer. The authors analyze how much the FX-87 static effect system can improve the execution times of five benchmark programs on a parallel graph interpreter. Three of their benchmark programs do not use side-effects (factorial, fibonacci, and polynomial division) and thus did not have any effect-induced constraints. Their FX-87 performance was comparable to their performance in a purely functional language. Two of the benchmark programsmore » use side effects (DNA sequence matching and Scheme interpretation) and the compiler was able to use effect information to reduce their execution times by factors of 1.7 to 5.4 when compared with sequential execution times. These results support the thesis that a static effect system is a powerful tool for compilation to multiprocessor computers. However, the graph interpreter we used was based on unrealistic assumptions, and thus our results may not accurately reflect the performance of a practical FX-87 implementation. The results also suggest that conventional loop analysis would complement the FX-87 effect system« less
Benchmarking and audit of breast units improves quality of care

PubMed Central

van Dam, P.A.; Verkinderen, L.; Hauspy, J.; Vermeulen, P.; Dirix, L.; Huizing, M.; Altintas, S.; Papadimitriou, K.; Peeters, M.; Tjalma, W.

2013-01-01

Quality Indicators (QIs) are measures of health care quality that make use of readily available hospital inpatient administrative data. Assessment quality of care can be performed on different levels: national, regional, on a hospital basis or on an individual basis. It can be a mandatory or voluntary system. In all cases development of an adequate database for data extraction, and feedback of the findings is of paramount importance. In the present paper we performed a Medline search on “QIs and breast cancer” and “benchmarking and breast cancer care”, and we have added some data from personal experience. The current data clearly show that the use of QIs for breast cancer care, regular internal and external audit of performance of breast units, and benchmarking are effective to improve quality of care. Adherence to guidelines improves markedly (particularly regarding adjuvant treatment) and there are data emerging showing that this results in a better outcome. As quality assurance benefits patients, it will be a challenge for the medical and hospital community to develop affordable quality control systems, which are not leading to excessive workload. PMID:24753926
Intercomparison of Monte Carlo radiation transport codes to model TEPC response in low-energy neutron and gamma-ray fields.

PubMed

Ali, F; Waker, A J; Waller, E J

2014-10-01

Tissue-equivalent proportional counters (TEPC) can potentially be used as a portable and personal dosemeter in mixed neutron and gamma-ray fields, but what hinders this use is their typically large physical size. To formulate compact TEPC designs, the use of a Monte Carlo transport code is necessary to predict the performance of compact designs in these fields. To perform this modelling, three candidate codes were assessed: MCNPX 2.7.E, FLUKA 2011.2 and PHITS 2.24. In each code, benchmark simulations were performed involving the irradiation of a 5-in. TEPC with monoenergetic neutron fields and a 4-in. wall-less TEPC with monoenergetic gamma-ray fields. The frequency and dose mean lineal energies and dose distributions calculated from each code were compared with experimentally determined data. For the neutron benchmark simulations, PHITS produces data closest to the experimental values and for the gamma-ray benchmark simulations, FLUKA yields data closest to the experimentally determined quantities. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images.

PubMed

Vázquez, David; Bernal, Jorge; Sánchez, F Javier; Fernández-Esparrach, Gloria; López, Antonio M; Romero, Adriana; Drozdzal, Michal; Courville, Aaron

2017-01-01

Colorectal cancer (CRC) is the third cause of cancer death worldwide. Currently, the standard approach to reduce CRC-related mortality is to perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss rate and the inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced by designing decision support systems (DSS) aiming to help clinicians in the different stages of the procedure by providing endoluminal scene segmentation. Thus, in this paper, we introduce an extended benchmark of colonoscopy image segmentation, with the hope of establishing a new strong benchmark for colonoscopy image analysis research. The proposed dataset consists of 4 relevant classes to inspect the endoluminal scene, targeting different clinical needs. Together with the dataset and taking advantage of advances in semantic segmentation literature, we provide new baselines by training standard fully convolutional networks (FCNs). We perform a comparative study to show that FCNs significantly outperform, without any further postprocessing, prior results in endoluminal scene segmentation, especially with respect to polyp segmentation and localization.
Semi-active control of a cable-stayed bridge under multiple-support excitations.

PubMed

Dai, Ze-Bing; Huang, Jin-Zhi; Wang, Hong-Xia

2004-03-01

This paper presents a semi-active strategy for seismic protection of a benchmark cable-stayed bridge with consideration of multiple-support excitations. In this control strategy, Magnetorheological (MR) dampers are proposed as control devices, a LQG-clipped-optimal control algorithm is employed. An active control strategy, shown in previous researches to perform well at controlling the benchmark bridge when uniform earthquake motion was assumed, is also used in this study to control this benchmark bridge with consideration of multiple-support excitations. The performance of active control system is compared to that of the presented semi-active control strategy. Because the MR fluid damper is a controllable energy- dissipation device that cannot add mechanical energy to the structural system, the proposed control strategy is fail-safe in that bounded-input, bounded-output stability of the controlled structure is guaranteed. The numerical results demonstrated that the performance of the presented control design is nearly the same as that of the active control system; and that the MR dampers can effectively be used to control seismically excited cable-stayed bridges with multiple-support excitations.
RASSP Benchmark 4 Technical Description.

DTIC Science & Technology

1998-01-09

be carried out. Based on results of the study, an implementation of all, or part, of the system described in this benchmark technical description...validate interface and timing constraints. The ISA level of modeling defines the limit of detail expected in the VHDL virtual prototype. It does not...develop a set of candidate architectures and perform an architecture trade-off study. Candidate proces- sor implementations must then be examined for
Middle Level Teachers' Perceptions of Interim Reading Assessments: An Exploratory Study of Data-Based Decision Making

ERIC Educational Resources Information Center

Reed, Deborah K.

2015-01-01

This study explored the data-based decision making of 12 teachers in grades 6-8 who were asked about their perceptions and use of three required interim measures of reading performance: oral reading fluency (ORF), retell, and a benchmark comprised of released state test items. Focus group participants reported they did not believe the benchmark or…
New data tool provides wealth of clinical, financial benchmarks by census region.

PubMed

1998-08-01

Data Library: Compare your departmental expenses, administrative expense ratio, length of stay, and other clinical-financial data to benchmarks for your census region. A new CD-rom product that provides access to four years of Medicare Cost Report data for every reporting hospital in the nation allows users to slice and dice the data by more than 200 different performance measures.
Benchmarking the Performance of Employment and Training Programs: A Pilot Effort of the Annie E. Casey Foundation's Jobs Initiative.

ERIC Educational Resources Information Center

Welch, Doug

As part of its Jobs Initiative (JI) program in six metropolitan areas Denver, Milwaukee, New Orleans, Philadelphia, St. Louis, and Seattle the Annie E. Casey Foundation sought to develop and test a method for establishing benchmarks for workforce development agencies. Data collected from 10 projects in the JI from April through March, 2000,…
Integrated Sensing Processor, Phase 2

DTIC Science & Technology

2005-12-01

performance analysis for several baseline classifiers including neural nets, linear classifiers, and kNN classifiers. Use of CCDR as a preprocessing step...below the level of the benchmark non-linear classifier for this problem ( kNN ). Furthermore, the CCDR preconditioned kNN achieved a 10% improvement over...the benchmark kNN without CCDR. Finally, we found an important connection between intrinsic dimension estimation via entropic graphs and the optimal
Quality assessment in head and neck oncologic surgery in a Brazilian cancer center compared with MD Anderson Cancer Center benchmarks.

PubMed

Lira, Renan Bezerra; de Carvalho, André Ywata; de Carvalho, Genival Barbosa; Lewis, Carol M; Weber, Randal S; Kowalski, Luiz Paulo

2016-07-01

Quality assessment is a major tool for evaluation of health care delivery. In head and neck surgery, the University of Texas MD Anderson Cancer Center (MD Anderson) has defined quality standards by publishing benchmarks. We conducted an analysis of 360 head and neck surgeries performed at the AC Camargo Cancer Center (AC Camargo). The procedures were stratified into low-acuity procedures (LAPs) or high-acuity procedures (HAPs) and outcome indicators where compared to MD Anderson benchmarks. In the 360 cases, there were 332 LAPs (92.2%) and 28 HAPs (7.8%). Patients with any comorbid condition had a higher incidence of negative outcome indicators (p = .005). In the LAPs, we achieved the MD Anderson benchmarks in all outcome indicators. In HAPs, the rate of surgical site infection and length of hospital stay were higher than what is established by the benchmarks. Quality assessment of head and neck surgery is possible and should be disseminated, improving effectiveness in health care delivery. © 2015 Wiley Periodicals, Inc. Head Neck 38: 1002-1007, 2016. © 2015 Wiley Periodicals, Inc.
Benchmarking facilities providing care: An international overview of initiatives

PubMed Central

Thonon, Frédérique; Watson, Jonathan; Saghatchian, Mahasti

2015-01-01

We performed a literature review of existing benchmarking projects of health facilities to explore (1) the rationales for those projects, (2) the motivation for health facilities to participate, (3) the indicators used and (4) the success and threat factors linked to those projects. We studied both peer-reviewed and grey literature. We examined 23 benchmarking projects of different medical specialities. The majority of projects used a mix of structure, process and outcome indicators. For some projects, participants had a direct or indirect financial incentive to participate (such as reimbursement by Medicaid/Medicare or litigation costs related to quality of care). A positive impact was reported for most projects, mainly in terms of improvement of practice and adoption of guidelines and, to a lesser extent, improvement in communication. Only 1 project reported positive impact in terms of clinical outcomes. Success factors and threats are linked to both the benchmarking process (such as organisation of meetings, link with existing projects) and indicators used (such as adjustment for diagnostic-related groups). The results of this review will help coordinators of a benchmarking project to set it up successfully. PMID:26770800
Results of the GABLS3 diurnal-cycle benchmark for wind energy applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rodrigo, J. Sanz; Allaerts, D.; Avila, M.

We present results of the GABLS3 model intercomparison benchmark revisited for wind energy applications. The case consists of a diurnal cycle, measured at the 200-m tall Cabauw tower in the Netherlands, including a nocturnal low-level jet. The benchmark includes a sensitivity analysis of WRF simulations using two input meteorological databases and five planetary boundary-layer schemes. A reference set of mesoscale tendencies is used to drive microscale simulations using RANS k-ϵ and LES turbulence models. The validation is based on rotor-based quantities of interest. Cycle-integrated mean absolute errors are used to quantify model performance. The results of the benchmark are usedmore » to discuss input uncertainties from mesoscale modelling, different meso-micro coupling strategies (online vs offline) and consistency between RANS and LES codes when dealing with boundary-layer mean flow quantities. Altogether, all the microscale simulations produce a consistent coupling with mesoscale forcings.« less
Results of the GABLS3 diurnal-cycle benchmark for wind energy applications

DOE PAGES

Rodrigo, J. Sanz; Allaerts, D.; Avila, M.; ...

2017-06-13

We present results of the GABLS3 model intercomparison benchmark revisited for wind energy applications. The case consists of a diurnal cycle, measured at the 200-m tall Cabauw tower in the Netherlands, including a nocturnal low-level jet. The benchmark includes a sensitivity analysis of WRF simulations using two input meteorological databases and five planetary boundary-layer schemes. A reference set of mesoscale tendencies is used to drive microscale simulations using RANS k-ϵ and LES turbulence models. The validation is based on rotor-based quantities of interest. Cycle-integrated mean absolute errors are used to quantify model performance. The results of the benchmark are usedmore » to discuss input uncertainties from mesoscale modelling, different meso-micro coupling strategies (online vs offline) and consistency between RANS and LES codes when dealing with boundary-layer mean flow quantities. Altogether, all the microscale simulations produce a consistent coupling with mesoscale forcings.« less
Benchmarking and Hardware-In-The-Loop Operation of a ...

EPA Pesticide Factsheets

Engine Performance evaluation in support of LD MTE. EPA used elements of its ALPHA model to apply hardware-in-the-loop (HIL) controls to the SKYACTIV engine test setup to better understand how the engine would operate in a chassis test after combined with future leading edge technologies, advanced high-efficiency transmission, reduced mass, and reduced roadload. Predict future vehicle performance with Atkinson engine. As part of its technology assessment for the upcoming midterm evaluation of the 2017-2025 LD vehicle GHG emissions regulation, EPA has been benchmarking engines and transmissions to generate inputs for use in its ALPHA model
Benchmarking the neurology practice.

PubMed

Henderson, William S

2010-05-01

A medical practice, whether operated by a solo physician or by a group, is a business. For a neurology practice to be successful, it must meet performance measures that ensure its viability. The best method of doing this is to benchmark the practice, both against itself over time and against other practices. Crucial medical practice metrics that should be measured are financial performance, staffing efficiency, physician productivity, and patient access. Such measures assist a physician or practice in achieving the goals and objectives that each determines are important to providing quality health care to patients. Copyright 2010 Elsevier Inc. All rights reserved.

Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

NASA Astrophysics Data System (ADS)

Moon, Hongsik

What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the changing computer hardware platforms in order to provide fast, accurate and efficient solutions to large, complex electromagnetic problems. The research in this dissertation proves that the performance of parallel code is intimately related to the configuration of the computer hardware and can be maximized for different hardware platforms. To benchmark and optimize the performance of parallel CEM software, a variety of large, complex projects are created and executed on a variety of computer platforms. The computer platforms used in this research are detailed in this dissertation. The projects run as benchmarks are also described in detail and results are presented. The parameters that affect parallel CEM software on High Performance Computing Clusters (HPCC) are investigated. This research demonstrates methods to maximize the performance of parallel CEM software code.
Benchmarking the Multidimensional Stellar Implicit Code MUSIC

NASA Astrophysics Data System (ADS)

Goffrey, T.; Pratt, J.; Viallet, M.; Baraffe, I.; Popov, M. V.; Walder, R.; Folini, D.; Geroux, C.; Constantino, T.

2017-04-01

We present the results of a numerical benchmark study for the MUltidimensional Stellar Implicit Code (MUSIC) based on widely applicable two- and three-dimensional compressible hydrodynamics problems relevant to stellar interiors. MUSIC is an implicit large eddy simulation code that uses implicit time integration, implemented as a Jacobian-free Newton Krylov method. A physics based preconditioning technique which can be adjusted to target varying physics is used to improve the performance of the solver. The problems used for this benchmark study include the Rayleigh-Taylor and Kelvin-Helmholtz instabilities, and the decay of the Taylor-Green vortex. Additionally we show a test of hydrostatic equilibrium, in a stellar environment which is dominated by radiative effects. In this setting the flexibility of the preconditioning technique is demonstrated. This work aims to bridge the gap between the hydrodynamic test problems typically used during development of numerical methods and the complex flows of stellar interiors. A series of multidimensional tests were performed and analysed. Each of these test cases was analysed with a simple, scalar diagnostic, with the aim of enabling direct code comparisons. As the tests performed do not have analytic solutions, we verify MUSIC by comparing it to established codes including ATHENA and the PENCIL code. MUSIC is able to both reproduce behaviour from established and widely-used codes as well as results expected from theoretical predictions. This benchmarking study concludes a series of papers describing the development of the MUSIC code and provides confidence in future applications.
Proficiency performance benchmarks for removal of simulated brain tumors using a virtual reality simulator NeuroTouch.

PubMed

AlZhrani, Gmaan; Alotaibi, Fahad; Azarnoush, Hamed; Winkler-Schwartz, Alexander; Sabbagh, Abdulrahman; Bajunaid, Khalid; Lajoie, Susanne P; Del Maestro, Rolando F

2015-01-01

Assessment of neurosurgical technical skills involved in the resection of cerebral tumors in operative environments is complex. Educators emphasize the need to develop and use objective and meaningful assessment tools that are reliable and valid for assessing trainees' progress in acquiring surgical skills. The purpose of this study was to develop proficiency performance benchmarks for a newly proposed set of objective measures (metrics) of neurosurgical technical skills performance during simulated brain tumor resection using a new virtual reality simulator (NeuroTouch). Each participant performed the resection of 18 simulated brain tumors of different complexity using the NeuroTouch platform. Surgical performance was computed using Tier 1 and Tier 2 metrics derived from NeuroTouch simulator data consisting of (1) safety metrics, including (a) volume of surrounding simulated normal brain tissue removed, (b) sum of forces utilized, and (c) maximum force applied during tumor resection; (2) quality of operation metric, which involved the percentage of tumor removed; and (3) efficiency metrics, including (a) instrument total tip path lengths and (b) frequency of pedal activation. All studies were conducted in the Neurosurgical Simulation Research Centre, Montreal Neurological Institute and Hospital, McGill University, Montreal, Canada. A total of 33 participants were recruited, including 17 experts (board-certified neurosurgeons) and 16 novices (7 senior and 9 junior neurosurgery residents). The results demonstrated that "expert" neurosurgeons resected less surrounding simulated normal brain tissue and less tumor tissue than residents. These data are consistent with the concept that "experts" focused more on safety of the surgical procedure compared with novices. By analyzing experts' neurosurgical technical skills performance on these different metrics, we were able to establish benchmarks for goal proficiency performance training of neurosurgery residents. This study furthers our understanding of expert neurosurgical performance during the resection of simulated virtual reality tumors and provides neurosurgical trainees with predefined proficiency performance benchmarks designed to maximize the learning of specific surgical technical skills. Copyright © 2015 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Overall ED efficiency is associated with decreased time to percutaneous coronary intervention for ST-segment elevation myocardial infarction.

PubMed

Jones, Christopher W; Sonnad, Seema S; Augustine, James J; Reese, Charles L

2014-10-01

Performance of percutaneous coronary intervention (PCI) within 90 minutes of hospital arrival for ST-segment elevation myocardial infarction patients is a commonly cited clinical quality measure. The Centers for Medicare and Medicaid Services use this measure to adjust hospital reimbursement via the Value-Based Purchasing Program. This study investigated the relationship between hospital performance on this quality measure and emergency department (ED) operational efficiency. Hospital-level data from Centers for Medicare and Medicaid Services on PCI quality measure performance was linked to information on operational performance from 272 US EDs obtained from the Emergency Department Benchmarking Alliance annual operations survey. Standard metrics of ED size, acuity, and efficiency were compared across hospitals grouped by performance on the door-to-balloon time quality measure. Mean hospital performance on the 90-minute arrival to PCI measure was 94.0% (range, 42-100). Among hospitals failing to achieve the door-to-balloon time performance standard, median ED length of stay was 209 minutes, compared with 173 minutes among those hospitals meeting the benchmark standard (P < .001). Similarly, median time from ED patient arrival to physician evaluation was 39 minutes for hospitals below the performance standard and 23 minutes for hospitals at the benchmark standard (P < .001). Markers of ED size and acuity, including annual patient volume, admission rate, and the percentage of patients arriving via ambulance did not vary with door-to-balloon time. Better performance on measures associated with ED efficiency is associated with more timely PCI performance. Copyright © 2014 Elsevier Inc. All rights reserved.
An Application-Based Performance Characterization of the Columbia Supercluster

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Djomehri, Jahed M.; Hood, Robert; Jin, Hoaqiang; Kiris, Cetin; Saini, Subhash

2005-01-01

Columbia is a 10,240-processor supercluster consisting of 20 Altix nodes with 512 processors each, and currently ranked as the second-fastest computer in the world. In this paper, we present the performance characteristics of Columbia obtained on up to four computing nodes interconnected via the InfiniBand and/or NUMAlink4 communication fabrics. We evaluate floating-point performance, memory bandwidth, message passing communication speeds, and compilers using a subset of the HPC Challenge benchmarks, and some of the NAS Parallel Benchmarks including the multi-zone versions. We present detailed performance results for three scientific applications of interest to NASA, one from molecular dynamics, and two from computational fluid dynamics. Our results show that both the NUMAlink4 and the InfiniBand hold promise for application scaling to a large number of processors.
Benchmarking is associated with improved quality of care in type 2 diabetes: the OPTIMISE randomized, controlled trial.

PubMed

Hermans, Michel P; Elisaf, Moses; Michel, Georges; Muls, Erik; Nobels, Frank; Vandenberghe, Hans; Brotons, Carlos

2013-11-01

To assess prospectively the effect of benchmarking on quality of primary care for patients with type 2 diabetes by using three major modifiable cardiovascular risk factors as critical quality indicators. Primary care physicians treating patients with type 2 diabetes in six European countries were randomized to give standard care (control group) or standard care with feedback benchmarked against other centers in each country (benchmarking group). In both groups, laboratory tests were performed every 4 months. The primary end point was the percentage of patients achieving preset targets of the critical quality indicators HbA1c, LDL cholesterol, and systolic blood pressure (SBP) after 12 months of follow-up. Of 4,027 patients enrolled, 3,996 patients were evaluable and 3,487 completed 12 months of follow-up. Primary end point of HbA1c target was achieved in the benchmarking group by 58.9 vs. 62.1% in the control group (P = 0.398) after 12 months; 40.0 vs. 30.1% patients met the SBP target (P < 0.001); 54.3 vs. 49.7% met the LDL cholesterol target (P = 0.006). Percentages of patients meeting all three targets increased during the study in both groups, with a statistically significant increase observed in the benchmarking group. The percentage of patients achieving all three targets at month 12 was significantly larger in the benchmarking group than in the control group (12.5 vs. 8.1%; P < 0.001). In this prospective, randomized, controlled study, benchmarking was shown to be an effective tool for increasing achievement of critical quality indicators and potentially reducing patient cardiovascular residual risk profile.
Benchmarking Is Associated With Improved Quality of Care in Type 2 Diabetes

PubMed Central

Hermans, Michel P.; Elisaf, Moses; Michel, Georges; Muls, Erik; Nobels, Frank; Vandenberghe, Hans; Brotons, Carlos

2013-01-01

OBJECTIVE To assess prospectively the effect of benchmarking on quality of primary care for patients with type 2 diabetes by using three major modifiable cardiovascular risk factors as critical quality indicators. RESEARCH DESIGN AND METHODS Primary care physicians treating patients with type 2 diabetes in six European countries were randomized to give standard care (control group) or standard care with feedback benchmarked against other centers in each country (benchmarking group). In both groups, laboratory tests were performed every 4 months. The primary end point was the percentage of patients achieving preset targets of the critical quality indicators HbA1c, LDL cholesterol, and systolic blood pressure (SBP) after 12 months of follow-up. RESULTS Of 4,027 patients enrolled, 3,996 patients were evaluable and 3,487 completed 12 months of follow-up. Primary end point of HbA1c target was achieved in the benchmarking group by 58.9 vs. 62.1% in the control group (P = 0.398) after 12 months; 40.0 vs. 30.1% patients met the SBP target (P < 0.001); 54.3 vs. 49.7% met the LDL cholesterol target (P = 0.006). Percentages of patients meeting all three targets increased during the study in both groups, with a statistically significant increase observed in the benchmarking group. The percentage of patients achieving all three targets at month 12 was significantly larger in the benchmarking group than in the control group (12.5 vs. 8.1%; P < 0.001). CONCLUSIONS In this prospective, randomized, controlled study, benchmarking was shown to be an effective tool for increasing achievement of critical quality indicators and potentially reducing patient cardiovascular residual risk profile. PMID:23846810
A benchmark study of the sea-level equation in GIA modelling

NASA Astrophysics Data System (ADS)

Martinec, Zdenek; Klemann, Volker; van der Wal, Wouter; Riva, Riccardo; Spada, Giorgio; Simon, Karen; Blank, Bas; Sun, Yu; Melini, Daniele; James, Tom; Bradley, Sarah

2017-04-01

The sea-level load in glacial isostatic adjustment (GIA) is described by the so called sea-level equation (SLE), which represents the mass redistribution between ice sheets and oceans on a deforming earth. Various levels of complexity of SLE have been proposed in the past, ranging from a simple mean global sea level (the so-called eustatic sea level) to the load with a deforming ocean bottom, migrating coastlines and a changing shape of the geoid. Several approaches to solve the SLE have been derived, from purely analytical formulations to fully numerical methods. Despite various teams independently investigating GIA, there has been no systematic intercomparison amongst the solvers through which the methods may be validated. The goal of this paper is to present a series of benchmark experiments designed for testing and comparing numerical implementations of the SLE. Our approach starts with simple load cases even though the benchmark will not result in GIA predictions for a realistic loading scenario. In the longer term we aim for a benchmark with a realistic loading scenario, and also for benchmark solutions with rotational feedback. The current benchmark uses an earth model for which Love numbers have been computed and benchmarked in Spada et al (2011). In spite of the significant differences in the numerical methods employed, the test computations performed so far show a satisfactory agreement between the results provided by the participants. The differences found can often be attributed to the different approximations inherent to the various algorithms. Literature G. Spada, V. R. Barletta, V. Klemann, R. E. M. Riva, Z. Martinec, P. Gasperini, B. Lund, D. Wolf, L. L. A. Vermeersen, and M. A. King, 2011. A benchmark study for glacial isostatic adjustment codes. Geophys. J. Int. 185: 106-132 doi:10.1111/j.1365-
Validating vignette and conjoint survey experiments against real-world behavior

PubMed Central

Hainmueller, Jens; Hangartner, Dominik; Yamamoto, Teppei

2015-01-01

Survey experiments, like vignette and conjoint analyses, are widely used in the social sciences to elicit stated preferences and study how humans make multidimensional choices. However, there is a paucity of research on the external validity of these methods that examines whether the determinants that explain hypothetical choices made by survey respondents match the determinants that explain what subjects actually do when making similar choices in real-world situations. This study compares results from conjoint and vignette analyses on which immigrant attributes generate support for naturalization with closely corresponding behavioral data from a natural experiment in Switzerland, where some municipalities used referendums to decide on the citizenship applications of foreign residents. Using a representative sample from the same population and the official descriptions of applicant characteristics that voters received before each referendum as a behavioral benchmark, we find that the effects of the applicant attributes estimated from the survey experiments perform remarkably well in recovering the effects of the same attributes in the behavioral benchmark. We also find important differences in the relative performances of the different designs. Overall, the paired conjoint design, where respondents evaluate two immigrants side by side, comes closest to the behavioral benchmark; on average, its estimates are within 2% percentage points of the effects in the behavioral benchmark. PMID:25646415
Benchmarking of Neutron Production of Heavy-Ion Transport Codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Remec, Igor; Ronningen, Reginald M.; Heilbronn, Lawrence

Accurate prediction of radiation fields generated by heavy ion interactions is important in medical applications, space missions, and in design and operation of rare isotope research facilities. In recent years, several well-established computer codes in widespread use for particle and radiation transport calculations have been equipped with the capability to simulate heavy ion transport and interactions. To assess and validate these capabilities, we performed simulations of a series of benchmark-quality heavy ion experiments with the computer codes FLUKA, MARS15, MCNPX, and PHITS. We focus on the comparisons of secondary neutron production. Results are encouraging; however, further improvements in models andmore » codes and additional benchmarking are required.« less
Benchmarking of Heavy Ion Transport Codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Remec, Igor; Ronningen, Reginald M.; Heilbronn, Lawrence

Accurate prediction of radiation fields generated by heavy ion interactions is important in medical applications, space missions, and in designing and operation of rare isotope research facilities. In recent years, several well-established computer codes in widespread use for particle and radiation transport calculations have been equipped with the capability to simulate heavy ion transport and interactions. To assess and validate these capabilities, we performed simulations of a series of benchmark-quality heavy ion experiments with the computer codes FLUKA, MARS15, MCNPX, and PHITS. We focus on the comparisons of secondary neutron production. Results are encouraging; however, further improvements in models andmore » codes and additional benchmarking are required.« less
Subgroup Benchmark Calculations for the Intra-Pellet Nonuniform Temperature Cases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, Kang Seog; Jung, Yeon Sang; Liu, Yuxuan

A benchmark suite has been developed by Seoul National University (SNU) for intrapellet nonuniform temperature distribution cases based on the practical temperature profiles according to the thermal power levels. Though a new subgroup capability for nonuniform temperature distribution was implemented in MPACT, no validation calculation has been performed for the new capability. This study focuses on bench-marking the new capability through a code-to-code comparison. Two continuous-energy Monte Carlo codes, McCARD and CE-KENO, are engaged in obtaining reference solutions, and the MPACT results are compared to the SNU nTRACER using a similar cross section library and subgroup method to obtain self-shieldedmore » cross sections.« less
Fast Neutron Spectrum Potassium Worth for Space Power Reactor Design Validation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bess, John D.; Marshall, Margaret A.; Briggs, J. Blair

2015-03-01

A variety of critical experiments were constructed of enriched uranium metal (oralloy ) during the 1960s and 1970s at the Oak Ridge Critical Experiments Facility (ORCEF) in support of criticality safety operations at the Y-12 Plant. The purposes of these experiments included the evaluation of storage, casting, and handling limits for the Y-12 Plant and providing data for verification of calculation methods and cross-sections for nuclear criticality safety applications. These included solid cylinders of various diameters, annuli of various inner and outer diameters, two and three interacting cylinders of various diameters, and graphite and polyethylene reflected cylinders and annuli. Ofmore » the hundreds of delayed critical experiments, one was performed that consisted of uranium metal annuli surrounding a potassium-filled, stainless steel can. The outer diameter of the annuli was approximately 13 inches (33.02 cm) with an inner diameter of 7 inches (17.78 cm). The diameter of the stainless steel can was 7 inches (17.78 cm). The critical height of the configurations was approximately 5.6 inches (14.224 cm). The uranium annulus consisted of multiple stacked rings, each with radial thicknesses of 1 inch (2.54 cm) and varying heights. A companion measurement was performed using empty stainless steel cans; the primary purpose of these experiments was to test the fast neutron cross sections of potassium as it was a candidate for coolant in some early space power reactor designs.The experimental measurements were performed on July 11, 1963, by J. T. Mihalczo and M. S. Wyatt (Ref. 1) with additional information in its corresponding logbook. Unreflected and unmoderated experiments with the same set of highly enriched uranium metal parts were performed at the Oak Ridge Critical Experiments Facility in the 1960s and are evaluated in the International Handbook for Evaluated Criticality Safety Benchmark Experiments (ICSBEP Handbook) with the identifier HEU MET FAST 051. Thin graphite reflected (2 inches or less) experiments also using the same set of highly enriched uranium metal parts are evaluated in HEU MET FAST 071. Polyethylene-reflected configurations are evaluated in HEU-MET-FAST-076. A stack of highly enriched metal discs with a thick beryllium top reflector is evaluated in HEU-MET-FAST-069, and two additional highly enriched uranium annuli with beryllium cores are evaluated in HEU-MET-FAST-059. Both detailed and simplified model specifications are provided in this evaluation. Both of these fast neutron spectra assemblies were determined to be acceptable benchmark experiments. The calculated eigenvalues for both the detailed and the simple benchmark models are within ~0.26 % of the benchmark values for Configuration 1 (calculations performed using MCNP6 with ENDF/B-VII.1 neutron cross section data), but under-calculate the benchmark values by ~7s because the uncertainty in the benchmark is very small: ~0.0004 (1s); for Configuration 2, the under-calculation is ~0.31 % and ~8s. Comparison of detailed and simple model calculations for the potassium worth measurement and potassium mass coefficient yield results approximately 70 – 80 % lower (~6s to 10s) than the benchmark values for the various nuclear data libraries utilized. Both the potassium worth and mass coefficient are also deemed to be acceptable benchmark experiment measurements.« less
Using relative survival measures for cross-sectional and longitudinal benchmarks of countries, states, and districts: the BenchRelSurv- and BenchRelSurvPlot-macros

PubMed Central

2013-01-01

Background The objective of screening programs is to discover life threatening diseases in as many patients as early as possible and to increase the chance of survival. To be able to compare aspects of health care quality, methods are needed for benchmarking that allow comparisons on various health care levels (regional, national, and international). Objectives Applications and extensions of algorithms can be used to link the information on disease phases with relative survival rates and to consolidate them in composite measures. The application of the developed SAS-macros will give results for benchmarking of health care quality. Data examples for breast cancer care are given. Methods A reference scale (expected, E) must be defined at a time point at which all benchmark objects (observed, O) are measured. All indices are defined as O/E, whereby the extended standardized screening-index (eSSI), the standardized case-mix-index (SCI), the work-up-index (SWI), and the treatment-index (STI) address different health care aspects. The composite measures called overall-performance evaluation (OPE) and relative overall performance indices (ROPI) link the individual indices differently for cross-sectional or longitudinal analyses. Results Algorithms allow a time point and a time interval associated comparison of the benchmark objects in the indices eSSI, SCI, SWI, STI, OPE, and ROPI. Comparisons between countries, states and districts are possible. Exemplarily comparisons between two countries are made. The success of early detection and screening programs as well as clinical health care quality for breast cancer can be demonstrated while the population’s background mortality is concerned. Conclusions If external quality assurance programs and benchmark objects are based on population-based and corresponding demographic data, information of disease phase and relative survival rates can be combined to indices which offer approaches for comparative analyses between benchmark objects. Conclusions on screening programs and health care quality are possible. The macros can be transferred to other diseases if a disease-specific phase scale of prognostic value (e.g. stage) exists. PMID:23316692
The Dutch Hospital Standardised Mortality Ratio (HSMR) method and cardiac surgery: benchmarking in a national cohort using hospital administration data versus a clinical database

PubMed Central

Siregar, S; Pouw, M E; Moons, K G M; Versteegh, M I M; Bots, M L; van der Graaf, Y; Kalkman, C J; van Herwerden, L A; Groenwold, R H H

2014-01-01

Objective To compare the accuracy of data from hospital administration databases and a national clinical cardiac surgery database and to compare the performance of the Dutch hospital standardised mortality ratio (HSMR) method and the logistic European System for Cardiac Operative Risk Evaluation, for the purpose of benchmarking of mortality across hospitals. Methods Information on all patients undergoing cardiac surgery between 1 January 2007 and 31 December 2010 in 10 centres was extracted from The Netherlands Association for Cardio-Thoracic Surgery database and the Hospital Discharge Registry. The number of cardiac surgery interventions was compared between both databases. The European System for Cardiac Operative Risk Evaluation and hospital standardised mortality ratio models were updated in the study population and compared using the C-statistic, calibration plots and the Brier-score. Results The number of cardiac surgery interventions performed could not be assessed using the administrative database as the intervention code was incorrect in 1.4–26.3%, depending on the type of intervention. In 7.3% no intervention code was registered. The updated administrative model was inferior to the updated clinical model with respect to discrimination (c-statistic of 0.77 vs 0.85, p<0.001) and calibration (Brier Score of 2.8% vs 2.6%, p<0.001, maximum score 3.0%). Two average performing hospitals according to the clinical model became outliers when benchmarking was performed using the administrative model. Conclusions In cardiac surgery, administrative data are less suitable than clinical data for the purpose of benchmarking. The use of either administrative or clinical risk-adjustment models can affect the outlier status of hospitals. Risk-adjustment models including procedure-specific clinical risk factors are recommended. PMID:24334377
Collected notes from the Benchmarks and Metrics Workshop

NASA Technical Reports Server (NTRS)

Drummond, Mark E.; Kaelbling, Leslie P.; Rosenschein, Stanley J.

1991-01-01

In recent years there has been a proliferation of proposals in the artificial intelligence (AI) literature for integrated agent architectures. Each architecture offers an approach to the general problem of constructing an integrated agent. Unfortunately, the ways in which one architecture might be considered better than another are not always clear. There has been a growing realization that many of the positive and negative aspects of an architecture become apparent only when experimental evaluation is performed and that to progress as a discipline, we must develop rigorous experimental methods. In addition to the intrinsic intellectual interest of experimentation, rigorous performance evaluation of systems is also a crucial practical concern to our research sponsors. DARPA, NASA, and AFOSR (among others) are actively searching for better ways of experimentally evaluating alternative approaches to building intelligent agents. One tool for experimental evaluation involves testing systems on benchmark tasks in order to assess their relative performance. As part of a joint DARPA and NASA funded project, NASA-Ames and Teleos Research are carrying out a research effort to establish a set of benchmark tasks and evaluation metrics by which the performance of agent architectures may be determined. As part of this project, we held a workshop on Benchmarks and Metrics at the NASA Ames Research Center on June 25, 1990. The objective of the workshop was to foster early discussion on this important topic. We did not achieve a consensus, nor did we expect to. Collected here is some of the information that was exchanged at the workshop. Given here is an outline of the workshop, a list of the participants, notes taken on the white-board during open discussions, position papers/notes from some participants, and copies of slides used in the presentations.
A Machine-to-Machine protocol benchmark for eHealth applications - Use case: Respiratory rehabilitation.

PubMed

Talaminos-Barroso, Alejandro; Estudillo-Valderrama, Miguel A; Roa, Laura M; Reina-Tosina, Javier; Ortega-Ruiz, Francisco

2016-06-01

M2M (Machine-to-Machine) communications represent one of the main pillars of the new paradigm of the Internet of Things (IoT), and is making possible new opportunities for the eHealth business. Nevertheless, the large number of M2M protocols currently available hinders the election of a suitable solution that satisfies the requirements that can demand eHealth applications. In the first place, to develop a tool that provides a benchmarking analysis in order to objectively select among the most relevant M2M protocols for eHealth solutions. In the second place, to validate the tool with a particular use case: the respiratory rehabilitation. A software tool, called Distributed Computing Framework (DFC), has been designed and developed to execute the benchmarking tests and facilitate the deployment in environments with a large number of machines, with independence of the protocol and performance metrics selected. DDS, MQTT, CoAP, JMS, AMQP and XMPP protocols were evaluated considering different specific performance metrics, including CPU usage, memory usage, bandwidth consumption, latency and jitter. The results obtained allowed to validate a case of use: respiratory rehabilitation of chronic obstructive pulmonary disease (COPD) patients in two scenarios with different types of requirement: Home-Based and Ambulatory. The results of the benchmark comparison can guide eHealth developers in the choice of M2M technologies. In this regard, the framework presented is a simple and powerful tool for the deployment of benchmark tests under specific environments and conditions. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
The MPC&A Questionnaire

DOE Office of Scientific and Technical Information (OSTI.GOV)

Powell, Danny H; Elwood Jr, Robert H

The questionnaire is the instrument used for recording performance data on the nuclear material protection, control, and accountability (MPC&A) system at a nuclear facility. The performance information provides a basis for evaluating the effectiveness of the MPC&A system. The goal for the questionnaire is to provide an accurate representation of the performance of the MPC&A system as it currently exists in the facility. Performance grades for all basic MPC&A functions should realistically reflect the actual level of performance at the time the survey is conducted. The questionnaire was developed after testing and benchmarking the material control and accountability (MC&A) systemmore » effectiveness tool (MSET) in the United States. The benchmarking exercise at the Idaho National Laboratory (INL) proved extremely valuable for improving the content and quality of the early versions of the questionnaire. Members of the INL benchmark team identified many areas of the questionnaire where questions should be clarified and areas where additional questions should be incorporated. The questionnaire addresses all elements of the MC&A system. Specific parts pertain to the foundation for the facility's overall MPC&A system, and other parts pertain to the specific functions of the operational MPC&A system. The questionnaire includes performance metrics for each of the basic functions or tasks performed in the operational MPC&A system. All of those basic functions or tasks are represented as basic events in the MPC&A fault tree. Performance metrics are to be used during completion of the questionnaire to report what is actually being done in relation to what should be done in the performance of MPC&A functions.« less
Implementation of Benchmarking Transportation Logistics Practices and Future Benchmarking Organizations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thrower, A.W.; Patric, J.; Keister, M.

2008-07-01

The purpose of the Office of Civilian Radioactive Waste Management's (OCRWM) Logistics Benchmarking Project is to identify established government and industry practices for the safe transportation of hazardous materials which can serve as a yardstick for design and operation of OCRWM's national transportation system for shipping spent nuclear fuel and high-level radioactive waste to the proposed repository at Yucca Mountain, Nevada. The project will present logistics and transportation practices and develop implementation recommendations for adaptation by the national transportation system. This paper will describe the process used to perform the initial benchmarking study, highlight interim findings, and explain how thesemore » findings are being implemented. It will also provide an overview of the next phase of benchmarking studies. The benchmarking effort will remain a high-priority activity throughout the planning and operational phases of the transportation system. The initial phase of the project focused on government transportation programs to identify those practices which are most clearly applicable to OCRWM. These Federal programs have decades of safe transportation experience, strive for excellence in operations, and implement effective stakeholder involvement, all of which parallel OCRWM's transportation mission and vision. The initial benchmarking project focused on four business processes that are critical to OCRWM's mission success, and can be incorporated into OCRWM planning and preparation in the near term. The processes examined were: transportation business model, contract management/out-sourcing, stakeholder relations, and contingency planning. More recently, OCRWM examined logistics operations of AREVA NC's Business Unit Logistics in France. The next phase of benchmarking will focus on integrated domestic and international commercial radioactive logistic operations. The prospective companies represent large scale shippers and have vast experience in safely and efficiently shipping spent nuclear fuel and other radioactive materials. Additional business processes may be examined in this phase. The findings of these benchmarking efforts will help determine the organizational structure and requirements of the national transportation system. (authors)« less
OPTIMIZATION OF DEEP DRILLING PERFORMANCE--DEVELOPMENT AND BENCHMARK TESTING OF ADVANCED DIAMOND PRODUCT DRILL BITS & HP/HT FLUIDS TO SIGNIFICANTLY IMPROVE RATES OF PENETRATION

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alan Black; Arnis Judzis

2004-10-01

The industry cost shared program aims to benchmark drilling rates of penetration in selected simulated deep formations and to significantly improve ROP through a team development of aggressive diamond product drill bit--fluid system technologies. Overall the objectives are as follows: Phase 1--Benchmark ''best in class'' diamond and other product drilling bits and fluids and develop concepts for a next level of deep drilling performance; Phase 2--Develop advanced smart bit-fluid prototypes and test at large scale; and Phase 3--Field trial smart bit-fluid concepts, modify as necessary and commercialize products. As of report date, TerraTek has concluded all major preparations for themore » high pressure drilling campaign. Baker Hughes encountered difficulties in providing additional pumping capacity before TerraTek's scheduled relocation to another facility, thus the program was delayed further to accommodate the full testing program.« less

Comparing the performance of two CBIRS indexing schemes

NASA Astrophysics Data System (ADS)

Mueller, Wolfgang; Robbert, Guenter; Henrich, Andreas

2003-01-01

Content based image retrieval (CBIR) as it is known today has to deal with a number of challenges. Quickly summarized, the main challenges are firstly, to bridge the semantic gap between high-level concepts and low-level features using feedback, secondly to provide performance under adverse conditions. High-dimensional spaces, as well as a demanding machine learning task make the right way of indexing an important issue. When indexing multimedia data, most groups opt for extraction of high-dimensional feature vectors from the data, followed by dimensionality reduction like PCA (Principal Components Analysis) or LSI (Latent Semantic Indexing). The resulting vectors are indexed using spatial indexing structures such as kd-trees or R-trees, for example. Other projects, such as MARS and Viper propose the adaptation of text indexing techniques, notably the inverted file. Here, the Viper system is the most direct adaptation of text retrieval techniques to quantized vectors. However, while the Viper query engine provides decent performance together with impressive user-feedback behavior, as well as the possibility for easy integration of long-term learning algorithms, and support for potentially infinite feature vectors, there has been no comparison of vector-based methods and inverted-file-based methods under similar conditions. In this publication, we compare a CBIR query engine that uses inverted files (Bothrops, a rewrite of the Viper query engine based on a relational database), and a CBIR query engine based on LSD (Local Split Decision) trees for spatial indexing using the same feature sets. The Benchathlon initiative works on providing a set of images and ground truth for simulating image queries by example and corresponding user feedback. When performing the Benchathlon benchmark on a CBIR system (the System Under Test, SUT), a benchmarking harness connects over internet to the SUT, performing a number of queries using an agreed-upon protocol, the multimedia retrieval markup language (MRML). Using this benchmark one can measure the quality of retrieval, as well as the overall (speed) performance of the benchmarked system. Our Benchmarks will draw on the Benchathlon"s work for documenting the retrieval performance of both inverted file-based and LSD tree based techniques. However in addition to these results, we will present statistics, that can be obtained only inside the system under test. These statistics will include the number of complex mathematical operations, as well as the amount of data that has to be read from disk during operation of a query.
Toward real-time performance benchmarks for Ada

NASA Technical Reports Server (NTRS)

Clapp, Russell M.; Duchesneau, Louis; Volz, Richard A.; Mudge, Trevor N.; Schultze, Timothy

1986-01-01

The issue of real-time performance measurements for the Ada programming language through the use of benchmarks is addressed. First, the Ada notion of time is examined and a set of basic measurement techniques are developed. Then a set of Ada language features believed to be important for real-time performance are presented and specific measurement methods discussed. In addition, other important time related features which are not explicitly part of the language but are part of the run-time related features which are not explicitly part of the language but are part of the run-time system are also identified and measurement techniques developed. The measurement techniques are applied to the language and run-time system features and the results are presented.
[Benchmarking of performance of Mexican states with effective coverage].

PubMed

Lozano, Rafael; Soliz, Patricia; Gakidou, Emmanuela; Abbott-Klafter, Jesse; Feehan, Dennis M; Vidal, Cecilia; Ortiz, Juan Pablo; Murray, Christopher J L

2007-01-01

Benchmarking of the performance of states, provinces, or districts in a decentralised health system is important for fostering of accountability, monitoring of progress, identification of determinants of success and failure, and creation of a culture of evidence. The Mexican Ministry of Health has, since 2001, used a benchmarking approach based on the World Health Organization (WHO) concept of effective coverage of an intervention, which is defined as the proportion of potential health gain that could be delivered by the health system to that which is actually delivered. Using data collection systems, including state representative examination surveys, vital registration, and hospital discharge registries, we have monitored the delivery of 14 interventions for 2005-06. Overall effective coverage ranges from 54.0% in Chiapas, a poor state, to 65.1% in the Federal District. Effective coverage for maternal and child health interventions is substantially higher than that for interventions that target other health problems. Effective coverage for the lowest wealth quintile is 52% compared with 61% for the highest quintile. Effective coverage is closely related to public-health spending per head across states; this relation is stronger for interventions that are not related to maternal and child health than those for maternal and child health. Considerable variation also exists in effective coverage at similar amounts of spending. We discuss the implications of these issues for the further development of the Mexican health-information system. Benchmarking of performance by measuring effective coverage encourages decision-makers to focus on quality service provision, not only service availability. The effective coverage calculation is an important device for health-system stewardship. In adopting this approach, other countries should select interventions to be measured on the basis of the criteria of affordability, effect on population health, effect on health inequalities, and capacity to measure the effects of the intervention. The national institutions undertaking this benchmarking must have the mandate, skills, resources, and independence to succeed.
Benchmarking of performance of Mexican states with effective coverage.

PubMed

Lozano, Rafael; Soliz, Patricia; Gakidou, Emmanuela; Abbott-Klafter, Jesse; Feehan, Dennis M; Vidal, Cecilia; Ortiz, Juan Pablo; Murray, Christopher J L

2006-11-11

Benchmarking of the performance of states, provinces, or districts in a decentralised health system is important for fostering of accountability, monitoring of progress, identification of determinants of success and failure, and creation of a culture of evidence. The Mexican Ministry of Health has, since 2001, used a benchmarking approach based on the WHO concept of effective coverage of an intervention, which is defined as the proportion of potential health gain that could be delivered by the health system to that which is actually delivered. Using data collection systems, including state representative examination surveys, vital registration, and hospital discharge registries, we have monitored the delivery of 14 interventions for 2005-06. Overall effective coverage ranges from 54.0% in Chiapas, a poor state, to 65.1% in the Federal District. Effective coverage for maternal and child health interventions is substantially higher than that for interventions that target other health problems. Effective coverage for the lowest wealth quintile is 52% compared with 61% for the highest quintile. Effective coverage is closely related to public-health spending per head across states; this relation is stronger for interventions that are not related to maternal and child health than those for maternal and child health. Considerable variation also exists in effective coverage at similar amounts of spending. We discuss the implications of these issues for the further development of the Mexican health-information system. Benchmarking of performance by measuring effective coverage encourages decision-makers to focus on quality service provision, not only service availability. The effective coverage calculation is an important device for health-system stewardship. In adopting this approach, other countries should select interventions to be measured on the basis of the criteria of affordability, effect on population health, effect on health inequalities, and capacity to measure the effects of the intervention. The national institutions undertaking this benchmarking must have the mandate, skills, resources, and independence to succeed.
Benchmarking short sequence mapping tools

PubMed Central

2013-01-01

Background The development of next-generation sequencing instruments has led to the generation of millions of short sequences in a single run. The process of aligning these reads to a reference genome is time consuming and demands the development of fast and accurate alignment tools. However, the current proposed tools make different compromises between the accuracy and the speed of mapping. Moreover, many important aspects are overlooked while comparing the performance of a newly developed tool to the state of the art. Therefore, there is a need for an objective evaluation method that covers all the aspects. In this work, we introduce a benchmarking suite to extensively analyze sequencing tools with respect to various aspects and provide an objective comparison. Results We applied our benchmarking tests on 9 well known mapping tools, namely, Bowtie, Bowtie2, BWA, SOAP2, MAQ, RMAP, GSNAP, Novoalign, and mrsFAST (mrFAST) using synthetic data and real RNA-Seq data. MAQ and RMAP are based on building hash tables for the reads, whereas the remaining tools are based on indexing the reference genome. The benchmarking tests reveal the strengths and weaknesses of each tool. The results show that no single tool outperforms all others in all metrics. However, Bowtie maintained the best throughput for most of the tests while BWA performed better for longer read lengths. The benchmarking tests are not restricted to the mentioned tools and can be further applied to others. Conclusion The mapping process is still a hard problem that is affected by many factors. In this work, we provided a benchmarking suite that reveals and evaluates the different factors affecting the mapping process. Still, there is no tool that outperforms all of the others in all the tests. Therefore, the end user should clearly specify his needs in order to choose the tool that provides the best results. PMID:23758764
OPTIMIZATION OF MUD HAMMER DRILLING PERFORMANCE - A PROGRAM TO BENCHMARK THE VIABILITY OF ADVANCED MUD HAMMER DRILLING

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alan Black; Arnis Judzis

2003-01-01

Progress during current reporting year 2002 by quarter--Progress during Q1 2002: (1) In accordance to Task 7.0 (D. No.2 Technical Publications) TerraTek, NETL, and the Industry Contributors successfully presented a paper detailing Phase 1 testing results at the February 2002 IADC/SPE Drilling Conference, a prestigious venue for presenting DOE and private sector drilling technology advances. The full reference is as follows: IADC/SPE 74540 ''World's First Benchmarking of Drilling Mud Hammer Performance at Depth Conditions'' authored by Gordon A. Tibbitts, TerraTek; Roy C. Long, US Department of Energy, Brian E. Miller, BP America, Inc.; Arnis Judzis, TerraTek; and Alan D. Black,more » TerraTek. Gordon Tibbitts, TerraTek, will presented the well-attended paper in February of 2002. The full text of the Mud Hammer paper was included in the last quarterly report. (2) The Phase 2 project planning meeting (Task 6) was held at ExxonMobil's Houston Greenspoint offices on February 22, 2002. In attendance were representatives from TerraTek, DOE, BP, ExxonMobil, PDVSA, Novatek, and SDS Digger Tools. (3) PDVSA has joined the advisory board to this DOE mud hammer project. PDVSA's commitment of cash and in-kind contributions were reported during the last quarter. (4) Strong Industry support remains for the DOE project. Both Andergauge and Smith Tools have expressed an interest in participating in the ''optimization'' phase of the program. The potential for increased testing with additional Industry cash support was discussed at the planning meeting in February 2002. Progress during Q2 2002: (1) Presentation material was provided to the DOE/NETL project manager (Dr. John Rogers) for the DOE exhibit at the 2002 Offshore Technology Conference. (2) Two meeting at Smith International and one at Andergauge in Houston were held to investigate their interest in joining the Mud Hammer Performance study. (3) SDS Digger Tools (Task 3 Benchmarking participant) apparently has not negotiated a commercial deal with Halliburton on the supply of fluid hammers to the oil and gas business. (4) TerraTek is awaiting progress by Novatek (a DOE contractor) on the redesign and development of their next hammer tool. Their delay will require an extension to TerraTek's contracted program. (5) Smith International has sufficient interest in the program to start engineering and chroming of collars for testing at TerraTek. (6) Shell's Brian Tarr has agreed to join the Industry Advisory Group for the DOE project. The addition of Brian Tarr is welcomed as he has numerous years of experience with the Novatek tool and was involved in the early tests in Europe while with Mobil Oil. (7) Conoco's field trial of the Smith fluid hammer for an application in Vietnam was organized and has contributed to the increased interest in their tool. Progress during Q3 2002: (1) Smith International agreed to participate in the DOE Mud Hammer program. (2) Smith International chromed collars for upcoming benchmark tests at TerraTek, now scheduled for 4Q 2002. (3) ConocoPhillips had a field trial of the Smith fluid hammer offshore Vietnam. The hammer functioned properly, though the well encountered hole conditions and reaming problems. ConocoPhillips plan another field trial as a result. (4) DOE/NETL extended the contract for the fluid hammer program to allow Novatek to ''optimize'' their much delayed tool to 2003 and to allow Smith International to add ''benchmarking'' tests in light of SDS Digger Tools' current financial inability to participate. (5) ConocoPhillips joined the Industry Advisors for the mud hammer program. Progress during Q4 2002: (1) Smith International participated in the DOE Mud Hammer program through full scale benchmarking testing during the week of 4 November 2003. (2) TerraTek acknowledges Smith International, BP America, PDVSA, and ConocoPhillips for cost-sharing the Smith benchmarking tests allowing extension of the contract to add to the benchmarking testing program. (3) Following the benchmark testing of the Smith International hammer, representatives from DOE/NETL, TerraTek, Smith International and PDVSA met at TerraTek in Salt Lake City to review observations, performance and views on the optimization step for 2003. (4) The December 2002 issue of Journal of Petroleum Technology (Society of Petroleum Engineers) highlighted the DOE fluid hammer testing program and reviewed last years paper on the benchmark performance of the SDS Digger and Novatek hammers. (5) TerraTek's Sid Green presented a technical review for DOE/NETL personnel in Morgantown on ''Impact Rock Breakage'' and its importance on improving fluid hammer performance. Much discussion has taken place on the issues surrounding mud hammer performance at depth conditions.« less
Classifying Higher Education Institutions in Korea: A Performance-Based Approach

ERIC Educational Resources Information Center

Shin, Jung Cheol

2009-01-01

The purpose of this study was to classify higher education institutions according to institutional performance rather than predetermined benchmarks. Institutional performance was defined as research performance and classified using Hierarchical Cluster Analysis, a statistical method that classifies objects according to specified classification…
Principles for Developing Benchmark Criteria for Staff Training in Responsible Gambling.

PubMed

Oehler, Stefan; Banzer, Raphaela; Gruenerbl, Agnes; Malischnig, Doris; Griffiths, Mark D; Haring, Christian

2017-03-01

One approach to minimizing the negative consequences of excessive gambling is staff training to reduce the rate of the development of new cases of harm or disorder within their customers. The primary goal of the present study was to assess suitable benchmark criteria for the training of gambling employees at casinos and lottery retailers. The study utilised the Delphi Method, a survey with one qualitative and two quantitative phases. A total of 21 invited international experts in the responsible gambling field participated in all three phases. A total of 75 performance indicators were outlined and assigned to six categories: (1) criteria of content, (2) modelling, (3) qualification of trainer, (4) framework conditions, (5) sustainability and (6) statistical indicators. Nine of the 75 indicators were rated as very important by 90 % or more of the experts. Unanimous support for importance was given to indicators such as (1) comprehensibility and (2) concrete action-guidance for handling with problem gamblers, Additionally, the study examined the implementation of benchmarking, when it should be conducted, and who should be responsible. Results indicated that benchmarking should be conducted every 1-2 years regularly and that one institution should be clearly defined and primarily responsible for benchmarking. The results of the present study provide the basis for developing a benchmarking for staff training in responsible gambling.
A Simplified Approach for the Rapid Generation of Transient Heat-Shield Environments

NASA Technical Reports Server (NTRS)

Wurster, Kathryn E.; Zoby, E. Vincent; Mills, Janelle C.; Kamhawi, Hilmi

2007-01-01

A simplified approach has been developed whereby transient entry heating environments are reliably predicted based upon a limited set of benchmark radiative and convective solutions. Heating, pressure and shear-stress levels, non-dimensionalized by an appropriate parameter at each benchmark condition are applied throughout the entry profile. This approach was shown to be valid based on the observation that the fully catalytic, laminar distributions examined were relatively insensitive to altitude as well as velocity throughout the regime of significant heating. In order to establish a best prediction by which to judge the results that can be obtained using a very limited benchmark set, predictions based on a series of benchmark cases along a trajectory are used. Solutions which rely only on the limited benchmark set, ideally in the neighborhood of peak heating, are compared against the resultant transient heating rates and total heat loads from the best prediction. Predictions based on using two or fewer benchmark cases at or near the trajectory peak heating condition, yielded results to within 5-10 percent of the best predictions. Thus, the method provides transient heating environments over the heat-shield face with sufficient resolution and accuracy for thermal protection system design and also offers a significant capability to perform rapid trade studies such as the effect of different trajectories, atmospheres, or trim angle of attack, on convective and radiative heating rates and loads, pressure, and shear-stress levels.
MoleculeNet: a benchmark for molecular machine learning† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7sc02664a

PubMed Central

Wu, Zhenqin; Ramsundar, Bharath; Feinberg, Evan N.; Gomes, Joseph; Geniesse, Caleb; Pappu, Aneesh S.; Leswing, Karl

2017-01-01

Molecular machine learning has been maturing rapidly over the last few years. Improved methods and the presence of larger datasets have enabled machine learning algorithms to make increasingly accurate predictions about molecular properties. However, algorithmic progress has been limited due to the lack of a standard benchmark to compare the efficacy of proposed methods; most new algorithms are benchmarked on different datasets making it challenging to gauge the quality of proposed methods. This work introduces MoleculeNet, a large scale benchmark for molecular machine learning. MoleculeNet curates multiple public datasets, establishes metrics for evaluation, and offers high quality open-source implementations of multiple previously proposed molecular featurization and learning algorithms (released as part of the DeepChem open source library). MoleculeNet benchmarks demonstrate that learnable representations are powerful tools for molecular machine learning and broadly offer the best performance. However, this result comes with caveats. Learnable representations still struggle to deal with complex tasks under data scarcity and highly imbalanced classification. For quantum mechanical and biophysical datasets, the use of physics-aware featurizations can be more important than choice of particular learning algorithm. PMID:29629118
A new enhanced index tracking model in portfolio optimization with sum weighted approach

NASA Astrophysics Data System (ADS)

Siew, Lam Weng; Jaaman, Saiful Hafizah; Hoe, Lam Weng

2017-04-01

Index tracking is a portfolio management which aims to construct the optimal portfolio to achieve similar return with the benchmark index return at minimum tracking error without purchasing all the stocks that make up the index. Enhanced index tracking is an improved portfolio management which aims to generate higher portfolio return than the benchmark index return besides minimizing the tracking error. The objective of this paper is to propose a new enhanced index tracking model with sum weighted approach to improve the existing index tracking model for tracking the benchmark Technology Index in Malaysia. The optimal portfolio composition and performance of both models are determined and compared in terms of portfolio mean return, tracking error and information ratio. The results of this study show that the optimal portfolio of the proposed model is able to generate higher mean return than the benchmark index at minimum tracking error. Besides that, the proposed model is able to outperform the existing model in tracking the benchmark index. The significance of this study is to propose a new enhanced index tracking model with sum weighted apporach which contributes 67% improvement on the portfolio mean return as compared to the existing model.
Evaluation of Neutron Radiography Reactor LEU-Core Start-Up Measurements

DOE PAGES

Bess, John D.; Maddock, Thomas L.; Smolinski, Andrew T.; ...

2014-11-04

Benchmark models were developed to evaluate the cold-critical start-up measurements performed during the fresh core reload of the Neutron Radiography (NRAD) reactor with Low Enriched Uranium (LEU) fuel. Experiments include criticality, control-rod worth measurements, shutdown margin, and excess reactivity for four core loadings with 56, 60, 62, and 64 fuel elements. The worth of four graphite reflector block assemblies and an empty dry tube used for experiment irradiations were also measured and evaluated for the 60-fuel-element core configuration. Dominant uncertainties in the experimental k eff come from uncertainties in the manganese content and impurities in the stainless steel fuel claddingmore » as well as the 236U and erbium poison content in the fuel matrix. Calculations with MCNP5 and ENDF/B-VII.0 neutron nuclear data are approximately 1.4% (9σ) greater than the benchmark model eigenvalues, which is commonly seen in Monte Carlo simulations of other TRIGA reactors. Simulations of the worth measurements are within the 2σ uncertainty for most of the benchmark experiment worth values. The complete benchmark evaluation details are available in the 2014 edition of the International Handbook of Evaluated Reactor Physics Benchmark Experiments.« less
Evaluation of Neutron Radiography Reactor LEU-Core Start-Up Measurements

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bess, John D.; Maddock, Thomas L.; Smolinski, Andrew T.

Benchmark models were developed to evaluate the cold-critical start-up measurements performed during the fresh core reload of the Neutron Radiography (NRAD) reactor with Low Enriched Uranium (LEU) fuel. Experiments include criticality, control-rod worth measurements, shutdown margin, and excess reactivity for four core loadings with 56, 60, 62, and 64 fuel elements. The worth of four graphite reflector block assemblies and an empty dry tube used for experiment irradiations were also measured and evaluated for the 60-fuel-element core configuration. Dominant uncertainties in the experimental k eff come from uncertainties in the manganese content and impurities in the stainless steel fuel claddingmore » as well as the 236U and erbium poison content in the fuel matrix. Calculations with MCNP5 and ENDF/B-VII.0 neutron nuclear data are approximately 1.4% (9σ) greater than the benchmark model eigenvalues, which is commonly seen in Monte Carlo simulations of other TRIGA reactors. Simulations of the worth measurements are within the 2σ uncertainty for most of the benchmark experiment worth values. The complete benchmark evaluation details are available in the 2014 edition of the International Handbook of Evaluated Reactor Physics Benchmark Experiments.« less
Global benchmarking of medical student learning outcomes? Implementation and pilot results of the International Foundations of Medicine Clinical Sciences Exam at The University of Queensland, Australia.

PubMed

Wilkinson, David; Schafer, Jennifer; Hewett, David; Eley, Diann; Swanson, Dave

2014-01-01

To report pilot results for international benchmarking of learning outcomes among 426 final year medical students at the University of Queensland (UQ), Australia. Students took the International Foundations of Medicine (IFOM) Clinical Sciences Exam (CSE) developed by the National Board of Medical Examiners, USA, as a required formative assessment. IFOM CSE comprises 160 multiple-choice questions in medicine, surgery, obstetrics, paediatrics and mental health, taken over 4.5 hours. Significant implementation issues; IFOM scores and benchmarking with International Comparison Group (ICG) scores and United States Medical Licensing Exam (USMLE) Step 2 Clinical Knowledge (CK) scores; and correlation with UQ medical degree cumulative grade point average (GPA). Implementation as an online exam, under university-mandated conditions was successful. Mean IFOM score was 531.3 (maximum 779-minimum 200). The UQ cohort performed better (31% scored below 500) than the ICG (55% below 500). However 49% of the UQ cohort did not meet the USMLE Step 2 CK minimum score. Correlation between IFOM scores and UQ cumulative GPA was reasonable at 0.552 (p < 0.001). International benchmarking is feasible and provides a variety of useful benchmarking opportunities.
Unstructured Adaptive Meshes: Bad for Your Memory?

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Feng, Hui-Yu; VanderWijngaart, Rob

2003-01-01

This viewgraph presentation explores the need for a NASA Advanced Supercomputing (NAS) parallel benchmark for problems with irregular dynamical memory access. This benchmark is important and necessary because: 1) Problems with localized error source benefit from adaptive nonuniform meshes; 2) Certain machines perform poorly on such problems; 3) Parallel implementation may provide further performance improvement but is difficult. Some examples of problems which use irregular dynamical memory access include: 1) Heat transfer problem; 2) Heat source term; 3) Spectral element method; 4) Base functions; 5) Elemental discrete equations; 6) Global discrete equations. Nonconforming Mesh and Mortar Element Method are covered in greater detail in this presentation.
A Comparison of Automatic Parallelization Tools/Compilers on the SGI Origin 2000 Using the NAS Benchmarks

NASA Technical Reports Server (NTRS)

Saini, Subhash; Frumkin, Michael; Hribar, Michelle; Jin, Hao-Qiang; Waheed, Abdul; Yan, Jerry

1998-01-01

Porting applications to new high performance parallel and distributed computing platforms is a challenging task. Since writing parallel code by hand is extremely time consuming and costly, porting codes would ideally be automated by using some parallelization tools and compilers. In this paper, we compare the performance of the hand written NAB Parallel Benchmarks against three parallel versions generated with the help of tools and compilers: 1) CAPTools: an interactive computer aided parallelization too] that generates message passing code, 2) the Portland Group's HPF compiler and 3) using compiler directives with the native FORTAN77 compiler on the SGI Origin2000.
District Heating Systems Performance Analyses. Heat Energy Tariff

NASA Astrophysics Data System (ADS)

Ziemele, Jelena; Vigants, Girts; Vitolins, Valdis; Blumberga, Dagnija; Veidenbergs, Ivars

2014-12-01

The paper addresses an important element of the European energy sector: the evaluation of district heating (DH) system operations from the standpoint of increasing energy efficiency and increasing the use of renewable energy resources. This has been done by developing a new methodology for the evaluation of the heat tariff. The paper presents an algorithm of this methodology, which includes not only a data base and calculation equation systems, but also an integrated multi-criteria analysis module using MADM/MCDM (Multi-Attribute Decision Making / Multi-Criteria Decision Making) based on TOPSIS (Technique for Order Performance by Similarity to Ideal Solution). The results of the multi-criteria analysis are used to set the tariff benchmarks. The evaluation methodology has been tested for Latvian heat tariffs, and the obtained results show that only half of heating companies reach a benchmark value equal to 0.5 for the efficiency closeness to the ideal solution indicator. This means that the proposed evaluation methodology would not only allow companies to determine how they perform with regard to the proposed benchmark, but also to identify their need to restructure so that they may reach the level of a low-carbon business.
Benchmarking health IT among OECD countries: better data for better policy

PubMed Central

Adler-Milstein, Julia; Ronchi, Elettra; Cohen, Genna R; Winn, Laura A Pannella; Jha, Ashish K

2014-01-01

Objective To develop benchmark measures of health information and communication technology (ICT) use to facilitate cross-country comparisons and learning. Materials and methods The effort is led by the Organisation for Economic Co-operation and Development (OECD). Approaches to definition and measurement within four ICT domains were compared across seven OECD countries in order to identify functionalities in each domain. These informed a set of functionality-based benchmark measures, which were refined in collaboration with representatives from more than 20 OECD and non-OECD countries. We report on progress to date and remaining work to enable countries to begin to collect benchmark data. Results The four benchmarking domains include provider-centric electronic record, patient-centric electronic record, health information exchange, and tele-health. There was broad agreement on functionalities in the provider-centric electronic record domain (eg, entry of core patient data, decision support), and less agreement in the other three domains in which country representatives worked to select benchmark functionalities. Discussion Many countries are working to implement ICTs to improve healthcare system performance. Although many countries are looking to others as potential models, the lack of consistent terminology and approach has made cross-national comparisons and learning difficult. Conclusions As countries develop and implement strategies to increase the use of ICTs to promote health goals, there is a historic opportunity to enable cross-country learning. To facilitate this learning and reduce the chances that individual countries flounder, a common understanding of health ICT adoption and use is needed. The OECD-led benchmarking process is a crucial step towards achieving this. PMID:23721983
Contributions to Integral Nuclear Data in ICSBEP and IRPhEP since ND 2013

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bess, John D.; Briggs, J. Blair; Gulliford, Jim

2016-09-01

The status of the International Criticality Safety Benchmark Evaluation Project (ICSBEP) and the International Reactor Physics Experiment Evaluation Project (IRPhEP) was last discussed directly with the international nuclear data community at ND2013. Since ND2013, integral benchmark data that are available for nuclear data testing has continued to increase. The status of the international benchmark efforts and the latest contributions to integral nuclear data for testing is discussed. Select benchmark configurations that have been added to the ICSBEP and IRPhEP Handbooks since ND2013 are highlighted. The 2015 edition of the ICSBEP Handbook now contains 567 evaluations with benchmark specifications for 4,874more » critical, near-critical, or subcritical configurations, 31 criticality alarm placement/shielding configuration with multiple dose points apiece, and 207 configurations that have been categorized as fundamental physics measurements that are relevant to criticality safety applications. The 2015 edition of the IRPhEP Handbook contains data from 143 different experimental series that were performed at 50 different nuclear facilities. Currently 139 of the 143 evaluations are published as approved benchmarks with the remaining four evaluations published in draft format only. Measurements found in the IRPhEP Handbook include criticality, buckling and extrapolation length, spectral characteristics, reactivity effects, reactivity coefficients, kinetics, reaction-rate distributions, power distributions, isotopic compositions, and/or other miscellaneous types of measurements for various types of reactor systems. Annual technical review meetings for both projects were held in April 2016; additional approved benchmark evaluations will be included in the 2016 editions of these handbooks.« less
Benchmarking health IT among OECD countries: better data for better policy.

PubMed

Adler-Milstein, Julia; Ronchi, Elettra; Cohen, Genna R; Winn, Laura A Pannella; Jha, Ashish K

2014-01-01

To develop benchmark measures of health information and communication technology (ICT) use to facilitate cross-country comparisons and learning. The effort is led by the Organisation for Economic Co-operation and Development (OECD). Approaches to definition and measurement within four ICT domains were compared across seven OECD countries in order to identify functionalities in each domain. These informed a set of functionality-based benchmark measures, which were refined in collaboration with representatives from more than 20 OECD and non-OECD countries. We report on progress to date and remaining work to enable countries to begin to collect benchmark data. The four benchmarking domains include provider-centric electronic record, patient-centric electronic record, health information exchange, and tele-health. There was broad agreement on functionalities in the provider-centric electronic record domain (eg, entry of core patient data, decision support), and less agreement in the other three domains in which country representatives worked to select benchmark functionalities. Many countries are working to implement ICTs to improve healthcare system performance. Although many countries are looking to others as potential models, the lack of consistent terminology and approach has made cross-national comparisons and learning difficult. As countries develop and implement strategies to increase the use of ICTs to promote health goals, there is a historic opportunity to enable cross-country learning. To facilitate this learning and reduce the chances that individual countries flounder, a common understanding of health ICT adoption and use is needed. The OECD-led benchmarking process is a crucial step towards achieving this.

Vendors' Corner: System Performance. A Forum.

ERIC Educational Resources Information Center

Sugnet, Chris, Ed.

1986-01-01

Representatives of six vendors of online library systems address key issues related to system performance: designing, configuring and sizing systems; establishing performance criteria; using benchmark and acceptance tests; the risks of miscalculations; vendor, consultant, and library roles; and related topics. (Author/EM)
The grout/glass performance assessment code system (GPACS) with verification and benchmarking

DOE Office of Scientific and Technical Information (OSTI.GOV)

Piepho, M.G.; Sutherland, W.H.; Rittmann, P.D.

1994-12-01

GPACS is a computer code system for calculating water flow (unsaturated or saturated), solute transport, and human doses due to the slow release of contaminants from a waste form (in particular grout or glass) through an engineered system and through a vadose zone to an aquifer, well and river. This dual-purpose document is intended to serve as a user`s guide and verification/benchmark document for the Grout/Glass Performance Assessment Code system (GPACS). GPACS can be used for low-level-waste (LLW) Glass Performance Assessment and many other applications including other low-level-waste performance assessments and risk assessments. Based on all the cses presented, GPACSmore » is adequate (verified) for calculating water flow and contaminant transport in unsaturated-zone sediments and for calculating human doses via the groundwater pathway.« less
Comparing and contrasting poverty reduction performance of social welfare programs across jurisdictions in Canada using Data Envelopment Analysis (DEA): an exploratory study of the era of devolution.

PubMed

Habibov, Nazim N; Fan, Lida

2010-11-01

In the mid-1990s, the responsibilities to design, implement, and evaluate social welfare programs were transferred from federal to local jurisdictions in many countries of North America and Europe through devolution processes. Devolution has caused the need for a technique to measure and compare the performances of social welfare programs across multiple jurisdictions. This paper utilizes Data Envelopment Analysis (DEA) for a comparison of poverty reduction performances of jurisdictional social welfare programs across Canadian provinces. From the theoretical perspective, findings of this paper demonstrates that DEA is a promising method to evaluate, compare, and benchmark poverty reduction performance across multiple jurisdictions using multiple inputs and outputs. This paper demonstrates that DEA generates easy to comprehend composite rankings of provincial performances, identifies appropriate benchmarks for each inefficient province, and estimates sources and amounts of improvement needed to make the provinces efficient. From a practical perspective the empirical results presented in this paper indicate that Newfoundland, Prince Edwards Island, and Alberta achieve better efficiency in poverty reduction than other provinces. Policy makers and social administrators of the ineffective provinces across Canada may find benefit in selecting one of the effective provinces as a benchmark for improving their own performance based on similar size and structure of population, size of the budget for social programs, and traditions with administering particular types of social programs. Copyright (c) 2009 Elsevier Ltd. All rights reserved.
Population-Attributable Risk Proportion of Clinical Risk Factors for Breast Cancer.

PubMed

Engmann, Natalie J; Golmakani, Marzieh K; Miglioretti, Diana L; Sprague, Brian L; Kerlikowske, Karla

2017-09-01

Many established breast cancer risk factors are used in clinical risk prediction models, although the proportion of breast cancers explained by these factors is unknown. To determine the population-attributable risk proportion (PARP) for breast cancer associated with clinical breast cancer risk factors among premenopausal and postmenopausal women. Case-control study with 1:10 matching on age, year of risk factor assessment, and Breast Cancer Surveillance Consortium (BCSC) registry. Risk factor data were collected prospectively from January 1, 1996, through October 31, 2012, from BCSC community-based breast imaging facilities. A total of 18 437 women with invasive breast cancer or ductal carcinoma in situ were enrolled as cases and matched to 184 309 women without breast cancer, with a total of 58 146 premenopausal and 144 600 postmenopausal women enrolled in the study. Breast Imaging Reporting and Data System (BI-RADS) breast density (heterogeneously or extremely dense vs scattered fibroglandular densities), first-degree family history of breast cancer, body mass index (>25 vs 18.5-25), history of benign breast biopsy, and nulliparity or age at first birth (≥30 years vs <30 years). Population-attributable risk proportion of breast cancer. Of the 18 437 women with breast cancer, the mean (SD) age was 46.3 (3.7) years among premenopausal women and 61.7 (7.2) years among the postmenopausal women. Overall, 4747 (89.8%) premenopausal and 12 502 (95.1%) postmenopausal women with breast cancer had at least 1 breast cancer risk factor. The combined PARP of all risk factors was 52.7% (95% CI, 49.1%-56.3%) among premenopausal women and 54.7% (95% CI, 46.5%-54.7%) among postmenopausal women. Breast density was the most prevalent risk factor for both premenopausal and postmenopausal women and had the largest effect on the PARP; 39.3% (95% CI, 36.6%-42.0%) of premenopausal and 26.2% (95% CI, 24.4%-28.0%) of postmenopausal breast cancers could potentially be averted if all women with heterogeneously or extremely dense breasts shifted to scattered fibroglandular breast density. Among postmenopausal women, 22.8% (95% CI, 18.3%-27.3%) of breast cancers could potentially be averted if all overweight and obese women attained a body mass index of less than 25. Most women with breast cancer have at least 1 breast cancer risk factor routinely documented at the time of mammography, and more than half of premenopausal and postmenopausal breast cancers are explained by these factors. These easily assessed risk factors should be incorporated into risk prediction models to stratify breast cancer risk and promote risk-based screening and targeted prevention efforts.
Hsp27 participates in the maintenance of breast cancer stem cells through regulation of epithelial-mesenchymal transition and nuclear factor-κB

PubMed Central

2011-01-01

Introduction Heat shock proteins (HSPs) are normally induced under environmental stress to serve as chaperones for maintenance of correct protein folding but they are often overexpressed in many cancers, including breast cancer. The expression of Hsp27, an ATP-independent small HSP, is associated with cell migration and drug resistance of breast cancer cells. Breast cancer stem cells (BCSCs) have been identified as a subpopulation of breast cancer cells with markers of CD24-CD44+ or high intracellular aldehyde dehydrogenase activity (ALDH+) and proved to be associated with radiation resistance and metastasis. However, the involvement of Hsp27 in the maintenance of BCSC is largely unknown. Methods Mitogen-activated protein kinase antibody array and Western blot were used to discover the expression of Hsp27 and its phosphorylation in ALDH + BCSCs. To study the involvement of Hsp27 in BCSC biology, siRNA mediated gene silencing and quercetin treatment were used to inhibit Hsp27 expression and the characters of BCSCs, which include ALDH+ population, mammosphere formation and cell migration, were analyzed simultaneously. The tumorigenicity of breast cancer cells after knockdown of Hsp27 was analyzed by xenograftment assay in NOD/SCID mice. The epithelial-mesenchymal transition (EMT) of breast cancer cells was analyzed by wound-healing assay and Western blot of snail, vimentin and E-cadherin expression. The activation of nuclear factor kappa B (NF-κB) was analyzed by luciferase-based reporter assay and nuclear translocation. Results Hsp27 and its phosphorylation were increased in ALDH+ BCSCs in comparison with ALDH- non-BCSCs. Knockdown of Hsp27 in breast cancer cells decreased characters of BCSCs, such as ALDH+ population, mammosphere formation and cell migration. In addition, the in vivo CSC frequency could be diminished in Hsp27 knockdown breast cancer cells. The inhibitory effects could also be observed in cells treated with quercetin, a plant flavonoid inhibitor of Hsp27, and it could be reversed by overexpression of Hsp27. Knockdown of Hsp27 also suppressed EMT signatures, such as decreasing the expression of snail and vimentin and increasing the expression of E-cadherin. Furthermore, knockdown of Hsp27 decreased the nuclear translocation as well as the activity of NF-κB in ALDH + BCSCs, which resulted from increasing expression of IκBα. Restored activation of NF-κB by knockdown of IκBα could reverse the inhibitory effect of Hsp27 siRNA in suppression of ALDH+ cells. Conclusions Our data suggest that Hsp27 regulates the EMT process and NF-κB activity to contribute the maintenance of BCSCs. Targeting Hsp27 may be considered as a novel strategy in breast cancer therapy. PMID:22023707
Benchmarking of neutron production of heavy-ion transport codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Remec, I.; Ronningen, R. M.; Heilbronn, L.

Document available in abstract form only, full text of document follows: Accurate prediction of radiation fields generated by heavy ion interactions is important in medical applications, space missions, and in design and operation of rare isotope research facilities. In recent years, several well-established computer codes in widespread use for particle and radiation transport calculations have been equipped with the capability to simulate heavy ion transport and interactions. To assess and validate these capabilities, we performed simulations of a series of benchmark-quality heavy ion experiments with the computer codes FLUKA, MARS15, MCNPX, and PHITS. We focus on the comparisons of secondarymore » neutron production. Results are encouraging; however, further improvements in models and codes and additional benchmarking are required. (authors)« less
DE-NE0008277_PROTEUS final technical report 2018

DOE Office of Scientific and Technical Information (OSTI.GOV)

Enqvist, Andreas

This project details re-evaluations of experiments of gas-cooled fast reactor (GCFR) core designs performed in the 1970s at the PROTEUS reactor and create a series of International Reactor Physics Experiment Evaluation Project (IRPhEP) benchmarks. Currently there are no gas-cooled fast reactor (GCFR) experiments available in the International Handbook of Evaluated Reactor Physics Benchmark Experiments (IRPhEP Handbook). These experiments are excellent candidates for reanalysis and development of multiple benchmarks because these experiments provide high-quality integral nuclear data relevant to the validation and refinement of thorium, neptunium, uranium, plutonium, iron, and graphite cross sections. It would be cost prohibitive to reproduce suchmore » a comprehensive suite of experimental data to support any future GCFR endeavors.« less
A benchmark for fault tolerant flight control evaluation

NASA Astrophysics Data System (ADS)

Smaili, H.; Breeman, J.; Lombaerts, T.; Stroosma, O.

2013-12-01

A large transport aircraft simulation benchmark (REconfigurable COntrol for Vehicle Emergency Return - RECOVER) has been developed within the GARTEUR (Group for Aeronautical Research and Technology in Europe) Flight Mechanics Action Group 16 (FM-AG(16)) on Fault Tolerant Control (2004 2008) for the integrated evaluation of fault detection and identification (FDI) and reconfigurable flight control strategies. The benchmark includes a suitable set of assessment criteria and failure cases, based on reconstructed accident scenarios, to assess the potential of new adaptive control strategies to improve aircraft survivability. The application of reconstruction and modeling techniques, based on accident flight data, has resulted in high-fidelity nonlinear aircraft and fault models to evaluate new Fault Tolerant Flight Control (FTFC) concepts and their real-time performance to accommodate in-flight failures.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Sample, B.E. Opresko, D.M. Suter, G.W.

Ecological risks of environmental contaminants are evaluated by using a two-tiered process. In the first tier, a screening assessment is performed where concentrations of contaminants in the environment are compared to no observed adverse effects level (NOAEL)-based toxicological benchmarks. These benchmarks represent concentrations of chemicals (i.e., concentrations presumed to be nonhazardous to the biota) in environmental media (water, sediment, soil, food, etc.). While exceedance of these benchmarks does not indicate any particular level or type of risk, concentrations below the benchmarks should not result in significant effects. In practice, when contaminant concentrations in food or water resources are less thanmore » these toxicological benchmarks, the contaminants may be excluded from further consideration. However, if the concentration of a contaminant exceeds a benchmark, that contaminant should be retained as a contaminant of potential concern (COPC) and investigated further. The second tier in ecological risk assessment, the baseline ecological risk assessment, may use toxicological benchmarks as part of a weight-of-evidence approach (Suter 1993). Under this approach, based toxicological benchmarks are one of several lines of evidence used to support or refute the presence of ecological effects. Other sources of evidence include media toxicity tests, surveys of biota (abundance and diversity), measures of contaminant body burdens, and biomarkers. This report presents NOAEL- and lowest observed adverse effects level (LOAEL)-based toxicological benchmarks for assessment of effects of 85 chemicals on 9 representative mammalian wildlife species (short-tailed shrew, little brown bat, meadow vole, white-footed mouse, cottontail rabbit, mink, red fox, and whitetail deer) or 11 avian wildlife species (American robin, rough-winged swallow, American woodcock, wild turkey, belted kingfisher, great blue heron, barred owl, barn owl, Cooper's hawk, and red-tailed hawk, osprey) (scientific names for both the mammalian and avian species are presented in Appendix B). [In this document, NOAEL refers to both dose (mg contaminant per kg animal body weight per day) and concentration (mg contaminant per kg of food or L of drinking water)]. The 20 wildlife species were chosen because they are widely distributed and provide a representative range of body sizes and diets. The chemicals are some of those that occur at U.S. Department of Energy (DOE) waste sites. The NOAEL-based benchmarks presented in this report represent values believed to be nonhazardous for the listed wildlife species; LOAEL-based benchmarks represent threshold levels at which adverse effects are likely to become evident. These benchmarks consider contaminant exposure through oral ingestion of contaminated media only. Exposure through inhalation and/or direct dermal exposure are not considered in this report.« less
Laparoscopic recurrent inguinal hernia repair during the learning curve: it can be done?

PubMed

Bracale, Umberto; Sciuto, Antonio; Andreuccetti, Jacopo; Merola, Giovanni; Pecchia, Leandro; Melillo, Paolo; Pignata, Giusto

2017-01-01

Trans-Abdominal Preperitoneal Patch (TAPP) repairs for Recurrent Hernia (RH) is a technically demanding procedure. It has to be performed only by surgeons with extensive experience in the laparoscopic approach. The purpose of this study is to evaluate the surgical safety and the efficacy of TAPP for RH performed in a tutoring program by surgeons in practice (SP). All TAPP repairs for RH performed by the same surgical team have been included in the study. We have evaluated the results of three SP during their learning curve in a tutoring program. Then these results have been compared to those of a highly experienced laparoscopic surgeon (Benchmark). A total of 530 TAPP repairs have been performed. Among these, 83 TAPP have been executed for RH, of which 43 by the Benchmark and 40 by the SP. When we have compared the outcomes of the Benchmark with those of SP, no significant difference has been observed about morbidity and recurrence while the operative time has been significantly longer for the SP. No intraoperative complications have occurred. International guidelines urge that TAPP repair for RH has to be performed only by surgeons with extensive experience in the laparoscopic approach. The results of the present study demonstrate that TAPP for RH could be performed also by surgeons in training during a learning program. We retain that an adequate tutoring program could lead a surgeon in practice to perform more complex hernia procedures without jeopardizing patient safety throughout the learning curve period. Laparoscopy, Learning Curve, Recurrent Hernia.
Identifying key genes in glaucoma based on a benchmarked dataset and the gene regulatory network.

PubMed

Chen, Xi; Wang, Qiao-Ling; Zhang, Meng-Hui

2017-10-01

The current study aimed to identify key genes in glaucoma based on a benchmarked dataset and gene regulatory network (GRN). Local and global noise was added to the gene expression dataset to produce a benchmarked dataset. Differentially-expressed genes (DEGs) between patients with glaucoma and normal controls were identified utilizing the Linear Models for Microarray Data (Limma) package based on benchmarked dataset. A total of 5 GRN inference methods, including Zscore, GeneNet, context likelihood of relatedness (CLR) algorithm, Partial Correlation coefficient with Information Theory (PCIT) and GEne Network Inference with Ensemble of Trees (Genie3) were evaluated using receiver operating characteristic (ROC) and precision and recall (PR) curves. The interference method with the best performance was selected to construct the GRN. Subsequently, topological centrality (degree, closeness and betweenness) was conducted to identify key genes in the GRN of glaucoma. Finally, the key genes were validated by performing reverse transcription-quantitative polymerase chain reaction (RT-qPCR). A total of 176 DEGs were detected from the benchmarked dataset. The ROC and PR curves of the 5 methods were analyzed and it was determined that Genie3 had a clear advantage over the other methods; thus, Genie3 was used to construct the GRN. Following topological centrality analysis, 14 key genes for glaucoma were identified, including IL6 , EPHA2 and GSTT1 and 5 of these 14 key genes were validated by RT-qPCR. Therefore, the current study identified 14 key genes in glaucoma, which may be potential biomarkers to use in the diagnosis of glaucoma and aid in identifying the molecular mechanism of this disease.
[Benchmark experiment to verify radiation transport calculations for dosimetry in radiation therapy].

PubMed

Renner, Franziska

2016-09-01

Monte Carlo simulations are regarded as the most accurate method of solving complex problems in the field of dosimetry and radiation transport. In (external) radiation therapy they are increasingly used for the calculation of dose distributions during treatment planning. In comparison to other algorithms for the calculation of dose distributions, Monte Carlo methods have the capability of improving the accuracy of dose calculations - especially under complex circumstances (e.g. consideration of inhomogeneities). However, there is a lack of knowledge of how accurate the results of Monte Carlo calculations are on an absolute basis. A practical verification of the calculations can be performed by direct comparison with the results of a benchmark experiment. This work presents such a benchmark experiment and compares its results (with detailed consideration of measurement uncertainty) with the results of Monte Carlo calculations using the well-established Monte Carlo code EGSnrc. The experiment was designed to have parallels to external beam radiation therapy with respect to the type and energy of the radiation, the materials used and the kind of dose measurement. Because the properties of the beam have to be well known in order to compare the results of the experiment and the simulation on an absolute basis, the benchmark experiment was performed using the research electron accelerator of the Physikalisch-Technische Bundesanstalt (PTB), whose beam was accurately characterized in advance. The benchmark experiment and the corresponding Monte Carlo simulations were carried out for two different types of ionization chambers and the results were compared. Considering the uncertainty, which is about 0.7 % for the experimental values and about 1.0 % for the Monte Carlo simulation, the results of the simulation and the experiment coincide. Copyright © 2015. Published by Elsevier GmbH.
The mass storage testing laboratory at GSFC

NASA Technical Reports Server (NTRS)

Venkataraman, Ravi; Williams, Joel; Michaud, David; Gu, Heng; Kalluri, Atri; Hariharan, P. C.; Kobler, Ben; Behnke, Jeanne; Peavey, Bernard

1998-01-01

Industry-wide benchmarks exist for measuring the performance of processors (SPECmarks), and of database systems (Transaction Processing Council). Despite storage having become the dominant item in computing and IT (Information Technology) budgets, no such common benchmark is available in the mass storage field. Vendors and consultants provide services and tools for capacity planning and sizing, but these do not account for the complete set of metrics needed in today's archives. The availability of automated tape libraries, high-capacity RAID systems, and high- bandwidth interconnectivity between processor and peripherals has led to demands for services which traditional file systems cannot provide. File Storage and Management Systems (FSMS), which began to be marketed in the late 80's, have helped to some extent with large tape libraries, but their use has introduced additional parameters affecting performance. The aim of the Mass Storage Test Laboratory (MSTL) at Goddard Space Flight Center is to develop a test suite that includes not only a comprehensive check list to document a mass storage environment but also benchmark code. Benchmark code is being tested which will provide measurements for both baseline systems, i.e. applications interacting with peripherals through the operating system services, and for combinations involving an FSMS. The benchmarks are written in C, and are easily portable. They are initially being aimed at the UNIX Open Systems world. Measurements are being made using a Sun Ultra 170 Sparc with 256MB memory running Solaris 2.5.1 with the following configuration: 4mm tape stacker on SCSI 2 Fast/Wide; 4GB disk device on SCSI 2 Fast/Wide; and Sony Petaserve on Fast/Wide differential SCSI 2.
Benchmarking worker nodes using LHCb productions and comparing with HEPSpec06

NASA Astrophysics Data System (ADS)

Charpentier, P.

2017-10-01

In order to estimate the capabilities of a computing slot with limited processing time, it is necessary to know with a rather good precision its “power”. This allows for example pilot jobs to match a task for which the required CPU-work is known, or to define the number of events to be processed knowing the CPU-work per event. Otherwise one always has the risk that the task is aborted because it exceeds the CPU capabilities of the resource. It also allows a better accounting of the consumed resources. The traditional way the CPU power is estimated in WLCG since 2007 is using the HEP-Spec06 benchmark (HS06) suite that was verified at the time to scale properly with a set of typical HEP applications. However, the hardware architecture of processors has evolved, all WLCG experiments moved to using 64-bit applications and use different compilation flags from those advertised for running HS06. It is therefore interesting to check the scaling of HS06 with the HEP applications. For this purpose, we have been using CPU intensive massive simulation productions from the LHCb experiment and compared their event throughput to the HS06 rating of the worker nodes. We also compared it with a much faster benchmark script that is used by the DIRAC framework used by LHCb for evaluating at run time the performance of the worker nodes. This contribution reports on the finding of these comparisons: the main observation is that the scaling with HS06 is no longer fulfilled, while the fast benchmarks have a better scaling but are less precise. One can also clearly see that some hardware or software features when enabled on the worker nodes may enhance their performance beyond expectation from either benchmark, depending on external factors.
Multi-Complementary Model for Long-Term Tracking

PubMed Central

Zhang, Deng; Zhang, Junchang; Xia, Chenyang

2018-01-01

In recent years, video target tracking algorithms have been widely used. However, many tracking algorithms do not achieve satisfactory performance, especially when dealing with problems such as object occlusions, background clutters, motion blur, low illumination color images, and sudden illumination changes in real scenes. In this paper, we incorporate an object model based on contour information into a Staple tracker that combines the correlation filter model and color model to greatly improve the tracking robustness. Since each model is responsible for tracking specific features, the three complementary models combine for more robust tracking. In addition, we propose an efficient object detection model with contour and color histogram features, which has good detection performance and better detection efficiency compared to the traditional target detection algorithm. Finally, we optimize the traditional scale calculation, which greatly improves the tracking execution speed. We evaluate our tracker on the Object Tracking Benchmarks 2013 (OTB-13) and Object Tracking Benchmarks 2015 (OTB-15) benchmark datasets. With the OTB-13 benchmark datasets, our algorithm is improved by 4.8%, 9.6%, and 10.9% on the success plots of OPE, TRE and SRE, respectively, in contrast to another classic LCT (Long-term Correlation Tracking) algorithm. On the OTB-15 benchmark datasets, when compared with the LCT algorithm, our algorithm achieves 10.4%, 12.5%, and 16.1% improvement on the success plots of OPE, TRE, and SRE, respectively. At the same time, it needs to be emphasized that, due to the high computational efficiency of the color model and the object detection model using efficient data structures, and the speed advantage of the correlation filters, our tracking algorithm could still achieve good tracking speed. PMID:29425170
Do physiological measures predict selected CrossFit(®) benchmark performance?

PubMed

Butcher, Scotty J; Neyedly, Tyler J; Horvey, Karla J; Benko, Chad R

2015-01-01

CrossFit(®) is a new but extremely popular method of exercise training and competition that involves constantly varied functional movements performed at high intensity. Despite the popularity of this training method, the physiological determinants of CrossFit performance have not yet been reported. The purpose of this study was to determine whether physiological and/or muscle strength measures could predict performance on three common CrossFit "Workouts of the Day" (WODs). Fourteen CrossFit Open or Regional athletes completed, on separate days, the WODs "Grace" (30 clean and jerks for time), "Fran" (three rounds of thrusters and pull-ups for 21, 15, and nine repetitions), and "Cindy" (20 minutes of rounds of five pull-ups, ten push-ups, and 15 bodyweight squats), as well as the "CrossFit Total" (1 repetition max [1RM] back squat, overhead press, and deadlift), maximal oxygen consumption (VO2max), and Wingate anaerobic power/capacity testing. Performance of Grace and Fran was related to whole-body strength (CrossFit Total) (r=-0.88 and -0.65, respectively) and anaerobic threshold (r=-0.61 and -0.53, respectively); however, whole-body strength was the only variable to survive the prediction regression for both of these WODs (R (2)=0.77 and 0.42, respectively). There were no significant associations or predictors for Cindy. CrossFit benchmark WOD performance cannot be predicted by VO2max, Wingate power/capacity, or either respiratory compensation or anaerobic thresholds. Of the data measured, only whole-body strength can partially explain performance on Grace and Fran, although anaerobic threshold also exhibited association with performance. Along with their typical training, CrossFit athletes should likely ensure an adequate level of strength and aerobic endurance to optimize performance on at least some benchmark WODs.
Assessing Ecosystem Model Performance in Semiarid Systems

NASA Astrophysics Data System (ADS)

Thomas, A.; Dietze, M.; Scott, R. L.; Biederman, J. A.

2017-12-01

In ecosystem process modelling, comparing outputs to benchmark datasets observed in the field is an important way to validate models, allowing the modelling community to track model performance over time and compare models at specific sites. Multi-model comparison projects as well as models themselves have largely been focused on temperate forests and similar biomes. Semiarid regions, on the other hand, are underrepresented in land surface and ecosystem modelling efforts, and yet will be disproportionately impacted by disturbances such as climate change due to their sensitivity to changes in the water balance. Benchmarking models at semiarid sites is an important step in assessing and improving models' suitability for predicting the impact of disturbance on semiarid ecosystems. In this study, several ecosystem models were compared at a semiarid grassland in southwestern Arizona using PEcAn, or the Predictive Ecosystem Analyzer, an open-source eco-informatics toolbox ideal for creating the repeatable model workflows necessary for benchmarking. Models included SIPNET, DALEC, JULES, ED2, GDAY, LPJ-GUESS, MAESPA, CLM, CABLE, and FATES. Comparison between model output and benchmarks such as net ecosystem exchange (NEE) tended to produce high root mean square error and low correlation coefficients, reflecting poor simulation of seasonality and the tendency for models to create much higher carbon sources than observed. These results indicate that ecosystem models do not currently adequately represent semiarid ecosystem processes.
Consultants' Corner: System Performance. A Forum.

ERIC Educational Resources Information Center

Drabenstott, John, Ed.

1986-01-01

Five library consultants address issues that affect online system performance: options in system design that relate to diverse library requirements; criteria that most affect performance; benchmark tests and sizing criteria; minimalizing the risks of miscalculation; and the roles and responsibilities of vendors, libraries, and consultants.…
Benchmarking the performance of fixed-image receptor digital radiography systems. Part 2: system performance metric.

PubMed

Lee, Kam L; Bernardo, Michael; Ireland, Timothy A

2016-06-01

This is part two of a two-part study in benchmarking system performance of fixed digital radiographic systems. The study compares the system performance of seven fixed digital radiography systems based on quantitative metrics like modulation transfer function (sMTF), normalised noise power spectrum (sNNPS), detective quantum efficiency (sDQE) and entrance surface air kerma (ESAK). It was found that the most efficient image receptors (greatest sDQE) were not necessarily operating at the lowest ESAK. In part one of this study, sMTF is shown to depend on system configuration while sNNPS is shown to be relatively consistent across systems. Systems are ranked on their signal-to-noise ratio efficiency (sDQE) and their ESAK. Systems using the same equipment configuration do not necessarily have the same system performance. This implies radiographic practice at the site will have an impact on the overall system performance. In general, systems are more dose efficient at low dose settings.
Benchmarking in pathology: development of an activity-based costing model.

PubMed

Burnett, Leslie; Wilson, Roger; Pfeffer, Sally; Lowry, John

2012-12-01

Benchmarking in Pathology (BiP) allows pathology laboratories to determine the unit cost of all laboratory tests and procedures, and also provides organisational productivity indices allowing comparisons of performance with other BiP participants. We describe 14 years of progressive enhancement to a BiP program, including the implementation of 'avoidable costs' as the accounting basis for allocation of costs rather than previous approaches using 'total costs'. A hierarchical tree-structured activity-based costing model distributes 'avoidable costs' attributable to the pathology activities component of a pathology laboratory operation. The hierarchical tree model permits costs to be allocated across multiple laboratory sites and organisational structures. This has enabled benchmarking on a number of levels, including test profiles and non-testing related workload activities. The development of methods for dealing with variable cost inputs, allocation of indirect costs using imputation techniques, panels of tests, and blood-bank record keeping, have been successfully integrated into the costing model. A variety of laboratory management reports are produced, including the 'cost per test' of each pathology 'test' output. Benchmarking comparisons may be undertaken at any and all of the 'cost per test' and 'cost per Benchmarking Complexity Unit' level, 'discipline/department' (sub-specialty) level, or overall laboratory/site and organisational levels. We have completed development of a national BiP program. An activity-based costing methodology based on avoidable costs overcomes many problems of previous benchmarking studies based on total costs. The use of benchmarking complexity adjustment permits correction for varying test-mix and diagnostic complexity between laboratories. Use of iterative communication strategies with program participants can overcome many obstacles and lead to innovations.

Gaming in risk-adjusted mortality rates: effect of misclassification of risk factors in the benchmarking of cardiac surgery risk-adjusted mortality rates.

PubMed

Siregar, Sabrina; Groenwold, Rolf H H; Versteegh, Michel I M; Noyez, Luc; ter Burg, Willem Jan P P; Bots, Michiel L; van der Graaf, Yolanda; van Herwerden, Lex A

2013-03-01

Upcoding or undercoding of risk factors could affect the benchmarking of risk-adjusted mortality rates. The aim was to investigate the effect of misclassification of risk factors on the benchmarking of mortality rates after cardiac surgery. A prospective cohort was used comprising all adult cardiac surgery patients in all 16 cardiothoracic centers in The Netherlands from January 1, 2007, to December 31, 2009. A random effects model, including the logistic European system for cardiac operative risk evaluation (EuroSCORE) was used to benchmark the in-hospital mortality rates. We simulated upcoding and undercoding of 5 selected variables in the patients from 1 center. These patients were selected randomly (nondifferential misclassification) or by the EuroSCORE (differential misclassification). In the random patients, substantial misclassification was required to affect benchmarking: a 1.8-fold increase in prevalence of the 4 risk factors changed an underperforming center into an average performing one. Upcoding of 1 variable required even more. When patients with the greatest EuroSCORE were upcoded (ie, differential misclassification), a 1.1-fold increase was sufficient: moderate left ventricular function from 14.2% to 15.7%, poor left ventricular function from 8.4% to 9.3%, recent myocardial infarction from 7.9% to 8.6%, and extracardiac arteriopathy from 9.0% to 9.8%. Benchmarking using risk-adjusted mortality rates can be manipulated by misclassification of the EuroSCORE risk factors. Misclassification of random patients or of single variables will have little effect. However, limited upcoding of multiple risk factors in high-risk patients can greatly influence benchmarking. To minimize "gaming," the prevalence of all risk factors should be carefully monitored. Copyright © 2013 The American Association for Thoracic Surgery. Published by Mosby, Inc. All rights reserved.
Benchmark Simulation Model No 2: finalisation of plant layout and default control strategy.

PubMed

Nopens, I; Benedetti, L; Jeppsson, U; Pons, M-N; Alex, J; Copp, J B; Gernaey, K V; Rosen, C; Steyer, J-P; Vanrolleghem, P A

2010-01-01

The COST/IWA Benchmark Simulation Model No 1 (BSM1) has been available for almost a decade. Its primary purpose has been to create a platform for control strategy benchmarking of activated sludge processes. The fact that the research work related to the benchmark simulation models has resulted in more than 300 publications worldwide demonstrates the interest in and need of such tools within the research community. Recent efforts within the IWA Task Group on "Benchmarking of control strategies for WWTPs" have focused on an extension of the benchmark simulation model. This extension aims at facilitating control strategy development and performance evaluation at a plant-wide level and, consequently, includes both pretreatment of wastewater as well as the processes describing sludge treatment. The motivation for the extension is the increasing interest and need to operate and control wastewater treatment systems not only at an individual process level but also on a plant-wide basis. To facilitate the changes, the evaluation period has been extended to one year. A prolonged evaluation period allows for long-term control strategies to be assessed and enables the use of control handles that cannot be evaluated in a realistic fashion in the one week BSM1 evaluation period. In this paper, the finalised plant layout is summarised and, as was done for BSM1, a default control strategy is proposed. A demonstration of how BSM2 can be used to evaluate control strategies is also given.
Exploring Infiniband Hardware Virtualization in OpenNebula towards Efficient High-Performance Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pais Pitta de Lacerda Ruivo, Tiago; Bernabeu Altayo, Gerard; Garzoglio, Gabriele

2014-11-11

has been widely accepted that software virtualization has a big negative impact on high-performance computing (HPC) application performance. This work explores the potential use of Infiniband hardware virtualization in an OpenNebula cloud towards the efficient support of MPI-based workloads. We have implemented, deployed, and tested an Infiniband network on the FermiCloud private Infrastructure-as-a-Service (IaaS) cloud. To avoid software virtualization towards minimizing the virtualization overhead, we employed a technique called Single Root Input/Output Virtualization (SRIOV). Our solution spanned modifications to the Linux’s Hypervisor as well as the OpenNebula manager. We evaluated the performance of the hardware virtualization on up to 56more » virtual machines connected by up to 8 DDR Infiniband network links, with micro-benchmarks (latency and bandwidth) as well as w a MPI-intensive application (the HPL Linpack benchmark).« less
Field Performance of Photovoltaic Systems in the Tucson Desert

NASA Astrophysics Data System (ADS)

Orsburn, Sean; Brooks, Adria; Cormode, Daniel; Greenberg, James; Hardesty, Garrett; Lonij, Vincent; Salhab, Anas; St. Germaine, Tyler; Torres, Gabe; Cronin, Alexander

2011-10-01

At the Tucson Electric Power (TEP) solar test yard, over 20 different grid-connected photovoltaic (PV) systems are being tested. The goal at the TEP solar test yard is to measure and model real-world performance of PV systems and to benchmark new technologies such as holographic concentrators. By studying voltage and current produced by the PV systems as a function of incident irradiance, and module temperature, we can compare our measurements of field-performance (in a harsh desert environment) to manufacturer specifications (determined under laboratory conditions). In order to measure high-voltage and high-current signals, we designed and built reliable, accurate sensors that can handle extreme desert temperatures. We will present several benchmarks of sensors in a controlled environment, including shunt resistors and Hall-effect current sensors, to determine temperature drift and accuracy. Finally we will present preliminary field measurements of PV performance for several different PV technologies.
Statistical process control as a tool for controlling operating room performance: retrospective analysis and benchmarking.

PubMed

Chen, Tsung-Tai; Chang, Yun-Jau; Ku, Shei-Ling; Chung, Kuo-Piao

2010-10-01

There is much research using statistical process control (SPC) to monitor surgical performance, including comparisons among groups to detect small process shifts, but few of these studies have included a stabilization process. This study aimed to analyse the performance of surgeons in operating room (OR) and set a benchmark by SPC after stabilized process. The OR profile of 499 patients who underwent laparoscopic cholecystectomy performed by 16 surgeons at a tertiary hospital in Taiwan during 2005 and 2006 were recorded. SPC was applied to analyse operative and non-operative times using the following five steps: first, the times were divided into two segments; second, they were normalized; third, they were evaluated as individual processes; fourth, the ARL(0) was calculated;, and fifth, the different groups (surgeons) were compared. Outliers were excluded to ensure stability for each group and to facilitate inter-group comparison. The results showed that in the stabilized process, only one surgeon exhibited a significantly shorter total process time (including operative time and non-operative time). In this study, we use five steps to demonstrate how to control surgical and non-surgical time in phase I. There are some measures that can be taken to prevent skew and instability in the process. Also, using SPC, one surgeon can be shown to be a real benchmark. © 2010 Blackwell Publishing Ltd.
EVA Human Health and Performance Benchmarking Study Overview and Development of a Microgravity Protocol

NASA Technical Reports Server (NTRS)

Norcross, Jason; Jarvis, Sarah; Bekdash, Omar; Cupples, Scott; Abercromby, Andrew

2017-01-01

The primary objective of this study is to develop a protocol to reliably characterize human health and performance metrics for individuals working inside various EVA suits under realistic spaceflight conditions. Expected results and methodologies developed during this study will provide the baseline benchmarking data and protocols with which future EVA suits and suit configurations (e.g., varied pressure, mass, center of gravity [CG]) and different test subject populations (e.g., deconditioned crewmembers) may be reliably assessed and compared. Results may also be used, in conjunction with subsequent testing, to inform fitness-for-duty standards, as well as design requirements and operations concepts for future EVA suits and other exploration systems.
libvdwxc: a library for exchange-correlation functionals in the vdW-DF family

NASA Astrophysics Data System (ADS)

Hjorth Larsen, Ask; Kuisma, Mikael; Löfgren, Joakim; Pouillon, Yann; Erhart, Paul; Hyldgaard, Per

2017-09-01

We present libvdwxc, a general library for evaluating the energy and potential for the family of vdW-DF exchange-correlation functionals. libvdwxc is written in C and provides an efficient implementation of the vdW-DF method and can be interfaced with various general-purpose DFT codes. Currently, the Gpaw and Octopus codes implement interfaces to libvdwxc. The present implementation emphasizes scalability and parallel performance, and thereby enables ab initio calculations of nanometer-scale complexes. The numerical accuracy is benchmarked on the S22 test set whereas parallel performance is benchmarked on ligand-protected gold nanoparticles ({{Au}}144{({{SC}}11{{NH}}25)}60) up to 9696 atoms.
Seeding for pervasively overlapping communities

NASA Astrophysics Data System (ADS)

Lee, Conrad; Reid, Fergal; McDaid, Aaron; Hurley, Neil

2011-06-01

In some social and biological networks, the majority of nodes belong to multiple communities. It has recently been shown that a number of the algorithms specifically designed to detect overlapping communities do not perform well in such highly overlapping settings. Here, we consider one class of these algorithms, those which optimize a local fitness measure, typically by using a greedy heuristic to expand a seed into a community. We perform synthetic benchmarks which indicate that an appropriate seeding strategy becomes more important as the extent of community overlap increases. We find that distinct cliques provide the best seeds. We find further support for this seeding strategy with benchmarks on a Facebook network and the yeast interactome.
Benchmarking and performance analysis of the CM-2. [SIMD computer

NASA Technical Reports Server (NTRS)

Myers, David W.; Adams, George B., II

1988-01-01

A suite of benchmarking routines testing communication, basic arithmetic operations, and selected kernel algorithms written in LISP and PARIS was developed for the CM-2. Experiment runs are automated via a software framework that sequences individual tests, allowing for unattended overnight operation. Multiple measurements are made and treated statistically to generate well-characterized results from the noisy values given by cm:time. The results obtained provide a comparison with similar, but less extensive, testing done on a CM-1. Tests were chosen to aid the algorithmist in constructing fast, efficient, and correct code on the CM-2, as well as gain insight into what performance criteria are needed when evaluating parallel processing machines.
Health and productivity management: establishing key performance measures, benchmarks, and best practices.

PubMed

Goetzel, R Z; Guindon, A M; Turshen, I J; Ozminkowski, R J

2001-01-01

Major areas considered under the rubric of health and productivity management (HPM) in American business include absenteeism, employee turnover, and the use of medical, disability, and workers' compensation programs. Until recently, few normative data existed for most HPM areas. To meet the need for normative information in HPM, a series of Consortium Benchmarking Studies were conducted. In the most recent application of the study, 1998 HPM costs, incidence, duration, and other program data were collected from 43 employers on almost one million workers. The median HPM costs for these organizations were $9992 per employee, which were distributed among group health (47%), turnover (37%), unscheduled absence (8%), nonoccupational disability (5%), and workers' compensation programs (3%). Achieving "best-practice" levels of performance (operationally defined as the 25th percentile for program expenditures in each HPM area) would realize savings of $2562 per employee (a 26% reduction). The results indicate substantial opportunities for improvement through effective coordination and management of HPM programs. Examples of best-practice activities collated from on-site visits to "benchmark" organizations are also reviewed.
Reactor Physics Measurements and Benchmark Specifications for Oak Ridge Highly Enriched Uranium Sphere (ORSphere)

DOE PAGES

Marshall, Margaret A.

2014-11-04

In the early 1970s Dr. John T. Mihalczo (team leader), J.J. Lynn, and J.R. Taylor performed experiments at the Oak Ridge Critical Experiments Facility (ORCEF) with highly enriched uranium (HEU) metal (called Oak Ridge Alloy or ORALLOY) in an effort to recreate GODIVA I results with greater accuracy than those performed at Los Alamos National Laboratory in the 1950s. The purpose of the Oak Ridge ORALLOY Sphere (ORSphere) experiments was to estimate the unreflected and unmoderated critical mass of an idealized sphere of uranium metal corrected to a density, purity, and enrichment such that it could be compared with themore » GODIVA I experiments. Additionally, various material reactivity worths, the surface material worth coefficient, the delayed neutron fraction, the prompt neutron decay constant, relative fission density, and relative neutron importance were all measured. The critical assembly, material reactivity worths, the surface material worth coefficient, and the delayed neutron fraction were all evaluated as benchmark experiment measurements. The reactor physics measurements are the focus of this paper; although for clarity the critical assembly benchmark specifications are briefly discussed.« less
Benchmarking MARS (accident management software) with the Browns Ferry fire

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dawson, S.M.; Liu, L.Y.; Raines, J.C.

1992-01-01

The MAAP Accident Response System (MARS) is a userfriendly computer software developed to provide management and engineering staff with the most needed insights, during actual or simulated accidents, of the current and future conditions of the plant based on current plant data and its trends. To demonstrate the reliability of the MARS code in simulatng a plant transient, MARS is being benchmarked with the available reactor pressure vessel (RPV) pressure and level data from the Browns Ferry fire. The MRS software uses the Modular Accident Analysis Program (MAAP) code as its basis to calculate plant response under accident conditions. MARSmore » uses a limited set of plant data to initialize and track the accidnt progression. To perform this benchmark, a simulated set of plant data was constructed based on actual report data containing the information necessary to initialize MARS and keep track of plant system status throughout the accident progression. The initial Browns Ferry fire data were produced by performing a MAAP run to simulate the accident. The remaining accident simulation used actual plant data.« less
The Star Schema Benchmark and Augmented Fact Table Indexing

NASA Astrophysics Data System (ADS)

O'Neil, Patrick; O'Neil, Elizabeth; Chen, Xuedong; Revilak, Stephen

We provide a benchmark measuring star schema queries retrieving data from a fact table with Where clause column restrictions on dimension tables. Clustering is crucial to performance with modern disk technology, since retrievals with filter factors down to 0.0005 are now performed most efficiently by sequential table search rather than by indexed access. DB2’s Multi-Dimensional Clustering (MDC) provides methods to "dice" the fact table along a number of orthogonal "dimensions", but only when these dimensions are columns in the fact table. The diced cells cluster fact rows on several of these "dimensions" at once so queries restricting several such columns can access crucially localized data, with much faster query response. Unfortunately, columns of dimension tables of a star schema are not usually represented in the fact table. In this paper, we show a simple way to adjoin physical copies of dimension columns to the fact table, dicing data to effectively cluster query retrieval, and explain how such dicing can be achieved on database products other than DB2. We provide benchmark measurements to show successful use of this methodology on three commercial database products.
TRIPOLI-4® - MCNP5 ITER A-lite neutronic model benchmarking

NASA Astrophysics Data System (ADS)

Jaboulay, J.-C.; Cayla, P.-Y.; Fausser, C.; Lee, Y.-K.; Trama, J.-C.; Li-Puma, A.

2014-06-01

The aim of this paper is to present the capability of TRIPOLI-4®, the CEA Monte Carlo code, to model a large-scale fusion reactor with complex neutron source and geometry. In the past, numerous benchmarks were conducted for TRIPOLI-4® assessment on fusion applications. Experiments (KANT, OKTAVIAN, FNG) analysis and numerical benchmarks (between TRIPOLI-4® and MCNP5) on the HCLL DEMO2007 and ITER models were carried out successively. In this previous ITER benchmark, nevertheless, only the neutron wall loading was analyzed, its main purpose was to present MCAM (the FDS Team CAD import tool) extension for TRIPOLI-4®. Starting from this work a more extended benchmark has been performed about the estimation of neutron flux, nuclear heating in the shielding blankets and tritium production rate in the European TBMs (HCLL and HCPB) and it is presented in this paper. The methodology to build the TRIPOLI-4® A-lite model is based on MCAM and the MCNP A-lite model (version 4.1). Simplified TBMs (from KIT) have been integrated in the equatorial-port. Comparisons of neutron wall loading, flux, nuclear heating and tritium production rate show a good agreement between the two codes. Discrepancies are mainly included in the Monte Carlo codes statistical error.
Benchmark Evaluation of the HTR-PROTEUS Absorber Rod Worths (Core 4)

DOE Office of Scientific and Technical Information (OSTI.GOV)

John D. Bess; Leland M. Montierth

2014-06-01

PROTEUS was a zero-power research reactor at the Paul Scherrer Institute (PSI) in Switzerland. The critical assembly was constructed from a large graphite annulus surrounding a central cylindrical cavity. Various experimental programs were investigated in PROTEUS; during the years 1992 through 1996, it was configured as a pebble-bed reactor and designated HTR-PROTEUS. Various critical configurations were assembled with each accompanied by an assortment of reactor physics experiments including differential and integral absorber rod measurements, kinetics, reaction rate distributions, water ingress effects, and small sample reactivity effects [1]. Four benchmark reports were previously prepared and included in the March 2013 editionmore » of the International Handbook of Evaluated Reactor Physics Benchmark Experiments (IRPhEP Handbook) [2] evaluating eleven critical configurations. A summary of that effort was previously provided [3] and an analysis of absorber rod worth measurements for Cores 9 and 10 have been performed prior to this analysis and included in PROTEUS-GCR-EXP-004 [4]. In the current benchmark effort, absorber rod worths measured for Core Configuration 4, which was the only core with a randomly-packed pebble loading, have been evaluated for inclusion as a revision to the HTR-PROTEUS benchmark report PROTEUS-GCR-EXP-002.« less
Testing New Programming Paradigms with NAS Parallel Benchmarks

NASA Technical Reports Server (NTRS)

Jin, H.; Frumkin, M.; Schultz, M.; Yan, J.

2000-01-01

Over the past decade, high performance computing has evolved rapidly, not only in hardware architectures but also with increasing complexity of real applications. Technologies have been developing to aim at scaling up to thousands of processors on both distributed and shared memory systems. Development of parallel programs on these computers is always a challenging task. Today, writing parallel programs with message passing (e.g. MPI) is the most popular way of achieving scalability and high performance. However, writing message passing programs is difficult and error prone. Recent years new effort has been made in defining new parallel programming paradigms. The best examples are: HPF (based on data parallelism) and OpenMP (based on shared memory parallelism). Both provide simple and clear extensions to sequential programs, thus greatly simplify the tedious tasks encountered in writing message passing programs. HPF is independent of memory hierarchy, however, due to the immaturity of compiler technology its performance is still questionable. Although use of parallel compiler directives is not new, OpenMP offers a portable solution in the shared-memory domain. Another important development involves the tremendous progress in the internet and its associated technology. Although still in its infancy, Java promisses portability in a heterogeneous environment and offers possibility to "compile once and run anywhere." In light of testing these new technologies, we implemented new parallel versions of the NAS Parallel Benchmarks (NPBs) with HPF and OpenMP directives, and extended the work with Java and Java-threads. The purpose of this study is to examine the effectiveness of alternative programming paradigms. NPBs consist of five kernels and three simulated applications that mimic the computation and data movement of large scale computational fluid dynamics (CFD) applications. We started with the serial version included in NPB2.3. Optimization of memory and cache usage was applied to several benchmarks, noticeably BT and SP, resulting in better sequential performance. In order to overcome the lack of an HPF performance model and guide the development of the HPF codes, we employed an empirical performance model for several primitives found in the benchmarks. We encountered a few limitations of HPF, such as lack of supporting the "REDISTRIBUTION" directive and no easy way to handle irregular computation. The parallelization with OpenMP directives was done at the outer-most loop level to achieve the largest granularity. The performance of six HPF and OpenMP benchmarks is compared with their MPI counterparts for the Class-A problem size in the figure in next page. These results were obtained on an SGI Origin2000 (195MHz) with MIPSpro-f77 compiler 7.2.1 for OpenMP and MPI codes and PGI pghpf-2.4.3 compiler with MPI interface for HPF programs.
Nocturnal Visual Orientation in Flying Insects: A Benchmark for the Design of Vision-Based Sensors in Micro-Aerial Vehicles

DTIC Science & Technology

2011-03-09

anu.edu.au Nocturnal visual orientation in flying insects: a benchmark for the design of vision-based sensors in Micro-Aerial Vehicles Report...9 10 Technical horizon sensors Over the past few years, a remarkable proliferation of designs for micro-aerial vehicles (MAVs) has occurred...possible elevations, it may severely degrade the performance of sensors by local saturation. Therefore it is necessary to find a method whereby the effect
Preparation and benchmarking of ANSL-V cross sections for advanced neutron source reactor studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arwood, J.W.; Ford, W.E. III; Greene, N.M.

1987-01-01

Validity of selected data from the fine-group neutron library was satisfactorily tested in performance parameter calculations for the BAPL-1, TRX-1, and ZEEP-1 thermal lattice benchmarks. BAPL-2 is an H/sub 2/O moderated, uranium oxide lattice; TRX-1 is an H/sub 2/O moderated, 1.31 weight percent enriched uranium metal lattice; ZEEP-1 is a D/sub 2/O-moderated, natural uranium lattice. 26 refs., 1 tab.
Validation of Shielding Analysis Capability of SuperMC with SINBAD

NASA Astrophysics Data System (ADS)

Chen, Chaobin; Yang, Qi; Wu, Bin; Han, Yuncheng; Song, Jing

2017-09-01

Abstract: The shielding analysis capability of SuperMC was validated with the Shielding Integral Benchmark Archive Database (SINBAD). The SINBAD was compiled by RSICC and NEA, it includes numerous benchmark experiments performed with the D-T fusion neutron source facilities of OKTAVIAN, FNS, IPPE, etc. The results from SuperMC simulation were compared with experimental data and MCNP results. Very good agreement with deviation lower than 1% was achieved and it suggests that SuperMC is reliable in shielding calculation.
Evaluating the Information Power Grid using the NAS Grid Benchmarks

NASA Technical Reports Server (NTRS)

VanderWijngaartm Rob F.; Frumkin, Michael A.

2004-01-01

The NAS Grid Benchmarks (NGB) are a collection of synthetic distributed applications designed to rate the performance and functionality of computational grids. We compare several implementations of the NGB to determine programmability and efficiency of NASA's Information Power Grid (IPG), whose services are mostly based on the Globus Toolkit. We report on the overheads involved in porting existing NGB reference implementations to the IPG. No changes were made to the component tasks of the NGB can still be improved.

Aquarius Project: Research in the System Architecture of Accelerators for the High Performance Execution of Logic Programs.

DTIC Science & Technology

1991-05-31

benchmarks ............ .... . .. .. . . .. 220 Appendix G : Source code of the Aquarius Prolog compiler ........ . 224 Chapter I Introduction "You’re given...notation, a tool that is used throughout the compiler’s implementation. Appendix F lists the source code of the C and Prolog benchmarks. Appendix G lists the...source code of the compilcr. 5 "- standard form Prolog / a-sfomadon / head umrvln Convert to tmeikernel Prol g vrans~fonaon 1symbolic execution
Physician Performance Assessment: Prevention of Cardiovascular Disease

ERIC Educational Resources Information Center

Lipner, Rebecca S.; Weng, Weifeng; Caverzagie, Kelly J.; Hess, Brian J.

2013-01-01

Given the rising burden of healthcare costs, both patients and healthcare purchasers are interested in discerning which physicians deliver quality care. We proposed a methodology to assess physician clinical performance in preventive cardiology care, and determined a benchmark for minimally acceptable performance. We used data on eight…
Performance Measures, Benchmarking and Value.

ERIC Educational Resources Information Center

McGregor, Felicity

This paper discusses performance measurement in university libraries, based on examples from the University of Wollongong (UoW) in Australia. The introduction highlights the integration of information literacy into the curriculum and the outcomes of a 1998 UoW student satisfaction survey. The first section considers performance indicators in…
Construct Validity of Fresh Frozen Human Cadaver as a Training Model in Minimal Access Surgery

PubMed Central

Macafee, David; Pranesh, Nagarajan; Horgan, Alan F.

2012-01-01

Background: The construct validity of fresh human cadaver as a training tool has not been established previously. The aims of this study were to investigate the construct validity of fresh frozen human cadaver as a method of training in minimal access surgery and determine if novices can be rapidly trained using this model to a safe level of performance. Methods: Junior surgical trainees, novices (<3 laparoscopic procedure performed) in laparoscopic surgery, performed 10 repetitions of a set of structured laparoscopic tasks on fresh frozen cadavers. Expert laparoscopists (>100 laparoscopic procedures) performed 3 repetitions of identical tasks. Performances were scored using a validated, objective Global Operative Assessment of Laparoscopic Skills scale. Scores for 3 consecutive repetitions were compared between experts and novices to determine construct validity. Furthermore, to determine if the novices reached a safe level, a trimmed mean of the experts score was used to define a benchmark. Mann-Whitney U test was used for construct validity analysis and 1-sample t test to compare performances of the novice group with the benchmark safe score. Results: Ten novices and 2 experts were recruited. Four out of 5 tasks (nondominant to dominant hand transfer; simulated appendicectomy; intracorporeal and extracorporeal knot tying) showed construct validity. Novices’ scores became comparable to benchmark scores between the eighth and tenth repetition. Conclusion: Minimal access surgical training using fresh frozen human cadavers appears to have construct validity. The laparoscopic skills of novices can be accelerated through to a safe level within 8 to 10 repetitions. PMID:23318058
Validation of the WIMSD4M cross-section generation code with benchmark results

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deen, J.R.; Woodruff, W.L.; Leal, L.E.

1995-01-01

The WIMSD4 code has been adopted for cross-section generation in support of the Reduced Enrichment Research and Test Reactor (RERTR) program at Argonne National Laboratory (ANL). Subsequently, the code has undergone several updates, and significant improvements have been achieved. The capability of generating group-collapsed micro- or macroscopic cross sections from the ENDF/B-V library and the more recent evaluation, ENDF/B-VI, in the ISOTXS format makes the modified version of the WIMSD4 code, WIMSD4M, very attractive, not only for the RERTR program, but also for the reactor physics community. The intent of the present paper is to validate the WIMSD4M cross-section librariesmore » for reactor modeling of fresh water moderated cores. The results of calculations performed with multigroup cross-section data generated with the WIMSD4M code will be compared against experimental results. These results correspond to calculations carried out with thermal reactor benchmarks of the Oak Ridge National Laboratory (ORNL) unreflected HEU critical spheres, the TRX LEU critical experiments, and calculations of a modified Los Alamos HEU D{sub 2}O moderated benchmark critical system. The benchmark calculations were performed with the discrete-ordinates transport code, TWODANT, using WIMSD4M cross-section data. Transport calculations using the XSDRNPM module of the SCALE code system are also included. In addition to transport calculations, diffusion calculations with the DIF3D code were also carried out, since the DIF3D code is used in the RERTR program for reactor analysis and design. For completeness, Monte Carlo results of calculations performed with the VIM and MCNP codes are also presented.« less
Validation of the WIMSD4M cross-section generation code with benchmark results

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leal, L.C.; Deen, J.R.; Woodruff, W.L.

1995-02-01

The WIMSD4 code has been adopted for cross-section generation in support of the Reduced Enrichment for Research and Test (RERTR) program at Argonne National Laboratory (ANL). Subsequently, the code has undergone several updates, and significant improvements have been achieved. The capability of generating group-collapsed micro- or macroscopic cross sections from the ENDF/B-V library and the more recent evaluation, ENDF/B-VI, in the ISOTXS format makes the modified version of the WIMSD4 code, WIMSD4M, very attractive, not only for the RERTR program, but also for the reactor physics community. The intent of the present paper is to validate the procedure to generatemore » cross-section libraries for reactor analyses and calculations utilizing the WIMSD4M code. To do so, the results of calculations performed with group cross-section data generated with the WIMSD4M code will be compared against experimental results. These results correspond to calculations carried out with thermal reactor benchmarks of the Oak Ridge National Laboratory(ORNL) unreflected critical spheres, the TRX critical experiments, and calculations of a modified Los Alamos highly-enriched heavy-water moderated benchmark critical system. The benchmark calculations were performed with the discrete-ordinates transport code, TWODANT, using WIMSD4M cross-section data. Transport calculations using the XSDRNPM module of the SCALE code system are also included. In addition to transport calculations, diffusion calculations with the DIF3D code were also carried out, since the DIF3D code is used in the RERTR program for reactor analysis and design. For completeness, Monte Carlo results of calculations performed with the VIM and MCNP codes are also presented.« less
Multiscale Modelling of Relationships between Protein Classes and Drug Behavior Across all Diseases Using the CANDO Platform

PubMed Central

Samudrala, Ram

2015-01-01

We have examined the effect of eight different protein classes (channels, GPCRs, kinases, ligases, nuclear receptors, proteases, phosphatases, transporters) on the benchmarking performance of the CANDO drug discovery and repurposing platform (http://protinfo.org/cando). The first version of the CANDO platform utilizes a matrix of predicted interactions between 48278 proteins and 3733 human ingestible compounds (including FDA approved drugs and supplements) that map to 2030 indications/diseases using a hierarchical chem and bio-informatic fragment based docking with dynamics protocol (> one billion predicted interactions considered). The platform uses similarity of compound-proteome interaction signatures as indicative of similar functional behavior and benchmarking accuracy is calculated across 1439 indications/diseases with more than one approved drug. The CANDO platform yields a significant correlation (0.99, p-value < 0.0001) between the number of proteins considered and benchmarking accuracy obtained indicating the importance of multitargeting for drug discovery. Average benchmarking accuracies range from 6.2 % to 7.6 % for the eight classes when the top 10 ranked compounds are considered, in contrast to a range of 5.5 % to 11.7 % obtained for the comparison/control sets consisting of 10, 100, 1000, and 10000 single best performing proteins. These results are generally two orders of magnitude better than the average accuracy of 0.2% obtained when randomly generated (fully scrambled) matrices are used. Different indications perform well when different classes are used but the best accuracies (up to 11.7% for the top 10 ranked compounds) are achieved when a combination of classes are used containing the broadest distribution of protein folds. Our results illustrate the utility of the CANDO approach and the consideration of different protein classes for devising indication specific protocols for drug repurposing as well as drug discovery. PMID:25694071
Assessment of composite motif discovery methods.

PubMed

Klepper, Kjetil; Sandve, Geir K; Abul, Osman; Johansen, Jostein; Drablos, Finn

2008-02-26

Computational discovery of regulatory elements is an important area of bioinformatics research and more than a hundred motif discovery methods have been published. Traditionally, most of these methods have addressed the problem of single motif discovery - discovering binding motifs for individual transcription factors. In higher organisms, however, transcription factors usually act in combination with nearby bound factors to induce specific regulatory behaviours. Hence, recent focus has shifted from single motifs to the discovery of sets of motifs bound by multiple cooperating transcription factors, so called composite motifs or cis-regulatory modules. Given the large number and diversity of methods available, independent assessment of methods becomes important. Although there have been several benchmark studies of single motif discovery, no similar studies have previously been conducted concerning composite motif discovery. We have developed a benchmarking framework for composite motif discovery and used it to evaluate the performance of eight published module discovery tools. Benchmark datasets were constructed based on real genomic sequences containing experimentally verified regulatory modules, and the module discovery programs were asked to predict both the locations of these modules and to specify the single motifs involved. To aid the programs in their search, we provided position weight matrices corresponding to the binding motifs of the transcription factors involved. In addition, selections of decoy matrices were mixed with the genuine matrices on one dataset to test the response of programs to varying levels of noise. Although some of the methods tested tended to score somewhat better than others overall, there were still large variations between individual datasets and no single method performed consistently better than the rest in all situations. The variation in performance on individual datasets also shows that the new benchmark datasets represents a suitable variety of challenges to most methods for module discovery.
Intravenous contrast extravasation during CT: a national data registry and practice quality improvement initiative.

PubMed

Dykes, Thomas M; Bhargavan-Chatfield, Mythreyi; Dyer, Raymond B

2015-02-01

Establish 3 performance benchmarks for intravenous contrast extravasation during CT examinations: extravasation frequency, distribution of extravasation volumes, and severity of injury. Evaluate the effectiveness of implementing practice quality improvement (PQI) methodology in improving performance for these 3 benchmarks. The Society of Abdominal Radiology and ACR developed a registry collecting data for contrast extravasation events. The project includes a PQI initiative allowing for process improvement. As of December 2013, a total of 58 radiology practices have participated in this project, and 32 practices have completed the 2-cycle PQI. There were a total of 454,497 contrast-enhanced CT exams and 1,085 extravasation events. The average extravasation rate is 0.24%. The median extravasation rate is 0.21%. Most extravasations (82.9%) were between 10 mL and 99 mL. The majority of injuries, 94.6%, are mild in severity, with 4.7% having moderate and 0.8% having severe injuries. Data from practices that completed the PQI process showed a change in the average extravasation rate from 0.28% in the first 6 months to 0.23% in the second 6 months, and the median extravasation rate dropped from 0.25% to 0.16%, neither statistically significant. The distribution of extravasation volumes and the severity of injury did not change between the first and second measurement periods. National performance benchmarks for contrast extravasation rate, distribution of volumes of extravasate, and distribution of severity of injury are established through this multi-institutional practice registry. The application of PQI failed to have a statistically significant positive impact on any of the 3 benchmarks. Copyright © 2015 American College of Radiology. Published by Elsevier Inc. All rights reserved.
Dynamic vehicle routing with time windows in theory and practice.

PubMed

Yang, Zhiwei; van Osta, Jan-Paul; van Veen, Barry; van Krevelen, Rick; van Klaveren, Richard; Stam, Andries; Kok, Joost; Bäck, Thomas; Emmerich, Michael

2017-01-01

The vehicle routing problem is a classical combinatorial optimization problem. This work is about a variant of the vehicle routing problem with dynamically changing orders and time windows. In real-world applications often the demands change during operation time. New orders occur and others are canceled. In this case new schedules need to be generated on-the-fly. Online optimization algorithms for dynamical vehicle routing address this problem but so far they do not consider time windows. Moreover, to match the scenarios found in real-world problems adaptations of benchmarks are required. In this paper, a practical problem is modeled based on the procedure of daily routing of a delivery company. New orders by customers are introduced dynamically during the working day and need to be integrated into the schedule. A multiple ant colony algorithm combined with powerful local search procedures is proposed to solve the dynamic vehicle routing problem with time windows. The performance is tested on a new benchmark based on simulations of a working day. The problems are taken from Solomon's benchmarks but a certain percentage of the orders are only revealed to the algorithm during operation time. Different versions of the MACS algorithm are tested and a high performing variant is identified. Finally, the algorithm is tested in situ: In a field study, the algorithm schedules a fleet of cars for a surveillance company. We compare the performance of the algorithm to that of the procedure used by the company and we summarize insights gained from the implementation of the real-world study. The results show that the multiple ant colony algorithm can get a much better solution on the academic benchmark problem and also can be integrated in a real-world environment.
Optimization of Deep Drilling Performance--Development and Benchmark Testing of Advanced Diamond Product Drill Bits & HP/HT Fluids to Significantly Improve Rates of Penetration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alan Black; Arnis Judzis

2003-10-01

This document details the progress to date on the OPTIMIZATION OF DEEP DRILLING PERFORMANCE--DEVELOPMENT AND BENCHMARK TESTING OF ADVANCED DIAMOND PRODUCT DRILL BITS AND HP/HT FLUIDS TO SIGNIFICANTLY IMPROVE RATES OF PENETRATION contract for the year starting October 2002 through September 2002. The industry cost shared program aims to benchmark drilling rates of penetration in selected simulated deep formations and to significantly improve ROP through a team development of aggressive diamond product drill bit--fluid system technologies. Overall the objectives are as follows: Phase 1--Benchmark ''best in class'' diamond and other product drilling bits and fluids and develop concepts for amore » next level of deep drilling performance; Phase 2--Develop advanced smart bit--fluid prototypes and test at large scale; and Phase 3--Field trial smart bit--fluid concepts, modify as necessary and commercialize products. Accomplishments to date include the following: 4Q 2002--Project started; Industry Team was assembled; Kick-off meeting was held at DOE Morgantown; 1Q 2003--Engineering meeting was held at Hughes Christensen, The Woodlands Texas to prepare preliminary plans for development and testing and review equipment needs; Operators started sending information regarding their needs for deep drilling challenges and priorities for large-scale testing experimental matrix; Aramco joined the Industry Team as DEA 148 objectives paralleled the DOE project; 2Q 2003--Engineering and planning for high pressure drilling at TerraTek commenced; 3Q 2003--Continuation of engineering and design work for high pressure drilling at TerraTek; Baker Hughes INTEQ drilling Fluids and Hughes Christensen commence planning for Phase 1 testing--recommendations for bits and fluids.« less
Summary of ORSphere critical and reactor physics measurements

NASA Astrophysics Data System (ADS)

Marshall, Margaret A.; Bess, John D.

2017-09-01

In the early 1970s Dr. John T. Mihalczo (team leader), J.J. Lynn, and J.R. Taylor performed experiments at the Oak Ridge Critical Experiments Facility (ORCEF) with highly enriched uranium (HEU) metal (called Oak Ridge Alloy or ORALLOY) to recreate GODIVA I results with greater accuracy than those performed at Los Alamos National Laboratory in the 1950s. The purpose of the Oak Ridge ORALLOY Sphere (ORSphere) experiments was to estimate the unreflected and unmoderated critical mass of an idealized sphere of uranium metal corrected to a density, purity, and enrichment such that it could be compared with the GODIVA I experiments. This critical configuration has been evaluated. Preliminary results were presented at ND2013. Since then, the evaluation was finalized and judged to be an acceptable benchmark experiment for the International Criticality Safety Benchmark Experiment Project (ICSBEP). Additionally, reactor physics measurements were performed to determine surface button worths, central void worth, delayed neutron fraction, prompt neutron decay constant, fission density and neutron importance. These measurements have been evaluated and found to be acceptable experiments and are discussed in full detail in the International Handbook of Evaluated Reactor Physics Benchmark Experiments. The purpose of this paper is to summarize all the evaluated critical and reactor physics measurements evaluations.
Struever Bros. Eccles & Rouse: Mixed, Humid Climate Region 40+% Energy Savings

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

2009-08-14

This case study describes the Overlook at Clipper Hill community homes in Baltimore, Maryland, which have been designed to perform 40+% better than the Building America benchmark house by combining aesthetics, energy performance, and sustainability.
RASSP signal processing architectures

NASA Astrophysics Data System (ADS)

Shirley, Fred; Bassett, Bob; Letellier, J. P.

1995-06-01

The rapid prototyping of application specific signal processors (RASSP) program is an ARPA/tri-service effort to dramatically improve the process by which complex digital systems, particularly embedded signal processors, are specified, designed, documented, manufactured, and supported. The domain of embedded signal processing was chosen because it is important to a variety of military and commercial applications as well as for the challenge it presents in terms of complexity and performance demands. The principal effort is being performed by two major contractors, Lockheed Sanders (Nashua, NH) and Martin Marietta (Camden, NJ). For both, improvements in methodology are to be exercised and refined through the performance of individual 'Demonstration' efforts. The Lockheed Sanders' Demonstration effort is to develop an infrared search and track (IRST) processor. In addition, both contractors' results are being measured by a series of externally administered (by Lincoln Labs) six-month Benchmark programs that measure process improvement as a function of time. The first two Benchmark programs are designing and implementing a synthetic aperture radar (SAR) processor. Our demonstration team is using commercially available VME modules from Mercury Computer to assemble a multiprocessor system scalable from one to hundreds of Intel i860 microprocessors. Custom modules for the sensor interface and display driver are also being developed. This system implements either proprietary or Navy owned algorithms to perform the compute-intensive IRST function in real time in an avionics environment. Our Benchmark team is designing custom modules using commercially available processor ship sets, communication submodules, and reconfigurable logic devices. One of the modules contains multiple vector processors optimized for fast Fourier transform processing. Another module is a fiberoptic interface that accepts high-rate input data from the sensors and provides video-rate output data to a display. This paper discusses the impact of simulation on choosing signal processing algorithms and architectures, drawing from the experiences of the Demonstration and Benchmark inter-company teams at Lockhhed Sanders, Motorola, Hughes, and ISX.
Quantifying risk and benchmarking performance in the adult intensive care unit.

PubMed

Higgins, Thomas L

2007-01-01

Morbidity, mortality, and length-of-stay outcomes in patients receiving critical care are difficult to interpret unless they are risk-stratified for diagnosis, presenting severity of illness, and other patient characteristics. Acuity adjustment systems for adults include the Acute Physiology And Chronic Health Evaluation (APACHE), the Mortality Probability Model (MPM), and the Simplified Acute Physiology Score (SAPS). All have recently been updated and recalibrated to reflect contemporary results. Specialized scores are also available for patient subpopulations where general acuity scores have drawbacks. Demand for outcomes data is likely to grow with pay-for-performance initiatives as well as for routine clinical, prognostic, administrative, and research applications. It is important for clinicians to understand how these scores are derived and how they are properly applied to quantify patient severity of illness and benchmark intensive care unit performance.
An efficient parallel algorithm for matrix-vector multiplication

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hendrickson, B.; Leland, R.; Plimpton, S.

The multiplication of a vector by a matrix is the kernel computation of many algorithms in scientific computation. A fast parallel algorithm for this calculation is therefore necessary if one is to make full use of the new generation of parallel supercomputers. This paper presents a high performance, parallel matrix-vector multiplication algorithm that is particularly well suited to hypercube multiprocessors. For an n x n matrix on p processors, the communication cost of this algorithm is O(n/[radical]p + log(p)), independent of the matrix sparsity pattern. The performance of the algorithm is demonstrated by employing it as the kernel in themore » well-known NAS conjugate gradient benchmark, where a run time of 6.09 seconds was observed. This is the best published performance on this benchmark achieved to date using a massively parallel supercomputer.« less
Rethinking key–value store for parallel I/O optimization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kougkas, Anthony; Eslami, Hassan; Sun, Xian-He

2015-01-26

Key-value stores are being widely used as the storage system for large-scale internet services and cloud storage systems. However, they are rarely used in HPC systems, where parallel file systems are the dominant storage solution. In this study, we examine the architecture differences and performance characteristics of parallel file systems and key-value stores. We propose using key-value stores to optimize overall Input/Output (I/O) performance, especially for workloads that parallel file systems cannot handle well, such as the cases with intense data synchronization or heavy metadata operations. We conducted experiments with several synthetic benchmarks, an I/O benchmark, and a real application.more » We modeled the performance of these two systems using collected data from our experiments, and we provide a predictive method to identify which system offers better I/O performance given a specific workload. The results show that we can optimize the I/O performance in HPC systems by utilizing key-value stores.« less
Proposal of an innovative benchmark for comparison of the performance of contactless digitizers

NASA Astrophysics Data System (ADS)

Iuliano, Luca; Minetola, Paolo; Salmi, Alessandro

2010-10-01

Thanks to the improving performances of 3D optical scanners, in terms of accuracy and repeatability, reverse engineering applications have extended from CAD model design or reconstruction to quality control. Today, contactless digitizing devices constitute a good alternative to coordinate measuring machines (CMMs) for the inspection of certain parts. The German guideline VDI/VDE 2634 is the only reference to evaluate whether 3D optical measuring systems comply with the declared or required performance specifications. Nevertheless it is difficult to compare the performance of different scanners referring to such a guideline. An adequate novel benchmark is proposed in this paper: focusing on the inspection of production tools (moulds), the innovative test piece was designed using common geometries and free-form surfaces. The reference part is intended to be employed for the evaluation of the performance of several contactless digitizing devices in computer-aided inspection, considering dimensional and geometrical tolerances as well as other quantitative and qualitative criteria.
42 CFR 425.502 - Calculating the ACO quality performance score.

Code of Federal Regulations, 2014 CFR

2014-10-01

... four domains: (i) Patient/care giver experience. (ii) Care coordination/Patient safety. (iii... year. (1) For the first performance year of an ACO's agreement, CMS defines the quality performance... a point scale for the measures. (2)(i) CMS will define the quality benchmarks using fee-for-service...
Benchmarking Deep Learning Models on Large Healthcare Datasets.

PubMed

Purushotham, Sanjay; Meng, Chuizheng; Che, Zhengping; Liu, Yan

2018-06-04

Deep learning models (aka Deep Neural Networks) have revolutionized many fields including computer vision, natural language processing, speech recognition, and is being increasingly used in clinical healthcare applications. However, few works exist which have benchmarked the performance of the deep learning models with respect to the state-of-the-art machine learning models and prognostic scoring systems on publicly available healthcare datasets. In this paper, we present the benchmarking results for several clinical prediction tasks such as mortality prediction, length of stay prediction, and ICD-9 code group prediction using Deep Learning models, ensemble of machine learning models (Super Learner algorithm), SAPS II and SOFA scores. We used the Medical Information Mart for Intensive Care III (MIMIC-III) (v1.4) publicly available dataset, which includes all patients admitted to an ICU at the Beth Israel Deaconess Medical Center from 2001 to 2012, for the benchmarking tasks. Our results show that deep learning models consistently outperform all the other approaches especially when the 'raw' clinical time series data is used as input features to the models. Copyright © 2018 Elsevier Inc. All rights reserved.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.